GET DATA: Get rid of lex_put_back().

[pspp-builds.git] / doc / language.texi
diff --git a/doc/language.texi b/doc/language.texi

index d71ecc8a3868d73a4f27d9fa3bdc70edbd6b5cb7..78d38acdc6f0813fe2e1666fa7ddfa8b5bb69c31 100644 (file)
--- a/doc/language.texi
+++ b/doc/language.texi
@@ -3,18 +3,13 @@
  @cindex language, PSPP
  @cindex PSPP, language
  
-@quotation
-@strong{Please note:} PSPP is not even close to completion.
-Only a few statistical procedures are implemented.  PSPP
-is a work in progress.
-@end quotation
-
  This chapter discusses elements common to many PSPP commands.
  Later chapters will describe individual commands in detail.
  
  @menu
  * Tokens::                      Characters combine to form tokens.
  * Commands::                    Tokens combine to form commands.
+* Syntax Variants::             Batch vs. Interactive mode
  * Types of Commands::           Commands come in several flavors.
  * Order of Commands::           Commands combine to form syntax files.
  * Missing Observations::        Handling missing observations.
@@ -24,6 +19,7 @@ Later chapters will describe individual commands in detail.
  * BNF::                         How command syntax is described.
  @end menu
  
+
  @node Tokens
  @section Tokens
  @cindex language, lexical analysis
@@ -116,8 +112,7 @@ significant inside strings.
  
  Strings can be concatenated using @samp{+}, so that @samp{"a" + 'b' +
  'c'} is equivalent to @samp{'abc'}.  Concatenation is useful for
-splitting a single string across multiple source lines. The maximum
-length of a string, after concatenation, is 255 characters.
+splitting a single string across multiple source lines.
  
  Strings may also be expressed as hexadecimal, octal, or binary
  character values by prefixing the initial quote character by @samp{X},
@@ -152,11 +147,6 @@ punctuator only as the last character on a line (except white space).
  When it is the last non-space character on a line, a period is not
  treated as part of another token, even if it would otherwise be part
  of, e.g.@:, an identifier or a floating-point number.
-
-Actually, the character that ends a command can be changed with
-@cmd{SET}'s ENDCMD subcommand (@pxref{SET}), but we do not recommend
-doing so.  Throughout the remainder of this manual we will assume that
-the default setting is in effect.
  @end table
  
  @node Commands
@@ -184,19 +174,36 @@ by a forward slash (@samp{/}).
  There are multiple ways to mark the end of a command.  The most common
  way is to end the last line of the command with a period (@samp{.}) as
  described in the previous section (@pxref{Tokens}).  A blank line, or
-one that consists only of white space or comments, also ends a command
-by default, although you can use the NULLINE subcommand of @cmd{SET}
-to disable this feature (@pxref{SET}).
-
-In batch mode only, that is, when reading commands from a file instead
-of an interactive user, any line that contains a non-space character
-in the leftmost column begins a new command.  Thus, each command
-consists of a flush-left line followed by any number of lines indented
-from the left margin.  In this mode, a plus or minus sign
-(@samp{+}, @samp{@minus{}}) as the first character
-in a line is ignored and causes that line to begin a new command,
-which allows for visual indentation of a command without that command
-being considered part of the previous command.
+one that consists only of white space or comments, also ends a command.
+
+@node Syntax Variants
+@section Variants of syntax.
+
+@cindex Batch syntax
+@cindex Interactive syntax
+
+There are two variants of command syntax, @i{viz}: @dfn{batch} mode and
+@dfn{interactive} mode.
+Batch mode is the default when reading commands from a file.
+Interactive mode is the default when commands are typed at a prompt
+by a user.
+Certain commands, such as @cmd{INSERT} (@pxref{INSERT}), may explicitly
+change the syntax mode. 
+
+In batch mode, any line that contains a non-space character
+in the leftmost column begins a new command. 
+Thus, each command consists of a flush-left line followed by any
+number of lines indented from the left margin. 
+In this mode, a plus or minus sign (@samp{+}, @samp{@minus{}}) as the
+first character in a line is ignored and causes that line to begin a
+new command, which allows for visual indentation of a command without
+that command being considered part of the previous command. 
+The period terminating the end of a command is optional but recommended.
+
+In interactive mode, each command must be terminated with a period
+or by a blank line.
+The use of @samp{+} and @samp{@minus{}} as continuation characters is not
+permitted.
  
  @node Types of Commands
  @section Types of Commands
@@ -360,9 +367,7 @@ spaces.
  Variables, whether numeric or string, can have designated
  @dfn{user-missing values}.  Every user-missing value is an actual value
  for that variable.  However, most of the time user-missing values are
-treated in the same way as the system-missing value.  String variables
-that are wider than a certain width, usually 8 characters (depending on
-computer architecture), cannot have user-missing values.
+treated in the same way as the system-missing value.
  
  For more information on missing values, see the following sections:
  @ref{Variables}, @ref{MISSING VALUES}, @ref{Expressions}.  See also the
@@ -428,13 +433,9 @@ Numeric or string.
  @item Width
  (string variables only) String variables with a width of 8 characters or
  fewer are called @dfn{short string variables}.  Short string variables
-can be used in many procedures where @dfn{long string variables} (those
+may be used in a few contexts where @dfn{long string variables} (those
  with widths greater than 8) are not allowed.
  
-Certain systems may consider strings longer than 8
-characters to be short strings.  Eight characters represents a minimum
-figure for the maximum length of a short string.
-
  @item Position
  Variables in the dictionary are arranged in a specific order.
  @cmd{DISPLAY} can be used to show this order: see @ref{DISPLAY}.
@@ -476,6 +477,11 @@ they are displayed.  Example: a width of 8, with 2 decimal places.
  @item Write format
  Similar to print format, but used by the @cmd{WRITE} command
  (@pxref{WRITE}).
+
+@cindex custom attributes
+@item Custom attributes
+User-defined associations between names and values.  @xref{VARIABLE
+ATTRIBUTE}.
  @end table
  
  @node System Variables
@@ -1146,6 +1152,7 @@ trailing white space.
  The maximum width for time and date formats is 40 columns.  Minimum
  input and output width for each of the time and date formats is shown
  below:
+
  @float
  @multitable {DATETIME} {Min. Input Width} {Min. Output Width} {4-digit year}
  @headitem Format @tab Min. Input Width @tab Min. Output Width @tab Option 
@@ -1331,6 +1338,14 @@ file, or scratch file.  Most often, a file handle is specified as the
  name of a file as a string, that is, enclosed within @samp{'} or
  @samp{"}.
  
+A file name string that begins or ends with @samp{|} is treated as the
+name of a command to pipe data to or from.  You can use this feature
+to read data over the network using a program such as @samp{curl}
+(e.g.@: @code{GET '|curl -s -S http://example.com/mydata.sav'}), to
+read compressed data from a file using a program such as @samp{zcat}
+(e.g.@: @code{GET '|zcat mydata.sav.gz'}), and for many other
+purposes.
+
  PSPP also supports declaring named file handles with the @cmd{FILE
  HANDLE} command.  This command associates an identifier of your choice
  (the file handle's name) with a file.  Later, the file handle name can
@@ -1459,4 +1474,3 @@ The first nonterminal defined in a set of productions is called the
  @dfn{start symbol}.  The start symbol defines the entire syntax for
  that command.
  @end itemize
-@setfilename ignored