Add tty and listing devices that use VT100 (and xterm) line-drawing

[pspp-builds.git] / doc / data-io.texi
diff --git a/doc/data-io.texi b/doc/data-io.texi

index 93f071568f98ac023b80b70ddea9a93fb76c4f45..dedf1c1d03a39473d4233c82ec252295fddc0257 100644 (file)
--- a/doc/data-io.texi
+++ b/doc/data-io.texi
@@ -1,4 +1,4 @@
-@node Data Input and Output, System and Portable Files, Expressions, Top
+@node Data Input and Output
  @chapter Data Input and Output
  @cindex input
  @cindex output
@@ -30,7 +30,6 @@ actually be read until a procedure is executed.
  * FILE HANDLE::                 Support for special file formats.
  * INPUT PROGRAM::               Support for complex input programs.
  * LIST::                        List cases in the active file.
-* MATRIX DATA::                 Read matrices in text format.
  * NEW FILE::                    Clear the active file and dictionary.
  * PRINT::                       Display values in print formats.
  * PRINT EJECT::                 Eject the current page then print.
@@ -138,9 +137,10 @@ Each form of @cmd{DATA LIST} is described in detail below.
  @display
  DATA LIST [FIXED]
          @{TABLE,NOTABLE@}
-        FILE='filename'
-        RECORDS=record_count
-        END=end_var
+        [FILE='file-name']
+        [RECORDS=record_count]
+        [END=end_var]
+        [SKIP=record_count]
          /[line_no] var_spec@dots{}
  
  where each var_spec takes one of the forms
@@ -153,7 +153,7 @@ positions on each line of single-line or multiline records.  The
  keyword FIXED is optional.
  
  The FILE subcommand must be used if input is to be taken from an
-external file.  It may be used to specify a filename as a string or a
+external file.  It may be used to specify a file name as a string or a
  file handle (@pxref{File Handles}).  If the FILE subcommand is not used,
  then input is assumed to be specified within the command file using
  @cmd{BEGIN DATA}@dots{}@cmd{END DATA} (@pxref{BEGIN DATA}).
@@ -166,6 +166,10 @@ the list of variable specifications later in @cmd{DATA LIST}.
  The END subcommand is only useful in conjunction with @cmd{INPUT
  PROGRAM}.  @xref{INPUT PROGRAM}, for details.
  
+The optional SKIP subcommand specifies a number of records to skip at
+the beginning of an input file.  It can be used to skip over a row
+that contains variable names, for example.
+
  @cmd{DATA LIST} can optionally output a table describing how the data file
  will be read.  The TABLE subcommand enables this output, and NOTABLE
  disables it.  The default is to output the table.
@@ -186,7 +190,7 @@ In columnar style, the starting column and ending column for the field
  are specified after the variable name, separated by a dash (@samp{-}).
  For instance, the third through fifth columns on a line would be
  specified @samp{3-5}.  By default, variables are considered to be in
-@samp{F} format (@pxref{Input/Output Formats}).  (This default can be
+@samp{F} format (@pxref{Input and Output Formats}).  (This default can be
  changed; see @ref{SET} for more information.)
  
  In columnar style, to use a variable format other than the default,
@@ -218,7 +222,7 @@ Implied decimal places also exist in FORTRAN style.  A format
  specification with @var{d} decimal places also has @var{d} implied
  decimal places.
  
-In addition to the standard format specifiers (@pxref{Input/Output
+In addition to the standard format specifiers (@pxref{Input and Output
  Formats}), FORTRAN style defines some extensions:
  
  @table @asis
@@ -346,8 +350,9 @@ This example shows keywords abbreviated to their first 3 letters.
  DATA LIST FREE
          [(@{TAB,'c'@}, @dots{})]
          [@{NOTABLE,TABLE@}]
-        FILE='filename'
-        END=end_var
+        [FILE='file-name']
+        [END=end_var]
+        [SKIP=record_cnt]
          /var_spec@dots{}
  
  where each var_spec takes one of the forms
@@ -376,12 +381,12 @@ of quoting is allowed.
  The NOTABLE and TABLE subcommands are as in @cmd{DATA LIST FIXED} above.
  NOTABLE is the default.
  
-The FILE and END subcommands are as in @cmd{DATA LIST FIXED} above.
+The FILE, END, and SKIP subcommands are as in @cmd{DATA LIST FIXED} above.
  
  The variables to be parsed are given as a single list of variable names.
  This list must be introduced by a single slash (@samp{/}).  The set of
  variable names may contain format specifications in parentheses
-(@pxref{Input/Output Formats}).  Format specifications apply to all
+(@pxref{Input and Output Formats}).  Format specifications apply to all
  variables back to the previous parenthesized format specification.  
  
  In addition, an asterisk may be used to indicate that all variables
@@ -398,8 +403,9 @@ on field width apply, but they are honored on output.
  DATA LIST LIST
          [(@{TAB,'c'@}, @dots{})]
          [@{NOTABLE,TABLE@}]
-        FILE='filename'
-        END=end_var
+        [FILE='file-name']
+        [END=end_var]
+        [SKIP=record_count]
          /var_spec@dots{}
  
  where each var_spec takes one of the forms
@@ -442,13 +448,13 @@ the current input program.  @xref{INPUT PROGRAM}.
  @display
  For text files:
          FILE HANDLE handle_name
-                /NAME='filename'
+                /NAME='file-name'
                  [/MODE=CHARACTER]
                  /TABWIDTH=tab_width
  
  For binary files with fixed-length records:
          FILE HANDLE handle_name
-                /NAME='filename'
+                /NAME='file-name'
                  /MODE=IMAGE
                  [/LRECL=rec_len]
  
@@ -482,10 +488,11 @@ exception).  By default, each tab is 4 characters wide, but an
  alternate width may be specified on TABWIDTH.  A tab width of 0
  suppresses tab expansion entirely.
  
-In IMAGE mode, the data file is opened in ANSI C binary mode and records
-are fixed in length.  In IMAGE mode, LRECL specifies the record length in
-bytes, with a default of 1024.  Tab characters are never expanded to
-spaces in binary mode.
+In IMAGE mode, the data file is opened in ANSI C binary mode.  Record
+length is fixed, with output data truncated or padded with spaces to
+the record length.  LRECL specifies the record length in bytes, with a
+default of 1024.  Tab characters are never expanded to spaces in
+binary mode.  Records
  
  The NAME subcommand specifies the name of the file associated with the
  handle.  It is required in CHARACTER and IMAGE modes.
@@ -687,108 +694,6 @@ cannot fit on a single line, then a multi-line format will be used.
  
  @cmd{LIST} is a procedure.  It causes the data to be read.
  
-@node MATRIX DATA
-@section MATRIX DATA
-@vindex MATRIX DATA
-
-@display
-MATRIX DATA
-        /VARIABLES=var_list
-        /FILE='filename'
-        /FORMAT=@{LIST,FREE@} @{LOWER,UPPER,FULL@} @{DIAGONAL,NODIAGONAL@}
-        /SPLIT=@{new_var,var_list@}
-        /FACTORS=var_list
-        /CELLS=n_cells
-        /N=n
-        /CONTENTS=@{N_VECTOR,N_SCALAR,N_MATRIX,MEAN,STDDEV,COUNT,MSE,
-                   DFE,MAT,COV,CORR,PROX@}
-@end display
-
-@cmd{MATRIX DATA} command reads square matrices in one of several textual
-formats.  @cmd{MATRIX DATA} clears the dictionary and replaces it and
-reads a
-data file.
-
-Use VARIABLES to specify the variables that form the rows and columns of
-the matrices.  You may not specify a variable named @code{VARNAME_}.  You
-should specify VARIABLES first.
-
-Specify the file to read on FILE, either as a file name string or a file
-handle (@pxref{File Handles}).  If FILE is not specified then matrix data
-must immediately follow @cmd{MATRIX DATA} with a @cmd{BEGIN
-DATA}@dots{}@cmd{END DATA}
-construct (@pxref{BEGIN DATA}).
-
-The FORMAT subcommand specifies how the matrices are formatted.  LIST,
-the default, indicates that there is one line per row of matrix data;
-FREE allows single matrix rows to be broken across multiple lines.  This
-is analogous to the difference between @cmd{DATA LIST FREE} and
-@cmd{DATA LIST LIST}
-(@pxref{DATA LIST}).  LOWER, the default, indicates that the lower
-triangle of the matrix is given; UPPER indicates the upper triangle; and
-FULL indicates that the entire matrix is given.  DIAGONAL, the default,
-indicates that the diagonal is part of the data; NODIAGONAL indicates
-that it is omitted.  DIAGONAL/NODIAGONAL have no effect when FULL is
-specified.
-
-The SPLIT subcommand is used to specify @cmd{SPLIT FILE} variables for the
-input matrices (@pxref{SPLIT FILE}).  Specify either a single variable
-not specified on VARIABLES, or one or more variables that are specified
-on VARIABLES.  In the former case, the SPLIT values are not present in
-the data and ROWTYPE_ may not be specified on VARIABLES.  In the latter
-case, the SPLIT values are present in the data.
-
-Specify a list of factor variables on FACTORS.  Factor variables must
-also be listed on VARIABLES.  Factor variables are used when there are
-some variables where, for each possible combination of their values,
-statistics on the matrix variables are included in the data.
-
-If FACTORS is specified and ROWTYPE_ is not specified on VARIABLES, the
-CELLS subcommand is required.  Specify the number of factor variable
-combinations that are given.  For instance, if factor variable A has 2
-values and factor variable B has 3 values, specify 6.
-
-The N subcommand specifies a population number of observations.  When N
-is specified, one N record is output for each @cmd{SPLIT FILE}.
-
-Use CONTENTS to specify what sort of information the matrices include.
-Each possible option is described in more detail below.  When ROWTYPE_
-is specified on VARIABLES, CONTENTS is optional; otherwise, if CONTENTS
-is not specified then /CONTENTS=CORR is assumed.
-
-@table @asis
-@item N
-@item N_VECTOR
-Number of observations as a vector, one value for each variable.
-@item N_SCALAR
-Number of observations as a single value.
-@item N_MATRIX
-Matrix of counts.
-@item MEAN
-Vector of means.
-@item STDDEV
-Vector of standard deviations.
-@item COUNT
-Vector of counts.
-@item MSE
-Vector of mean squared errors.
-@item DFE
-Vector of degrees of freedom.
-@item MAT
-Generic matrix.
-@item COV
-Covariance matrix.
-@item CORR
-Correlation matrix.
-@item PROX
-Proximities matrix.
-@end table
-
-The exact semantics of the matrices read by @cmd{MATRIX DATA} are complex.
-Right now @cmd{MATRIX DATA} isn't too useful due to a lack of procedures
-accepting or producing related data, so these semantics aren't
-documented.  Later, they'll be described here in detail.
-
  @node NEW FILE
  @section NEW FILE
  @vindex NEW FILE
@@ -805,10 +710,10 @@ NEW FILE.
  
  @display
  PRINT 
-        OUTFILE='filename'
+        OUTFILE='file-name'
          RECORDS=n_lines
          @{NOTABLE,TABLE@}
-        /[line_no] arg@dots{}
+        [/[line_no] arg@dots{}]
  
  arg takes one of the following forms:
          'string' [start-end]
@@ -817,17 +722,20 @@ arg takes one of the following forms:
          var_list *
  @end display
  
-The @cmd{PRINT} transformation writes variable data to an output file.
-@cmd{PRINT} is executed when a procedure causes the data to be read.
-Follow @cmd{PRINT} by @cmd{EXECUTE} to print variable data without
-invoking a procedure (@pxref{EXECUTE}).
+The @cmd{PRINT} transformation writes variable data to the listing
+file or an output file.  @cmd{PRINT} is executed when a procedure
+causes the data to be read.  Follow @cmd{PRINT} by @cmd{EXECUTE} to
+print variable data without invoking a procedure (@pxref{EXECUTE}).
  
-All @cmd{PRINT} subcommands are optional.
+All @cmd{PRINT} subcommands are optional.  If no strings or variables
+are specified, PRINT outputs a single blank line.
  
  The OUTFILE subcommand specifies the file to receive the output.  The
  file may be a file name as a string or a file handle (@pxref{File
-Handles}).  If OUTFILE is not present then output will be sent to PSPP's
-output listing file.
+Handles}).  If OUTFILE is not present then output will be sent to
+PSPP's output listing file.  When OUTFILE is present, a space is
+inserted at beginning of each output line, even lines that otherwise
+would be blank.
  
  The RECORDS subcommand specifies the number of lines to be output.  The
  number of lines may optionally be surrounded by parentheses.
@@ -868,7 +776,7 @@ again extend the line to that length.
  
  @display
  PRINT EJECT 
-        OUTFILE='filename'
+        OUTFILE='file-name'
          RECORDS=n_lines
          @{NOTABLE,TABLE@}
          /[line_no] arg@dots{}
@@ -880,8 +788,20 @@ arg takes one of the following forms:
          var_list *
  @end display
  
-@cmd{PRINT EJECT} writes data to an output file.  Before the data is
-written, the current page in the listing file is ejected.
+@cmd{PRINT EJECT} advances to the beginning of a new output page in
+the listing file or output file.  It can also output data in the same
+way as @cmd{PRINT}.
+
+All @cmd{PRINT EJECT} subcommands are optional.
+
+Without OUTFILE, PRINT EJECT ejects the current page in
+the listing file, then it produces other output, if any is specified.
+
+With OUTFILE, PRINT EJECT writes its output to the specified file.
+The first line of output is written with @samp{1} inserted in the
+first column.  Commonly, this is the only line of output.  If
+additional lines of output are specified, these additional lines are
+written with a space inserted in the first column, as with PRINT.
  
  @xref{PRINT}, for more information on syntax and usage.
  
@@ -890,7 +810,7 @@ written, the current page in the listing file is ejected.
  @vindex PRINT SPACE
  
  @display
-PRINT SPACE OUTFILE='filename' n_lines.
+PRINT SPACE OUTFILE='file-name' n_lines.
  @end display
  
  @cmd{PRINT SPACE} prints one or more blank lines to an output file.
@@ -940,7 +860,7 @@ file.  Instead, it will re-read the same line multiple times.
  REPEATING DATA
          /STARTS=start-end
          /OCCURS=n_occurs
-        /FILE='filename'
+        /FILE='file-name'
          /LENGTH=length
          /CONTINUED[=cont_start-cont_end]
          /ID=id_start-id_end=id_var
@@ -1020,7 +940,7 @@ structure (@pxref{LOOP}).  Use @cmd{DATA LIST} before, not after,
  
  @display
  WRITE 
-        OUTFILE='filename'
+        OUTFILE='file-name'
          RECORDS=n_lines
          @{NOTABLE,TABLE@}
          /[line_no] arg@dots{}
@@ -1034,11 +954,29 @@ arg takes one of the following forms:
  
  @code{WRITE} writes text or binary data to an output file.  
  
-@xref{PRINT}, for more information on syntax and usage.  The main
-difference between @code{PRINT} and @code{WRITE} is that @cmd{WRITE}
-uses write formats by default, where PRINT uses print formats.
+@xref{PRINT}, for more information on syntax and usage.  @cmd{PRINT}
+and @cmd{WRITE} differ in only a few ways:
+
+@itemize @bullet
+@item
+@cmd{WRITE} uses write formats by default, whereas @cmd{PRINT} uses
+print formats.
+
+@item
+@cmd{PRINT} inserts a space between variables unless a format is
+explicitly specified, but @cmd{WRITE} never inserts space between
+variables in output.
+
+@item
+@cmd{PRINT} inserts a space at the beginning of each line that it
+writes to an output file (and @cmd{PRINT EJECT} inserts @samp{1} at
+the beginning of each line that should begin a new page), but
+@cmd{WRITE} does not.
  
-The sole additional difference is that if @cmd{WRITE} is used to send output
-to a binary file, carriage control characters will not be output.
-@xref{FILE HANDLE}, for information on how to declare a file as binary.
+@item
+@cmd{PRINT} outputs the system-missing value according to its
+specified output format, whereas @cmd{WRITE} outputs the
+system-missing value as a field filled with spaces.  Binary formats
+are an exception.
+@end itemize
  @setfilename ignored