X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fdata-io.texi;h=81911983c83b284c8ccabcfab01c1869b36f9089;hb=308c33df7c1be1edd6b2ebdf10b901fe05903438;hp=a5ba26f0186eace4fed20f708ae4f9b802916c71;hpb=cc0b5800fcdde6126c4fc65b656f39c1459bf17c;p=pspp diff --git a/doc/data-io.texi b/doc/data-io.texi index a5ba26f018..81911983c8 100644 --- a/doc/data-io.texi +++ b/doc/data-io.texi @@ -1,3 +1,12 @@ +@c PSPP - a program for statistical analysis. +@c Copyright (C) 2017 Free Software Foundation, Inc. +@c Permission is granted to copy, distribute and/or modify this document +@c under the terms of the GNU Free Documentation License, Version 1.3 +@c or any later version published by the Free Software Foundation; +@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. +@c A copy of the license is included in the section entitled "GNU +@c Free Documentation License". +@c @c (modify-syntax-entry ?_ "w") @c (modify-syntax-entry ?' "'") @c (modify-syntax-entry ?@ "'") @@ -11,7 +20,7 @@ @cindex cases @cindex observations -Data are the focus of the @pspp{} language. +Data are the focus of the @pspp{} language. Each datum belongs to a @dfn{case} (also called an @dfn{observation}). Each case represents an individual or ``experimental unit''. For example, in the results of a survey, the names of the respondents, @@ -320,7 +329,7 @@ changed; see @ref{SET} for more information.) In columnar style, to use a variable format other than the default, specify the format type in parentheses after the column numbers. For -instance, for alphanumeric @samp{A} format, use @samp{(A)}. +instance, for alphanumeric @samp{A} format, use @samp{(A)}. In addition, implied decimal places can be specified in parentheses after the column numbers. As an example, suppose that a data file has a @@ -376,7 +385,7 @@ FORTRAN and columnar styles may be freely intermixed. Columnar style leaves the active column immediately after the ending column specified. Record motion using @code{NEWREC} in FORTRAN style also applies to later FORTRAN and columnar specifiers. - + @menu * DATA LIST FIXED Examples:: Examples of DATA LIST FIXED. @end menu @@ -515,7 +524,7 @@ The variables to be parsed are given as a single list of variable names. This list must be introduced by a single slash (@samp{/}). The set of variable names may contain format specifications in parentheses (@pxref{Input and Output Formats}). Format specifications apply to all -variables back to the previous parenthesized format specification. +variables back to the previous parenthesized format specification. In addition, an asterisk may be used to indicate that all variables preceding it are to have input/output format @samp{F8.0}. @@ -823,7 +832,7 @@ the extra data in the longer file is ignored. @example INPUT PROGRAM. NUMERIC #A #B. - + DO IF NOT #A. DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10. END IF. @@ -966,17 +975,18 @@ active dataset. @display MATRIX DATA VARIABLES = @var{columns} - [eFILE='@var{file_name}'| INLINE @} + [FILE='@var{file_name}'| INLINE @} [/FORMAT= [@{LIST | FREE@}] [@{UPPER | LOWER | FULL@}] [@{DIAGONAL | NODIAGONAL@}]] + [/N= @var{n}] [/SPLIT= @var{split_variables}]. @end display The @cmd{MATRIX DATA} command is used to input data in the form of matrices which can subsequently be used by other commands. If the @subcmd{FILE} is omitted or takes the value @samp{INLINE} then the command -should immediately followed by @cmd{BEGIN DATA}, @xref{BEGIN DATA}. +should immediately followed by @cmd{BEGIN DATA} (@pxref{BEGIN DATA}). There is one mandatory subcommand, @i{viz:} @subcmd{VARIABLES}, which defines the @var{columns} of the matrix. @@ -1049,6 +1059,40 @@ single line. If you pass the keyword @var{FREE} to @subcmd{FORMAT} then the data may be data for several matrix rows may be specified on the same line, or a single row may be split across lines. +The @subcmd{N} subcommand may be used to specify the number +of valid cases for each variable. It should not be used if the +data contains a record whose ROWTYPE_ column is @samp{N} or @samp{N_VECTOR}. +It implies a @samp{N} record whose values are all @var{n}. +That is to say, +@example +matrix data + variables = rowtype_ var01 TO var04 + /format = upper nodiagonal + /n = 99. +begin data +mean 34 35 36 37 +sd 22 11 55 66 +corr 9 8 7 +corr 6 5 +corr 4 +end data. +@end example +produces an effect identical to +@example +matrix data + variables = rowtype_ var01 TO var04 + /format = upper nodiagonal +begin data +n 99 99 99 99 +mean 34 35 36 37 +sd 22 11 55 66 +corr 9 8 7 +corr 6 5 +corr 4 +end data. +@end example + + The @subcmd{SPLIT} is used to indicate that variables are to be considered as split variables. For example, the following defines two matrices using the variable @samp{S1} to distinguish @@ -1083,7 +1127,7 @@ end data. @vindex PRINT @display -PRINT +PRINT [OUTFILE='@var{file_name}'] [RECORDS=@var{n_lines}] [@{NOTABLE,TABLE@}] @@ -1108,9 +1152,10 @@ are specified, @cmd{PRINT} outputs a single blank line. The @subcmd{OUTFILE} subcommand specifies the file to receive the output. The file may be a file name as a string or a file handle (@pxref{File Handles}). If @subcmd{OUTFILE} is not present then output will be sent to -@pspp{}'s output listing file. When @subcmd{OUTFILE} is present, a space is -inserted at beginning of each output line, even lines that otherwise -would be blank. +@pspp{}'s output listing file. When @subcmd{OUTFILE} is present, the +output is written to @var{file_name} in a plain text format, with a +space inserted at beginning of each output line, even lines that +otherwise would be blank. The @subcmd{ENCODING} subcommand may only be used if the @subcmd{OUTFILE} subcommand is also used. It specifies the character @@ -1154,7 +1199,7 @@ again extend the line to that length. @vindex PRINT EJECT @display -PRINT EJECT +PRINT EJECT OUTFILE='@var{file_name}' RECORDS=@var{n_lines} @{NOTABLE,TABLE@} @@ -1326,7 +1371,7 @@ structure (@pxref{LOOP}). Use @cmd{DATA LIST} before, not after, @vindex WRITE @display -WRITE +WRITE OUTFILE='@var{file_name}' RECORDS=@var{n_lines} @{NOTABLE,TABLE@} @@ -1339,7 +1384,7 @@ WRITE @var{var_list} * @end display -@code{WRITE} writes text or binary data to an output file. +@code{WRITE} writes text or binary data to an output file. @xref{PRINT}, for more information on syntax and usage. @cmd{PRINT} and @cmd{WRITE} differ in only a few ways: