+@c PSPP - a program for statistical analysis.
+@c Copyright (C) 2017, 2020 Free Software Foundation, Inc.
+@c Permission is granted to copy, distribute and/or modify this document
+@c under the terms of the GNU Free Documentation License, Version 1.3
+@c or any later version published by the Free Software Foundation;
+@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
+@c A copy of the license is included in the section entitled "GNU
+@c Free Documentation License".
+@c
@c (modify-syntax-entry ?_ "w")
@c (modify-syntax-entry ?' "'")
@c (modify-syntax-entry ?@ "'")
@cindex cases
@cindex observations
-Data are the focus of the @pspp{} language.
+Data are the focus of the @pspp{} language.
Each datum belongs to a @dfn{case} (also called an @dfn{observation}).
Each case represents an individual or ``experimental unit''.
For example, in the results of a survey, the names of the respondents,
``empty,'' that is, it has no dictionary or data. If a dataset with
the given name already exists, this has no effect. The new dataset
can be used with commands that support output to a dataset,
-e.g. AGGREGATE (@pxref{AGGREGATE}).
+@i{e.g.} AGGREGATE (@pxref{AGGREGATE}).
@vindex DATASET CLOSE
The DATASET CLOSE command deletes a dataset. If the active dataset is
that contains variable names, for example.
@cmd{DATA LIST} can optionally output a table describing how the data file
-will be read. The @subcmd{TABLE} subcommand enables this output, and
+is read. The @subcmd{TABLE} subcommand enables this output, and
@subcmd{NOTABLE} disables it. The default is to output the table.
The list of variables to be read from the data list must come last.
In columnar style, to use a variable format other than the default,
specify the format type in parentheses after the column numbers. For
-instance, for alphanumeric @samp{A} format, use @samp{(A)}.
+instance, for alphanumeric @samp{A} format, use @samp{(A)}.
In addition, implied decimal places can be specified in parentheses
after the column numbers. As an example, suppose that a data file has a
leaves the active column immediately after the ending column
specified. Record motion using @code{NEWREC} in FORTRAN style also
applies to later FORTRAN and columnar specifiers.
-
+
@menu
* DATA LIST FIXED Examples:: Examples of DATA LIST FIXED.
@end menu
This list must be introduced by a single slash (@samp{/}). The set of
variable names may contain format specifications in parentheses
(@pxref{Input and Output Formats}). Format specifications apply to all
-variables back to the previous parenthesized format specification.
+variables back to the previous parenthesized format specification.
In addition, an asterisk may be used to indicate that all variables
preceding it are to have input/output format @samp{F8.0}.
for ENCODING on the INSERT command are supported (@pxref{INSERT}).
For reading in other file-based modes, encoding autodetection is not
supported; if the specified encoding requests autodetection then the
-default encoding will be used. This is also true when a file handle
+default encoding is used. This is also true when a file handle
is used for writing a file in any mode.
@node INPUT PROGRAM
@example
INPUT PROGRAM.
NUMERIC #A #B.
-
+
DO IF NOT #A.
DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10.
END IF.
@display
MATRIX DATA
VARIABLES = @var{columns}
- [eFILE='@var{file_name}'| INLINE @}
+ [FILE='@var{file_name}'| INLINE @}
[/FORMAT= [@{LIST | FREE@}]
[@{UPPER | LOWER | FULL@}]
[@{DIAGONAL | NODIAGONAL@}]]
+ [/N= @var{n}]
[/SPLIT= @var{split_variables}].
@end display
The @cmd{MATRIX DATA} command is used to input data in the form of matrices
which can subsequently be used by other commands. If the
@subcmd{FILE} is omitted or takes the value @samp{INLINE} then the command
-should immediately followed by @cmd{BEGIN DATA}, @xref{BEGIN DATA}.
+should immediately followed by @cmd{BEGIN DATA} (@pxref{BEGIN DATA}).
There is one mandatory subcommand, @i{viz:} @subcmd{VARIABLES}, which defines
the @var{columns} of the matrix.
then the data may be data for several matrix rows may be specified on
the same line, or a single row may be split across lines.
+The @subcmd{N} subcommand may be used to specify the number
+of valid cases for each variable. It should not be used if the
+data contains a record whose ROWTYPE_ column is @samp{N} or @samp{N_VECTOR}.
+It implies a @samp{N} record whose values are all @var{n}.
+That is to say,
+@example
+matrix data
+ variables = rowtype_ var01 TO var04
+ /format = upper nodiagonal
+ /n = 99.
+begin data
+mean 34 35 36 37
+sd 22 11 55 66
+corr 9 8 7
+corr 6 5
+corr 4
+end data.
+@end example
+produces an effect identical to
+@example
+matrix data
+ variables = rowtype_ var01 TO var04
+ /format = upper nodiagonal
+begin data
+n 99 99 99 99
+mean 34 35 36 37
+sd 22 11 55 66
+corr 9 8 7
+corr 6 5
+corr 4
+end data.
+@end example
+
+
The @subcmd{SPLIT} is used to indicate that variables are to be
considered as split variables. For example, the following
defines two matrices using the variable @samp{S1} to distinguish
@vindex PRINT
@display
-PRINT
+PRINT
[OUTFILE='@var{file_name}']
[RECORDS=@var{n_lines}]
[@{NOTABLE,TABLE@}]
The @subcmd{OUTFILE} subcommand specifies the file to receive the output. The
file may be a file name as a string or a file handle (@pxref{File
-Handles}). If @subcmd{OUTFILE} is not present then output will be sent to
-@pspp{}'s output listing file. When @subcmd{OUTFILE} is present, a space is
-inserted at beginning of each output line, even lines that otherwise
-would be blank.
+Handles}). If @subcmd{OUTFILE} is not present then output is sent to
+@pspp{}'s output listing file. When @subcmd{OUTFILE} is present, the
+output is written to @var{file_name} in a plain text format, with a
+space inserted at beginning of each output line, even lines that
+otherwise would be blank.
The @subcmd{ENCODING} subcommand may only be used if the
@subcmd{OUTFILE} subcommand is also used. It specifies the character
Introduce the strings and variables to be printed with a slash
(@samp{/}). Optionally, the slash may be followed by a number
-indicating which output line will be specified. In the absence of this
-line number, the next line number will be specified. Multiple lines may
+indicating which output line is specified. In the absence of this
+line number, the next line number is specified. Multiple lines may
be specified using multiple slashes with the intended output for a line
following its respective slash.
Literal strings may be printed. Specify the string itself.
Optionally the string may be followed by a column number, specifying
the column on the line where the string should start. Otherwise, the
-string will be printed at the current position on the line.
+string is printed at the current position on the line.
Variables to be printed can be specified in the same ways as available
for @cmd{DATA LIST FIXED} (@pxref{DATA LIST FIXED}). In addition, a
list may be followed by an asterisk (@samp{*}), which indicates that the
variables should be printed in their dictionary print formats, separated
by spaces. A variable list followed by a slash or the end of command
-will be interpreted the same way.
+is interpreted in the same way.
If a FORTRAN type specification is used to move backwards on the current
-line, then text is written at that point on the line, the line will be
+line, then text is written at that point on the line, the line is
truncated to that length, although additional text being added will
again extend the line to that length.
@vindex PRINT EJECT
@display
-PRINT EJECT
+PRINT EJECT
OUTFILE='@var{file_name}'
RECORDS=@var{n_lines}
@{NOTABLE,TABLE@}
The @subcmd{OUTFILE} subcommand is optional. It may be used to direct output to
a file specified by file name as a string or file handle (@pxref{File
-Handles}). If OUTFILE is not specified then output will be directed to
+Handles}). If OUTFILE is not specified then output is directed to
the listing file.
The @subcmd{ENCODING} subcommand may only be used if @subcmd{OUTFILE}
The @subcmd{FILE} subcommand, which is optional, is used to specify the file to
have its line re-read. The file must be specified as the name of a file
handle (@pxref{File Handles}). If FILE is not specified then the last
-file specified on @cmd{DATA LIST} will be assumed (last file specified
+file specified on @cmd{DATA LIST} is assumed (last file specified
lexically, not in terms of flow-of-control).
By default, the line re-read is re-read in its entirety. With the
@vindex WRITE
@display
-WRITE
+WRITE
OUTFILE='@var{file_name}'
RECORDS=@var{n_lines}
@{NOTABLE,TABLE@}
@var{var_list} *
@end display
-@code{WRITE} writes text or binary data to an output file.
+@code{WRITE} writes text or binary data to an output file.
@xref{PRINT}, for more information on syntax and usage. @cmd{PRINT}
and @cmd{WRITE} differ in only a few ways: