From: Ben Pfaff Date: Sun, 24 Oct 2021 23:40:04 +0000 (-0700) Subject: MATRIX command documentation is complete (but needs examples) X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=5f379b736b5797b0583e99652d51a35a23ba38f8;p=pspp MATRIX command documentation is complete (but needs examples) --- diff --git a/doc/matrices.texi b/doc/matrices.texi index af055c71b5..1f650c773c 100644 --- a/doc/matrices.texi +++ b/doc/matrices.texi @@ -659,10 +659,10 @@ The following matrix commands support matrix input and output: @t{READ} @i{variable}[@t{(}@i{index}[@t{,}@i{index}]@t{)}] [@t{/FILE}@t{=}@i{file}] @t{/FIELD}@t{=}@i{first} @t{TO} @i{last} [@t{BY} @i{width}] + [@t{/FORMAT}@t{=}@i{format}] [@t{/SIZE}@t{=}@i{expression}] [@t{/MODE}@t{=}@{@t{RECTANGULAR} @math{|} @t{SYMMETRIC}@}] - [@t{/REREAD}] - [@t{/FORMAT}@t{=}@i{format}]@t{.} + [@t{/REREAD}]@t{.} @t{WRITE} @i{expression} [@t{/OUTFILE}@t{=}@i{file}] @t{/FIELD}@t{=}@i{first} @t{TO} @i{last} [@t{BY} @i{width}] @@ -741,8 +741,8 @@ first, the syntax of matrix expressions, then each of the supported commands. The @code{COMMENT} command (@pxref{COMMENT}) is also supported. -@node Matrix Operators -@subsection Matrix Operators +@node Matrix Expressions +@subsection Matrix Expressions Many matrix commands use expressions. A matrix expression may use the following operators, listed in descending order of operator @@ -787,6 +787,13 @@ Logical @t{OR} and @t{XOR} @xref{Matrix Functions}, for the available matrix functions. The remaining operators are described in more detail below. +@cindex restricted expressions +Expressions appear in the matrix language in some contexts where there +would be ambiguity whether @samp{/} is an operator or a separator +between subcommands. In these contexts, only the operators with +higher precedence than @samp{/} are allowed outside parentheses. +Later sections call these @dfn{restricted expressions}. + @node Matrix Construction Operator @subsubsection Matrix Construction Operator @t{@{@}} @@ -1486,6 +1493,8 @@ If @math{@var{M}_{kk} = 0}, then: @subsubsection I/O @deffn {Matrix Function} EOF (@var{file}) +@anchor{EOF Matrix Function} + Given a file handle or file name @var{file}, returns an integer scalar that indicates whether the last record in the file has been read. Determining this requires attempting reading past the current record, @@ -1581,15 +1590,9 @@ matrix with a specified main diagonal. @end display The @code{PRINT} command is commonly used to display a matrix. It -evaluates the @i{expression}, if present, and outputs it either as -text or a pivot table, depending on the setting of @code{MDISPLAY} -(@pxref{SET MDISPLAY}). - -Any matrix expression is allowed as @var{expression}, but an -expression with operators with lower precedence than exponentiation -(@pxref{Matrix Operators}) must be parenthesized. (This avoids -ambiguity between @t{/} as an operator and @t{/} to separate -subcommands.) +evaluates the restricted @var{expression}, if present, and outputs it +either as text or a pivot table, depending on the setting of +@code{MDISPLAY} (@pxref{SET MDISPLAY}). Use the @code{FORMAT} subcommand to specify a format, such as @code{F8.2}, for displaying the matrix elements. @code{FORMAT} is @@ -1720,31 +1723,151 @@ terminates immediately, jumping to the command just following @code{END LOOP}. When multiple @code{LOOP} commands nest, @code{BREAK} terminates the innermost loop. +@node Matrix READ and WRITE Commands +@subsection The @code{READ} and @code{WRITE} Commands + +The @code{READ} and @code{WRITE} commands perform matrix input and +output with text files. They share the following syntax for +specifying how data is divided among input lines: + +@display +@t{/FIELD}@t{=}@i{first} @t{TO} @i{last} [@t{BY} @i{width}] +[@t{/FORMAT}@t{=}@i{format}] +@end display + +Both commands require the @code{FIELD} subcommand. It specifies the +range of columns, from @var{first} to @var{last}, inclusive, that the +data occupies on each line of the file. The leftmost column is column +1. The columns must be literal numbers, not expressions. To use +entire lines, even if they might be very long, specify a column range +such as @code{1 TO 100000}. + +The @code{FORMAT} subcommand is optional for numerical matrices. For +string matrix input and output, specify an @code{A} format. In +addition to @code{FORMAT}, the optional @code{BY} specification on +@code{FIELD} determine the meaning of each text line: + +@itemize @bullet +@item +Without @code{BY} and @code{FORMAT}, the numbers in the text file are +in @code{F} format separated by spaces or commas. For @code{WRITE}, +@pspp{} uses as many digits of precision needed to represent the +numbers in the matrix + +@item +With @code{BY @i{width}}, the input area is divided into fixed-width +fields with the given @i{width}. The input area must be a multiple of +@i{width} columns wide. Numbers are read or written as +@code{F@i{width}.0} format. + +@item +With @code{FORMAT=@i{count}F}, the input area is divided into +@i{count} equal-width fields per line. The input area must be a +multiple of @i{count} columns wide. Another format type may be +substituted for @code{F}. + +@item +@code{FORMAT=F@i{w}.@i{d}} divides the input area into fixed-width +fields with width @i{w}. The input area must be a multiple of @i{w} +columns wide. Another format type may be substituted for @code{F}. + +@item +If @code{BY} and @code{FORMAT} are both used, then they must agree on +the field width. +@end itemize + @node Matrix READ Command -@subsection The @code{READ} Command +@subsubsection The @code{READ} Command @display @t{READ} @i{variable}[@t{(}@i{index}[@t{,}@i{index}]@t{)}] [@t{/FILE}@t{=}@i{file}] @t{/FIELD}@t{=}@i{first} @t{TO} @i{last} [@t{BY} @i{width}] + [@t{/FORMAT}@t{=}@i{format}] [@t{/SIZE}@t{=}@i{expression}] [@t{/MODE}@t{=}@{@t{RECTANGULAR} @math{|} @t{SYMMETRIC}@}] - [@t{/REREAD}] - [@t{/FORMAT}@t{=}@i{format}]@t{.} + [@t{/REREAD}]@t{.} @end display +The @code{READ} command reads from a text file into a matrix variable. +Specify the target variable just after the command name, either just a +variable name to create or replace an entire variable, or a variable +name followed by an indexing expression to replace a submatrix of an +existing variable. + +The @code{FILE} subcommand is required in the first @code{READ} +command that appears within @code{MATRIX}. It specifies the text file +to be read, either as a file name in quotes or a file handle +previously declared on @code{FILE HANDLE} (@pxref{FILE HANDLE}). +Later @code{READ} commands (in syntax order) use the previous +referenced file if @code{FILE} is omitted. + +The @code{FIELD} and @code{FORMAT} subcommands specify how input lines +are interpreted. @xref{Matrix READ and WRITE Commands}, for details. + +The @code{SIZE} subcommand is required for reading into an entire +variable. Its restricted expression argument should evaluate to a +2-element vector @code{@{@var{n},@w{ }@var{m}@}} or +@code{@{@var{n};@w{ }@var{m}@}}, which indicates a +@math{@var{n}@times{}@var{m}} matrix destination. A scalar @var{n} is +also allowed and indicates a @math{@var{n}@times{}1} column vector +destination. When the destination is a submatrix, @code{SIZE} is +optional, and if it is present then it must match the size of the +submatrix. + +By default, or with @code{MODE=RECTANGULAR}, the command reads an +entry for every row and column. With @code{MODE=SYMMETRIC}, the +command reads only the entries on and below the matrix's main +diagonal, and copies the entries above the main diagonal from the +corresponding symmetric entries below it. Only square matrices +may use @code{MODE=SYMMETRIC}. + +Ordinarily, each @code{READ} command starts from a new line in the +text file. Specify the @code{REREAD} subcommand to instead start from +the last line read by the previous @code{READ} command. This has no +effect for the first @code{READ} command to read from a particular +file. It is also ineffective just after a command that uses the +@code{EOF} matrix function (@pxref{EOF Matrix Function}) on a +particular file, because @code{EOF} has to try to read the next line +from the file to determine whether the file contains more input. + @node Matrix WRITE Command -@subsection The @code{WRITE} Command +@subsubsection The @code{WRITE} Command @display @t{WRITE} @i{expression} [@t{/OUTFILE}@t{=}@i{file}] @t{/FIELD}@t{=}@i{first} @t{TO} @i{last} [@t{BY} @i{width}] + [@t{/FORMAT}@t{=}@i{format}] [@t{/MODE}@t{=}@{@t{RECTANGULAR} @math{|} @t{TRIANGULAR}@}] - [@t{/HOLD}] - [@t{/FORMAT}@t{=}@i{format}]@t{.} + [@t{/HOLD}]@t{.} @end display +The @code{WRITE} command evaluates @i{expression} and writes it to a +text file in a specified format. Write the expression to evaluate +just after the command name. + +The @code{OUTFILE} subcommand is required in the first @code{WRITE} +command that appears within @code{MATRIX}. It specifies the text file +to be written, either as a file name in quotes or a file handle +previously declared on @code{FILE HANDLE} (@pxref{FILE HANDLE}). +Later @code{WRITE} commands (in syntax order) use the previous +referenced file if @code{FILE} is omitted. + +The @code{FIELD} and @code{FORMAT} subcommands specify how output +lines are formed. @xref{Matrix READ and WRITE Commands}, for details. + +By default, or with @code{MODE=RECTANGULAR}, the command writes an +entry for every row and column. With @code{MODE=TRIANGULAR}, the +command writes only the entries on and below the matrix's main +diagonal. Entries above the diagonal are not written. Only square +matrices may be written with @code{MODE=TRIANGULAR}. + +Ordinarily, each @code{WRITE} command starts a new line in the output +file. With @code{HOLD}, the next @code{WRITE} command will write to +the same line as the current one. This can be useful to write more +than one matrix on a single output line. + @node Matrix GET Command @subsection The @code{GET} Command @@ -1752,11 +1875,76 @@ terminates immediately, jumping to the command just following @t{GET} @i{variable}[@t{(}@i{index}[@t{,}@i{index}]@t{)}] [@t{/FILE}@t{=}@{@i{file} @math{|} @t{*}@}] [@t{/VARIABLES}@t{=}@i{variable}@dots{}] - [@t{/NAMES}@t{=}@i{expression}] + [@t{/NAMES}@t{=}@i{variable}] [@t{/MISSING}@t{=}@{@t{ACCEPT} @math{|} @t{OMIT} @math{|} @i{number}@}] [@t{/SYSMIS}@t{=}@{@t{OMIT} @math{|} @i{number}@}]@t{.} @end display +The @code{READ} command reads numeric data from an SPSS system file, +SPSS/PC+ system file, or SPSS portable file into a matrix variable or +submatrix: + +@itemize @bullet +@item +To read data into a variable, specify just its name following +@code{GET}. The variable need not already exist; if it does, it is +replaced. The variable will have as many columns as there are +variables specified on the @code{VARIABLES} subcommand and as many +rows as there are cases in the input file. + +@item +To read data into a submatrix, specify the name of an existing +variable, followed by an indexing expression, just after @code{GET}. +The submatrix must have as many columns as variables specified on +@code{VARIABLES} and as many rows as cases in the input file. +@end itemize + +Specify the name or handle of the file to be read on @code{FILE}. Use +@samp{*}, or simply omit the @code{FILE} subcommand, to read from the +active file. Reading from the active file is only permitted if it was +already defined outside @code{MATRIX}. + +List the variables to be read as columns in the matrix on the +@code{VARIABLES} subcommand. The list can use @code{TO} for +collections of variables or @code{ALL} for all variables. If +@code{VARIABLES} is omitted, all variables are read. Only numeric +variables may be read. + +If a variable is named on @code{NAMES}, then the names of the +variables read as data columns are stored in a string vector within +the given name, replacing any existing matrix variable with that name. +Variable names are truncated to 8 bytes. + +The @code{MISSING} and @code{SYSMIS} subcommands control the treatment +of missing values in the input file. By default, any user- or +system-missing data in the variables being read from the input causes +an error that prevents @code{GET} from executing. To accept missing +values, specify one of the following settings on @code{MISSING}: + +@table @asis +@item @code{ACCEPT} +Accept user-missing values with no change. By default, system-missing +values still yield an error. Use the @code{SYSMIS} subcommand to +change this treatment: + +@table @asis +@item @code{OMIT} +Skip any case that contains a system-missing value. + +@item @i{number} +Recode the system-missing value to @i{number}. +@end table + +@item @code{OMIT} +Skip any case that contains any user- or system-missing value. + +@item @i{number} +Recode all user- and system-missing values to @i{number}. +@end table + +The @code{SYSMIS} subcommand has an effect only with +@code{MISSING=ACCEPT}. + @node Matrix SAVE Command @subsection The @code{SAVE} Command @@ -1767,25 +1955,175 @@ terminates immediately, jumping to the command just following [@t{/NAMES}@t{=}@i{expression}] [@t{/STRINGS}@t{=}@i{variable}@dots{}]@t{.} @end display + +The @code{SAVE} matrix command evaluates @i{expression} and writes the +resulting matrix to an SPSS system file. In the system file, each +matrix row becomes a case and each column becomes a variable. + +Specify the name or handle of the SPSS system file on the +@code{OUTFILE} subcommand, or @samp{*} to write the output as the new +active file. The @code{OUTFILE} subcommand is required on the first +@code{SAVE} command, in syntax order, within @code{MATRIX}. For +@code{SAVE} commands after the first, the default output file is the +same as the previous. + +When multiple @code{SAVE} commands write to one destination within a +single @code{MATRIX}, the later commands append to the same output +file. All the matrices written to the file must have the same number +of columns. The @code{VARIABLES}, @code{NAMES}, and @code{STRINGS} +subcommands are honored only for the first @code{SAVE} command that +writes to a given file. + +By default, @code{SAVE} names the variables in the output file +@code{COL1} through @code{COL@i{n}}. Use @code{VARIABLES} or +@code{NAMES} to give the variables meaningful names. The +@code{VARIABLES} subcommand accepts a comma-separated list of variable +names. Its alternative, @code{NAMES}, instead accepts an expression +that must evaluate to a row or column string vector of names. The +number of names need not exactly match the number of columns in the +matrix to be written: extra names are ignored; extra columns use +default names. + +By default, @code{SAVE} assumes that the matrix to be written is all +numeric. To write string columns, specify a comma-separated list of +the string columns' variable names on @code{STRINGS}. + +@node Matrix MGET Command +@subsection The @code{MGET} Command + @display @t{MGET} [@t{/FILE}@t{=}@i{file}] [@t{/TYPE}@t{=}@{@t{COV} @math{|} @t{CORR} @math{|} @t{MEAN} @math{|} @t{STDDEV} @math{|} @t{N} @math{|} @t{COUNT}@}]@t{.} @end display +The @code{MGET} command reads the data from a matrix file +(@pxref{Matrix Files}) into matrix variables. Specify the name or +handle of the matrix file to be read on the @code{FILE} subcommand; if +it is omitted, then the command reads the active file. + +By default, @code{MGET} reads all of the data from the matrix file. +Specify a space-delimited list of matrix types @code{TYPE} to limit the +kinds of data to the one specified: + +@table @code +@item COV +Covariance matrix. + +@item CORR +Correlation coefficient matrix. + +@item MEAN +Vector of means. + +@item STDDEV +Vector of standard deviations. + +@item N +Vector of case counts. + +@item COUNT +Vector of counts. +@end table + +@code{MGET} reads the entire matrix file and automatically names, +creates, and populates matrix variables using its contents. It +constructs the name of each variable by concatenating the following: + +@itemize @bullet +@item +A 2-character prefix that identifies the type of the matrix: + +@table @code +@item CV +Covariance matrix. + +@item CR +Correlation coefficient matrix. + +@item MN +Vector of means. + +@item SD +Vector of standard deviations. + +@item NC +Vector of case counts. + +@item CN +Vector of counts. +@end table + +@item +If the matrix file has factor variables, @code{F@i{n}}, where @i{n} +is a number identifying a group of factors: @code{F1} for the first +group, @code{F2} for the second, and so on. + +@item +If the matrix file has split file variables, @code{S@i{n}}, where +@i{n} is a number identifying a split group: @code{S1} for the first +group, @code{S2} for the second, and so on. +@end itemize + +If @code{MGET} chooses the name of an existing variable, it issues a +warning and does not change the variable. + @node Matrix MSAVE Command @subsection The @code{MSAVE} Command @display @t{MSAVE} @i{expression} @t{/TYPE}@t{=}@{@t{COV} @math{|} @t{CORR} @math{|} @t{MEAN} @math{|} @t{STDDEV} @math{|} @t{N} @math{|} @t{COUNT}@} + [@t{/FACTOR}@t{=}@i{expression}] + [@t{/SPLIT}@t{=}@i{expression}] [@t{/OUTFILE}@t{=}@i{file}] [@t{/VARIABLES}@t{=}@i{variable}@dots{}] [@t{/SNAMES}@t{=}@i{variable}@dots{}] - [@t{/SPLIT}@t{=}@i{expression}] - [@t{/FNAMES}@t{=}@i{variable}@dots{}] - [@t{/FACTOR}@t{=}@i{expression}]@t{.} + [@t{/FNAMES}@t{=}@i{variable}@dots{}]@t{.} @end display +The @code{MSAVE} command evaluates the @i{expression} specifies just +after the command name, and writes the resulting matrix to a matrix +file (@pxref{Matrix Files}). + +The @code{TYPE} subcommand is required. It specifies the +@code{ROWTYPE_} to write along with this matrix. + +The @code{FACTOR} and @code{SPLIT} subcommands are required if and +only if the matrix file has factor or split variables, respectively. +Each one takes an expression that must evaluate to a vector with the +same number of entries as the matrix has factor or split variables, +respectively. Each @code{MSAVE} only writes data for a single +combination of factor and split variables, so many @code{MSAVE} +commands (or one inside a loop) may be needed to write a complete set. + +The remaining @code{MSAVE} subcommands define the format of the matrix +file. All of the @code{MSAVE} commands within a given matrix program +write to the same matrix file, so these subcommands are only +meaningful on the first @code{MSAVE} command within a matrix program. +(If they are given again on later @code{MSAVE} commands, then they +must have the same values as on the first.) + +The @code{OUTFILE} subcommand specifies the name or handle of the +matrix file to be written. Output must go to an external file, not a +data set or the active file. + +The @code{VARIABLES} subcommand specifies a comma-separated list of +the names of the continuous variables to be written to the matrix +file. The @code{TO} keyword can be used to define variables named +with consecutive integer suffixes. These names become column names +and names that appear in @code{VARNAME_} in the matrix file. +@code{ROWTYPE_} and @code{VARNAME_} are not allowed on +@code{VARIABLES}. If @code{VARIABLES} is omitted, then @pspp{} uses +the names @code{COL1}, @code{COL2}, and so on. + +The @code{FNAMES} subcommand may be used to supply a comma-separated +list of factor variable names. The default names are @code{FAC1}, +@code{FAC2}, and so on. + +The @code{SNAMES} subcommand can supply a comma-separated list of +split variable names. The default names are @code{SPL1}, @code{SPL2}, +and so on. + @node Matrix DISPLAY Command @subsection The @code{DISPLAY} Command @@ -1793,9 +2131,19 @@ terminates immediately, jumping to the command just following @t{DISPLAY} [@{@t{DICTIONARY} @math{|} @t{STATUS}@}]@t{.} @end display +The @code{DISPLAY} command makes @pspp{} display a table with the name +and dimensions of each matrix variable. The @code{DICTIONARY} and +@code{STATUS} keywords are accepted but have no effect. + @node Matrix RELEASE Command @subsection The @code{RELEASE} Command @display @t{RELEASE} @i{variable}@dots{}@t{.} @end display + +The @code{RELEASE} command accepts a comma-separated list of matrix +variable names. It deletes each variable and releases the memory +associated with it. + +The @code{END MATRIX} command releases all matrix variables. diff --git a/src/language/stats/matrix.c b/src/language/stats/matrix.c index be635f9bad..636c242ee1 100644 --- a/src/language/stats/matrix.c +++ b/src/language/stats/matrix.c @@ -3360,7 +3360,7 @@ struct matrix_cmd int w; int d; bool triangular; - bool hold; + bool hold; /* XXX */ } write;