From: Ben Pfaff Date: Thu, 8 May 2025 23:00:54 +0000 (-0700) Subject: work on manual X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=1b5b1e14b55e3cf17c3669307accbec851fa2838;p=pspp work on manual --- diff --git a/rust/doc/src/SUMMARY.md b/rust/doc/src/SUMMARY.md index 1224196c48..1654d2ebca 100644 --- a/rust/doc/src/SUMMARY.md +++ b/rust/doc/src/SUMMARY.md @@ -132,6 +132,17 @@ - [LOGISTIC REGRESSION](commands/statistics/logistic-regression.md) - [MEANS](commands/statistics/means.md) - [NPAR TESTS](commands/statistics/npar-tests.md) + - [T-TEST](commands/statistics/t-test.md) + - [ONEWAY](commands/statistics/oneway.md) + - [QUICK CLUSTER](commands/statistics/quick-cluster.md) + - [RANK](commands/statistics/rank.md) + - [REGRESSION](commands/statistics/regression.md) + - [RELIABILITY](commands/statistics/reliability.md) + - [ROC](commands/statistics/roc.md) +- [Matrices](commands/matrix/index.md) + - [MATRIX DATA](commands/matrix/matrix-data.md) + - [MCONVERT](commands/matrix/mconvert.md) + - [MATRIX](commands/matrix/matrix.md) # Developer Documentation diff --git a/rust/doc/src/commands/matrix/index.md b/rust/doc/src/commands/matrix/index.md new file mode 100644 index 0000000000..f4d938043b --- /dev/null +++ b/rust/doc/src/commands/matrix/index.md @@ -0,0 +1,115 @@ +# Matrices + +Some PSPP procedures work with matrices by producing numeric matrices +that report results of data analysis, or by consuming matrices as a +basis for further analysis. This chapter documents the [format of +data files](#matrix-files) that store these matrices and commands for +working with them, as well as PSPP's general-purpose facility for +matrix operations. + +## Matrix Files + +A matrix file is an SPSS system file that conforms to the dictionary and +case structure described in this section. Procedures that read matrices +from files expect them to be in the matrix file format. Procedures that +write matrices also use this format. + +Text files that contain matrices can be converted to matrix file +format. The [MATRIX DATA](matrix-data.md) command can read a text +file as a matrix file. + +A matrix file's dictionary must have the following variables in the +specified order: + +1. Zero or more numeric split variables. These are included by + procedures when [`SPLIT + FILE`](../../commands/selection/split-file.md) is active. [`MATRIX + DATA`](matrix-data.md) assigns split variables format `F4.0`. + +2. `ROWTYPE_`, a string variable with width 8. This variable + indicates the kind of matrix or vector that a given case + represents. The supported row types are listed below. + +3. Zero or more numeric factor variables. These are included by + procedures that divide data into cells. For within-cell data, + factor variables are filled with non-missing values; for pooled + data, they are missing. [`MATRIX DATA`](matrix-data.md) assigns + factor variables format `F4.0`. + +4. `VARNAME_`, a string variable. Matrix data includes one row per + continuous variable (see below), naming each continuous variable in + order. This column is blank for vector data. [`MATRIX + DATA`](matrix-data.md) makes `VARNAME_` wide enough for the name of + any of the continuous variables, but at least 8 bytes. + +5. One or more numeric continuous variables. These are the variables + whose data was analyzed to produce the matrices. [`MATRIX + DATA`](matrix-data.md) assigns continuous variables format `F10.4`. + +Case weights are ignored in matrix files. + +### Row Types + +Matrix files support a fixed set of types of matrix and vector data. +The `ROWTYPE_` variable in each case of a matrix file indicates its row +type. + +The supported matrix row types are listed below. Each type is listed +with the keyword that identifies it in `ROWTYPE_`. All supported types +of matrices are square, meaning that each matrix must include one row +per continuous variable, with the `VARNAME_` variable indicating each +continuous variable in turn in the same order as the dictionary. + +* `CORR` + Correlation coefficients. + +* `COV` + Covariance coefficients. + +* `MAT` + General-purpose matrix. + +* `N_MATRIX` + Counts. + +* `PROX` + Proximities matrix. + +The supported vector row types are listed below, along with their +associated keyword. Vector row types only require a single row, whose +`VARNAME_` is blank: + +* `COUNT` + Unweighted counts. + +* `DFE` + Degrees of freedom. + +* `MEAN` + Means. + +* `MSE` + Mean squared errors. + +* `N` + Counts. + +* `STDDEV` + Standard deviations. + +Only the row types listed above may appear in matrix files. The +[`MATRIX DATA`](matrix-data.md) command, however, accepts the additional row types +listed below, which it changes into matrix file row types as part of +its conversion process: + +* `N_VECTOR` + Synonym for `N`. + +* `SD` + Synonym for `STDDEV`. + +* `N_SCALAR` + Accepts a single number from the [`MATRIX DATA`](matrix-data.md) + input and writes it as an `N` row with the number replicated across + all the continuous variables. + diff --git a/rust/doc/src/commands/matrix/matrix-data.md b/rust/doc/src/commands/matrix/matrix-data.md new file mode 100644 index 0000000000..bbe5f3b4db --- /dev/null +++ b/rust/doc/src/commands/matrix/matrix-data.md @@ -0,0 +1,421 @@ +MATRIX DATA +================ + +``` +MATRIX DATA + VARIABLES=VARIABLES + [FILE={'FILE_NAME' | INLINE} + [/FORMAT=[{LIST | FREE}] + [{UPPER | LOWER | FULL}] + [{DIAGONAL | NODIAGONAL}]] + [/SPLIT=SPLIT_VARS] + [/FACTORS=FACTOR_VARS] + [/N=N] + +The following subcommands are only needed when ROWTYPE_ is not +specified on the VARIABLES subcommand: + [/CONTENTS={CORR,COUNT,COV,DFE,MAT,MEAN,MSE, + N_MATRIX,N|N_VECTOR,N_SCALAR,PROX,SD|STDDEV}] + [/CELLS=N_CELLS] +``` + +The `MATRIX DATA` command convert matrices and vectors from text +format into the [matrix file format](index.md#matrix-files) for use by +procedures that read matrices. It reads a text file or inline data and +outputs to the active file, replacing any data already in the active +dataset. The matrix file may then be used by other commands directly +from the active file, or it may be written to a `.sav` file using the +`SAVE` command. + +The text data read by `MATRIX DATA` can be delimited by spaces or +commas. A plus or minus sign, except immediately following a `d` or +`e`, also begins a new value. Optionally, values may be enclosed in +single or double quotes. + +`MATRIX DATA` can read the types of matrix and vector data supported +in matrix files (see [Row Types](index.md#row-types)). + +The `FILE` subcommand specifies the source of the command's input. To +read input from a text file, specify its name in quotes. To supply +input inline, omit `FILE` or specify `INLINE`. Inline data must +directly follow `MATRIX DATA`, inside [`BEGIN +DATA`](../../commands/data-io/begin-data.md). + +`VARIABLES` is the only required subcommand. It names the variables +present in each input record in the order that they appear. (`MATRIX +DATA` reorders the variables in the matrix file it produces, if needed +to fit the matrix file format.) The variable list must include split +variables and factor variables, if they are present in the data, in +addition to the continuous variables that form matrix rows and columns. +It may also include a special variable named `ROWTYPE_`. + +Matrix data may include split variables or factor variables or both. +List split variables, if any, on the `SPLIT` subcommand and factor +variables, if any, on the `FACTORS` subcommand. Split and factor +variables must be numeric. Split and factor variables must also be +listed on `VARIABLES`, with one exception: if `VARIABLES` does not +include `ROWTYPE_`, then `SPLIT` may name a single variable that is not +in `VARIABLES` (see [Example 8](#example-8-split-variable-with-sequential-values)). + +The `FORMAT` subcommand accepts settings to describe the format of +the input data: + +* `LIST` (default) + `FREE` + + `LIST` requires each row to begin at the start of a new input line. + `FREE` allows rows to begin in the middle of a line. Either setting + allows a single row to continue across multiple input lines. + +* `LOWER` (default) + `UPPER` + `FULL` + + With `LOWER`, only the lower triangle is read from the input data and + the upper triangle is mirrored across the main diagonal. `UPPER` + behaves similarly for the upper triangle. `FULL` reads the entire + matrix. + +* `DIAGONAL` (default) + `NODIAGONAL` + + With `DIAGONAL`, the main diagonal is read from the input data. With + `NODIAGONAL`, which is incompatible with `FULL`, the main diagonal is + not read from the input data but instead set to 1 for correlation + matrices and system-missing for others. + +The `N` subcommand is a way to specify the size of the population. +It is equivalent to specifying an `N` vector with the specified value +for each split file. + +`MATRIX DATA` supports two different ways to indicate the kinds of +matrices and vectors present in the data, depending on whether a +variable with the special name `ROWTYPE_` is present in `VARIABLES`. +The following subsections explain `MATRIX DATA` syntax and behavior in +each case. + + + +## With `ROWTYPE_` + +If `VARIABLES` includes `ROWTYPE_`, each case's `ROWTYPE_` indicates +the type of data contained in the row. See [Row +Types](index.md#row-types) for a list of supported row types. + +### Example 1: Defaults with `ROWTYPE_` + +This example shows a simple use of `MATRIX DATA` with `ROWTYPE_` plus 8 +variables named `var01` through `var08`. + +Because `ROWTYPE_` is the first variable in `VARIABLES`, it appears +first on each line. The first three lines in the example data have +`ROWTYPE_` values of `MEAN`, `SD`, and `N`. These indicate that these +lines contain vectors of means, standard deviations, and counts, +respectively, for `var01` through `var08` in order. + +The remaining 8 lines have a ROWTYPE_ of `CORR` which indicates that +the values are correlation coefficients. Each of the lines corresponds +to a row in the correlation matrix: the first line is for `var01`, the +next line for `var02`, and so on. The input only contains values for +the lower triangle, including the diagonal, since `FORMAT=LOWER +DIAGONAL` is the default. + +With `ROWTYPE_`, the `CONTENTS` subcommand is optional and the +`CELLS` subcommand may not be used. + +``` +MATRIX DATA + VARIABLES=ROWTYPE_ var01 TO var08. +BEGIN DATA. +MEAN 24.3 5.4 69.7 20.1 13.4 2.7 27.9 3.7 +SD 5.7 1.5 23.5 5.8 2.8 4.5 5.4 1.5 +N 92 92 92 92 92 92 92 92 +CORR 1.00 +CORR .18 1.00 +CORR -.22 -.17 1.00 +CORR .36 .31 -.14 1.00 +CORR .27 .16 -.12 .22 1.00 +CORR .33 .15 -.17 .24 .21 1.00 +CORR .50 .29 -.20 .32 .12 .38 1.00 +CORR .17 .29 -.05 .20 .27 .20 .04 1.00 +END DATA. +``` + +### Example 2: `FORMAT=UPPER NODIAGONAL` + +This syntax produces the same matrix file as example 1, but it uses +`FORMAT=UPPER NODIAGONAL` to specify the upper triangle and omit the +diagonal. Because the matrix's `ROWTYPE_` is `CORR`, PSPP automatically +fills in the diagonal with 1. + +``` +MATRIX DATA + VARIABLES=ROWTYPE_ var01 TO var08 + /FORMAT=UPPER NODIAGONAL. +BEGIN DATA. +MEAN 24.3 5.4 69.7 20.1 13.4 2.7 27.9 3.7 +SD 5.7 1.5 23.5 5.8 2.8 4.5 5.4 1.5 +N 92 92 92 92 92 92 92 92 +CORR .17 .50 -.33 .27 .36 -.22 .18 +CORR .29 .29 -.20 .32 .12 .38 +CORR .05 .20 -.15 .16 .21 +CORR .20 .32 -.17 .12 +CORR .27 .12 -.24 +CORR -.20 -.38 +CORR .04 +END DATA. +``` + +### Example 3: `N` subcommand + +This syntax uses the `N` subcommand in place of an `N` vector. It +produces the same matrix file as examples 1 and 2. + +``` +MATRIX DATA + VARIABLES=ROWTYPE_ var01 TO var08 + /FORMAT=UPPER NODIAGONAL + /N 92. +BEGIN DATA. +MEAN 24.3 5.4 69.7 20.1 13.4 2.7 27.9 3.7 +SD 5.7 1.5 23.5 5.8 2.8 4.5 5.4 1.5 +CORR .17 .50 -.33 .27 .36 -.22 .18 +CORR .29 .29 -.20 .32 .12 .38 +CORR .05 .20 -.15 .16 .21 +CORR .20 .32 -.17 .12 +CORR .27 .12 -.24 +CORR -.20 -.38 +CORR .04 +END DATA. +``` + +### Example 4: Split variables + +This syntax defines two matrices, using the variable `s1` to distinguish +between them. Notice how the order of variables in the input matches +their order on `VARIABLES`. This example also uses `FORMAT=FULL`. + +``` +MATRIX DATA + VARIABLES=s1 ROWTYPE_ var01 TO var04 + /SPLIT=s1 + /FORMAT=FULL. +BEGIN DATA. +0 MEAN 34 35 36 37 +0 SD 22 11 55 66 +0 N 99 98 99 92 +0 CORR 1 .9 .8 .7 +0 CORR .9 1 .6 .5 +0 CORR .8 .6 1 .4 +0 CORR .7 .5 .4 1 +1 MEAN 44 45 34 39 +1 SD 23 15 51 46 +1 N 98 34 87 23 +1 CORR 1 .2 .3 .4 +1 CORR .2 1 .5 .6 +1 CORR .3 .5 1 .7 +1 CORR .4 .6 .7 1 +END DATA. +``` + +### Example 5: Factor variables + +This syntax defines a matrix file that includes a factor variable `f1`. +The data includes mean, standard deviation, and count vectors for two +values of the factor variable, plus a correlation matrix for pooled +data. + +``` +MATRIX DATA + VARIABLES=ROWTYPE_ f1 var01 TO var04 + /FACTOR=f1. +BEGIN DATA. +MEAN 0 34 35 36 37 +SD 0 22 11 55 66 +N 0 99 98 99 92 +MEAN 1 44 45 34 39 +SD 1 23 15 51 46 +N 1 98 34 87 23 +CORR . 1 +CORR . .9 1 +CORR . .8 .6 1 +CORR . .7 .5 .4 1 +END DATA. +``` + +## Without `ROWTYPE_` + +If `VARIABLES` does not contain `ROWTYPE_`, the `CONTENTS` subcommand +defines the row types that appear in the file and their order. If +`CONTENTS` is omitted, `CONTENTS=CORR` is assumed. + +Factor variables without `ROWTYPE_` introduce special requirements, +illustrated below in Examples 8 and 9. + +### Example 6: Defaults without `ROWTYPE_` + +This example shows a simple use of `MATRIX DATA` with 8 variables named +`var01` through `var08`, without `ROWTYPE_`. This yields the same +matrix file as [Example 1](#example-1-defaults-with-rowtype_). + +``` +MATRIX DATA + VARIABLES=var01 TO var08 + /CONTENTS=MEAN SD N CORR. +BEGIN DATA. +24.3 5.4 69.7 20.1 13.4 2.7 27.9 3.7 + 5.7 1.5 23.5 5.8 2.8 4.5 5.4 1.5 + 92 92 92 92 92 92 92 92 +1.00 + .18 1.00 +-.22 -.17 1.00 + .36 .31 -.14 1.00 + .27 .16 -.12 .22 1.00 + .33 .15 -.17 .24 .21 1.00 + .50 .29 -.20 .32 .12 .38 1.00 + .17 .29 -.05 .20 .27 .20 .04 1.00 +END DATA. +``` + +### Example 7: Split variables with explicit values + +This syntax defines two matrices, using the variable `s1` to distinguish +between them. Each line of data begins with `s1`. This yields the same +matrix file as [Example 4](#example-4-split-variables). + +``` +MATRIX DATA + VARIABLES=s1 var01 TO var04 + /SPLIT=s1 + /FORMAT=FULL + /CONTENTS=MEAN SD N CORR. +BEGIN DATA. +0 34 35 36 37 +0 22 11 55 66 +0 99 98 99 92 +0 1 .9 .8 .7 +0 .9 1 .6 .5 +0 .8 .6 1 .4 +0 .7 .5 .4 1 +1 44 45 34 39 +1 23 15 51 46 +1 98 34 87 23 +1 1 .2 .3 .4 +1 .2 1 .5 .6 +1 .3 .5 1 .7 +1 .4 .6 .7 1 +END DATA. +``` + +### Example 8: Split variable with sequential values + +Like this previous example, this syntax defines two matrices with split +variable `s1`. In this case, though, `s1` is not listed in `VARIABLES`, +which means that its value does not appear in the data. Instead, +`MATRIX DATA` reads matrix data until the input is exhausted, supplying +1 for the first split, 2 for the second, and so on. + +``` +MATRIX DATA + VARIABLES=var01 TO var04 + /SPLIT=s1 + /FORMAT=FULL + /CONTENTS=MEAN SD N CORR. +BEGIN DATA. +34 35 36 37 +22 11 55 66 +99 98 99 92 + 1 .9 .8 .7 +.9 1 .6 .5 +.8 .6 1 .4 +.7 .5 .4 1 +44 45 34 39 +23 15 51 46 +98 34 87 23 + 1 .2 .3 .4 +.2 1 .5 .6 +.3 .5 1 .7 +.4 .6 .7 1 +END DATA. +``` + +### Factor variables without `ROWTYPE_` + +Without `ROWTYPE_`, factor variables introduce two new wrinkles to +`MATRIX DATA` syntax. First, the `CELLS` subcommand must declare the +number of combinations of factor variables present in the data. If +there is, for example, one factor variable for which the data contains +three values, one would write `CELLS=3`; if there are two (or more) +factor variables for which the data contains five combinations, one +would use `CELLS=5`; and so on. + +Second, the `CONTENTS` subcommand must distinguish within-cell data +from pooled data by enclosing within-cell row types in parentheses. +When different within-cell row types for a single factor appear in +subsequent lines, enclose the row types in a single set of parentheses; +when different factors' values for a given within-cell row type appear +in subsequent lines, enclose each row type in individual parentheses. + +Without `ROWTYPE_`, input lines for pooled data do not include factor +values, not even as missing values, but input lines for within-cell data +do. + +The following examples aim to clarify this syntax. + +#### Example 9: Factor variables, grouping within-cell records by factor + +This syntax defines the same matrix file as [Example +5](#example-5-factor-variables), without using `ROWTYPE_`. It +declares `CELLS=2` because the data contains two values (0 and 1) for +factor variable `f1`. Within-cell vector row types `MEAN`, `SD`, and +`N` are in a single set of parentheses on `CONTENTS` because they are +grouped together in subsequent lines for a single factor value. The +data lines with the pooled correlation matrix do not have any factor +values. + +``` +MATRIX DATA + VARIABLES=f1 var01 TO var04 + /FACTOR=f1 + /CELLS=2 + /CONTENTS=(MEAN SD N) CORR. +BEGIN DATA. +0 34 35 36 37 +0 22 11 55 66 +0 99 98 99 92 +1 44 45 34 39 +1 23 15 51 46 +1 98 34 87 23 + 1 + .9 1 + .8 .6 1 + .7 .5 .4 1 +END DATA. +``` + +#### Example 10: Factor variables, grouping within-cell records by row type + +This syntax defines the same matrix file as the previous example. The +only difference is that the within-cell vector rows are grouped +differently: two rows of means (one for each factor), followed by two +rows of standard deviations, followed by two rows of counts. + +``` +MATRIX DATA + VARIABLES=f1 var01 TO var04 + /FACTOR=f1 + /CELLS=2 + /CONTENTS=(MEAN) (SD) (N) CORR. +BEGIN DATA. +0 34 35 36 37 +1 44 45 34 39 +0 22 11 55 66 +1 23 15 51 46 +0 99 98 99 92 +1 98 34 87 23 + 1 + .9 1 + .8 .6 1 + .7 .5 .4 1 +END DATA. +``` diff --git a/rust/doc/src/commands/matrix/matrix.md b/rust/doc/src/commands/matrix/matrix.md new file mode 100644 index 0000000000..85fdac56a5 --- /dev/null +++ b/rust/doc/src/commands/matrix/matrix.md @@ -0,0 +1,1835 @@ +# MATRIX + + + +## Summary + +``` +MATRIX. +…matrix commands… +END MATRIX. +``` + +The following basic matrix commands are supported: + +``` +COMPUTE variable[(index[,index])]=expression. +CALL procedure(argument, …). +PRINT [expression] + [/FORMAT=format] + [/TITLE=title] + [/SPACE={NEWPAGE | n}] + [{/RLABELS=string… | /RNAMES=expression}] + [{/CLABELS=string… | /CNAMES=expression}]. +``` + +The following matrix commands offer support for flow control: + +``` +DO IF expression. + …matrix commands… +[ELSE IF expression. + …matrix commands…]… +[ELSE + …matrix commands…] +END IF. + +LOOP [var=first TO last [BY step]] [IF expression]. + …matrix commands… +END LOOP [IF expression]. + +BREAK. +``` + +The following matrix commands support matrix input and output: + +``` +READ variable[(index[,index])] + [/FILE=file] + /FIELD=first TO last [BY width] + [/FORMAT=format] + [/SIZE=expression] + [/MODE={RECTANGULAR | SYMMETRIC}] + [/REREAD]. +WRITE expression + [/OUTFILE=file] + /FIELD=first TO last [BY width] + [/MODE={RECTANGULAR | TRIANGULAR}] + [/HOLD] + [/FORMAT=format]. +GET variable[(index[,index])] + [/FILE={file | *}] + [/VARIABLES=variable…] + [/NAMES=expression] + [/MISSING={ACCEPT | OMIT | number}] + [/SYSMIS={OMIT | number}]. +SAVE expression + [/OUTFILE={file | *}] + [/VARIABLES=variable…] + [/NAMES=expression] + [/STRINGS=variable…]. +MGET [/FILE=file] + [/TYPE={COV | CORR | MEAN | STDDEV | N | COUNT}]. +MSAVE expression + /TYPE={COV | CORR | MEAN | STDDEV | N | COUNT} + [/OUTFILE=file] + [/VARIABLES=variable…] + [/SNAMES=variable…] + [/SPLIT=expression] + [/FNAMES=variable…] + [/FACTOR=expression]. +``` + +The following matrix commands provide additional support: + +``` +DISPLAY [{DICTIONARY | STATUS}]. +RELEASE variable…. +``` + +`MATRIX` and `END MATRIX` enclose a special PSPP sub-language, called +the matrix language. The matrix language does not require an active +dataset to be defined and only a few of the matrix language commands +work with any datasets that are defined. Each instance of +`MATRIX`…`END MATRIX` is a separate program whose state is independent +of any instance, so that variables declared within a matrix program are +forgotten at its end. + +The matrix language works with matrices, where a "matrix" is a +rectangular array of real numbers. An `N`×`M` matrix has `N` rows and +`M` columns. Some special cases are important: a `N`×1 matrix is a +"column vector", a 1×`N` is a "row vector", and a 1×1 matrix is a +"scalar". + +The matrix language also has limited support for matrices that +contain 8-byte strings instead of numbers. Strings longer than 8 bytes +are truncated, and shorter strings are padded with spaces. String +matrices are mainly useful for labeling rows and columns when printing +numerical matrices with the `MATRIX PRINT` command. Arithmetic +operations on string matrices will not produce useful results. The user +should not mix strings and numbers within a matrix. + +The matrix language does not work with cases. A variable in the +matrix language represents a single matrix. + +The matrix language does not support missing values. + +`MATRIX` is a procedure, so it cannot be enclosed inside `DO IF`, +`LOOP`, etc. + +Macros defined before a matrix program may be used within a matrix +program, and macros may expand to include entire matrix programs. The +[`DEFINE`](../../control/define.md) command to define new macros may +not appear within a matrix program. + +The following sections describe the details of the matrix language: +first, the syntax of matrix expressions, then each of the supported +commands. The `COMMENT` command (*note COMMENT::) is also supported. + +## Matrix Expressions + +Many matrix commands use expressions. A matrix expression may use the +following operators, listed in descending order of operator precedence. +Within a single level, operators associate from left to right. + +- [Function call `()`](#matrix-functions) and [matrix construction `{}`](#matrix-construction-operator-) + +- [Indexing `()`](#index-operator-) + +- [Unary `+` and `-`](#unary-operators) + +- [Integer sequence `:`](#integer-sequence-operator-) + +- Matrix [`**`](#matrix-exponentiation-operator-) and elementwise [`&**`](#elementwise-binary-operators) exponentiation. + +- Matrix [`*`](#matrix-multiplication-operator-) and elementwise [`&*`](#elementwise-binary-operators) multiplication; [elementwise division `/` and `&/`](#elementwise-binary-operators). + +- [Addition `+` and subtraction `-`](#elementwise-binary-operators) + +- [Relational `<` `<=` `=` `>=` `>` `<>`](#elementwise-binary-operators) + +- [Logical `NOT`](#unary-operators) + +- [Logical `AND`](#elementwise-binary-operators) + +- [Logical `OR` and `XOR`](#elementwise-binary-operators) + +The operators are described in more detail below. [Matrix +Functions](#matrix-functions) documents matrix functions. + +Expressions appear in the matrix language in some contexts where there +would be ambiguity whether `/` is an operator or a separator between +subcommands. In these contexts, only the operators with higher +precedence than `/` are allowed outside parentheses. Later sections +call these "restricted expressions". + +### Matrix Construction Operator `{}` + +Use the `{}` operator to construct matrices. Within the curly braces, +commas separate elements within a row and semicolons separate rows. The +following examples show a 2×3 matrix, a 1×4 row vector, a 3×1 column +vector, and a scalar. + +``` +{1, 2, 3; 4, 5, 6} ⇒ [1 2 3] + [4 5 6] +{3.14, 6.28, 9.24, 12.57} ⇒ [3.14 6.28 9.42 12.57] +{1.41; 1.73; 2} ⇒ [1.41] + [1.73] + [2.00] +{5} ⇒ 5 +``` + + Curly braces are not limited to holding numeric literals. They can +contain calculations, and they can paste together matrices and vectors +in any way as long as the result is rectangular. For example, if `m` is +matrix `{1, 2; 3, 4}`, `r` is row vector `{5, 6}`, and `c` is column +vector `{7, 8}`, then curly braces can be used as follows: + +``` +{m, c; r, 10} ⇒ [1 2 7] + [3 4 8] + [5 6 10] +{c, 2 * c, T(r)} ⇒ [7 14 5] + [8 16 6] +``` + + The final example above uses the transposition function `T`. + +### Integer Sequence Operator `:` + +The syntax `FIRST:LAST:STEP` yields a row vector of consecutive integers +from FIRST to LAST counting by STEP. The final `:STEP` is optional and +defaults to 1 when omitted. + +`FIRST`, `LAST`, and `STEP` must each be a scalar and should be an +integer (any fractional part is discarded). Because `:` has a high +precedence, operands other than numeric literals must usually be +parenthesized. + +When `STEP` is positive (or omitted) and `END < START`, or if `STEP` +is negative and `END > START`, then the result is an empty matrix. If +`STEP` is 0, then PSPP reports an error. + +Here are some examples: + +``` +1:6 ⇒ {1, 2, 3, 4, 5, 6} +1:6:2 ⇒ {1, 3, 5} +-1:-5:-1 ⇒ {-1, -2, -3, -4, -5} +-1:-5 ⇒ {} +2:1:0 ⇒ (error) +``` + +### Index Operator `()` + +The result of the submatrix or indexing operator, written `M(RINDEX, +CINDEX)`, contains the rows of `M` whose indexes are given in vector +`RINDEX` and the columns whose indexes are given in vector `CINDEX`. + + In the simplest case, if `RINDEX` and `CINDEX` are both scalars, the +result is also a scalar: + +``` +{10, 20; 30, 40}(1, 1) ⇒ 10 +{10, 20; 30, 40}(1, 2) ⇒ 20 +{10, 20; 30, 40}(2, 1) ⇒ 30 +{10, 20; 30, 40}(2, 2) ⇒ 40 +``` + +If the index arguments have multiple elements, then the result +includes multiple rows or columns: + +``` +{10, 20; 30, 40}(1:2, 1) ⇒ {10; 30} +{10, 20; 30, 40}(2, 1:2) ⇒ {30, 40} +{10, 20; 30, 40}(1:2, 1:2) ⇒ {10, 20; 30, 40} +``` + +The special argument `:` may stand in for all the rows or columns in +the matrix being indexed, like this: + +``` +{10, 20; 30, 40}(:, 1) ⇒ {10; 30} +{10, 20; 30, 40}(2, :) ⇒ {30, 40} +{10, 20; 30, 40}(:, :) ⇒ {10, 20; 30, 40} +``` + +The index arguments do not have to be in order, and they may contain +repeated values, like this: + +``` +{10, 20; 30, 40}({2, 1}, 1) ⇒ {30; 10} +{10, 20; 30, 40}(2, {2; 2; ⇒ {40, 40, 30} +1}) +{10, 20; 30, 40}(2:1:-1, :) ⇒ {30, 40; 10, 20} +``` + +When the matrix being indexed is a row or column vector, only a +single index argument is needed, like this: + +``` +{11, 12, 13, 14, 15}(2:4) ⇒ {12, 13, 14} +{11; 12; 13; 14; 15}(2:4) ⇒ {12; 13; 14} +``` + +When an index is not an integer, PSPP discards the fractional part. +It is an error for an index to be less than 1 or greater than the number +of rows or columns: + +``` +{11, 12, 13, 14}({2.5, ⇒ {12, 14} +4.6}) +{11; 12; 13; 14}(0) ⇒ (error) +``` + +### Unary Operators + +The unary operators take a single operand of any dimensions and operate +on each of its elements independently. The unary operators are: + +* `-`: Inverts the sign of each element. +* `+`: No change. +* `NOT`: Logical inversion: each positive value becomes 0 and each + zero or negative value becomes 1. + +Examples: + +``` +-{1, -2; 3, -4} ⇒ {-1, 2; -3, 4} ++{1, -2; 3, -4} ⇒ {1, -2; 3, -4} +NOT {1, 0; -1, 1} ⇒ {0, 1; 1, 0} +``` + +### Elementwise Binary Operators + +The elementwise binary operators require their operands to be matrices +with the same dimensions. Alternatively, if one operand is a scalar, +then its value is treated as if it were duplicated to the dimensions of +the other operand. The result is a matrix of the same size as the +operands, in which each element is the result of the applying the +operator to the corresponding elements of the operands. + +The elementwise binary operators are listed below. + +- The arithmetic operators, for familiar arithmetic operations: + + - `+`: Addition. + + - `-`: Subtraction. + + - `*`: Multiplication, if one operand is a scalar. (Otherwise this + is matrix multiplication, described below.) + + - `/` or `&/`: Division. + + - `&*`: Multiplication. + + - `&**`: Exponentiation. + +- The relational operators, whose results are 1 when a comparison is + true and 0 when it is false: + + - `<` or `LT`: Less than. + + - `<=` or `LE`: Less than or equal. + + - `=` or `EQ`: Equal. + + - `>` or `GT`: Greater than. + + - `>=` or `GE`: Greater than or equal. + + - `<>` or `~=` or `NE`: Not equal. + +- The logical operators, which treat positive operands as true and + nonpositive operands as false. They yield 0 for false and 1 for + true: + + - `AND`: True if both operands are true. + + - `OR`: True if at least one operand is true. + + - `XOR`: True if exactly one operand is true. + +Examples: + +``` +1 + 2 ⇒ 3 +1 + {3; 4} ⇒ {4; 5} +{66, 77; 88, 99} + 5 ⇒ {71, 82; 93, 104} +{4, 8; 3, 7} + {1, 0; 5, 2} ⇒ {5, 8; 8, 9} +{1, 2; 3, 4} < {4, 3; 2, 1} ⇒ {1, 1; 0, 0} +{1, 3; 2, 4} >= 3 ⇒ {0, 1; 0, 1} +{0, 0; 1, 1} AND {0, 1; 0, ⇒ {0, 0; 0, 1} +1} +``` + +### Matrix Multiplication Operator `*` + +If `A` is an `M`×`N` matrix and `B` is an `N`×`P` matrix, then `A*B` is the +`M`×`P` matrix multiplication product `C`. PSPP reports an error if the +number of columns in `A` differs from the number of rows in `B`. + +The `*` operator performs elementwise multiplication (see above) if +one of its operands is a scalar. + +No built-in operator yields the inverse of matrix multiplication. +Instead, multiply by the result of `INV` or `GINV`. + +Some examples: + +``` +{1, 2, 3} * {4; 5; 6} ⇒ 32 +{4; 5; 6} * {1, 2, 3} ⇒ {4, 8, 12; + 5, 10, 15; + 6, 12, 18} +``` + +### Matrix Exponentiation Operator `**` + +The result of `A**B` is defined as follows when `A` is a square matrix +and `B` is an integer scalar: + + - For `B > 0`, `A**B` is `A*…*A`, where there are `B` `A`s. (PSPP + implements this efficiently for large `B`, using exponentiation by + squaring.) + + - For `B < 0`, `A**B` is `INV(A**(-B))`. + + - For `B = 0`, `A**B` is the identity matrix. + +PSPP reports an error if `A` is not square or `B` is not an integer. + +Examples: + +``` +{2, 5; 1, 4}**3 ⇒ {48, 165; 33, 114} +{2, 5; 1, 4}**0 ⇒ {1, 0; 0, 1} +10*{4, 7; 2, 6}**-1 ⇒ {6, -7; -2, 4} +``` + +## Matrix Functions + +The matrix language support numerous functions in multiple categories. +The following subsections document each of the currently supported +functions. The first letter of each parameter's name indicate the +required argument type: + +* `S`: A scalar. + +* `N`: A nonnegative integer scalar. (Non-integers are accepted and + silently rounded down to the nearest integer.) + +* `V`: A row or column vector. + +* `M`: A matrix. + +### Elementwise Functions + +These functions act on each element of their argument independently, +like the [elementwise operators](#elementwise-binary-operators). + +* `ABS(M)` + Takes the absolute value of each element of M. + + ``` + ABS({-1, 2; -3, 0}) ⇒ {1, 2; 3, 0} + ``` + +* `ARSIN(M)` + `ARTAN(M)` + Computes the inverse sine or tangent, respectively, of each + element in M. The results are in radians, between \\(-\pi/2\\) + and \\(+\pi/2\\), inclusive. + + The value of \\(\pi\\) can be computed as `4*ARTAN(1)`. + + ``` + ARSIN({-1, 0, 1}) ⇒ {-1.57, 0, 1.57} (approximately) + + ARTAN({-5, -1, 1, 5}) ⇒ {-1.37, -.79, .79, 1.37} (approximately) + ``` + +* `COS(M)` + `SIN(M)` + Computes the cosine or sine, respectively, of each element in `M`, + which must be in radians. + + ``` + COS({0.785, 1.57; 3.14, 1.57 + 3.14}) ⇒ {.71, 0; -1, 0} + (approximately) + ``` + +* `EXP(M)` + Computes \\(e^x\\) for each element \\(x\\) in `M`. + + ``` + EXP({2, 3; 4, 5}) ⇒ {7.39, 20.09; 54.6, 148.4} (approximately) + ``` + +* `LG10(M)` + `LN(M)` + Takes the logarithm with base 10 or base \\(e\\), respectively, of each + element in `M`. + + ``` + LG10({1, 10, 100, 1000}) ⇒ {0, 1, 2, 3} + LG10(0) ⇒ (error) + + LN({EXP(1), 1, 2, 3, 4}) ⇒ {1, 0, .69, 1.1, 1.39} (approximately) + LN(0) ⇒ (error) + ``` + +* `MOD(M, S)` + Takes each element in `M` modulo nonzero scalar value `S`, that + is, the remainder of division by `S`. The sign of the result is + the same as the sign of the dividend. + + ``` + MOD({5, 4, 3, 2, 1, 0}, 3) ⇒ {2, 1, 0, 2, 1, 0} + MOD({5, 4, 3, 2, 1, 0}, -3) ⇒ {2, 1, 0, 2, 1, 0} + MOD({-5, -4, -3, -2, -1, 0}, 3) ⇒ {-2, -1, 0, -2, -1, 0} + MOD({-5, -4, -3, -2, -1, 0}, -3) ⇒ {-2, -1, 0, -2, -1, 0} + MOD({5, 4, 3, 2, 1, 0}, 1.5) ⇒ {.5, 1.0, .0, .5, 1.0, .0} + MOD({5, 4, 3, 2, 1, 0}, 0) ⇒ (error) + ``` + +* `RND(M)` + `TRUNC(M)` + Rounds each element of `M` to an integer. `RND` rounds to the + nearest integer, with halves rounded to even integers, and + `TRUNC` rounds toward zero. + + ``` + RND({-1.6, -1.5, -1.4}) ⇒ {-2, -2, -1} + RND({-.6, -.5, -.4}) ⇒ {-1, 0, 0} + RND({.4, .5, .6} ⇒ {0, 0, 1} + RND({1.4, 1.5, 1.6}) ⇒ {1, 2, 2} + + TRUNC({-1.6, -1.5, -1.4}) ⇒ {-1, -1, -1} + TRUNC({-.6, -.5, -.4}) ⇒ {0, 0, 0} + TRUNC({.4, .5, .6} ⇒ {0, 0, 0} + TRUNC({1.4, 1.5, 1.6}) ⇒ {1, 1, 1} + ``` + +* `SQRT(M)` + Takes the square root of each element of `M`, which must not be + negative. + + ``` + SQRT({0, 1, 2, 4, 9, 81}) ⇒ {0, 1, 1.41, 2, 3, 9} (approximately) + SQRT(-1) ⇒ (error) + ``` + +### Logical Functions + +* `ALL(M)` + Returns a scalar with value 1 if all of the elements in `M` are + nonzero, or 0 if at least one element is zero. + + ``` + ALL({1, 2, 3} < {2, 3, 4}) ⇒ 1 + ALL({2, 2, 3} < {2, 3, 4}) ⇒ 0 + ALL({2, 3, 3} < {2, 3, 4}) ⇒ 0 + ALL({2, 3, 4} < {2, 3, 4}) ⇒ 0 + ``` + +* `ANY(M)` + Returns a scalar with value 1 if any of the elements in `M` is + nonzero, or 0 if all of them are zero. + + ``` + ANY({1, 2, 3} < {2, 3, 4}) ⇒ 1 + ANY({2, 2, 3} < {2, 3, 4}) ⇒ 1 + ANY({2, 3, 3} < {2, 3, 4}) ⇒ 1 + ANY({2, 3, 4} < {2, 3, 4}) ⇒ 0 + ``` + +### Matrix Construction Functions + +* `BLOCK(M1, …, MN)` + Returns a block diagonal matrix with as many rows as the sum of + its arguments' row counts and as many columns as the sum of their + columns. Each argument matrix is placed along the main diagonal + of the result, and all other elements are zero. + + ``` + BLOCK({1, 2; 3, 4}, 5, {7; 8; 9}, {10, 11}) ⇒ + 1 2 0 0 0 0 + 3 4 0 0 0 0 + 0 0 5 0 0 0 + 0 0 0 7 0 0 + 0 0 0 8 0 0 + 0 0 0 9 0 0 + 0 0 0 0 10 11 + ``` + +* `IDENT(N)` + `IDENT(NR, NC)` + Returns an identity matrix, whose main diagonal elements are one + and whose other elements are zero. The returned matrix has `N` + rows and columns or `NR` rows and `NC` columns, respectively. + + ``` + IDENT(1) ⇒ 1 + IDENT(2) ⇒ + 1 0 + 0 1 + IDENT(3, 5) ⇒ + 1 0 0 0 0 + 0 1 0 0 0 + 0 0 1 0 0 + IDENT(5, 3) ⇒ + 1 0 0 + 0 1 0 + 0 0 1 + 0 0 0 + 0 0 0 + ``` + +* `MAGIC(N)` + Returns an `N`×`N` matrix that contains each of the integers 1…`N` + once, in which each column, each row, and each diagonal sums to + \\(n(n^2+1)/2\\). There are many magic squares with given dimensions, + but this function always returns the same one for a given value of + N. + + ``` + MAGIC(3) ⇒ {8, 1, 6; 3, 5, 7; 4, 9, 2} + MAGIC(4) ⇒ {1, 5, 12, 16; 15, 11, 6, 2; 14, 8, 9, 3; 4, 10, 7, 13} + ``` + +* `MAKE(NR, NC, S)` + Returns an `NR`×`NC` matrix whose elements are all `S`. + + ``` + MAKE(1, 2, 3) ⇒ {3, 3} + MAKE(2, 1, 4) ⇒ {4; 4} + MAKE(2, 3, 5) ⇒ {5, 5, 5; 5, 5, 5} + ``` + +* `MDIAG(V)` + Given `N`-element vector `V`, returns a `N`×`N` matrix whose main + diagonal is copied from `V`. The other elements in the returned + vector are zero. + + Use [`CALL SETDIAG`](#setdiag) to replace the main diagonal of a + matrix in-place. + + ``` + MDIAG({1, 2, 3, 4}) ⇒ + 1 0 0 0 + 0 2 0 0 + 0 0 3 0 + 0 0 0 4 + ``` + +* `RESHAPE(M, NR, NC)` + Returns an `NR`×`NC` matrix whose elements come from `M`, which + must have the same number of elements as the new matrix, copying + elements from `M` to the new matrix row by row. + + ``` + RESHAPE(1:12, 1, 12) ⇒ + 1 2 3 4 5 6 7 8 9 10 11 12 + RESHAPE(1:12, 2, 6) ⇒ + 1 2 3 4 5 6 + 7 8 9 10 11 12 + RESHAPE(1:12, 3, 4) ⇒ + 1 2 3 4 + 5 6 7 8 + 9 10 11 12 + RESHAPE(1:12, 4, 3) ⇒ + 1 2 3 + 4 5 6 + 7 8 9 + 10 11 12 + ``` + +* `T(M)` + `TRANSPOS(M)` + Returns `M` with rows exchanged for columns. + + ``` + T({1, 2, 3}) ⇒ {1; 2; 3} + T({1; 2; 3}) ⇒ {1, 2, 3} + ``` + +* `UNIFORM(NR, NC)` + Returns a `NR`×`NC` matrix in which each element is randomly + chosen from a uniform distribution of real numbers between 0 + and 1. Random number generation honors the current seed setting + (*note SET SEED::). + + The following example shows one possible output, but of course + every result will be different (given different seeds): + + ``` + UNIFORM(4, 5)*10 ⇒ + 7.71 2.99 .21 4.95 6.34 + 4.43 7.49 8.32 4.99 5.83 + 2.25 .25 1.98 7.09 7.61 + 2.66 1.69 2.64 .88 1.50 + ``` + +### Minimum, Maximum, and Sum Functions + +* `CMIN(M)` + `CMAX(M)` + `CSUM(M)` + `CSSQ(M)` + Returns a row vector with the same number of columns as `M`, in + which each element is the minimum, maximum, sum, or sum of + squares, respectively, of the elements in the same column of `M`. + + ``` + CMIN({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {1, 2, 3} + CMAX({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {7, 8, 9} + CSUM({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {12, 15, 18} + CSSQ({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {66, 93, 126} + ``` + +* `MMIN(M)` + `MMAX(M)` + `MSUM(M)` + `MSSQ(M)` + Returns the minimum, maximum, sum, or sum of squares, respectively, + of the elements of `M`. + + ``` + MMIN({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 1 + MMAX({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 9 + MSUM({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 45 + MSSQ({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 285 + ``` + +* `RMIN(M)` + `RMAX(M)` + `RSUM(M)` + `RSSQ(M)` + Returns a column vector with the same number of rows as `M`, in + which each element is the minimum, maximum, sum, or sum of + squares, respectively, of the elements in the same row of `M`. + + ``` + RMIN({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {1; 4; 7} + RMAX({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {3; 6; 9} + RSUM({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {6; 15; 24} + RSSQ({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {14; 77; 194} + ``` + +* `SSCP(M)` + Returns \\({\bf M}^{\bf T} × \bf M\\). + + ``` + SSCP({1, 2, 3; 4, 5, 6}) ⇒ {17, 22, 27; 22, 29, 36; 27, 36, 45} + ``` + +* `TRACE(M)` + Returns the sum of the elements along `M`'s main diagonal, + equivalent to `MSUM(DIAG(M))`. + + ``` + TRACE(MDIAG(1:5)) ⇒ 15 + ``` + +### Matrix Property Functions + +* `NROW(M)` + `NCOL(M)` + Returns the number of row or columns, respectively, in `M`. + + ``` + NROW({1, 0; -2, -3; 3, 3}) ⇒ 3 + NROW(1:5) ⇒ 1 + + NCOL({1, 0; -2, -3; 3, 3}) ⇒ 2 + NCOL(1:5) ⇒ 5 + ``` + +* `DIAG(M)` + Returns a column vector containing a copy of M's main diagonal. + The vector's length is the lesser of `NCOL(M)` and `NROW(M)`. + + ``` + DIAG({1, 0; -2, -3; 3, 3}) ⇒ {1; -3} + ``` + +### Matrix Rank Ordering Functions + +The `GRADE` and `RANK` functions each take a matrix `M` and return a +matrix `R` with the same dimensions. Each element in `R` ranges +between 1 and the number of elements `N` in `M`, inclusive. When the +elements in `M` all have unique values, both of these functions yield +the same results: the smallest element in `M` corresponds to value 1 +in R, the next smallest to 2, and so on, up to the largest to `N`. +When multiple elements in `M` have the same value, these functions use +different rules for handling the ties. + +* `GRADE(M)` + Returns a ranking of `M`, turning duplicate values into sequential + ranks. The returned matrix always contains each of the integers 1 + through the number of elements in the matrix exactly once. + + ``` + GRADE({1, 0, 3; 3, 1, 2; 3, 0, 5}) ⇒ {3, 1, 6; 7, 4, 5; 8, 2, 9} + ``` + +* `RNKORDER(M)` + Returns a ranking of `M`, turning duplicate values into the mean + of their sequential ranks. + + ``` + RNKORDER({1, 0, 3; 3, 1, 2; 3, 0, 5}) + ⇒ {3.5, 1.5, 7; 7, 3.5, 5; 7, 1.5, 9} + ``` + +One may use `GRADE` to sort a vector: + +``` +COMPUTE v(GRADE(v))=v. /* Sort v in ascending order. +COMPUTE v(GRADE(-v))=v. /* Sort v in descending order. +``` + +### Matrix Algebra Functions + +* `CHOL(M)` + Matrix `M` must be an `N`×`N` symmetric positive-definite matrix. + Returns an `N`×`N` matrix `B` such that \\({\bf B}^{\bf T}×{\bf + B}=\bf M\\). + + ``` + CHOL({4, 12, -16; 12, 37, -43; -16, -43, 98}) ⇒ + 2 6 -8 + 0 1 5 + 0 0 3 + ``` + +* `DESIGN(M)` + Returns a design matrix for `M`. The design matrix has the same + number of rows as `M`. Each column C in `M`, from left to right, + yields a group of columns in the output. For each unique value + `V` in `C`, from top to bottom, add a column to the output in + which `V` becomes 1 and other values become 0. + + PSPP issues a warning if a column only contains a single unique + value. + + ``` + DESIGN({1; 2; 3}) ⇒ {1, 0, 0; 0, 1, 0; 0, 0, 1} + DESIGN({5; 8; 5}) ⇒ {1, 0; 0, 1; 1, 0} + DESIGN({1, 5; 2, 8; 3, 5}) + ⇒ {1, 0, 0, 1, 0; 0, 1, 0, 0, 1; 0, 0, 1, 1, 0} + DESIGN({5; 5; 5}) ⇒ (warning) + ``` + +* `DET(M)` + Returns the determinant of square matrix `M`. + + ``` + DET({3, 7; 1, -4}) ⇒ -19 + ``` + +* `EVAL(M)` + Returns a column vector containing the eigenvalues of symmetric + matrix `M`, sorted in ascending order. + + Use [`CALL EIGEN`](#eigen) to compute eigenvalues and eigenvectors + of a matrix. + + ``` + EVAL({2, 0, 0; 0, 3, 4; 0, 4, 9}) ⇒ {11; 2; 1} + ``` + +* `GINV(M)` + Returns the `K`×`N` matrix `A` that is the "generalized inverse" + of `N`×`K` matrix `M`, defined such that \\({\bf M}×{\bf A}×{\bf + M}={\bf M}\\) and \\({\bf A}×{\bf M}×{\bf A}={\bf A}\\). + + ``` + GINV({1, 2}) ⇒ {.2; .4} (approximately) + {1:9} * GINV(1:9) * {1:9} ⇒ {1:9} (approximately) + ``` + +* `GSCH(M)` + `M` must be a `N`×`M` matrix, `M` ≥ `N`, with rank `N`. Returns + an `N`×`N` orthonormal basis for `M`, obtained using the + [Gram-Schmidt + process](https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process). + + ``` + GSCH({3, 2; 1, 2}) * SQRT(10) ⇒ {3, -1; 1, 3} (approximately) + ``` + +* `INV(M)` + Returns the `N`×`N` matrix A that is the inverse of `N`×`N` matrix M, + defined such that \\({\bf M}×{\bf A} = {\bf A}×{\bf M} = {\bf I}\\), where I is the identity matrix. M + must not be singular, that is, \\(\det({\bf M}) ≠ 0\\). + + ``` + INV({4, 7; 2, 6}) ⇒ {.6, -.7; -.2, .4} (approximately) + ``` + +* `KRONEKER(MA, MB)` + Returns the `PM`×`QN` matrix P that is the [Kroneker + product](https://en.wikipedia.org/wiki/Kronecker_product) of `M`×`N` + matrix `MA` and `P`×`Q` matrix `MB`. One may view P as the + concatenation of multiple `P`×`Q` blocks, each of which is the + scalar product of `MB` by a different element of `MA`. For example, + when `A` is a 2×2 matrix, `KRONEKER(A, B)` is equivalent to + `{A(1,1)*B, A(1,2)*B; A(2,1)*B, A(2,2)*B}`. + + ``` + KRONEKER({1, 2; 3, 4}, {0, 5; 6, 7}) ⇒ + 0 5 0 10 + 6 7 12 14 + 0 15 0 20 + 18 21 24 28 + ``` + +* `RANK(M)` + Returns the rank of matrix `M`, a integer scalar whose value is the + dimension of the vector space spanned by its columns or, + equivalently, by its rows. + + ``` + RANK({1, 0, 1; -2, -3, 1; 3, 3, 0}) ⇒ 2 + RANK({1, 1, 0, 2; -1, -1, 0, -2}) ⇒ 1 + RANK({1, -1; 1, -1; 0, 0; 2, -2}) ⇒ 1 + RANK({1, 2, 1; -2, -3, 1; 3, 5, 0}) ⇒ 2 + RANK({1, 0, 2; 2, 1, 0; 3, 2, 1}) ⇒ 3 + ``` + +* `SOLVE(MA, MB)` + MA must be an `N`×`N` matrix, with \\(\det({\bf MA}) ≠ 0\\), and MB an `P`×`Q` matrix. + Returns an `P`×`Q` matrix X such that \\({\bf MA} × {\bf X} = {\bf MB}\\). + + All of the following examples show approximate results: + + ``` + SOLVE({2, 3; 4, 9}, {6, 2; 15, 5}) ⇒ + 1.50 .50 + 1.00 .33 + SOLVE({1, 3, -2; 3, 5, 6; 2, 4, 3}, {5; 7; 8}) ⇒ + -15.00 + 8.00 + 2.00 + SOLVE({2, 1, -1; -3, -1, 2; -2, 1, 2}, {8; -11; -3}) ⇒ + 2.00 + 3.00 + -1.00 + ``` + +* `SVAL(M)` + + Given `P`×`Q` matrix `M`, returns a \\(\min(N,K)\\)-element column vector + containing the singular values of `M` in descending order. + + Use [`CALL SVD`](#svd) to compute the full singular value + decomposition of a matrix. + + ``` + SVAL({1, 1; 0, 0}) ⇒ {1.41; .00} + SVAL({1, 0, 1; 0, 1, 1; 0, 0, 0}) ⇒ {1.73; 1.00; .00} + SVAL({2, 4; 1, 3; 0, 0; 0, 0}) ⇒ {5.46; .37} + ``` + +* `SWEEP(M, NK)` + Given `P`×`Q` matrix `M` and integer scalar \\(k\\) = `NK` such that \\(1 ≤ k ≤ + \min(R,C)\\), returns the `P`×`Q` sweep matrix A. + + If \\({\bf M}_{kk} ≠ 0\\), then: + + \\[ + \begin{align} + A_{kk} &= 1/M_{kk},\\\\ + A_{ik} &= -M_{ik}/M_{kk} \text{ for } i ≠ k,\\\\ + A_{kj} &= M_{kj}/M_{kk} \text{ for } j ≠ k,\\\\ + A_{ij} &= M_{ij} - M_{ik}M_{kj}/M_{kk} \text{ for } i ≠ k \text{ and } j ≠ k. + \end{align} + \\] + + If \\({\bf M}_{kk}\\) = 0, then: + + \\[ + \begin{align} + A_{ik} &= A_{ki} = 0, \\\\ + A_{ij} &= M_{ij}, \text{ for } i ≠ k \text{ and } j ≠ k. + \end{align} + \\] + + Given `M = {0, 1, 2; 3, 4, 5; 6, 7, 8}`, then (approximately): + + ``` + SWEEP(M, 1) ⇒ + .00 .00 .00 + .00 4.00 5.00 + .00 7.00 8.00 + SWEEP(M, 2) ⇒ + -.75 -.25 .75 + .75 .25 1.25 + .75 -1.75 -.75 + SWEEP(M, 3) ⇒ + -1.50 -.75 -.25 + -.75 -.38 -.63 + .75 .88 .13 + ``` + +### Matrix Statistical Distribution Functions + +The matrix language can calculate several functions of standard +statistical distributions using the same syntax and semantics as in PSPP +transformation expressions. *Note Statistical Distribution Functions::, +for details. + + The matrix language extends the `PDF`, `CDF`, `SIG`, `IDF`, `NPDF`, +and `NCDF` functions by allowing the first parameters to each of these +functions to be a vector or matrix with any dimensions. In addition, +`CDF.BVNOR` and `PDF.BVNOR` allow either or both of their first two +parameters to be vectors or matrices; if both are non-scalar then they +must have the same dimensions. In each case, the result is a matrix +or vector with the same dimensions as the input populated with +elementwise calculations. + +### `EOF` Function + +This function works with files being used on the `READ` statement. + +* `EOF(FILE)` + + Given a file handle or file name `FILE`, returns an integer scalar 1 + if the last line in the file has been read or 0 if more lines are + available. Determining this requires attempting to read another + line, which means that `REREAD` on the next `READ` command + following `EOF` on the same file will be ineffective. + +The `EOF` function gives a matrix program the flexibility to read a +file with text data without knowing the length of the file in advance. +For example, the following program will read all the lines of data in +`data.txt`, each consisting of three numbers, as rows in matrix `data`: + +``` +MATRIX. +COMPUTE data={}. +LOOP IF NOT EOF('data.txt'). + READ row/FILE='data.txt'/FIELD=1 TO 1000/SIZE={1,3}. + COMPUTE data={data; row}. +END LOOP. +PRINT data. +END MATRIX. +``` + +## The `COMPUTE` Command + +``` +COMPUTE variable[(index[,index])]=expression. +``` + + The `COMPUTE` command evaluates an expression and assigns the +result to a variable or a submatrix of a variable. Assigning to a +submatrix uses the same syntax as the [index +operator](#index-operator-). + +## The `CALL` Command + +A matrix function returns a single result. The `CALL` command +implements procedures, which take a similar syntactic form to functions +but yield results by modifying their arguments rather than returning a +value. + +Output arguments to a `CALL` procedure must be a single variable +name. + +The following procedures are implemented via `CALL` to allow them to +return multiple results. For these procedures, the output arguments +need not name existing variables; if they do, then their previous +values are replaced: + +* `CALL EIGEN(M, EVEC, EVAL)` + + Computes the eigenvalues and eigenvector of symmetric `N`×`N` matrix `M`. + Assigns the eigenvectors of `M` to the columns of `N`×`N` matrix EVEC and + the eigenvalues in descending order to `N`-element column vector + `EVAL`. + + Use the [`EVAL`](#eval) function to compute just the eigenvalues of + a symmetric matrix. + + For example, the following matrix language commands: + + ``` + CALL EIGEN({1, 0; 0, 1}, evec, eval). + PRINT evec. + PRINT eval. + + CALL EIGEN({3, 2, 4; 2, 0, 2; 4, 2, 3}, evec2, eval2). + PRINT evec2. + PRINT eval2. + ``` + + yield this output: + + ``` + evec + 1 0 + 0 1 + + eval + 1 + 1 + + evec2 + -.6666666667 .0000000000 .7453559925 + -.3333333333 -.8944271910 -.2981423970 + -.6666666667 .4472135955 -.5962847940 + + eval2 + 8.0000000000 + -1.0000000000 + -1.0000000000 + ``` + +* `CALL SVD(M, U, S, V)` + + Computes the singular value decomposition of `P`×`Q` matrix `M`, + assigning `S` a `P`×`Q` diagonal matrix and to `U` and `V` unitary `P`×`Q` + matrices such that M = U×S×V^T. The main diagonal of `Q` contains the + singular values of `M`. + + Use the [`SVAL`](#sval) function to compute just the singular values + of a matrix. + + For example, the following matrix program: + + ``` + CALL SVD({3, 2, 2; 2, 3, -2}, u, s, v). + PRINT (u * s * T(v))/FORMAT F5.1. + ``` + + yields this output: + + ``` + (u * s * T(v)) + 3.0 2.0 2.0 + 2.0 3.0 -2.0 + ``` + +The final procedure is implemented via `CALL` to allow it to modify a +matrix instead of returning a modified version. For this procedure, +the output argument must name an existing variable. + +* `CALL SETDIAG(M, V)` + + Replaces the main diagonal of `N`×`P` matrix M by the contents of + `K`-element vector `V`. If `K` = 1, so that `V` is a scalar, replaces all + of the diagonal elements of `M` by `V`. If K < \min(N,P), only the + upper K diagonal elements are replaced; if K > \min(N,P), then the + extra elements of V are ignored. + + Use the [`MDIAG`](#mdiag) function to construct a new matrix with a + specified main diagonal. + + For example, this matrix program: + + ``` + COMPUTE x={1, 2, 3; 4, 5, 6; 7, 8, 9}. + CALL SETDIAG(x, 10). + PRINT x. + ``` + + outputs the following: + + ``` + x + 10 2 3 + 4 10 6 + 7 8 10 + ``` + +## The `PRINT` Command + +``` +PRINT [expression] + [/FORMAT=format] + [/TITLE=title] + [/SPACE={NEWPAGE | n}] + [{/RLABELS=string… | /RNAMES=expression}] + [{/CLABELS=string… | /CNAMES=expression}]. +``` + + The `PRINT` command is commonly used to display a matrix. It +evaluates the restricted EXPRESSION, if present, and outputs it either +as text or a pivot table, depending on the setting of `MDISPLAY` (*note +SET MDISPLAY::). + + Use the `FORMAT` subcommand to specify a format, such as `F8.2`, for +displaying the matrix elements. `FORMAT` is optional for numerical +matrices. When it is omitted, PSPP chooses how to format entries +automatically using \\(m\\), the magnitude of the largest-magnitude element in +the matrix to be displayed: + + 1. If \\(m < 10^{11}\\) and the matrix's elements are all integers, + PSPP chooses the narrowest `F` format that fits \\(m\\) plus a + sign. For example, if the matrix is `{1:10}`, then \\(m = 10\\), + which fits in 3 columns with room for a sign, the format is + `F3.0`. + + 2. Otherwise, if \\(m ≥ 10^9\\) or \\(m ≤ 10^{-4}\\), PSPP scales + all of the numbers in the matrix by \\(10^x\\), where \\(x\\) is + the exponent that would be used to display \\(m\\) in scientific + notation. For example, for \\(m = 5.123×10^{20}\\), the scale + factor is \\(10^{20}\\). PSPP displays the scaled values in + format `F13.10` and notes the scale factor in the output. + + 3. Otherwise, PSPP displays the matrix values, without scaling, in + format `F13.10`. + + The optional `TITLE` subcommand specifies a title for the output text +or table, as a quoted string. When it is omitted, the syntax of the +matrix expression is used as the title. + + Use the `SPACE` subcommand to request extra space above the matrix +output. With a numerical argument, it adds the specified number of +lines of blank space above the matrix. With `NEWPAGE` as an argument, +it prints the matrix at the top of a new page. The `SPACE` subcommand +has no effect when a matrix is output as a pivot table. + + The `RLABELS` and `RNAMES` subcommands, which are mutually exclusive, +can supply a label to accompany each row in the output. With `RLABELS`, +specify the labels as comma-separated strings or other tokens. With +`RNAMES`, specify a single expression that evaluates to a vector of +strings. Either way, if there are more labels than rows, the extra +labels are ignored, and if there are more rows than labels, the extra +rows are unlabeled. For output to a pivot table with `RLABELS`, the +labels can be any length; otherwise, the labels are truncated to 8 +bytes. + + The `CLABELS` and `CNAMES` subcommands work for labeling columns as +`RLABELS` and `RNAMES` do for labeling rows. + + When the EXPRESSION is omitted, `PRINT` does not output a matrix. +Instead, it outputs only the text specified on `TITLE`, if any, preceded +by any space specified on the `SPACE` subcommand, if any. Any other +subcommands are ignored, and the command acts as if `MDISPLAY` is set to +`TEXT` regardless of its actual setting. + +### Example + + The following syntax demonstrates two different ways to label the +rows and columns of a matrix with `PRINT`: + +``` +MATRIX. +COMPUTE m={1, 2, 3; 4, 5, 6; 7, 8, 9}. +PRINT m/RLABELS=a, b, c/CLABELS=x, y, z. + +COMPUTE rlabels={"a", "b", "c"}. +COMPUTE clabels={"x", "y", "z"}. +PRINT m/RNAMES=rlabels/CNAMES=clabels. +END MATRIX. +``` + +With `MDISPLAY=TEXT` (the default), this program outputs the following +(twice): + +``` +m + x y z +a 1 2 3 +b 4 5 6 +c 7 8 9 +``` + +With `SET MDISPLAY=TABLES.` added above `MATRIX.`, the output becomes +the following (twice): + +``` + m +┌─┬─┬─┬─┐ +│ │x│y│z│ +├─┼─┼─┼─┤ +│a│1│2│3│ +│b│4│5│6│ +│c│7│8│9│ +└─┴─┴─┴─┘ +``` + + +## The `DO IF` Command + +``` +DO IF expression. + …matrix commands… +[ELSE IF expression. + …matrix commands…]… +[ELSE + …matrix commands…] +END IF. +``` + + A `DO IF` command evaluates its expression argument. If the `DO IF` +expression evaluates to true, then PSPP executes the associated +commands. Otherwise, PSPP evaluates the expression on each `ELSE IF` +clause (if any) in order, and executes the commands associated with the +first one that yields a true value. Finally, if the `DO IF` and all the +`ELSE IF` expressions all evaluate to false, PSPP executes the commands +following the `ELSE` clause (if any). + + Each expression on `DO IF` and `ELSE IF` must evaluate to a scalar. +Positive scalars are considered to be true, and scalars that are zero or +negative are considered to be false. + +### Example + + The following matrix language fragment sets `b` to the term +following `a` in the [Juggler +sequence](https://en.wikipedia.org/wiki/Juggler_sequence): + +``` +DO IF MOD(a, 2) = 0. + COMPUTE b = TRUNC(a &** (1/2)). +ELSE. + COMPUTE b = TRUNC(a &** (3/2)). +END IF. +``` + +## The `LOOP` and `BREAK` Commands + +``` +LOOP [var=first TO last [BY step]] [IF expression]. + …matrix commands… +END LOOP [IF expression]. + +BREAK. +``` + + The `LOOP` command executes a nested group of matrix commands, called +the loop's "body", repeatedly. It has three optional clauses that +control how many times the loop body executes. Regardless of these +clauses, the global `MXLOOPS` setting, which defaults to 40, also limits +the number of iterations of a loop. To iterate more times, raise the +maximum with `SET MXLOOPS` outside of the `MATRIX` command (*note SET +MXLOOPS::). + + The optional index clause causes VAR to be assigned successive +values on each trip through the loop: first `FIRST`, then `FIRST + +STEP`, then `FIRST + 2 × STEP`, and so on. The loop ends when `VAR > +LAST`, for positive `STEP`, or `VAR < LAST`, for negative `STEP`. If +`STEP` is not specified, it defaults to 1. All the index clause +expressions must evaluate to scalars, and non-integers are rounded +toward zero. If `STEP` evaluates as zero (or rounds to zero), then +the loop body never executes. + + The optional `IF` on `LOOP` is evaluated before each iteration +through the loop body. If its expression, which must evaluate to a +scalar, is zero or negative, then the loop terminates without executing +the loop body. + + The optional `IF` on `END LOOP` is evaluated after each iteration +through the loop body. If its expression, which must evaluate to a +scalar, is zero or negative, then the loop terminates. + +### Example + + The following computes and prints \\(l(n)\\), whose value is the +number of steps in the [Juggler +sequence](https://en.wikipedia.org/wiki/Juggler_sequence) for \\(n\\), +for \\( 2 \le n \le 10\\): + +``` +COMPUTE l = {}. +LOOP n = 2 TO 10. + COMPUTE a = n. + LOOP i = 1 TO 100. + DO IF MOD(a, 2) = 0. + COMPUTE a = TRUNC(a &** (1/2)). + ELSE. + COMPUTE a = TRUNC(a &** (3/2)). + END IF. + END LOOP IF a = 1. + COMPUTE l = {l; i}. +END LOOP. +PRINT l. +``` + +### The `BREAK` Command + +The `BREAK` command may be used inside a loop body, ordinarily within a +`DO IF` command. If it is executed, then the loop terminates +immediately, jumping to the command just following `END LOOP`. When +multiple `LOOP` commands nest, `BREAK` terminates the innermost loop. + +#### Example + +The following example is a revision of the one above that shows how +`BREAK` could substitute for the index and `IF` clauses on `LOOP` and +`END LOOP`: + +``` +COMPUTE l = {}. +LOOP n = 2 TO 10. + COMPUTE a = n. + COMPUTE i = 1. + LOOP. + DO IF MOD(a, 2) = 0. + COMPUTE a = TRUNC(a &** (1/2)). + ELSE. + COMPUTE a = TRUNC(a &** (3/2)). + END IF. + DO IF a = 1. + BREAK. + END IF. + COMPUTE i = i + 1. + END LOOP. + COMPUTE l = {l; i}. +END LOOP. +PRINT l. +``` + +## The `READ` and `WRITE` Commands + +The `READ` and `WRITE` commands perform matrix input and output with +text files. They share the following syntax for specifying how data is +divided among input lines: + +``` +/FIELD=first TO last [BY width] +[/FORMAT=format] +``` + +Both commands require the `FIELD` subcommand. It specifies the range +of columns, from FIRST to LAST, inclusive, that the data occupies on +each line of the file. The leftmost column is column 1. The columns +must be literal numbers, not expressions. To use entire lines, even if +they might be very long, specify a column range such as `1 TO 100000`. + +The `FORMAT` subcommand is optional for numerical matrices. For +string matrix input and output, specify an `A` format. In addition to +`FORMAT`, the optional `BY` specification on `FIELD` determine the +meaning of each text line: + +- With neither `BY` nor `FORMAT`, the numbers in the text file are in + `F` format separated by spaces or commas. For `WRITE`, PSPP uses + as many digits of precision as needed to accurately represent the + numbers in the matrix. + +- `BY width` divides the input area into fixed-width fields with the + given width. The input area must be a multiple of width columns + wide. Numbers are read or written as `Fwidth.0` format. + +- `FORMAT="countF"` divides the input area into integer count + equal-width fields per line. The input area must be a multiple of + count columns wide. Another format type may be substituted for + `F`. + +- `FORMAT=Fw`[`.d`] divides the input area into fixed-width fields + with width `w`. The input area must be a multiple of `w` columns + wide. Another format type may be substituted for `F`. The + `READ` command disregards `d`. + +- `FORMAT=F` specifies format `F` without indicating a field width. + Another format type may be substituted for `F`. The `WRITE` + command accepts this form, but it has no effect unless `BY` is also + used to specify a field width. + +If `BY` and `FORMAT` both specify or imply a field width, then they +must indicate the same field width. + +### The `READ` Command + +``` +READ variable[(index[,index])] + [/FILE=file] + /FIELD=first TO last [BY width] + [/FORMAT=format] + [/SIZE=expression] + [/MODE={RECTANGULAR | SYMMETRIC}] + [/REREAD]. +``` + +The `READ` command reads from a text file into a matrix variable. +Specify the target variable just after the command name, either just a +variable name to create or replace an entire variable, or a variable +name followed by an indexing expression to replace a submatrix of an +existing variable. + +The `FILE` subcommand is required in the first `READ` command that +appears within `MATRIX`. It specifies the text file to be read, either +as a file name in quotes or a file handle previously declared on `FILE +HANDLE` (*note FILE HANDLE::). Later `READ` commands (in syntax order) +use the previous referenced file if `FILE` is omitted. + +The `FIELD` and `FORMAT` subcommands specify how input lines are +interpreted. `FIELD` is required, but `FORMAT` is optional. *Note +Matrix READ and WRITE Commands::, for details. + +The `SIZE` subcommand is required for reading into an entire +variable. Its restricted expression argument should evaluate to a +2-element vector `{N, M}` or `{N; M}`, which indicates a `N`×`M` +matrix destination. A scalar `N` is also allowed and indicates a +`N`×1 column vector destination. When the destination is a submatrix, +`SIZE` is optional, and if it is present then it must match the size +of the submatrix. + +By default, or with `MODE=RECTANGULAR`, the command reads an entry +for every row and column. With `MODE=SYMMETRIC`, the command reads only +the entries on and below the matrix's main diagonal, and copies the +entries above the main diagonal from the corresponding symmetric entries +below it. Only square matrices may use `MODE=SYMMETRIC`. + +Ordinarily, each `READ` command starts from a new line in the text +file. Specify the `REREAD` subcommand to instead start from the last +line read by the previous `READ` command. This has no effect for the +first `READ` command to read from a particular file. It is also +ineffective just after a command that uses the `EOF` matrix function +(*note EOF Matrix Function::) on a particular file, because `EOF` has to +try to read the next line from the file to determine whether the file +contains more input. + +#### Example 1: Basic Use + +The following matrix program reads the same matrix `{1, 2, 4; 2, 3, 5; +4, 5, 6}` into matrix variables `v`, `w`, and `x`: + +``` +READ v /FILE='input.txt' /FIELD=1 TO 100 /SIZE={3, 3}. +READ w /FIELD=1 TO 100 /SIZE={3; 3} /MODE=SYMMETRIC. +READ x /FIELD=1 TO 100 BY 1/SIZE={3, 3} /MODE=SYMMETRIC. +``` +given that `input.txt` contains the following: + +``` +1, 2, 4 +2, 3, 5 +4, 5, 6 +1 +2 3 +4 5 6 +1 +23 +456 +``` +The `READ` command will read as many lines of input as needed for a +particular row, so it's also acceptable to break any of the lines above +into multiple lines. For example, the first line `1, 2, 4` could be +written with a line break following either or both commas. + +#### Example 2: Reading into a Submatrix + +The following reads a 5×5 matrix from `input2.txt`, reversing the order +of the rows: + +``` +COMPUTE m = MAKE(5, 5, 0). +LOOP r = 5 TO 1 BY -1. + READ m(r, :) /FILE='input2.txt' /FIELD=1 TO 100. +END LOOP. +``` +#### Example 3: Using `REREAD` + +Suppose each of the 5 lines in a file `input3.txt` starts with an +integer COUNT followed by COUNT numbers, e.g.: + +``` +1 5 +3 1 2 3 +5 6 -1 2 5 1 +2 8 9 +3 1 3 2 +``` +Then, the following reads this file into a matrix `m`: + +``` +COMPUTE m = MAKE(5, 5, 0). +LOOP i = 1 TO 5. + READ count /FILE='input3.txt' /FIELD=1 TO 1 /SIZE=1. + READ m(i, 1:count) /FIELD=3 TO 100 /REREAD. +END LOOP. +``` +### The `WRITE` Command + +``` +WRITE expression + [/OUTFILE=file] + /FIELD=first TO last [BY width] + [/FORMAT=format] + [/MODE={RECTANGULAR | TRIANGULAR}] + [/HOLD]. +``` +The `WRITE` command evaluates expression and writes its value to a +text file in a specified format. Write the expression to evaluate just +after the command name. + +The `OUTFILE` subcommand is required in the first `WRITE` command +that appears within `MATRIX`. It specifies the text file to be written, +either as a file name in quotes or a file handle previously declared on +`FILE HANDLE` (*note FILE HANDLE::). Later `WRITE` commands (in syntax +order) use the previous referenced file if `FILE` is omitted. + +The `FIELD` and `FORMAT` subcommands specify how output lines are +formed. `FIELD` is required, but `FORMAT` is optional. *Note Matrix +READ and WRITE Commands::, for details. + +By default, or with `MODE=RECTANGULAR`, the command writes an entry +for every row and column. With `MODE=TRIANGULAR`, the command writes +only the entries on and below the matrix's main diagonal. Entries above +the diagonal are not written. Only square matrices may be written with +`MODE=TRIANGULAR`. + +Ordinarily, each `WRITE` command writes complete lines to the output +file. With `HOLD`, the final line written by `WRITE` will be held back +for the next `WRITE` command to augment. This can be useful to write +more than one matrix on a single output line. + +#### Example 1: Basic Usage + +This matrix program: + +``` +WRITE {1, 2; 3, 4} /OUTFILE='matrix.txt' /FIELD=1 TO 80. +``` +writes the following to `matrix.txt`: + +``` + 1 2 + 3 4 +``` +#### Example 2: Triangular Matrix + +This matrix program: + +``` +WRITE MAGIC(5) /OUTFILE='matrix.txt' /FIELD=1 TO 80 BY 5 /MODE=TRIANGULAR. +``` +writes the following to `matrix.txt`: + +``` + 17 + 23 5 + 4 6 13 + 10 12 19 21 + 11 18 25 2 9 +``` +## The `GET` Command + +``` +GET variable[(index[,index])] + [/FILE={file | *}] + [/VARIABLES=variable…] + [/NAMES=variable] + [/MISSING={ACCEPT | OMIT | number}] + [/SYSMIS={OMIT | number}]. +``` + The `READ` command reads numeric data from an SPSS system file, +SPSS/PC+ system file, or SPSS portable file into a matrix variable or +submatrix: + +- To read data into a variable, specify just its name following + `GET`. The variable need not already exist; if it does, it is + replaced. The variable will have as many columns as there are + variables specified on the `VARIABLES` subcommand and as many rows + as there are cases in the input file. + +- To read data into a submatrix, specify the name of an existing + variable, followed by an indexing expression, just after `GET`. + The submatrix must have as many columns as variables specified on + `VARIABLES` and as many rows as cases in the input file. + +Specify the name or handle of the file to be read on `FILE`. Use +`*`, or simply omit the `FILE` subcommand, to read from the active file. +Reading from the active file is only permitted if it was already defined +outside `MATRIX`. + +List the variables to be read as columns in the matrix on the +`VARIABLES` subcommand. The list can use `TO` for collections of +variables or `ALL` for all variables. If `VARIABLES` is omitted, all +variables are read. Only numeric variables may be read. + +If a variable is named on `NAMES`, then the names of the variables +read as data columns are stored in a string vector within the given +name, replacing any existing matrix variable with that name. Variable +names are truncated to 8 bytes. + +The `MISSING` and `SYSMIS` subcommands control the treatment of +missing values in the input file. By default, any user- or +system-missing data in the variables being read from the input causes an +error that prevents `GET` from executing. To accept missing values, +specify one of the following settings on `MISSING`: + +* `ACCEPT`: Accept user-missing values with no change. + + By default, system-missing values still yield an error. Use the + `SYSMIS` subcommand to change this treatment: + + - `OMIT`: Skip any case that contains a system-missing value. + + - `number`: Recode the system-missing value to `number`. + +* `OMIT`: Skip any case that contains any user- or system-missing value. + +* `number`: Recode all user- and system-missing values to `number`. + +The `SYSMIS` subcommand has an effect only with `MISSING=ACCEPT`. + +## The `SAVE` Command + +``` +SAVE expression + [/OUTFILE={file | *}] + [/VARIABLES=variable…] + [/NAMES=expression] + [/STRINGS=variable…]. +``` +The `SAVE` matrix command evaluates expression and writes the +resulting matrix to an SPSS system file. In the system file, each +matrix row becomes a case and each column becomes a variable. + +Specify the name or handle of the SPSS system file on the `OUTFILE` +subcommand, or `*` to write the output as the new active file. The +`OUTFILE` subcommand is required on the first `SAVE` command, in syntax +order, within `MATRIX`. For `SAVE` commands after the first, the +default output file is the same as the previous. + +When multiple `SAVE` commands write to one destination within a +single `MATRIX`, the later commands append to the same output file. All +the matrices written to the file must have the same number of columns. +The `VARIABLES`, `NAMES`, and `STRINGS` subcommands are honored only for +the first `SAVE` command that writes to a given file. + +By default, `SAVE` names the variables in the output file `COL1` +through `COLn`. Use `VARIABLES` or `NAMES` to give the variables +meaningful names. The `VARIABLES` subcommand accepts a comma-separated +list of variable names. Its alternative, `NAMES`, instead accepts an +expression that must evaluate to a row or column string vector of names. +The number of names need not exactly match the number of columns in the +matrix to be written: extra names are ignored; extra columns use default +names. + +By default, `SAVE` assumes that the matrix to be written is all +numeric. To write string columns, specify a comma-separated list of the +string columns' variable names on `STRINGS`. + +## The `MGET` Command + +``` +MGET [/FILE=file] + [/TYPE={COV | CORR | MEAN | STDDEV | N | COUNT}]. +``` +The `MGET` command reads the data from a matrix file (*note Matrix +Files::) into matrix variables. + +All of `MGET`'s subcommands are optional. Specify the name or handle +of the matrix file to be read on the `FILE` subcommand; if it is +omitted, then the command reads the active file. + +By default, `MGET` reads all of the data from the matrix file. +Specify a space-delimited list of matrix types on `TYPE` to limit the +kinds of data to the one specified: + +* `COV`: Covariance matrix. +* `CORR`: Correlation coefficient matrix. +* `MEAN`: Vector of means. +* `STDDEV`: Vector of standard deviations. +* `N`: Vector of case counts. +* `COUNT`: Vector of counts. + +`MGET` reads the entire matrix file and automatically names, creates, +and populates matrix variables using its contents. It constructs the +name of each variable by concatenating the following: + +- A 2-character prefix that identifies the type of the matrix: + + * `CV`: Covariance matrix. + * `CR`: Correlation coefficient matrix. + * `MN`: Vector of means. + * `SD`: Vector of standard deviations. + * `NC`: Vector of case counts. + * `CN`: Vector of counts. + +- If the matrix file has factor variables, `Fn`, where `n` is a number + identifying a group of factors: `F1` for the first group, `F2` for + the second, and so on. This part is omitted for pooled data (where + the factors all have the system-missing value). + +- If the matrix file has split file variables, `Sn`, where n is a + number identifying a split group: `S1` for the first group, `S2` + for the second, and so on. + +If `MGET` chooses the name of an existing variable, it issues a +warning and does not change the variable. + +## The `MSAVE` Command + +``` +MSAVE expression + /TYPE={COV | CORR | MEAN | STDDEV | N | COUNT} + [/FACTOR=expression] + [/SPLIT=expression] + [/OUTFILE=file] + [/VARIABLES=variable…] + [/SNAMES=variable…] + [/FNAMES=variable…]. +``` +The `MSAVE` command evaluates the expression specified just after the +command name, and writes the resulting matrix to a matrix file (*note +Matrix Files::). + +The `TYPE` subcommand is required. It specifies the `ROWTYPE_` to +write along with this matrix. + +The `FACTOR` and `SPLIT` subcommands are required on the first +`MSAVE` if and only if the matrix file has factor or split variables, +respectively. After that, their values are carried along from one +`MSAVE` command to the next in syntax order as defaults. Each one takes +an expression that must evaluate to a vector with the same number of +entries as the matrix has factor or split variables, respectively. Each +`MSAVE` only writes data for a single combination of factor and split +variables, so many `MSAVE` commands (or one inside a loop) may be needed +to write a complete set. + +The remaining `MSAVE` subcommands define the format of the matrix +file. All of the `MSAVE` commands within a given matrix program write +to the same matrix file, so these subcommands are only meaningful on the +first `MSAVE` command within a matrix program. (If they are given again +on later `MSAVE` commands, then they must have the same values as on the +first.) + +The `OUTFILE` subcommand specifies the name or handle of the matrix +file to be written. Output must go to an external file, not a data set +or the active file. + +The `VARIABLES` subcommand specifies a comma-separated list of the +names of the continuous variables to be written to the matrix file. The +`TO` keyword can be used to define variables named with consecutive +integer suffixes. These names become column names and names that appear +in `VARNAME_` in the matrix file. `ROWTYPE_` and `VARNAME_` are not +allowed on `VARIABLES`. If `VARIABLES` is omitted, then PSPP uses the +names `COL1`, `COL2`, and so on. + +The `FNAMES` subcommand may be used to supply a comma-separated list +of factor variable names. The default names are `FAC1`, `FAC2`, and so +on. + +The `SNAMES` subcommand can supply a comma-separated list of split +variable names. The default names are `SPL1`, `SPL2`, and so on. + +## The `DISPLAY` Command + +``` +DISPLAY [{DICTIONARY | STATUS}]. +``` +The `DISPLAY` command makes PSPP display a table with the name and +dimensions of each matrix variable. The `DICTIONARY` and `STATUS` +keywords are accepted but have no effect. + +## The `RELEASE` Command + +``` +RELEASE variable…. +``` +The `RELEASE` command accepts a comma-separated list of matrix +variable names. It deletes each variable and releases the memory +associated with it. + +The `END MATRIX` command releases all matrix variables. diff --git a/rust/doc/src/commands/matrix/matrix.md.2 b/rust/doc/src/commands/matrix/matrix.md.2 new file mode 100644 index 0000000000..49cab3ae98 --- /dev/null +++ b/rust/doc/src/commands/matrix/matrix.md.2 @@ -0,0 +1,1846 @@ +# MATRIX + + + +## Summary + +``` +MATRIX. +...matrix commands... +END MATRIX. +``` + +The following basic matrix commands are supported: + +``` +COMPUTE variable[(index[,index])]=expression. +CALL procedure(argument, ...). +PRINT [expression] + [/FORMAT=format] + [/TITLE=title] + [/SPACE={NEWPAGE | n}] + [{/RLABELS=string... | /RNAMES=expression}] + [{/CLABELS=string... | /CNAMES=expression}]. +``` + +The following matrix commands offer support for flow control: + +``` +DO IF expression. + ...matrix commands... +[ELSE IF expression. + ...matrix commands...]... +[ELSE + ...matrix commands...] +END IF. + +LOOP [var=first TO last [BY step]] [IF expression]. + ...matrix commands... +END LOOP [IF expression]. + +BREAK. +``` + +The following matrix commands support matrix input and output: + +``` +READ variable[(index[,index])] + [/FILE=file] + /FIELD=first TO last [BY width] + [/FORMAT=format] + [/SIZE=expression] + [/MODE={RECTANGULAR | SYMMETRIC}] + [/REREAD]. +WRITE expression + [/OUTFILE=file] + /FIELD=first TO last [BY width] + [/MODE={RECTANGULAR | TRIANGULAR}] + [/HOLD] + [/FORMAT=format]. +GET variable[(index[,index])] + [/FILE={file | *}] + [/VARIABLES=variable...] + [/NAMES=expression] + [/MISSING={ACCEPT | OMIT | number}] + [/SYSMIS={OMIT | number}]. +SAVE expression + [/OUTFILE={file | *}] + [/VARIABLES=variable...] + [/NAMES=expression] + [/STRINGS=variable...]. +MGET [/FILE=file] + [/TYPE={COV | CORR | MEAN | STDDEV | N | COUNT}]. +MSAVE expression + /TYPE={COV | CORR | MEAN | STDDEV | N | COUNT} + [/OUTFILE=file] + [/VARIABLES=variable...] + [/SNAMES=variable...] + [/SPLIT=expression] + [/FNAMES=variable...] + [/FACTOR=expression]. +``` + +The following matrix commands provide additional support: + +``` +DISPLAY [{DICTIONARY | STATUS}]. +RELEASE variable.... +``` + +`MATRIX` and `END MATRIX` enclose a special PSPP sub-language, called +the matrix language. The matrix language does not require an active +dataset to be defined and only a few of the matrix language commands +work with any datasets that are defined. Each instance of +`MATRIX`...`END MATRIX` is a separate program whose state is independent +of any instance, so that variables declared within a matrix program are +forgotten at its end. + +The matrix language works with matrices, where a "matrix" is a +rectangular array of real numbers. An N×M matrix has N rows and M +columns. Some special cases are important: a N×1 matrix is a "column +vector", a 1×N is a "row vector", and a 1×1 matrix is a "scalar". + +The matrix language also has limited support for matrices that +contain 8-byte strings instead of numbers. Strings longer than 8 bytes +are truncated, and shorter strings are padded with spaces. String +matrices are mainly useful for labeling rows and columns when printing +numerical matrices with the `MATRIX PRINT` command. Arithmetic +operations on string matrices will not produce useful results. The user +should not mix strings and numbers within a matrix. + +The matrix language does not work with cases. A variable in the +matrix language represents a single matrix. + +The matrix language does not support missing values. + +`MATRIX` is a procedure, so it cannot be enclosed inside `DO IF`, +`LOOP`, etc. + +Macros defined before a matrix program may be used within a matrix +program, and macros may expand to include entire matrix programs. The +[`DEFINE`](../../control/define.md) command to define new macros may +not appear within a matrix program. + +The following sections describe the details of the matrix language: +first, the syntax of matrix expressions, then each of the supported +commands. The `COMMENT` command (*note COMMENT::) is also supported. + +## Matrix Expressions + +Many matrix commands use expressions. A matrix expression may use the +following operators, listed in descending order of operator precedence. +Within a single level, operators associate from left to right. + +- Function call () and matrix construction {} + +- Indexing () + +- Unary + and - + +- Integer sequence : + +- Exponentiation ** and &** + +- Multiplication * and &*, and division / and &/ + +- Addition + and subtraction - + +- Relational < <= = >= > <> + +- Logical NOT + +- Logical AND + +- Logical OR and XOR + +[Matrix Functions](#matrix-functions) documents the available matrix +functions. The remaining operators are described in more detail +below. + +Expressions appear in the matrix language in some contexts where +there would be ambiguity whether `/` is an operator or a separator +between subcommands. In these contexts, only the operators with higher +precedence than `/` are allowed outside parentheses. Later sections +call these "restricted expressions". + +### Matrix Construction Operator `{}` + +Use the `{}` operator to construct matrices. Within the curly braces, +commas separate elements within a row and semicolons separate rows. The +following examples show a 2×3 matrix, a 1×4 row vector, a 3×1 column +vector, and a scalar. + +``` +{1, 2, 3; 4, 5, 6} ⇒ [1 2 3] + [4 5 6] +{3.14, 6.28, 9.24, 12.57} ⇒ [3.14 6.28 9.42 12.57] +{1.41; 1.73; 2} ⇒ [1.41] + [1.73] + [2.00] +{5} ⇒ 5 +``` + + Curly braces are not limited to holding numeric literals. They can +contain calculations, and they can paste together matrices and vectors +in any way as long as the result is rectangular. For example, if `m` is +matrix `{1, 2; 3, 4}`, `r` is row vector `{5, 6}`, and `c` is column +vector `{7, 8}`, then curly braces can be used as follows: + +``` +{m, c; r, 10} ⇒ [1 2 7] + [3 4 8] + [5 6 10] +{c, 2 * c, T(r)} ⇒ [7 14 5] + [8 16 6] +``` + + The final example above uses the transposition function `T`. + +### Integer Sequence Operator `:` + +The syntax `FIRST:LAST:STEP` yields a row vector of consecutive integers +from FIRST to LAST counting by STEP. The final `:STEP` is optional and +defaults to 1 when omitted. + + Each of FIRST, LAST, and STEP must be a scalar and should be an +integer (any fractional part is discarded). Because `:` has a high +precedence, operands other than numeric literals must usually be +parenthesized. + + When STEP is positive (or omitted) and END < START, or if STEP is +negative and END > START, then the result is an empty matrix. If STEP +is 0, then PSPP reports an error. + + Here are some examples: + +``` +1:6 ⇒ {1, 2, 3, 4, 5, 6} +1:6:2 ⇒ {1, 3, 5} +-1:-5:-1 ⇒ {-1, -2, -3, -4, -5} +-1:-5 ⇒ {} +2:1:0 ⇒ (error) +``` + +### Index Operator `()` + +The result of the submatrix or indexing operator, written `M(RINDEX, +CINDEX)`, contains the rows of M whose indexes are given in vector +RINDEX and the columns whose indexes are given in vector CINDEX. + + In the simplest case, if RINDEX and CINDEX are both scalars, the +result is also a scalar: + +``` +{10, 20; 30, 40}(1, 1) ⇒ 10 +{10, 20; 30, 40}(1, 2) ⇒ 20 +{10, 20; 30, 40}(2, 1) ⇒ 30 +{10, 20; 30, 40}(2, 2) ⇒ 40 +``` + + If the index arguments have multiple elements, then the result +includes multiple rows or columns: + +``` +{10, 20; 30, 40}(1:2, 1) ⇒ {10; 30} +{10, 20; 30, 40}(2, 1:2) ⇒ {30, 40} +{10, 20; 30, 40}(1:2, 1:2) ⇒ {10, 20; 30, 40} +``` + + The special argument `:` may stand in for all the rows or columns in +the matrix being indexed, like this: + +``` +{10, 20; 30, 40}(:, 1) ⇒ {10; 30} +{10, 20; 30, 40}(2, :) ⇒ {30, 40} +{10, 20; 30, 40}(:, :) ⇒ {10, 20; 30, 40} +``` + + The index arguments do not have to be in order, and they may contain +repeated values, like this: + +``` +{10, 20; 30, 40}({2, 1}, 1) ⇒ {30; 10} +{10, 20; 30, 40}(2, {2; 2; ⇒ {40, 40, 30} +1}) +{10, 20; 30, 40}(2:1:-1, :) ⇒ {30, 40; 10, 20} +``` + + When the matrix being indexed is a row or column vector, only a +single index argument is needed, like this: + +``` +{11, 12, 13, 14, 15}(2:4) ⇒ {12, 13, 14} +{11; 12; 13; 14; 15}(2:4) ⇒ {12; 13; 14} +``` + + When an index is not an integer, PSPP discards the fractional part. +It is an error for an index to be less than 1 or greater than the number +of rows or columns: + +``` +{11, 12, 13, 14}({2.5, ⇒ {12, 14} +4.6}) +{11; 12; 13; 14}(0) ⇒ (error) +``` + +### Unary Operators + +The unary operators take a single operand of any dimensions and operate +on each of its elements independently. The unary operators are: + +`-` + Inverts the sign of each element. + +`+` + No change. + +`NOT` + Logical inversion: each positive value becomes 0 and each zero or + negative value becomes 1. + +Examples: + +``` +-{1, -2; 3, -4} ⇒ {-1, 2; -3, 4} ++{1, -2; 3, -4} ⇒ {1, -2; 3, -4} +NOT {1, 0; -1, 1} ⇒ {0, 1; 1, 0} +``` + +### Elementwise Binary Operators + +The elementwise binary operators require their operands to be matrices +with the same dimensions. Alternatively, if one operand is a scalar, +then its value is treated as if it were duplicated to the dimensions of +the other operand. The result is a matrix of the same size as the +operands, in which each element is the result of the applying the +operator to the corresponding elements of the operands. + + The elementwise binary operators are listed below. + + - The arithmetic operators, for familiar arithmetic operations: + + `+` + Addition. + + `-` + Subtraction. + + `*` + Multiplication, if one operand is a scalar. (Otherwise this + is matrix multiplication, described below.) + + `/` or `&/` + Division. + + `&*` + Multiplication. + + `&**` + Exponentiation. + + - The relational operators, whose results are 1 when a comparison is + true and 0 when it is false: + + `<` or `LT` + Less than. + + `<=` or `LE` + Less than or equal. + + `=` or `EQ` + Equal. + + `>` or `GT` + Greater than. + + `>=` or `GE` + Greater than or equal. + + `<>` or `~=` or `NE` + Not equal. + + - The logical operators, which treat positive operands as true and + nonpositive operands as false. They yield 0 for false and 1 for + true: + + `AND` + True if both operands are true. + + `OR` + True if at least one operand is true. + + `XOR` + True if exactly one operand is true. + + Examples: + +``` +1 + 2 ⇒ 3 +1 + {3; 4} ⇒ {4; 5} +{66, 77; 88, 99} + 5 ⇒ {71, 82; 93, 104} +{4, 8; 3, 7} + {1, 0; 5, 2} ⇒ {5, 8; 8, 9} +{1, 2; 3, 4} < {4, 3; 2, 1} ⇒ {1, 1; 0, 0} +{1, 3; 2, 4} >= 3 ⇒ {0, 1; 0, 1} +{0, 0; 1, 1} AND {0, 1; 0, ⇒ {0, 0; 0, 1} +1} +``` + +### Matrix Multiplication Operator `*` + +If `A` is an M×N matrix and `B` is an N×P matrix, then `A*B` is the M×P +matrix multiplication product `C`. PSPP reports an error if the number +of columns in `A` differs from the number of rows in `B`. + + The `*` operator performs elementwise multiplication (see above) if +one of its operands is a scalar. + + No built-in operator yields the inverse of matrix multiplication. +Instead, multiply by the result of `INV` or `GINV`. + + Some examples: + +``` +{1, 2, 3} * {4; 5; 6} ⇒ 32 +{4; 5; 6} * {1, 2, 3} ⇒ {4, 8, 12; + 5, 10, 15; + 6, 12, 18} +``` + +### Matrix Exponentiation Operator `**` + +The result of `A**B` is defined as follows when `A` is a square matrix +and `B` is an integer scalar: + + - For `B > 0`, `A**B` is `A*...*A`, where there are `B` `A`s. (PSPP + implements this efficiently for large `B`, using exponentiation by + squaring.) + + - For `B < 0`, `A**B` is `INV(A**(-B))`. + + - For `B = 0`, `A**B` is the identity matrix. + +PSPP reports an error if `A` is not square or `B` is not an integer. + + Examples: + +``` +{2, 5; 1, 4}**3 ⇒ {48, 165; 33, 114} +{2, 5; 1, 4}**0 ⇒ {1, 0; 0, 1} +10*{4, 7; 2, 6}**-1 ⇒ {6, -7; -2, 4} +``` + +## Matrix Functions + +The matrix language support numerous functions in multiple categories. +The following subsections document each of the currently supported +functions. The first letter of each parameter's name indicate the +required argument type: + +S + A scalar. + +N + A nonnegative integer scalar. (Non-integers are accepted and + silently rounded down to the nearest integer.) + +V + A row or column vector. + +M + A matrix. + +### Elementwise Functions + +These functions act on each element of their argument independently, +like the elementwise operators (*note Matrix Elementwise Binary +Operators::). + +* `ABS (M)` + Takes the absolute value of each element of M. + + ``` + ABS({-1, 2; -3, 0}) ⇒ {1, 2; 3, 0} + ``` + +* `ARSIN (M)` +* `ARTAN (M)` + Computes the inverse sine or tangent, respectively, of each element + in M. The results are in radians, between -\pi/2 and +\pi/2, + inclusive. + + The value of \pi can be computed as `4*ARTAN(1)`. + + ``` + ARSIN({-1, 0, 1}) ⇒ {-1.57, 0, 1.57} (approximately) + + ARTAN({-5, -1, 1, 5}) ⇒ {-1.37, -.79, .79, 1.37} (approximately) + ``` + +* `COS (M)` +* `SIN (M)` + Computes the cosine or sine, respectively, of each element in M, + which must be in radians. + + ``` + COS({0.785, 1.57; 3.14, 1.57 + 3.14}) ⇒ {.71, 0; -1, 0} + (approximately) + ``` + +* `EXP (M)` + Computes e^x for each element X in M. + + ``` + EXP({2, 3; 4, 5}) ⇒ {7.39, 20.09; 54.6, 148.4} (approximately) + ``` + +* `LG10 (M)` +* `LN (M)` + Takes the logarithm with base 10 or base e, respectively, of each + element in M. + + ``` + LG10({1, 10, 100, 1000}) ⇒ {0, 1, 2, 3} + LG10(0) ⇒ (error) + + LN({EXP(1), 1, 2, 3, 4}) ⇒ {1, 0, .69, 1.1, 1.39} (approximately) + LN(0) ⇒ (error) + ``` + +* `MOD (M, S)` + Takes each element in M modulo nonzero scalar value S, that is, the + remainder of division by S. The sign of the result is the same as + the sign of the dividend. + + ``` + MOD({5, 4, 3, 2, 1, 0}, 3) ⇒ {2, 1, 0, 2, 1, 0} + MOD({5, 4, 3, 2, 1, 0}, -3) ⇒ {2, 1, 0, 2, 1, 0} + MOD({-5, -4, -3, -2, -1, 0}, 3) ⇒ {-2, -1, 0, -2, -1, 0} + MOD({-5, -4, -3, -2, -1, 0}, -3) ⇒ {-2, -1, 0, -2, -1, 0} + MOD({5, 4, 3, 2, 1, 0}, 1.5) ⇒ {.5, 1.0, .0, .5, 1.0, .0} + MOD({5, 4, 3, 2, 1, 0}, 0) ⇒ (error) + ``` + +* `RND (M)` +* `TRUNC (M)` + Rounds each element of M to an integer. `RND` rounds to the + nearest integer, with halves rounded to even integers, and `TRUNC` + rounds toward zero. + + ``` + RND({-1.6, -1.5, -1.4}) ⇒ {-2, -2, -1} + RND({-.6, -.5, -.4}) ⇒ {-1, 0, 0} + RND({.4, .5, .6} ⇒ {0, 0, 1} + RND({1.4, 1.5, 1.6}) ⇒ {1, 2, 2} + + TRUNC({-1.6, -1.5, -1.4}) ⇒ {-1, -1, -1} + TRUNC({-.6, -.5, -.4}) ⇒ {0, 0, 0} + TRUNC({.4, .5, .6} ⇒ {0, 0, 0} + TRUNC({1.4, 1.5, 1.6}) ⇒ {1, 1, 1} + ``` + +* `SQRT (M)` + Takes the square root of each element of M, which must not be + negative. + + ``` + SQRT({0, 1, 2, 4, 9, 81}) ⇒ {0, 1, 1.41, 2, 3, 9} (approximately) + SQRT(-1) ⇒ (error) + ``` + +### Logical Functions + +* `ALL (M)` + Returns a scalar with value 1 if all of the elements in M are + nonzero, or 0 if at least one element is zero. + + ``` + ALL({1, 2, 3} < {2, 3, 4}) ⇒ 1 + ALL({2, 2, 3} < {2, 3, 4}) ⇒ 0 + ALL({2, 3, 3} < {2, 3, 4}) ⇒ 0 + ALL({2, 3, 4} < {2, 3, 4}) ⇒ 0 + ``` + +* `ANY (M)` + Returns a scalar with value 1 if any of the elements in M is + nonzero, or 0 if all of them are zero. + + ``` + ANY({1, 2, 3} < {2, 3, 4}) ⇒ 1 + ANY({2, 2, 3} < {2, 3, 4}) ⇒ 1 + ANY({2, 3, 3} < {2, 3, 4}) ⇒ 1 + ANY({2, 3, 4} < {2, 3, 4}) ⇒ 0 + ``` + +### Matrix Construction Functions + +* `BLOCK (M1, ..., MN)` + Returns a block diagonal matrix with as many rows as the sum of its + arguments' row counts and as many columns as the sum of their + columns. Each argument matrix is placed along the main diagonal of + the result, and all other elements are zero. + + ``` + BLOCK({1, 2; 3, 4}, 5, {7; 8; 9}, {10, 11}) ⇒ + 1 2 0 0 0 0 + 3 4 0 0 0 0 + 0 0 5 0 0 0 + 0 0 0 7 0 0 + 0 0 0 8 0 0 + 0 0 0 9 0 0 + 0 0 0 0 10 11 + ``` + +* `IDENT (N)` +* `IDENT (NR, NC)` + Returns an identity matrix, whose main diagonal elements are one + and whose other elements are zero. The returned matrix has N rows + and columns or NR rows and NC columns, respectively. + + ``` + IDENT(1) ⇒ 1 + IDENT(2) ⇒ + 1 0 + 0 1 + IDENT(3, 5) ⇒ + 1 0 0 0 0 + 0 1 0 0 0 + 0 0 1 0 0 + IDENT(5, 3) ⇒ + 1 0 0 + 0 1 0 + 0 0 1 + 0 0 0 + 0 0 0 + ``` + +* `MAGIC (N)` + Returns an N×N matrix that contains each of the integers 1...N + once, in which each column, each row, and each diagonal sums to + n(n^2+1)/2. There are many magic squares with given dimensions, + but this function always returns the same one for a given value of + N. + + ``` + MAGIC(3) ⇒ {8, 1, 6; 3, 5, 7; 4, 9, 2} + MAGIC(4) ⇒ {1, 5, 12, 16; 15, 11, 6, 2; 14, 8, 9, 3; 4, 10, 7, 13} + ``` + +* `MAKE (NR, NC, S)` + Returns an NR×NC matrix whose elements are all S. + + ``` + MAKE(1, 2, 3) ⇒ {3, 3} + MAKE(2, 1, 4) ⇒ {4; 4} + MAKE(2, 3, 5) ⇒ {5, 5, 5; 5, 5, 5} + ``` + +* `MDIAG (V)` + Given N-element vector V, returns a N×N matrix whose main diagonal + is copied from V. The other elements in the returned vector are + zero. + + Use `CALL SETDIAG` (*note CALL SETDIAG::) to replace the main + diagonal of a matrix in-place. + + ``` + MDIAG({1, 2, 3, 4}) ⇒ + 1 0 0 0 + 0 2 0 0 + 0 0 3 0 + 0 0 0 4 + ``` + +* `RESHAPE (M, NR, NC)` + Returns an NR×NC matrix whose elements come from M, which must have + the same number of elements as the new matrix, copying elements + from M to the new matrix row by row. + + ``` + RESHAPE(1:12, 1, 12) ⇒ + 1 2 3 4 5 6 7 8 9 10 11 12 + RESHAPE(1:12, 2, 6) ⇒ + 1 2 3 4 5 6 + 7 8 9 10 11 12 + RESHAPE(1:12, 3, 4) ⇒ + 1 2 3 4 + 5 6 7 8 + 9 10 11 12 + RESHAPE(1:12, 4, 3) ⇒ + 1 2 3 + 4 5 6 + 7 8 9 + 10 11 12 + ``` + +* `T (M)` +* `TRANSPOS (M)` + Returns M with rows exchanged for columns. + + ``` + T({1, 2, 3}) ⇒ {1; 2; 3} + T({1; 2; 3}) ⇒ {1, 2, 3} + ``` + +* `UNIFORM (NR, NC)` + Returns a NR×NC matrix in which each element is randomly chosen + from a uniform distribution of real numbers between 0 and 1. + Random number generation honors the current seed setting (*note SET + SEED::). + + The following example shows one possible output, but of course + every result will be different (given different seeds): + + ``` + UNIFORM(4, 5)*10 ⇒ + 7.71 2.99 .21 4.95 6.34 + 4.43 7.49 8.32 4.99 5.83 + 2.25 .25 1.98 7.09 7.61 + 2.66 1.69 2.64 .88 1.50 + ``` + +### Minimum, Maximum, and Sum Functions + +* `CMIN (M)` +* `CMAX (M)` +* `CSUM (M)` +* `CSSQ (M)` + Returns a row vector with the same number of columns as M, in which + each element is the minimum, maximum, sum, or sum of squares, + respectively, of the elements in the same column of M. + + ``` + CMIN({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {1, 2, 3} + CMAX({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {7, 8, 9} + CSUM({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {12, 15, 18} + CSSQ({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {66, 93, 126} + ``` + +* `MMIN (M)` +* `MMAX (M)` +* `MSUM (M)` +* `MSSQ (M)` + Returns the minimum, maximum, sum, or sum of squares, respectively, + of the elements of M. + + ``` + MMIN({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 1 + MMAX({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 9 + MSUM({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 45 + MSSQ({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ 285 + ``` + +* `RMIN (M)` +* `RMAX (M)` +* `RSUM (M)` +* `RSSQ (M)` + Returns a column vector with the same number of rows as M, in which + each element is the minimum, maximum, sum, or sum of squares, + respectively, of the elements in the same row of M. + + ``` + RMIN({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {1; 4; 7} + RMAX({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {3; 6; 9} + RSUM({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {6; 15; 24} + RSSQ({1, 2, 3; 4, 5, 6; 7, 8, 9} ⇒ {14; 77; 194} + ``` + +* `SSCP (M)` + Returns M^T × M. + + ``` + SSCP({1, 2, 3; 4, 5, 6}) ⇒ {17, 22, 27; 22, 29, 36; 27, 36, 45} + ``` + +* `TRACE (M)` + Returns the sum of the elements along M's main diagonal, equivalent + to `MSUM(DIAG(M))`. + + ``` + TRACE(MDIAG(1:5)) ⇒ 15 + ``` + +### Matrix Property Functions + +* `NROW (M)` +* `NCOL (M)` + Returns the number of row or columns, respectively, in M. + + ``` + NROW({1, 0; -2, -3; 3, 3}) ⇒ 3 + NROW(1:5) ⇒ 1 + + NCOL({1, 0; -2, -3; 3, 3}) ⇒ 2 + NCOL(1:5) ⇒ 5 + ``` + +* `DIAG (M)` + Returns a column vector containing a copy of M's main diagonal. + The vector's length is the lesser of `NCOL(M)` and `NROW(M)`. + + ``` + DIAG({1, 0; -2, -3; 3, 3}) ⇒ {1; -3} + ``` + +### Matrix Rank Ordering Functions + +The `GRADE` and `RANK` functions each take a matrix M and return a +matrix R with the same dimensions. Each element in R ranges between 1 +and the number of elements N in M, inclusive. When the elements in M +all have unique values, both of these functions yield the same results: +the smallest element in M corresponds to value 1 in R, the next smallest +to 2, and so on, up to the largest to N. When multiple elements in M +have the same value, these functions use different rules for handling +the ties. + +* `GRADE (M)` + Returns a ranking of M, turning duplicate values into sequential + ranks. The returned matrix always contains each of the integers 1 + through the number of elements in the matrix exactly once. + + ``` + GRADE({1, 0, 3; 3, 1, 2; 3, 0, 5}) ⇒ {3, 1, 6; 7, 4, 5; 8, 2, 9} + ``` + +* `RNKORDER (M)` + Returns a ranking of M, turning duplicate values into the mean of + their sequential ranks. + + ``` + RNKORDER({1, 0, 3; 3, 1, 2; 3, 0, 5}) + ⇒ {3.5, 1.5, 7; 7, 3.5, 5; 7, 1.5, 9} + ``` + +One may use `GRADE` to sort a vector: + +``` +COMPUTE v(GRADE(v))=v. /* Sort v in ascending order. +COMPUTE v(GRADE(-v))=v. /* Sort v in descending order. +``` + +### Matrix Algebra Functions + +* `CHOL (M)` + Matrix M must be an N×N symmetric positive-definite matrix. + Returns an N×N matrix B such that B^T×B=M. + + ``` + CHOL({4, 12, -16; 12, 37, -43; -16, -43, 98}) ⇒ + 2 6 -8 + 0 1 5 + 0 0 3 + ``` + +* `DESIGN (M)` + Returns a design matrix for M. The design matrix has the same + number of rows as M. Each column C in M, from left to right, + yields a group of columns in the output. For each unique value V + in C, from top to bottom, add a column to the output in which V + becomes 1 and other values become 0. + + PSPP issues a warning if a column only contains a single unique + value. + + ``` + DESIGN({1; 2; 3}) ⇒ {1, 0, 0; 0, 1, 0; 0, 0, 1} + DESIGN({5; 8; 5}) ⇒ {1, 0; 0, 1; 1, 0} + DESIGN({1, 5; 2, 8; 3, 5}) + ⇒ {1, 0, 0, 1, 0; 0, 1, 0, 0, 1; 0, 0, 1, 1, 0} + DESIGN({5; 5; 5}) ⇒ (warning) + ``` + +* `DET (M)` + Returns the determinant of square matrix M. + + ``` + DET({3, 7; 1, -4}) ⇒ -19 + ``` + +* `EVAL (M)` + Returns a column vector containing the eigenvalues of symmetric + matrix M, sorted in ascending order. + + Use `CALL EIGEN` (*note CALL EIGEN::) to compute eigenvalues and + eigenvectors of a matrix. + + ``` + EVAL({2, 0, 0; 0, 3, 4; 0, 4, 9}) ⇒ {11; 2; 1} + ``` + +* `GINV (M)` + Returns the K×N matrix A that is the "generalized inverse" of N×K + matrix M, defined such that M×A×M=M and A×M×A=A. + + ``` + GINV({1, 2}) ⇒ {.2; .4} (approximately) + {1:9} * GINV(1:9) * {1:9} ⇒ {1:9} (approximately) + ``` + +* `GSCH (M)` + M must be a N×M matrix, M ≥ N, with rank N. Returns an N×N + orthonormal basis for M, obtained using the Gram-Schmidt process. + + ``` + GSCH({3, 2; 1, 2}) * SQRT(10) ⇒ {3, -1; 1, 3} (approximately) + ``` + +* `INV (M)` + Returns the N×N matrix A that is the inverse of N×N matrix M, + defined such that M×A = A×M = I, where I is the identity matrix. M + must not be singular, that is, \det(M) ≠ 0. + + ``` + INV({4, 7; 2, 6}) ⇒ {.6, -.7; -.2, .4} (approximately) + ``` + +* `KRONEKER (MA, MB)` + Returns the PM×QN matrix P that is the "Kroneker product" of M×N + matrix MA and P×Q matrix MB. One may view P as the concatenation + of multiple P×Q blocks, each of which is the scalar product of MB + by a different element of MA. For example, when `A` is a 2×2 + matrix, `KRONEKER(A, B)` is equivalent to `{A(1,1)*B, A(1,2)*B; + A(2,1)*B, A(2,2)*B}`. + + ``` + KRONEKER({1, 2; 3, 4}, {0, 5; 6, 7}) ⇒ + 0 5 0 10 + 6 7 12 14 + 0 15 0 20 + 18 21 24 28 + ``` + +* `RANK (M)` + Returns the rank of matrix M, a integer scalar whose value is the + dimension of the vector space spanned by its columns or, + equivalently, by its rows. + + ``` + RANK({1, 0, 1; -2, -3, 1; 3, 3, 0}) ⇒ 2 + RANK({1, 1, 0, 2; -1, -1, 0, -2}) ⇒ 1 + RANK({1, -1; 1, -1; 0, 0; 2, -2}) ⇒ 1 + RANK({1, 2, 1; -2, -3, 1; 3, 5, 0}) ⇒ 2 + RANK({1, 0, 2; 2, 1, 0; 3, 2, 1}) ⇒ 3 + ``` + +* `SOLVE (MA, MB)` + MA must be an N×N matrix, with \det(MA) ≠ 0, and MB an N×K matrix. + Returns an N×K matrix X such that MA × X = MB. + + All of the following examples show approximate results: + + ``` + SOLVE({2, 3; 4, 9}, {6, 2; 15, 5}) ⇒ + 1.50 .50 + 1.00 .33 + SOLVE({1, 3, -2; 3, 5, 6; 2, 4, 3}, {5; 7; 8}) ⇒ + -15.00 + 8.00 + 2.00 + SOLVE({2, 1, -1; -3, -1, 2; -2, 1, 2}, {8; -11; -3}) ⇒ + 2.00 + 3.00 + -1.00 + ``` + +* `SVAL (M)` + + Given N×K matrix M, returns a \min(N,K)-element column vector + containing the singular values of M in descending order. + + Use `CALL SVD` (*note CALL SVD::) to compute the full singular + value decomposition of a matrix. + + ``` + SVAL({1, 1; 0, 0}) ⇒ {1.41; .00} + SVAL({1, 0, 1; 0, 1, 1; 0, 0, 0}) ⇒ {1.73; 1.00; .00} + SVAL({2, 4; 1, 3; 0, 0; 0, 0}) ⇒ {5.46; .37} + ``` + +* `SWEEP (M, NK)` + Given R×C matrix M and integer scalar k = NK such that 1 ≤ k ≤ + \min(R,C), returns the R×C sweep matrix A. + + If M_{kk} ≠ 0, then: + + ``` + A_{kk} = 1/M_{kk}, + A_{ik} = -M_{ik}/M_{kk} for i ≠ k, + A_{kj} = M_{kj}/M_{kk} for j ≠ k, and + A_{ij} = M_{ij} - M_{ik}M_{kj}/M_{kk} for i ≠ k and j ≠ k. + ``` + + If M_{kk} = 0, then: + + ``` + A_{ik} = A_{ki} = 0 and + A_{ij} = M_{ij}, for i ≠ k and j ≠ k. + ``` + + Given M = {0, 1, 2; 3, 4, 5; 6, 7, 8}, then (approximately): + + ``` + SWEEP(M, 1) ⇒ + .00 .00 .00 + .00 4.00 5.00 + .00 7.00 8.00 + SWEEP(M, 2) ⇒ + -.75 -.25 .75 + .75 .25 1.25 + .75 -1.75 -.75 + SWEEP(M, 3) ⇒ + -1.50 -.75 -.25 + -.75 -.38 -.63 + .75 .88 .13 + ``` + +### Matrix Statistical Distribution Functions + +The matrix language can calculate several functions of standard +statistical distributions using the same syntax and semantics as in PSPP +transformation expressions. *Note Statistical Distribution Functions::, +for details. + + The matrix language extends the PDF, CDF, SIG, IDF, NPDF, and NCDF +functions by allowing the first parameters to each of these functions to +be a vector or matrix with any dimensions. In addition, `CDF.BVNOR` and +`PDF.BVNOR` allow either or both of their first two parameters to be +vectors or matrices; if both are non-scalar then they must have the same +dimensions. In each case, the result is a matrix or vector with the +same dimensions as the input populated with elementwise calculations. + +### EOF Function + +This function works with files being used on the `READ` statement. + +* `EOF (FILE)` + + Given a file handle or file name FILE, returns an integer scalar 1 + if the last line in the file has been read or 0 if more lines are + available. Determining this requires attempting to read another + line, which means that `REREAD` on the next `READ` command + following `EOF` on the same file will be ineffective. + + The `EOF` function gives a matrix program the flexibility to read a +file with text data without knowing the length of the file in advance. +For example, the following program will read all the lines of data in +`data.txt`, each consisting of three numbers, as rows in matrix `data`: + +``` +MATRIX. +COMPUTE data={}. +LOOP IF NOT EOF('data.txt'). + READ row/FILE='data.txt'/FIELD=1 TO 1000/SIZE={1,3}. + COMPUTE data={data; row}. +END LOOP. +PRINT data. +END MATRIX. +``` + +## The `COMPUTE` Command + + COMPUTE variable[(index[,index])]=expression. + + The `COMPUTE` command evaluates an expression and assigns the result +to a variable or a submatrix of a variable. Assigning to a submatrix +uses the same syntax as the index operator (*note Matrix Index +Operator::). + +## The `CALL` Command + +A matrix function returns a single result. The `CALL` command +implements procedures, which take a similar syntactic form to functions +but yield results by modifying their arguments rather than returning a +value. + + Output arguments to a `CALL` procedure must be a single variable +name. + + The following procedures are implemented via `CALL` to allow them to +return multiple results. For these procedures, the output arguments +need not name existing variables; if they do, then their previous values +are replaced: + +CALL EIGEN(M, EVEC, EVAL) + + Computes the eigenvalues and eigenvector of symmetric N×N matrix M. + Assigns the eigenvectors of M to the columns of N×N matrix EVEC and + the eigenvalues in descending order to N-element column vector + EVAL. + + Use the `EVAL` function (*note EVAL::) to compute just the + eigenvalues of a symmetric matrix. + + For example, the following matrix language commands: + ``` + CALL EIGEN({1, 0; 0, 1}, evec, eval). + PRINT evec. + PRINT eval. + + CALL EIGEN({3, 2, 4; 2, 0, 2; 4, 2, 3}, evec2, eval2). + PRINT evec2. + PRINT eval2. + + yield this output: + + evec + 1 0 + 0 1 + + eval + 1 + 1 + + evec2 + -.6666666667 .0000000000 .7453559925 + -.3333333333 -.8944271910 -.2981423970 + -.6666666667 .4472135955 -.5962847940 + + eval2 + 8.0000000000 + -1.0000000000 + -1.0000000000 + +CALL SVD(M, U, S, V) + + Computes the singular value decomposition of N×K matrix M, + assigning S a N×K diagonal matrix and to U and V unitary K×K + matrices such that M = U×S×V^T. The main diagonal of Q contains the + singular values of M. + + Use the `SVAL` function (*note SVAL::) to compute just the singular + values of a matrix. + + For example, the following matrix program: + + CALL SVD({3, 2, 2; 2, 3, -2}, u, s, v). + PRINT (u * s * T(v))/FORMAT F5.1. + + yields this output: + + (u * s * T(v)) + 3.0 2.0 2.0 + 2.0 3.0 -2.0 + + The final procedure is implemented via `CALL` to allow it to modify a +matrix instead of returning a modified version. For this procedure, the +output argument must name an existing variable. + +CALL SETDIAG(M, V) + + Replaces the main diagonal of N×P matrix M by the contents of + K-element vector V. If K = 1, so that V is a scalar, replaces all + of the diagonal elements of M by V. If K < \min(N,P), only the + upper K diagonal elements are replaced; if K > \min(N,P), then the + extra elements of V are ignored. + + Use the `MDIAG` function (*note MDIAG::) to construct a new matrix + with a specified main diagonal. + + For example, this matrix program: + + COMPUTE x={1, 2, 3; 4, 5, 6; 7, 8, 9}. + CALL SETDIAG(x, 10). + PRINT x. + + outputs the following: + + x + 10 2 3 + 4 10 6 + 7 8 10 + +## The `PRINT` Command + + PRINT [expression] + [/FORMAT=format] + [/TITLE=title] + [/SPACE={NEWPAGE | n}] + [{/RLABELS=string... | /RNAMES=expression}] + [{/CLABELS=string... | /CNAMES=expression}]. + + The `PRINT` command is commonly used to display a matrix. It +evaluates the restricted EXPRESSION, if present, and outputs it either +as text or a pivot table, depending on the setting of `MDISPLAY` (*note +SET MDISPLAY::). + + Use the `FORMAT` subcommand to specify a format, such as `F8.2`, for +displaying the matrix elements. `FORMAT` is optional for numerical +matrices. When it is omitted, PSPP chooses how to format entries +automatically using M, the magnitude of the largest-magnitude element in +the matrix to be displayed: + + 1. If M < 10^{11} and the matrix's elements are all integers, PSPP + chooses the narrowest `F` format that fits M plus a sign. For + example, if the matrix is {1:10}, then m = 10, which fits in 3 + columns with room for a sign, the format is `F3.0`. + + 2. Otherwise, if M ≥ 10^9 or M ≤ 10^{-4}, PSPP scales all of the + numbers in the matrix by 10^x, where X is the exponent that would + be used to display M in scientific notation. For example, for M = + 5.123×10^{20}, the scale factor is 10^{20}. PSPP displays the + scaled values in format `F13.10` and notes the scale factor in the + output. + + 3. Otherwise, PSPP displays the matrix values, without scaling, in + format `F13.10`. + + The optional `TITLE` subcommand specifies a title for the output text +or table, as a quoted string. When it is omitted, the syntax of the +matrix expression is used as the title. + + Use the `SPACE` subcommand to request extra space above the matrix +output. With a numerical argument, it adds the specified number of +lines of blank space above the matrix. With `NEWPAGE` as an argument, +it prints the matrix at the top of a new page. The `SPACE` subcommand +has no effect when a matrix is output as a pivot table. + + The `RLABELS` and `RNAMES` subcommands, which are mutually exclusive, +can supply a label to accompany each row in the output. With `RLABELS`, +specify the labels as comma-separated strings or other tokens. With +`RNAMES`, specify a single expression that evaluates to a vector of +strings. Either way, if there are more labels than rows, the extra +labels are ignored, and if there are more rows than labels, the extra +rows are unlabeled. For output to a pivot table with `RLABELS`, the +labels can be any length; otherwise, the labels are truncated to 8 +bytes. + + The `CLABELS` and `CNAMES` subcommands work for labeling columns as +`RLABELS` and `RNAMES` do for labeling rows. + + When the EXPRESSION is omitted, `PRINT` does not output a matrix. +Instead, it outputs only the text specified on `TITLE`, if any, preceded +by any space specified on the `SPACE` subcommand, if any. Any other +subcommands are ignored, and the command acts as if `MDISPLAY` is set to +`TEXT` regardless of its actual setting. + + The following syntax demonstrates two different ways to label the +rows and columns of a matrix with `PRINT`: + +``` +MATRIX. +COMPUTE m={1, 2, 3; 4, 5, 6; 7, 8, 9}. +PRINT m/RLABELS=a, b, c/CLABELS=x, y, z. + +COMPUTE rlabels={"a", "b", "c"}. +COMPUTE clabels={"x", "y", "z"}. +PRINT m/RNAMES=rlabels/CNAMES=clabels. +END MATRIX. +``` + +With `MDISPLAY=TEXT` (the default), this program outputs the following +(twice): + + m + x y z + a 1 2 3 + b 4 5 6 + c 7 8 9 + +With `SET MDISPLAY=TABLES.` added above `MATRIX.`, the output becomes +the following (twice): + +[image src="pspp-figures/matrix-print.png" text=" m ++-+-+-+-+ +| |x|y|z| ++-+-+-+-+ +|a|1|2|3| +|b|4|5|6| +|c|7|8|9| ++-+-+-+-+"] + + +## The `DO IF` Command + + DO IF expression. + ...matrix commands... + [ELSE IF expression. + ...matrix commands...]... + [ELSE + ...matrix commands...] + END IF. + + A `DO IF` command evaluates its expression argument. If the `DO IF` +expression evaluates to true, then PSPP executes the associated +commands. Otherwise, PSPP evaluates the expression on each `ELSE IF` +clause (if any) in order, and executes the commands associated with the +first one that yields a true value. Finally, if the `DO IF` and all the +`ELSE IF` expressions all evaluate to false, PSPP executes the commands +following the `ELSE` clause (if any). + + Each expression on `DO IF` and `ELSE IF` must evaluate to a scalar. +Positive scalars are considered to be true, and scalars that are zero or +negative are considered to be false. + + The following matrix language fragment sets `b` to the term following +`a` in the Juggler sequence +(https://en.wikipedia.org/wiki/Juggler_sequence): + +``` +DO IF MOD(a, 2) = 0. + COMPUTE b = TRUNC(a &** (1/2)). +ELSE. + COMPUTE b = TRUNC(a &** (3/2)). +END IF. +``` + +The `LOOP` and `BREAK` Commands + +``` +LOOP [var=first TO last [BY step]] [IF expression]. + ...matrix commands... +END LOOP [IF expression]. + +BREAK. +``` + + The `LOOP` command executes a nested group of matrix commands, called +the loop's "body", repeatedly. It has three optional clauses that +control how many times the loop body executes. Regardless of these +clauses, the global `MXLOOPS` setting, which defaults to 40, also limits +the number of iterations of a loop. To iterate more times, raise the +maximum with `SET MXLOOPS` outside of the `MATRIX` command (*note SET +MXLOOPS::). + + The optional index clause causes VAR to be assigned successive values +on each trip through the loop: first FIRST, then FIRST + STEP, then +FIRST + 2 × STEP, and so on. The loop ends when VAR > LAST, for +positive STEP, or VAR < LAST, for negative STEP. If STEP is not +specified, it defaults to 1. All the index clause expressions must +evaluate to scalars, and non-integers are rounded toward zero. If STEP +evaluates as zero (or rounds to zero), then the loop body never +executes. + + The optional `IF` on `LOOP` is evaluated before each iteration +through the loop body. If its expression, which must evaluate to a +scalar, is zero or negative, then the loop terminates without executing +the loop body. + + The optional `IF` on `END LOOP` is evaluated after each iteration +through the loop body. If its expression, which must evaluate to a +scalar, is zero or negative, then the loop terminates. + + The following computes and prints l(n), whose value is the number of +steps in the Juggler sequence +(https://en.wikipedia.org/wiki/Juggler_sequence) for n, for n from 2 to +10 inclusive: + +``` +COMPUTE l = {}. +LOOP n = 2 TO 10. + COMPUTE a = n. + LOOP i = 1 TO 100. + DO IF MOD(a, 2) = 0. + COMPUTE a = TRUNC(a &** (1/2)). + ELSE. + COMPUTE a = TRUNC(a &** (3/2)). + END IF. + END LOOP IF a = 1. + COMPUTE l = {l; i}. +END LOOP. +PRINT l. +``` + +### The `BREAK` Command + +The `BREAK` command may be used inside a loop body, ordinarily within a +`DO IF` command. If it is executed, then the loop terminates +immediately, jumping to the command just following `END LOOP`. When +multiple `LOOP` commands nest, `BREAK` terminates the innermost loop. + + The following example is a revision of the one above that shows how +`BREAK` could substitute for the index and `IF` clauses on `LOOP` and +`END LOOP`: + +``` +COMPUTE l = {}. +LOOP n = 2 TO 10. + COMPUTE a = n. + COMPUTE i = 1. + LOOP. + DO IF MOD(a, 2) = 0. + COMPUTE a = TRUNC(a &** (1/2)). + ELSE. + COMPUTE a = TRUNC(a &** (3/2)). + END IF. + DO IF a = 1. + BREAK. + END IF. + COMPUTE i = i + 1. + END LOOP. + COMPUTE l = {l; i}. +END LOOP. +PRINT l. +``` + +## The `READ` and `WRITE` Commands + +The `READ` and `WRITE` commands perform matrix input and output with +text files. They share the following syntax for specifying how data is +divided among input lines: + +``` +/FIELD=first TO last [BY width] +[/FORMAT=format] +``` + + Both commands require the `FIELD` subcommand. It specifies the range +of columns, from FIRST to LAST, inclusive, that the data occupies on +each line of the file. The leftmost column is column 1. The columns +must be literal numbers, not expressions. To use entire lines, even if +they might be very long, specify a column range such as `1 TO 100000`. + + The `FORMAT` subcommand is optional for numerical matrices. For +string matrix input and output, specify an `A` format. In addition to +`FORMAT`, the optional `BY` specification on `FIELD` determine the +meaning of each text line: + + - With neither `BY` nor `FORMAT`, the numbers in the text file are in + `F` format separated by spaces or commas. For `WRITE`, PSPP uses + as many digits of precision as needed to accurately represent the + numbers in the matrix. + + - `BY width` divides the input area into fixed-width fields with the + given width. The input area must be a multiple of width columns + wide. Numbers are read or written as `Fwidth.0` format. + + - `FORMAT="countF"` divides the input area into integer count + equal-width fields per line. The input area must be a multiple of + count columns wide. Another format type may be substituted for + `F`. + + - `FORMAT=Fw`[`.d`] divides the input area into fixed-width fields + with width w. The input area must be a multiple of w columns wide. + Another format type may be substituted for `F`. The `READ` command + disregards d. + + - `FORMAT=F` specifies format `F` without indicating a field width. + Another format type may be substituted for `F`. The `WRITE` + command accepts this form, but it has no effect unless `BY` is also + used to specify a field width. + + If `BY` and `FORMAT` both specify or imply a field width, then they +must indicate the same field width. + +### The `READ` Command + +``` +READ variable[(index[,index])] + [/FILE=file] + /FIELD=first TO last [BY width] + [/FORMAT=format] + [/SIZE=expression] + [/MODE={RECTANGULAR | SYMMETRIC}] + [/REREAD]. +``` + + The `READ` command reads from a text file into a matrix variable. +Specify the target variable just after the command name, either just a +variable name to create or replace an entire variable, or a variable +name followed by an indexing expression to replace a submatrix of an +existing variable. + + The `FILE` subcommand is required in the first `READ` command that +appears within `MATRIX`. It specifies the text file to be read, either +as a file name in quotes or a file handle previously declared on `FILE +HANDLE` (*note FILE HANDLE::). Later `READ` commands (in syntax order) +use the previous referenced file if `FILE` is omitted. + + The `FIELD` and `FORMAT` subcommands specify how input lines are +interpreted. `FIELD` is required, but `FORMAT` is optional. *Note +Matrix READ and WRITE Commands::, for details. + + The `SIZE` subcommand is required for reading into an entire +variable. Its restricted expression argument should evaluate to a +2-element vector `{N, M}` or `{N; M}`, which indicates a N×M matrix +destination. A scalar N is also allowed and indicates a N×1 column +vector destination. When the destination is a submatrix, `SIZE` is +optional, and if it is present then it must match the size of the +submatrix. + + By default, or with `MODE=RECTANGULAR`, the command reads an entry +for every row and column. With `MODE=SYMMETRIC`, the command reads only +the entries on and below the matrix's main diagonal, and copies the +entries above the main diagonal from the corresponding symmetric entries +below it. Only square matrices may use `MODE=SYMMETRIC`. + + Ordinarily, each `READ` command starts from a new line in the text +file. Specify the `REREAD` subcommand to instead start from the last +line read by the previous `READ` command. This has no effect for the +first `READ` command to read from a particular file. It is also +ineffective just after a command that uses the `EOF` matrix function +(*note EOF Matrix Function::) on a particular file, because `EOF` has to +try to read the next line from the file to determine whether the file +contains more input. + +Example 1: Basic Use + +The following matrix program reads the same matrix `{1, 2, 4; 2, 3, 5; +4, 5, 6}` into matrix variables `v`, `w`, and `x`: + +``` +READ v /FILE='input.txt' /FIELD=1 TO 100 /SIZE={3, 3}. +READ w /FIELD=1 TO 100 /SIZE={3; 3} /MODE=SYMMETRIC. +READ x /FIELD=1 TO 100 BY 1/SIZE={3, 3} /MODE=SYMMETRIC. +``` +given that `input.txt` contains the following: + +``` +1, 2, 4 +2, 3, 5 +4, 5, 6 +1 +2 3 +4 5 6 +1 +23 +456 +``` + The `READ` command will read as many lines of input as needed for a +particular row, so it's also acceptable to break any of the lines above +into multiple lines. For example, the first line `1, 2, 4` could be +written with a line break following either or both commas. + +Example 2: Reading into a Submatrix + +The following reads a 5×5 matrix from `input2.txt`, reversing the order +of the rows: + +``` +COMPUTE m = MAKE(5, 5, 0). +LOOP r = 5 TO 1 BY -1. + READ m(r, :) /FILE='input2.txt' /FIELD=1 TO 100. +END LOOP. +``` +Example 3: Using `REREAD` + +Suppose each of the 5 lines in a file `input3.txt` starts with an +integer COUNT followed by COUNT numbers, e.g.: + +``` +1 5 +3 1 2 3 +5 6 -1 2 5 1 +2 8 9 +3 1 3 2 +``` +Then, the following reads this file into a matrix `m`: + +``` +COMPUTE m = MAKE(5, 5, 0). +LOOP i = 1 TO 5. + READ count /FILE='input3.txt' /FIELD=1 TO 1 /SIZE=1. + READ m(i, 1:count) /FIELD=3 TO 100 /REREAD. +END LOOP. +``` +### The `WRITE` Command + +``` +WRITE expression + [/OUTFILE=file] + /FIELD=first TO last [BY width] + [/FORMAT=format] + [/MODE={RECTANGULAR | TRIANGULAR}] + [/HOLD]. +``` + The `WRITE` command evaluates expression and writes its value to a +text file in a specified format. Write the expression to evaluate just +after the command name. + + The `OUTFILE` subcommand is required in the first `WRITE` command +that appears within `MATRIX`. It specifies the text file to be written, +either as a file name in quotes or a file handle previously declared on +`FILE HANDLE` (*note FILE HANDLE::). Later `WRITE` commands (in syntax +order) use the previous referenced file if `FILE` is omitted. + + The `FIELD` and `FORMAT` subcommands specify how output lines are +formed. `FIELD` is required, but `FORMAT` is optional. *Note Matrix +READ and WRITE Commands::, for details. + + By default, or with `MODE=RECTANGULAR`, the command writes an entry +for every row and column. With `MODE=TRIANGULAR`, the command writes +only the entries on and below the matrix's main diagonal. Entries above +the diagonal are not written. Only square matrices may be written with +`MODE=TRIANGULAR`. + + Ordinarily, each `WRITE` command writes complete lines to the output +file. With `HOLD`, the final line written by `WRITE` will be held back +for the next `WRITE` command to augment. This can be useful to write +more than one matrix on a single output line. + +Example 1: Basic Usage + +This matrix program: + +``` +WRITE {1, 2; 3, 4} /OUTFILE='matrix.txt' /FIELD=1 TO 80. +``` +writes the following to `matrix.txt`: + +``` + 1 2 + 3 4 +``` +Example 2: Triangular Matrix + +This matrix program: + +``` +WRITE MAGIC(5) /OUTFILE='matrix.txt' /FIELD=1 TO 80 BY 5 /MODE=TRIANGULAR. +``` +writes the following to `matrix.txt`: + +``` + 17 + 23 5 + 4 6 13 + 10 12 19 21 + 11 18 25 2 9 +``` +## The `GET` Command + +``` +GET variable[(index[,index])] + [/FILE={file | *}] + [/VARIABLES=variable...] + [/NAMES=variable] + [/MISSING={ACCEPT | OMIT | number}] + [/SYSMIS={OMIT | number}]. +``` + The `READ` command reads numeric data from an SPSS system file, +SPSS/PC+ system file, or SPSS portable file into a matrix variable or +submatrix: + + - To read data into a variable, specify just its name following + `GET`. The variable need not already exist; if it does, it is + replaced. The variable will have as many columns as there are + variables specified on the `VARIABLES` subcommand and as many rows + as there are cases in the input file. + + - To read data into a submatrix, specify the name of an existing + variable, followed by an indexing expression, just after `GET`. + The submatrix must have as many columns as variables specified on + `VARIABLES` and as many rows as cases in the input file. + + Specify the name or handle of the file to be read on `FILE`. Use +`*`, or simply omit the `FILE` subcommand, to read from the active file. +Reading from the active file is only permitted if it was already defined +outside `MATRIX`. + + List the variables to be read as columns in the matrix on the +`VARIABLES` subcommand. The list can use `TO` for collections of +variables or `ALL` for all variables. If `VARIABLES` is omitted, all +variables are read. Only numeric variables may be read. + + If a variable is named on `NAMES`, then the names of the variables +read as data columns are stored in a string vector within the given +name, replacing any existing matrix variable with that name. Variable +names are truncated to 8 bytes. + + The `MISSING` and `SYSMIS` subcommands control the treatment of +missing values in the input file. By default, any user- or +system-missing data in the variables being read from the input causes an +error that prevents `GET` from executing. To accept missing values, +specify one of the following settings on `MISSING`: + +`ACCEPT` + Accept user-missing values with no change. + + By default, system-missing values still yield an error. Use the + `SYSMIS` subcommand to change this treatment: + + `OMIT` + Skip any case that contains a system-missing value. + + number + Recode the system-missing value to number. + +`OMIT` + Skip any case that contains any user- or system-missing value. + +number + Recode all user- and system-missing values to number. + + The `SYSMIS` subcommand has an effect only with `MISSING=ACCEPT`. + +## The `SAVE` Command + +``` +SAVE expression + [/OUTFILE={file | *}] + [/VARIABLES=variable...] + [/NAMES=expression] + [/STRINGS=variable...]. +``` + The `SAVE` matrix command evaluates expression and writes the +resulting matrix to an SPSS system file. In the system file, each +matrix row becomes a case and each column becomes a variable. + + Specify the name or handle of the SPSS system file on the `OUTFILE` +subcommand, or `*` to write the output as the new active file. The +`OUTFILE` subcommand is required on the first `SAVE` command, in syntax +order, within `MATRIX`. For `SAVE` commands after the first, the +default output file is the same as the previous. + + When multiple `SAVE` commands write to one destination within a +single `MATRIX`, the later commands append to the same output file. All +the matrices written to the file must have the same number of columns. +The `VARIABLES`, `NAMES`, and `STRINGS` subcommands are honored only for +the first `SAVE` command that writes to a given file. + + By default, `SAVE` names the variables in the output file `COL1` +through `COLn`. Use `VARIABLES` or `NAMES` to give the variables +meaningful names. The `VARIABLES` subcommand accepts a comma-separated +list of variable names. Its alternative, `NAMES`, instead accepts an +expression that must evaluate to a row or column string vector of names. +The number of names need not exactly match the number of columns in the +matrix to be written: extra names are ignored; extra columns use default +names. + + By default, `SAVE` assumes that the matrix to be written is all +numeric. To write string columns, specify a comma-separated list of the +string columns' variable names on `STRINGS`. + +The `MGET` Command + +``` +MGET [/FILE=file] + [/TYPE={COV | CORR | MEAN | STDDEV | N | COUNT}]. +``` + The `MGET` command reads the data from a matrix file (*note Matrix +Files::) into matrix variables. + + All of `MGET`'s subcommands are optional. Specify the name or handle +of the matrix file to be read on the `FILE` subcommand; if it is +omitted, then the command reads the active file. + + By default, `MGET` reads all of the data from the matrix file. +Specify a space-delimited list of matrix types on `TYPE` to limit the +kinds of data to the one specified: + +`COV` + Covariance matrix. + +`CORR` + Correlation coefficient matrix. + +`MEAN` + Vector of means. + +`STDDEV` + Vector of standard deviations. + +`N` + Vector of case counts. + +`COUNT` + Vector of counts. + + `MGET` reads the entire matrix file and automatically names, creates, +and populates matrix variables using its contents. It constructs the +name of each variable by concatenating the following: + + - A 2-character prefix that identifies the type of the matrix: + + `CV` + Covariance matrix. + + `CR` + Correlation coefficient matrix. + + `MN` + Vector of means. + + `SD` + Vector of standard deviations. + + `NC` + Vector of case counts. + + `CN` + Vector of counts. + + - If the matrix file has factor variables, `Fn`, where n is a number + identifying a group of factors: `F1` for the first group, `F2` for + the second, and so on. This part is omitted for pooled data (where + the factors all have the system-missing value). + + - If the matrix file has split file variables, `Sn`, where n is a + number identifying a split group: `S1` for the first group, `S2` + for the second, and so on. + + If `MGET` chooses the name of an existing variable, it issues a +warning and does not change the variable. + +## The `MSAVE` Command + +``` +MSAVE expression + /TYPE={COV | CORR | MEAN | STDDEV | N | COUNT} + [/FACTOR=expression] + [/SPLIT=expression] + [/OUTFILE=file] + [/VARIABLES=variable...] + [/SNAMES=variable...] + [/FNAMES=variable...]. +``` + The `MSAVE` command evaluates the expression specified just after the +command name, and writes the resulting matrix to a matrix file (*note +Matrix Files::). + + The `TYPE` subcommand is required. It specifies the `ROWTYPE_` to +write along with this matrix. + + The `FACTOR` and `SPLIT` subcommands are required on the first +`MSAVE` if and only if the matrix file has factor or split variables, +respectively. After that, their values are carried along from one +`MSAVE` command to the next in syntax order as defaults. Each one takes +an expression that must evaluate to a vector with the same number of +entries as the matrix has factor or split variables, respectively. Each +`MSAVE` only writes data for a single combination of factor and split +variables, so many `MSAVE` commands (or one inside a loop) may be needed +to write a complete set. + + The remaining `MSAVE` subcommands define the format of the matrix +file. All of the `MSAVE` commands within a given matrix program write +to the same matrix file, so these subcommands are only meaningful on the +first `MSAVE` command within a matrix program. (If they are given again +on later `MSAVE` commands, then they must have the same values as on the +first.) + + The `OUTFILE` subcommand specifies the name or handle of the matrix +file to be written. Output must go to an external file, not a data set +or the active file. + + The `VARIABLES` subcommand specifies a comma-separated list of the +names of the continuous variables to be written to the matrix file. The +`TO` keyword can be used to define variables named with consecutive +integer suffixes. These names become column names and names that appear +in `VARNAME_` in the matrix file. `ROWTYPE_` and `VARNAME_` are not +allowed on `VARIABLES`. If `VARIABLES` is omitted, then PSPP uses the +names `COL1`, `COL2`, and so on. + + The `FNAMES` subcommand may be used to supply a comma-separated list +of factor variable names. The default names are `FAC1`, `FAC2`, and so +on. + + The `SNAMES` subcommand can supply a comma-separated list of split +variable names. The default names are `SPL1`, `SPL2`, and so on. + +## The `DISPLAY` Command + +``` +DISPLAY [{DICTIONARY | STATUS}]. +``` + The `DISPLAY` command makes PSPP display a table with the name and +dimensions of each matrix variable. The `DICTIONARY` and `STATUS` +keywords are accepted but have no effect. + +## The `RELEASE` Command + +``` +RELEASE variable.... +``` + The `RELEASE` command accepts a comma-separated list of matrix +variable names. It deletes each variable and releases the memory +associated with it. + + The `END MATRIX` command releases all matrix variables. diff --git a/rust/doc/src/commands/matrix/mconvert.md b/rust/doc/src/commands/matrix/mconvert.md new file mode 100644 index 0000000000..ada82e3f5d --- /dev/null +++ b/rust/doc/src/commands/matrix/mconvert.md @@ -0,0 +1,31 @@ +# MCONVERT + +``` +MCONVERT + [[MATRIX=] + [IN({‘*’|'FILE'})] + [OUT({‘*’|'FILE'})]] + [/{REPLACE,APPEND}]. +``` + +The `MCONVERT` command converts matrix data from a correlation matrix +and a vector of standard deviations into a covariance matrix, or vice +versa. + +By default, `MCONVERT` both reads and writes the active file. Use +the `MATRIX` subcommand to specify other files. To read a matrix file, +specify its name inside parentheses following `IN`. To write a matrix +file, specify its name inside parentheses following `OUT`. Use `*` to +explicitly specify the active file for input or output. + +When `MCONVERT` reads the input, by default it substitutes a +correlation matrix and a vector of standard deviations each time it +encounters a covariance matrix, and vice versa. Specify `/APPEND` to +instead have `MCONVERT` add the other form of data without removing the +existing data. Use `/REPLACE` to explicitly request removing the +existing data. + +The `MCONVERT` command requires its input to be a matrix file. Use +[`MATRIX DATA`](matrix-data.md) to convert text input into matrix file +format. + diff --git a/rust/doc/src/commands/statistics/oneway.md b/rust/doc/src/commands/statistics/oneway.md new file mode 100644 index 0000000000..37c52e76c2 --- /dev/null +++ b/rust/doc/src/commands/statistics/oneway.md @@ -0,0 +1,58 @@ +# ONEWAY + +``` +ONEWAY + [/VARIABLES = ] VAR_LIST BY VAR + /MISSING={ANALYSIS,LISTWISE} {EXCLUDE,INCLUDE} + /CONTRAST= VALUE1 [, VALUE2] ... [,VALUEN] + /STATISTICS={DESCRIPTIVES,HOMOGENEITY} + /POSTHOC={BONFERRONI, GH, LSD, SCHEFFE, SIDAK, TUKEY, ALPHA ([VALUE])} +``` + +The `ONEWAY` procedure performs a one-way analysis of variance of +variables factored by a single independent variable. It is used to +compare the means of a population divided into more than two groups. + +The dependent variables to be analysed should be given in the +`VARIABLES` subcommand. The list of variables must be followed by the +`BY` keyword and the name of the independent (or factor) variable. + +You can use the `STATISTICS` subcommand to tell PSPP to display +ancillary information. The options accepted are: +- `DESCRIPTIVES`: Displays descriptive statistics about the groups +factored by the independent variable. +- `HOMOGENEITY`: Displays the Levene test of Homogeneity of Variance for +the variables and their groups. + +The `CONTRAST` subcommand is used when you anticipate certain +differences between the groups. The subcommand must be followed by a +list of numerals which are the coefficients of the groups to be tested. +The number of coefficients must correspond to the number of distinct +groups (or values of the independent variable). If the total sum of the +coefficients are not zero, then PSPP will display a warning, but will +proceed with the analysis. The `CONTRAST` subcommand may be given up to +10 times in order to specify different contrast tests. The `MISSING` +subcommand defines how missing values are handled. If `LISTWISE` is +specified then cases which have missing values for the independent +variable or any dependent variable are ignored. If `ANALYSIS` is +specified, then cases are ignored if the independent variable is missing +or if the dependent variable currently being analysed is missing. The +default is `ANALYSIS`. A setting of `EXCLUDE` means that variables +whose values are user-missing are to be excluded from the analysis. A +setting of `INCLUDE` means they are to be included. The default is +`EXCLUDE`. + +Using the `POSTHOC` subcommand you can perform multiple pairwise +comparisons on the data. The following comparison methods are +available: +- `LSD`: Least Significant Difference. +- `TUKEY`: Tukey Honestly Significant Difference. +- `BONFERRONI`: Bonferroni test. +- `SCHEFFE`: Scheffé's test. +- `SIDAK`: Sidak test. +- `GH`: The Games-Howell test. + +Use the optional syntax `ALPHA(VALUE)` to indicate that `ONEWAY` should +perform the posthoc tests at a confidence level of VALUE. If +`ALPHA(VALUE)` is not specified, then the confidence level used is 0.05. + diff --git a/rust/doc/src/commands/statistics/quick-cluster.md b/rust/doc/src/commands/statistics/quick-cluster.md new file mode 100644 index 0000000000..7f1df26e96 --- /dev/null +++ b/rust/doc/src/commands/statistics/quick-cluster.md @@ -0,0 +1,69 @@ +# QUICK CLUSTER + +``` +QUICK CLUSTER VAR_LIST + [/CRITERIA=CLUSTERS(K) [MXITER(MAX_ITER)] CONVERGE(EPSILON) [NOINITIAL]] + [/MISSING={EXCLUDE,INCLUDE} {LISTWISE, PAIRWISE}] + [/PRINT={INITIAL} {CLUSTER}] + [/SAVE[=[CLUSTER[(MEMBERSHIP_VAR)]] [DISTANCE[(DISTANCE_VAR)]]] +``` + +The `QUICK CLUSTER` command performs k-means clustering on the +dataset. This is useful when you wish to allocate cases into clusters +of similar values and you already know the number of clusters. + +The minimum specification is `QUICK CLUSTER` followed by the names of +the variables which contain the cluster data. Normally you will also +want to specify `/CRITERIA=CLUSTERS(K)` where `K` is the number of +clusters. If this is not specified, then `K` defaults to 2. + +If you use `/CRITERIA=NOINITIAL` then a naive algorithm to select the +initial clusters is used. This will provide for faster execution but +less well separated initial clusters and hence possibly an inferior +final result. + +`QUICK CLUSTER` uses an iterative algorithm to select the clusters +centers. The subcommand `/CRITERIA=MXITER(MAX_ITER)` sets the maximum +number of iterations. During classification, PSPP will continue +iterating until until `MAX_ITER` iterations have been done or the +convergence criterion (see below) is fulfilled. The default value of +MAX_ITER is 2. + +If however, you specify `/CRITERIA=NOUPDATE` then after selecting the +initial centers, no further update to the cluster centers is done. In +this case, `MAX_ITER`, if specified, is ignored. + +The subcommand `/CRITERIA=CONVERGE(EPSILON)` is used to set the +convergence criterion. The value of convergence criterion is +`EPSILON` times the minimum distance between the _initial_ cluster +centers. Iteration stops when the mean cluster distance between one +iteration and the next is less than the convergence criterion. The +default value of `EPSILON` is zero. + +The `MISSING` subcommand determines the handling of missing +variables. If `INCLUDE` is set, then user-missing values are considered +at their face value and not as missing values. If `EXCLUDE` is set, +which is the default, user-missing values are excluded as well as +system-missing values. + +If `LISTWISE` is set, then the entire case is excluded from the +analysis whenever any of the clustering variables contains a missing +value. If `PAIRWISE` is set, then a case is considered missing only if +all the clustering variables contain missing values. Otherwise it is +clustered on the basis of the non-missing values. The default is +`LISTWISE`. + +The `PRINT` subcommand requests additional output to be printed. If +`INITIAL` is set, then the initial cluster memberships will be printed. +If `CLUSTER` is set, the cluster memberships of the individual cases are +displayed (potentially generating lengthy output). + +You can specify the subcommand `SAVE` to ask that each case's cluster +membership and the euclidean distance between the case and its cluster +center be saved to a new variable in the active dataset. To save the +cluster membership use the `CLUSTER` keyword and to save the distance +use the `DISTANCE` keyword. Each keyword may optionally be followed by +a variable name in parentheses to specify the new variable which is to +contain the saved parameter. If no variable name is specified, then +PSPP will create one. + diff --git a/rust/doc/src/commands/statistics/rank.md b/rust/doc/src/commands/statistics/rank.md new file mode 100644 index 0000000000..150239908f --- /dev/null +++ b/rust/doc/src/commands/statistics/rank.md @@ -0,0 +1,57 @@ +# RANK + +``` +RANK + [VARIABLES=] VAR_LIST [{A,D}] [BY VAR_LIST] + /TIES={MEAN,LOW,HIGH,CONDENSE} + /FRACTION={BLOM,TUKEY,VW,RANKIT} + /PRINT[={YES,NO} + /MISSING={EXCLUDE,INCLUDE} + + /RANK [INTO VAR_LIST] + /NTILES(k) [INTO VAR_LIST] + /NORMAL [INTO VAR_LIST] + /PERCENT [INTO VAR_LIST] + /RFRACTION [INTO VAR_LIST] + /PROPORTION [INTO VAR_LIST] + /N [INTO VAR_LIST] + /SAVAGE [INTO VAR_LIST] +``` + +The `RANK` command ranks variables and stores the results into new +variables. + +The `VARIABLES` subcommand, which is mandatory, specifies one or more +variables whose values are to be ranked. After each variable, `A` or +`D` may appear, indicating that the variable is to be ranked in +ascending or descending order. Ascending is the default. If a `BY` +keyword appears, it should be followed by a list of variables which are +to serve as group variables. In this case, the cases are gathered into +groups, and ranks calculated for each group. + +The `TIES` subcommand specifies how tied values are to be treated. +The default is to take the mean value of all the tied cases. + +The `FRACTION` subcommand specifies how proportional ranks are to be +calculated. This only has any effect if `NORMAL` or `PROPORTIONAL` rank +functions are requested. + +The `PRINT` subcommand may be used to specify that a summary of the +rank variables created should appear in the output. + +The function subcommands are `RANK`, `NTILES`, `NORMAL`, `PERCENT`, +`RFRACTION`, `PROPORTION`, and `SAVAGE`. Any number of function +subcommands may appear. If none are given, then the default is `RANK`. +The `NTILES` subcommand must take an integer specifying the number of +partitions into which values should be ranked. Each subcommand may be +followed by the `INTO` keyword and a list of variables which are the +variables to be created and receive the rank scores. There may be as +many variables specified as there are variables named on the +`VARIABLES` subcommand. If fewer are specified, then the variable +names are automatically created. + +The `MISSING` subcommand determines how user missing values are to be +treated. A setting of `EXCLUDE` means that variables whose values are +user-missing are to be excluded from the rank scores. A setting of +`INCLUDE` means they are to be included. The default is `EXCLUDE`. + diff --git a/rust/doc/src/commands/statistics/regression.md b/rust/doc/src/commands/statistics/regression.md new file mode 100644 index 0000000000..a6c4def803 --- /dev/null +++ b/rust/doc/src/commands/statistics/regression.md @@ -0,0 +1,117 @@ +# REGRESSION + +The `REGRESSION` procedure fits linear models to data via least-squares +estimation. The procedure is appropriate for data which satisfy those +assumptions typical in linear regression: + +- The data set contains \\(n\\) observations of a dependent variable, + say \\(y_1,...,y_n\\), and \\(n\\) observations of one or more + explanatory variables. Let \\(x_{11}, x_{12}, ..., x_{1n}\\\) + denote the \\(n\\) observations of the first explanatory variable; + \\(x_{21},...,x_{2n}\\) denote the \\(n\\) observations of the + second explanatory variable; \\(x_{k1},...,x_{kn}\\) denote the + \\(n\\) observations of the kth explanatory variable. + +- The dependent variable \\(y\\) has the following relationship to the + explanatory variables: \\(y_i = b_0 + b_1 x_{1i} + ... + b_k + x_{ki} + z_i\\) where \\(b_0, b_1, ..., b_k\\) are unknown + coefficients, and \\(z_1,...,z_n\\) are independent, normally + distributed "noise" terms with mean zero and common variance. The + noise, or "error" terms are unobserved. This relationship is called + the "linear model". + + The `REGRESSION` procedure estimates the coefficients +\\(b_0,...,b_k\\) and produces output relevant to inferences for the +linear model. + +## Syntax + +``` +REGRESSION + /VARIABLES=VAR_LIST + /DEPENDENT=VAR_LIST + /STATISTICS={ALL, DEFAULTS, R, COEFF, ANOVA, BCOV, CI[CONF, TOL]} + { /ORIGIN | /NOORIGIN } + /SAVE={PRED, RESID} +``` + +The `REGRESSION` procedure reads the active dataset and outputs +statistics relevant to the linear model specified by the user. + +The `VARIABLES` subcommand, which is required, specifies the list of +variables to be analyzed. Keyword `VARIABLES` is required. The +`DEPENDENT` subcommand specifies the dependent variable of the linear +model. The `DEPENDENT` subcommand is required. All variables listed +in the `VARIABLES` subcommand, but not listed in the `DEPENDENT` +subcommand, are treated as explanatory variables in the linear model. + +All other subcommands are optional: + +The `STATISTICS` subcommand specifies which statistics are to be +displayed. The following keywords are accepted: + +* `ALL` + All of the statistics below. +* `R` + The ratio of the sums of squares due to the model to the total sums + of squares for the dependent variable. +* `COEFF` + A table containing the estimated model coefficients and their + standard errors. +* `CI (CONF)` + This item is only relevant if `COEFF` has also been selected. It + specifies that the confidence interval for the coefficients should + be printed. The optional value `CONF`, which must be in + parentheses, is the desired confidence level expressed as a + percentage. +* `ANOVA` + Analysis of variance table for the model. +* `BCOV` + The covariance matrix for the estimated model coefficients. +* `TOL` + The variance inflation factor and its reciprocal. This has no + effect unless `COEFF` is also given. +* `DEFAULT` + The same as if `R`, `COEFF`, and `ANOVA` had been selected. This is + what you get if the `/STATISTICS` command is not specified, or if it + is specified without any parameters. + +The `ORIGIN` and `NOORIGIN` subcommands are mutually exclusive. +`ORIGIN` indicates that the regression should be performed through the +origin. You should use this option if, and only if you have reason to +believe that the regression does indeed pass through the origin -- that +is to say, the value b_0 above, is zero. The default is `NOORIGIN`. + +The `SAVE` subcommand causes PSPP to save the residuals or predicted +values from the fitted model to the active dataset. PSPP will store the +residuals in a variable called `RES1` if no such variable exists, `RES2` +if `RES1` already exists, `RES3` if `RES1` and `RES2` already exist, +etc. It will choose the name of the variable for the predicted values +similarly, but with `PRED` as a prefix. When `SAVE` is used, PSPP +ignores `TEMPORARY`, treating temporary transformations as permanent. + +## Example + +The following PSPP syntax will generate the default output and save the +predicted values and residuals to the active dataset. + +``` +title 'Demonstrate REGRESSION procedure'. +data list / v0 1-2 (A) v1 v2 3-22 (10). +begin data. +b 7.735648 -23.97588 +b 6.142625 -19.63854 +a 7.651430 -25.26557 +c 6.125125 -16.57090 +a 8.245789 -25.80001 +c 6.031540 -17.56743 +a 9.832291 -28.35977 +c 5.343832 -16.79548 +a 8.838262 -29.25689 +b 6.200189 -18.58219 +end data. +list. +regression /variables=v0 v1 v2 /statistics defaults /dependent=v2 + /save pred resid /method=enter. + +``` diff --git a/rust/doc/src/commands/statistics/reliability.md b/rust/doc/src/commands/statistics/reliability.md new file mode 100644 index 0000000000..fc1e3e6d10 --- /dev/null +++ b/rust/doc/src/commands/statistics/reliability.md @@ -0,0 +1,95 @@ +# RELIABILITY + +``` +RELIABILITY + /VARIABLES=VAR_LIST + /SCALE (NAME) = {VAR_LIST, ALL} + /MODEL={ALPHA, SPLIT[(N)]} + /SUMMARY={TOTAL,ALL} + /MISSING={EXCLUDE,INCLUDE} +``` + +The `RELIABILITY` command performs reliability analysis on the data. + +The `VARIABLES` subcommand is required. It determines the set of +variables upon which analysis is to be performed. + +The `SCALE` subcommand determines the variables for which reliability +is to be calculated. If `SCALE` is omitted, then analysis for all +variables named in the `VARIABLES` subcommand are used. Optionally, the +NAME parameter may be specified to set a string name for the scale. + +The `MODEL` subcommand determines the type of analysis. If `ALPHA` +is specified, then Cronbach's Alpha is calculated for the scale. If the +model is `SPLIT`, then the variables are divided into 2 subsets. An +optional parameter `N` may be given, to specify how many variables to be +in the first subset. If `N` is omitted, then it defaults to one half of +the variables in the scale, or one half minus one if there are an odd +number of variables. The default model is `ALPHA`. + +By default, any cases with user missing, or system missing values for +any variables given in the `VARIABLES` subcommand are omitted from the +analysis. The `MISSING` subcommand determines whether user missing +values are included or excluded in the analysis. + +The `SUMMARY` subcommand determines the type of summary analysis to +be performed. Currently there is only one type: `SUMMARY=TOTAL`, which +displays per-item analysis tested against the totals. + +## Example + +Before analysing the results of a survey—particularly for a multiple +choice survey—it is desirable to know whether the respondents have +considered their answers or simply provided random answers. + +In the following example the survey results from the file `hotel.sav` +are used. All five survey questions are included in the reliability +analysis. However, before running the analysis, the data must be +preprocessed. An examination of the survey questions reveals that two +questions, viz: v3 and v5 are negatively worded, whereas the others +are positively worded. All questions must be based upon the same +scale for the analysis to be meaningful. One could use the +[`RECODE`](../../commands/data/recode.md) command, however a simpler +way is to use [`COMPUTE`](../../commands/compute.md) and this is what +is done in the syntax below. + +``` +get file="hotel.sav". + +* Recode V3 and V5 inverting the sense of the values. +compute v3 = 6 - v3. +compute v5 = 6 - v5. + +reliability + /variables= all + /model=alpha. +``` + +In this case, all variables in the data set are used, so we can use +the special keyword `ALL`. + +The output, below, shows that Cronbach's Alpha is 0.11 which is a +value normally considered too low to indicate consistency within the +data. This is possibly due to the small number of survey questions. +The survey should be redesigned before serious use of the results are +applied. + +``` +Scale: ANY + +Case Processing Summary +┌────────┬──┬───────┐ +│Cases │ N│Percent│ +├────────┼──┼───────┤ +│Valid │17│ 100.0%│ +│Excluded│ 0│ .0%│ +│Total │17│ 100.0%│ +└────────┴──┴───────┘ + + Reliability Statistics +┌────────────────┬──────────┐ +│Cronbach's Alpha│N of Items│ +├────────────────┼──────────┤ +│ .11│ 5│ +└────────────────┴──────────┘ +``` diff --git a/rust/doc/src/commands/statistics/roc.md b/rust/doc/src/commands/statistics/roc.md new file mode 100644 index 0000000000..41426a3662 --- /dev/null +++ b/rust/doc/src/commands/statistics/roc.md @@ -0,0 +1,66 @@ +# ROC + +``` +ROC + VAR_LIST BY STATE_VAR (STATE_VALUE) + /PLOT = { CURVE [(REFERENCE)], NONE } + /PRINT = [ SE ] [ COORDINATES ] + /CRITERIA = [ CUTOFF({INCLUDE,EXCLUDE}) ] + [ TESTPOS ({LARGE,SMALL}) ] + [ CI (CONFIDENCE) ] + [ DISTRIBUTION ({FREE, NEGEXPO }) ] + /MISSING={EXCLUDE,INCLUDE} +``` + +The `ROC` command is used to plot the receiver operating +characteristic curve of a dataset, and to estimate the area under the +curve. This is useful for analysing the efficacy of a variable as a +predictor of a state of nature. + +The mandatory `VAR_LIST` is the list of predictor variables. The +variable `STATE_VAR` is the variable whose values represent the actual +states, and `STATE_VALUE` is the value of this variable which represents +the positive state. + +The optional subcommand `PLOT` is used to determine if and how the +`ROC` curve is drawn. The keyword `CURVE` means that the `ROC` curve +should be drawn, and the optional keyword `REFERENCE`, which should be +enclosed in parentheses, says that the diagonal reference line should be +drawn. If the keyword `NONE` is given, then no `ROC` curve is drawn. +By default, the curve is drawn with no reference line. + +The optional subcommand `PRINT` determines which additional tables +should be printed. Two additional tables are available. The `SE` +keyword says that standard error of the area under the curve should be +printed as well as the area itself. In addition, a p-value for the null +hypothesis that the area under the curve equals 0.5 is printed. The +`COORDINATES` keyword says that a table of coordinates of the `ROC` +curve should be printed. + +The `CRITERIA` subcommand has four optional parameters: + +- The `TESTPOS` parameter may be `LARGE` or `SMALL`. `LARGE` is the + default, and says that larger values in the predictor variables are + to be considered positive. `SMALL` indicates that smaller values + should be considered positive. + +- The `CI` parameter specifies the confidence interval that should be + printed. It has no effect if the `SE` keyword in the `PRINT` + subcommand has not been given. + +- The `DISTRIBUTION` parameter determines the method to be used when + estimating the area under the curve. There are two possibilities, + viz: `FREE` and `NEGEXPO`. The `FREE` method uses a non-parametric + estimate, and the `NEGEXPO` method a bi-negative exponential + distribution estimate. The `NEGEXPO` method should only be used + when the number of positive actual states is equal to the number of + negative actual states. The default is `FREE`. + +- The `CUTOFF` parameter is for compatibility and is ignored. + +The `MISSING` subcommand determines whether user missing values are to +be included or excluded in the analysis. The default behaviour is to +exclude them. Cases are excluded on a listwise basis; if any of the +variables in `VAR_LIST` or if the variable `STATE_VAR` is missing, +then the entire case is excluded. + diff --git a/rust/doc/src/commands/statistics/t-test.md b/rust/doc/src/commands/statistics/t-test.md new file mode 100644 index 0000000000..3586a364a7 --- /dev/null +++ b/rust/doc/src/commands/statistics/t-test.md @@ -0,0 +1,238 @@ +# T-TEST + +``` +T-TEST + /MISSING={ANALYSIS,LISTWISE} {EXCLUDE,INCLUDE} + /CRITERIA=CI(CONFIDENCE) + + +(One Sample mode.) + TESTVAL=TEST_VALUE + /VARIABLES=VAR_LIST + + +(Independent Samples mode.) + GROUPS=var(VALUE1 [, VALUE2]) + /VARIABLES=VAR_LIST + + +(Paired Samples mode.) + PAIRS=VAR_LIST [WITH VAR_LIST [(PAIRED)] ] +``` + +The `T-TEST` procedure outputs tables used in testing hypotheses +about means. It operates in one of three modes: +- [One Sample mode](#one-sample-mode). +- [Independent Samples mode](#independent-samples-mode). +- [Paired Samples mode](#paired-samples-mode). + +Each of these modes are described in more detail below. There are two +optional subcommands which are common to all modes. + +The `/CRITERIA` subcommand tells PSPP the confidence interval used in +the tests. The default value is 0.95. + +The `MISSING` subcommand determines the handling of missing +variables. If `INCLUDE` is set, then user-missing values are included +in the calculations, but system-missing values are not. If `EXCLUDE` is +set, which is the default, user-missing values are excluded as well as +system-missing values. This is the default. + +If `LISTWISE` is set, then the entire case is excluded from analysis +whenever any variable specified in the `/VARIABLES`, `/PAIRS` or +`/GROUPS` subcommands contains a missing value. If `ANALYSIS` is set, +then missing values are excluded only in the analysis for which they +would be needed. This is the default. + +## One Sample Mode + +The `TESTVAL` subcommand invokes the One Sample mode. This mode is used +to test a population mean against a hypothesized mean. The value given +to the `TESTVAL` subcommand is the value against which you wish to test. +In this mode, you must also use the `/VARIABLES` subcommand to tell PSPP +which variables you wish to test. + +### Example + +A researcher wishes to know whether the weight of persons in a +population is different from the national average. The samples are +drawn from the population under investigation and recorded in the file +`physiology.sav`. From the Department of Health, she knows that the +national average weight of healthy adults is 76.8kg. Accordingly the +`TESTVAL` is set to 76.8. The null hypothesis therefore is that the +mean average weight of the population from which the sample was drawn is +76.8kg. + + As previously noted (*note Identifying incorrect data::), one sample +in the dataset contains a weight value which is clearly incorrect. So +this is excluded from the analysis using the `SELECT` command. + +``` +GET FILE='physiology.sav'. + +SELECT IF (weight > 0). + +T-TEST TESTVAL = 76.8 + /VARIABLES = weight. +``` + +The output below shows that the mean of our sample differs from the +test value by -1.40kg. However the significance is very high (0.610). +So one cannot reject the null hypothesis, and must conclude there is +not enough evidence to suggest that the mean weight of the persons in +our population is different from 76.8kg. + +``` + One─Sample Statistics +┌───────────────────┬──┬─────┬──────────────┬─────────┐ +│ │ N│ Mean│Std. Deviation│S.E. Mean│ +├───────────────────┼──┼─────┼──────────────┼─────────┤ +│Weight in kilograms│39│75.40│ 17.08│ 2.73│ +└───────────────────┴──┴─────┴──────────────┴─────────┘ + + One─Sample Test +┌──────────────┬──────────────────────────────────────────────────────────────┐ +│ │ Test Value = 76.8 │ +│ ├────┬──┬────────────┬────────────┬────────────────────────────┤ +│ │ │ │ │ │ 95% Confidence Interval of │ +│ │ │ │ │ │ the Difference │ +│ │ │ │ Sig. (2─ │ Mean ├──────────────┬─────────────┤ +│ │ t │df│ tailed) │ Difference │ Lower │ Upper │ +├──────────────┼────┼──┼────────────┼────────────┼──────────────┼─────────────┤ +│Weight in │─.51│38│ .610│ ─1.40│ ─6.94│ 4.13│ +│kilograms │ │ │ │ │ │ │ +└──────────────┴────┴──┴────────────┴────────────┴──────────────┴─────────────┘ +``` + +## Independent Samples Mode + +The `GROUPS` subcommand invokes Independent Samples mode or 'Groups' +mode. This mode is used to test whether two groups of values have the +same population mean. In this mode, you must also use the `/VARIABLES` +subcommand to tell PSPP the dependent variables you wish to test. + +The variable given in the `GROUPS` subcommand is the independent +variable which determines to which group the samples belong. The values +in parentheses are the specific values of the independent variable for +each group. If the parentheses are omitted and no values are given, the +default values of 1.0 and 2.0 are assumed. + +If the independent variable is numeric, it is acceptable to specify +only one value inside the parentheses. If you do this, cases where the +independent variable is greater than or equal to this value belong to +the first group, and cases less than this value belong to the second +group. When using this form of the `GROUPS` subcommand, missing values +in the independent variable are excluded on a listwise basis, regardless +of whether `/MISSING=LISTWISE` was specified. + +### Example + +A researcher wishes to know whether within a population, adult males are +taller than adult females. The samples are drawn from the population +under investigation and recorded in the file `physiology.sav`. + +As previously noted (*note Identifying incorrect data::), one sample +in the dataset contains a height value which is clearly incorrect. So +this is excluded from the analysis using the `SELECT` command. + +``` +get file='physiology.sav'. + +select if (height >= 200). + +t-test /variables = height + /groups = sex(0,1). +``` + +The null hypothesis is that both males and females are on average of +equal height. + + +From the output, shown below, one can clearly see that the _sample_ +mean height is greater for males than for females. However in order +to see if this is a significant result, one must consult the T-Test +table. + +The T-Test table contains two rows; one for use if the variance of +the samples in each group may be safely assumed to be equal, and the +second row if the variances in each group may not be safely assumed to +be equal. + +In this case however, both rows show a 2-tailed significance less +than 0.001 and one must therefore reject the null hypothesis and +conclude that within the population the mean height of males and of +females are unequal. + +``` + Group Statistics +┌────────────────────────────┬──┬───────┬──────────────┬─────────┐ +│ Group │ N│ Mean │Std. Deviation│S.E. Mean│ +├────────────────────────────┼──┼───────┼──────────────┼─────────┤ +│Height in millimeters Male │22│1796.49│ 49.71│ 10.60│ +│ Female│17│1610.77│ 25.43│ 6.17│ +└────────────────────────────┴──┴───────┴──────────────┴─────────┘ + + Independent Samples Test +┌─────────────────────┬──────────┬────────────────────────────────────────── +│ │ Levene's │ +│ │ Test for │ +│ │ Equality │ +│ │ of │ +│ │ Variances│ T─Test for Equality of Means +│ ├────┬─────┼─────┬─────┬───────┬──────────┬──────────┐ +│ │ │ │ │ │ │ │ │ +│ │ │ │ │ │ │ │ │ +│ │ │ │ │ │ │ │ │ +│ │ │ │ │ │ │ │ │ +│ │ │ │ │ │ Sig. │ │ │ +│ │ │ │ │ │ (2─ │ Mean │Std. Error│ +│ │ F │ Sig.│ t │ df │tailed)│Difference│Difference│ +├─────────────────────┼────┼─────┼─────┼─────┼───────┼──────────┼──────────┤ +│Height in Equal │ .97│ .331│14.02│37.00│ .000│ 185.72│ 13.24│ +│millimeters variances│ │ │ │ │ │ │ │ +│ assumed │ │ │ │ │ │ │ │ +│ Equal │ │ │15.15│32.71│ .000│ 185.72│ 12.26│ +│ variances│ │ │ │ │ │ │ │ +│ not │ │ │ │ │ │ │ │ +│ assumed │ │ │ │ │ │ │ │ +└─────────────────────┴────┴─────┴─────┴─────┴───────┴──────────┴──────────┘ + +┌─────────────────────┬─────────────┐ +│ │ │ +│ │ │ +│ │ │ +│ │ │ +│ │ │ +│ ├─────────────┤ +│ │ 95% │ +│ │ Confidence │ +│ │ Interval of │ +│ │ the │ +│ │ Difference │ +│ ├──────┬──────┤ +│ │ Lower│ Upper│ +├─────────────────────┼──────┼──────┤ +│Height in Equal │158.88│212.55│ +│millimeters variances│ │ │ +│ assumed │ │ │ +│ Equal │160.76│210.67│ +│ variances│ │ │ +│ not │ │ │ +│ assumed │ │ │ +└─────────────────────┴──────┴──────┘ +``` + +## Paired Samples Mode + +The `PAIRS` subcommand introduces Paired Samples mode. Use this mode +when repeated measures have been taken from the same samples. If the +`WITH` keyword is omitted, then tables for all combinations of variables +given in the `PAIRS` subcommand are generated. If the `WITH` keyword is +given, and the `(PAIRED)` keyword is also given, then the number of +variables preceding `WITH` must be the same as the number following it. +In this case, tables for each respective pair of variables are +generated. In the event that the `WITH` keyword is given, but the +`(PAIRED)` keyword is omitted, then tables for each combination of +variable preceding `WITH` against variable following `WITH` are +generated. +