- [AGGREGATE](commands/data/aggregate.md)
- [AUTORECODE](commands/data/autorecode.md)
- [COMPUTE](commands/data/compute.md)
+ - [FLIP](commands/data/flip.md)
+ - [IF](commands/data/if.md)
+ - [RECODE](commands/data/recode.md)
+ - [SORT CASES](commands/data/sort-cases.md)
# Developer Documentation
the [`LAG`](../../language/expressions/functions/miscellaneous.md)
function may not be used.
-## Examples
+## Example
The dataset `physiology.sav` contains the height and weight of
persons. For some purposes, neither height nor weight alone is of
--- /dev/null
+# FLIP
+
+```
+FLIP /VARIABLES=VAR_LIST /NEWNAMES=VAR_NAME.
+```
+
+`FLIP` transposes rows and columns in the active dataset. It causes
+cases to be swapped with variables, and vice versa.
+
+All variables in the transposed active dataset are numeric. String
+variables take on the system-missing value in the transposed file.
+
+`N` subcommands are required. If specified, the `VARIABLES`
+subcommand selects variables to be transformed into cases, and variables
+not specified are discarded. If the `VARIABLES` subcommand is omitted,
+all variables are selected for transposition.
+
+The variables specified by `NEWNAMES`, which must be a string
+variable, is used to give names to the variables created by `FLIP`.
+Only the first 8 characters of the variable are used. If `NEWNAMES`
+is not specified then the default is a variable named CASE_LBL, if it
+exists. If it does not then the variables created by `FLIP` are named
+`VAR000` through `VAR999`, then `VAR1000`, `VAR1001`, and so on.
+
+When a `NEWNAMES` variable is available, the names must be
+canonicalized before becoming variable names. Invalid characters are
+replaced by letter `V` in the first position, or by `_` in subsequent
+positions. If the name thus generated is not unique, then numeric
+extensions are added, starting with 1, until a unique name is found or
+there are no remaining possibilities. If the latter occurs then the
+`FLIP` operation aborts.
+
+The resultant dictionary contains a `CASE_LBL` variable, a string
+variable of width 8, which stores the names of the variables in the
+dictionary before the transposition. Variables names longer than 8
+characters are truncated. If `FLIP` is called again on this dataset,
+the `CASE_LBL` variable can be passed to the `NEWNAMES` subcommand to
+recreate the original variable names.
+
+`FLIP` honors `N OF CASES` (*note N OF CASES::). It ignores
+`TEMPORARY` (*note TEMPORARY::), so that "temporary" transformations
+become permanent.
+
+## Example
+
+In the syntax below, data has been entered using [`DATA
+LIST`](../../commands/data-io/data-list.md) such that the first
+variable in the dataset is a string variable containing a description
+of the other data for the case. Clearly this is not a convenient
+arrangement for performing statistical analyses, so it would have been
+better to think a little more carefully about how the data should have
+been arranged. However often the data is provided by some third party
+source, and you have no control over the form. Fortunately, we can
+use `FLIP` to exchange the variables and cases in the active dataset.
+
+```
+data list notable list /heading (a16) v1 v2 v3 v4 v5 v6
+begin data.
+date-of-birth 1970 1989 2001 1966 1976 1982
+sex 1 0 0 1 0 1
+score 10 10 9 3 8 9
+end data.
+
+echo 'Before FLIP:'.
+display variables.
+list.
+
+flip /variables = all /newnames = heading.
+
+echo 'After FLIP:'.
+display variables.
+list.
+```
+
+As you can see in the results below, before the `FLIP` command has run
+there are seven variables (six containing data and one for the
+heading) and three cases. Afterwards there are four variables (one
+per case, plus the CASE_LBL variable) and six cases. You can delete
+the CASE_LBL variable (see [DELETE
+VARIABLES](../commands/variables/delete-variables.md)) if you don't
+need it.
+
+```
+Before FLIP:
+
+ Variables
+┌───────┬────────┬────────────┬────────────┐
+│Name │Position│Print Format│Write Format│
+├───────┼────────┼────────────┼────────────┤
+│heading│ 1│A16 │A16 │
+│v1 │ 2│F8.2 │F8.2 │
+│v2 │ 3│F8.2 │F8.2 │
+│v3 │ 4│F8.2 │F8.2 │
+│v4 │ 5│F8.2 │F8.2 │
+│v5 │ 6│F8.2 │F8.2 │
+│v6 │ 7│F8.2 │F8.2 │
+└───────┴────────┴────────────┴────────────┘
+
+ Data List
+┌─────────────┬───────┬───────┬───────┬───────┬───────┬───────┐
+│ heading │ v1 │ v2 │ v3 │ v4 │ v5 │ v6 │
+├─────────────┼───────┼───────┼───────┼───────┼───────┼───────┤
+│date─of─birth│1970.00│1989.00│2001.00│1966.00│1976.00│1982.00│
+│sex │ 1.00│ .00│ .00│ 1.00│ .00│ 1.00│
+│score │ 10.00│ 10.00│ 9.00│ 3.00│ 8.00│ 9.00│
+└─────────────┴───────┴───────┴───────┴───────┴───────┴───────┘
+
+After FLIP:
+
+ Variables
+┌─────────────┬────────┬────────────┬────────────┐
+│Name │Position│Print Format│Write Format│
+├─────────────┼────────┼────────────┼────────────┤
+│CASE_LBL │ 1│A8 │A8 │
+│date_of_birth│ 2│F8.2 │F8.2 │
+│sex │ 3│F8.2 │F8.2 │
+│score │ 4│F8.2 │F8.2 │
+└─────────────┴────────┴────────────┴────────────┘
+
+ Data List
+┌────────┬─────────────┬────┬─────┐
+│CASE_LBL│date_of_birth│ sex│score│
+├────────┼─────────────┼────┼─────┤
+│v1 │ 1970.00│1.00│10.00│
+│v2 │ 1989.00│ .00│10.00│
+│v3 │ 2001.00│ .00│ 9.00│
+│v4 │ 1966.00│1.00│ 3.00│
+│v5 │ 1976.00│ .00│ 8.00│
+│v6 │ 1982.00│1.00│ 9.00│
+└────────┴─────────────┴────┴─────┘
+```
--- /dev/null
+# IF
+
+```
+ IF CONDITION VARIABLE=EXPRESSION.
+or
+ IF CONDITION vector(INDEX)=EXPRESSION.
+```
+
+The `IF` transformation evaluates a test expression and, if it is
+true, assigns the value of a target expression to a target variable.
+
+Specify a boolean-valued test
+[expression](../../language/expressions/index.md) to be tested following the
+`IF` keyword. The test expression is evaluated for each case:
+
+- If it is true, then the target expression is evaluated and assigned
+ to the specified variable.
+
+- If it is false or missing, nothing is done.
+
+Numeric and string variables may be assigned. When a string
+expression's width differs from the target variable's width, the
+string result is truncated or padded with spaces on the right as
+necessary. The expression and variable types must match.
+
+The target variable may be specified as an element of a
+[vector](../../commands/variables/vector.md). In this case, a vector
+index expression must be specified in parentheses following the vector
+name. The index expression must evaluate to a numeric value that,
+after rounding down to the nearest integer, is a valid index for the
+named vector.
+
+Using `IF` to assign to a variable specified on
+[`LEAVE`](../../commands/variables/leave.md) resets the variable's
+left state. Therefore, use `LEAVE` after `IF`, not before.
+
+When `IF` follows `TEMPORARY` (*note TEMPORARY::), the
+[`LAG`](../../language/expressions/functions/miscellaneous.md) function may not
+be used.
+
--- /dev/null
+# RECODE
+
+The `RECODE` command is used to transform existing values into other,
+user specified values. The general form is:
+
+```
+RECODE SRC_VARS
+ (SRC_VALUE SRC_VALUE ... = DEST_VALUE)
+ (SRC_VALUE SRC_VALUE ... = DEST_VALUE)
+ (SRC_VALUE SRC_VALUE ... = DEST_VALUE) ...
+ [INTO DEST_VARS].
+```
+
+Following the `RECODE` keyword itself comes `SRC_VARS`, a list of
+variables whose values are to be transformed. These variables must
+all string or all numeric variables.
+
+After the list of source variables, there should be one or more
+"mappings". Each mapping is enclosed in parentheses, and contains the
+source values and a destination value separated by a single `=`. The
+source values are used to specify the values in the dataset which need
+to change, and the destination value specifies the new value to which
+they should be changed. Each SRC_VALUE may take one of the following
+forms:
+
+* `NUMBER` (numeric source variables only)
+ Matches a number.
+
+* `STRING` (string source variables only)
+ Matches a string enclosed in single or double quotes.
+
+* `NUM1 THRU NUM2` (numeric source variables only)
+ Matches all values in the range between `NUM1` and `NUM2`, including
+ both endpoints of the range. `NUM1` should be less than `NUM2`.
+ Open-ended ranges may be specified using `LO` or `LOWEST` for `NUM1`
+ or `HI` or `HIGHEST` for `NUM2`.
+
+* `MISSING`
+ Matches system missing and user missing values.
+
+* `SYSMIS` (numeric source variables only)
+ Match system-missing values.
+
+* `ELSE`
+ Matches any values that are not matched by any other `SRC_VALUE`.
+ This should appear only as the last mapping in the command.
+
+After the source variables comes an `=` and then the `DEST_VALUE`,
+which may take any of the following forms:
+
+* `NUMBER` (numeric destination variables only)
+ A literal numeric value to which the source values should be
+ changed.
+
+* `STRING` (string destination variables only)
+ A literal string value (enclosed in quotation marks) to which the
+ source values should be changed. This implies the destination
+ variable must be a string variable.
+
+* `SYSMIS` (numeric destination variables only)
+ The keyword `SYSMIS` changes the value to the system missing value.
+ This implies the destination variable must be numeric.
+
+* `COPY`
+ The special keyword `COPY` means that the source value should not be
+ modified, but copied directly to the destination value. This is
+ meaningful only if `INTO DEST_VARS` is specified.
+
+Mappings are considered from left to right. Therefore, if a value is
+matched by a `SRC_VALUE` from more than one mapping, the first
+(leftmost) mapping which matches is considered. Any subsequent
+matches are ignored.
+
+The clause `INTO DEST_VARS` is optional. The behaviour of the command
+is slightly different depending on whether it appears or not:
+
+* Without `INTO DEST_VARS`, then values are recoded "in place". This
+ means that the recoded values are written back to the source variables
+ from whence the original values came. In this case, the DEST_VALUE
+ for every mapping must imply a value which has the same type as the
+ SRC_VALUE. For example, if the source value is a string value, it is
+ not permissible for DEST_VALUE to be `SYSMIS` or another forms which
+ implies a numeric result. It is also not permissible for DEST_VALUE
+ to be longer than the width of the source variable.
+
+ The following example recodes two numeric variables `x` and `y` in
+ place. 0 becomes 99, the values 1 to 10 inclusive are unchanged,
+ values 1000 and higher are recoded to the system-missing value, and
+ all other values are changed to 999:
+
+ ```
+ RECODE x y
+ (0 = 99)
+ (1 THRU 10 = COPY)
+ (1000 THRU HIGHEST = SYSMIS)
+ (ELSE = 999).
+ ```
+
+* With `INTO DEST_VARS`, recoded values are written into the variables
+ specified in `DEST_VARS`, which must therefore contain a list of
+ valid variable names. The number of variables in `DEST_VARS` must
+ be the same as the number of variables in `SRC_VARS` and the
+ respective order of the variables in `DEST_VARS` corresponds to the
+ order of `SRC_VARS`. That is to say, the recoded value whose
+ original value came from the Nth variable in `SRC_VARS` is placed
+ into the Nth variable in `DEST_VARS`. The source variables are
+ unchanged. If any mapping implies a string as its destination
+ value, then the respective destination variable must already exist,
+ or have been declared using `STRING` or another transformation.
+ Numeric variables however are automatically created if they don't
+ already exist.
+
+ The following example deals with two source variables, `a` and `b`
+ which contain string values. Hence there are two destination
+ variables `v1` and `v2`. Any cases where `a` or `b` contain the
+ values `apple`, `pear` or `pomegranate` result in `v1` or `v2` being
+ filled with the string `fruit` whilst cases with `tomato`, `lettuce`
+ or `carrot` result in `vegetable`. Other values produce the result
+ `unknown`:
+
+ ```
+ STRING v1 (A20).
+ STRING v2 (A20).
+
+ RECODE a b
+ ("apple" "pear" "pomegranate" = "fruit")
+ ("tomato" "lettuce" "carrot" = "vegetable")
+ (ELSE = "unknown")
+ INTO v1 v2.
+ ```
+
+There is one special mapping, not mentioned above. If the source
+variable is a string variable then a mapping may be specified as
+`(CONVERT)`. This mapping, if it appears must be the last mapping
+given and the `INTO DEST_VARS` clause must also be given and must not
+refer to a string variable. `CONVERT` causes a number specified as a
+string to be converted to a numeric value. For example it converts
+the string `"3"` into the numeric value 3 (note that it does not
+convert `three` into 3). If the string cannot be parsed as a number,
+then the system-missing value is assigned instead. In the following
+example, cases where the value of `x` (a string variable) is the empty
+string, are recoded to 999 and all others are converted to the numeric
+equivalent of the input value. The results are placed into the
+numeric variable `y`:
+
+```
+RECODE x ("" = 999) (CONVERT) INTO y.
+```
+
+It is possible to specify multiple recodings on a single command.
+Introduce additional recodings with a slash (`/`) to separate them from
+the previous recodings:
+
+```
+RECODE
+ a (2 = 22) (ELSE = 99)
+ /b (1 = 3) INTO z.
+```
+
+Here we have two recodings. The first affects the source variable `a`
+and recodes in-place the value 2 into 22 and all other values to 99.
+The second recoding copies the values of `b` into the variable `z`,
+changing any instances of 1 into 3.
+
--- /dev/null
+# SORT CASES
+
+```
+SORT CASES BY VAR_LIST[({D|A}] [ VAR_LIST[({D|A}] ] ...
+```
+
+`SORT CASES` sorts the active dataset by the values of one or more
+variables.
+
+Specify `BY` and a list of variables to sort by. By default,
+variables are sorted in ascending order. To override sort order,
+specify `(D)` or `(DOWN)` after a list of variables to get descending
+order, or `(A)` or `(UP)` for ascending order. These apply to all the
+listed variables up until the preceding `(A)`, `(D)`, `(UP)` or
+`(DOWN)`.
+
+`SORT CASES` performs a stable sort, meaning that records with equal
+values of the sort variables have the same relative order before and
+after sorting. Thus, re-sorting an already sorted file does not
+affect the ordering of cases.
+
+`SORT CASES` is a procedure. It causes the data to be read.
+
+`SORT CASES` attempts to sort the entire active dataset in main
+memory. If workspace is exhausted, it falls back to a merge sort
+algorithm which creates numerous temporary files.
+
+`SORT CASES` may not be specified following `TEMPORARY`.
+
+## Example
+
+In the syntax below, the data from the file `physiology.sav` is sorted
+by two variables, viz sex in descending order and temperature in
+ascending order.
+
+```
+get file='physiology.sav'.
+sort cases by sex (D) temperature(A).
+list.
+```
+
+ In the output below, you can see that all the cases with a sex of
+`1` (female) appear before those with a sex of `0` (male). This is
+because they have been sorted in descending order. Within each sex,
+the data is sorted on the temperature variable, this time in ascending
+order.
+
+```
+ Data List
+┌───┬──────┬──────┬───────────┐
+│sex│height│weight│temperature│
+├───┼──────┼──────┼───────────┤
+│ 1│ 1606│ 56.1│ 34.56│
+│ 1│ 179│ 56.3│ 35.15│
+│ 1│ 1609│ 55.4│ 35.46│
+│ 1│ 1606│ 56.0│ 36.06│
+│ 1│ 1607│ 56.3│ 36.26│
+│ 1│ 1604│ 56.0│ 36.57│
+│ 1│ 1604│ 56.6│ 36.81│
+│ 1│ 1606│ 56.3│ 36.88│
+│ 1│ 1604│ 57.8│ 37.32│
+│ 1│ 1598│ 55.6│ 37.37│
+│ 1│ 1607│ 55.9│ 37.84│
+│ 1│ 1605│ 54.5│ 37.86│
+│ 1│ 1603│ 56.1│ 38.80│
+│ 1│ 1604│ 58.1│ 38.85│
+│ 1│ 1605│ 57.7│ 38.98│
+│ 1│ 1709│ 55.6│ 39.45│
+│ 1│ 1604│ -55.6│ 39.72│
+│ 1│ 1601│ 55.9│ 39.90│
+│ 0│ 1799│ 90.3│ 32.59│
+│ 0│ 1799│ 89.0│ 33.61│
+│ 0│ 1799│ 90.6│ 34.04│
+│ 0│ 1801│ 90.5│ 34.42│
+│ 0│ 1802│ 87.7│ 35.03│
+│ 0│ 1793│ 90.1│ 35.11│
+│ 0│ 1801│ 92.1│ 35.98│
+│ 0│ 1800│ 89.5│ 36.10│
+│ 0│ 1645│ 92.1│ 36.68│
+│ 0│ 1698│ 90.2│ 36.94│
+│ 0│ 1800│ 89.6│ 37.02│
+│ 0│ 1800│ 88.9│ 37.03│
+│ 0│ 1801│ 88.9│ 37.12│
+│ 0│ 1799│ 90.4│ 37.33│
+│ 0│ 1903│ 91.5│ 37.52│
+│ 0│ 1799│ 90.9│ 37.53│
+│ 0│ 1800│ 91.0│ 37.60│
+│ 0│ 1799│ 90.4│ 37.68│
+│ 0│ 1801│ 91.7│ 38.98│
+│ 0│ 1801│ 90.9│ 39.03│
+│ 0│ 1799│ 89.3│ 39.77│
+│ 0│ 1884│ 88.6│ 39.97│
+└───┴──────┴──────┴───────────┘
+```
+
+`SORT CASES` affects only the active file. It does not have any
+effect upon the `physiology.sav` file itself. For that, you would
+have to rewrite the file using the `SAVE` command (*note SAVE::).