format](../../language/datasets/formats/index.md). The default is
initially `F8.2`.
-* `EPOCH`
+* <a name="epoch">`EPOCH`</a>
Specifies the range of years used when a 2-digit year is read from a
data file or used in a [date construction
expression](../../language/expressions/functions/time-and-date.md#constructing-dates).
`AUTOMATIC` (the default) is specified, then the epoch begins 69
years before the current date.
-* `RIB`
+* <a name="rib">`RIB`</a>
PSPP extension to set the byte ordering (endianness) used for
reading data in [`IB` or `PIB`
format](../../language/datasets/formats/binary-and-hex.md#ib-and-pib-formats). In
* `UNDEFINED`
Currently not used.
-* `FUZZBITS`
+* <a name="fuzzbits">`FUZZBITS`</a>
The maximum number of bits of errors in the least-significant places
to accept for rounding up a value that is almost halfway between two
possibilities for rounding with the
format](../../language/datasets/formats/index.md) to be specified. The
default is `F8.2`.
-* <a name+"leadzero">`LEADZERO`</a>
+* <a name="leadzero">`LEADZERO`</a>
Controls whether numbers with magnitude less than one are displayed
with a zero before the decimal point. For example, with `SET
LEADZERO=OFF`, which is the default, one-half is shown as 0.5, and
otherwise, it formats it in standard notation. The default is
0.0001. Set a value of 0 to disable scientific notation.
-* `WIB`
+* <a name="wib">`WIB`</a>
PSPP extension to set the byte ordering (endianness) used for
writing data in [`IB` or `PIB`
format](../../language/datasets/formats/binary-and-hex.md#ib-and-pib-formats).
lowercase), or by a letter followed by a sign. A single space may
follow the letter or the sign or both.
- On fixed-format `DATA LIST` (*note DATA LIST FIXED::) and in a few
-other contexts, decimals are implied when the field does not contain a
-decimal point. In `F6.5` format, for example, the field `314159` is taken
-as the value 3.14159 with implied decimals. Decimals are never implied
-if an explicit decimal point is present or if scientific notation is
-used.
+ On [fixed-format `DATA
+LIST`](../../../commands/data-io/data-list.md#data-list-fixed) and in
+a few other contexts, decimals are implied when the field does not
+contain a decimal point. In `F6.5` format, for example, the field
+`314159` is taken as the value 3.14159 with implied decimals.
+Decimals are never implied if an explicit decimal point is present or
+if scientific notation is used.
`E` and `F` formats accept the basic syntax already described. The other
formats allow some additional variations:
not fit at all without it. Scientific notation with `$` or `%` is
preferred to ordinary decimal notation without it.
-- Except in scientific notation, a decimal point is included only
- when it is followed by a digit. If the integer part of the number
- being output is 0, and a decimal point is included, then PSPP
- ordinarily drops the zero before the decimal point. However, in
- `F`, `COMMA`, or `DOT` formats, PSPP keeps the zero if `SET
- LEADZERO` is set to `ON` (*note SET LEADZERO::).
+- Except in scientific notation, a decimal point is included only when
+ it is followed by a digit. If the integer part of the number being
+ output is 0, and a decimal point is included, then PSPP ordinarily
+ drops the zero before the decimal point. However, in `F`, `COMMA`,
+ or `DOT` formats, PSPP keeps the zero if [`SET
+ LEADZERO`](../../../commands/utilities/set.md#leadzero) is set to
+ `ON`.
In scientific notation, the number always includes a decimal point,
even if it is not followed by a digit.
These are integer binary formats. `IB` reads and writes 2's
complement binary integers, and `PIB` reads and writes unsigned binary
integers. The byte ordering is by default the host machine's, but
-`SET RIB` may be used to select a specific byte ordering for reading
-(*note SET RIB::) and `SET WIB`, similarly, for writing (*note SET
-WIB::).
+[`SET RIB`](../../../commands/utilities/set.md#rib) may be used to
+select a specific byte ordering for reading and [`SET
+WIB`](../../../commands/utilities/set.md#wib), similarly, for writing.
The maximum field width is 8. Decimal places may range from 0 up to
the number of decimal digits in the largest value representable in the
## `RB` Format
-This is a binary format for real numbers. By default it reads and
-writes the host machine's floating-point format, but `SET RRB` may be
-used to select an alternate floating-point format for reading (*note
-SET RRB::) and `SET WRB`, similarly, for writing (*note SET WRB::).
+This is a binary format for real numbers. It reads and writes the
+host machine's floating-point format. The byte ordering is by default
+the host machine's, but [`SET
+RIB`](../../../commands/utilities/set.md#rib) may be used to select a
+specific byte ordering for reading and [`SET
+WIB`](../../../commands/utilities/set.md#wib), similarly, for writing.
The field width should be 4, for 32-bit floating-point numbers, or 8,
for 64-bit floating-point numbers. Other field widths do not produce
hexadecimal formats, the default output format is an easier-to-read
decimal format.
- Every variable has two output formats, called its "print format" and
-"write format". Print formats are used in most output contexts; write
-formats are used only by `WRITE` (*note WRITE::). Newly created
-variables have identical print and write formats, and `FORMATS`, the
-most commonly used command for changing formats (*note FORMATS::), sets
-both of them to the same value as well. Thus, most of the time, the
-distinction between print and write formats is unimportant.
+ Every variable has two output formats, called its "print format"
+and "write format". Print formats are used in most output contexts;
+only the [`WRITE`](../../../commands/data-io/write.md) command uses
+write formats. Newly created variables have identical print and write
+formats, and [`FORMATS`](../../../commands/variables/formats.md), the
+most commonly used command for changing formats, sets both of them to
+the same value as well. This means that the distinction between print
+and write formats is usually unimportant.
Input and output formats are specified to PSPP with a "format
specification" of the form `TypeW` or `TypeW.D`, where `Type` is one
input field other than spaces, the digit characters above, and `.`
causes the field to be read as system-missing.
- The decimal point character for input and output is always `.`, even
-if the decimal point character is a comma (*note SET DECIMAL::).
+ The decimal point character for input and output is always `.`,
+even if the decimal point character is a comma (see [`SET
+DECIMAL`](../../../commands/utilities/set.md#decimal)).
Nonzero, negative values output in `Z` format are marked as
negative even when no nonzero digits are output. For example, -0.2 is
In input, both of these formats, plus Roman numerals, are accepted.
* `yyyy`
- Year. In output, `DATETIME` and `YMDHMS` always produce 4-digit years;
- other formats can produce a 2- or 4-digit year. The century
- assumed for 2-digit years depends on the `EPOCH` setting (*note SET
- EPOCH::). In output, a year outside the epoch causes the whole
- field to be filled with asterisks (`*`).
+ Year. In output, `DATETIME` and `YMDHMS` always produce 4-digit
+ years; other formats can produce a 2- or 4-digit year. The century
+ assumed for 2-digit years depends on the
+ [`EPOCH`](../../../commands/utilities/set.md#epoch) setting. In
+ output, a year outside the epoch causes the whole field to be filled
+ with asterisks (`*`).
* `jjj`
Day of year (Julian day), from 1 to 366. This is exactly three
* `SS.ss`
Seconds within minute, from 0 to 59. The integer part is output as
exactly two digits. On output, seconds and fractional seconds may
- or may not be included, depending on field width and decimal
- places. On input, seconds and fractional seconds are optional.
- The `DECIMAL` setting controls the character accepted and displayed
- as the decimal point (*note SET DECIMAL::).
+ or may not be included, depending on field width and decimal places.
+ On input, seconds and fractional seconds are optional. The
+ `DECIMAL` setting controls the character accepted and displayed as
+ the decimal point (see [`SET
+ DECIMAL`](../../../commands/utilities/set.md#decimal)).
For output, the date and time formats use the delimiters indicated in
the table. For input, date components may be separated by spaces or by
"dictionary", and one or more "cases", each of which has one value for
each variable.
- At any given time PSPP has exactly one distinguished dataset, called
-the "active dataset". Most PSPP commands work only with the active
-dataset. In addition to the active dataset, PSPP also supports any
-number of additional open datasets. The `DATASET` commands can choose a
-new active dataset from among those that are open, as well as create and
-destroy datasets (*note DATASET::).
+ At any given time PSPP has exactly one distinguished dataset,
+called the "active dataset". Most PSPP commands work only with the
+active dataset. In addition to the active dataset, PSPP also supports
+any number of additional open datasets. The [`DATASET`
+commands](../../commands/data-io/dataset.md) can choose a new active
+dataset from among those that are open, as well as create and destroy
+datasets.
system-missing value or to blanks, depending on type.
However, sometimes it's useful to have a variable that keeps its
-value between cases. You can do this with `LEAVE` (*note LEAVE::), or
-you can use a "scratch variable". Scratch variables are variables whose
-names begin with an octothorpe (`#`).
+value between cases. You can do this with
+[`LEAVE`](../../commands/variables/leave.md), or you can use a
+"scratch variable". Scratch variables are variables whose names begin
+with an octothorpe (`#`).
Scratch variables have the same properties as variables left with
-`LEAVE`: they retain their values between cases, and for the first case
-they are initialized to 0 or blanks. They have the additional property
-that they are deleted before the execution of any procedure. For this
-reason, scratch variables can't be used for analysis. To use a scratch
-variable in an analysis, use `COMPUTE` (*note COMPUTE::) to copy its
-value into an ordinary variable, then use that ordinary variable in the
-analysis.
+`LEAVE`: they retain their values between cases, and for the first
+case they are initialized to 0 or blanks. They have the additional
+property that they are deleted before the execution of any procedure.
+For this reason, scratch variables can't be used for analysis. To use
+a scratch variable in an analysis, use
+[`COMPUTE`](../../commands/data/compute.md) to copy its value into an
+ordinary variable, then use that ordinary variable in the analysis.
Halves are rounded away from zero, as are values that fall short of
halves by less than `FUZZBITS` of errors in the least-significant
bits of X. If `FUZZBITS` is not specified then the default is taken
- from `SET FUZZBITS` (*note SET FUZZBITS::), which is 6 unless
- overridden.
+ from [`SET FUZZBITS`](../../../commands/utilities/set.md#fuzzbits),
+ which is 6 unless overridden.
* `SQRT(X)`
Takes the square root of `X`. If `X` is negative, the result is
`X`. Values that fall short of a multiple of `MULT` by less than
`FUZZBITS` of errors in the least-significant bits of `X` are
rounded away from zero. If `FUZZBITS` is not specified then the
- default is taken from `SET FUZZBITS` (*note SET FUZZBITS::), which
- is 6 unless overridden.
+ default is taken from [`SET
+ FUZZBITS`](../../../commands/utilities/set.md#fuzzbits), which is 6
+ unless overridden.
They take a set of numeric arguments or a set of string arguments, and
produce Boolean results.
- String comparisons are performed according to the rules given in
-*note Relational Operators::. User-missing string values are treated as
-valid values.
+ String comparisons are performed according to the rules given for
+[Relational Operators](../operators.md#relational-operators).
+User-missing string values are treated as valid values.
* `ANY(VALUE, SET [, SET]...)`
Returns true if `VALUE` is equal to any of the `SET` values, and false
Statistical functions compute descriptive statistics on a list of
values. Some statistics can be computed on numeric or string values;
other can only be computed on numeric values. Their results have the
-same type as their arguments. The current case's weighting factor
-(*note WEIGHT::) has no effect on statistical functions.
+same type as their arguments. The current case's
+[weight](../../../commands/selection/weight.md) has no effect on
+statistical functions.
These functions' argument lists may include entire ranges of
variables using the `VAR1 TO VAR2` syntax.
* `YEAR`
Refers to a year, 1582 or greater. Years between 0 and 99 are
- treated according to the epoch set on SET EPOCH, by default
- beginning 69 years before the current date (*note SET EPOCH::).
+ treated according to the epoch set on [`SET
+ EPOCH`](../../../commands/utilities/set.md#epoch), by default
+ beginning 69 years before the current date.
If these functions' arguments are out-of-range, they are correctly
normalized before conversion to date format. Non-integers are rounded
These names (synonyms) refer to the file that contains instructions
that tell PSPP what to do. The syntax file's name is specified on
the PSPP command line. Syntax files can also be read with
- `INCLUDE` (*note INCLUDE::).
+ [`INCLUDE`](../../commands/utilities/include.md) or
+ [`INSERT`](../../commands/utilities/insert.md).
* data file
Data files contain raw data in text or binary format. Data can
Always set to 2 and 0, respectively.
These fields could be used as a signature for the file format, but
- the `product` field in record 0 seems more likely to be unique
- (*note Record 0 Main Header Record::).
+ the `product` field in [record 0](#record-0-main-header-record)
+ seems more likely to be unique.
* `struct { ... } records[15];`
Each of the elements in this array identifies a record in the
* 0: no compression
Data is arranged as a series of 8-byte elements, one per variable
- instance variable in the variable record (*note Record 1 Variables
- Record::). Numeric values are given in `flt64` format; string
- values are literal characters string, padded on the right with
- spaces when necessary to fill out 8-byte units.
+ instance variable in the [variable
+ record](#record-1-variables-record). Numeric values are given in
+ `flt64` format; string values are literal characters string, padded
+ on the right with spaces when necessary to fill out 8-byte units.
* 1: bytecode compression
The first 8 bytes of the data record is divided into a series of
stylized names, where N is a number for the dimension starting from 0:
* `dimensionNcategories`
- The dimension's leaf categories (*note SPV Light Member
- Categories::).
+ The dimension's leaf [categories](light-detail.md#categories).
* `dimensionNgroup0`
Present only if the dimension's categories are grouped, this
* `source`
Always set to `tableData`, the `source-name` in the corresponding
- `tableData.bin` member (*note SPV Legacy Member Metadata::).
+ `tableData.bin` member (see
+ [Metadata](legacy-detail-binary.md#metadata)).
* `sourceName`
The name of a variable within the source, corresponding to the
- `variable-name` in the `tableData.bin` member (*note SPV Legacy
- Member Numeric Data::).
+ `variable-name` in the `tableData.bin` member (see [Numeric
+ Data](legacy-detail-binary.md#numeric-data)).
* `label`
The variable label, if any.
* `cellStyle`
`style`
- Each of these is the `id` of a `style` element (*note SPV Detail
- style Element::). The former is the default style for individual
- cells, the latter for the entire table.
+ Each of these is the `id` of a [`style`
+ element](#the-style-element). The former is the default style for
+ individual cells, the latter for the entire table.
## The `location` Element
Always observed as `0pt`.
Each `facetLevel` contains an `axis`, which in turn may contain a
-`label` for the `facetLevel` (*note SPV Detail label Element::) and does
-contain a `majorTicks` element.
+[`label`](#the-label-element) for the `facetLevel` and does contain a
+`majorTicks` element.
* `labelAngle`
Normally 0. The value -90 causes inner column or outer row labels
meaning the SPSS print format for a variable.
The details of this element vary depending on the schema version, as
-declared in the root `visualization` element's `version` attribute
-(*note SPV Detail visualization Element::). A reader can interpret the
-content without knowing the schema version.
+declared in the root [`visualization`
+element](#the-visualization-element)'s `version` attribute. A reader
+can interpret the content without knowing the schema version.
The `setFormat` element itself has the following attributes.
=> affix*
```
-This element appears only in schema version 2.5 and earlier (*note
-SPV Detail visualization Element::).
+This element appears only in [schema
+version](#the-visualization-eleemnt) 2.5 and earlier.
Data to be formatted in date formats is stored as strings in legacy
data, in the format `yyyy-mm-ddTHH:MM:SS.SSS` and must be parsed and
The `interval` element and its descendants determine the basic
formatting and labeling for the table's cells. These basic styles are
-overridden by more specific styles set using `setCellProperties` (*note
-SPV Detail setCellProperties Element::).
+overridden by more specific styles set using
+[`setCellProperties`](#the-setcellproperties-element).
The `style` attribute of `interval` itself may be ignored.
=> EMPTY
```
-The `name` attribute appears only in standalone `.stt` files (*note
-SPSS TableLook STT Format::).
+The `name` attribute appears only in [standalone `.stt`
+files](../tablelook.md#the-tlo-format).
`current-layer` is the displayed layer. Suppose there are \\(d\\)
layers, numbered 1 through \\(d\\) in the order given in the
-Dimensions (*note SPV Light Member Dimensions::), and that the
-displayed value of dimension \\(i\\) is \\\(d_i, 0 \le x_i < n_i\\),
-where \\(n_i\\) is the number of categories in dimension \\(i\\).
-Then `current-layer` is the \\(k\\) calculated by the following algorithm:
+[Dimensions](#dimensions), and that the displayed value of dimension
+\\(i\\) is \\\(d_i, 0 \le x_i < n_i\\), where \\(n_i\\) is the number
+of categories in dimension \\(i\\). Then `current-layer` is the
+\\(k\\) calculated by the following algorithm:
> let \\(k = 0\\).
> for each \\(i\\) from \\(d\\) downto 1:
part of the data, but in practice `X2` specifies a style for a cell
only if that cell is empty (and thus does not appear in the data at
all). Each StyleMap specifies the index of a blank cell, calculated
-the same was as in the Cells (*note SPV Light Member Cells::), along
-with a 0-based index into the accompanying StylePair array.
+the same was as in the [Cells](#cells), along with a 0-based index
+into the accompanying StylePair array.
A writer may safely omit the optional `i0 i0` inside the
`count(...)`.
`small` is a small real number. In the corpus, it overwhelmingly
takes the value 0.0001, with zero occasionally seen. Nonzero numbers
-with format 40 (*note SPV Light Member Value::) whose magnitudes are
-smaller than displayed in scientific notation. (Thus, a `small` of zero
-prevents scientific notation from being chosen.)
+with format 40 (see [Value](#value)) whose magnitudes are smaller than
+displayed in scientific notation. (Thus, a `small` of zero prevents
+scientific notation from being chosen.)
`dataset` is the name of the dataset analyzed to produce the output,
e.g. `DataSet1`, and `datafile` the name of the file it was read from,
* `01`
The numeric value `x`, intended to be presented to the user
formatted according to `format`, which is about the same as the
- format described for system files (*note System File Output
- Formats::). The exception is that format 40 is not MTIME but
- instead approximately a synonym for F format with a different rule
- for whether a value is shown in scientific notation: a value in
- format 40 is shown in scientific notation if and only if it is
- nonzero and its magnitude is less than [`small`](#formats).
+ [format described for system files](../system-file.md#format-types).
+ The exception is that format 40 is not `MTIME` but instead
+ approximately a synonym for `F` format with a different rule for
+ whether a value is shown in scientific notation: a value in format
+ 40 is shown in scientific notation if and only if it is nonzero and
+ its magnitude is less than [`small`](#formats).
Most commonly, `format` has width 40 (the maximum).
An `x` with the maximum negative double value `-DBL_MAX` represents
- the system-missing value `SYSMIS`. (`HIGHEST` and `LOWEST` have not been
- observed.) See *note System File Format::, for more about these
- special values.
+ the system-missing value `SYSMIS`. (`HIGHEST` and `LOWEST` have not
+ been observed.) See [System File
+ Format](../system-file.md#introduction) for more about these special
+ values.
* `02`
Similar to `01`, with the additional information that `x` is a
- When a "light" format is used, only `dataPath` is present, and it
names a `.bin` member of the Zip file that has `light` in its name,
- e.g. `0000000001437_lightTableData.bin` (*note SPV Light Detail
- Member Format::).
+ e.g. `0000000001437_lightTableData.bin`. See [Light Detail Member
+ Format](light-detail.md) for light format details.
- When the legacy format is used, both are present. In this case,
`dataPath` names a Zip member with a legacy binary format that
- contains relevant data (*note SPV Legacy Detail Member Binary
- Format::), and `path` names a Zip member that uses an XML format
- (*note SPV Legacy Detail Member XML Format::).
+ contains relevant data (see [Legacy Detail Member Binary
+ Format](legacy-detail-binary.md)), and `path` names a Zip member
+ that uses an XML format (see [Legacy Detail Member XML Member
+ Format](legacy-detail-xml.md)).
Graphs normally follow the legacy approach described above. The
corpus contains one example of a graph with `path` but not `dataPath`.
* 0: no compression
Data is arranged as a series of 8-byte elements. Each element
- corresponds to the variable declared in the respective variable
- record (*note Variable Record::). Numeric values are given in
- `flt64` format; string values are literal characters string, padded
- on the right when necessary to fill out 8-byte units.
+ corresponds to the variable declared in the respective [variable
+ record](#variable-record). Numeric values are given in `flt64`
+ format; string values are literal characters string, padded on the
+ right when necessary to fill out 8-byte units.
* 1: bytecode compression
## The `.stt` Format
-The `.stt` file format is an XML file that contains a subset of the SPV
-structure member format (*note SPV Structure Member Format::). Its root
-element is a `tableProperties` element (*note SPV Detail Legacy
-Properties::).
+The `.stt` file format is an XML file that contains a subset of the
+SPV structure member format. Its root element is a [`tableProperties`
+element](spv/legacy-detail-xml.md#legacy-properties).
## The `.tlo` Format
A `.tlo` file has a custom binary format. This section describes it
-using the syntax used previously for SPV binary members (*note SPV Light
-Detail Member Format::). There is one new convention: TLO files express
-colors as `int32` values in which the low 8 bits are the red component,
-the next 8 bits are green, and next 8 bits are blue, and the high bits
-are zeros.
+using the [binary format
+conventions](spv/light-detail.md#binary-format-conventions) used for
+SPV binary members. There is one new convention: TLO files express
+colors as `int32` values in which the low 8 bits are the red
+component, the next 8 bits are green, and next 8 bits are blue, and
+the high bits are zeros.
TLO files support various features that SPV files do not. PSPP
implements the SPV feature set, so it mostly ignores the added TLO
i54 i18
```
- In `PTTableLook`, `version` is 00 or 02. The only difference is that
-version 00 lacks `V2Styles` (*note V2Styles in SPSS TLO Files::) and that
-version 02 includes it. Both TLO versions are seen in the wild.
+ In `PTTableLook`, `version` is 00 or 02. The only difference is
+that version 00 lacks [`V2Styles`](#v2styles) and that version 02
+includes it. Both TLO versions are seen in the wild.
`flags` is a bit-mapped field. Its bits have the following meanings:
```
These sections hold the styling and coloring for each of the 8 areas
-in a pivot table. They are conceptually similar to the area style
-information in SPV light members (*note SPV Light Member Areas::).
+in a pivot table. They are conceptually similar to the
+[Areas](spv/light-detail.md#areas) style information in SPV light
+members.
The styling and coloring for the title area is split between
`PVCellStyle` and `PVTextStyle`: the former holds `title-color`, the