floating points, and 1-byte characters. In this document these will
simply be referred to as @code{int32}, @code{flt64}, and @code{char},
the names that are used in the PSPP source code. Every field of type
-@code{int32} or @code{flt64} is aligned on a 32-bit boundary.
+@code{int32} or @code{flt64} is aligned on a 32-bit boundary relative to
+the start of the record.
The endianness of data in PSPP system files is not specified. System
files output on a computer of a particular endianness will have the
* Document Record::
* Machine int32 Info Record::
* Machine flt64 Info Record::
-* Auxilliary Variable Parameter Record::
+* Auxiliary Variable Parameter Record::
* Long Variable Names Record::
+* Very Long String Length Record::
* Miscellaneous Informational Records::
* Dictionary Termination Record::
* Data Record::
char rec_type[4];
char prod_name[60];
int32 layout_code;
- int32 case_size;
+ int32 nominal_case_size;
int32 compressed;
int32 weight_index;
int32 ncases;
Always set to 2. PSPP reads this value to determine the
file's endianness.
-@item int32 case_size;
+@item int32 nominal_case_size;
Number of data elements per case. This is the number of variables,
except that long string variables add extra data elements (one for every
-8 characters after the first 8).
-When reading system files, PSPP will use this value unless it is set
-to -1, in which case it will determine the number of data elements by
-context. When writing system files PSPP always uses this value.
+8 characters after the first 8). However, string variables do not
+contribute to this value beyond the first 255 bytes. Further, system
+files written by some systems set this value to -1. In general, it is
+unsafe for systems reading system files to rely upon this value.
@item int32 compressed;
Set to 1 if the data in the file is compressed, 0 otherwise.
field is arbitrarily set to @samp{00:00:00}.
@item char file_label[64];
-Set the the file label declared by the user, if any. Padded on the
-right with spaces.
+Set the the file label declared by the user, if any (@pxref{FILE LABEL}).
+Padded on the right with spaces.
@item char padding[3];
Ignored padding bytes to make the structure a multiple of 32 bits in
Immediately following the header must come the variable records. There
must be one variable record for every variable and every 8 characters in
-a long string beyond the first 8; i.e., there must be exactly as many
-variable records as the value specified for @code{case_size} in the file
-header record.
+a long string beyond the first 8.
@example
struct sysfile_variable
/* The following field is present only
if n_missing_values is not 0. */
- flt64 missing_values[/* variable length*/];
+ flt64 missing_values[/* variable length */];
@};
@end example
Number of variables that the associated value labels from the value
label record are to be applied.
-@item int32 vars[/* variable length];
+@item int32 vars[/* variable length */];
A list of variables to which to apply the value labels. There are
-@code{count} elements.
+@code{count} elements. Each element identifies a variable record, where
+the first element is numbered 1 and long string variables are considered
+to occupy multiple indexes.
@end table
@node Document Record, Machine int32 Info Record, Value Label Variable Record, Data File Format
indicates 8-bit ASCII, 4 indicates DEC Kanji.
@end table
-@node Machine flt64 Info Record, Auxilliary Variable Parameter Record, Machine int32 Info Record, Data File Format
+@node Machine flt64 Info Record, Auxiliary Variable Parameter Record, Machine int32 Info Record, Data File Format
@section Machine @code{flt64} Info Record
There must be no more than one machine @code{flt64} info record per
The value used for LOWEST in missing values.
@end table
-@node Auxilliary Variable Parameter Record, Long Variable Names Record, Machine flt64 Info Record, Data File Format
-@section Auxilliary Variable Parameter Record
+@node Auxiliary Variable Parameter Record, Long Variable Names Record, Machine flt64 Info Record, Data File Format
+@section Auxiliary Variable Parameter Record
-There must be no more than one auxilliary variable parameter record per
+There must be no more than one auxiliary variable parameter record per
system file. This record must follow the variable
records and precede the dictionary termination record.
The size @code{int32}. Always set to 4.
@item int32 count;
-The total number of bytes in @code{aux_params} divided by 3.
+The total number of records in @code{aux_params}, multiplied by 3.
@item struct aux_params aux_params[];
An array of @code{struct aux_params}. The order of the elements corresponds
-to the order of the variables in the Variable Records. The @code{struct aux_params} type is defined as follows:
+to the order of the variables in the Variable Records. No element
+corresponds to variable records that continue long string variables.
+The @code{struct aux_params} type is defined as follows:
@example
struct aux_params
@item int32 measure
The measurement type of the variable:
@table @asis
-@item 0
-Nominal Scale
@item 1
-Ordinal Scale
+Nominal Scale
@item 2
+Ordinal Scale
+@item 3
Continuous Scale
@end table
-@node Long Variable Names Record, Miscellaneous Informational Records, Auxilliary Variable Parameter Record, Data File Format
+@node Long Variable Names Record, Very Long String Length Record, Auxiliary Variable Parameter Record, Data File Format
@section Long Variable Names Record
There must be no more than one long variable names record per
@item int32 count;
The total number of bytes in @code{var_name_pairs}.
-@item char var_name_pairs[/* variable length];
+@item char var_name_pairs[/* variable length */];
A list of @var{key}--@var{value} tuples, where @var{key} is the name
of a variable, and @var{value} is its long variable name.
The @var{key} field is at most 8 bytes long and must match the
-name of a variable which appears in the variable record @xref{Variable Record}.
+name of a variable which appears in the variable record (@pxref{Variable
+Record}).
The @var{value} field is at most 64 bytes long.
The @var{key} and @var{value} fields are separated by a @samp{=} byte.
Each tuple is separated by a byte whose value is 09. There is no
The total length is @code{count} bytes.
@end table
+@node Very Long String Length Record, Miscellaneous Informational Records, Long Variable Names Record, Data File Format
+@comment node-name, next, previous, up
+@section Very Long String Length Record
+
+
+There must be no more than one very long string length record per
+system file. This record must follow the variable records and precede the
+dictionary termination record.
+
+@example
+struct sysfile_very_long_string_lengths
+ @{
+ /* Header. */
+ int32 rec_type;
+ int32 subtype;
+ int32 size;
+ int32 count;
+
+ /* Data. */
+ char string_lengths[/* variable length */];
+ @};
+@end example
+
+@table @code
+@item int32 rec_type;
+Record type. Always set to 7.
+
+@item int32 subtype;
+Record subtype. Always set to 14.
+
+@item int32 size;
+The size of each element in the @code{string_lengths} member. Always set to 1.
+
+@item int32 count;
+The total number of bytes in @code{string_lengths}.
+
+@item char string_lengths[/* variable length */];
+A list of @var{key}--@var{value} tuples, where @var{key} is the name
+of a variable, and @var{value} is its length.
+The @var{key} field is at most 8 bytes long and must match the
+name of a variable which appears in the variable record (@pxref{Variable
+Record}).
+The @var{value} field is exactly 5 bytes long. It is a zero-padded,
+ASCII-encoded string that is the length of the variable.
+The @var{key} and @var{value} fields are separated by a @samp{=} byte.
+Tuples are delimited by a two-byte sequence @{00, 09@}.
+After the last tuple, there may be a single byte 00, or @{00, 09@}.
+The total length is @code{count} bytes.
+@end table
+
+
-@node Miscellaneous Informational Records, Dictionary Termination Record, Long Variable Names Record, Data File Format
+@node Miscellaneous Informational Records, Dictionary Termination Record, Very Long String Length Record, Data File Format
@section Miscellaneous Informational Records
Miscellaneous informational records must follow the variable records and
precede the dictionary termination record.
-Miscellaneous informational records are ignored by PSPP when reading
-system files. They are not written by PSPP when writing system files.
+Some specific types of miscellaneous informational records are
+documented here, but others are known to exist. PSPP ignores unknown
+miscellaneous informational records when reading system files.
@example
struct sysfile_misc_info
compressed. Regardless, the data is arranged in a series of 8-byte
elements.
-When data is not compressed, Every case is composed of @code{case_size}
-of these 8-byte elements, where @code{case_size} comes from the file
-header record (@pxref{File Header Record}). Each element corresponds to
+When data is not compressed,
+each element corresponds to
the variable declared in the respective variable record (@pxref{Variable
Record}). Numeric values are given in @code{flt64} format; string
values are literal characters string, padded on the right when