-@node Portable File Format, Data File Format, Configuration, Top
+@node Portable File Format
@appendix Portable File Format
These days, most computers use the same internal data formats for
* Case Weight Variable Record::
* Variable Records::
* Value Label Records::
+* Portable File Document Record::
* Portable File Data::
@end menu
-@node Portable File Characters, Portable File Structure, Portable File Format, Portable File Format
+@node Portable File Characters
@section Portable File Characters
-Portable files are arranged as a series of lines of exactly 80
+Portable files are arranged as a series of lines of 80
characters each. Each line is terminated by a carriage-return,
-line-feed sequence ``new-lines''). New-lines are only used to avoid
+line-feed sequence (``new-lines''). New-lines are only used to avoid
line length limits imposed by some OSes; they are not meaningful.
+Most lines in portable files are exactly 80 characters long. The only
+exception is a line that ends in one or more spaces, in which the
+spaces may optionally be omitted. Thus, a portable file reader must
+act as though a line shorter than 80 characters is padded to that
+length with spaces.
+
The file must be terminated with a @samp{Z} character. In addition, if
the final line in the file does not have exactly 80 characters, then it
is padded on the right with @samp{Z} characters. (The file contents may
because they are not an important part of understanding the file
contents.
-@node Portable File Structure, Portable File Header, Portable File Characters, Portable File Format
+@node Portable File Structure
@section Portable File Structure
Every portable file consists of the following records, in sequence:
@item
Product identification.
+@item
+Author identification (optional).
+
@item
Subproduct identification (optional).
@item
Value labels (optional).
+@item
+Documents (optional).
+
@item
Data.
@end itemize
String fields take the form of a integer field having value @var{n},
followed by exactly @var{n} characters, which are the string content.
-@node Portable File Header, Version and Date Info Record, Portable File Structure, Portable File Format
+@node Portable File Header
@section Portable File Header
Every portable file begins with a 464-byte header, consisting of a
in the portable file's character set, which can be used to verify that
the file is indeed a portable file.
-@node Version and Date Info Record, Identification Records, Portable File Header, Portable File Format
+@node Version and Date Info Record
@section Version and Date Info Record
This record does not have a tag code. It has the following structure:
HHMMSS.
@end itemize
-@node Identification Records, Variable Count Record, Version and Date Info Record, Portable File Format
+@node Identification Records
@section Identification Records
The product identification record has tag code @samp{1}. It consists of
a single string field giving the name of the product that wrote the
portable file.
-The subproduct identification record has tag code @samp{3}. It
-consists of a single string field giving additional information on the
-product that wrote the portable file.
+The author identification record has tag code @samp{2}. It is
+optional. If present, it consists of a single string field giving the
+name of the person who caused the portable file to be written.
+
+The subproduct identification record has tag code @samp{3}. It is
+optional. If present, it consists of a single string field giving
+additional information on the product that wrote the portable file.
-@node Variable Count Record, Case Weight Variable Record, Identification Records, Portable File Format
+@node Variable Count Record
@section Variable Count Record
The variable count record has tag code @samp{4}. It consists of two
dictionary. The purpose of the second is unknown; it contains the value
161 in all portable files examined so far.
-@node Case Weight Variable Record, Variable Records, Variable Count Record, Portable File Format
+@node Case Weight Variable Record
@section Case Weight Variable Record
The case weight variable record is optional. If it is present, it
cases are unweighted. It has tag code @samp{6}. It consists of a
single string field that names the weighting variable.
-@node Variable Records, Value Label Records, Case Weight Variable Record, Portable File Format
+@node Variable Records
@section Variable Records
Each variable record represents a single variable. Variable records
@item
Name (string). 1--8 characters long. Must be in all capitals.
+A few portable files that contain duplicate variable names have been
+spotted in the wild. PSPP handles these by renaming the duplicates
+with numeric extensions: @code{@var{var}_1}, @code{@var{var}_2}, and
+so on.
+
@item
Print format. This is a set of three integer fields:
Number of decimal places. 1--40.
@end itemize
+A few portable files with invalid format types or formats that are not
+of the appropriate width for their variables have been spotted in the
+wild. PSPP assigns a default F or A format to a variable with an
+invalid format.
+
@item
Write format. Same structure as the print format described above.
@end itemize
variable label record, which has tag code @samp{C}. A variable label
record has one field, the variable label itself (string).
-@node Value Label Records, Portable File Data, Variable Records, Portable File Format
+@node Value Label Records
@section Value Label Records
Value label records have tag code @samp{D}. They have the following
@item
List of variables (strings). The variable count specifies the number in
the list. Variables are specified by their names. All variables must
-be of the same type (numeric or string).
+be of the same type (numeric or string), but string variables do not
+necessarily have the same width.
@item
Label count (integer).
appropriate to the variables, followed by a label (string).
@end itemize
-@node Portable File Data, , Value Label Records, Portable File Format
+A few portable files that specify duplicate value labels, that is, two
+different labels for a single value of a single variable, have been
+spotted in the wild. PSPP uses the last value label specified in
+these cases.
+
+@node Portable File Document Record
+@section Document Record
+
+One document record may optionally follow the value label record. The
+document record consists of tag code @samp{E}, following by the number
+of document lines as an integer, followed by that number of strings,
+each of which represents one document line. Document lines must be 80
+bytes long or shorter.
+
+@node Portable File Data
@section Portable File Data
The data record has tag code @samp{F}. There is only one tag for all