floating-point format in use, as well as the endianness of IEEE 754
floating-point numbers, and translates as needed. However, only IEEE
754 numbers with the same endianness as integer data in the same file
-has actually been observed in system files, and it is likely that
+have actually been observed in system files, and it is likely that
other formats are obsolete or were never used.
-The PSPP system-missing value is represented by the largest possible
-negative number in the floating point format (@code{-DBL_MAX}). Two
-other values are important for use as missing values: @code{HIGHEST},
-represented by the largest possible positive number (@code{DBL_MAX}),
-and @code{LOWEST}, represented by the second-largest negative number
-(in IEEE 754 format, @code{0xffeffffffffffffe}).
+System files use a few floating point values for special purposes:
-System files are divided into records, each of which begins with a
-4-byte record type, usually regarded as an @code{int32}.
+@table @asis
+@item SYSMIS
+The system-missing value is represented by the largest possible
+negative number in the floating point format (@code{-DBL_MAX}).
+
+@item HIGHEST
+HIGHEST is used as the high end of a missing value range with an
+unbounded maximum. It is represented by the largest possible positive
+number (@code{DBL_MAX}).
+
+@item LOWEST
+LOWEST is used as the low end of a missing value range with an
+unbounded minimum. It was originally represented by the
+second-largest negative number (in IEEE 754 format,
+@code{0xffeffffffffffffe}). System files written by SPSS 21 and later
+instead use the largest negative number (@code{-DBL_MAX}), the same
+value as SYSMIS. This does not lead to ambiguity because LOWEST
+appears in system files only in missing value ranges, which never
+contain SYSMIS.
+@end table
+
+System files may use most character encodings based on an 8-bit unit.
+UTF-16 and UTF-32, based on wider units, appear to be unacceptable.
+@code{rec_type} in the file header record is sufficient to distinguish
+between ASCII and EBCDIC based encodings. The best way to determine
+the specific encoding in use is to consult the character encoding
+record (@pxref{Character Encoding Record}), if present, and failing
+that the @code{character_code} in the machine integer info record
+(@pxref{Machine Integer Info Record}). The same encoding should be
+used for the dictionary and the data in the file, although it is
+possible to artificially synthesize files that use different encodings
+(@pxref{Character Encoding Record}).
+
+@menu
+* System File Record Structure::
+* File Header Record::
+* Variable Record::
+* Value Labels Records::
+* Document Record::
+* Machine Integer Info Record::
+* Machine Floating-Point Info Record::
+* Multiple Response Sets Records::
+* Extra Product Info Record::
+* Variable Display Parameter Record::
+* Long Variable Names Record::
+* Very Long String Record::
+* Character Encoding Record::
+* Long String Value Labels Record::
+* Long String Missing Values Record::
+* Data File and Variable Attributes Records::
+* Extended Number of Cases Record::
+* Other Informational Records::
+* Dictionary Termination Record::
+* Data Record::
+@end menu
+
+@node System File Record Structure
+@section System File Record Structure
+
+System files are divided into records with the following format:
+
+@example
+int32 type;
+char data[];
+@end example
+
+This header does not identify the length of the @code{data} or any
+information about what it contains, so the system file reader must
+understand the format of @code{data} based on @code{type}. However,
+records with type 7, called @dfn{extension records}, have a stricter
+format:
+
+@example
+int32 type;
+int32 subtype;
+int32 size;
+int32 count;
+char data[size * count];
+@end example
+
+@table @code
+@item int32 rec_type;
+Record type. Always set to 7.
+
+@item int32 subtype;
+Record subtype. This value identifies a particular kind of extension
+record.
+
+@item int32 size;
+The size of each piece of data that follows the header, in bytes.
+Known extension records use 1, 4, or 8, for @code{char}, @code{int32},
+and @code{flt64} format data, respectively.
+
+@item int32 count;
+The number of pieces of data that follow the header.
+
+@item char data[size * count];
+Data, whose format and interpretation depend on the subtype.
+@end table
+
+An extension record contains exactly @code{size * count} bytes of
+data, which allows a reader that does not understand an extension
+record to skip it. Extension records provide only nonessential
+information, so this allows for files written by newer software to
+preserve backward compatibility with older or less capable readers.
-The records must appear in the following order:
+Records in a system file must appear in the following order:
@itemize @bullet
@item
Document record, if present.
@item
-Any records not explicitly included in this list, in any order.
+Extension (type 7) records, in ascending numerical order of their
+subtypes.
@item
Dictionary termination record.
Data record.
@end itemize
-Each type of record is described separately below.
+We advise authors of programs that read system files to tolerate
+format variations. Various kinds of misformatting and corruption have
+been observed in system files written by SPSS and other software
+alike. In particular, because extension records provide nonessential
+information, it is generally better to ignore an extension record
+entirely than to refuse to read a system file.
-@menu
-* File Header Record::
-* Variable Record::
-* Value Labels Records::
-* Document Record::
-* Machine Integer Info Record::
-* Machine Floating-Point Info Record::
-* Multiple Response Sets Records::
-* Variable Display Parameter Record::
-* Long Variable Names Record::
-* Very Long String Record::
-* Character Encoding Record::
-* Long String Value Labels Record::
-* Data File and Variable Attributes Records::
-* Extended Number of Cases Record::
-* Miscellaneous Informational Records::
-* Dictionary Termination Record::
-* Data Record::
-@end menu
+The following sections describe the known kinds of records.
@node File Header Record
@section File Header Record
-The file header is always the first record in the file. It has the
-following format:
+A system file begins with the file header, with the following format:
@example
char rec_type[4];
char prod_name[60];
int32 layout_code;
int32 nominal_case_size;
-int32 compressed;
+int32 compression;
int32 weight_index;
int32 ncases;
flt64 bias;
@table @code
@item char rec_type[4];
-Record type code, set to @samp{$FL2}.
+Record type code, either @samp{$FL2} for system files with
+uncompressed data or data compressed with simple bytecode compression,
+or @samp{$FL3} for system files with ZLIB compressed data.
+
+This is truly a character field that uses the character encoding as
+other strings. Thus, in a file with an ASCII-based character encoding
+this field contains @code{24 46 4c 32} or @code{24 46 4c 33}, and in a
+file with an EBCDIC-based encoding this field contains @code{5b c6 d3
+f2}. (No EBCDIC-based ZLIB-compressed files have been observed.)
@item char prod_name[60];
Product identification string. This always begins with the characters
files written by some systems set this value to -1. In general, it is
unsafe for systems reading system files to rely upon this value.
-@item int32 compressed;
-Set to 1 if the data in the file is compressed, 0 otherwise.
+@item int32 compression;
+Set to 0 if the data in the file is not compressed, 1 if the data is
+compressed with simple bytecode compression, 2 if the data is ZLIB
+compressed. This field has value 2 if and only if @code{rec_type} is
+@samp{$FL3}.
@item int32 weight_index;
If one of the variables in the data set is used as a weighting
File label declared by the user, if any (@pxref{FILE LABEL,,,pspp,
PSPP Users Guide}). Padded on the right with spaces.
+A product that identifies itself as @code{VOXCO INTERVIEWER 4.3} uses
+CR-only line ends in this field, rather than the more usual LF-only or
+CR LF line ends.
+
@item char padding[3];
Ignored padding bytes to make the structure a multiple of 32 bits in
length. Set to zeros.
by a number of narrower string variables. @xref{Very Long String
Record}, for details.
+A system file should contain at least one variable and thus at least
+one variable record, but system files have been observed in the wild
+without any variables (thus, no data either).
+
@example
int32 rec_type;
int32 type;
-2; if the variable has a range for missing variables plus a single
discrete value, set to -3.
+A long string variable always has the value 0 here. A separate record
+indicates missing values for long string variables (@pxref{Long String
+Missing Values Record}).
+
@item int32 print;
Print format for this variable. See below.
(@samp{#}), dollar signs (@samp{$}), underscores (@samp{_}), or full
stops (@samp{.}). The variable name is padded on the right with spaces.
+The @samp{name} fields should be unique within a system file. System
+files written by SPSS that contain very long string variables with
+similar names sometimes contain duplicate names that are later
+eliminated by resolving the very long string names (@pxref{Very Long
+String Record}). PSPP handles duplicates by assigning them new,
+unique names.
+
@item int32 label_len;
This field is present only if @code{has_var_label} is set to 1. It is
-set to the length, in characters, of the variable label, which must be a
-number between 0 and 120.
+set to the length, in characters, of the variable label. The
+documented maximum length varies from 120 to 255 based on SPSS
+version, but some files have been seen with longer labels. PSPP
+accepts labels of any length.
@item char label[];
This field is present only if @code{has_var_label} is set to 1. It has
element denotes the additional discrete missing value.
@end table
+@anchor{System File Output Formats}
The @code{print} and @code{write} members of sysfile_variable are output
formats coded into @code{int32} types. The least-significant byte
of the @code{int32} represents the number of decimal places, and the
@end multitable
@end quotation
+A few system files have been observed in the wild with invalid
+@code{write} fields, in particular with value 0. Readers should
+probably treat invalid @code{print} or @code{write} fields as some
+default format.
+
@node Value Labels Records
@section Value Labels Records
following value label variables record (see below) is read.
@item char label_len;
-The label's length, in bytes.
+The label's length, in bytes. The documented maximum length varies
+from 60 to 120 based on SPSS version. PSPP supports value labels up
+to 255 bytes long.
@item char label[];
@code{label_len} bytes of the actual label, followed by up to 7 bytes
IBM 370 sets this to 2, and DEC VAX E to 3.
@item int32 compression_code;
-Compression code. Always set to 1.
+Compression code. Always set to 1, regardless of whether or how the
+file is compressed.
@item int32 endianness;
Machine endianness. 1 indicates big-endian, 2 indicates little-endian.
@item int32 character_code;
-@anchor{character-code}
-Character code. 1 indicates EBCDIC, 2 indicates 7-bit ASCII, 3
-indicates 8-bit ASCII, 4 indicates DEC Kanji.
-Windows code page numbers are also valid.
-
-Experience has shown that in many files, this field is ignored or incorrect.
-For a more reliable indication of the file's character encoding
-see @ref{Character Encoding Record}.
+@anchor{character-code} Character code. The following values have
+been actually observed in system files:
+
+@table @asis
+@item 1
+EBCDIC.
+
+@item 2
+7-bit ASCII.
+
+@item 1250
+The @code{windows-1250} code page for Central European and Eastern
+European languages.
+
+@item 1252
+The @code{windows-1252} code page for Western European languages.
+
+@item 28591
+ISO 8859-1.
+
+@item 65001
+UTF-8.
+@end table
+
+The following additional values are known to be defined:
+
+@table @asis
+@item 3
+8-bit ``ASCII''.
+
+@item 4
+DEC Kanji.
+@end table
+
+Other Windows code page numbers are known to be generally valid.
+
+Old versions of SPSS for Unix and Windows always wrote value 2 in this
+field, regardless of the encoding in use. Newer versions also write
+the character encoding as a string (see @ref{Character Encoding
+Record}).
@end table
@node Machine Floating-Point Info Record
Number of pieces of data in the data part. Always set to 3.
@item flt64 sysmis;
-The system missing value.
-
-@item flt64 highest;
-The value used for HIGHEST in missing values.
-
-@item flt64 lowest;
-The value used for LOWEST in missing values.
+@itemx flt64 highest;
+@itemx flt64 lowest;
+The system missing value, the value used for HIGHEST in missing
+values, and the value used for LOWEST in missing values, respectively.
+@xref{System File Format}, for more information.
+
+The SPSSWriter library in PHP, which identifies itself as @code{FOM
+SPSS 1.0.0} in the file header record @code{prod_name} field, writes
+unexpected values to these fields, but it uses the same values
+consistently throughout the rest of the file.
@end table
@node Multiple Response Sets Records
The total number of bytes in @code{mrsets}.
@item char mrsets[];
-A series of multiple response sets, each of which consists of the
-following:
+Zero or more line feeds (byte 0x0a), followed by a series of multiple
+response sets, each of which consists of the following:
@itemize @bullet
@item
-The set's name (an identifier that begins with @samp{$}).
+The set's name (an identifier that begins with @samp{$}), in mixed
+upper and lower case.
@item
An equals sign (@samp{=}).
A space.
@item
-The names of the variables in the set, each separated from the
-previous by a single space.
+The short names of the variables in the set, converted to lowercase,
+each separated from the previous by a single space.
+
+Even though a multiple response set must have at least two variables,
+some system files contain multiple response sets with no variables or
+one variable. The source and meaning of these multiple response sets is
+unknown. (Perhaps they arise from creating a multiple response set
+then deleting all the variables that it contains?)
@item
-A line feed (byte 0x0a).
+One line feed (byte 0x0a). Sometimes multiple, even hundreds, of line
+feeds are present.
@end itemize
@end table
$e=E 11 6 choice 0 n o p
@end example
+@node Extra Product Info Record
+@section Extra Product Info Record
+
+This optional record appears to contain a text string that describes
+the program that wrote the file and the source of the data. (This is
+redundant with the file label and product info found in the file
+header record.)
+
+@example
+/* @r{Header.} */
+int32 rec_type;
+int32 subtype;
+int32 size;
+int32 count;
+
+/* @r{Exactly @code{count} bytes of data.} */
+char info[];
+@end example
+
+@table @code
+@item int32 rec_type;
+Record type. Always set to 7.
+
+@item int32 subtype;
+Record subtype. Always set to 10.
+
+@item int32 size;
+The size of each element in the @code{info} member. Always set to 1.
+
+@item int32 count;
+The total number of bytes in @code{info}.
+
+@item char info[];
+A text string. A product that identifies itself as @code{VOXCO
+INTERVIEWER 4.3} uses CR-only line ends in this field, rather than the
+more usual LF-only or CR LF line ends.
+@end table
+
@node Variable Display Parameter Record
@section Variable Display Parameter Record
Continuous Scale
@end table
-SPSS 14 sometimes writes a @code{measure} of 0. PSPP interprets this
-as nominal scale.
+SPSS sometimes writes a @code{measure} of 0. PSPP interprets this as
+nominal scale.
@item int32 width;
The width of the display column for the variable in characters.
The total number of bytes in @code{encoding}.
@item char encoding[];
-The name of the character encoding. Normally this will be an official IANA characterset name or alias.
+The name of the character encoding. Normally this will be an official
+IANA character set name or alias.
See @url{http://www.iana.org/assignments/character-sets}.
+Character set names are not case-sensitive, but SPSS appears to write
+them in all-uppercase.
@end table
-This record is not present in files generated by older software.
-See also @ref{character-code}.
+This record is not present in files generated by older software. See
+also the @code{character_code} field in the machine integer info
+record (@pxref{character-code}).
+
+When the character encoding record and the machine integer info record
+are both present, all system files observed in practice indicate the
+same character encoding, e.g.@: 1252 as @code{character_code} and
+@code{windows-1252} as @code{encoding}, 65001 and @code{UTF-8}, etc.
+
+If, for testing purposes, a file is crafted with different
+@code{character_code} and @code{encoding}, it seems that
+@code{character_code} controls the encoding for all strings in the
+system file before the dictionary termination record, including
+strings in data (e.g.@: string missing values), and @code{encoding}
+controls the encoding for strings following the dictionary termination
+record.
@node Long String Value Labels Record
@section Long String Value Labels Record
@end table
@end table
+@node Long String Missing Values Record
+@section Long String Missing Values Record
+
+This record, if present, specifies missing values for long string
+variables.
+
+@example
+/* @r{Header.} */
+int32 rec_type;
+int32 subtype;
+int32 size;
+int32 count;
+
+/* @r{Repeated up to exactly @code{count} bytes.} */
+int32 var_name_len;
+char var_name[];
+char n_missing_values;
+long_string_missing_value values[];
+@end example
+
+@table @code
+@item int32 rec_type;
+Record type. Always set to 7.
+
+@item int32 subtype;
+Record subtype. Always set to 22.
+
+@item int32 size;
+Always set to 1.
+
+@item int32 count;
+The number of bytes following the header until the next header.
+
+@item int32 var_name_len;
+@itemx char var_name[];
+The number of bytes in the name of the long string variable that has
+missing values, plus the variable name itself, which consists of
+exactly @code{var_name_len} bytes. The variable name is not padded to
+any particular boundary, nor is it null-terminated.
+
+@item char n_missing_values;
+The number of missing values, either 1, 2, or 3. (This is, unusually,
+a single byte instead of a 32-bit number.)
+
+@item long_string_missing_value values[];
+The missing values themselves. This array contains exactly
+@code{n_missing_values} elements, each of which has the following
+substructure:
+
+@example
+int32 value_len;
+char value[];
+@end example
+
+@table @code
+@item int32 value_len;
+The length of the missing value string, in bytes. This value should
+be 8, because long string variables are at least 8 bytes wide (by
+definition), only the first 8 bytes of a long string variable's
+missing values are allowed to be non-spaces, and any spaces within the
+first 8 bytes are included in the missing value here.
+
+@item char value[];
+The missing value string, exactly @code{value_len} bytes, without
+any padding or null terminator.
+@end table
+@end table
+
@node Data File and Variable Attributes Records
@section Data File and Variable Attributes Records
In record type 18, this field contains a sequence of one or more
variable attribute sets. If more than one variable attribute set is
present, each one after the first is delimited from the previous by
-@code{/}. Each variable attribute set consists of a variable name,
+@code{/}. Each variable attribute set consists of a long
+variable name,
followed by @code{:}, followed by an attribute set with the same
syntax as on record type 17.
will contain a variable attribute record with the following contents:
@example
-00000000 07 00 00 00 12 00 00 00 01 00 00 00 22 00 00 00 |............"...|
-00000010 64 75 6d 6d 79 3a 66 72 65 64 28 27 32 33 27 0a |dummy:fred('23'.|
-00000020 27 33 34 27 0a 29 62 65 72 74 28 27 31 32 33 27 |'34'.)bert('123'|
-00000030 0a 29 |.) |
+0000 07 00 00 00 12 00 00 00 01 00 00 00 22 00 00 00 |............"...|
+0010 64 75 6d 6d 79 3a 66 72 65 64 28 27 32 33 27 0a |dummy:fred('23'.|
+0020 27 33 34 27 0a 29 62 65 72 74 28 27 31 32 33 27 |'34'.)bert('123'|
+0030 0a 29 |.) |
@end example
+@menu
+* Variable Roles::
+@end menu
+
+@node Variable Roles
+@subsection Variable Roles
+
+A variable's role is represented as an attribute named @code{$@@Role}.
+This attribute has a single element whose values and their meanings
+are:
+
+@table @code
+@item 0
+Input. This, the default, is the most common role.
+@item 1
+Output.
+@item 2
+Both.
+@item 3
+None.
+@item 4
+Partition.
+@item 5
+Split.
+@end table
+
@node Extended Number of Cases Record
@section Extended Number of Cases Record
not been observed in the wild.
@end table
-@node Miscellaneous Informational Records
-@section Miscellaneous Informational Records
+@node Other Informational Records
+@section Other Informational Records
-Some specific types of miscellaneous informational records are
+This chapter documents many specific types of extension records are
documented here, but others are known to exist. PSPP ignores unknown
-miscellaneous informational records when reading system files.
-
-@example
-/* @r{Header.} */
-int32 rec_type;
-int32 subtype;
-int32 size;
-int32 count;
+extension records when reading system files.
-/* @r{Exactly @code{size * count} bytes of data.} */
-char data[];
-@end example
+The following extension record subtypes have also been observed, with
+the following believed meanings:
-@table @code
-@item int32 rec_type;
-Record type. Always set to 7.
-
-@item int32 subtype;
-Record subtype. May take any value. According to Aapi
-H@"am@"al@"ainen, value 5 indicates a set of grouped variables and 6
-indicates date info (probably related to USE).
+@table @asis
+@item 5
+A set of grouped variables (according to Aapi H@"am@"al@"ainen).
-@item int32 size;
-Size of each piece of data in the data part. Should have the value 1,
-4, or 8, for @code{char}, @code{int32}, and @code{flt64} format data,
-respectively.
+@item 6
+Date info, probably related to USE (according to Aapi H@"am@"al@"ainen).
-@item int32 count;
-Number of pieces of data in the data part.
+@item 12
+A UUID in the format described in RFC 4122. Only two examples
+observed, both written by SPSS 13, and in each case the UUID contained
+both upper and lower case.
-@item char data[];
-Arbitrary data. There must be @code{size} times @code{count} bytes of
-data.
+@item 24
+XML that describes how data in the file should be displayed on-screen.
@end table
@node Dictionary Termination Record
@node Data Record
@section Data Record
-Data records must follow all other records in the system file. There must
-be at least one data record in every system file.
-
-The format of data records varies depending on whether the data is
-compressed. Regardless, the data is arranged in a series of 8-byte
-elements.
+The data record must follow all other records in the system file.
+Every system file must have a data record that specifies data for at
+least one case. The format of the data record varies depending on the
+value of @code{compression} in the file header record:
-When data is not compressed,
-each element corresponds to
+@table @asis
+@item 0: no compression
+Data is arranged as a series of 8-byte elements.
+Each element corresponds to
the variable declared in the respective variable record (@pxref{Variable
Record}). Numeric values are given in @code{flt64} format; string
values are literal characters string, padded on the right when
necessary to fill out 8-byte units.
-Compressed data is arranged in the following manner: the first 8 bytes
-in the data section is divided into a series of 1-byte command
+@item 1: bytecode compression
+The first 8 bytes
+of the data record is divided into a series of 1-byte command
codes. These codes have meanings as described below:
@table @asis
The system-missing value.
@end table
-When the end of the an 8-byte group of command bytes is reached, any
-blocks of non-compressible values indicated by code 253 are skipped,
-and the next element of command bytes is read and interpreted, until
-the end of the file or a code with value 252 is reached.
+The end of the 8-byte group of bytecodes is followed by any 8-byte
+blocks of non-compressible values indicated by code 253. After that
+follows another 8-byte group of bytecodes, then those bytecodes'
+non-compressible values. The pattern repeats to the end of the file
+or a code with value 252.
+
+@item 2: ZLIB compression
+The data record consists of the following, in order:
+
+@itemize @bullet
+@item
+ZLIB data header, 24 bytes long.
+
+@item
+One or more variable-length blocks of ZLIB compressed data.
+
+@item
+ZLIB data trailer, with a 24-byte fixed header plus an additional 24
+bytes for each preceding ZLIB compressed data block.
+@end itemize
+
+The ZLIB data header has the following format:
+
+@example
+int64 zheader_ofs;
+int64 ztrailer_ofs;
+int64 ztrailer_len;
+@end example
+
+@table @code
+@item int64 zheader_ofs;
+The offset, in bytes, of the beginning of this structure within the
+system file.
+
+@item int64 ztrailer_ofs;
+The offset, in bytes, of the first byte of the ZLIB data trailer.
+
+@item int64 ztrailer_len;
+The number of bytes in the ZLIB data trailer. This and the previous
+field sum to the size of the system file in bytes.
+@end table
+
+The data header is followed by @code{(ztrailer_ofs - 24) / 24} ZLIB
+compressed data blocks. Each ZLIB compressed data block begins with a
+ZLIB header as specified in RFC@tie{}1950, e.g.@: hex bytes @code{78
+01} (the only header yet observed in practice). Each block
+decompresses to a fixed number of bytes (in practice only
+@code{0x3ff000}-byte blocks have been observed), except that the last
+block of data may be shorter. The last ZLIB compressed data block
+gends just before offset @code{ztrailer_ofs}.
+
+The result of ZLIB decompression is bytecode compressed data as
+described above for compression format 1.
+
+The ZLIB data trailer begins with the following 24-byte fixed header:
+
+@example
+int64 bias;
+int64 zero;
+int32 block_size;
+int32 n_blocks;
+@end example
+
+@table @code
+@item int64 int_bias;
+The compression bias as a negative integer, e.g.@: if @code{bias} in
+the file header record is 100.0, then @code{int_bias} is @minus{}100
+(this is the only value yet observed in practice).
+
+@item int64 zero;
+Always observed to be zero.
+
+@item int32 block_size;
+The number of bytes in each ZLIB compressed data block, except
+possibly the last, following decompression. Only @code{0x3ff000} has
+been observed so far.
+
+@item int32 n_blocks;
+The number of ZLIB compressed data blocks, always exactly
+@code{(ztrailer_ofs - 24) / 24}.
+@end table
+
+The fixed header is followed by @code{n_blocks} 24-byte ZLIB data
+block descriptors, each of which describes the compressed data block
+corresponding to its offset. Each block descriptor has the following
+format:
+
+@example
+int64 uncompressed_ofs;
+int64 compressed_ofs;
+int32 uncompressed_size;
+int32 compressed_size;
+@end example
+
+@table @code
+@item int64 uncompressed_ofs;
+The offset, in bytes, that this block of data would have in a similar
+system file that uses compression format 1. This is
+@code{zheader_ofs} in the first block descriptor, and in each
+succeeding block descriptor it is the sum of the previous desciptor's
+@code{uncompressed_ofs} and @code{uncompressed_size}.
+
+@item int64 compressed_ofs;
+The offset, in bytes, of the actual beginning of this compressed data
+block. This is @code{zheader_ofs + 24} in the first block descriptor,
+and in each succeeding block descriptor it is the sum of the previous
+descriptor's @code{compressed_ofs} and @code{compressed_size}. The
+final block descriptor's @code{compressed_ofs} and
+@code{compressed_size} sum to @code{ztrailer_ofs}.
+
+@item int32 uncompressed_size;
+The number of bytes in this data block, after decompression. This is
+@code{block_size} in every data block except the last, which may be
+smaller.
+
+@item int32 compressed_size;
+The number of bytes in this data block, as stored compressed in this
+system file.
+@end table
+@end table
+
@setfilename ignored