From 9c0b0dce29dcc30268cf2a06fb88648ec9028a88 Mon Sep 17 00:00:00 2001
From: Ben Pfaff <blp@cs.stanford.edu>
Date: Fri, 9 May 2025 08:49:08 -0700
Subject: [PATCH] document SPSS/PC+ and portable files

---
 rust/doc/src/SUMMARY.md                     |   2 +
 rust/doc/src/pc+.md                         | 346 ++++++++++++++++++++
 rust/doc/src/pc+/index.md                   |   1 +
 rust/doc/src/pc.md                          |   1 +
 rust/doc/src/portable.md                    | 340 +++++++++++++++++++
 rust/doc/src/system-file/variable-record.md |   4 +-
 6 files changed, 693 insertions(+), 1 deletion(-)
 create mode 100644 rust/doc/src/pc+.md
 create mode 100644 rust/doc/src/pc+/index.md
 create mode 100644 rust/doc/src/pc.md
 create mode 100644 rust/doc/src/portable.md

diff --git a/rust/doc/src/SUMMARY.md b/rust/doc/src/SUMMARY.md
index f5ae0d5e4b..39dadbae09 100644
--- a/rust/doc/src/SUMMARY.md
+++ b/rust/doc/src/SUMMARY.md
@@ -195,3 +195,5 @@
 - [Encrypted File Wrappers](encrypted-wrapper/index.md)
   - [Common Wrapper Format](encrypted-wrapper/common-wrapper-format.md)
   - [Password Encoding](encrypted-wrapper/password-encoding.md)
+- [SPSS Portable File Format](portable.md)
+- [SPSS/PC+ System File Format](pc+.md)
\ No newline at end of file
diff --git a/rust/doc/src/pc+.md b/rust/doc/src/pc+.md
new file mode 100644
index 0000000000..261e978856
--- /dev/null
+++ b/rust/doc/src/pc+.md
@@ -0,0 +1,346 @@
+# SPSS/PC+ System File Format
+
+SPSS/PC+, first released in 1984, was a simplified version of SPSS for
+IBM PC and compatible computers.  It used a data file format related to
+the one described in the previous chapter, but simplified and
+incompatible.  The SPSS/PC+ software became obsolete in the 1990s, so
+files in this format are rarely encountered today.  Nevertheless, for
+completeness, and because it is not very difficult, it seems worthwhile
+to support at least reading these files.  This chapter documents this
+format, based on examination of a corpus of about 60 files from a
+variety of sources.
+
+System files use four data types: 8-bit characters, 16-bit unsigned
+integers, 32-bit unsigned integers, and 64-bit floating points, called
+here `char`, `uint16`, `uint32`, and `flt64`, respectively.  Data is not
+necessarily aligned on a word or double-word boundary.
+
+SPSS/PC+ ran only on IBM PC and compatible computers.  Therefore,
+values in these files are always in little-endian byte order.
+Floating-point numbers are always in IEEE 754 format.
+
+SPSS/PC+ system files represent the system-missing value as
+-1.66e308, or `f5 1e 26 02 8a 8c ed ff` expressed as hexadecimal.  (This
+is an unusual choice: it is close to, but not equal to, the largest
+negative 64-bit IEEE 754, which is about -1.8e308.)
+
+Text in SPSS/PC+ system file is encoded in ASCII-based 8-bit MS DOS
+codepages.  The corpus used for investigating the format were all
+ASCII-only.
+
+An SPSS/PC+ system file begins with the following 256-byte directory:
+
+```
+uint32              two;
+uint32              zero;
+struct {
+    uint32          ofs;
+    uint32          len;
+} records[15];
+char                filename[128];
+```
+
+* `uint32 two;`  
+  `uint32 zero;`  
+  Always set to 2 and 0, respectively.
+
+  These fields could be used as a signature for the file format, but
+  the `product` field in record 0 seems more likely to be unique
+  (*note Record 0 Main Header Record::).
+
+* `struct { ... } records[15];`  
+  Each of the elements in this array identifies a record in the
+  system file.  The `ofs` is a byte offset, from the beginning of the
+  file, that identifies the start of the record.  `len` specifies the
+  length of the record, in bytes.  Many records are optional or not
+  used.  If a record is not present, `ofs` and `len` for that record
+  are both are zero.
+
+* `char filename[128];`  
+  In most files in the corpus, this field is entirely filled with
+  spaces.  In one file, it contains a file name, followed by a null
+  bytes, followed by spaces to fill the remainder of the field.  The
+  meaning is unknown.
+
+The following sections describe the contents of each record,
+identified by the index into the `records` array.
+
+<!-- toc -->
+
+## Record 0: Main Header Record
+
+All files in the corpus have this record at offset 0x100 with length
+0xb0 (but readers should find this record, like the others, via the
+`records` table in the directory).  Its format is:
+
+```
+uint16              one0;
+char                product[62];
+flt64               sysmis;
+uint32              zero0;
+uint32              zero1;
+uint16              one1;
+uint16              compressed;
+uint16              nominal_case_size;
+uint16              n_cases0;
+uint16              weight_index;
+uint16              zero2;
+uint16              n_cases1;
+uint16              zero3;
+char                creation_date[8];
+char                creation_time[8];
+char                label[64];
+```
+
+* `uint16 one0;`  
+  `uint16 one1;`  
+  Always set to 1.
+
+* `uint32 zero0;`  
+  `uint32 zero1;`  
+  `uint16 zero2;`  
+  `uint16 zero3;`  
+  Always set to 0.
+
+  It seems likely that one of these variables is set to 1 if
+  weighting is enabled, but none of the files in the corpus is
+  weighted.
+
+* `char product[62];`  
+  Name of the program that created the file.  Only the following
+  unique values have been observed, in each case padded on the right
+  with spaces:
+
+  ```
+  DESPSS/PC+ System File Written by Data Entry II
+  PCSPSS SYSTEM FILE.  IBM PC DOS, SPSS/PC+
+  PCSPSS SYSTEM FILE.  IBM PC DOS, SPSS/PC+ V3.0
+  PCSPSS SYSTEM FILE.  IBM PC DOS, SPSS for Windows
+  ```
+
+  Thus, it is reasonable to use the presence of the string `SPSS` at
+  offset 0x104 as a simple test for an SPSS/PC+ data file.
+
+* `flt64 sysmis;`  
+  The system-missing value, as described previously.
+
+* `uint16 compressed;`  
+  Set to 0 if the data in the file is not compressed, 1 if the data
+  is compressed with simple bytecode compression.
+
+* `uint16 nominal_case_size;`  
+  Number of data elements per case.  This is the number of variables,
+  except that long string variables add extra data elements (one for
+  every 8 bytes after the first 8).  String variables in SPSS/PC+
+  system files are limited to 255 bytes.
+
+* `uint16 n_cases0;`  
+  `uint16 n_cases1;`  
+  The number of cases in the data record.  Both values are the same.
+  Some files in the corpus contain data for the number of cases noted
+  here, followed by garbage that somewhat resembles data.
+
+* `uint16 weight_index;`  
+  0, if the file is unweighted, otherwise a 1-based index into the
+  data record of the weighting variable, e.g. 4 for the first
+  variable after the 3 system-defined variables.
+
+* `char creation_date[8];`  
+  The date that the file was created, in `mm/dd/yy` format.
+  Single-digit days and months are not prefixed by zeros.  The string
+  is padded with spaces on right or left or both, e.g.  `_2/4/93_`,
+  `10/5/87_`, and `_1/11/88` (with `_` standing in for a space) are
+  all actual examples from the corpus.
+
+* `char creation_time[8];`  
+  The time that the file was created, in `HH:MM:SS` format.
+  Single-digit hours are padded on a left with a space.  Minutes and
+  seconds are always written as two digits.
+
+* `char file_label[64];`  
+  [File label](commands/utilities/file-label.md) declared by the user,
+  if any.  Padded on the right with spaces.
+
+## Record 1: Variables Record
+
+The variables record most commonly starts at offset 0x1b0, but it can be
+placed elsewhere.  The record contains instances of the following
+32-byte structure:
+
+```
+uint32              value_label_start;
+uint32              value_label_end;
+uint32              var_label_ofs;
+uint32              format;
+char                name[8];
+union {
+    flt64           f;
+    char            s[8];
+} missing;
+```
+
+The number of instances is the `nominal_case_size` specified in the
+main header record.  There is one instance for each numeric variable
+and each string variable with width 8 bytes or less.  String variables
+wider than 8 bytes have one instance for each 8 bytes, rounding up.
+The first instance for a long string specifies the variable's correct
+dictionary information.  Subsequent instances for a long string are
+generally filled with all-zero bytes, although the `missing` field
+contains the numeric system-missing value, and some writers also fill
+in `var_label_ofs`, `format`, and `name`, sometimes filling the latter
+with the numeric system-missing value rather than a text string.
+Regardless of the values used, readers should ignore the contents of
+these additional instances for long strings.
+
+* `uint32 value_label_start;`  
+  `uint32 value_label_end;`  
+  For a variable with value labels, these specify offsets into the
+  label record of the start and end of this variable's value
+  labels, respectively.  See the [labels
+  record](#record-2-labels-record), for more information.
+
+  For a variable without any value labels, these are both zero.
+
+  A long string variable may not have value labels.
+
+* `uint32 var_label_ofs;`  
+  For a variable with a variable label, this specifies an offset into
+  the label record.  *Note Record 2 Labels Record::, for more
+  information.
+
+  For a variable without a variable label, this is zero.
+
+* `uint32 format;`  
+  The variable's output format, in the same format used in system
+  files.  *Note System File Output Formats::, for details.  SPSS/PC+
+  system files only use format types 5 (F, for numeric variables) and
+  1 (A, for string variables).
+
+* `char name[8];`  
+  The variable's name, padded on the right with spaces.
+
+* `union { ... } missing;`  
+  A user-missing value.  For numeric variables, `missing.f` is the
+  variable's user-missing value.  For string variables, `missing.s`
+  is a string missing value.  A variable without a user-missing value
+  is indicated with `missing.f` set to the system-missing value, even
+  for string variables (!).  A Long string variable may not have a
+  missing value.
+
+In addition to the user-defined variables, every SPSS/PC+ system file
+contains, as its first three variables, the following system-defined
+variables, in the following order.  The system-defined variables have
+no variable label, value labels, or missing values.
+
+* `$CASENUM`  
+  A numeric variable with format `F8.0`.  Most of the time this is a
+  sequence number, starting with 1 for the first case and counting up
+  for each subsequent case.  Some files skip over values, which
+  probably reflects cases that were deleted.
+
+* `$DATE`  
+  A string variable with format `A8`.  Same format (including varying
+  padding) as the `creation_date` field in the [main header
+  record](#record-0-main-header-record).  The actual date can differ
+  from `creation_date` and from record to record.  This may reflect
+  when individual cases were added or updated.
+
+* `$WEIGHT`  
+  A numeric variable with format `F8.2`.  This represents the case's
+  weight; SPSS/PC+ files do not have a user-defined weighting
+  variable.  If weighting has not been enabled, every case has value
+  1.0.
+
+## Record 2: Labels Record
+
+The labels record holds value labels and variable labels.  Unlike the
+other records, it is not meant to be read directly and sequentially.
+Instead, this record must be interpreted one piece at a time, by
+following pointers from the variables record.
+
+The `value_label_start`, `value_label_end`, and `var_label_ofs`
+fields in a variable record are all offsets relative to the beginning of
+the labels record, with an additional 7-byte offset.  That is, if the
+labels record starts at byte offset `labels_ofs` and a variable has a
+given `var_label_ofs`, then the variable label begins at byte offset
+`labels_ofs` + `var_label_ofs` + 7 in the file.
+
+A variable label, starting at the offset indicated by
+`var_label_ofs`, consists of a one-byte length followed by the specified
+number of bytes of the variable label string, like this:
+
+```
+uint8               length;
+char                s[length];
+```
+
+   A set of value labels, extending from `value_label_start` to
+`value_label_end` (exclusive), consists of a numeric or string value
+followed by a string in the format just described.  String values are
+padded on the right with spaces to fill the 8-byte field, like this:
+
+```
+union {
+    flt64           f;
+    char            s[8];
+} value;
+uint8               length;
+char                s[length];
+```
+
+   The labels record begins with a pair of `uint32` values.  The first of
+these is always 3.  The second is between 8 and 16 less than the number
+of bytes in the record.  Neither value is important for interpreting the
+file.
+
+## Record 3: Data Record
+
+The format of the data record varies depending on the value of
+`compressed` in the file header record:
+
+* 0: no compression  
+  Data is arranged as a series of 8-byte elements, one per variable
+  instance variable in the variable record (*note Record 1 Variables
+  Record::).  Numeric values are given in `flt64` format; string
+  values are literal characters string, padded on the right with
+  spaces when necessary to fill out 8-byte units.
+
+* 1: bytecode compression  
+  The first 8 bytes of the data record is divided into a series of
+  1-byte command codes.  These codes have meanings as described
+  below:
+
+  - 0  
+    The system-missing value.
+
+  - 1  
+    A numeric or string value that is not compressible.  The value
+    is stored in the 8 bytes following the current block of
+    command bytes.  If this value appears twice in a block of
+    command bytes, then it indicates the second group of 8 bytes
+    following the command bytes, and so on.
+
+  - 2 through 255  
+    A number with value CODE - 100, where CODE is the value of the
+    compression code.  For example, code 105 indicates a numeric
+    variable of value 5.
+
+  The end of the 8-byte group of bytecodes is followed by any 8-byte
+  blocks of non-compressible values indicated by code 1.  After that
+  follows another 8-byte group of bytecodes, then those bytecodes'
+  non-compressible values.  The pattern repeats up to the number of
+  cases specified by the main header record have been seen.
+
+  The corpus does not contain any files with command codes 2 through
+  95, so it is possible that some of these codes are used for special
+  purposes.
+
+Cases of data often, but not always, fill the entire data record.
+Readers should stop reading after the number of cases specified in the
+main header record.  Otherwise, readers may try to interpret garbage
+following the data as additional cases.
+
+## Records 4 and 5: Data Entry
+
+Records 4 and 5 appear to be related to SPSS/PC+ Data Entry.
+
diff --git a/rust/doc/src/pc+/index.md b/rust/doc/src/pc+/index.md
new file mode 100644
index 0000000000..c7f15adba3
--- /dev/null
+++ b/rust/doc/src/pc+/index.md
@@ -0,0 +1 @@
+# SPSS/PC+ System File Format
diff --git a/rust/doc/src/pc.md b/rust/doc/src/pc.md
new file mode 100644
index 0000000000..c7f15adba3
--- /dev/null
+++ b/rust/doc/src/pc.md
@@ -0,0 +1 @@
+# SPSS/PC+ System File Format
diff --git a/rust/doc/src/portable.md b/rust/doc/src/portable.md
new file mode 100644
index 0000000000..e038d730ee
--- /dev/null
+++ b/rust/doc/src/portable.md
@@ -0,0 +1,340 @@
+# Portable File Format
+
+These days, most computers use the same internal data formats for
+integer and floating-point data, if one ignores little differences like
+big- versus little-endian byte ordering.  However, occasionally it is
+necessary to exchange data between systems with incompatible data
+formats.  This is what portable files are designed to do.
+
+The portable file format is mostly obsolete.  [System
+files](system-file/index.md) are a better alternative.
+
+> This information is gleaned from examination of ASCII-formatted
+portable files only, so some of it may be incorrect for portable files
+formatted in EBCDIC or other character sets.
+
+<!-- toc -->
+
+## Portable File Characters
+
+Portable files are arranged as a series of lines of 80 characters each.
+Each line is terminated by a carriage-return, line-feed sequence
+("new-lines").  New-lines are only used to avoid line length limits
+imposed by some OSes; they are not meaningful.
+
+Most lines in portable files are exactly 80 characters long.  The
+only exception is a line that ends in one or more spaces, in which the
+spaces may optionally be omitted.  Thus, a portable file reader must act
+as though a line shorter than 80 characters is padded to that length
+with spaces.
+
+The file must be terminated with a `Z` character.  In addition, if
+the final line in the file does not have exactly 80 characters, then it
+is padded on the right with `Z` characters.  (The file contents may be
+in any character set; the file contains a description of its own
+character set, as explained in the next section.  Therefore, the `Z`
+character is not necessarily an ASCII `Z`.)
+
+For the rest of the description of the portable file format,
+new-lines and the trailing `Z`s will be ignored, as if they did not
+exist, because they are not an important part of understanding the file
+contents.
+
+## Portable File Structure
+
+Every portable file consists of the following records, in sequence:
+
+- File header.
+
+- Version and date info.
+
+- Product identification.
+
+- Author identification (optional).
+
+- Subproduct identification (optional).
+
+- Variable count.
+
+- Case weight variable (optional).
+
+- Variables.  Each variable record may optionally be followed by a
+  missing value record and a variable label record.
+
+- Value labels (optional).
+
+- Documents (optional).
+
+- Data.
+
+Most records are identified by a single-character tag code.  The file
+header and version info record do not have a tag.
+
+Other than these single-character codes, there are three types of
+fields in a portable file: floating-point, integer, and string.
+Floating-point fields have the following format:
+
+- Zero or more leading spaces.
+
+- Optional asterisk (`*`), which indicates a missing value.  The
+  asterisk must be followed by a single character, generally a period
+  (`.`), but it appears that other characters may also be possible.
+  This completes the specification of a missing value.
+
+- Optional minus sign (`-`) to indicate a negative number.
+
+- A whole number, consisting of one or more base-30 digits: `0`
+  through `9` plus capital letters `A` through `T`.
+
+- Optional fraction, consisting of a radix point (`.`) followed by
+  one or more base-30 digits.
+
+- Optional exponent, consisting of a plus or minus sign (`+` or `-`)
+  followed by one or more base-30 digits.
+
+- A forward slash (`/`).
+
+Integer fields take a form identical to floating-point fields, but
+they may not contain a fraction.
+
+String fields take the form of a integer field having value N,
+followed by exactly N characters, which are the string content.
+
+## Portable File Header
+
+Every portable file begins with a 464-byte header, consisting of a
+200-byte collection of vanity splash strings, followed by a 256-byte
+character set translation table, followed by an 8-byte tag string.
+
+The 200-byte segment is divided into five 40-byte sections, each of
+which represents the string `CHARSET SPSS PORT FILE` in a different
+character set encoding, where `CHARSET` is the name of the character set
+used in the file, e.g. `ASCII` or `EBCDIC`.  Each string is padded on
+the right with spaces in its respective character set.
+
+It appears that these strings exist only to inform those who might
+view the file on a screen, and that they are not parsed by SPSS
+products.  Thus, they can be safely ignored.  For those interested, the
+strings are supposed to be in the following character sets, in the
+specified order: EBCDIC, 7-bit ASCII, CDC 6-bit ASCII, 6-bit ASCII,
+Honeywell 6-bit ASCII.
+
+The 256-byte segment describes a mapping from the character set used
+in the portable file to an arbitrary character set having characters at
+the following positions:
+
+* 0-60: Control characters.  Not important enough to describe in full here.
+
+* 61-63: Reserved.
+
+* 64-73: Digits `0` through `9`.
+
+* 74-99: Capital letters `A` through `Z`.
+
+* 100-125: Lowercase letters `a` through `z`.
+
+* 126: Space.
+
+* 127-130: Symbols `.<(+`
+
+* 131: Solid vertical pipe.
+
+* 132-142: Symbols `&[]!$*);^-/`
+
+* 143: Broken vertical pipe.
+
+* 144-150: Symbols `,%_>`?``:`
+
+* 151: British pound symbol.
+
+* 152-155: Symbols `@'="`.
+
+* 156: Less than or equal symbol.
+
+* 157: Empty box.
+
+* 158: Plus or minus.
+
+* 159: Filled box.
+
+* 160: Degree symbol.
+
+* 161: Dagger.
+
+* 162: Symbol `~`.
+
+* 163: En dash.
+
+* 164: Lower left corner box draw.
+
+* 165: Upper left corner box draw.
+
+* 166: Greater than or equal symbol.
+
+* 167-176: Superscript `0` through `9`.
+
+* 177: Lower right corner box draw.
+
+* 178: Upper right corner box draw.
+
+* 179: Not equal symbol.
+
+* 180: Em dash.
+
+* 181: Superscript `(`.
+
+* 182: Superscript `)`.
+
+* 183: Horizontal dagger (?).
+
+* 184-186: Symbols `{}\`.
+
+* 187: Cents symbol.
+
+* 188: Centered dot, or bullet.
+
+* 189-255: Reserved.
+
+Symbols that are not defined in a particular character set are set to
+the same value as symbol 64; i.e., to `0`.
+
+The 8-byte tag string consists of the exact characters `SPSSPORT` in
+the portable file's character set, which can be used to verify that the
+file is indeed a portable file.
+
+## Version and Date Info Record
+
+This record does not have a tag code.  It has the following structure:
+
+- A single character identifying the file format version.  The letter
+  A represents version 0, and so on.
+
+- An 8-character string field giving the file creation date in the
+  format YYYYMMDD.
+
+- A 6-character string field giving the file creation time in the
+  format HHMMSS.
+
+## Identification Records
+
+The product identification record has tag code `1`.  It consists of a
+single string field giving the name of the product that wrote the
+portable file.
+
+The author identification record has tag code `2`.  It is optional.
+If present, it consists of a single string field giving the name of the
+person who caused the portable file to be written.
+
+The subproduct identification record has tag code `3`.  It is
+optional.  If present, it consists of a single string field giving
+additional information on the product that wrote the portable file.
+
+## Variable Count Record
+
+The variable count record has tag code `4`.  It consists of a single
+integer field giving the number of variables in the file dictionary.
+
+## Precision Record
+
+The precision record has tag code `5`.  It consists of a single integer
+field specifying the maximum number of base-30 digits used in data in
+the file.
+
+## Case Weight Variable Record
+
+The case weight variable record is optional.  If it is present, it
+indicates the variable used for weighting cases; if it is absent, cases
+are unweighted.  It has tag code `6`.  It consists of a single string
+field that names the weighting variable.
+
+## Variable Records
+
+Each variable record represents a single variable.  Variable records
+have tag code `7`.  They have the following structure:
+
+- Width (integer).  This is 0 for a numeric variable, and a number
+  between 1 and 255 for a string variable.
+
+- Name (string).  1-8 characters long.  Must be in all capitals.
+
+  A few portable files that contain duplicate variable names have
+  been spotted in the wild.  PSPP handles these by renaming the
+  duplicates with numeric extensions: `VAR_1`, `VAR_2`, and so on.
+
+- Print format.  This is a set of three integer fields:
+
+  - [Format type](system-file/variable-record.md#format-types) encoded
+    the same as in system files.
+
+  - Format width.  1-40.
+
+  - Number of decimal places.  1-40.
+
+  A few portable files with invalid format types or formats that are
+  not of the appropriate width for their variables have been spotted
+  in the wild.  PSPP assigns a default F or A format to a variable
+  with an invalid format.
+
+- Write format.  Same structure as the print format described above.
+
+Each variable record can optionally be followed by a missing value
+record, which has tag code `8`.  A missing value record has one field,
+the missing value itself (a floating-point or string, as appropriate).
+Up to three of these missing value records can be used.
+
+There is also a record for missing value ranges, which has tag code
+`B`.  It is followed by two fields representing the range, which are
+floating-point or string as appropriate.  If a missing value range is
+present, it may be followed by a single missing value record.
+
+Tag codes `9` and `A` represent `LO THRU X` and `X THRU HI` ranges,
+respectively.  Each is followed by a single field representing X.  If
+one of the ranges is present, it may be followed by a single missing
+value record.
+
+In addition, each variable record can optionally be followed by a
+variable label record, which has tag code `C`.  A variable label record
+has one field, the variable label itself (string).
+
+## Value Label Records
+
+Value label records have tag code `D`.  They have the following format:
+
+- Variable count (integer).
+
+- List of variables (strings).  The variable count specifies the
+  number in the list.  Variables are specified by their names.  All
+  variables must be of the same type (numeric or string), but string
+  variables do not necessarily have the same width.
+
+- Label count (integer).
+
+- List of (value, label) tuples.  The label count specifies the
+  number of tuples.  Each tuple consists of a value, which is numeric
+  or string as appropriate to the variables, followed by a label
+  (string).
+
+A few portable files that specify duplicate value labels, that is,
+two different labels for a single value of a single variable, have been
+spotted in the wild.  PSPP uses the last value label specified in these
+cases.
+
+## Document Record
+
+One document record may optionally follow the value label record.  The
+document record consists of tag code `E`, following by the number of
+document lines as an integer, followed by that number of strings, each
+of which represents one document line.  Document lines must be 80 bytes
+long or shorter.
+
+## Portable File Data
+
+The data record has tag code `F`.  There is only one tag for all the
+data; thus, all the data must follow the dictionary.  The data is
+terminated by the end-of-file marker `Z`, which is not valid as the
+beginning of a data element.
+
+Data elements are output in the same order as the variable records
+describing them.  String variables are output as string fields, and
+numeric variables are output as floating-point fields.
+
diff --git a/rust/doc/src/system-file/variable-record.md b/rust/doc/src/system-file/variable-record.md
index 433243fde1..603cff8309 100644
--- a/rust/doc/src/system-file/variable-record.md
+++ b/rust/doc/src/system-file/variable-record.md
@@ -127,7 +127,9 @@ without any variables (thus, no data either).
   maximum value in the range.  When a range plus a value are present,
   the third element denotes the additional discrete missing value.
 
-The `print` and `write` members of sysfile_variable are output
+## Format Types
+
+The `print` and `write` members of `sysfile_variable` are output
 formats coded into `int32` types.  The least-significant byte of the
 `int32` represents the number of decimal places, and the next two bytes
 in order of increasing significance represent field width and format
-- 
2.30.2