and have no semantic significance.
@item 00, 01, @dots{}, ff.
-Bytes with fixed values are written in hexadecimal:
+A bytes with a fixed value, written as a pair of hexadecimal digits.
@item i0, i1, @dots{}, i9, i10, i11, @dots{}
-32-bit integers with fixed values are written in decimal, prefixed by
+@itemx b0, b1, @dots{}, b9, b10, b11, @dots{}
+A 32-bit integer in little-endian or big-endian byte order,
+respectively, with a fixed value, written in decimal, prefixed by
@samp{i}.
@item byte
-An arbitrary byte.
+A byte.
@item bool
A byte with value 0 or 1.
@item int16
-An arbitrary 16-bit integer.
+@itemx be16
+A 16-bit integer in little-endian or big-endian byte order,
+respectively.
@item int
-An arbitrary 32-bit integer.
+@itemx be32
+A 32-bit integer in little-endian or big-endian byte order,
+respectively.
+
+@item int64
+@itemx be64
+A 64-bit integer in little-endian or big-endian byte order,
+respectively.
@item double
-An arbitrary 64-bit IEEE floating-point number.
+A 64-bit IEEE floating-point number.
@item float
-An arbitrary 32-bit IEEE floating-point number.
+A 32-bit IEEE floating-point number.
@item string
-A 32-bit integer followed by the specified number of bytes of
-character data. (The encoding is indicated by the Formats
-nonterminal.)
+@itemx bestring
+A 32-bit integer, in little-endian or big-endian byte order,
+respectively, followed by the specified number of bytes of character
+data. (The encoding is indicated by the Formats nonterminal.)
@item @var{x}?
@var{x} is optional, e.g.@: 00? is an optional zero byte.
In a version 3 @file{.bin} member, @var{x}; in version 1, nothing.
@end table
-All integer and floating-point values in this format use little-endian
-byte order.
+Little-endian byte order is far more common in this format, but a few
+pieces of the format use big-endian byte order.
A ``light'' detail member @file{.bin} consists of a number of sections
concatenated together, terminated by a byte 01:
@cartouche
@format
-LightMember @result{} Header Title Caption Footnotes Fonts Formats Dimensions Data 01
+LightMember @result{}
+ Header Title
+ Caption Footnotes
+ Fonts Formats Borders PrintSettings TableSettings
+ Dimensions Data
+ 01
@end format
@end cartouche
* PSV Light Member Caption::
* SPV Light Member Footnotes::
* SPV Light Member Fonts::
+* SPV Light Member Borders::
+* SPV Light Member Print Settings::
+* SPV Light Member Table Settings::
* SPV Light Member Formats::
* SPV Light Member Dimensions::
* SPV Light Member Categories::
@node SPV Light Member Header
@subsection Header
-An SPV file begins with an 39-byte header:
+An SPV light member begins with a 39-byte header:
@cartouche
@format
Header @result{}
01 00
(i1 @math{|} i3)[@t{version}]
- 01 (00 @math{|} 01) byte*21 00 00
- int[@t{table-id}] byte*4
+ 01 bool*4 int
+ int[@t{min-column-width}] int[@t{max-column-width}]
+ int[@t{min-row-height}] int[@t{max-row-height}]
+ int64[@t{table-id}]
@end format
@end cartouche
@code{table-id} is a binary version of the @code{tableId} attribute in
the structure member that refers to the detail member. For example,
-if @code{tableId} is @code{-4154297861994971133}, then @code{table-id}
-would be 0xdca00003.
+if @code{tableId} is @code{-4122591256483201023}, then @code{table-id}
+would be 0xc6c99d183b300001.
The meaning of the other variable parts of the header is not known.
should be the same color. When @code{alternate} is 01, @code{altfg}
and @code{altbg} specify the colors for the alternate rows.
-The meaning of the remaining data is unknown. It seems likely to
-include font sizes, attributes such as bold or italic, and margins.
-
-The table below lists the values observed in the corpus. When a cell
-contains a single value, then 99@math{+}% of the corpus contains that value.
-When a cell contains a pair of values, then the first value is seen in
-about two-thirds of the corpus and the second value in about the
-remaining one-third. In fonts that include multiple pairs, values are
-correlated, that is, for font 3, f5 = 24, f6 = 24, f7 = 2 appears
-about two-thirds of the time, as does the combination of f4 = 0, f6 =
-10 for font 7.
-
-@multitable {font} {40} {f2} {64173} {0/1} {24/11} {10/11} {2/3} {f8}
-@headitem font @tab f1 @tab f2 @tab f3 @tab f4 @tab f5 @tab f6 @tab f7 @tab f8
-@item 1 @tab 40 @tab 1 @tab 0 @tab 0 @tab 8 @tab 10/11 @tab 1 @tab 8
-@item 2 @tab 40 @tab 0 @tab 2 @tab 1 @tab 8 @tab 10/11 @tab 1 @tab 1
-@item 3 @tab 40 @tab 0 @tab 2 @tab 1 @tab 24/11 @tab 24/ 8 @tab 2/3 @tab 4
-@item 4 @tab 40 @tab 0 @tab 2 @tab 3 @tab 8 @tab 10/11 @tab 1 @tab 1
-@item 5 @tab 40 @tab 0 @tab 0 @tab 1 @tab 8 @tab 10/11 @tab 1 @tab 4
-@item 6 @tab 40 @tab 0 @tab 2 @tab 1 @tab 8 @tab 10/11 @tab 1 @tab 4
-@item 7 @tab 40 @tab 0 @tab 64173 @tab 0/1 @tab 8 @tab 10/11 @tab 1 @tab 1
-@item 8 @tab 40 @tab 0 @tab 2 @tab 3 @tab 8 @tab 10/11 @tab 1 @tab 4
-@end multitable
-
-@node SPV Light Member Formats
-@subsection Formats
+@node SPV Light Member Borders
+@subsection Borders
@cartouche
@format
-Formats @result{}
- Borders
- PrintSettings
- TableSettings
- int[@t{n4}] int*[@t{n4}]
- string[@t{encoding}]
- (i0 @math{|} i-1) (00 @math{|} 01) 00 (00 @math{|} 01)
- int
- byte[@t{decimal}] byte[@t{grouping}]
- int[@t{n-ccs}] string*[@t{n-ccs}]
- v1(i0)
- v3(count(count(X5) count(X6)))
-
Borders @result{}
- int[@t{endian}]
- int[@t[n-borders}] Border*[@t{n-borders}]
+ b1[@t{endian}]
+ be32[@t{n-borders}] Border*[@t{n-borders}]
bool[@t{show-grid-lines}]
00 00 00
Border @result{}
- int[@t{border-type}]
- int[@t{stroke-type}]
- int[@t{color}]
-
-PrintSettings @result{}
- int[@t{endian}]
- bool[@t{all-layers}]
- bool[@t{new-layers}]
- bool[@t{fit-width}]
- bool[@t{fit-length}]
- bool[@t{top-continuation}]
- bool[@t{bottom-continuation}]
- int[@t{n-orphan-lines}]
- string[@t{continuation-string}]
-
-TableSettings @result{}
- int[@t{endian}]
- int
- int[@t{current-layer}]
- bool[@t{skip-empty}]
- bool[@t{show-dimension-in-corner}]
- bool[@t{use-alphabetic-markers}]
- bool[@t{footnote-marker-position}]
- v3(
- byte
- int[@t{n}] byte*[@t{n}]
- string
- string[@t{table-look}]
- 00...
- )
-
-X5 @result{} byte*33 int[@t{n}] int*[@t{n}]
-X6 @result{}
- 01 00 (03 @math{|} 04) 00 00 00
- string[@t{command}] string[@t{subcommand}]
- string[@t{language}] string[@t{charset}] string[@t{locale}]
- (00 @math{|} 01) 00 (00 @math{|} 01) (00 @math{|} 01)
- int
- byte[@t{decimal}] byte[@t{grouping}]
- byte*8 01
- (string[@t{dataset}] string[@t{data file}] i0 int i0)?
- int[@t{n-ccs}] string*[@t{n-ccs}]
- 2e (00 @math{|} 01) (i2000000 i0)?
+ be32[@t{border-type}]
+ be32[@t{stroke-type}]
+ be32[@t{color}]
@end format
@end cartouche
-The Borders reflect how borders between regions are drawn. If
-@code{endian} is 1, then values inside Borders, including
-@code{endian} itself, are big-endian, otherwise they are
-little-endian. In practice, they seem to always be big-endian, even
-though the rest of the file is little-endian. @code{n-borders} seems
-to always be 19. @code{show-grid-lines} is 1 to draw grid lines,
-otherwise 0.
+The Borders reflect how borders between regions are drawn.
-Each Border describes one kind of border. Each @code{border-type}
-appears once in order, and they correspond to the following borders:
+The fixed value of @code{endian} can be used to validate the
+endianness.
+
+@code{show-grid-lines} is 1 to draw grid lines, otherwise 0.
+
+Each Border describes one kind of border. @code{n-borders} seems to
+always be 19. Each @code{border-type} appears once in order, and they
+correspond to the following borders:
@table @asis
@item 0
red, 8--15 are green, 0--7 are blue. An alpha of 255 indicates an
opaque color, therefore opaque black is 0xff000000.
-The PrintSettings reflect settings for printing. Like Borders, they
-have independent endianness. The @code{continuation-string} is
-usually empty but it may contain a text string such as ``(cont.)''.
-
-The TableSettings reflect display settings. Like Borders, they
-have independent endianness. @code{current-layer} is the displayed
-layer. @code{use-alphabetic-markers} is 1 to show markers as letters
-(e.g. @samp{a}, @samp{b}, @samp{c}, @dots{}), otherwise they are shown
-as numbers starting from 1. When @code{footnote-marker-position} is
-1, footnote markers are shown as superscripts, otherwise as
-subscripts. @code{table-look} is the name of a SPSS ``TableLook''
-table style, such as ``Default'' or ``Academic''; it is often empty.
+@node SPV Light Member Print Settings
+@subsection Print Settings
+
+@cartouche
+@format
+PrintSettings @result{}
+ b1[@t{endian}]
+ bool[@t{layers}]
+ bool[@t{paginate-layers}]
+ bool[@t{fit-width}]
+ bool[@t{fit-length}]
+ bool[@t{top-continuation}]
+ bool[@t{bottom-continuation}]
+ be32[@t{n-orphan-lines}]
+ bestring[@t{continuation-string}]
+@end format
+@end cartouche
+
+The PrintSettings reflect settings for printing. The fixed value of
+@code{endian} can be used to validate the endianness.
+
+@code{layers} is 1 to print all layers, 0 to print only the visible
+layers.
+
+@code{paginate-layers} is 1 to print each layer at the start of a new
+page, 0 otherwise.
+
+@code{fit-width} and @code{fit-length} control whether the table is
+shrunk to fit within a page's width or length, respectively.
+
+@code{n-orphan-lines} is the minimum number of rows or columns to put
+in one part of a table that is broken across pages.
+
+If @code{top-continuation} is 1, then @code{continuation-string} is
+printed at the top of a page when a table is broken across pages for
+printing; similarly for @code{bottom-continuation} and the bottom of a
+page. Usually, @code{continuation-string} is empty.
+
+@node SPV Light Member Table Settings
+@subsection Table Settings
+
+@cartouche
+@format
+TableSettings @result{}
+ be32[@t{endian}]
+ be32
+ be32[@t{current-layer}]
+ bool[@t{omit-empty}]
+ bool[@t{show-row-labels-in-corner}]
+ bool[@t{show-alphabetic-markers}]
+ bool[@t{footnote-marker-position}]
+ v3(
+ byte
+ be32[@t{n}] byte*[@t{n}]
+ bestring
+ bestring[@t{table-look}]
+ 00...
+ )
+@end format
+@end cartouche
+
+The TableSettings reflect display settings. The fixed value of
+@code{endian} can be used to validate the endianness.
+
+@code{current-layer} is the displayed layer.
+
+If @code{omit-empty} is 1, empty rows or columns (ones with nothing in
+any cell) are hidden; otherwise, they are shown.
+
+If @code{show-row-labels-in-corner} is 1, then row labels are shown in
+the upper left corner; otherwise, they are shown nested.
+
+If @code{show-alphabetic-markers} is 1, markers are shown as letters
+(e.g. @samp{a}, @samp{b}, @samp{c}, @dots{}); otherwise, they are
+shown as numbers starting from 1.
+
+When @code{footnote-marker-position} is 1, footnote markers are shown
+as superscripts, otherwise as subscripts.
+
+@code{table-look} is the name of a SPSS ``TableLook'' table style,
+such as ``Default'' or ``Academic''; it is often empty.
+
TableSettings ends with an arbitrary number of null bytes.
+@node SPV Light Member Formats
+@subsection Formats
+
+@cartouche
+@format
+Formats @result{}
+ int[@t{n4}] int*[@t{n4}]
+ string[@t{encoding}]
+ (i0 @math{|} i-1) (00 @math{|} 01) 00 (00 @math{|} 01)
+ int
+ byte[@t{decimal}] byte[@t{grouping}]
+ int[@t{n-ccs}] string*[@t{n-ccs}]
+ v1(i0)
+ v3(count(count(X5) count(X6)))
+
+X5 @result{} byte*33 int[@t{n}] int*[@t{n}]
+X6 @result{}
+ 01 00 (03 @math{|} 04) 00 00 00
+ string[@t{command}] string[@t{subcommand}]
+ string[@t{language}] string[@t{charset}] string[@t{locale}]
+ (00 @math{|} 01) 00 (00 @math{|} 01) (00 @math{|} 01)
+ int[@t{epoch}]
+ byte[@t{decimal}] byte[@t{grouping}]
+ (2d 43 1c eb e2 36 1a 3f | 00*8) 01
+ (string[@t{dataset}] string[@t{datafile}] i0 int[@t{date}] i0)?
+ int[@t{n-ccs}] string*[@t{n-ccs}]
+ 2e (00 @math{|} 01) (i2000000 i0)?
+@end format
+@end cartouche
+
Observed values of @code{n4} vary from 0 to 17. Out of 7,060 examples
in the corpus, it is nonzero only 36 times.
rest of the character strings in the member use this encoding. The
encoding string is itself encoded in US-ASCII.
+@code{epoch} is the year that starts the epoch. A 2-digit year is
+interpreted as belonging to the 100 years beginning at the epoch. The
+default epoch year is 69 years prior to the current year; thus, in
+2017 this field by default contains 1948. In the corpus, @code{epoch}
+ranges from 1943 to 1948, plus some contain -1.
+
@code{decimal} is the decimal point character. The observed values
are @samp{.} and @samp{,}.
@samp{'} (apostrophe), @samp{ } (space), and zero (presumably
indicating that digits should not be grouped).
+@code{dataset} is the name of the dataset analyzed to produce the
+output, e.g.@: @code{DataSet1}, and @code{datafile} the name of the
+file it was read from, e.g.@: @file{C:\Users\foo\bar.sav}. The latter
+is sometimes the empty string.
+
+@code{date} is a date, as seconds since the epoch, i.e.@: since
+January 1, 1970. Pivot tables within an SPV files often have dates a
+few minutes apart, so this is probably a creation date for the tables
+rather than for the file.
+
+Sometimes @code{dataset}, @code{datafile}, and @code{date} are present
+and other times they are absent. The reader can distinguish by
+assuming that they are present and then checking whether the
+presumptive @code{dataset} contains a null byte (a valid string never
+will).
+
@code{n-ccs} is observed as either 0 or 5. When it is 5, the
following strings are CCA through CCE format strings. @xref{Custom
Currency Formats,,, pspp, PSPP}. Most commonly these are all