@defvr {Optional} @code{creation-date-time}
The date and time at which the SPV file was written, in a
-locale-specific format, e.g. @code{Friday, May 16, 2014 6:47:37 PM
+locale-specific format, e.g.@: @code{Friday, May 16, 2014 6:47:37 PM
PDT} or @code{lunedì 17 marzo 2014 3.15.48 CET} or even @code{Friday,
December 5, 2014 5:00:19 o'clock PM EST}.
@end defvr
and have no semantic significance.
@item 00, 01, @dots{}, ff.
-Bytes with fixed values are written in hexadecimal:
+A bytes with a fixed value, written as a pair of hexadecimal digits.
@item i0, i1, @dots{}, i9, i10, i11, @dots{}
-32-bit integers with fixed values are written in decimal, prefixed by
+@itemx b0, b1, @dots{}, b9, b10, b11, @dots{}
+A 32-bit integer in little-endian or big-endian byte order,
+respectively, with a fixed value, written in decimal, prefixed by
@samp{i}.
@item byte
-An arbitrary byte.
+A byte.
+
+@item bool
+A byte with value 0 or 1.
+
+@item int16
+@itemx be16
+A 16-bit integer in little-endian or big-endian byte order,
+respectively.
@item int
-An arbitrary 32-bit integer.
+@itemx be32
+A 32-bit integer in little-endian or big-endian byte order,
+respectively.
+
+@item int64
+@itemx be64
+A 64-bit integer in little-endian or big-endian byte order,
+respectively.
@item double
-An arbitrary 64-bit IEEE floating-point number.
+A 64-bit IEEE floating-point number.
+
+@item float
+A 32-bit IEEE floating-point number.
@item string
-A 32-bit integer followed by the specified number of bytes of
-character data. (The encoding is indicated by the Formats
-nonterminal.)
+@itemx bestring
+A 32-bit integer, in little-endian or big-endian byte order,
+respectively, followed by the specified number of bytes of character
+data. (The encoding is indicated by the Formats nonterminal.)
@item @var{x}?
@var{x} is optional, e.g.@: 00? is an optional zero byte.
In a version 3 @file{.bin} member, @var{x}; in version 1, nothing.
@end table
-All integer and floating-point values in this format use little-endian
-byte order.
+Little-endian byte order is far more common in this format, but a few
+pieces of the format use big-endian byte order.
A ``light'' detail member @file{.bin} consists of a number of sections
concatenated together, terminated by a byte 01:
@cartouche
@format
-LightMember @result{} Header Title Caption Footnotes Fonts Formats Dimensions Data 01
+LightMember @result{}
+ Header Title
+ Caption Footnotes
+ Fonts Borders PrintSettings TableSettings Formats
+ Dimensions Data
+ 01
@end format
@end cartouche
@menu
* SPV Light Member Header::
* SPV Light Member Title::
-* PSV Light Member Caption::
+* SPV Light Member Caption::
* SPV Light Member Footnotes::
* SPV Light Member Fonts::
+* SPV Light Member Borders::
+* SPV Light Member Print Settings::
+* SPV Light Member Table Settings::
* SPV Light Member Formats::
* SPV Light Member Dimensions::
* SPV Light Member Categories::
@node SPV Light Member Header
@subsection Header
-An SPV file begins with an 39-byte header:
+An SPV light member begins with a 39-byte header:
@cartouche
@format
Header @result{}
01 00
(i1 @math{|} i3)[@t{version}]
- 01 (00 @math{|} 01) byte*21 00 00
- int[@t{table-id}] byte*4
+ 01 bool*4 int
+ int[@t{min-column-width}] int[@t{max-column-width}]
+ int[@t{min-row-width}] int[@t{max-row-width}]
+ int64[@t{table-id}]
@end format
@end cartouche
@code{table-id} is a binary version of the @code{tableId} attribute in
the structure member that refers to the detail member. For example,
-if @code{tableId} is @code{-4154297861994971133}, then @code{table-id}
-would be 0xdca00003.
+if @code{tableId} is @code{-4122591256483201023}, then @code{table-id}
+would be 0xc6c99d183b300001.
+
+@code{min-column-width} is the minimum width that a column will be
+assigned automatically. @code{max-column-width} is the maximum width
+that a column will be assigned to accommodate a long column label.
+@code{min-row-width} and @code{max-row-width} are a similar range for
+the width of row labels. All of these measurements are in 1/96 inch
+units.
The meaning of the other variable parts of the header is not known.
Title @result{}
Value[@t{title1}] 01?
Value[@t{c}] 01? 31
- Value[@t{title2}] 01? 00? 58
+ Value[@t{title2}] 01?
@end format
@end cartouche
well formatted. For example, for a frequency table, @code{title1} and
@code{title2} name the variable and @code{c} is simply ``Frequencies''.
-@node PSV Light Member Caption
+@node SPV Light Member Caption
@subsection Caption
@cartouche
@format
-Caption @result{} 58 @math{|} 31 Value[@t{caption}]
+Caption @result{} Caption1 Caption2
+Caption1 @result{} 31 Value @math{|} 58
+Caption2 @result{} 31 Value @math{|} 58
@end format
@end cartouche
-The @code{caption}, if presented, is shown below the table.
+The Caption, if present, is shown below the table. Caption2 is
+normally present. Caption1 is only rarely nonempty; it might reflect
+user editing of the caption.
@node SPV Light Member Footnotes
@subsection Footnotes
@format
Fonts @result{} 00 Font*8
Font @result{}
- byte[@t{index}] 31 string[@t{typeface}] 00 00
- (10 @math{|} 20 @math{|} 40 @math{|} 50 @math{|} 70 @math{|} 80)[@t{f1}] 41
- (i0 @math{|} i1 @math{|} i2)[@t{f2}] 00
- (i0 @math{|} i2 @math{|} i64173)[@t{f3}]
- (i0 @math{|} i1 @math{|} i2 @math{|} i3)[@t{f4}]
- string[@t{fgcolor}] string[@t{bgcolor}] i0 i0 00
- v3(int[@t{f5}] int[@t{f6}] int[@t{f7}] int[@t{f8}]))
+ byte[@t{index}] 31
+ string[@t{typeface}] float[@t{size}] int[@t{style}] bool[@t{underline}]
+ int[@t{halign}] int[@t{valign}]
+ string[@t{fgcolor}] string[@t{bgcolor}]
+ byte[@t{alternate}] string[@t{altfg}] string[@t{altbg}]
+ v3(int[@t{left-margin}] int[@t{right-margin}] int[@t{top-margin}] int[@t{bottom-margin}])
@end format
@end cartouche
Each Font represents the font style for a different element, in the
-following order: title, caption, footnote, row labels, column labels,
-corner labels, data, and layers.
+following order: title, caption, footer, corner, column
+labels, row labels, data, and layers.
@code{index} is the 1-based index of the Font, i.e. 1 for the first
Font, through 8 for the final Font.
is @code{SansSerif} in over 99% of instances and @code{Times New
Roman} in the rest.
+@code{size} is the size of the font, in points. The most common size
+in the corpus is 12 points.
+
+@code{style} is a bit mask. Bit 0 (with value 1) is set for bold, bit
+1 (with value 2) is set for italic.
+
+@code{underline} is 1 if the font is underlined, 0 otherwise.
+
+@code{halign} specifies horizontal alignment: 0 for center, 2 for
+left, 4 for right, 61453 for decimal, 64173 for mixed. Mixed
+alignment varies according to type: string data is left-justified,
+numbers and most other formats are right-justified.
+
+@code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
+for bottom.
+
@code{fgcolor} and @code{bgcolor} are the foreground color and
background color, respectively. In the corpus, these are always
@code{#000000} and @code{#ffffff}, respectively.
-The meaning of the remaining data is unknown. It seems likely to
-include font sizes, horizontal and vertical alignment, attributes such
-as bold or italic, and margins.
-
-The table below lists the values observed in the corpus. When a cell
-contains a single value, then 99@math{+}% of the corpus contains that value.
-When a cell contains a pair of values, then the first value is seen in
-about two-thirds of the corpus and the second value in about the
-remaining one-third. In fonts that include multiple pairs, values are
-correlated, that is, for font 3, f5 = 24, f6 = 24, f7 = 2 appears
-about two-thirds of the time, as does the combination of f4 = 0, f6 =
-10 for font 7.
-
-@multitable {font} {40} {f2} {64173} {0/1} {24/11} {10/11} {2/3} {f8}
-@headitem font @tab f1 @tab f2 @tab f3 @tab f4 @tab f5 @tab f6 @tab f7 @tab f8
-@item 1 @tab 40 @tab 1 @tab 0 @tab 0 @tab 8 @tab 10/11 @tab 1 @tab 8
-@item 2 @tab 40 @tab 0 @tab 2 @tab 1 @tab 8 @tab 10/11 @tab 1 @tab 1
-@item 3 @tab 40 @tab 0 @tab 2 @tab 1 @tab 24/11 @tab 24/ 8 @tab 2/3 @tab 4
-@item 4 @tab 40 @tab 0 @tab 2 @tab 3 @tab 8 @tab 10/11 @tab 1 @tab 1
-@item 5 @tab 40 @tab 0 @tab 0 @tab 1 @tab 8 @tab 10/11 @tab 1 @tab 4
-@item 6 @tab 40 @tab 0 @tab 2 @tab 1 @tab 8 @tab 10/11 @tab 1 @tab 4
-@item 7 @tab 40 @tab 0 @tab 64173 @tab 0/1 @tab 8 @tab 10/11 @tab 1 @tab 1
-@item 8 @tab 40 @tab 0 @tab 2 @tab 3 @tab 8 @tab 10/11 @tab 1 @tab 4
-@end multitable
+@code{alternate} is 01 if rows should alternate colors, 00 if all rows
+should be the same color. When @code{alternate} is 01, @code{altfg}
+and @code{altbg} specify the colors for the alternate rows.
+
+@code{left-margin}, @code{right-margin}, @code{top-margin}, and
+@code{bottom-margin} are measured in multiples of 1/96 inch.
+
+@node SPV Light Member Borders
+@subsection Borders
+
+@cartouche
+@format
+Borders @result{}
+ b1[@t{endian}]
+ be32[@t{n-borders}] Border*[@t{n-borders}]
+ bool[@t{show-grid-lines}]
+ 00 00 00
+
+Border @result{}
+ be32[@t{border-type}]
+ be32[@t{stroke-type}]
+ be32[@t{color}]
+@end format
+@end cartouche
+
+The Borders reflect how borders between regions are drawn.
+
+The fixed value of @code{endian} can be used to validate the
+endianness.
+
+@code{show-grid-lines} is 1 to draw grid lines, otherwise 0.
+
+Each Border describes one kind of border. @code{n-borders} seems to
+always be 19. Each @code{border-type} appears once (although in an
+unpredictable order) and correspond to the following borders:
+
+@table @asis
+@item 0
+Title.
+@item 1@dots{}4
+Left, top, right, and bottom outer frame.
+@item 5@dots{}8
+Left, top, right, and bottom inner frame.
+@item 9, 10
+Left and top of data area.
+@item 11, 12
+Horizontal and vertical dimension rows.
+@item 13, 14
+Horizontal and vertical dimension columns.
+@item 15, 16
+Horizontal and vertical category rows.
+@item 17, 18
+Horizontal and vertical category columns.
+@end table
+
+@code{stroke-type} describes how a border is drawn, as one of:
+
+@table @asis
+@item 0
+No line.
+@item 1
+Solid line.
+@item 2
+Dashed line.
+@item 3
+Thick line.
+@item 4
+Thin line.
+@item 5
+Double line.
+@end table
+
+@code{color} is an RGB color. Bits 24--31 are alpha, bits 16--23 are
+red, 8--15 are green, 0--7 are blue. An alpha of 255 indicates an
+opaque color, therefore opaque black is 0xff000000.
+
+@node SPV Light Member Print Settings
+@subsection Print Settings
+
+@cartouche
+@format
+PrintSettings @result{}
+ b1[@t{endian}]
+ bool[@t{all-layers}]
+ bool[@t{paginate-layers}]
+ bool[@t{fit-width}]
+ bool[@t{fit-length}]
+ bool[@t{top-continuation}]
+ bool[@t{bottom-continuation}]
+ be32[@t{n-orphan-lines}]
+ bestring[@t{continuation-string}]
+@end format
+@end cartouche
+
+The PrintSettings reflect settings for printing. The fixed value of
+@code{endian} can be used to validate the endianness.
+
+@code{all-layers} is 1 to print all layers, 0 to print only the
+visible layers.
+
+@code{paginate-layers} is 1 to print each layer at the start of a new
+page, 0 otherwise. (This setting is honored only @code{all-layers} is
+1, since otherwise only one layer is printed.)
+
+@code{fit-width} and @code{fit-length} control whether the table is
+shrunk to fit within a page's width or length, respectively.
+
+@code{n-orphan-lines} is the minimum number of rows or columns to put
+in one part of a table that is broken across pages.
+
+If @code{top-continuation} is 1, then @code{continuation-string} is
+printed at the top of a page when a table is broken across pages for
+printing; similarly for @code{bottom-continuation} and the bottom of a
+page. Usually, @code{continuation-string} is empty.
+
+@node SPV Light Member Table Settings
+@subsection Table Settings
+
+@cartouche
+@format
+TableSettings @result{}
+ be32[@t{endian}]
+ be32
+ be32[@t{current-layer}]
+ bool[@t{omit-empty}]
+ bool[@t{show-row-labels-in-corner}]
+ bool[@t{show-alphabetic-markers}]
+ bool[@t{footnote-marker-position}]
+ v3(
+ byte
+ be32[@t{n}] byte*[@t{n}]
+ bestring[@t{notes}]
+ bestring[@t{table-look}]
+ 00...
+ )
+@end format
+@end cartouche
+
+The TableSettings reflect display settings. The fixed value of
+@code{endian} can be used to validate the endianness.
+
+@code{current-layer} is the displayed layer.
+
+If @code{omit-empty} is 1, empty rows or columns (ones with nothing in
+any cell) are hidden; otherwise, they are shown.
+
+If @code{show-row-labels-in-corner} is 1, then row labels are shown in
+the upper left corner; otherwise, they are shown nested.
+
+If @code{show-alphabetic-markers} is 1, markers are shown as letters
+(e.g. @samp{a}, @samp{b}, @samp{c}, @dots{}); otherwise, they are
+shown as numbers starting from 1.
+
+When @code{footnote-marker-position} is 1, footnote markers are shown
+as superscripts, otherwise as subscripts.
+
+@code{notes} is a text string that contains user-specified notes. It
+is displayed when the user hovers the cursor over the table, like
+``alt text'' on a webpage. It is not printed. It is usually empty.
+
+@code{table-look} is the name of a SPSS ``TableLook'' table style,
+such as ``Default'' or ``Academic''; it is often empty.
+
+TableSettings ends with an arbitrary number of null bytes.
@node SPV Light Member Formats
@subsection Formats
@cartouche
@format
Formats @result{}
- int[@t{n1}] byte*[@t{n1}]
- int[@t{n2}] byte*[@t{n2}]
- int[@t{n3}] byte*[@t{n3}]
- int[@t{n4}] int*[@t{n4}]
+ int[@t{nwidths}] int*[@t{nwidths}]
string[@t{encoding}]
- (i0 @math{|} i-1) (00 @math{|} 01) 00 (00 @math{|} 01)
- int
+ int (00 @math{|} 01) 00 (00 @math{|} 01)
+ int[@t{epoch}]
byte[@t{decimal}] byte[@t{grouping}]
- int[@t{n-ccs}] string*[@t{n-ccs}]
+ CustomCurrency
v1(i0)
v3(count(count(X5) count(X6)))
+CustomCurrency @result{} int[@t{n-ccs}] string*[@t{n-ccs}]
+
X5 @result{} byte*33 int[@t{n}] int*[@t{n}]
X6 @result{}
01 00 (03 @math{|} 04) 00 00 00
string[@t{command}] string[@t{subcommand}]
string[@t{language}] string[@t{charset}] string[@t{locale}]
- (00 @math{|} 01) 00 (00 @math{|} 01) (00 @math{|} 01)
- int
+ (00 @math{|} 01) 00 bool bool
+ int[@t{epoch}]
byte[@t{decimal}] byte[@t{grouping}]
- byte*8 01
- (string[@t{dataset}] string[@t{datafile}] i0 int i0)?
- int[@t{n-ccs}] string*[@t{n-ccs}]
- 2e (00 @math{|} 01) (i2000000 i0)?
+ double[@t{small}] 01
+ (string[@t{dataset}] string[@t{datafile}] i0 int[@t{date}] i0)?
+ CustomCurrency
+ byte[@t{missing}] bool (i2000000 i0)?
@end format
@end cartouche
-In every example in the corpus, @code{n1} is 240. The meaning of the
-bytes that follow it is unknown.
-
-In every example in the corpus, @code{n2} is 18 and the bytes that
-follow it are @code{00 00 00 01 00 00 00 00 00 00 00 00 00 02 00 00 00
-00}. The meaning of these bytes is unknown.
-
-In every example in the corpus for version 1, @code{n3} is 16 and the
-bytes that follow it are @code{00 00 00 01 00 00 00 01 00 00 00 00 01
-01 01 01}. In version 3, observed @code{n3} varies from 117 to 150,
-and its bytes include a 1-byte count at offset 0x34. When the count
-is nonzero, a text string of that length at offset 0x35 is the name of
-a ``TableLook'', e.g. ``Default'' or ``Academic''.
-
-Observed values of @code{n4} vary from 0 to 17. Out of 7,060 examples
-in the corpus, it is nonzero only 36 times.
+If @code{nwidths} is nonzero, then the accompanying integers are
+column widths as manually adjusted by the user. (Row heights are
+computed automatically based on the widths.)
@code{encoding} is a character encoding, usually a Windows code page
such as @code{en_US.windows-1252} or @code{it_IT.windows-1252}. The
rest of the character strings in the member use this encoding. The
encoding string is itself encoded in US-ASCII.
+@code{epoch} is the year that starts the epoch. A 2-digit year is
+interpreted as belonging to the 100 years beginning at the epoch. The
+default epoch year is 69 years prior to the current year; thus, in
+2017 this field by default contains 1948. In the corpus, @code{epoch}
+ranges from 1943 to 1948, plus some contain -1.
+
@code{decimal} is the decimal point character. The observed values
are @samp{.} and @samp{,}.
@samp{'} (apostrophe), @samp{ } (space), and zero (presumably
indicating that digits should not be grouped).
+@code{dataset} is the name of the dataset analyzed to produce the
+output, e.g.@: @code{DataSet1}, and @code{datafile} the name of the
+file it was read from, e.g.@: @file{C:\Users\foo\bar.sav}. The latter
+is sometimes the empty string.
+
+@code{date} is a date, as seconds since the epoch, i.e.@: since
+January 1, 1970. Pivot tables within an SPV files often have dates a
+few minutes apart, so this is probably a creation date for the tables
+rather than for the file.
+
+Sometimes @code{dataset}, @code{datafile}, and @code{date} are present
+and other times they are absent. The reader can distinguish by
+assuming that they are present and then checking whether the
+presumptive @code{dataset} contains a null byte (a valid string never
+will).
+
@code{n-ccs} is observed as either 0 or 5. When it is 5, the
following strings are CCA through CCE format strings. @xref{Custom
Currency Formats,,, pspp, PSPP}. Most commonly these are all
ValueMod @result{}
31 i0 (i0 @math{|} i1 string[@t{subscript}])
v1(00 (i1 @math{|} i2) 00 00 int 00 00)
- v3(count(FormatString Style ValueModUnknown))
- @math{|} 31 i1 int[@t{footnote-number}] Format
- @math{|} 31 i2 (00 @math{|} 01 @math{|} 02) 00 (i1 @math{|} i2 @math{|} i3) Format
- @math{|} 31 i3 00 00 01 00 i2 Format
+ v3(count(FormatString
+ (31 Style | 58)
+ (31 Style2 | 58)))
+ @math{|} 31 int[@t{n-refs}] int16*[@t{n-refs}] Format
@math{|} 58
-Style @result{} 58 @math{|} 31 01? 00? 00? 00? 01 string[@t{fgcolor}] string[@t{bgcolor}] string[@t{typeface}] byte
+
Format @result{} 00 00 count(FormatString Style 58)
-FormatString @result{} count((i0 (58 @math{|} 31 string))?)
-ValueModUnknown @result{} 58 @math{|} 31 i0 i0 i0 i0 01 00 (01 @math{|} 02 @math{|} 08) 00 08 00 0a 00)
+FormatString @result{} count((count((i0 58)?) (58 @math{|} 31 string))?)
+
+Style @result{}
+ bool[@t{bold}] bool[@t{italic}] bool[@t{underline}] bool[@t{show}]
+ string[@t{fgcolor}] string[@t{bgcolor}]
+ string[@t{typeface}] byte[@t{size}]
+
+Style2 @result{}
+ int[@t{halign}] int[@t{valign}] double[@t{offset}]
+ int16[@t{left-margin}] int16[@t{right-margin}]
+ int16[@t{top-margin}] int16[@t{bottom-margin}]
@end format
@end cartouche
-The @code{footnote-number}, if present, specifies a footnote that the
-Value references. The footnote's marker is shown appended to the main
-text of the Value, as a superscript.
+A ValueMod that begins with ``31 i0'' specifies a string to append to
+the main text of the Value, as a subscript. The subscript text is a
+brief indicator, e.g.@: @samp{a} or @samp{a,b}, with its meaning
+indicated by the table caption. In this usage, subscripts are similar
+to footnotes. One apparent difference is that a Value can only
+reference one footnote but a subscript can list more than one letter.
-The @code{subscript}, if present, specifies a string to append to the
-main text of the Value, as a subscript. The subscript text is a brief
-indicator, e.g.@: @samp{a} or @samp{a,b}, with its meaning indicated
-by the table caption. In this usage, subscripts are similar to
-footnotes; one apparent difference is that a Value can only reference
-one footnote but a subscript can list more than one letter.
+A ValueMod that begins with 31 followed by a nonzero ``int'' specifies
+a footnote or footnotes that the Value references. Footnote markers
+are shown appended to the main text of the Value, as superscripts.
The Format, if present, is a format string for substitutions using the
syntax explained previously. It appears to be an English-language
version of the localized format string in the Value in which the
Format is nested.
-The Style, if present, changes the style for this individual Value.
+Style and Style2, if present, change the style for this individual
+Value. @code{bold}, @code{italic}, and @code{underline} control the
+particular style. @code{fgcolor} and @code{bgcolor} are strings, such
+as @code{#ffffff}. The @code{size} is a font size in units of 1/96
+inch.
+
+@code{halign} is 0 for center, 2 for left, 4 for right, 6 for decimal,
+0xffffffad for mixed. For decimal alignment, @code{offset} is the
+decimal point's offset from the right side of the cell, in units of
+1/72 inch.
+
+@code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
+for bottom.
+
+@code{left-margin}, @code{right-margin}, @code{top-margin}, and
+@code{bottom-margin} are in units of 1/72 inch.
@node SPV Legacy Detail Member Binary Format
@section Legacy Detail Member Binary Format