1 @c PSPP - a program for statistical analysis.
2 @c Copyright (C) 2019 Free Software Foundation, Inc.
3 @c Permission is granted to copy, distribute and/or modify this document
4 @c under the terms of the GNU Free Documentation License, Version 1.3
5 @c or any later version published by the Free Software Foundation;
6 @c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
7 @c A copy of the license is included in the section entitled "GNU
8 @c Free Documentation License".
11 @node SPSS Viewer File Format
12 @appendix SPSS Viewer File Format
14 SPSS Viewer or @file{.spv} files, here called SPV files, are written
15 by SPSS 16 and later to represent the contents of its output editor.
16 This chapter documents the format, based on examination of a corpus of
17 about 8,000 files from a variety of sources. This description is
18 detailed enough to both read and write SPV files.
20 SPSS 15 and earlier versions instead use @file{.spo} files, which have
21 a completely different output format based on the Microsoft Compound
22 Document Format. This format is not documented here.
24 An SPV file is a Zip archive that can be read with @command{zipinfo}
25 and @command{unzip} and similar programs. The final member in the Zip
26 archive is the @dfn{manifest}, a file named
27 @file{META-INF/MANIFEST.MF}. This structure makes SPV files resemble
28 Java ``JAR'' files (and ODF files), but whereas a JAR manifest
29 contains a sequence of colon-delimited key/value pairs, an SPV
30 manifest contains the string @samp{allowPivoting=true}, without a
31 new-line. PSPP uses this string to identify an SPV file; it is
32 invariant across the corpus.@footnote{SPV files always begin with the
33 7-byte sequence 50 4b 03 04 14 00 08, but this is not a useful magic
34 number because most Zip archives start the same way.}
36 The rest of the members in an SPV file's Zip archive fall into two
37 categories: @dfn{structure} and @dfn{detail} members. Structure
38 member names begin with @file{outputViewer@var{nnnnnnnnnn}}, where
39 each @var{n} is a decimal digit, and end with @file{.xml}, and often
40 include the string @file{_heading} in between. Each of these members
41 represents some kind of output item (a table, a heading, a block of
42 text, etc.) or a group of them. The member whose output goes at the
43 beginning of the document is numbered 0, the next member in the output
44 is numbered 1, and so on.
46 Structure members contain XML. This XML is sometimes self-contained,
47 but it often references detail members in the Zip archive, which are
51 @item @file{@var{prefix}_table.xml} and @file{@var{prefix}_tableData.bin}
52 @itemx @file{@var{prefix}_lightTableData.bin}
53 The structure of a table plus its data. Older SPV files pair a
54 @file{@var{prefix}_table.xml} file that describes the table's
55 structure with a binary @file{@var{prefix}_tableData.bin} file that
56 gives its data. Newer SPV files (the majority of those in the corpus)
57 instead include a single @file{@var{prefix}_lightTableData.bin} file
58 that incorporates both into a single binary format.
60 @item @file{@var{prefix}_warning.xml} and @file{@var{prefix}_warningData.bin}
61 @itemx @file{@var{prefix}_lightWarningData.bin}
62 Same format used for tables, with a different name.
64 @item @file{@var{prefix}_notes.xml} and @file{@var{prefix}_notesData.bin}
65 @itemx @file{@var{prefix}_lightNotesData.bin}
66 Same format used for tables, with a different name.
68 @item @file{@var{prefix}_chartData.bin} and @file{@var{prefix}_chart.xml}
69 The structure of a chart plus its data. Charts do not have a
72 @item @file{@var{prefix}_pmml.scf}
73 @itemx @file{@var{prefix}_stats.scf}
74 @item @file{@var{prefix}_model.xml}
75 Not yet investigated. The corpus contains few examples.
78 The @file{@var{prefix}} in the names of the detail members is
79 typically an 11-digit decimal number that increases for each item,
80 tending to skip values. Older SPV files use different naming
81 conventions. Structure member refer to detail members by name, and so
82 their exact names do not matter to readers as long as they are unique.
84 SPSS tolerates corrupted Zip archives that Zip reader libraries tend
85 to reject. These can be fixed up with @command{zip -FF}.
88 * SPV Structure Member Format::
89 * SPV Light Detail Member Format::
90 * SPV Legacy Detail Member Binary Format::
91 * SPV Legacy Detail Member XML Format::
94 @node SPV Structure Member Format
95 @section Structure Member Format
97 A structure member lays out the high-level structure for a group of
98 output items such as heading, tables, and charts. Structure members
99 do not include the details of tables and charts but instead refer to
100 them by their member names.
102 Structure members' XML files claim conformance with a collection of
103 XML Schemas. These schemas are distributed, under a nonfree license,
104 with SPSS binaries. Fortunately, the schemas are not necessary to
105 understand the structure members. The schemas can even
106 be deceptive because they document elements and attributes that are
107 not in the corpus and do not document elements and attributes that are
108 commonly found in the corpus.
110 Structure members use a different XML namespace for each schema, but
111 these namespaces are not entirely consistent. In some SPV files, for
112 example, the @code{viewer-tree} schema is associated with namespace
113 @indicateurl{http://xml.spss.com/spss/viewer-tree} and in others with
114 @indicateurl{http://xml.spss.com/spss/viewer/viewer-tree} (note the
115 additional @file{viewer/}). Under either name, the schema URIs are
116 not resolvable to obtain the schemas themselves.
118 One may ignore all of the above in interpreting a structure member.
119 The actual XML has a simple and straightforward form that does not
120 require a reader to take schemas or namespaces into account. A
121 structure member's root is @code{heading} element, which contains
122 @code{heading} or @code{container} elements (or a mix), forming a
123 tree. In turn, @code{container} holds a @code{label} and one more
124 child, usually @code{text} or @code{table}.
126 The following sections document the elements found in structure
127 members in a context-free grammar-like fashion. Consider the
128 following example, which specifies the attributes and content for the
129 @code{container} element:
133 :visibility=(visible | hidden)
134 :page-break-before=(always)?
135 :text-align=(left | center)?
137 => label (table | container_text | graph | model | object | image | tree)
140 Each attribute specification begins with @samp{:} followed by the
141 attribute's name. If the attribute's value has an easily specified
142 form, then @samp{=} and its description follows the name. Finally, if
143 the attribute is optional, the specification ends with @samp{?}. The
144 following value specifications are defined:
147 @item (@var{a} | @var{b} | @dots{})
148 One of the listed literal strings. If only one string is listed, it
149 is the only acceptable value. If @code{OTHER} is listed, then any
150 string not explicitly listed is also accepted.
153 Either @code{true} or @code{false}.
156 A floating-point number followed by a unit, e.g.@: @code{10pt}. Units
157 in the corpus include @code{in} (inch), @code{pt} (points, 72/inch),
158 @code{px} (``device-independent pixels'', 96/inch), and @code{cm}. If
159 the unit is omitted then points should be assumed. The number and
160 unit may be separated by white space.
162 The corpus also includes localized names for units. A reader must
163 understand these to properly interpret the dimension:
167 @code{인치}, @code{pol.}, @code{cala}, @code{cali}
177 A floating-point number.
183 A color in one of the forms @code{#@var{rr}@var{gg}@var{bb}} or
184 @code{@var{rr}@var{gg}@var{bb}}, or the string @code{transparent}, or
185 one of the standard Web color names.
188 @item ref @var{element}
189 @itemx ref(@var{elem1} | @var{elem2} | @dots{})
190 The name from the @code{id} attribute in some element. If one or more
191 elements are named, the name must refer to one of those elements,
192 otherwise any element is acceptable.
195 All elements have an optional @code{id} attribute. If present, its
196 value must be unique. In practice many elements are assigned
197 @code{id} attributes that are never referenced.
199 The content specification for an element supports the following
206 @item @var{a} @var{b}
207 @var{a} followed by @var{b}.
209 @item @var{a} | @var{b} | @var{c}
210 One of @var{a} or @var{b} or @var{c}.
213 Zero or one instances of @var{a}.
216 Zero or more instances of @var{a}.
219 One or more instances of @var{a}.
221 @item (@var{subexpression})
222 Grouping for a subexpression.
231 Element and attribute names are sometimes suffixed by another name in
232 square brackets to distinguish different uses of the same name. For
233 example, structure XML has two @code{text} elements, one inside
234 @code{container}, the other inside @code{pageParagraph}. The former
235 is defined as @code{text[container_text]} and referenced as
236 @code{container_text}, the latter defined as
237 @code{text[pageParagraph_text]} and referenced as
238 @code{pageParagraph_text}.
240 This language is used in the PSPP source code for parsing structure
241 and detail XML members. Refer to
242 @file{src/output/spv/structure-xml.grammar} and
243 @file{src/output/spv/detail-xml.grammar} for the full grammars.
245 The following example shows the contents of a typical structure member
246 for a @cmd{DESCRIPTIVES} procedure. A real structure member is not
247 indented. This example also omits most attributes, all XML namespace
248 information, and the CSS from the embedded HTML:
251 <?xml version="1.0" encoding="utf-8"?>
253 <label>Output</label>
254 <heading commandName="Descriptives">
255 <label>Descriptives</label>
258 <text commandName="Descriptives" type="title">
260 <![CDATA[<head><style type="text/css">...</style></head><BR>Descriptives]]>
264 <container visibility="hidden">
266 <table commandName="Descriptives" subType="Notes" type="note">
268 <dataPath>00000000001_lightNotesData.bin</dataPath>
273 <label>Descriptive Statistics</label>
274 <table commandName="Descriptives" subType="Descriptive Statistics"
277 <dataPath>00000000002_lightTableData.bin</dataPath>
286 * SPV Structure heading Element::
287 * SPV Structure label Element::
288 * SPV Structure container Element::
289 * SPV Structure text Element (Inside @code{container})::
290 * SPV Structure html Element::
291 * SPV Structure table Element::
292 * SPV Structure graph Element::
293 * SPV Structure model Element::
294 * SPV Structure tree Element::
295 * SPV Structure Path Elements::
296 * SPV Structure pageSetup Element::
297 * SPV Structure @code{text} Element (Inside @code{pageParagraph})::
300 @node SPV Structure heading Element
301 @subsection The @code{heading} Element
304 heading[root_heading]
310 => label pageSetup? (container | heading)*
315 :visibility[heading_visibility]=(collapsed)?
318 => label (container | heading)*
321 The root of a structure member is a @code{heading}, which represents a
322 section of output beginning with a title (the @code{label}) and
323 ordinarily followed by content containers or further nested
324 (sub)-sections of output. Unlike heading elements in HTML and other
325 common document formats, which precede the content that they head,
326 @code{heading} contains the elements that appear below the heading.
328 The document root heading, only, may contain a @code{pageSetup}
331 The following attributes have been observed on both document root and
332 nested @code{heading} elements.
334 @defvr {Attribute} creator-version
335 The version of the software that created this SPV file. A string of
336 the form @code{xxyyzzww} represents software version xx.yy.zz.ww,
337 e.g.@: @code{21000001} is version 21.0.0.1. Trailing pairs of zeros
338 are sometimes omitted, so that @code{21}, @code{210000}, and
339 @code{21000000} are all version 21.0.0.0 (and the corpus contains all
340 three of those forms).
344 The following attributes have been observed on document root
345 @code{heading} elements only:
347 @defvr {Attribute} @code{creator}
348 The directory in the file system of the software that created this SPV
352 @defvr {Attribute} @code{creation-date-time}
353 The date and time at which the SPV file was written, in a
354 locale-specific format, e.g.@: @code{Friday, May 16, 2014 6:47:37 PM
355 PDT} or @code{lunedì 17 marzo 2014 3.15.48 CET} or even @code{Friday,
356 December 5, 2014 5:00:19 o'clock PM EST}.
359 @defvr {Attribute} @code{lockReader}
360 Whether a reader should be allowed to edit the output. The possible
361 values are @code{true} and @code{false}. The value @code{false} is by
365 @defvr {Attribute} @code{schemaLocation}
366 This is actually an XML Namespace attribute. A reader may ignore it.
370 The following attributes have been observed only on nested
371 @code{heading} elements:
373 @defvr {Attribute} @code{commandName}
374 A locale-invariant identifier for the command that produced the
375 output, e.g.@: @code{Frequencies}, @code{T-Test}, @code{Non Par Corr}.
378 @defvr {Attribute} @code{visibility}
379 To what degree the output represented by the element is visible.
382 @defvr {Attribute} @code{locale}
383 The locale used for output, in Windows format, which is similar to the
384 format used in Unix with the underscore replaced by a hyphen, e.g.@:
385 @code{en-US}, @code{en-GB}, @code{el-GR}, @code{sr-Cryl-RS}.
388 @defvr {Attribute} @code{olang}
389 The output language, e.g.@: @code{en}, @code{it}, @code{es},
390 @code{de}, @code{pt-BR}.
393 @node SPV Structure label Element
394 @subsection The @code{label} Element
400 Every @code{heading} and @code{container} holds a @code{label} as its
401 first child. The root @code{heading} in a structure member always
402 contains the string ``Output'' (localized). Otherwise, the text in
403 @code{label} describes what it labels, often by naming the statistical
404 procedure that was executed, e.g.@: ``Frequencies'' or ``T-Test''.
405 Labels are often very generic, especially within a @code{container},
406 e.g.@: ``Title'' or ``Warnings'' or ``Notes''. Label text is
407 localized according to the output language, e.g.@: in Italian a
408 frequency table procedure is labeled ``Frequenze''.
410 The corpus contains a few examples of empty labels, ones that contain
413 @node SPV Structure container Element
414 @subsection The @code{container} Element
418 :visibility=(visible | hidden)
419 :page-break-before=(always)?
420 :text-align=(left | center)?
422 => label (table | container_text | graph | model | object | image | tree)
425 A @code{container} serves to contain and label a @code{table},
426 @code{text}, or other kind of item.
428 This element has the following attributes.
430 @defvr {Attribute} @code{visibility}
431 Whether the container's content is displayed. ``Notes'' tables are
432 often hidden; other data is usually
435 @defvr {Attribute} @code{text-align}
436 Alignment of text within the container. Observed with nested
437 @code{table} and @code{text} elements.
440 @defvr {Attribute} @code{width}
441 The width of the container, e.g.@: @code{1097px}.
444 @node SPV Structure text Element (Inside @code{container})
445 @subsection The @code{text} Element (Inside @code{container})
449 :type[text_type]=(title | log | text | page-title)
455 This @code{text} element is nested inside a @code{container}. There
456 is a different @code{text} element that is nested inside a
457 @code{pageParagraph}.
459 This element has the following attributes.
461 @defvr {Attribute} @code{type}
462 The semantics of the text.
465 @defvr {Attribute} @code{commandName}
466 As on the @code{heading} element. For output not specific to a
467 command, this is simply @code{log}. The corpus contains one example
468 of where @code{commandName} is present but set to the empty string.
471 @defvr {Attribute} @code{creator-version}
472 As on the @code{heading} element.
475 @node SPV Structure html Element
476 @subsection The @code{html} Element
479 html :lang=(en) => TEXT
482 The element contains an HTML document as text (or, in practice, as
483 CDATA). In some cases, the document starts with @code{<html>} and
484 ends with @code{</html>}; in others the @code{html} element is
485 implied. Generally the HTML includes a @code{head} element with a CSS
486 stylesheet. The HTML body often begins with @code{<BR>}.
488 The HTML document uses only the following elements:
492 Sometimes, the document is enclosed with
493 @code{<html>}@dots{}@code{</html>}.
496 The HTML body often begins with @code{<BR>} and may contain it as well.
504 The attributes @code{face}, @code{color}, and @code{size} are
505 observed. The value of @code{color} takes one of the forms
506 @code{#@var{rr}@var{gg}@var{bb}} or @code{rgb (@var{r}, @var{g},
507 @var{b})}. The value of @code{size} is a number between 1 and 7,
511 The CSS in the corpus is simple. To understand it, a parser only
512 needs to be able to skip white space, @code{<!--}, and @code{-->}, and
513 parse style only for @code{p} elements. Only @code{font-weight},
514 @code{font-style}, @code{font-decoration}, @code{font-family}, and
515 @code{font-size} matter.
517 This element has the following attributes.
519 @defvr {Attribute} @code{lang}
520 This always contains @code{en} in the corpus.
523 @node SPV Structure table Element
524 @subsection The @code{table} Element
533 :displayFiltering=bool?
535 :orphanTolerance=int?
540 :type[table_type]=(table | note | warning)
541 => tableProperties? tableStructure
543 tableStructure => path? dataPath csvPath?
546 This element has the following attributes.
548 @defvr {Attribute} @code{commandName}
549 As on the @code{heading} element.
552 @defvr {Attribute} @code{type}
553 One of @code{table}, @code{note}, or @code{warning}.
556 @defvr {Attribute} @code{subType}
557 The locale-invariant command ID for the particular kind of output that
558 this table represents in the procedure. This can be the same as
559 @code{commandName} e.g.@: @code{Frequencies}, or different, e.g.@:
560 @code{Case Processing Summary}. Generic subtypes @code{Notes} and
561 @code{Warnings} are often used.
564 @defvr {Attribute} @code{tableId}
565 A number that uniquely identifies the table within the SPV file,
566 typically a large negative number such as @code{-4147135649387905023}.
569 @defvr {Attribute} @code{creator-version}
570 As on the @code{heading} element. In the corpus, this is only present
571 for version 21 and up and always includes all 8 digits.
574 @xref{SPV Detail Legacy Properties}, for details on the
575 @code{tableProperties} element.
577 @node SPV Structure graph Element
578 @subsection The @code{graph} Element
593 => dataPath? path csvPath?
596 This element represents a graph. The @code{dataPath} and @code{path}
597 elements name the Zip members that give the details of the graph.
598 Normally, both elements are present; there is only one counterexample
601 @code{csvPath} only appears in one SPV file in the corpus, for two
602 graphs. In these two cases, @code{dataPath}, @code{path}, and
603 @code{csvPath} all appear. These @code{csvPath} name Zip members with
604 names of the form @file{@var{number}_csv.bin}, where @var{number} is a
605 many-digit number and the same as the @code{csvFileIds}. The named
606 Zip members are CSV text files (despite the @file{.bin} extension).
607 The CSV files are encoded in UTF-8 and begin with a U+FEFF byte-order
610 @node SPV Structure model Element
611 @subsection The @code{model} Element
623 => ViZml? dataPath? path | pmmlContainerPath statsContainerPath
625 pmmlContainerPath => TEXT
627 statsContainerPath => TEXT
629 ViZml :viewName? => TEXT
632 This element represents a model. The @code{dataPath} and @code{path}
633 elements name the Zip members that give the details of the model.
634 Normally, both elements are present; there is only one counterexample
637 The details are unexplored. The @code{ViZml} element contains base-64
638 encoded text, that decodes to a binary format with some embedded text
639 strings, and @code{path} names an Zip member that contains XML.
640 Alternatively, @code{pmmlContainerPath} and @code{statsContainerPath}
641 name Zip members with @file{.scf} extension.
643 @node SPV Structure tree Element
644 @subsection The @code{tree} Element
655 This element represents a tree. The @code{dataPath} and @code{path}
656 elements name the Zip members that give the details of the tree.
657 The details are unexplored.
659 @node SPV Structure Path Elements
660 @subsection Path Elements
670 These element contain the name of the Zip members that hold details
671 for a container. For tables:
675 When a ``light'' format is used, only @code{dataPath} is present, and
676 it names a @file{.bin} member of the Zip file that has @code{light} in
677 its name, e.g.@: @code{0000000001437_lightTableData.bin} (@pxref{SPV
678 Light Detail Member Format}).
681 When the legacy format is used, both are present. In this case,
682 @code{dataPath} names a Zip member with a legacy binary format that
683 contains relevant data (@pxref{SPV Legacy Detail Member Binary
684 Format}), and @code{path} names a Zip member that uses an XML format
685 (@pxref{SPV Legacy Detail Member XML Format}).
688 Graphs normally follow the legacy approach described above. The
689 corpus contains one example of a graph with @code{path} but not
690 @code{dataPath}. The reason is unexplored.
692 Models use @code{path} but not @code{dataPath}. @xref{SPV Structure
693 graph Element}, for more information.
695 These elements have no attributes.
697 @node SPV Structure pageSetup Element
698 @subsection The @code{pageSetup} Element
702 :initial-page-number=int?
703 :chart-size=(as-is | full-height | half-height | quarter-height | OTHER)?
704 :margin-left=dimension?
705 :margin-right=dimension?
706 :margin-top=dimension?
707 :margin-bottom=dimension?
708 :paper-height=dimension?
709 :paper-width=dimension?
710 :reference-orientation?
711 :space-after=dimension?
712 => pageHeader pageFooter
714 pageHeader => pageParagraph?
716 pageFooter => pageParagraph?
718 pageParagraph => pageParagraph_text
721 The @code{pageSetup} element has the following attributes.
723 @defvr {Attribute} @code{initial-page-number}
724 The page number to put on the first page of printed output. Usually
728 @defvr {Attribute} @code{chart-size}
729 One of the listed, self-explanatory chart sizes,
730 @code{quarter-height}, or a localization (!) of one of these (e.g.@:
731 @code{dimensione attuale}, @code{Wie vorgegeben}).
734 @defvr {Attribute} @code{margin-left}
735 @defvrx {Attribute} @code{margin-right}
736 @defvrx {Attribute} @code{margin-top}
737 @defvrx {Attribute} @code{margin-bottom}
738 Margin sizes, e.g.@: @code{0.25in}.
741 @defvr {Attribute} @code{paper-height}
742 @defvrx {Attribute} @code{paper-width}
746 @defvr {Attribute} @code{reference-orientation}
747 Indicates the orientation of the output page. Either @code{0deg}
748 (portrait) or @code{90deg} (landscape),
751 @defvr {Attribute} @code{space-after}
752 The amount of space between printed objects, typically @code{12pt}.
755 @node SPV Structure @code{text} Element (Inside @code{pageParagraph})
756 @subsection The @code{text} Element (Inside @code{pageParagraph})
759 text[pageParagraph_text] :type=(title | text) => TEXT
762 This @code{text} element is nested inside a @code{pageParagraph}. There
763 is a different @code{text} element that is nested inside a
766 The element is either empty, or contains CDATA that holds almost-XHTML
767 text: in the corpus, either an @code{html} or @code{p} element. It is
768 @emph{almost}-XHTML because the @code{html} element designates the
770 @indicateurl{http://xml.spss.com/spss/viewer/viewer-tree} instead of
771 an XHTML namespace, and because the CDATA can contain substitution
772 variables. The following variables are supported:
777 The current date or time in the preferred format for the locale.
783 First-, second-, third-, or fourth-level heading.
789 Name of the output file.
795 @code{&[Page]} for the page number and @code{&[PageTitle]} for the
798 Typical contents (indented for clarity):
801 <html xmlns="http://xml.spss.com/spss/viewer/viewer-tree">
804 <p style="text-align:right; margin-top: 0">Page &[Page]</p>
809 This element has the following attributes.
811 @defvr {Attribute} @code{type}
815 @node SPV Light Detail Member Format
816 @section Light Detail Member Format
818 This section describes the format of ``light'' detail @file{.bin}
819 members. These members have a binary format which we describe here in
820 terms of a context-free grammar using the following conventions:
823 @item NonTerminal @result{} @dots{}
824 Nonterminals have CamelCaps names, and @result{} indicates a
825 production. The right-hand side of a production is often broken
826 across multiple lines. Break points are chosen for aesthetics only
827 and have no semantic significance.
829 @item 00, 01, @dots{}, ff.
830 A bytes with a fixed value, written as a pair of hexadecimal digits.
832 @item i0, i1, @dots{}, i9, i10, i11, @dots{}
833 @itemx ib0, ib1, @dots{}, ib9, ib10, ib11, @dots{}
834 A 32-bit integer in little-endian or big-endian byte order,
835 respectively, with a fixed value, written in decimal. Prefixed by
836 @samp{i} for little-endian or @samp{ib} for big-endian.
842 A byte with value 0 or 1.
846 A 16-bit unsigned integer in little-endian or big-endian byte order,
851 A 32-bit unsigned integer in little-endian or big-endian byte order,
856 A 64-bit unsigned integer in little-endian or big-endian byte order,
860 A 64-bit IEEE floating-point number.
863 A 32-bit IEEE floating-point number.
867 A 32-bit unsigned integer, in little-endian or big-endian byte order,
868 respectively, followed by the specified number of bytes of character
869 data. (The encoding is indicated by the Formats nonterminal.)
872 @var{x} is optional, e.g.@: 00? is an optional zero byte.
874 @item @var{x}*@var{n}
875 @var{x} is repeated @var{n} times, e.g.@: byte*10 for ten arbitrary bytes.
877 @item @var{x}[@var{name}]
878 Gives @var{x} the specified @var{name}. Names are used in textual
879 explanations. They are also used, also bracketed, to indicate counts,
880 e.g.@: @code{int32[n] byte*[n]} for a 32-bit integer followed by the
881 specified number of arbitrary bytes.
883 @item @var{a} @math{|} @var{b}
884 Either @var{a} or @var{b}.
887 Parentheses are used for grouping to make precedence clear, especially
888 in the presence of @math{|}, e.g.@: in 00 (01 @math{|} 02 @math{|} 03)
892 @itemx becount(@var{x})
893 A 32-bit unsigned integer, in little-endian or big-endian byte order,
894 respectively, that indicates the number of bytes in @var{x}, followed
898 In a version 1 @file{.bin} member, @var{x}; in version 3, nothing.
899 (The @file{.bin} header indicates the version.)
902 In a version 3 @file{.bin} member, @var{x}; in version 1, nothing.
905 PSPP uses this grammar to parse light detail members. See
906 @file{src/output/spv/light-binary.grammar} in the PSPP source tree for
909 Little-endian byte order is far more common in this format, but a few
910 pieces of the format use big-endian byte order.
912 Light detail members express linear units in two ways: points (pt), at
913 72/inch, and ``device-independent pixels'' (px), at 96/inch. To
914 convert from pt to px, multiply by 1.33 and round up. To convert
915 from px to pt, divide by 1.33 and round down.
917 A ``light'' detail member @file{.bin} consists of a number of sections
918 concatenated together, terminated by an optional byte 01:
922 Header Titles Footnotes
923 Areas Borders PrintSettings TableSettings Formats
924 Dimensions Axes Cells
928 The following sections go into more detail.
931 * SPV Light Member Header::
932 * SPV Light Member Titles::
933 * SPV Light Member Footnotes::
934 * SPV Light Member Areas::
935 * SPV Light Member Borders::
936 * SPV Light Member Print Settings::
937 * SPV Light Member Table Settings::
938 * SPV Light Member Formats::
939 * SPV Light Member Dimensions::
940 * SPV Light Member Categories::
941 * SPV Light Member Axes::
942 * SPV Light Member Cells::
943 * SPV Light Member Value::
944 * SPV Light Member ValueMod::
947 @node SPV Light Member Header
950 An SPV light member begins with a 39-byte header:
955 (i1 @math{|} i3)[version]
958 bool[rotate-inner-column-labels]
959 bool[rotate-outer-row-labels]
962 int32[min-col-width] int32[max-col-width]
963 int32[min-row-width] int32[max-row-width]
967 @code{version} is a version number that affects the interpretation of
968 some of the other data in the member. We will refer to ``version 1''
969 and ``version 3'' later on and use v1(@dots{}) and v3(@dots{}) for
970 version-specific formatting (as described previously).
972 If @code{rotate-inner-column-labels} is 1, then column labels closest
973 to the data are rotated to be vertical; otherwise, they are shown
976 If @code{rotate-outer-row-labels} is 1, then row labels farthest from
977 the data are rotated to be vertical; otherwise, they are shown in the
980 @code{table-id} is a binary version of the @code{tableId} attribute in
981 the structure member that refers to the detail member. For example,
982 if @code{tableId} is @code{-4122591256483201023}, then @code{table-id}
983 would be 0xc6c99d183b300001.
985 @code{min-col-width} is the minimum width that a column will be
986 assigned automatically. @code{max-col-width} is the maximum width
987 that a column will be assigned to accommodate a long column label.
988 @code{min-row-width} and @code{max-row-width} are a similar range for
989 the width of row labels. All of these measurements are in 1/96 inch
990 units (called a ``device independent pixel'' unit in Windows).
992 The meaning of the other variable parts of the header is not known. A
993 writer may safely use version 3, true for @code{x0}, false for
994 @code{x1}, true for @code{x2}, and 0x15 for @code{x3}.
996 @node SPV Light Member Titles
1002 Value[subtype] 01? 31
1003 Value[user-title] 01?
1004 (31 Value[corner-text] @math{|} 58)
1005 (31 Value[caption] @math{|} 58)
1008 The Titles follow the Header and specify the table's title, caption,
1011 The @code{user-title} is shown above the title and reflects any user
1012 editing of the title text or style. The @code{title} is the title
1013 originally generated by the procedure. Both of these are appropriate
1014 for presentation and localized to the user's language. For example,
1015 for a frequency table, @code{title} and @code{user-title} normally
1016 name the variable and @code{c} is simply ``Frequencies''.
1018 @code{subtype} is the same as the @code{subType} attribute in the
1019 @code{table} structure XML element that referred to this member.
1020 @xref{SPV Structure table Element}, for details.
1022 The @code{corner-text}, if present, is shown in the upper-left corner
1023 of the table, above the row headings and to the left of the column
1024 headings. It is usually absent. Corner text prevents row dimension
1025 labels from being displayed above the dimension's group and category
1026 labels (see @code{show-row-labels-in-corner}).
1028 The @code{caption}, if present, is shown below the table.
1029 @code{caption} reflects user editing of the caption.
1031 @node SPV Light Member Footnotes
1032 @subsection Footnotes
1035 Footnotes => int32[n-footnotes] Footnote*[n-footnotes]
1036 Footnote => Value[text] (58 @math{|} 31 Value[marker]) int32[show]
1039 Each footnote has @code{text} and an optional custom @code{marker}
1042 @code{show} is a 32-bit signed integer. It is positive to show the
1043 footnote or negative to hide it. Its magnitude is often 1, and in
1044 other cases tends to be the number of references to the footnote.
1046 @node SPV Light Member Areas
1053 string[typeface] float[size] int32[style] bool[underline]
1054 int32[halign] int32[valign]
1055 string[fg-color] string[bg-color]
1056 bool[alternate] string[alt-fg-color] string[alt-bg-color]
1057 v3(int32[left-margin] int32[right-margin] int32[top-margin] int32[bottom-margin])
1060 Each Area represents the style for a different area of the table, in
1061 the following order: title, caption, footer, corner, column labels,
1062 row labels, data, and layers.
1064 @code{index} is the 1-based index of the Area, i.e. 1 for the first
1065 Area, through 8 for the final Area.
1067 @code{typeface} is the string name of the font used in the area. In
1068 the corpus, this is @code{SansSerif} in over 99% of instances and
1069 @code{Times New Roman} in the rest.
1071 @code{size} is the size of the font, in px (@pxref{SPV Light Detail
1072 Member Format}) The most common size in the corpus is 12 px. Even
1073 though @code{size} has a floating-point type, in the corpus its values
1074 are always integers.
1076 @code{style} is a bit mask. Bit 0 (with value 1) is set for bold, bit
1077 1 (with value 2) is set for italic.
1079 @code{underline} is 1 if the font is underlined, 0 otherwise.
1081 @code{halign} specifies horizontal alignment: 0 for center, 2 for
1082 left, 4 for right, 61453 for decimal, 64173 for mixed. Mixed
1083 alignment varies according to type: string data is left-justified,
1084 numbers and most other formats are right-justified.
1086 @code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
1089 @code{fg-color} and @code{bg-color} are the foreground color and
1090 background color, respectively. In the corpus, these are always
1091 @code{#000000} and @code{#ffffff}, respectively.
1093 @code{alternate} is 1 if rows should alternate colors, 0 if all rows
1094 should be the same color. When @code{alternate} is 1,
1095 @code{alt-fg-color} and @code{alt-bg-color} specify the colors for the
1096 alternate rows; otherwise they are empty strings.
1098 @code{left-margin}, @code{right-margin}, @code{top-margin}, and
1099 @code{bottom-margin} are measured in px.
1101 @node SPV Light Member Borders
1108 be32[n-borders] Border*[n-borders]
1109 bool[show-grid-lines]
1118 The Borders reflect how borders between regions are drawn.
1120 The fixed value of @code{endian} can be used to validate the
1123 @code{show-grid-lines} is 1 to draw grid lines, otherwise 0.
1125 Each Border describes one kind of border. @code{n-borders} seems to
1126 always be 19. Each @code{border-type} appears once (although in an
1127 unpredictable order) and correspond to the following borders:
1133 Left, top, right, and bottom outer frame.
1135 Left, top, right, and bottom inner frame.
1137 Left and top of data area.
1139 Horizontal and vertical dimension rows.
1141 Horizontal and vertical dimension columns.
1143 Horizontal and vertical category rows.
1145 Horizontal and vertical category columns.
1148 @code{stroke-type} describes how a border is drawn, as one of:
1165 @code{color} is an RGB color. Bits 24--31 are alpha, bits 16--23 are
1166 red, 8--15 are green, 0--7 are blue. An alpha of 255 indicates an
1167 opaque color, therefore opaque black is 0xff000000.
1169 @node SPV Light Member Print Settings
1170 @subsection Print Settings
1177 bool[paginate-layers]
1180 bool[top-continuation]
1181 bool[bottom-continuation]
1182 be32[n-orphan-lines]
1183 bestring[continuation-string])
1186 The PrintSettings reflect settings for printing. The fixed value of
1187 @code{endian} can be used to validate the endianness.
1189 @code{all-layers} is 1 to print all layers, 0 to print only the
1192 @code{paginate-layers} is 1 to print each layer at the start of a new
1193 page, 0 otherwise. (This setting is honored only @code{all-layers} is
1194 1, since otherwise only one layer is printed.)
1196 @code{fit-width} and @code{fit-length} control whether the table is
1197 shrunk to fit within a page's width or length, respectively.
1199 @code{n-orphan-lines} is the minimum number of rows or columns to put
1200 in one part of a table that is broken across pages.
1202 If @code{top-continuation} is 1, then @code{continuation-string} is
1203 printed at the top of a page when a table is broken across pages for
1204 printing; similarly for @code{bottom-continuation} and the bottom of a
1205 page. Usually, @code{continuation-string} is empty.
1207 @node SPV Light Member Table Settings
1208 @subsection Table Settings
1218 bool[show-row-labels-in-corner]
1219 bool[show-alphabetic-markers]
1220 bool[footnote-marker-superscripts]
1223 Breakpoints[row-breaks] Breakpoints[column-breaks]
1224 Keeps[row-keeps] Keeps[column-keeps]
1225 PointKeeps[row-point-keeps] PointKeeps[column-point-keeps]
1228 bestring[table-look]
1231 Breakpoints => be32[n-breaks] be32*[n-breaks]
1233 Keeps => be32[n-keeps] Keep*[n-keeps]
1234 Keep => be32[offset] be32[n]
1236 PointKeeps => be32[n-point-keeps] PointKeep*[n-point-keeps]
1237 PointKeep => be32[offset] be32 be32
1240 The TableSettings reflect display settings. The fixed value of
1241 @code{endian} can be used to validate the endianness.
1243 @code{current-layer} is the displayed layer. The interpretation when
1244 there is more than one layer dimension is not yet known.
1246 If @code{omit-empty} is 1, empty rows or columns (ones with nothing in
1247 any cell) are hidden; otherwise, they are shown.
1249 If @code{show-row-labels-in-corner} is 1, then row labels are shown in
1250 the upper left corner; otherwise, they are shown nested.
1252 If @code{show-alphabetic-markers} is 1, markers are shown as letters
1253 (e.g.@: @samp{a}, @samp{b}, @samp{c}, @dots{}); otherwise, they are
1254 shown as numbers starting from 1.
1256 When @code{footnote-marker-superscripts} is 1, footnote markers are shown
1257 as superscripts, otherwise as subscripts.
1259 The Breakpoints are rows or columns after which there is a page break;
1260 for example, a row break of 1 requests a page break after the second
1261 row. Usually no breakpoints are specified, indicating that page
1262 breaks should be selected automatically.
1264 The Keeps are ranges of rows or columns to be kept together without a
1265 page break; for example, a row Keep with @code{offset} 1 and @code{n}
1266 10 requests that the 10 rows starting with the second row be kept
1267 together. Usually no Keeps are specified.
1269 The PointKeeps seem to be generated automatically based on
1270 user-specified Keeps. They seems to indicate a conversion from rows
1271 or columns to pixel or point offsets.
1273 @code{notes} is a text string that contains user-specified notes. It
1274 is displayed when the user hovers the cursor over the table, like
1275 ``alt text'' on a webpage. It is not printed. It is usually empty.
1277 @code{table-look} is the name of a SPSS ``TableLook'' table style,
1278 such as ``Default'' or ``Academic''; it is often empty.
1280 TableSettings ends with an arbitrary number of null bytes. A writer
1281 may safely write 82 null bytes.
1283 A writer may safely use 4 for @code{x5} and 0 for @code{x6}.
1285 @node SPV Light Member Formats
1290 int32[n-widths] int32*[n-widths]
1292 int32[current-layer]
1298 v3(count(X1 count(X2)) count(X3)))
1299 Y0 => int32[epoch] byte[decimal] byte[grouping]
1300 CustomCurrency => int32[n-ccs] string*[n-ccs]
1303 If @code{n-widths} is nonzero, then the accompanying integers are
1304 column widths as manually adjusted by the user.
1306 @code{locale} is a locale including an encoding, such as
1307 @code{en_US.windows-1252} or @code{it_IT.windows-1252}. The rest of
1308 the character strings in the member use this encoding. The encoding
1309 string is itself encoded in US-ASCII.
1311 @code{epoch} is the year that starts the epoch. A 2-digit year is
1312 interpreted as belonging to the 100 years beginning at the epoch. The
1313 default epoch year is 69 years prior to the current year; thus, in
1314 2017 this field by default contains 1948. In the corpus, @code{epoch}
1315 ranges from 1943 to 1948, plus some contain -1.
1317 @code{decimal} is the decimal point character. The observed values
1318 are @samp{.} and @samp{,}.
1320 @code{grouping} is the grouping character. Usually, it is @samp{,} if
1321 @code{decimal} is @samp{.}, and vice versa. Other observed values are
1322 @samp{'} (apostrophe), @samp{ } (space), and zero (presumably
1323 indicating that digits should not be grouped).
1325 @code{n-ccs} is observed as either 0 or 5. When it is 5, the
1326 following strings are CCA through CCE format strings. @xref{Custom
1327 Currency Formats,,, pspp, PSPP}. Most commonly these are all
1328 @code{-,,,} but other strings occur.
1332 X0 only appears, optionally, in version 1 members.
1337 string[command] string[command-local]
1338 string[language] string[charset] string[locale]
1341 Y2 => CustomCurrency byte[missing] bool[x17]
1344 @code{command} describes the statistical procedure that generated the
1345 output, in English. It is not necessarily the literal syntax name of
1346 the procedure: for example, NPAR TESTS becomes ``Nonparametric
1347 Tests.'' @code{command-local} is the procedure's name, translated
1348 into the output language; it is often empty and, when it is not,
1349 sometimes the same as @code{command}.
1351 @code{dataset} is the name of the dataset analyzed to produce the
1352 output, e.g.@: @code{DataSet1}, and @code{datafile} the name of the
1353 file it was read from, e.g.@: @file{C:\Users\foo\bar.sav}. The latter
1354 is sometimes the empty string.
1356 @code{missing} is the character used to indicate that a cell contains
1357 a missing value. It is always observed as @samp{.}.
1359 X0 repeats @code{decimal}, @code{grouping}, CustomCurrency, and
1360 @code{missing} already included in Formats.
1362 A writer may safely use false for @code{x17}.
1366 X1 only appears in version 3 members.
1370 bool byte[x15] bool[x16]
1372 byte[show-variables]
1374 int32[x18] int32[x19]
1380 @code{lang} may indicate the language in use. Some values seem to be
1381 0: @t{en}, 1: @t{de}, 2: @t{es}, 3: @t{it}, 5: @t{ko}, 6: @t{pl}, 8:
1382 @t{zh-tw}, 10: @t{pt_BR}, 11: @t{fr}. The @code{locale} in Formats
1383 and the @code{language}, @code{charset}, and @code{locale} in X0 are
1384 more likely to be useful in practice.
1386 @code{show-variables} determines how variables are displayed by
1387 default. A value of 1 means to display variable names, 2 to display
1388 variable labels when available, 3 to display both (name followed by
1389 label, separated by a space). The most common value is 0, which
1390 probably means to use a global default.
1392 @code{show-values} is a similar setting for values. A value of 1
1393 means to display the value, 2 to display the value label when
1394 available, 3 to display both. Again, the most common value is 0,
1395 which probably means to use a global default.
1397 @code{show-caption} is true to show the caption, false to hide it.
1399 A writer may safely use false for @code{x14}, 1 for @code{x15}, false
1400 for @code{x16}, -1 for @code{x18} and @code{x19}, and false for
1405 X2 only appears in version 3 members.
1409 int32[n-row-heights] int32*[n-row-heights]
1410 int32[n-style-map] StyleMap*[n-style-map]
1411 int32[n-styles] StylePair*[n-styles]
1413 StyleMap => int64[cell-index] int16[style-index]
1416 If present, @code{n-row-heights} and the accompanying integers are row
1417 heights as manually adjusted by the user.
1419 The rest of X2 specifies styles for data cells. At first glance this
1420 is odd, because each data cell can have its own style embedded as part
1421 of the data, but in practice X2 specifies a style for a cell only if
1422 that cell is empty (and thus does not appear in the data at all).
1423 Each StyleMap specifies the index of a blank cell, calculated the same
1424 was as in the Cells (@pxref{SPV Light Member Cells}), along with a
1425 0-based index into the accompanying StylePair array.
1427 A writer may safely omit the optional @code{i0 i0} inside the
1428 @code{count(@dots{})}.
1432 X3 only appears in version 3 members.
1436 01 00 byte[x21] 00 00 00
1439 (string[dataset] string[datafile] i0 int32[date] i0)?
1444 @code{date} is a date, as seconds since the epoch, i.e.@: since
1445 January 1, 1970. Pivot tables within an SPV file often have dates a
1446 few minutes apart, so this is probably a creation date for the table
1447 rather than for the file.
1449 X3 repeats @code{decimal}, @code{grouping}, CustomCurrency, and
1450 @code{missing} already included in Formats. @code{command},
1451 @code{command-local}, @code{language}, @code{charset}, and
1452 @code{locale} have the same meaning as in X0.
1454 @code{small} is a small real number, e.g.@: .001. Numbers smaller
1455 than this in absolute value are displayed in scientific notation.
1457 Sometimes @code{dataset}, @code{datafile}, and @code{date} are present
1458 and other times they are absent. The reader can distinguish by
1459 assuming that they are present and then checking whether the
1460 presumptive @code{dataset} contains a null byte (a valid string never
1463 @code{x22} is usually 0 or 2000000.
1465 A writer may safely use 4 for @code{x21} and omit @code{x22} and the
1466 other optional bytes at the end.
1468 @node SPV Light Member Dimensions
1469 @subsection Dimensions
1471 A pivot table presents multidimensional data. A Dimension identifies
1472 the categories associated with each dimension.
1475 Dimensions => int32[n-dims] Dimension*[n-dims]
1477 Value[name] DimProperties
1478 int32[n-categories] Category*[n-categories]
1483 bool[hide-dim-label]
1484 bool[hide-all-labels]
1488 @code{name} is the name of the dimension, e.g.@: @code{Variables},
1489 @code{Statistics}, or a variable name.
1491 The meanings of @code{x1} and @code{x3} are unknown. @code{x1} is
1492 usually 0 but many other values have been observed. A writer may
1493 safely use 0 for @code{x1} and 2 for @code{x3}.
1495 @code{x2} is 0, 1, or 2. For a pivot table with @var{L} layer
1496 dimensions, @var{R} row dimensions, and @var{C} column dimensions,
1497 @code{x2} is 2 for the first @var{L} dimensions, 0 for the next
1498 @var{R} dimensions, and 1 for the remaining @var{C} dimensions. This
1499 does not mean that the layer dimensions must be presented first,
1500 followed by the row dimensions, followed by the column dimensions---on
1501 the contrary, they are frequently in a different order---but @code{x2}
1502 must follow this pattern to prevent the pivot table from being
1505 If @code{hide-dim-label} is 00, the pivot table displays a label for
1506 the dimension itself. Because usually the group and category labels
1507 are enough explanation, it is usually 01.
1509 If @code{hide-all-labels} is 01, the pivot table omits all labels for
1510 the dimension, including group and category labels. It is usually 00.
1511 When @code{hide-all-labels} is 01, @code{show-dim-label} is ignored.
1513 @code{dim-index} is usually the 0-based index of the dimension, e.g.@:
1514 0 for the first dimension, 1 for the second, and so on. Sometimes it
1515 is -1. There is no visible difference.
1517 @node SPV Light Member Categories
1518 @subsection Categories
1520 Categories are arranged in a tree. Only the leaf nodes in the tree
1521 are really categories; the others just serve as grouping constructs.
1524 Category => Value[name] (Leaf @math{|} Group)
1525 Leaf => 00 00 00 i2 int32[leaf-index] i0
1527 bool[merge] 00 01 int32[x23]
1528 i-1 int32[n-subcategories] Category*[n-subcategories]
1531 @code{name} is the name of the category (or group).
1533 A Leaf represents a leaf category. The Leaf's @code{leaf-index} is a
1534 nonnegative integer unique within the Dimension and less than
1535 @code{n-categories} in the Dimension. If the user does not sort or
1536 rearrange the categories, then @code{leaf-index} starts at 0 for the
1537 first Leaf in the dimension and increments by 1 with each successive
1538 Leaf. If the user does sorts or rearrange the categories, then the
1539 order of categories in the file reflects that change and
1540 @code{leaf-index} reflects the original order.
1542 Occasionally a dimension has no leaf categories at all. A table that
1543 contains such a dimension necessarily has no data at all.
1545 A Group is a group of nested categories. Usually a Group contains at
1546 least one Category, so that @code{n-subcategories} is positive, but a
1547 few Groups with @code{n-subcategories} 0 has been observed.
1549 If a Group's @code{merge} is 00, the most common value, then the group
1550 is really a distinct group that should be represented as such in the
1551 visual representation and user interface. If @code{merge} is 01, the
1552 categories in this group should be shown and treated as if they were
1553 direct children of the group's containing group (or if it has no
1554 parent group, then direct children of the dimension), and this group's
1555 name is irrelevant and should not be displayed. (Merged groups can be
1558 (For writing an SPV file, there is no need to use the @code{merge}
1559 feature unless it is convenient.)
1561 A Group's @code{x23} appears to be i2 when all of the categories
1562 within a group are leaf categories that directly represent data values
1563 for a variable (e.g.@: in a frequency table or crosstabulation, a group
1564 of values in a variable being tabulated) and i0 otherwise. A writer
1565 may safely write a constant 0 in this field.
1567 @node SPV Light Member Axes
1570 After the dimensions come assignment of each dimension to one of the
1571 axes: layers, rows, and columns.
1575 int32[n-layers] int32[n-rows] int32[n-columns]
1576 int32*[n-layers] int32*[n-rows] int32*[n-columns]
1579 The values of @code{n-layers}, @code{n-rows}, and @code{n-columns}
1580 each specifies the number of dimensions displayed in layers, rows, and
1581 columns, respectively. Any of them may be zero. Their values sum to
1582 @code{n-dimensions} from Dimensions (@pxref{SPV Light Member
1585 The following @code{n-dimensions} integers, in three groups, are a
1586 permutation of the 0-based dimension numbers. The first
1587 @code{n-layers} integers specify each of the dimensions represented by
1588 layers, the next @code{n-rows} integers specify the dimensions
1589 represented by rows, and the final @code{n-columns} integers specify
1590 the dimensions represented by columns. When there is more than one
1591 dimension of a given kind, the inner dimensions are given first.
1593 @node SPV Light Member Cells
1596 The final part of an SPV light member contains the actual data.
1599 Cells => int32[n-cells] Cell*[n-cells]
1600 Cell => int64[index] v1(00?) Value
1603 A Cell consists of an @code{index} and a Value. Suppose there are
1604 @math{d} dimensions, numbered 1 through @math{d} in the order given in
1605 the Dimensions previously, and that dimension @math{i}, has @math{n_i}
1606 categories. Consider the cell at coordinates @math{x_i}, @math{1 \le
1607 i \le d}, and note that @math{0 \le x_i < n_i}. Then the index is
1608 calculated by the following algorithm:
1612 for each @math{i} from 1 to @math{d}:
1613 @i{index} = (@math{n_i \times} @i{index}) @math{+} @math{x_i}
1616 For example, suppose there are 3 dimensions with 3, 4, and 5
1617 categories, respectively. The cell at coordinates (1, 2, 3) has
1618 index @math{5 \times (4 \times (3 \times 0 + 1) + 2) + 3 = 33}.
1619 Within a given dimension, the index is the @code{leaf-index} in a Leaf.
1621 @node SPV Light Member Value
1624 Value is used throughout the SPV light member format. It boils down
1625 to a number or a string.
1628 Value => 00? 00? 00? 00? RawValue
1630 01 ValueMod int32[format] double[x]
1631 @math{|} 02 ValueMod int32[format] double[x]
1632 string[var-name] string[value-label] byte[show]
1633 @math{|} 03 string[local] ValueMod string[id] string[c] bool[fixed]
1634 @math{|} 04 ValueMod int32[format] string[value-label] string[var-name]
1635 byte[show] string[s]
1636 @math{|} 05 ValueMod string[var-name] string[var-label] byte[show]
1637 @math{|} ValueMod string[template] int32[n-args] Argument*[n-args]
1640 @math{|} int32[x] i0 Value*[x] /* x > 0 */
1643 There are several possible encodings, which one can distinguish by the
1644 first nonzero byte in the encoding.
1648 The numeric value @code{x}, intended to be presented to the user
1649 formatted according to @code{format}, which is in the format described
1650 for system files, except that format 40 is a synonym for F format
1651 instead of MTIME. @xref{System File Output Formats}, for details.
1652 Most commonly, @code{format} has width 40 (the maximum).
1654 An @code{x} with the maximum negative double value @code{-DBL_MAX}
1655 represents the system-missing value SYSMIS. (HIGHEST and LOWEST have
1656 not been observed.) @xref{System File Format}, for more about these
1660 Similar to @code{01}, with the additional information that @code{x} is
1661 a value of variable @code{var-name} and has value label
1662 @code{value-label}. Both @code{var-name} and @code{value-label} can
1663 be the empty string, the latter very commonly.
1665 @code{show} determines whether to show the numeric value or the value
1666 label. A value of 1 means to show the value, 2 to show the label, 3
1667 to show both, and 0 means to use the default specified in
1668 @code{show-values} (@pxref{SPV Light Member Formats}).
1671 A text string, in two forms: @code{c} is in English, and sometimes
1672 abbreviated or obscure, and @code{local} is localized to the user's
1673 locale. In an English-language locale, the two strings are often the
1674 same, and in the cases where they differ, @code{local} is more
1675 appropriate for a user interface, e.g.@: @code{c} of ``Not a PxP table
1676 for MCN...'' versus @code{local} of ``Computed only for a PxP table,
1677 where P must be greater than 1.''
1679 @code{c} and @code{local} are always either both empty or both
1682 @code{id} is a brief identifying string whose form seems to resemble a
1683 programming language identifier, e.g.@: @code{cumulative_percent} or
1684 @code{factor_14}. It is not unique.
1686 @code{fixed} is 00 for text taken from user input, such as syntax
1687 fragment, expressions, file names, data set names, and 01 for fixed
1688 text strings such as names of procedures or statistics. In the former
1689 case, @code{id} is always the empty string; in the latter case,
1690 @code{id} is still sometimes empty.
1693 The string value @code{s}, intended to be presented to the user
1694 formatted according to @code{format}. The format for a string is not
1695 too interesting, and the corpus contains many clearly invalid formats
1696 like A16.39 or A255.127 or A134.1, so readers should probably ignore
1697 the format entirely.
1699 @code{s} is a value of variable @code{var-name} and has value label
1700 @code{value-label}. @code{var-name} is never empty but
1701 @code{value-label} is commonly empty.
1703 @code{show} has the same meaning as in the encoding for 02.
1706 Variable @code{var-name}, which is rarely observed as empty in the
1707 corpus, with variable label @code{var-label}, which is often empty.
1709 @code{show} determines whether to show the variable name or the
1710 variable label. A value of 1 means to show the name, 2 to show the
1711 label, 3 to show both, and 0 means to use the default specified in
1712 @code{show-variables} (@pxref{SPV Light Member Formats}).
1715 When the first byte of a RawValue is not one of the above, the
1716 RawValue starts with a ValueMod, whose syntax is described in the next
1717 section. (A ValueMod always begins with byte 31 or 58.)
1719 This case is a template string, analogous to @code{printf}, followed
1720 by one or more Arguments, each of which has one or more values. The
1721 template string is copied directly into the output except for the
1722 following special syntax,
1729 Each of these expands to the character following @samp{\\}, to escape
1730 characters that have special meaning in template strings. These are
1731 effective inside and outside the @code{[@dots{}]} syntax forms
1735 Expands to a new-line, inside or outside the @code{[@dots{}]} forms
1739 Expands to a formatted version of argument @var{i}, which must have
1740 only a single value. For example, @code{^1} expands to the first
1741 argument's @code{value}.
1743 @item [:@var{a}:]@var{i}
1744 Expands @var{a} for each of the values in @var{i}. @var{a}
1745 should contain one or more @code{^@var{j}} conversions, which are
1746 drawn from the values for argument @var{i} in order. Some examples
1751 All of the values for the first argument, concatenated.
1754 Expands to the values for the first argument, each followed by
1758 Expands to @code{@var{x} = @var{y}} where @var{x} is the second
1759 argument's first value and @var{y} is its second value. (This would
1760 be used only if the argument has two values. If there were more
1761 values, the second and third values would be directly concatenated,
1762 which would look funny.)
1765 @item [@var{a}:@var{b}:]@var{i}
1766 This extends the previous form so that the first values are expanded
1767 using @var{a} and later values are expanded using @var{b}. For an
1768 unknown reason, within @var{a} the @code{^@var{j}} conversions are
1769 instead written as @code{%@var{j}}. Some examples from the corpus:
1773 Expands to all of the values for the first argument, separated by
1776 @item [%1 = %2:, ^1 = ^2:]1
1777 Given appropriate values for the first argument, expands to @code{X =
1781 Given appropriate values, expands to @code{1, 2, 3}.
1785 The template string is localized to the user's locale.
1788 A writer may safely omit all of the optional 00 bytes at the beginning
1789 of a Value, except that it should write a single 00 byte before a
1792 @node SPV Light Member ValueMod
1793 @subsection ValueMod
1795 A ValueMod can specify special modifications to a Value.
1801 int32[n-refs] int16*[n-refs]
1802 int32[n-subscripts] string*[n-subscripts]
1803 v1(00 (i1 | i2) 00? 00? int32 00? 00?)
1804 v3(count(TemplateString StylePair))
1806 TemplateString => count((count((i0 (58 @math{|} 31 55))?) (58 @math{|} 31 string[id]))?)
1813 bool[bold] bool[italic] bool[underline] bool[show]
1814 string[fg-color] string[bg-color]
1815 string[typeface] byte[size]
1818 int32[halign] int32[valign] double[decimal-offset]
1819 int16[left-margin] int16[right-margin]
1820 int16[top-margin] int16[bottom-margin]
1823 A ValueMod that begins with ``31'' specifies special modifications to
1826 Each of the @code{n-refs} integers is a reference to a Footnote
1827 (@pxref{SPV Light Member Footnotes}) by 0-based index. Footnote
1828 markers are shown appended to the main text of the Value, as
1831 The @code{subscripts}, if present, are strings to append to the main
1832 text of the Value, as subscripts. Each subscript text is a brief
1833 indicator, e.g.@: @samp{a} or @samp{b}, with its meaning indicated by
1834 the table caption. When multiple subscripts are present, they are
1835 displayed separated by commas.
1837 The @code{id} inside the TemplateString, if present, is a template
1838 string for substitutions using the syntax explained previously. It
1839 appears to be an English-language version of the localized template
1840 string in the Value in which the Template is nested. A writer may
1841 safely omit the optional fixed data in TemplateString.
1843 FontStyle and CellStyle, if present, change the style for this
1844 individual Value. In FontStyle, @code{bold}, @code{italic}, and
1845 @code{underline} control the particular style. @code{show} is
1846 ordinarily 1; if it is 0, then the cell data is not shown.
1847 @code{fg-color} and @code{bg-color} are strings in the format
1848 @code{#rrggbb}, e.g.@: @code{#ff0000} for red or @code{#ffffff} for
1849 white. The empty string is occasionally observed also. The
1850 @code{size} is a font size in units of 1/128 inch.
1852 In CellStyle, @code{halign} is 0 for center, 2 for left, 4 for right,
1853 6 for decimal, 0xffffffad for mixed. For decimal alignment,
1854 @code{decimal-offset} is the decimal point's offset from the right
1855 side of the cell, in pt (@pxref{SPV Light Detail Member Format}).
1856 @code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
1857 for bottom. @code{left-margin}, @code{right-margin},
1858 @code{top-margin}, and @code{bottom-margin} are in pt.
1860 @node SPV Legacy Detail Member Binary Format
1861 @section Legacy Detail Member Binary Format
1863 Whereas the light binary format represents everything about a given
1864 pivot table, the legacy binary format conceptually consists of a
1865 number of named sources, each of which consists of a number of named
1866 variables, each of which is a 1-dimensional array of numbers or
1867 strings or a mix. Thus, the legacy binary member format is quite
1870 This section uses the same context-free grammar notation as in the
1871 previous section, with the following additions:
1875 In a version 0xaf legacy member, @var{x}; in other versions, nothing.
1876 (The legacy member header indicates the version; see below.)
1879 In a version 0xb0 legacy member, @var{x}; in other versions, nothing.
1882 A legacy detail member @file{.bin} has the following overall format:
1886 00 byte[version] int16[n-sources] int32[member-size]
1887 Metadata*[n-sources]
1892 @code{version} is a version number that affects the interpretation of
1893 some of the other data in the member. Versions 0xaf and 0xb0 are
1894 known. We will refer to ``version 0xaf'' and ``version 0xb0'' members
1897 A legacy member consists of @code{n-sources} data sources, each of
1898 which has Metadata and Data.
1900 @code{member-size} is the size of the legacy binary member, in bytes.
1902 The Data and Strings above are commented out because the Metadata has
1903 some oddities that mean that the Data sometimes seems to start at
1904 an unexpected place. The following section goes into detail.
1907 * SPV Legacy Member Metadata::
1908 * SPV Legacy Member Numeric Data::
1909 * SPV Legacy Member String Data::
1912 @node SPV Legacy Member Metadata
1913 @subsection Metadata
1917 int32[n-values] int32[n-variables] int32[data-offset]
1918 vAF(byte*28[source-name])
1919 vB0(byte*64[source-name] int32[x])
1922 A data source has @code{n-variables} variables, each with
1923 @code{n-values} data values.
1925 @code{source-name} is a 28- or 64-byte string padded on the right with
1926 0-bytes. The names that appear in the corpus are very generic:
1927 usually @code{tableData} for pivot table data or @code{source0} for
1930 A given Metadata's @code{data-offset} is the offset, in bytes, from
1931 the beginning of the member to the start of the corresponding Data.
1932 This allows programs to skip to the beginning of the data for a
1933 particular source. In every case in the corpus, the Data follow the
1934 Metadata in the same order, but it is important to use
1935 @code{data-offset} instead of reading sequentially through the file
1936 because of the exception described below.
1938 One SPV file in the corpus has legacy binary members with version 0xb0
1939 but a 28-byte @code{source-name} field (and only a single source). In
1940 practice, this means that the 64-byte @code{source-name} used in
1941 version 0xb0 has a lot of 0-bytes in the middle followed by the
1942 @code{variable-name} of the following Data. As long as a reader
1943 treats the first 0-byte in the @code{source-name} as terminating the
1944 string, it can properly interpret these members.
1946 The meaning of @code{x} in version 0xb0 is unknown.
1948 @node SPV Legacy Member Numeric Data
1949 @subsection Numeric Data
1952 Data => Variable*[n-variables]
1953 Variable => byte*288[variable-name] double*[n-values]
1956 Data follow the Metadata in the legacy binary format, with sources in
1957 the same order (but readers should use the @code{data-offset} in
1958 Metadata records, rather than reading sequentially). Each Variable
1959 begins with a @code{variable-name} that generally indicates its role
1960 in the pivot table, e.g.@: ``cell'', ``cellFormat'',
1961 ``dimension0categories'', ``dimension0group0'', followed by the
1962 numeric data, one double per datum. A double with the maximum
1963 negative double @code{-DBL_MAX} represents the system-missing value
1966 @node SPV Legacy Member String Data
1967 @subsection String Data
1970 Strings => SourceMaps[maps] Labels
1972 SourceMaps => int32[n-maps] SourceMap*[n-maps]
1974 SourceMap => string[source-name] int32[n-variables] VariableMap*[n-variables]
1975 VariableMap => string[variable-name] int32[n-data] DatumMap*[n-data]
1976 DatumMap => int32[value-idx] int32[label-idx]
1978 Labels => int32[n-labels] Label*[n-labels]
1979 Label => int32[frequency] string[label]
1982 Each variable may include a mix of numeric and string data values. If
1983 a legacy binary member contains any string data, Strings is present;
1984 otherwise, it ends just after the last Data element.
1986 The string data overlays the numeric data. When a variable includes
1987 any string data, its Variable represents the string values with a
1988 SYSMIS or NaN placeholder. (Not all such values need be
1991 Each SourceMap provides a mapping between SYSMIS or NaN values in source
1992 @code{source-name} and the string data that they represent.
1993 @code{n-variables} is the number of variables in the source that
1994 include string data. More precisely, it is the 1-based index of the
1995 last variable in the source that includes any string data; thus, it
1996 would be 4 if there are 5 variables and only the fourth one includes
1999 A VariableMap repeats its variable's name, but variables are always
2000 present in the same order as the source, starting from the first
2001 variable, without skipping any even if they have no string values.
2002 Each VariableMap contains DatumMap nonterminals, each of which maps
2003 from a 0-based index within its variable's data to a 0-based label
2004 index, e.g.@: pair @code{value-idx} = 2, @code{label-idx} = 3, means
2005 that the third data value (which must be SYSMIS or NaN) is to be
2006 replaced by the string of the fourth Label.
2008 The labels themselves follow the pairs. The valuable part of each
2009 label is the string @code{label}. Each label also includes a
2010 @code{frequency} that reports the number of DatumMaps that reference
2011 it (although this is not useful).
2013 @node SPV Legacy Detail Member XML Format
2014 @section Legacy Detail Member XML Format
2016 The design of the detail XML format is not what one would end up with
2017 for describing pivot tables. This is because it is a special case
2018 of a much more general format (``visualization XML'' or ``VizML'')
2019 that can describe a wide range of visualizations. Most of this
2020 generality is overkill for tables, and so we end up with a funny
2021 subset of a general-purpose format.
2023 An XML Schema for VizML is available, distributed with SPSS binaries,
2024 under a nonfree license. It contains documentation that is
2025 occasionally helpful.
2027 This section describes the detail XML format using the same notation
2028 already used for the structure XML format (@pxref{SPV Structure Member
2029 Format}). See @file{src/output/spv/detail-xml.grammar} in the PSPP
2030 source tree for the full grammar that it uses for parsing.
2032 The important elements of the detail XML format are:
2036 Variables. @xref{SPV Detail Variable Elements}.
2039 Assignment of variables to axes. A variable can appear as columns, or
2040 rows, or layers. The @code{faceting} element and its sub-elements
2041 describe this assignment.
2044 Styles and other annotations.
2047 This description is not detailed enough to write legacy tables.
2048 Instead, write tables in the light binary format.
2051 * SPV Detail visualization Element::
2052 * SPV Detail Variable Elements::
2053 * SPV Detail extension Element::
2054 * SPV Detail graph Element::
2055 * SPV Detail location Element::
2056 * SPV Detail faceting Element::
2057 * SPV Detail facetLayout Element::
2058 * SPV Detail label Element::
2059 * SPV Detail setCellProperties Element::
2060 * SPV Detail setFormat Element::
2061 * SPV Detail interval Element::
2062 * SPV Detail style Element::
2063 * SPV Detail labelFrame Element::
2064 * SPV Detail Legacy Properties::
2067 @node SPV Detail visualization Element
2068 @subsection The @code{visualization} Element
2076 :style[style_ref]=ref style
2080 => visualization_extension?
2082 (sourceVariable | derivedVariable)+
2091 extension[visualization_extension]
2094 :minWidthSet=(true)?
2095 :maxWidthSet=(true)?
2098 userSource :missing=(listwise | pairwise)? => EMPTY
2100 categoricalDomain => variableReference simpleSort
2102 simpleSort :method[sort_method]=(custom) => categoryOrder
2104 container :style=ref style => container_extension? location+ labelFrame*
2106 extension[container_extension] :combinedFootnotes=(true) => EMPTY
2114 The @code{visualization} element is the root of detail XML member. It
2115 has the following attributes:
2117 @defvr {Attribute} creator
2118 The version of the software that created this SPV file, as a string of
2119 the form @code{xxyyzz}, which represents software version xx.yy.zz,
2120 e.g.@: @code{160001} is version 16.0.1. The corpus includes major
2121 versions 16 through 19.
2124 @defvr {Attribute} date
2125 The date on the which the file was created, as a string of the form
2129 @defvr {Attribute} lang
2130 The locale used for output, in Windows format, which is similar to the
2131 format used in Unix with the underscore replaced by a hyphen, e.g.@:
2132 @code{en-US}, @code{en-GB}, @code{el-GR}, @code{sr-Cryl-RS}.
2135 @defvr {Attribute} name
2136 The title of the pivot table, localized to the output language.
2139 @defvr {Attribute} style
2140 The base style for the pivot table. In every example in the corpus,
2141 the @code{style} element has no attributes other than @code{id}.
2144 @defvr {Attribute} type
2145 A floating-point number. The meaning is unknown.
2148 @defvr {Attribute} version
2149 The visualization schema version number. In the corpus, the value is
2150 one of 2.4, 2.5, 2.7, and 2.8.
2153 The @code{userSource} element has no visible effect.
2155 The @code{extension} element as a child of @code{visualization} has
2156 the following attributes.
2158 @defvr {Attribute} numRows
2159 An integer that presumably defines the number of rows in the displayed
2163 @defvr {Attribute} showGridline
2164 Always set to @code{false} in the corpus.
2167 @defvr {Attribute} minWidthSet
2168 @defvrx {Attribute} maxWidthSet
2169 Always set to @code{true} in the corpus.
2172 The @code{extension} element as a child of @code{container} has the
2175 @defvr {Attribute} combinedFootnotes
2179 The @code{categoricalDomain} and @code{simpleSort} elements have no
2182 The @code{layerController} element has no visible effect.
2184 @node SPV Detail Variable Elements
2185 @subsection Variable Elements
2187 A ``variable'' in detail XML is a 1-dimensional array of data. Each
2188 element of the array may, independently, have string or numeric
2189 content. All of the variables in a given detail XML member either
2190 have the same number of elements or have zero elements.
2192 Two different elements define variables and their content:
2195 @item sourceVariable
2196 These variables' data comes from the associated @code{tableData.bin}
2199 @item derivedVariable
2200 These variables are defined in terms of a mapping function from a
2201 source variable, or they are empty.
2204 A variable named @code{cell} always exists. This variable holds the
2205 data displayed in the table.
2207 Variables in detail XML roughly correspond to the dimensions in a
2208 light detail member. Each dimension has the following variables with
2209 stylized names, where @var{n} is a number for the dimension starting
2213 @item dimension@var{n}categories
2214 The dimension's leaf categories (@pxref{SPV Light Member Categories}).
2216 @item dimension@var{n}group0
2217 Present only if the dimension's categories are grouped, this variable
2218 holds the group labels for the categories. Grouping is inferred
2219 through adjacent identical labels. Categories that are not part of a
2220 group have empty-string data in this variable.
2222 @item dimension@var{n}group1
2223 Present only if the first-level groups are further grouped, this
2224 variable holds the labels for the second-level groups. There can be
2225 additional variables with further levels of grouping.
2227 @item dimension@var{n}
2231 Determining the data for a (non-empty) variable is a multi-step
2236 Draw initial data from its source, for a @code{sourceVariable}, or
2237 from another named variable, for a @code{derivedVariable}.
2240 Apply mappings from @code{valueMapEntry} elements within the
2241 @code{derivedVariable} element, if any.
2244 Apply mappings from @code{relabel} elements within a @code{format} or
2245 @code{stringFormat} element in the @code{sourceVariable} or
2246 @code{derivedVariable} element, if any.
2249 If the variable is a @code{sourceVariable} with a @code{labelVariable}
2250 attribute, and there were no mappings to apply in previous steps, then
2251 replace each element of the variable by the corresponding value in the
2255 A single variable's data can be modified in two of the steps, if both
2256 @code{valueMapEntry} and @code{relabel} are used. The following
2257 example from the corpus maps several integers to 2, then maps 2 in
2258 turn to the string ``Input'':
2261 <derivedVariable categorical="true" dependsOn="dimension0categories"
2262 id="dimension0group0map" value="map(dimension0group0)">
2264 <relabel from="2" to="Input"/>
2265 <relabel from="10" to="Missing Value Handling"/>
2266 <relabel from="14" to="Resources"/>
2267 <relabel from="0" to=""/>
2268 <relabel from="1" to=""/>
2269 <relabel from="13" to=""/>
2271 <valueMapEntry from="2;3;5;6;7;8;9" to="2"/>
2272 <valueMapEntry from="10;11" to="10"/>
2273 <valueMapEntry from="14;15" to="14"/>
2274 <valueMapEntry from="0" to="0"/>
2275 <valueMapEntry from="1" to="1"/>
2276 <valueMapEntry from="13" to="13"/>
2281 * SPV Detail sourceVariable Element::
2282 * SPV Detail derivedVariable Element::
2283 * SPV Detail valueMapEntry Element::
2286 @node SPV Detail sourceVariable Element
2287 @subsubsection The @code{sourceVariable} Element
2294 :domain=ref categoricalDomain?
2296 :dependsOn=ref sourceVariable?
2298 :labelVariable=ref sourceVariable?
2299 => variable_extension* (format | stringFormat)?
2302 This element defines a variable whose data comes from the
2303 @file{tableData.bin} member that corresponds to this @file{.xml}.
2305 This element has the following attributes.
2307 @defvr {Attribute} id
2308 An @code{id} is always present because this element exists to be
2309 referenced from other elements.
2312 @defvr {Attribute} categorical
2313 Always set to @code{true}.
2316 @defvr {Attribute} source
2317 Always set to @code{tableData}, the @code{source-name} in the
2318 corresponding @file{tableData.bin} member (@pxref{SPV Legacy Member
2322 @defvr {Attribute} sourceName
2323 The name of a variable within the source, corresponding to the
2324 @code{variable-name} in the @file{tableData.bin} member (@pxref{SPV
2325 Legacy Member Numeric Data}).
2328 @defvr {Attribute} label
2329 The variable label, if any.
2332 @defvr {Attribute} labelVariable
2333 The @code{variable-name} of a variable whose string values correspond
2334 one-to-one with the values of this variable and are suitable for use
2338 @defvr {Attribute} dependsOn
2339 This attribute doesn't affect the display of a table.
2342 @node SPV Detail derivedVariable Element
2343 @subsubsection The @code{derivedVariable} Element
2350 :dependsOn=ref sourceVariable?
2351 => variable_extension* (format | stringFormat)? valueMapEntry*
2354 Like @code{sourceVariable}, this element defines a variable whose
2355 values can be used elsewhere in the visualization. Instead of being
2356 read from a data source, the variable's data are defined by a
2357 mathematical expression.
2359 This element has the following attributes.
2361 @defvr {Attribute} id
2362 An @code{id} is always present because this element exists to be
2363 referenced from other elements.
2366 @defvr {Attribute} categorical
2367 Always set to @code{true}.
2370 @defvr {Attribute} value
2371 An expression that defines the variable's value. In theory this could
2372 be an arbitrary expression in terms of constants, functions, and other
2373 variables, e.g.@: @math{(@var{var1} + @var{var2}) / 2}. In practice,
2374 the corpus contains only the following forms of expressions:
2378 @itemx constant(@var{variable})
2379 All zeros. The reason why a variable is sometimes named is unknown.
2380 Sometimes the ``variable name'' has spaces in it.
2382 @item map(@var{variable})
2383 Transforms the values in the named @var{variable} using the
2384 @code{valueMapEntry}s contained within the element.
2388 @defvr {Attribute} dependsOn
2389 This attribute doesn't affect the display of a table.
2392 @node SPV Detail valueMapEntry Element
2393 @subsubsection The @code{valueMapEntry} Element
2396 valueMapEntry :from :to => EMPTY
2399 A @code{valueMapEntry} element defines a mapping from one or more
2400 values of a source expression to a target value. (In the corpus, the
2401 source expression is always just the name of a variable.) Each target
2402 value requires a separate @code{valueMapEntry}. If multiple source
2403 values map to the same target value, they can be combined or separate.
2405 In the corpus, all of the source and target values are integers.
2407 @code{valueMapEntry} has the following attributes.
2409 @defvr {Attribute} from
2410 A source value, or multiple source values separated by semicolons,
2411 e.g.@: @code{0} or @code{13;14;15;16}.
2414 @defvr {Attribute} to
2415 The target value, e.g.@: @code{0}.
2418 @node SPV Detail extension Element
2419 @subsection The @code{extension} Element
2421 This is a general-purpose ``extension'' element. Readers that don't
2422 understand a given extension should be able to safely ignore it. The
2423 attributes on this element, and their meanings, vary based on the
2424 context. Each known usage is described separately below. The current
2425 extensions use attributes exclusively, without any nested elements.
2427 @subsubheading @code{container} Parent Element
2430 extension[container_extension] :combinedFootnotes=(true) => EMPTY
2433 With @code{container} as its parent element, @code{extension} has the
2434 following attributes.
2436 @defvr {Attribute} combinedFootnotes
2437 Always set to @code{true} in the corpus.
2440 @subsubheading @code{sourceVariable} and @code{derivedVariable} Parent Element
2443 extension[variable_extension] :from :helpId => EMPTY
2446 With @code{sourceVariable} or @code{derivedVariable} as its parent
2447 element, @code{extension} has the following attributes. A given
2448 parent element often contains several @code{extension} elements that
2449 specify the meaning of the source data's variables or sources, e.g.@:
2452 <extension from="0" helpId="corrected_model"/>
2453 <extension from="3" helpId="error"/>
2454 <extension from="4" helpId="total_9"/>
2455 <extension from="5" helpId="corrected_total"/>
2458 More commonly they are less helpful, e.g.@:
2461 <extension from="0" helpId="notes"/>
2462 <extension from="1" helpId="notes"/>
2463 <extension from="2" helpId="notes"/>
2464 <extension from="5" helpId="notes"/>
2465 <extension from="6" helpId="notes"/>
2466 <extension from="7" helpId="notes"/>
2467 <extension from="8" helpId="notes"/>
2468 <extension from="12" helpId="notes"/>
2469 <extension from="13" helpId="no_help"/>
2470 <extension from="14" helpId="notes"/>
2473 @defvr {Attribute} from
2474 An integer or a name like ``dimension0''.
2477 @defvr {Attribute} helpId
2481 @node SPV Detail graph Element
2482 @subsection The @code{graph} Element
2486 :cellStyle=ref style
2488 => location+ coordinates faceting facetLayout interval
2490 coordinates => EMPTY
2493 @code{graph} has the following attributes.
2495 @defvr {Attribute} cellStyle
2496 @defvrx {Attribute} style
2497 Each of these is the @code{id} of a @code{style} element (@pxref{SPV
2498 Detail style Element}). The former is the default style for
2499 individual cells, the latter for the entire table.
2502 @node SPV Detail location Element
2503 @subsection The @code{location} Element
2507 :part=(height | width | top | bottom | left | right)
2508 :method=(sizeToContent | attach | fixed | same)
2511 :target=ref (labelFrame | graph | container)?
2516 Each instance of this element specifies where some part of the table
2517 frame is located. All the examples in the corpus have four instances
2518 of this element, one for each of the parts @code{height},
2519 @code{width}, @code{left}, and @code{top}. Some examples in the
2520 corpus add a fifth for part @code{bottom}, even though it is not clear
2521 how all of @code{top}, @code{bottom}, and @code{height} can be honored
2522 at the same time. In any case, @code{location} seems to have little
2523 importance in representing tables; a reader can safely ignore it.
2525 @defvr {Attribute} part
2526 The part of the table being located.
2529 @defvr {Attribute} method
2530 How the location is determined:
2534 Based on the natural size of the table. Observed only for
2535 parts @code{height} and @code{width}.
2538 Based on the location specified in @code{target}. Observed only for
2539 parts @code{top} and @code{bottom}.
2542 Using the value in @code{value}. Observed only for parts @code{top},
2543 @code{bottom}, and @code{left}.
2546 Same as the specified @code{target}. Observed only for part
2551 @defvr {Attribute} min
2552 Minimum size. Only observed with value @code{100pt}. Only observed
2553 for part @code{width}.
2556 @defvr {Dependent} target
2557 Required when @code{method} is @code{attach} or @code{same}, not
2558 observed otherwise. This identifies an element to attach to.
2559 Observed with the ID of @code{title}, @code{footnote}, @code{graph},
2563 @defvr {Dependent} value
2564 Required when @code{method} is @code{fixed}, not observed otherwise.
2565 Observed values are @code{0%}, @code{0px}, @code{1px}, and @code{3px}
2566 on parts @code{top} and @code{left}, and @code{100%} on part
2570 @node SPV Detail faceting Element
2571 @subsection The @code{faceting} Element
2574 faceting => layer[layers1]* cross layer[layers2]*
2576 cross => (unity | nest) (unity | nest)
2580 nest => variableReference[vars]+
2582 variableReference :ref=ref (sourceVariable | derivedVariable) => EMPTY
2585 :variable=ref (sourceVariable | derivedVariable)
2588 :method[layer_method]=(nest)?
2593 The @code{faceting} element describes the row, column, and layer
2594 structure of the table. Its @code{cross} child determines the row and
2595 column structure, and each @code{layer} child (if any) represents a
2596 layer. Layers may appear before or after @code{cross}.
2598 The @code{cross} element describes the row and column structure of the
2599 table. It has exactly two children, the first of which describes the
2600 table's columns and the second the table's rows. Each child is a
2601 @code{nest} element if the table has any dimensions along the axis in
2602 question, otherwise a @code{unity} element.
2604 A @code{nest} element contains of one or more dimensions listed from
2605 innermost to outermost, each represented by @code{variableReference}
2606 child elements. Each variable in a dimension is listed in order.
2607 @xref{SPV Detail Variable Elements}, for information on the variables
2608 that comprise a dimension.
2610 A @code{nest} can contain a single dimension, e.g.:
2614 <variableReference ref="dimension0categories"/>
2615 <variableReference ref="dimension0group0"/>
2616 <variableReference ref="dimension0"/>
2621 A @code{nest} can contain multiple dimensions, e.g.:
2625 <variableReference ref="dimension1categories"/>
2626 <variableReference ref="dimension1group0"/>
2627 <variableReference ref="dimension1"/>
2628 <variableReference ref="dimension0categories"/>
2629 <variableReference ref="dimension0"/>
2633 A @code{nest} may have no dimensions, in which case it still has one
2634 @code{variableReference} child, which references a
2635 @code{derivedVariable} whose @code{value} attribute is
2636 @code{constant(0)}. In the corpus, such a @code{derivedVariable} has
2637 @code{row} or @code{column}, respectively, as its @code{id}. This is
2638 equivalent to using a @code{unity} element in place of @code{nest}.
2640 A @code{variableReference} element refers to a variable through its
2641 @code{ref} attribute.
2643 Each @code{layer} element represents a dimension, e.g.:
2646 <layer value="0" variable="dimension0categories" visible="true"/>
2647 <layer value="dimension0" variable="dimension0" visible="false"/>
2651 @code{layer} has the following attributes.
2653 @defvr {Attribute} variable
2654 Refers to a @code{sourceVariable} or @code{derivedVariable} element.
2657 @defvr {Attribute} value
2658 The value to select. For a category variable, this is always
2659 @code{0}; for a data variable, it is the same as the @code{variable}
2663 @defvr {Attribute} visible
2664 Whether the layer is visible. Generally, category layers are visible
2665 and data layers are not, but sometimes this attribute is omitted.
2668 @defvr {Attribute} method
2669 When present, this is always @code{nest}.
2672 @node SPV Detail facetLayout Element
2673 @subsection The @code{facetLayout} Element
2676 facetLayout => tableLayout setCellProperties[scp1]*
2677 facetLevel+ setCellProperties[scp2]*
2680 :verticalTitlesInCorner=bool
2682 :fitCells=(ticks both)?
2686 The @code{facetLayout} element and its descendants control styling for
2689 Its @code{tableLayout} child has the following attributes
2691 @defvr {Attribute} verticalTitlesInCorner
2692 If true, in the absence of corner text, row headings will be displayed
2696 @defvr {Attribute} style
2697 Refers to a @code{style} element.
2700 @defvr {Attribute} fitCells
2704 @subsubheading The @code{facetLevel} Element
2707 facetLevel :level=int :gap=dimension? => axis
2709 axis :style=ref style => label? majorTicks
2715 :tickFrameStyle=ref style
2716 :labelFrequency=int?
2726 Each @code{facetLevel} describes a @code{variableReference} or
2727 @code{layer}, and a table has one @code{facetLevel} element for
2728 each such element. For example, an SPV detail member that contains
2729 four @code{variableReference} elements and two @code{layer} elements
2730 will contain six @code{facetLevel} elements.
2732 In the corpus, @code{facetLevel} elements and the elements that they
2733 describe are always in the same order. The correspondence may also be
2734 observed in two other ways. First, one may use the @code{level}
2735 attribute, described below. Second, in the corpus, a
2736 @code{facetLevel} always has an @code{id} that is the same as the
2737 @code{id} of the element it describes with @code{_facetLevel}
2738 appended. One should not formally rely on this, of course, but it is
2739 usefully indicative.
2741 @defvr {Attribute} level
2742 A 1-based index into the @code{variableReference} and @code{layer}
2743 elements, e.g.@: a @code{facetLayout} with a @code{level} of 1
2744 describes the first @code{variableReference} in the SPV detail member,
2745 and in a member with four @code{variableReference} elements, a
2746 @code{facetLayout} with a @code{level} of 5 describes the first
2747 @code{layer} in the member.
2750 @defvr {Attribute} gap
2751 Always observed as @code{0pt}.
2754 Each @code{facetLevel} contains an @code{axis}, which in turn may
2755 contain a @code{label} for the @code{facetLevel} (@pxref{SPV Detail
2756 label Element}) and does contain a @code{majorTicks} element.
2758 @defvr {Attribute} labelAngle
2759 Normally 0. The value -90 causes inner column or outer row labels to
2760 be rotated vertically.
2763 @defvr {Attribute} style
2764 @defvrx {Attribute} tickFrameStyle
2765 Each refers to a @code{style} element. @code{style} is the style of
2766 the tick labels, @code{tickFrameStyle} the style for the frames around
2770 @node SPV Detail label Element
2771 @subsection The @code{label} Element
2776 :textFrameStyle=ref style?
2777 :purpose=(title | subTitle | subSubTitle | layer | footnote)?
2778 => text+ | descriptionGroup
2781 :target=ref faceting
2783 => (description | text)+
2785 description :name=(variable | value) => EMPTY
2789 :definesReference=int?
2790 :position=(subscript | superscript)?
2795 This element represents a label on some aspect of the table.
2797 @defvr {Attribute} style
2798 @defvrx {Attribute} textFrameStyle
2799 Each of these refers to a @code{style} element. @code{style} is the
2800 style of the label text, @code{textFrameStyle} the style for the frame
2804 @defvr {Attribute} purpose
2805 The kind of entity being labeled.
2808 A @code{descriptionGroup} concatenates one or more elements to form a
2809 label. Each element can be a @code{text} element, which contains
2810 literal text, or a @code{description} element that substitutes a value
2813 @defvr {Attribute} target
2814 The @code{id} of an element being described. In the corpus, this is
2815 always @code{faceting}.
2818 @defvr {Attribute} separator
2819 A string to separate the description of multiple groups, if the
2820 @code{target} has more than one. In the corpus, this is always a
2824 Typical contents for a @code{descriptionGroup} are a value by itself:
2826 <description name="value"/>
2828 @noindent or a variable and its value, separated by a colon:
2830 <description name="variable"/><text>:</text><description name="value"/>
2833 A @code{description} is like a macro that expands to some property of
2834 the target of its parent @code{descriptionGroup}. The @code{name}
2835 attribute specifies the property.
2837 @node SPV Detail setCellProperties Element
2838 @subsection The @code{setCellProperties} Element
2842 :applyToConverse=bool?
2843 => (setStyle | setFrameStyle | setFormat | setMetaData)* union[union_]?
2846 The @code{setCellProperties} element sets style properties of cells or
2847 row or column labels.
2849 Interpreting @code{setCellProperties} requires answering two
2850 questions: which cells or labels to style, and what styles to use.
2852 @subsubheading Which Cells?
2857 intersect => where+ | intersectWhere | alternating | EMPTY
2860 :variable=ref (sourceVariable | derivedVariable)
2865 :variable=ref (sourceVariable | derivedVariable)
2866 :variable2=ref (sourceVariable | derivedVariable)
2869 alternating => EMPTY
2872 When @code{union} is present with @code{intersect} children, each of
2873 those children specifies a group of cells that should be styled, and
2874 the total group is all those cells taken together. When @code{union}
2875 is absent, every cell is styled. One attribute on
2876 @code{setCellProperties} affects the choice of cells:
2878 @defvr {Attribute} applyToConverse
2879 If true, this inverts the meaning of the cell selection: the selected
2880 cells are the ones @emph{not} designated. This is confusing, given
2881 the additional restrictions of @code{union}, but in the corpus
2882 @code{applyToConverse} is never present along with @code{union}.
2885 An @code{intersect} specifies restrictions on the cells to be matched.
2886 Each @code{where} child specifies which values of a given variable to
2887 include. The attributes of @code{intersect} are:
2889 @defvr {Attribute} variable
2890 Refers to a variable, e.g.@: @code{dimension0categories}. Only
2891 ``categories'' variables make sense here, but other variables, e.g.@:
2892 @code{dimension0group0map}, are sometimes seen. The reader may ignore
2896 @defvr {Attribute} include
2897 A value, or multiple values separated by semicolons,
2898 e.g.@: @code{0} or @code{13;14;15;16}.
2901 PSPP ignores @code{setCellProperties} when @code{intersectWhere} is
2904 @subsubheading What Styles?
2908 :target=ref (labeling | graph | interval | majorTicks)
2912 setMetaData :target=ref graph :key :value => EMPTY
2915 :target=ref (majorTicks | labeling)
2917 => format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat
2921 :target=ref majorTicks
2925 The @code{set*} children of @code{setCellProperties} determine the
2928 When @code{setCellProperties} contains a @code{setFormat} whose
2929 @code{target} references a @code{labeling} element, or if it contains
2930 a @code{setStyle} that references a @code{labeling} or @code{interval}
2931 element, the @code{setCellProperties} sets the style for table cells.
2932 The format from the @code{setFormat}, if present, replaces the cells'
2933 format. The style from the @code{setStyle} that references
2934 @code{labeling}, if present, replaces the label's font and cell
2935 styles, except that the background color is taken instead from the
2936 @code{interval}'s style, if present.
2938 When @code{setCellProperties} contains a @code{setFormat} whose
2939 @code{target} references a @code{majorTicks} element, or if it
2940 contains a @code{setStyle} whose @code{target} references a
2941 @code{majorTicks}, or if it contains a @code{setFrameStyle} element,
2942 the @code{setCellProperties} sets the style for row or column labels.
2943 In this case, the @code{setCellProperties} always contains a single
2944 @code{where} element whose @code{variable} designates the variable
2945 whose labels are to be styled. The format from the @code{setFormat},
2946 if present, replaces the labels' format. The style from the
2947 @code{setStyle} that references @code{majorTicks}, if present,
2948 replaces the labels' font and cell styles, except that the background
2949 color is taken instead from the @code{setFrameStyle}'s style, if
2952 When @code{setCellProperties} contains a @code{setStyle} whose
2953 @code{target} references a @code{graph} element, and one that
2954 references a @code{labeling} element, and the @code{union} element
2955 contains @code{alternating}, the @code{setCellProperties} sets the
2956 alternate foreground and background colors for the data area. The
2957 foreground color is taken from the style referenced by the
2958 @code{setStyle} that targets the @code{graph}, the background color
2959 from the @code{setStyle} for @code{labeling}.
2961 A reader may ignore a @code{setCellProperties} that only contains
2962 @code{setMetaData}, as well as @code{setMetaData} within other
2963 @code{setCellProperties}.
2965 A reader may ignore a @code{setCellProperties} whose only @code{set*}
2966 child is a @code{setStyle} that targets the @code{graph} element.
2968 @subsubheading The @code{setStyle} Element
2972 :target=ref (labeling | graph | interval | majorTicks)
2977 This element associates a style with the target.
2979 @defvr {Attribute} target
2980 The @code{id} of an element whose style is to be set.
2983 @defvr {Attribute} style
2984 The @code{id} of a @code{style} element that identifies the style to
2988 @node SPV Detail setFormat Element
2989 @subsection The @code{setFormat} Element
2993 :target=ref (majorTicks | labeling)
2995 => format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat
2998 This element sets the format of the target, ``format'' in this case
2999 meaning the SPSS print format for a variable.
3001 The details of this element vary depending on the schema version, as
3002 declared in the root @code{visualization} element's @code{version}
3003 attribute (@pxref{SPV Detail visualization Element}). A reader can
3004 interpret the content without knowing the schema version.
3006 The @code{setFormat} element itself has the following attributes.
3008 @defvr {Attribute} target
3009 Refers to an element whose style is to be set.
3012 @defvr {Attribute} reset
3013 If this is @code{true}, this format replaces the target's previous
3014 format. If it is @code{false}, the modifies the previous format.
3018 * SPV Detail numberFormat Element::
3019 * SPV Detail stringFormat Element::
3020 * SPV Detail dateTimeFormat Element::
3021 * SPV Detail elapsedTimeFormat Element::
3022 * SPV Detail format Element::
3023 * SPV Detail affix Element::
3026 @node SPV Detail numberFormat Element
3027 @subsubsection The @code{numberFormat} Element
3031 :minimumIntegerDigits=int?
3032 :maximumFractionDigits=int?
3033 :minimumFractionDigits=int?
3035 :scientific=(onlyForSmall | whenNeeded | true | false)?
3042 Specifies a format for displaying a number. The available options are
3043 a superset of those available from PSPP print formats. PSPP chooses a
3044 print format type for a @code{numberFormat} as follows:
3048 If @code{scientific} is @code{true}, uses @code{E} format.
3051 If @code{prefix} is @code{$}, uses @code{DOLLAR} format.
3054 If @code{suffix} is @code{%}, uses @code{PCT} format.
3057 If @code{useGrouping} is @code{true}, uses @code{COMMA} format.
3060 Otherwise, uses @code{F} format.
3063 For translating to a print format, PSPP uses
3064 @code{maximumFractionDigits} as the number of decimals, unless that
3065 attribute is missing or out of the range [0,15], in which case it uses
3068 @defvr {Attribute} minimumIntegerDigits
3069 Minimum number of digits to display before the decimal point. Always
3070 observed as @code{0}.
3073 @defvr {Attribute} maximumFractionDigits
3074 @defvrx {Attribute} minimumFractionDigits
3075 Maximum or minimum, respectively, number of digits to display after
3076 the decimal point. The observed values of each attribute range from 0
3080 @defvr {Attribute} useGrouping
3081 Whether to use the grouping character to group digits in large
3085 @defvr {Attribute} scientific
3086 This attribute controls when and whether the number is formatted in
3087 scientific notation. It takes the following values:
3091 Use scientific notation only when the number's magnitude is smaller
3092 than the value of the @code{small} attribute.
3095 Use scientific notation when the number will not otherwise fit in the
3099 Always use scientific notation. Not observed in the corpus.
3102 Never use scientific notation. A number that won't otherwise fit will
3103 be replaced by an error indication (see the @code{errorCharacter}
3104 attribute). Not observed in the corpus.
3108 @defvr {Attribute} small
3109 Only present when the @code{scientific} attribute is
3110 @code{onlyForSmall}, this is a numeric magnitude below which the
3111 number will be formatted in scientific notation. The values @code{0}
3112 and @code{0.0001} have been observed. The value @code{0} seems like a
3113 pathological choice, since no real number has a magnitude less than 0;
3114 perhaps in practice such a choice is equivalent to setting
3115 @code{scientific} to @code{false}.
3118 @defvr {Attribute} prefix
3119 @defvrx {Attribute} suffix
3120 Specifies a prefix or a suffix to apply to the formatted number. Only
3121 @code{suffix} has been observed, with value @samp{%}.
3124 @node SPV Detail stringFormat Element
3125 @subsubsection The @code{stringFormat} Element
3128 stringFormat => relabel* affix*
3130 relabel :from=real :to => EMPTY
3133 The @code{stringFormat} element specifies how to display a string. By
3134 default, a string is displayed verbatim, but @code{relabel} can change
3137 The @code{relabel} element appears as a child of @code{stringFormat}
3138 (and of @code{format}, when it is used to format strings). It
3139 specifies how to display a given value. It is used to implement value
3140 labels and to display the system-missing value in a human-readable
3141 way. It has the following attributes:
3143 @defvr {Attribute} from
3144 The value to map. In the corpus this is an integer or the
3145 system-missing value @code{-1.797693134862316E300}.
3148 @defvr {Attribute} to
3149 The string to display in place of the value of @code{from}. In the
3150 corpus this is a wide variety of value labels; the system-missing
3151 value is mapped to @samp{.}.
3154 @node SPV Detail dateTimeFormat Element
3155 @subsubsection The @code{dateTimeFormat} Element
3159 :baseFormat[dt_base_format]=(date | time | dateTime)
3161 :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
3163 :yearAbbreviation=bool?
3168 :monthFormat=(long | short | number | paddedNumber)?
3172 :showDayOfWeek=bool?
3173 :dayOfWeekAbbreviation=bool?
3175 :dayOfMonthPadding=bool?
3177 :minutePadding=bool?
3178 :secondPadding=bool?
3184 :dayType=(month | year)?
3185 :hourFormat=(AMPM | AS_24 | AS_12)?
3189 This element appears only in schema version 2.5 and earlier
3190 (@pxref{SPV Detail visualization Element}).
3192 Data to be formatted in date formats is stored as strings in legacy
3193 data, in the format @code{yyyy-mm-ddTHH:MM:SS.SSS} and must be parsed
3194 and reformatted by the reader.
3196 The following attribute is required.
3198 @defvr {Attribute} baseFormat
3199 Specifies whether a date and time are both to be displayed, or just
3203 Many of the attributes' meanings are obvious. The following seem to
3204 be worth documenting.
3206 @defvr {Attribute} separatorChars
3207 Exactly four characters. In order, these are used for: decimal point,
3208 grouping, date separator, time separator. Always @samp{.,-:}.
3211 @defvr {Attribute} mdyOrder
3212 Within a date, the order of the days, months, and years.
3213 @code{dayMonthYear} is the only observed value, but one would expect
3214 that @code{monthDayYear} and @code{yearMonthDay} to be reasonable as
3218 @defvr {Attribute} showYear
3219 @defvrx {Attribute} yearAbbreviation
3220 Whether to include the year and, if so, whether the year should be
3221 shown abbreviated, that is, with only 2 digits. Each is @code{true}
3222 or @code{false}; only values of @code{true} and @code{false},
3223 respectively, have been observed.
3226 @defvr {Attribute} showMonth
3227 @defvrx {Attribute} monthFormat
3228 Whether to include the month (@code{true} or @code{false}) and, if so,
3229 how to format it. @code{monthFormat} is one of the following:
3233 The full name of the month, e.g.@: in an English locale,
3237 The abbreviated name of the month, e.g.@: in an English locale,
3241 The number representing the month, e.g.@: 9 for September.
3244 A two-digit number representing the month, e.g.@: 09 for September.
3247 Only values of @code{true} and @code{short}, respectively, have been
3251 @defvr {Attribute} dayType
3252 This attribute is always @code{month} in the corpus, specifying that
3253 the day of the month is to be displayed; a value of @code{year} is
3254 supposed to indicate that the day of the year, where 1 is January 1,
3255 is to be displayed instead.
3258 @defvr {Attribute} hourFormat
3259 @code{hourFormat}, if present, is one of:
3263 The time is displayed with an @code{am} or @code{pm} suffix, e.g.@:
3267 The time is displayed in a 24-hour format, e.g.@: @code{22:15}.
3269 This is the only value observed in the corpus.
3272 The time is displayed in a 12-hour format, without distinguishing
3273 morning or evening, e.g.@: @code{10;15}.
3276 @code{hourFormat} is sometimes present for @code{elapsedTime} formats,
3277 which is confusing since a time duration does not have a concept of AM
3278 or PM. This might indicate a bug in the code that generated the XML
3279 in the corpus, or it might indicate that @code{elapsedTime} is
3280 sometimes used to format a time of day.
3283 For a @code{baseFormat} of @code{date}, PSPP chooses a print format
3284 type based on the following rules:
3288 If @code{showQuarter} is true: @code{QYR}.
3291 Otherwise, if @code{showWeek} is true: @code{WKYR}.
3294 Otherwise, if @code{mdyOrder} is @code{dayMonthYear}:
3298 If @code{monthFormat} is @code{number} or @code{paddedNumber}: @code{EDATE}.
3301 Otherwise: @code{DATE}.
3305 Otherwise, if @code{mdyOrder} is @code{yearMonthDay}: @code{SDATE}.
3308 Otherwise, @code{ADATE}.
3311 For a @code{baseFormat} of @code{dateTime}, PSPP uses @code{YMDHMS} if
3312 @code{mdyOrder} is @code{yearMonthDay} and @code{DATETIME} otherwise.
3313 For a @code{baseFormat} of @code{time}, PSPP uses @code{DTIME} if
3314 @code{showDay} is true, otherwise @code{TIME} if @code{showHour} is
3315 true, otherwise @code{MTIME}.
3317 For a @code{baseFormat} of @code{date}, the chosen width is the
3318 minimum for the format type, adding 2 if @code{yearAbbreviation} is
3319 false or omitted. For other base formats, the chosen width is the
3320 minimum for its type, plus 3 if @code{showSecond} is true, plus 4 more
3321 if @code{showMillis} is also true. Decimals are 0 by default, or 3
3322 if @code{showMillis} is true.
3324 @node SPV Detail elapsedTimeFormat Element
3325 @subsubsection The @code{elapsedTimeFormat} Element
3329 :baseFormat[dt_base_format]=(date | time | dateTime)
3332 :minutePadding=bool?
3333 :secondPadding=bool?
3343 This element specifies the way to display a time duration.
3345 Data to be formatted in elapsed time formats is stored as strings in
3346 legacy data, in the format @code{H:MM:SS.SSS}, with additional hour
3347 digits as needed for long durations, and must be parsed and
3348 reformatted by the reader.
3350 The following attribute is required.
3352 @defvr {Attribute} baseFormat
3353 Specifies whether a day and a time are both to be displayed, or just
3357 The remaining attributes specify exactly how to display the elapsed
3360 For @code{baseFormat} of @code{time}, PSPP converts this element to
3361 print format type @code{DTIME}; otherwise, if @code{showHour} is true,
3362 to @code{TIME}; otherwise, to @code{MTIME}. The chosen width is the
3363 minimum for the chosen type, adding 3 if @code{showSecond} is true,
3364 adding 4 more if @code{showMillis} is also true. Decimals are 0 by
3365 default, or 3 if @code{showMillis} is true.
3367 @node SPV Detail format Element
3368 @subsubsection The @code{format} Element
3372 :baseFormat[f_base_format]=(date | time | dateTime | elapsedTime)?
3375 :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
3380 :yearAbbreviation=bool?
3382 :monthFormat=(long | short | number | paddedNumber)?
3384 :dayOfMonthPadding=bool?
3388 :showDayOfWeek=bool?
3389 :dayOfWeekAbbreviation=bool?
3391 :minutePadding=bool?
3392 :secondPadding=bool?
3398 :dayType=(month | year)?
3399 :hourFormat=(AMPM | AS_24 | AS_12)?
3400 :minimumIntegerDigits=int?
3401 :maximumFractionDigits=int?
3402 :minimumFractionDigits=int?
3404 :scientific=(onlyForSmall | whenNeeded | true | false)?
3408 :tryStringsAsNumbers=bool?
3409 :negativesOutside=bool?
3413 This element is the union of all of the more-specific format elements.
3414 It is interpreted in the same way as one of those format elements,
3415 using @code{baseFormat} to determine which kind of format to use.
3417 There are a few attributes not present in the more specific formats:
3419 @defvr {Attribute} tryStringsAsNumbers
3420 When this is @code{true}, it is supposed to indicate that string
3421 values should be parsed as numbers and then displayed according to
3422 numeric formatting rules. However, in the corpus it is always
3426 @defvr {Attribute} negativesOutside
3427 If true, the negative sign should be shown before the prefix; if
3428 false, it should be shown after.
3431 @node SPV Detail affix Element
3432 @subsubsection The @code{affix} Element
3436 :definesReference=int
3437 :position=(subscript | superscript)
3443 This defines a suffix (or, theoretically, a prefix) for a formatted
3444 value. It is used to insert a reference to a footnote. It has the
3445 following attributes:
3447 @defvr {Attribute} definesReference
3448 This specifies the footnote number as a natural number: 1 for the
3449 first footnote, 2 for the second, and so on.
3452 @defvr {Attribute} position
3453 Position for the footnote label. Always @code{superscript}.
3456 @defvr {Attribute} suffix
3457 Whether the affix is a suffix (@code{true}) or a prefix
3458 (@code{false}). Always @code{true}.
3461 @defvr {Attribute} value
3462 The text of the suffix or prefix. Typically a letter, e.g.@: @code{a}
3463 for footnote 1, @code{b} for footnote 2, @enddots{} The corpus
3464 contains other values: @code{*}, @code{**}, and a few that begin with
3465 at least one comma: @code{,b}, @code{,c}, @code{,,b}, and @code{,,c}.
3468 @node SPV Detail interval Element
3469 @subsection The @code{interval} Element
3472 interval :style=ref style => labeling footnotes?
3476 :variable=ref (sourceVariable | derivedVariable)
3477 => (formatting | format | footnotes)*
3479 formatting :variable=ref (sourceVariable | derivedVariable) => formatMapping*
3481 formatMapping :from=int => format?
3485 :variable=ref (sourceVariable | derivedVariable)
3488 footnoteMapping :definesReference=int :from=int :to => EMPTY
3491 The @code{interval} element and its descendants determine the basic
3492 formatting and labeling for the table's cells. These basic styles are
3493 overridden by more specific styles set using @code{setCellProperties}
3494 (@pxref{SPV Detail setCellProperties Element}).
3496 The @code{style} attribute of @code{interval} itself may be ignored.
3498 The @code{labeling} element may have a single @code{formatting} child.
3499 If present, its @code{variable} attribute refers to a variable whose
3500 values are format specifiers as numbers, e.g. value 0x050802 for F8.2.
3501 However, the numbers are not actually interpreted that way. Instead,
3502 each number actually present in the variable's data is mapped by a
3503 @code{formatMapping} child of @code{formatting} to a @code{format}
3504 that specifies how to display it.
3506 The @code{labeling} element may also have a @code{footnotes} child
3507 element. The @code{variable} attribute of this element refers to a
3508 variable whose values are comma-delimited strings that list the
3509 1-based indexes of footnote references. (Cells without any footnote
3510 references are numeric 0 instead of strings.)
3512 Each @code{footnoteMapping} child of the @code{footnotes} element
3513 defines the footnote marker to be its @code{to} attribute text for the
3514 footnote whose 1-based index is given in its @code{definesReference}
3517 @node SPV Detail style Element
3518 @subsection The @code{style} Element
3525 :border-bottom=(solid | thick | thin | double | none)?
3526 :border-top=(solid | thick | thin | double | none)?
3527 :border-left=(solid | thick | thin | double | none)?
3528 :border-right=(solid | thick | thin | double | none)?
3529 :border-bottom-color?
3532 :border-right-color?
3535 :font-weight=(regular | bold)?
3536 :font-style=(regular | italic)?
3537 :font-underline=(none | underline)?
3538 :margin-bottom=dimension?
3539 :margin-left=dimension?
3540 :margin-right=dimension?
3541 :margin-top=dimension?
3542 :textAlignment=(left | right | center | decimal | mixed)?
3543 :labelLocationHorizontal=(positive | negative | center)?
3544 :labelLocationVertical=(positive | negative | center)?
3545 :decimal-offset=dimension?
3552 A @code{style} element has an effect only when it is referenced by
3553 another element to set some aspect of the table's style. Most of the
3554 attributes are self-explanatory. The rest are described below.
3556 @defvr {Attribute} {color}
3557 In some cases, the text color; in others, the background color.
3560 @defvr {Attribute} {color2}
3564 @defvr {Attribute} {labelAngle}
3565 Normally 0. The value -90 causes inner column or outer row labels to
3566 be rotated vertically.
3569 @defvr {Attribute} {labelLocationHorizontal}
3573 @defvr {Attribute} {labelLocationVertical}
3574 The value @code{positive} corresponds to vertically aligning text to
3575 the top of a cell, @code{negative} to the bottom, @code{center} to the
3579 @node SPV Detail labelFrame Element
3580 @subsection The @code{labelFrame} Element
3583 labelFrame :style=ref style => location+ label? paragraph?
3585 paragraph :hangingIndent=dimension? => EMPTY
3588 A @code{labelFrame} element specifies content and style for some
3589 aspect of a table. Only @code{labelFrame} elements that have a
3590 @code{label} child are important. The @code{purpose} attribute in the
3591 @code{label} determines what the @code{labelFrame} affects:
3595 The table's title and its style.
3598 The table's caption and its style.
3601 The table's footnotes and the style for the footer area.
3604 The style for the layer area.
3610 The @code{style} attribute references the style to use for the area.
3612 The @code{label}, if present, specifies the text to put into the title
3613 or caption or footnotes. For footnotes, the label has two @code{text}
3614 children for every footnote, each of which has a @code{usesReference}
3615 attribute identifying the 1-based index of a footnote. The first,
3616 third, fifth, @dots{} @code{text} child specifies the content for a
3617 footnote; the second, fourth, sixth, @dots{} child specifies the
3618 marker. Content tends to end in a new-line, which the reader may wish
3619 to trim; similarly, markers tend to end in @samp{.}.
3621 The @code{paragraph}, if present, may be ignored, since it is always
3624 @node SPV Detail Legacy Properties
3625 @subsection Legacy Properties
3627 The detail XML format has features for styling most of the aspects of
3628 a table. It also inherits defaults for many aspects from structure
3629 XML, which has the following @code{tableProperties} element:
3633 => generalProperties footnoteProperties cellFormatProperties borderProperties printingProperties
3636 :hideEmptyRows=bool?
3637 :maximumColumnWidth=dimension?
3638 :maximumRowWidth=dimension?
3639 :minimumColumnWidth=dimension?
3640 :minimumRowWidth=dimension?
3641 :rowDimensionLabels=(inCorner | nested)?
3645 :markerPosition=(superscript | subscript)?
3646 :numberFormat=(alphabetic | numeric)?
3649 cellFormatProperties => cell_style+
3652 :alternatingColor=color?
3653 :alternatingTextColor=color?
3661 :font-style=(regular | italic)?
3662 :font-weight=(regular | bold)?
3663 :labelLocationVertical=(positive | negative | center)?
3664 :margin-bottom=dimension?
3665 :margin-left=dimension?
3666 :margin-right=dimension?
3667 :margin-top=dimension?
3668 :textAlignment=(left | right | center | decimal | mixed)?
3669 :decimal-offset=dimension?
3672 borderProperties => border_style+
3675 :borderStyleType=(none | solid | dashed | thick | thin | double)?
3680 :printAllLayers=bool?
3681 :rescaleLongTableToFitPage=bool?
3682 :rescaleWideTableToFitPage=bool?
3683 :windowOrphanLines=int?
3685 :continuationTextAtBottom=bool?
3686 :continuationTextAtTop=bool?
3687 :printEachLayerOnSeparatePage=bool?