1 @c PSPP - a program for statistical analysis.
2 @c Copyright (C) 2019 Free Software Foundation, Inc.
3 @c Permission is granted to copy, distribute and/or modify this document
4 @c under the terms of the GNU Free Documentation License, Version 1.3
5 @c or any later version published by the Free Software Foundation;
6 @c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
7 @c A copy of the license is included in the section entitled "GNU
8 @c Free Documentation License".
11 @node SPSS Viewer File Format
12 @appendix SPSS Viewer File Format
14 SPSS Viewer or @file{.spv} files, here called SPV files, are written
15 by SPSS 16 and later to represent the contents of its output editor.
16 This chapter documents the format, based on examination of a corpus of
17 about 8,000 files from a variety of sources. This description is
18 detailed enough to both read and write SPV files.
20 SPSS 15 and earlier versions instead use @file{.spo} files, which have
21 a completely different output format based on the Microsoft Compound
22 Document Format. This format is not documented here.
24 An SPV file is a Zip archive that can be read with @command{zipinfo}
25 and @command{unzip} and similar programs. The final member in the Zip
26 archive is the @dfn{manifest}, a file named
27 @file{META-INF/MANIFEST.MF}. This structure makes SPV files resemble
28 Java ``JAR'' files (and ODF files), but whereas a JAR manifest
29 contains a sequence of colon-delimited key/value pairs, an SPV
30 manifest contains the string @samp{allowPivoting=true}, without a
31 new-line. PSPP uses this string to identify an SPV file; it is
32 invariant across the corpus.@footnote{SPV files always begin with the
33 7-byte sequence 50 4b 03 04 14 00 08, but this is not a useful magic
34 number because most Zip archives start the same way.}@footnote{SPSS
35 writes @file{META-INF/MANIFEST.MF} to every SPV file, but it does not
36 read it or even require it to exist, so using different contents,
37 e.g.@: as @samp{allowingPivot=false} has no effect.}
39 The rest of the members in an SPV file's Zip archive fall into two
40 categories: @dfn{structure} and @dfn{detail} members. Structure
41 member names begin with @file{outputViewer@var{nnnnnnnnnn}}, where
42 each @var{n} is a decimal digit, and end with @file{.xml}, and often
43 include the string @file{_heading} in between. Each of these members
44 represents some kind of output item (a table, a heading, a block of
45 text, etc.) or a group of them. The member whose output goes at the
46 beginning of the document is numbered 0, the next member in the output
47 is numbered 1, and so on.
49 Structure members contain XML. This XML is sometimes self-contained,
50 but it often references detail members in the Zip archive, which are
54 @item @file{@var{prefix}_table.xml} and @file{@var{prefix}_tableData.bin}
55 @itemx @file{@var{prefix}_lightTableData.bin}
56 The structure of a table plus its data. Older SPV files pair a
57 @file{@var{prefix}_table.xml} file that describes the table's
58 structure with a binary @file{@var{prefix}_tableData.bin} file that
59 gives its data. Newer SPV files (the majority of those in the corpus)
60 instead include a single @file{@var{prefix}_lightTableData.bin} file
61 that incorporates both into a single binary format.
63 @item @file{@var{prefix}_warning.xml} and @file{@var{prefix}_warningData.bin}
64 @itemx @file{@var{prefix}_lightWarningData.bin}
65 Same format used for tables, with a different name.
67 @item @file{@var{prefix}_notes.xml} and @file{@var{prefix}_notesData.bin}
68 @itemx @file{@var{prefix}_lightNotesData.bin}
69 Same format used for tables, with a different name.
71 @item @file{@var{prefix}_chartData.bin} and @file{@var{prefix}_chart.xml}
72 The structure of a chart plus its data. Charts do not have a
75 @item @file{@var{prefix}_pmml.scf}
76 @itemx @file{@var{prefix}_stats.scf}
77 @item @file{@var{prefix}_model.xml}
78 Not yet investigated. The corpus contains few examples.
81 The @file{@var{prefix}} in the names of the detail members is
82 typically an 11-digit decimal number that increases for each item,
83 tending to skip values. Older SPV files use different naming
84 conventions. Structure member refer to detail members by name, and so
85 their exact names do not matter to readers as long as they are unique.
87 SPSS tolerates corrupted Zip archives that Zip reader libraries tend
88 to reject. These can be fixed up with @command{zip -FF}.
91 * SPV Structure Member Format::
92 * SPV Light Detail Member Format::
93 * SPV Legacy Detail Member Binary Format::
94 * SPV Legacy Detail Member XML Format::
97 @node SPV Structure Member Format
98 @section Structure Member Format
100 A structure member lays out the high-level structure for a group of
101 output items such as heading, tables, and charts. Structure members
102 do not include the details of tables and charts but instead refer to
103 them by their member names.
105 Structure members' XML files claim conformance with a collection of
106 XML Schemas. These schemas are distributed, under a nonfree license,
107 with SPSS binaries. Fortunately, the schemas are not necessary to
108 understand the structure members. The schemas can even
109 be deceptive because they document elements and attributes that are
110 not in the corpus and do not document elements and attributes that are
111 commonly found in the corpus.
113 Structure members use a different XML namespace for each schema, but
114 these namespaces are not entirely consistent. In some SPV files, for
115 example, the @code{viewer-tree} schema is associated with namespace
116 @indicateurl{http://xml.spss.com/spss/viewer-tree} and in others with
117 @indicateurl{http://xml.spss.com/spss/viewer/viewer-tree} (note the
118 additional @file{viewer/}). Under either name, the schema URIs are
119 not resolvable to obtain the schemas themselves.
121 One may ignore all of the above in interpreting a structure member.
122 The actual XML has a simple and straightforward form that does not
123 require a reader to take schemas or namespaces into account. A
124 structure member's root is @code{heading} element, which contains
125 @code{heading} or @code{container} elements (or a mix), forming a
126 tree. In turn, @code{container} holds a @code{label} and one more
127 child, usually @code{text} or @code{table}.
129 The following sections document the elements found in structure
130 members in a context-free grammar-like fashion. Consider the
131 following example, which specifies the attributes and content for the
132 @code{container} element:
136 :visibility=(visible | hidden)
137 :page-break-before=(always)?
138 :text-align=(left | center)?
140 => label (table | container_text | graph | model | object | image | tree)
143 Each attribute specification begins with @samp{:} followed by the
144 attribute's name. If the attribute's value has an easily specified
145 form, then @samp{=} and its description follows the name. Finally, if
146 the attribute is optional, the specification ends with @samp{?}. The
147 following value specifications are defined:
150 @item (@var{a} | @var{b} | @dots{})
151 One of the listed literal strings. If only one string is listed, it
152 is the only acceptable value. If @code{OTHER} is listed, then any
153 string not explicitly listed is also accepted.
156 Either @code{true} or @code{false}.
159 A floating-point number followed by a unit, e.g.@: @code{10pt}. Units
160 in the corpus include @code{in} (inch), @code{pt} (points, 72/inch),
161 @code{px} (``device-independent pixels'', 96/inch), and @code{cm}. If
162 the unit is omitted then points should be assumed. The number and
163 unit may be separated by white space.
165 The corpus also includes localized names for units. A reader must
166 understand these to properly interpret the dimension:
170 @code{인치}, @code{pol.}, @code{cala}, @code{cali}
180 A floating-point number.
186 A color in one of the forms @code{#@var{rr}@var{gg}@var{bb}} or
187 @code{@var{rr}@var{gg}@var{bb}}, or the string @code{transparent}, or
188 one of the standard Web color names.
191 @item ref @var{element}
192 @itemx ref(@var{elem1} | @var{elem2} | @dots{})
193 The name from the @code{id} attribute in some element. If one or more
194 elements are named, the name must refer to one of those elements,
195 otherwise any element is acceptable.
198 All elements have an optional @code{id} attribute. If present, its
199 value must be unique. In practice many elements are assigned
200 @code{id} attributes that are never referenced.
202 The content specification for an element supports the following
209 @item @var{a} @var{b}
210 @var{a} followed by @var{b}.
212 @item @var{a} | @var{b} | @var{c}
213 One of @var{a} or @var{b} or @var{c}.
216 Zero or one instances of @var{a}.
219 Zero or more instances of @var{a}.
222 One or more instances of @var{a}.
224 @item (@var{subexpression})
225 Grouping for a subexpression.
234 Element and attribute names are sometimes suffixed by another name in
235 square brackets to distinguish different uses of the same name. For
236 example, structure XML has two @code{text} elements, one inside
237 @code{container}, the other inside @code{pageParagraph}. The former
238 is defined as @code{text[container_text]} and referenced as
239 @code{container_text}, the latter defined as
240 @code{text[pageParagraph_text]} and referenced as
241 @code{pageParagraph_text}.
243 This language is used in the PSPP source code for parsing structure
244 and detail XML members. Refer to
245 @file{src/output/spv/structure-xml.grammar} and
246 @file{src/output/spv/detail-xml.grammar} for the full grammars.
248 The following example shows the contents of a typical structure member
249 for a @cmd{DESCRIPTIVES} procedure. A real structure member is not
250 indented. This example also omits most attributes, all XML namespace
251 information, and the CSS from the embedded HTML:
254 <?xml version="1.0" encoding="utf-8"?>
256 <label>Output</label>
257 <heading commandName="Descriptives">
258 <label>Descriptives</label>
261 <text commandName="Descriptives" type="title">
263 <![CDATA[<head><style type="text/css">...</style></head><BR>Descriptives]]>
267 <container visibility="hidden">
269 <table commandName="Descriptives" subType="Notes" type="note">
271 <dataPath>00000000001_lightNotesData.bin</dataPath>
276 <label>Descriptive Statistics</label>
277 <table commandName="Descriptives" subType="Descriptive Statistics"
280 <dataPath>00000000002_lightTableData.bin</dataPath>
289 * SPV Structure heading Element::
290 * SPV Structure label Element::
291 * SPV Structure container Element::
292 * SPV Structure text Element (Inside @code{container})::
293 * SPV Structure html Element::
294 * SPV Structure table Element::
295 * SPV Structure graph Element::
296 * SPV Structure model Element::
297 * SPV Structure tree Element::
298 * SPV Structure Path Elements::
299 * SPV Structure pageSetup Element::
300 * SPV Structure @code{text} Element (Inside @code{pageParagraph})::
303 @node SPV Structure heading Element
304 @subsection The @code{heading} Element
307 heading[root_heading]
313 => label pageSetup? (container | heading)*
318 :visibility[heading_visibility]=(collapsed)?
321 => label (container | heading)*
324 The root of a structure member is a @code{heading}, which represents a
325 section of output beginning with a @code{label} and
326 ordinarily followed by content containers or further nested
327 (sub)-sections of output. Unlike heading elements in HTML and other
328 common document formats, which precede the content that they head,
329 @code{heading} contains the elements that appear below the heading.
331 The document root heading, only, may contain a @code{pageSetup}
334 The following attributes have been observed on both document root and
335 nested @code{heading} elements.
337 @defvr {Attribute} creator-version
338 The version of the software that created this SPV file. A string of
339 the form @code{xxyyzzww} represents software version xx.yy.zz.ww,
340 e.g.@: @code{21000001} is version 21.0.0.1. Trailing pairs of zeros
341 are sometimes omitted, so that @code{21}, @code{210000}, and
342 @code{21000000} are all version 21.0.0.0 (and the corpus contains all
343 three of those forms).
347 The following attributes have been observed on document root
348 @code{heading} elements only:
350 @defvr {Attribute} @code{creator}
351 The directory in the file system of the software that created this SPV
355 @defvr {Attribute} @code{creation-date-time}
356 The date and time at which the SPV file was written, in a
357 locale-specific format, e.g.@: @code{Friday, May 16, 2014 6:47:37 PM
358 PDT} or @code{lunedì 17 marzo 2014 3.15.48 CET} or even @code{Friday,
359 December 5, 2014 5:00:19 o'clock PM EST}.
362 @defvr {Attribute} @code{lockReader}
363 Whether a reader should be allowed to edit the output. The possible
364 values are @code{true} and @code{false}. The value @code{false} is by
368 @defvr {Attribute} @code{schemaLocation}
369 This is actually an XML Namespace attribute. A reader may ignore it.
373 The following attributes have been observed only on nested
374 @code{heading} elements:
376 @defvr {Attribute} @code{commandName}
377 A locale-invariant identifier for the command that produced the
378 output, e.g.@: @code{Frequencies}, @code{T-Test}, @code{Non Par Corr}.
381 @defvr {Attribute} @code{visibility}
382 To what degree the output represented by the element is visible.
385 @defvr {Attribute} @code{locale}
386 The locale used for output, in Windows format, which is similar to the
387 format used in Unix with the underscore replaced by a hyphen, e.g.@:
388 @code{en-US}, @code{en-GB}, @code{el-GR}, @code{sr-Cryl-RS}.
391 @defvr {Attribute} @code{olang}
392 The output language, e.g.@: @code{en}, @code{it}, @code{es},
393 @code{de}, @code{pt-BR}.
396 @node SPV Structure label Element
397 @subsection The @code{label} Element
403 Every @code{heading} and @code{container} holds a @code{label} as its
404 first child. The label text is what appears in the outline pane of
405 the GUI's viewer window. PSPP also puts it into the outline of PDF
406 output. The label text doesn't appear in the output itself.
408 The text in @code{label} describes what it labels, often by naming the
409 statistical procedure that was executed, e.g.@: ``Frequencies'' or
410 ``T-Test''. The root @code{heading} in a structure member is normally
411 ``Output''. Labels are often very generic, especially within a
412 @code{container}, e.g.@: ``Title'' or ``Warnings'' or ``Notes''.
413 Label text is localized according to the output language, e.g.@: in
414 Italian a frequency table procedure is labeled ``Frequenze''.
416 The user can edit labels to be anything they want. The corpus
417 contains a few examples of empty labels, ones that contain no text,
418 probably as a result of user editing.
420 @node SPV Structure container Element
421 @subsection The @code{container} Element
425 :visibility=(visible | hidden)
426 :page-break-before=(always)?
427 :text-align=(left | center)?
429 => label (table | container_text | graph | model | object | image | tree)
432 A @code{container} serves to contain and label a @code{table},
433 @code{text}, or other kind of item.
435 This element has the following attributes.
437 @defvr {Attribute} @code{visibility}
438 Whether the container's content is displayed. ``Notes'' tables are
439 often hidden; other data is usually
442 @defvr {Attribute} @code{text-align}
443 Alignment of text within the container. Observed with nested
444 @code{table} and @code{text} elements.
447 @defvr {Attribute} @code{width}
448 The width of the container, e.g.@: @code{1097px}.
451 @node SPV Structure text Element (Inside @code{container})
452 @subsection The @code{text} Element (Inside @code{container})
456 :type[text_type]=(title | log | text | page-title)
462 This @code{text} element is nested inside a @code{container}. There
463 is a different @code{text} element that is nested inside a
464 @code{pageParagraph}.
466 This element has the following attributes.
468 @defvr {Attribute} @code{type}
469 The semantics of the text.
472 @defvr {Attribute} @code{commandName}
473 As on the @code{heading} element. For output not specific to a
474 command, this is simply @code{log}. The corpus contains one example
475 of where @code{commandName} is present but set to the empty string.
478 @defvr {Attribute} @code{creator-version}
479 As on the @code{heading} element.
482 @node SPV Structure html Element
483 @subsection The @code{html} Element
486 html :lang=(en) => TEXT
489 The element contains an HTML document as text (or, in practice, as
490 CDATA). In some cases, the document starts with @code{<html>} and
491 ends with @code{</html>}; in others the @code{html} element is
492 implied. Generally the HTML includes a @code{head} element with a CSS
493 stylesheet. The HTML body often begins with @code{<BR>}.
495 The HTML document uses only the following elements:
499 Sometimes, the document is enclosed with
500 @code{<html>}@dots{}@code{</html>}.
503 The HTML body often begins with @code{<BR>} and may contain it as well.
511 The attributes @code{face}, @code{color}, and @code{size} are
512 observed. The value of @code{color} takes one of the forms
513 @code{#@var{rr}@var{gg}@var{bb}} or @code{rgb (@var{r}, @var{g},
514 @var{b})}. The value of @code{size} is a number between 1 and 7,
518 The CSS in the corpus is simple. To understand it, a parser only
519 needs to be able to skip white space, @code{<!--}, and @code{-->}, and
520 parse style only for @code{p} elements. Only the following properties
525 In the form @code{@var{rr}@var{gg}@var{bb}}, e.g. @code{000000}, with
529 Either @code{bold} or @code{normal}.
532 Either @code{italic} or @code{normal}.
534 @item text-decoration
535 Either @code{underline} or @code{normal}.
538 A font name, commonly @code{Monospaced} or @code{SansSerif}.
541 Values claim to be in points, e.g.@: @code{14pt}, but the values are
542 actually in ``device-independent pixels'' (px), at 96/inch.
545 This element has the following attributes.
547 @defvr {Attribute} @code{lang}
548 This always contains @code{en} in the corpus.
551 @node SPV Structure table Element
552 @subsection The @code{table} Element
561 :displayFiltering=bool?
563 :orphanTolerance=int?
568 :type[table_type]=(table | note | warning)
569 => tableProperties? tableStructure
571 tableStructure => path? dataPath csvPath?
574 This element has the following attributes.
576 @defvr {Attribute} @code{commandName}
577 As on the @code{heading} element.
580 @defvr {Attribute} @code{type}
581 One of @code{table}, @code{note}, or @code{warning}.
584 @defvr {Attribute} @code{subType}
585 The locale-invariant command ID for the particular kind of output that
586 this table represents in the procedure. This can be the same as
587 @code{commandName} e.g.@: @code{Frequencies}, or different, e.g.@:
588 @code{Case Processing Summary}. Generic subtypes @code{Notes} and
589 @code{Warnings} are often used.
592 @defvr {Attribute} @code{tableId}
593 A number that uniquely identifies the table within the SPV file,
594 typically a large negative number such as @code{-4147135649387905023}.
597 @defvr {Attribute} @code{creator-version}
598 As on the @code{heading} element. In the corpus, this is only present
599 for version 21 and up and always includes all 8 digits.
602 @xref{SPV Detail Legacy Properties}, for details on the
603 @code{tableProperties} element.
605 @node SPV Structure graph Element
606 @subsection The @code{graph} Element
621 => dataPath? path csvPath?
624 This element represents a graph. The @code{dataPath} and @code{path}
625 elements name the Zip members that give the details of the graph.
626 Normally, both elements are present; there is only one counterexample
629 @code{csvPath} only appears in one SPV file in the corpus, for two
630 graphs. In these two cases, @code{dataPath}, @code{path}, and
631 @code{csvPath} all appear. These @code{csvPath} name Zip members with
632 names of the form @file{@var{number}_csv.bin}, where @var{number} is a
633 many-digit number and the same as the @code{csvFileIds}. The named
634 Zip members are CSV text files (despite the @file{.bin} extension).
635 The CSV files are encoded in UTF-8 and begin with a U+FEFF byte-order
638 @node SPV Structure model Element
639 @subsection The @code{model} Element
651 => ViZml? dataPath? path | pmmlContainerPath statsContainerPath
653 pmmlContainerPath => TEXT
655 statsContainerPath => TEXT
657 ViZml :viewName? => TEXT
660 This element represents a model. The @code{dataPath} and @code{path}
661 elements name the Zip members that give the details of the model.
662 Normally, both elements are present; there is only one counterexample
665 The details are unexplored. The @code{ViZml} element contains base-64
666 encoded text, that decodes to a binary format with some embedded text
667 strings, and @code{path} names an Zip member that contains XML.
668 Alternatively, @code{pmmlContainerPath} and @code{statsContainerPath}
669 name Zip members with @file{.scf} extension.
671 @node SPV Structure tree Element
672 @subsection The @code{tree} Element
683 This element represents a tree. The @code{dataPath} and @code{path}
684 elements name the Zip members that give the details of the tree.
685 The details are unexplored.
687 @node SPV Structure Path Elements
688 @subsection Path Elements
698 These element contain the name of the Zip members that hold details
699 for a container. For tables:
703 When a ``light'' format is used, only @code{dataPath} is present, and
704 it names a @file{.bin} member of the Zip file that has @code{light} in
705 its name, e.g.@: @code{0000000001437_lightTableData.bin} (@pxref{SPV
706 Light Detail Member Format}).
709 When the legacy format is used, both are present. In this case,
710 @code{dataPath} names a Zip member with a legacy binary format that
711 contains relevant data (@pxref{SPV Legacy Detail Member Binary
712 Format}), and @code{path} names a Zip member that uses an XML format
713 (@pxref{SPV Legacy Detail Member XML Format}).
716 Graphs normally follow the legacy approach described above. The
717 corpus contains one example of a graph with @code{path} but not
718 @code{dataPath}. The reason is unexplored.
720 Models use @code{path} but not @code{dataPath}. @xref{SPV Structure
721 graph Element}, for more information.
723 These elements have no attributes.
725 @node SPV Structure pageSetup Element
726 @subsection The @code{pageSetup} Element
730 :initial-page-number=int?
731 :chart-size=(as-is | full-height | half-height | quarter-height | OTHER)?
732 :margin-left=dimension?
733 :margin-right=dimension?
734 :margin-top=dimension?
735 :margin-bottom=dimension?
736 :paper-height=dimension?
737 :paper-width=dimension?
738 :reference-orientation?
739 :space-after=dimension?
740 => pageHeader pageFooter
742 pageHeader => pageParagraph?
744 pageFooter => pageParagraph?
746 pageParagraph => pageParagraph_text
749 The @code{pageSetup} element has the following attributes.
751 @defvr {Attribute} @code{initial-page-number}
752 The page number to put on the first page of printed output. Usually
756 @defvr {Attribute} @code{chart-size}
757 One of the listed, self-explanatory chart sizes,
758 @code{quarter-height}, or a localization (!) of one of these (e.g.@:
759 @code{dimensione attuale}, @code{Wie vorgegeben}).
762 @defvr {Attribute} @code{margin-left}
763 @defvrx {Attribute} @code{margin-right}
764 @defvrx {Attribute} @code{margin-top}
765 @defvrx {Attribute} @code{margin-bottom}
766 Margin sizes, e.g.@: @code{0.25in}.
769 @defvr {Attribute} @code{paper-height}
770 @defvrx {Attribute} @code{paper-width}
774 @defvr {Attribute} @code{reference-orientation}
775 Indicates the orientation of the output page. Either @code{0deg}
776 (portrait) or @code{90deg} (landscape),
779 @defvr {Attribute} @code{space-after}
780 The amount of space between printed objects, typically @code{12pt}.
783 @node SPV Structure @code{text} Element (Inside @code{pageParagraph})
784 @subsection The @code{text} Element (Inside @code{pageParagraph})
787 text[pageParagraph_text] :type=(title | text) => TEXT
790 This @code{text} element is nested inside a @code{pageParagraph}. There
791 is a different @code{text} element that is nested inside a
794 The element is either empty, or contains CDATA that holds almost-XHTML
795 text: in the corpus, either an @code{html} or @code{p} element. It is
796 @emph{almost}-XHTML because the @code{html} element designates the
798 @indicateurl{http://xml.spss.com/spss/viewer/viewer-tree} instead of
799 an XHTML namespace, and because the CDATA can contain substitution
800 variables. The following variables are supported:
805 The current date or time in the preferred format for the locale.
811 First-, second-, third-, or fourth-level heading.
817 Name of the output file.
823 @code{&[Page]} for the page number and @code{&[PageTitle]} for the
826 Typical contents (indented for clarity):
829 <html xmlns="http://xml.spss.com/spss/viewer/viewer-tree">
832 <p style="text-align:right; margin-top: 0">Page &[Page]</p>
837 This element has the following attributes.
839 @defvr {Attribute} @code{type}
843 @node SPV Light Detail Member Format
844 @section Light Detail Member Format
846 This section describes the format of ``light'' detail @file{.bin}
847 members. These members have a binary format which we describe here in
848 terms of a context-free grammar using the following conventions:
851 @item NonTerminal @result{} @dots{}
852 Nonterminals have CamelCaps names, and @result{} indicates a
853 production. The right-hand side of a production is often broken
854 across multiple lines. Break points are chosen for aesthetics only
855 and have no semantic significance.
857 @item 00, 01, @dots{}, ff.
858 A bytes with a fixed value, written as a pair of hexadecimal digits.
860 @item i0, i1, @dots{}, i9, i10, i11, @dots{}
861 @itemx ib0, ib1, @dots{}, ib9, ib10, ib11, @dots{}
862 A 32-bit integer in little-endian or big-endian byte order,
863 respectively, with a fixed value, written in decimal. Prefixed by
864 @samp{i} for little-endian or @samp{ib} for big-endian.
870 A byte with value 0 or 1.
874 A 16-bit unsigned integer in little-endian or big-endian byte order,
879 A 32-bit unsigned integer in little-endian or big-endian byte order,
884 A 64-bit unsigned integer in little-endian or big-endian byte order,
888 A 64-bit IEEE floating-point number.
891 A 32-bit IEEE floating-point number.
895 A 32-bit unsigned integer, in little-endian or big-endian byte order,
896 respectively, followed by the specified number of bytes of UTF-8
897 encoded character data.
900 @var{x} is optional, e.g.@: 00? is an optional zero byte.
902 @item @var{x}*@var{n}
903 @var{x} is repeated @var{n} times, e.g.@: byte*10 for ten arbitrary bytes.
905 @item @var{x}[@var{name}]
906 Gives @var{x} the specified @var{name}. Names are used in textual
907 explanations. They are also used, also bracketed, to indicate counts,
908 e.g.@: @code{int32[n] byte*[n]} for a 32-bit integer followed by the
909 specified number of arbitrary bytes.
911 @item @var{a} @math{|} @var{b}
912 Either @var{a} or @var{b}.
915 Parentheses are used for grouping to make precedence clear, especially
916 in the presence of @math{|}, e.g.@: in 00 (01 @math{|} 02 @math{|} 03)
920 @itemx becount(@var{x})
921 A 32-bit unsigned integer, in little-endian or big-endian byte order,
922 respectively, that indicates the number of bytes in @var{x}, followed
926 In a version 1 @file{.bin} member, @var{x}; in version 3, nothing.
927 (The @file{.bin} header indicates the version.)
930 In a version 3 @file{.bin} member, @var{x}; in version 1, nothing.
933 PSPP uses this grammar to parse light detail members. See
934 @file{src/output/spv/light-binary.grammar} in the PSPP source tree for
937 Little-endian byte order is far more common in this format, but a few
938 pieces of the format use big-endian byte order.
940 Light detail members express linear units in two ways: points (pt), at
941 72/inch, and ``device-independent pixels'' (px), at 96/inch. To
942 convert from pt to px, multiply by 1.33 and round up. To convert
943 from px to pt, divide by 1.33 and round down.
945 A ``light'' detail member @file{.bin} consists of a number of sections
946 concatenated together, terminated by an optional byte 01:
950 Header Titles Footnotes
951 Areas Borders PrintSettings TableSettings Formats
952 Dimensions Axes Cells
956 The following sections go into more detail.
959 * SPV Light Member Header::
960 * SPV Light Member Titles::
961 * SPV Light Member Footnotes::
962 * SPV Light Member Areas::
963 * SPV Light Member Borders::
964 * SPV Light Member Print Settings::
965 * SPV Light Member Table Settings::
966 * SPV Light Member Formats::
967 * SPV Light Member Dimensions::
968 * SPV Light Member Categories::
969 * SPV Light Member Axes::
970 * SPV Light Member Cells::
971 * SPV Light Member Value::
972 * SPV Light Member ValueMod::
975 @node SPV Light Member Header
978 An SPV light member begins with a 39-byte header:
983 (i1 @math{|} i3)[version]
986 bool[rotate-inner-column-labels]
987 bool[rotate-outer-row-labels]
990 int32[min-col-width] int32[max-col-width]
991 int32[min-row-width] int32[max-row-width]
995 @code{version} is a version number that affects the interpretation of
996 some of the other data in the member. We will refer to ``version 1''
997 and ``version 3'' later on and use v1(@dots{}) and v3(@dots{}) for
998 version-specific formatting (as described previously).
1000 If @code{rotate-inner-column-labels} is 1, then column labels closest
1001 to the data are rotated to be vertical; otherwise, they are shown
1004 If @code{rotate-outer-row-labels} is 1, then row labels farthest from
1005 the data are rotated to be vertical; otherwise, they are shown in the
1008 @code{table-id} is a binary version of the @code{tableId} attribute in
1009 the structure member that refers to the detail member. For example,
1010 if @code{tableId} is @code{-4122591256483201023}, then @code{table-id}
1011 would be 0xc6c99d183b300001.
1013 @code{min-col-width} is the minimum width that a column will be
1014 assigned automatically. @code{max-col-width} is the maximum width
1015 that a column will be assigned to accommodate a long column label.
1016 @code{min-row-width} and @code{max-row-width} are a similar range for
1017 the width of row labels. All of these measurements are in 1/96 inch
1018 units (called a ``device independent pixel'' unit in Windows).
1020 The meaning of the other variable parts of the header is not known. A
1021 writer may safely use version 3, true for @code{x0}, false for
1022 @code{x1}, true for @code{x2}, and 0x15 for @code{x3}.
1024 @node SPV Light Member Titles
1030 Value[subtype] 01? 31
1031 Value[user-title] 01?
1032 (31 Value[corner-text] @math{|} 58)
1033 (31 Value[caption] @math{|} 58)
1036 The Titles follow the Header and specify the table's title, caption,
1039 The @code{user-title} is shown above the title and reflects any user
1040 editing of the title text or style. The @code{title} is the title
1041 originally generated by the procedure. Both of these are appropriate
1042 for presentation and localized to the user's language. For example,
1043 for a frequency table, @code{title} and @code{user-title} normally
1044 name the variable and @code{c} is simply ``Frequencies''.
1046 @code{subtype} is the same as the @code{subType} attribute in the
1047 @code{table} structure XML element that referred to this member.
1048 @xref{SPV Structure table Element}, for details.
1050 The @code{corner-text}, if present, is shown in the upper-left corner
1051 of the table, above the row headings and to the left of the column
1052 headings. It is usually absent. Corner text prevents row dimension
1053 labels from being displayed above the dimension's group and category
1054 labels (see @code{show-row-labels-in-corner}).
1056 The @code{caption}, if present, is shown below the table.
1057 @code{caption} reflects user editing of the caption.
1059 @node SPV Light Member Footnotes
1060 @subsection Footnotes
1063 Footnotes => int32[n-footnotes] Footnote*[n-footnotes]
1064 Footnote => Value[text] (58 @math{|} 31 Value[marker]) int32[show]
1067 Each footnote has @code{text} and an optional custom @code{marker}
1070 The syntax for Value would allow footnotes (and their markers) to
1071 reference other footnotes, but in practice this doesn't work.
1073 @code{show} is a 32-bit signed integer. It is positive to show the
1074 footnote or negative to hide it. Its magnitude is often 1, and in
1075 other cases tends to be the number of references to the footnote.
1077 @node SPV Light Member Areas
1084 string[typeface] float[size] int32[style] bool[underline]
1085 int32[halign] int32[valign]
1086 string[fg-color] string[bg-color]
1087 bool[alternate] string[alt-fg-color] string[alt-bg-color]
1088 v3(int32[left-margin] int32[right-margin] int32[top-margin] int32[bottom-margin])
1091 Each Area represents the style for a different area of the table, in
1092 the following order: title, caption, footer, corner, column labels,
1093 row labels, data, and layers.
1095 @code{index} is the 1-based index of the Area, i.e. 1 for the first
1096 Area, through 8 for the final Area.
1098 @code{typeface} is the string name of the font used in the area. In
1099 the corpus, this is @code{SansSerif} in over 99% of instances and
1100 @code{Times New Roman} in the rest.
1102 @code{size} is the size of the font, in px (@pxref{SPV Light Detail
1103 Member Format}) The most common size in the corpus is 12 px. Even
1104 though @code{size} has a floating-point type, in the corpus its values
1105 are always integers.
1107 @code{style} is a bit mask. Bit 0 (with value 1) is set for bold, bit
1108 1 (with value 2) is set for italic.
1110 @code{underline} is 1 if the font is underlined, 0 otherwise.
1112 @code{halign} specifies horizontal alignment: 0 for center, 2 for
1113 left, 4 for right, 61453 for decimal, 64173 for mixed. Mixed
1114 alignment varies according to type: string data is left-justified,
1115 numbers and most other formats are right-justified.
1117 @code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
1120 @code{fg-color} and @code{bg-color} are the foreground color and
1121 background color, respectively. In the corpus, these are always
1122 @code{#000000} and @code{#ffffff}, respectively.
1124 @code{alternate} is 1 if rows should alternate colors, 0 if all rows
1125 should be the same color. When @code{alternate} is 1,
1126 @code{alt-fg-color} and @code{alt-bg-color} specify the colors for the
1127 alternate rows; otherwise they are empty strings.
1129 @code{left-margin}, @code{right-margin}, @code{top-margin}, and
1130 @code{bottom-margin} are measured in px.
1132 @node SPV Light Member Borders
1139 be32[n-borders] Border*[n-borders]
1140 bool[show-grid-lines]
1149 The Borders reflect how borders between regions are drawn.
1151 The fixed value of @code{endian} can be used to validate the
1154 @code{show-grid-lines} is 1 to draw grid lines, otherwise 0.
1156 Each Border describes one kind of border. @code{n-borders} seems to
1157 always be 19. Each @code{border-type} appears once (although in an
1158 unpredictable order) and correspond to the following borders:
1164 Left, top, right, and bottom outer frame.
1166 Left, top, right, and bottom inner frame.
1168 Left and top of data area.
1170 Horizontal and vertical dimension rows.
1172 Horizontal and vertical dimension columns.
1174 Horizontal and vertical category rows.
1176 Horizontal and vertical category columns.
1179 @code{stroke-type} describes how a border is drawn, as one of:
1196 @code{color} is an RGB color. Bits 24--31 are alpha, bits 16--23 are
1197 red, 8--15 are green, 0--7 are blue. An alpha of 255 indicates an
1198 opaque color, therefore opaque black is 0xff000000.
1200 @node SPV Light Member Print Settings
1201 @subsection Print Settings
1208 bool[paginate-layers]
1211 bool[top-continuation]
1212 bool[bottom-continuation]
1213 be32[n-orphan-lines]
1214 bestring[continuation-string])
1217 The PrintSettings reflect settings for printing. The fixed value of
1218 @code{endian} can be used to validate the endianness.
1220 @code{all-layers} is 1 to print all layers, 0 to print only the
1223 @code{paginate-layers} is 1 to print each layer at the start of a new
1224 page, 0 otherwise. (This setting is honored only @code{all-layers} is
1225 1, since otherwise only one layer is printed.)
1227 @code{fit-width} and @code{fit-length} control whether the table is
1228 shrunk to fit within a page's width or length, respectively.
1230 @code{n-orphan-lines} is the minimum number of rows or columns to put
1231 in one part of a table that is broken across pages.
1233 If @code{top-continuation} is 1, then @code{continuation-string} is
1234 printed at the top of a page when a table is broken across pages for
1235 printing; similarly for @code{bottom-continuation} and the bottom of a
1236 page. Usually, @code{continuation-string} is empty.
1238 @node SPV Light Member Table Settings
1239 @subsection Table Settings
1249 bool[show-row-labels-in-corner]
1250 bool[show-alphabetic-markers]
1251 bool[footnote-marker-superscripts]
1254 Breakpoints[row-breaks] Breakpoints[column-breaks]
1255 Keeps[row-keeps] Keeps[column-keeps]
1256 PointKeeps[row-point-keeps] PointKeeps[column-point-keeps]
1259 bestring[table-look]
1262 Breakpoints => be32[n-breaks] be32*[n-breaks]
1264 Keeps => be32[n-keeps] Keep*[n-keeps]
1265 Keep => be32[offset] be32[n]
1267 PointKeeps => be32[n-point-keeps] PointKeep*[n-point-keeps]
1268 PointKeep => be32[offset] be32 be32
1271 The TableSettings reflect display settings. The fixed value of
1272 @code{endian} can be used to validate the endianness.
1274 @code{current-layer} is the displayed layer. The interpretation when
1275 there is more than one layer dimension is not yet known.
1277 If @code{omit-empty} is 1, empty rows or columns (ones with nothing in
1278 any cell) are hidden; otherwise, they are shown.
1280 If @code{show-row-labels-in-corner} is 1, then row labels are shown in
1281 the upper left corner; otherwise, they are shown nested.
1283 If @code{show-alphabetic-markers} is 1, markers are shown as letters
1284 (e.g.@: @samp{a}, @samp{b}, @samp{c}, @dots{}); otherwise, they are
1285 shown as numbers starting from 1.
1287 When @code{footnote-marker-superscripts} is 1, footnote markers are shown
1288 as superscripts, otherwise as subscripts.
1290 The Breakpoints are rows or columns after which there is a page break;
1291 for example, a row break of 1 requests a page break after the second
1292 row. Usually no breakpoints are specified, indicating that page
1293 breaks should be selected automatically.
1295 The Keeps are ranges of rows or columns to be kept together without a
1296 page break; for example, a row Keep with @code{offset} 1 and @code{n}
1297 10 requests that the 10 rows starting with the second row be kept
1298 together. Usually no Keeps are specified.
1300 The PointKeeps seem to be generated automatically based on
1301 user-specified Keeps. They seems to indicate a conversion from rows
1302 or columns to pixel or point offsets.
1304 @code{notes} is a text string that contains user-specified notes. It
1305 is displayed when the user hovers the cursor over the table, like text
1306 in the @code{title} attribute in HTML. It is not printed. It is
1309 @code{table-look} is the name of a SPSS ``TableLook'' table style,
1310 such as ``Default'' or ``Academic''; it is often empty.
1312 TableSettings ends with an arbitrary number of null bytes. A writer
1313 may safely write 82 null bytes.
1315 A writer may safely use 4 for @code{x5} and 0 for @code{x6}.
1317 @node SPV Light Member Formats
1322 int32[n-widths] int32*[n-widths]
1324 int32[current-layer]
1330 v3(count(X1 count(X2)) count(X3)))
1331 Y0 => int32[epoch] byte[decimal] byte[grouping]
1332 CustomCurrency => int32[n-ccs] string*[n-ccs]
1335 If @code{n-widths} is nonzero, then the accompanying integers are
1336 column widths as manually adjusted by the user.
1338 @code{locale} is a locale including an encoding, such as
1339 @code{en_US.windows-1252} or @code{it_IT.windows-1252}. The encoding
1340 string (like other strings in the member) is encoded in UTF-8.
1342 @code{epoch} is the year that starts the epoch. A 2-digit year is
1343 interpreted as belonging to the 100 years beginning at the epoch. The
1344 default epoch year is 69 years prior to the current year; thus, in
1345 2017 this field by default contains 1948. In the corpus, @code{epoch}
1346 ranges from 1943 to 1948, plus some contain -1.
1348 @code{decimal} is the decimal point character. The observed values
1349 are @samp{.} and @samp{,}.
1351 @code{grouping} is the grouping character. Usually, it is @samp{,} if
1352 @code{decimal} is @samp{.}, and vice versa. Other observed values are
1353 @samp{'} (apostrophe), @samp{ } (space), and zero (presumably
1354 indicating that digits should not be grouped).
1356 @code{n-ccs} is observed as either 0 or 5. When it is 5, the
1357 following strings are CCA through CCE format strings. @xref{Custom
1358 Currency Formats,,, pspp, PSPP}. Most commonly these are all
1359 @code{-,,,} but other strings occur.
1363 X0 only appears, optionally, in version 1 members.
1368 string[command] string[command-local]
1369 string[language] string[charset] string[locale]
1372 Y2 => CustomCurrency byte[missing] bool[x17]
1375 @code{command} describes the statistical procedure that generated the
1376 output, in English. It is not necessarily the literal syntax name of
1377 the procedure: for example, NPAR TESTS becomes ``Nonparametric
1378 Tests.'' @code{command-local} is the procedure's name, translated
1379 into the output language; it is often empty and, when it is not,
1380 sometimes the same as @code{command}.
1382 @code{dataset} is the name of the dataset analyzed to produce the
1383 output, e.g.@: @code{DataSet1}, and @code{datafile} the name of the
1384 file it was read from, e.g.@: @file{C:\Users\foo\bar.sav}. The latter
1385 is sometimes the empty string.
1387 @code{missing} is the character used to indicate that a cell contains
1388 a missing value. It is always observed as @samp{.}.
1390 X0 repeats @code{decimal}, @code{grouping}, CustomCurrency, and
1391 @code{missing} already included in Formats.
1393 A writer may safely use false for @code{x17}.
1397 X1 only appears in version 3 members.
1405 byte[show-variables]
1407 int32[x18] int32[x19]
1413 @code{lang} may indicate the language in use. Some values seem to be
1414 0: @t{en}, 1: @t{de}, 2: @t{es}, 3: @t{it}, 5: @t{ko}, 6: @t{pl}, 8:
1415 @t{zh-tw}, 10: @t{pt_BR}, 11: @t{fr}. The @code{locale} in Formats
1416 and the @code{language}, @code{charset}, and @code{locale} in X0 are
1417 more likely to be useful in practice.
1419 @code{show-variables} determines how variables are displayed by
1420 default. A value of 1 means to display variable names, 2 to display
1421 variable labels when available, 3 to display both (name followed by
1422 label, separated by a space). The most common value is 0, which
1423 probably means to use a global default.
1425 @code{show-values} is a similar setting for values. A value of 1
1426 means to display the value, 2 to display the value label when
1427 available, 3 to display both. Again, the most common value is 0,
1428 which probably means to use a global default.
1430 @code{show-title} is 1 to show the caption, 10 to hide it.
1432 @code{show-caption} is true to show the caption, false to hide it.
1434 A writer may safely use false for @code{x14}, false
1435 for @code{x16}, -1 for @code{x18} and @code{x19}, and false for
1440 X2 only appears in version 3 members.
1444 int32[n-row-heights] int32*[n-row-heights]
1445 int32[n-style-map] StyleMap*[n-style-map]
1446 int32[n-styles] StylePair*[n-styles]
1448 StyleMap => int64[cell-index] int16[style-index]
1451 If present, @code{n-row-heights} and the accompanying integers are row
1452 heights as manually adjusted by the user.
1454 The rest of X2 specifies styles for data cells. At first glance this
1455 is odd, because each data cell can have its own style embedded as part
1456 of the data, but in practice X2 specifies a style for a cell only if
1457 that cell is empty (and thus does not appear in the data at all).
1458 Each StyleMap specifies the index of a blank cell, calculated the same
1459 was as in the Cells (@pxref{SPV Light Member Cells}), along with a
1460 0-based index into the accompanying StylePair array.
1462 A writer may safely omit the optional @code{i0 i0} inside the
1463 @code{count(@dots{})}.
1467 X3 only appears in version 3 members.
1471 01 00 byte[x21] 00 00 00
1474 (string[dataset] string[datafile] i0 int32[date] i0)?
1479 @code{date} is a date, as seconds since the epoch, i.e.@: since
1480 January 1, 1970. Pivot tables within an SPV file often have dates a
1481 few minutes apart, so this is probably a creation date for the table
1482 rather than for the file.
1484 X3 repeats @code{decimal}, @code{grouping}, CustomCurrency, and
1485 @code{missing} already included in Formats. @code{command},
1486 @code{command-local}, @code{language}, @code{charset}, and
1487 @code{locale} have the same meaning as in X0.
1489 @code{small} is a small real number. In the corpus, it overwhelmingly
1490 takes the value 0.0001, with zero occasionally seen. Nonzero numbers
1491 with format 40 (@pxref{SPV Light Member Value}) whose magnitudes are
1492 smaller than displayed in scientific notation. (Thus, a @code{small}
1493 of zero prevents scientific notation from being chosen.)
1495 Sometimes @code{dataset}, @code{datafile}, and @code{date} are present
1496 and other times they are absent. The reader can distinguish by
1497 assuming that they are present and then checking whether the
1498 presumptive @code{dataset} contains a null byte (a valid string never
1501 @code{x22} is usually 0 or 2000000.
1503 A writer may safely use 4 for @code{x21} and omit @code{x22} and the
1504 other optional bytes at the end.
1506 @node SPV Light Member Dimensions
1507 @subsection Dimensions
1509 A pivot table presents multidimensional data. A Dimension identifies
1510 the categories associated with each dimension.
1513 Dimensions => int32[n-dims] Dimension*[n-dims]
1515 Value[name] DimProperties
1516 int32[n-categories] Category*[n-categories]
1521 bool[hide-dim-label]
1522 bool[hide-all-labels]
1526 @code{name} is the name of the dimension, e.g.@: @code{Variables},
1527 @code{Statistics}, or a variable name.
1529 The meanings of @code{x1} and @code{x3} are unknown. @code{x1} is
1530 usually 0 but many other values have been observed. A writer may
1531 safely use 0 for @code{x1} and 2 for @code{x3}.
1533 @code{x2} is 0, 1, or 2. For a pivot table with @var{L} layer
1534 dimensions, @var{R} row dimensions, and @var{C} column dimensions,
1535 @code{x2} is 2 for the first @var{L} dimensions, 0 for the next
1536 @var{R} dimensions, and 1 for the remaining @var{C} dimensions. This
1537 does not mean that the layer dimensions must be presented first,
1538 followed by the row dimensions, followed by the column dimensions---on
1539 the contrary, they are frequently in a different order---but @code{x2}
1540 must follow this pattern to prevent the pivot table from being
1543 If @code{hide-dim-label} is 00, the pivot table displays a label for
1544 the dimension itself. Because usually the group and category labels
1545 are enough explanation, it is usually 01.
1547 If @code{hide-all-labels} is 01, the pivot table omits all labels for
1548 the dimension, including group and category labels. It is usually 00.
1549 When @code{hide-all-labels} is 01, @code{show-dim-label} is ignored.
1551 @code{dim-index} is usually the 0-based index of the dimension, e.g.@:
1552 0 for the first dimension, 1 for the second, and so on. Sometimes it
1553 is -1. There is no visible difference.
1555 @node SPV Light Member Categories
1556 @subsection Categories
1558 Categories are arranged in a tree. Only the leaf nodes in the tree
1559 are really categories; the others just serve as grouping constructs.
1562 Category => Value[name] (Leaf @math{|} Group)
1563 Leaf => 00 00 00 i2 int32[leaf-index] i0
1565 bool[merge] 00 01 int32[x23]
1566 i-1 int32[n-subcategories] Category*[n-subcategories]
1569 @code{name} is the name of the category (or group).
1571 A Leaf represents a leaf category. The Leaf's @code{leaf-index} is a
1572 nonnegative integer unique within the Dimension and less than
1573 @code{n-categories} in the Dimension. If the user does not sort or
1574 rearrange the categories, then @code{leaf-index} starts at 0 for the
1575 first Leaf in the dimension and increments by 1 with each successive
1576 Leaf. If the user does sorts or rearrange the categories, then the
1577 order of categories in the file reflects that change and
1578 @code{leaf-index} reflects the original order.
1580 Occasionally a dimension has no leaf categories at all. A table that
1581 contains such a dimension necessarily has no data at all.
1583 A Group is a group of nested categories. Usually a Group contains at
1584 least one Category, so that @code{n-subcategories} is positive, but a
1585 few Groups with @code{n-subcategories} 0 has been observed.
1587 If a Group's @code{merge} is 00, the most common value, then the group
1588 is really a distinct group that should be represented as such in the
1589 visual representation and user interface. If @code{merge} is 01, the
1590 categories in this group should be shown and treated as if they were
1591 direct children of the group's containing group (or if it has no
1592 parent group, then direct children of the dimension), and this group's
1593 name is irrelevant and should not be displayed. (Merged groups can be
1596 (For writing an SPV file, there is no need to use the @code{merge}
1597 feature unless it is convenient.)
1599 A Group's @code{x23} appears to be i2 when all of the categories
1600 within a group are leaf categories that directly represent data values
1601 for a variable (e.g.@: in a frequency table or crosstabulation, a group
1602 of values in a variable being tabulated) and i0 otherwise. A writer
1603 may safely write a constant 0 in this field.
1605 @node SPV Light Member Axes
1608 After the dimensions come assignment of each dimension to one of the
1609 axes: layers, rows, and columns.
1613 int32[n-layers] int32[n-rows] int32[n-columns]
1614 int32*[n-layers] int32*[n-rows] int32*[n-columns]
1617 The values of @code{n-layers}, @code{n-rows}, and @code{n-columns}
1618 each specifies the number of dimensions displayed in layers, rows, and
1619 columns, respectively. Any of them may be zero. Their values sum to
1620 @code{n-dimensions} from Dimensions (@pxref{SPV Light Member
1623 The following @code{n-dimensions} integers, in three groups, are a
1624 permutation of the 0-based dimension numbers. The first
1625 @code{n-layers} integers specify each of the dimensions represented by
1626 layers, the next @code{n-rows} integers specify the dimensions
1627 represented by rows, and the final @code{n-columns} integers specify
1628 the dimensions represented by columns. When there is more than one
1629 dimension of a given kind, the inner dimensions are given first.
1631 @node SPV Light Member Cells
1634 The final part of an SPV light member contains the actual data.
1637 Cells => int32[n-cells] Cell*[n-cells]
1638 Cell => int64[index] v1(00?) Value
1641 A Cell consists of an @code{index} and a Value. Suppose there are
1642 @math{d} dimensions, numbered 1 through @math{d} in the order given in
1643 the Dimensions previously, and that dimension @math{i}, has @math{n_i}
1644 categories. Consider the cell at coordinates @math{x_i}, @math{1 \le
1645 i \le d}, and note that @math{0 \le x_i < n_i}. Then the index is
1646 calculated by the following algorithm:
1650 for each @math{i} from 1 to @math{d}:
1651 @i{index} = (@math{n_i \times} @i{index}) @math{+} @math{x_i}
1654 For example, suppose there are 3 dimensions with 3, 4, and 5
1655 categories, respectively. The cell at coordinates (1, 2, 3) has
1656 index @math{5 \times (4 \times (3 \times 0 + 1) + 2) + 3 = 33}.
1657 Within a given dimension, the index is the @code{leaf-index} in a Leaf.
1659 @node SPV Light Member Value
1662 Value is used throughout the SPV light member format. It boils down
1663 to a number or a string.
1666 Value => 00? 00? 00? 00? RawValue
1668 01 ValueMod int32[format] double[x]
1669 @math{|} 02 ValueMod int32[format] double[x]
1670 string[var-name] string[value-label] byte[show]
1671 @math{|} 03 string[local] ValueMod string[id] string[c] bool[fixed]
1672 @math{|} 04 ValueMod int32[format] string[value-label] string[var-name]
1673 byte[show] string[s]
1674 @math{|} 05 ValueMod string[var-name] string[var-label] byte[show]
1675 @math{|} ValueMod string[template] int32[n-args] Argument*[n-args]
1678 @math{|} int32[x] i0 Value*[x] /* x > 0 */
1681 There are several possible encodings, which one can distinguish by the
1682 first nonzero byte in the encoding.
1686 The numeric value @code{x}, intended to be presented to the user
1687 formatted according to @code{format}, which is about the same as the
1688 format described for system files (@pxref{System File Output
1689 Formats}). The exception is that format 40 is not MTIME but instead
1690 approximately a synonym for F format with a different rule for whether
1691 a value is shown in scientific notation: a value in format 40 is shown
1692 in scientific notation if and only if it is nonzero and its magnitude
1693 is less than @code{small} (@pxref{SPV Light Member Formats}).
1695 Most commonly, @code{format} has width 40 (the maximum).
1697 An @code{x} with the maximum negative double value @code{-DBL_MAX}
1698 represents the system-missing value SYSMIS. (HIGHEST and LOWEST have
1699 not been observed.) @xref{System File Format}, for more about these
1703 Similar to @code{01}, with the additional information that @code{x} is
1704 a value of variable @code{var-name} and has value label
1705 @code{value-label}. Both @code{var-name} and @code{value-label} can
1706 be the empty string, the latter very commonly.
1708 @code{show} determines whether to show the numeric value or the value
1709 label. A value of 1 means to show the value, 2 to show the label, 3
1710 to show both, and 0 means to use the default specified in
1711 @code{show-values} (@pxref{SPV Light Member Formats}).
1714 A text string, in two forms: @code{c} is in English, and sometimes
1715 abbreviated or obscure, and @code{local} is localized to the user's
1716 locale. In an English-language locale, the two strings are often the
1717 same, and in the cases where they differ, @code{local} is more
1718 appropriate for a user interface, e.g.@: @code{c} of ``Not a PxP table
1719 for MCN...'' versus @code{local} of ``Computed only for a PxP table,
1720 where P must be greater than 1.''
1722 @code{c} and @code{local} are always either both empty or both
1725 @code{id} is a brief identifying string whose form seems to resemble a
1726 programming language identifier, e.g.@: @code{cumulative_percent} or
1727 @code{factor_14}. It is not unique.
1729 @code{fixed} is 00 for text taken from user input, such as syntax
1730 fragment, expressions, file names, data set names, and 01 for fixed
1731 text strings such as names of procedures or statistics. In the former
1732 case, @code{id} is always the empty string; in the latter case,
1733 @code{id} is still sometimes empty.
1736 The string value @code{s}, intended to be presented to the user
1737 formatted according to @code{format}. The format for a string is not
1738 too interesting, and the corpus contains many clearly invalid formats
1739 like A16.39 or A255.127 or A134.1, so readers should probably ignore
1740 the format entirely.
1742 @code{s} is a value of variable @code{var-name} and has value label
1743 @code{value-label}. @code{var-name} is never empty but
1744 @code{value-label} is commonly empty.
1746 @code{show} has the same meaning as in the encoding for 02.
1749 Variable @code{var-name}, which is rarely observed as empty in the
1750 corpus, with variable label @code{var-label}, which is often empty.
1752 @code{show} determines whether to show the variable name or the
1753 variable label. A value of 1 means to show the name, 2 to show the
1754 label, 3 to show both, and 0 means to use the default specified in
1755 @code{show-variables} (@pxref{SPV Light Member Formats}).
1758 When the first byte of a RawValue is not one of the above, the
1759 RawValue starts with a ValueMod, whose syntax is described in the next
1760 section. (A ValueMod always begins with byte 31 or 58.)
1762 This case is a template string, analogous to @code{printf}, followed
1763 by one or more Arguments, each of which has one or more values. The
1764 template string is copied directly into the output except for the
1765 following special syntax,
1772 Each of these expands to the character following @samp{\\}, to escape
1773 characters that have special meaning in template strings. These are
1774 effective inside and outside the @code{[@dots{}]} syntax forms
1778 Expands to a new-line, inside or outside the @code{[@dots{}]} forms
1782 Expands to a formatted version of argument @var{i}, which must have
1783 only a single value. For example, @code{^1} expands to the first
1784 argument's @code{value}.
1786 @item [:@var{a}:]@var{i}
1787 Expands @var{a} for each of the values in @var{i}. @var{a}
1788 should contain one or more @code{^@var{j}} conversions, which are
1789 drawn from the values for argument @var{i} in order. Some examples
1794 All of the values for the first argument, concatenated.
1797 Expands to the values for the first argument, each followed by
1801 Expands to @code{@var{x} = @var{y}} where @var{x} is the second
1802 argument's first value and @var{y} is its second value. (This would
1803 be used only if the argument has two values. If there were more
1804 values, the second and third values would be directly concatenated,
1805 which would look funny.)
1808 @item [@var{a}:@var{b}:]@var{i}
1809 This extends the previous form so that the first values are expanded
1810 using @var{a} and later values are expanded using @var{b}. For an
1811 unknown reason, within @var{a} the @code{^@var{j}} conversions are
1812 instead written as @code{%@var{j}}. Some examples from the corpus:
1816 Expands to all of the values for the first argument, separated by
1819 @item [%1 = %2:, ^1 = ^2:]1
1820 Given appropriate values for the first argument, expands to @code{X =
1824 Given appropriate values, expands to @code{1, 2, 3}.
1828 The template string is localized to the user's locale.
1831 A writer may safely omit all of the optional 00 bytes at the beginning
1832 of a Value, except that it should write a single 00 byte before a
1835 @node SPV Light Member ValueMod
1836 @subsection ValueMod
1838 A ValueMod can specify special modifications to a Value.
1844 int32[n-refs] int16*[n-refs]
1845 int32[n-subscripts] string*[n-subscripts]
1846 v1(00 (i1 | i2) 00? 00? int32 00? 00?)
1847 v3(count(TemplateString StylePair))
1849 TemplateString => count((count((i0 (58 @math{|} 31 55))?) (58 @math{|} 31 string[id]))?)
1856 bool[bold] bool[italic] bool[underline] bool[show]
1857 string[fg-color] string[bg-color]
1858 string[typeface] byte[size]
1861 int32[halign] int32[valign] double[decimal-offset]
1862 int16[left-margin] int16[right-margin]
1863 int16[top-margin] int16[bottom-margin]
1866 A ValueMod that begins with ``31'' specifies special modifications to
1869 Each of the @code{n-refs} integers is a reference to a Footnote
1870 (@pxref{SPV Light Member Footnotes}) by 0-based index. Footnote
1871 markers are shown appended to the main text of the Value, as
1874 The @code{subscripts}, if present, are strings to append to the main
1875 text of the Value, as subscripts. Each subscript text is a brief
1876 indicator, e.g.@: @samp{a} or @samp{b}, with its meaning indicated by
1877 the table caption. When multiple subscripts are present, they are
1878 displayed separated by commas.
1880 The @code{id} inside the TemplateString, if present, is a template
1881 string for substitutions using the syntax explained previously. It
1882 appears to be an English-language version of the localized template
1883 string in the Value in which the Template is nested. A writer may
1884 safely omit the optional fixed data in TemplateString.
1886 FontStyle and CellStyle, if present, change the style for this
1887 individual Value. In FontStyle, @code{bold}, @code{italic}, and
1888 @code{underline} control the particular style. @code{show} is
1889 ordinarily 1; if it is 0, then the cell data is not shown.
1890 @code{fg-color} and @code{bg-color} are strings in the format
1891 @code{#rrggbb}, e.g.@: @code{#ff0000} for red or @code{#ffffff} for
1892 white. The empty string is occasionally observed also. The
1893 @code{size} is a font size in units of 1/128 inch.
1895 In CellStyle, @code{halign} is 0 for center, 2 for left, 4 for right,
1896 6 for decimal, 0xffffffad for mixed. For decimal alignment,
1897 @code{decimal-offset} is the decimal point's offset from the right
1898 side of the cell, in pt (@pxref{SPV Light Detail Member Format}).
1899 @code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
1900 for bottom. @code{left-margin}, @code{right-margin},
1901 @code{top-margin}, and @code{bottom-margin} are in pt.
1903 @node SPV Legacy Detail Member Binary Format
1904 @section Legacy Detail Member Binary Format
1906 Whereas the light binary format represents everything about a given
1907 pivot table, the legacy binary format conceptually consists of a
1908 number of named sources, each of which consists of a number of named
1909 variables, each of which is a 1-dimensional array of numbers or
1910 strings or a mix. Thus, the legacy binary member format is quite
1913 This section uses the same context-free grammar notation as in the
1914 previous section, with the following additions:
1918 In a version 0xaf legacy member, @var{x}; in other versions, nothing.
1919 (The legacy member header indicates the version; see below.)
1922 In a version 0xb0 legacy member, @var{x}; in other versions, nothing.
1925 A legacy detail member @file{.bin} has the following overall format:
1929 00 byte[version] int16[n-sources] int32[member-size]
1930 Metadata*[n-sources]
1935 @code{version} is a version number that affects the interpretation of
1936 some of the other data in the member. Versions 0xaf and 0xb0 are
1937 known. We will refer to ``version 0xaf'' and ``version 0xb0'' members
1940 A legacy member consists of @code{n-sources} data sources, each of
1941 which has Metadata and Data.
1943 @code{member-size} is the size of the legacy binary member, in bytes.
1945 The Data and Strings above are commented out because the Metadata has
1946 some oddities that mean that the Data sometimes seems to start at
1947 an unexpected place. The following section goes into detail.
1950 * SPV Legacy Member Metadata::
1951 * SPV Legacy Member Numeric Data::
1952 * SPV Legacy Member String Data::
1955 @node SPV Legacy Member Metadata
1956 @subsection Metadata
1960 int32[n-values] int32[n-variables] int32[data-offset]
1961 vAF(byte*28[source-name])
1962 vB0(byte*64[source-name] int32[x])
1965 A data source has @code{n-variables} variables, each with
1966 @code{n-values} data values.
1968 @code{source-name} is a 28- or 64-byte string padded on the right with
1969 0-bytes. The names that appear in the corpus are very generic:
1970 usually @code{tableData} for pivot table data or @code{source0} for
1973 A given Metadata's @code{data-offset} is the offset, in bytes, from
1974 the beginning of the member to the start of the corresponding Data.
1975 This allows programs to skip to the beginning of the data for a
1976 particular source. In every case in the corpus, the Data follow the
1977 Metadata in the same order, but it is important to use
1978 @code{data-offset} instead of reading sequentially through the file
1979 because of the exception described below.
1981 One SPV file in the corpus has legacy binary members with version 0xb0
1982 but a 28-byte @code{source-name} field (and only a single source). In
1983 practice, this means that the 64-byte @code{source-name} used in
1984 version 0xb0 has a lot of 0-bytes in the middle followed by the
1985 @code{variable-name} of the following Data. As long as a reader
1986 treats the first 0-byte in the @code{source-name} as terminating the
1987 string, it can properly interpret these members.
1989 The meaning of @code{x} in version 0xb0 is unknown.
1991 @node SPV Legacy Member Numeric Data
1992 @subsection Numeric Data
1995 Data => Variable*[n-variables]
1996 Variable => byte*288[variable-name] double*[n-values]
1999 Data follow the Metadata in the legacy binary format, with sources in
2000 the same order (but readers should use the @code{data-offset} in
2001 Metadata records, rather than reading sequentially). Each Variable
2002 begins with a @code{variable-name} that generally indicates its role
2003 in the pivot table, e.g.@: ``cell'', ``cellFormat'',
2004 ``dimension0categories'', ``dimension0group0'', followed by the
2005 numeric data, one double per datum. A double with the maximum
2006 negative double @code{-DBL_MAX} represents the system-missing value
2009 @node SPV Legacy Member String Data
2010 @subsection String Data
2013 Strings => SourceMaps[maps] Labels
2015 SourceMaps => int32[n-maps] SourceMap*[n-maps]
2017 SourceMap => string[source-name] int32[n-variables] VariableMap*[n-variables]
2018 VariableMap => string[variable-name] int32[n-data] DatumMap*[n-data]
2019 DatumMap => int32[value-idx] int32[label-idx]
2021 Labels => int32[n-labels] Label*[n-labels]
2022 Label => int32[frequency] string[label]
2025 Each variable may include a mix of numeric and string data values. If
2026 a legacy binary member contains any string data, Strings is present;
2027 otherwise, it ends just after the last Data element.
2029 The string data overlays the numeric data. When a variable includes
2030 any string data, its Variable represents the string values with a
2031 SYSMIS or NaN placeholder. (Not all such values need be
2034 Each SourceMap provides a mapping between SYSMIS or NaN values in source
2035 @code{source-name} and the string data that they represent.
2036 @code{n-variables} is the number of variables in the source that
2037 include string data. More precisely, it is the 1-based index of the
2038 last variable in the source that includes any string data; thus, it
2039 would be 4 if there are 5 variables and only the fourth one includes
2042 A VariableMap repeats its variable's name, but variables are always
2043 present in the same order as the source, starting from the first
2044 variable, without skipping any even if they have no string values.
2045 Each VariableMap contains DatumMap nonterminals, each of which maps
2046 from a 0-based index within its variable's data to a 0-based label
2047 index, e.g.@: pair @code{value-idx} = 2, @code{label-idx} = 3, means
2048 that the third data value (which must be SYSMIS or NaN) is to be
2049 replaced by the string of the fourth Label.
2051 The labels themselves follow the pairs. The valuable part of each
2052 label is the string @code{label}. Each label also includes a
2053 @code{frequency} that reports the number of DatumMaps that reference
2054 it (although this is not useful).
2056 @node SPV Legacy Detail Member XML Format
2057 @section Legacy Detail Member XML Format
2059 The design of the detail XML format is not what one would end up with
2060 for describing pivot tables. This is because it is a special case
2061 of a much more general format (``visualization XML'' or ``VizML'')
2062 that can describe a wide range of visualizations. Most of this
2063 generality is overkill for tables, and so we end up with a funny
2064 subset of a general-purpose format.
2066 An XML Schema for VizML is available, distributed with SPSS binaries,
2067 under a nonfree license. It contains documentation that is
2068 occasionally helpful.
2070 This section describes the detail XML format using the same notation
2071 already used for the structure XML format (@pxref{SPV Structure Member
2072 Format}). See @file{src/output/spv/detail-xml.grammar} in the PSPP
2073 source tree for the full grammar that it uses for parsing.
2075 The important elements of the detail XML format are:
2079 Variables. @xref{SPV Detail Variable Elements}.
2082 Assignment of variables to axes. A variable can appear as columns, or
2083 rows, or layers. The @code{faceting} element and its sub-elements
2084 describe this assignment.
2087 Styles and other annotations.
2090 This description is not detailed enough to write legacy tables.
2091 Instead, write tables in the light binary format.
2094 * SPV Detail visualization Element::
2095 * SPV Detail Variable Elements::
2096 * SPV Detail extension Element::
2097 * SPV Detail graph Element::
2098 * SPV Detail location Element::
2099 * SPV Detail faceting Element::
2100 * SPV Detail facetLayout Element::
2101 * SPV Detail label Element::
2102 * SPV Detail setCellProperties Element::
2103 * SPV Detail setFormat Element::
2104 * SPV Detail interval Element::
2105 * SPV Detail style Element::
2106 * SPV Detail labelFrame Element::
2107 * SPV Detail Legacy Properties::
2110 @node SPV Detail visualization Element
2111 @subsection The @code{visualization} Element
2119 :style[style_ref]=ref style
2123 => visualization_extension?
2125 (sourceVariable | derivedVariable)+
2134 extension[visualization_extension]
2137 :minWidthSet=(true)?
2138 :maxWidthSet=(true)?
2141 userSource :missing=(listwise | pairwise)? => EMPTY
2143 categoricalDomain => variableReference simpleSort
2145 simpleSort :method[sort_method]=(custom) => categoryOrder
2147 container :style=ref style => container_extension? location+ labelFrame*
2149 extension[container_extension] :combinedFootnotes=(true) => EMPTY
2157 The @code{visualization} element is the root of detail XML member. It
2158 has the following attributes:
2160 @defvr {Attribute} creator
2161 The version of the software that created this SPV file, as a string of
2162 the form @code{xxyyzz}, which represents software version xx.yy.zz,
2163 e.g.@: @code{160001} is version 16.0.1. The corpus includes major
2164 versions 16 through 19.
2167 @defvr {Attribute} date
2168 The date on the which the file was created, as a string of the form
2172 @defvr {Attribute} lang
2173 The locale used for output, in Windows format, which is similar to the
2174 format used in Unix with the underscore replaced by a hyphen, e.g.@:
2175 @code{en-US}, @code{en-GB}, @code{el-GR}, @code{sr-Cryl-RS}.
2178 @defvr {Attribute} name
2179 The title of the pivot table, localized to the output language.
2182 @defvr {Attribute} style
2183 The base style for the pivot table. In every example in the corpus,
2184 the @code{style} element has no attributes other than @code{id}.
2187 @defvr {Attribute} type
2188 A floating-point number. The meaning is unknown.
2191 @defvr {Attribute} version
2192 The visualization schema version number. In the corpus, the value is
2193 one of 2.4, 2.5, 2.7, and 2.8.
2196 The @code{userSource} element has no visible effect.
2198 The @code{extension} element as a child of @code{visualization} has
2199 the following attributes.
2201 @defvr {Attribute} numRows
2202 An integer that presumably defines the number of rows in the displayed
2206 @defvr {Attribute} showGridline
2207 Always set to @code{false} in the corpus.
2210 @defvr {Attribute} minWidthSet
2211 @defvrx {Attribute} maxWidthSet
2212 Always set to @code{true} in the corpus.
2215 The @code{extension} element as a child of @code{container} has the
2218 @defvr {Attribute} combinedFootnotes
2222 The @code{categoricalDomain} and @code{simpleSort} elements have no
2225 The @code{layerController} element has no visible effect.
2227 @node SPV Detail Variable Elements
2228 @subsection Variable Elements
2230 A ``variable'' in detail XML is a 1-dimensional array of data. Each
2231 element of the array may, independently, have string or numeric
2232 content. All of the variables in a given detail XML member either
2233 have the same number of elements or have zero elements.
2235 Two different elements define variables and their content:
2238 @item sourceVariable
2239 These variables' data comes from the associated @code{tableData.bin}
2242 @item derivedVariable
2243 These variables are defined in terms of a mapping function from a
2244 source variable, or they are empty.
2247 A variable named @code{cell} always exists. This variable holds the
2248 data displayed in the table.
2250 Variables in detail XML roughly correspond to the dimensions in a
2251 light detail member. Each dimension has the following variables with
2252 stylized names, where @var{n} is a number for the dimension starting
2256 @item dimension@var{n}categories
2257 The dimension's leaf categories (@pxref{SPV Light Member Categories}).
2259 @item dimension@var{n}group0
2260 Present only if the dimension's categories are grouped, this variable
2261 holds the group labels for the categories. Grouping is inferred
2262 through adjacent identical labels. Categories that are not part of a
2263 group have empty-string data in this variable.
2265 @item dimension@var{n}group1
2266 Present only if the first-level groups are further grouped, this
2267 variable holds the labels for the second-level groups. There can be
2268 additional variables with further levels of grouping.
2270 @item dimension@var{n}
2274 Determining the data for a (non-empty) variable is a multi-step
2279 Draw initial data from its source, for a @code{sourceVariable}, or
2280 from another named variable, for a @code{derivedVariable}.
2283 Apply mappings from @code{valueMapEntry} elements within the
2284 @code{derivedVariable} element, if any.
2287 Apply mappings from @code{relabel} elements within a @code{format} or
2288 @code{stringFormat} element in the @code{sourceVariable} or
2289 @code{derivedVariable} element, if any.
2292 If the variable is a @code{sourceVariable} with a @code{labelVariable}
2293 attribute, and there were no mappings to apply in previous steps, then
2294 replace each element of the variable by the corresponding value in the
2298 A single variable's data can be modified in two of the steps, if both
2299 @code{valueMapEntry} and @code{relabel} are used. The following
2300 example from the corpus maps several integers to 2, then maps 2 in
2301 turn to the string ``Input'':
2304 <derivedVariable categorical="true" dependsOn="dimension0categories"
2305 id="dimension0group0map" value="map(dimension0group0)">
2307 <relabel from="2" to="Input"/>
2308 <relabel from="10" to="Missing Value Handling"/>
2309 <relabel from="14" to="Resources"/>
2310 <relabel from="0" to=""/>
2311 <relabel from="1" to=""/>
2312 <relabel from="13" to=""/>
2314 <valueMapEntry from="2;3;5;6;7;8;9" to="2"/>
2315 <valueMapEntry from="10;11" to="10"/>
2316 <valueMapEntry from="14;15" to="14"/>
2317 <valueMapEntry from="0" to="0"/>
2318 <valueMapEntry from="1" to="1"/>
2319 <valueMapEntry from="13" to="13"/>
2324 * SPV Detail sourceVariable Element::
2325 * SPV Detail derivedVariable Element::
2326 * SPV Detail valueMapEntry Element::
2329 @node SPV Detail sourceVariable Element
2330 @subsubsection The @code{sourceVariable} Element
2337 :domain=ref categoricalDomain?
2339 :dependsOn=ref sourceVariable?
2341 :labelVariable=ref sourceVariable?
2342 => variable_extension* (format | stringFormat)?
2345 This element defines a variable whose data comes from the
2346 @file{tableData.bin} member that corresponds to this @file{.xml}.
2348 This element has the following attributes.
2350 @defvr {Attribute} id
2351 An @code{id} is always present because this element exists to be
2352 referenced from other elements.
2355 @defvr {Attribute} categorical
2356 Always set to @code{true}.
2359 @defvr {Attribute} source
2360 Always set to @code{tableData}, the @code{source-name} in the
2361 corresponding @file{tableData.bin} member (@pxref{SPV Legacy Member
2365 @defvr {Attribute} sourceName
2366 The name of a variable within the source, corresponding to the
2367 @code{variable-name} in the @file{tableData.bin} member (@pxref{SPV
2368 Legacy Member Numeric Data}).
2371 @defvr {Attribute} label
2372 The variable label, if any.
2375 @defvr {Attribute} labelVariable
2376 The @code{variable-name} of a variable whose string values correspond
2377 one-to-one with the values of this variable and are suitable for use
2381 @defvr {Attribute} dependsOn
2382 This attribute doesn't affect the display of a table.
2385 @node SPV Detail derivedVariable Element
2386 @subsubsection The @code{derivedVariable} Element
2393 :dependsOn=ref sourceVariable?
2394 => variable_extension* (format | stringFormat)? valueMapEntry*
2397 Like @code{sourceVariable}, this element defines a variable whose
2398 values can be used elsewhere in the visualization. Instead of being
2399 read from a data source, the variable's data are defined by a
2400 mathematical expression.
2402 This element has the following attributes.
2404 @defvr {Attribute} id
2405 An @code{id} is always present because this element exists to be
2406 referenced from other elements.
2409 @defvr {Attribute} categorical
2410 Always set to @code{true}.
2413 @defvr {Attribute} value
2414 An expression that defines the variable's value. In theory this could
2415 be an arbitrary expression in terms of constants, functions, and other
2416 variables, e.g.@: @math{(@var{var1} + @var{var2}) / 2}. In practice,
2417 the corpus contains only the following forms of expressions:
2421 @itemx constant(@var{variable})
2422 All zeros. The reason why a variable is sometimes named is unknown.
2423 Sometimes the ``variable name'' has spaces in it.
2425 @item map(@var{variable})
2426 Transforms the values in the named @var{variable} using the
2427 @code{valueMapEntry}s contained within the element.
2431 @defvr {Attribute} dependsOn
2432 This attribute doesn't affect the display of a table.
2435 @node SPV Detail valueMapEntry Element
2436 @subsubsection The @code{valueMapEntry} Element
2439 valueMapEntry :from :to => EMPTY
2442 A @code{valueMapEntry} element defines a mapping from one or more
2443 values of a source expression to a target value. (In the corpus, the
2444 source expression is always just the name of a variable.) Each target
2445 value requires a separate @code{valueMapEntry}. If multiple source
2446 values map to the same target value, they can be combined or separate.
2448 In the corpus, all of the source and target values are integers.
2450 @code{valueMapEntry} has the following attributes.
2452 @defvr {Attribute} from
2453 A source value, or multiple source values separated by semicolons,
2454 e.g.@: @code{0} or @code{13;14;15;16}.
2457 @defvr {Attribute} to
2458 The target value, e.g.@: @code{0}.
2461 @node SPV Detail extension Element
2462 @subsection The @code{extension} Element
2464 This is a general-purpose ``extension'' element. Readers that don't
2465 understand a given extension should be able to safely ignore it. The
2466 attributes on this element, and their meanings, vary based on the
2467 context. Each known usage is described separately below. The current
2468 extensions use attributes exclusively, without any nested elements.
2470 @subsubheading @code{container} Parent Element
2473 extension[container_extension] :combinedFootnotes=(true) => EMPTY
2476 With @code{container} as its parent element, @code{extension} has the
2477 following attributes.
2479 @defvr {Attribute} combinedFootnotes
2480 Always set to @code{true} in the corpus.
2483 @subsubheading @code{sourceVariable} and @code{derivedVariable} Parent Element
2486 extension[variable_extension] :from :helpId => EMPTY
2489 With @code{sourceVariable} or @code{derivedVariable} as its parent
2490 element, @code{extension} has the following attributes. A given
2491 parent element often contains several @code{extension} elements that
2492 specify the meaning of the source data's variables or sources, e.g.@:
2495 <extension from="0" helpId="corrected_model"/>
2496 <extension from="3" helpId="error"/>
2497 <extension from="4" helpId="total_9"/>
2498 <extension from="5" helpId="corrected_total"/>
2501 More commonly they are less helpful, e.g.@:
2504 <extension from="0" helpId="notes"/>
2505 <extension from="1" helpId="notes"/>
2506 <extension from="2" helpId="notes"/>
2507 <extension from="5" helpId="notes"/>
2508 <extension from="6" helpId="notes"/>
2509 <extension from="7" helpId="notes"/>
2510 <extension from="8" helpId="notes"/>
2511 <extension from="12" helpId="notes"/>
2512 <extension from="13" helpId="no_help"/>
2513 <extension from="14" helpId="notes"/>
2516 @defvr {Attribute} from
2517 An integer or a name like ``dimension0''.
2520 @defvr {Attribute} helpId
2524 @node SPV Detail graph Element
2525 @subsection The @code{graph} Element
2529 :cellStyle=ref style
2531 => location+ coordinates faceting facetLayout interval
2533 coordinates => EMPTY
2536 @code{graph} has the following attributes.
2538 @defvr {Attribute} cellStyle
2539 @defvrx {Attribute} style
2540 Each of these is the @code{id} of a @code{style} element (@pxref{SPV
2541 Detail style Element}). The former is the default style for
2542 individual cells, the latter for the entire table.
2545 @node SPV Detail location Element
2546 @subsection The @code{location} Element
2550 :part=(height | width | top | bottom | left | right)
2551 :method=(sizeToContent | attach | fixed | same)
2554 :target=ref (labelFrame | graph | container)?
2559 Each instance of this element specifies where some part of the table
2560 frame is located. All the examples in the corpus have four instances
2561 of this element, one for each of the parts @code{height},
2562 @code{width}, @code{left}, and @code{top}. Some examples in the
2563 corpus add a fifth for part @code{bottom}, even though it is not clear
2564 how all of @code{top}, @code{bottom}, and @code{height} can be honored
2565 at the same time. In any case, @code{location} seems to have little
2566 importance in representing tables; a reader can safely ignore it.
2568 @defvr {Attribute} part
2569 The part of the table being located.
2572 @defvr {Attribute} method
2573 How the location is determined:
2577 Based on the natural size of the table. Observed only for
2578 parts @code{height} and @code{width}.
2581 Based on the location specified in @code{target}. Observed only for
2582 parts @code{top} and @code{bottom}.
2585 Using the value in @code{value}. Observed only for parts @code{top},
2586 @code{bottom}, and @code{left}.
2589 Same as the specified @code{target}. Observed only for part
2594 @defvr {Attribute} min
2595 Minimum size. Only observed with value @code{100pt}. Only observed
2596 for part @code{width}.
2599 @defvr {Dependent} target
2600 Required when @code{method} is @code{attach} or @code{same}, not
2601 observed otherwise. This identifies an element to attach to.
2602 Observed with the ID of @code{title}, @code{footnote}, @code{graph},
2606 @defvr {Dependent} value
2607 Required when @code{method} is @code{fixed}, not observed otherwise.
2608 Observed values are @code{0%}, @code{0px}, @code{1px}, and @code{3px}
2609 on parts @code{top} and @code{left}, and @code{100%} on part
2613 @node SPV Detail faceting Element
2614 @subsection The @code{faceting} Element
2617 faceting => layer[layers1]* cross layer[layers2]*
2619 cross => (unity | nest) (unity | nest)
2623 nest => variableReference[vars]+
2625 variableReference :ref=ref (sourceVariable | derivedVariable) => EMPTY
2628 :variable=ref (sourceVariable | derivedVariable)
2631 :method[layer_method]=(nest)?
2636 The @code{faceting} element describes the row, column, and layer
2637 structure of the table. Its @code{cross} child determines the row and
2638 column structure, and each @code{layer} child (if any) represents a
2639 layer. Layers may appear before or after @code{cross}.
2641 The @code{cross} element describes the row and column structure of the
2642 table. It has exactly two children, the first of which describes the
2643 table's columns and the second the table's rows. Each child is a
2644 @code{nest} element if the table has any dimensions along the axis in
2645 question, otherwise a @code{unity} element.
2647 A @code{nest} element contains of one or more dimensions listed from
2648 innermost to outermost, each represented by @code{variableReference}
2649 child elements. Each variable in a dimension is listed in order.
2650 @xref{SPV Detail Variable Elements}, for information on the variables
2651 that comprise a dimension.
2653 A @code{nest} can contain a single dimension, e.g.:
2657 <variableReference ref="dimension0categories"/>
2658 <variableReference ref="dimension0group0"/>
2659 <variableReference ref="dimension0"/>
2664 A @code{nest} can contain multiple dimensions, e.g.:
2668 <variableReference ref="dimension1categories"/>
2669 <variableReference ref="dimension1group0"/>
2670 <variableReference ref="dimension1"/>
2671 <variableReference ref="dimension0categories"/>
2672 <variableReference ref="dimension0"/>
2676 A @code{nest} may have no dimensions, in which case it still has one
2677 @code{variableReference} child, which references a
2678 @code{derivedVariable} whose @code{value} attribute is
2679 @code{constant(0)}. In the corpus, such a @code{derivedVariable} has
2680 @code{row} or @code{column}, respectively, as its @code{id}. This is
2681 equivalent to using a @code{unity} element in place of @code{nest}.
2683 A @code{variableReference} element refers to a variable through its
2684 @code{ref} attribute.
2686 Each @code{layer} element represents a dimension, e.g.:
2689 <layer value="0" variable="dimension0categories" visible="true"/>
2690 <layer value="dimension0" variable="dimension0" visible="false"/>
2694 @code{layer} has the following attributes.
2696 @defvr {Attribute} variable
2697 Refers to a @code{sourceVariable} or @code{derivedVariable} element.
2700 @defvr {Attribute} value
2701 The value to select. For a category variable, this is always
2702 @code{0}; for a data variable, it is the same as the @code{variable}
2706 @defvr {Attribute} visible
2707 Whether the layer is visible. Generally, category layers are visible
2708 and data layers are not, but sometimes this attribute is omitted.
2711 @defvr {Attribute} method
2712 When present, this is always @code{nest}.
2715 @node SPV Detail facetLayout Element
2716 @subsection The @code{facetLayout} Element
2719 facetLayout => tableLayout setCellProperties[scp1]*
2720 facetLevel+ setCellProperties[scp2]*
2723 :verticalTitlesInCorner=bool
2725 :fitCells=(ticks both)?
2729 The @code{facetLayout} element and its descendants control styling for
2732 Its @code{tableLayout} child has the following attributes
2734 @defvr {Attribute} verticalTitlesInCorner
2735 If true, in the absence of corner text, row headings will be displayed
2739 @defvr {Attribute} style
2740 Refers to a @code{style} element.
2743 @defvr {Attribute} fitCells
2747 @subsubheading The @code{facetLevel} Element
2750 facetLevel :level=int :gap=dimension? => axis
2752 axis :style=ref style => label? majorTicks
2758 :tickFrameStyle=ref style
2759 :labelFrequency=int?
2769 Each @code{facetLevel} describes a @code{variableReference} or
2770 @code{layer}, and a table has one @code{facetLevel} element for
2771 each such element. For example, an SPV detail member that contains
2772 four @code{variableReference} elements and two @code{layer} elements
2773 will contain six @code{facetLevel} elements.
2775 In the corpus, @code{facetLevel} elements and the elements that they
2776 describe are always in the same order. The correspondence may also be
2777 observed in two other ways. First, one may use the @code{level}
2778 attribute, described below. Second, in the corpus, a
2779 @code{facetLevel} always has an @code{id} that is the same as the
2780 @code{id} of the element it describes with @code{_facetLevel}
2781 appended. One should not formally rely on this, of course, but it is
2782 usefully indicative.
2784 @defvr {Attribute} level
2785 A 1-based index into the @code{variableReference} and @code{layer}
2786 elements, e.g.@: a @code{facetLayout} with a @code{level} of 1
2787 describes the first @code{variableReference} in the SPV detail member,
2788 and in a member with four @code{variableReference} elements, a
2789 @code{facetLayout} with a @code{level} of 5 describes the first
2790 @code{layer} in the member.
2793 @defvr {Attribute} gap
2794 Always observed as @code{0pt}.
2797 Each @code{facetLevel} contains an @code{axis}, which in turn may
2798 contain a @code{label} for the @code{facetLevel} (@pxref{SPV Detail
2799 label Element}) and does contain a @code{majorTicks} element.
2801 @defvr {Attribute} labelAngle
2802 Normally 0. The value -90 causes inner column or outer row labels to
2803 be rotated vertically.
2806 @defvr {Attribute} style
2807 @defvrx {Attribute} tickFrameStyle
2808 Each refers to a @code{style} element. @code{style} is the style of
2809 the tick labels, @code{tickFrameStyle} the style for the frames around
2813 @node SPV Detail label Element
2814 @subsection The @code{label} Element
2819 :textFrameStyle=ref style?
2820 :purpose=(title | subTitle | subSubTitle | layer | footnote)?
2821 => text+ | descriptionGroup
2824 :target=ref faceting
2826 => (description | text)+
2828 description :name=(variable | value) => EMPTY
2832 :definesReference=int?
2833 :position=(subscript | superscript)?
2838 This element represents a label on some aspect of the table.
2840 @defvr {Attribute} style
2841 @defvrx {Attribute} textFrameStyle
2842 Each of these refers to a @code{style} element. @code{style} is the
2843 style of the label text, @code{textFrameStyle} the style for the frame
2847 @defvr {Attribute} purpose
2848 The kind of entity being labeled.
2851 A @code{descriptionGroup} concatenates one or more elements to form a
2852 label. Each element can be a @code{text} element, which contains
2853 literal text, or a @code{description} element that substitutes a value
2856 @defvr {Attribute} target
2857 The @code{id} of an element being described. In the corpus, this is
2858 always @code{faceting}.
2861 @defvr {Attribute} separator
2862 A string to separate the description of multiple groups, if the
2863 @code{target} has more than one. In the corpus, this is always a
2867 Typical contents for a @code{descriptionGroup} are a value by itself:
2869 <description name="value"/>
2871 @noindent or a variable and its value, separated by a colon:
2873 <description name="variable"/><text>:</text><description name="value"/>
2876 A @code{description} is like a macro that expands to some property of
2877 the target of its parent @code{descriptionGroup}. The @code{name}
2878 attribute specifies the property.
2880 @node SPV Detail setCellProperties Element
2881 @subsection The @code{setCellProperties} Element
2885 :applyToConverse=bool?
2886 => (setStyle | setFrameStyle | setFormat | setMetaData)* union[union_]?
2889 The @code{setCellProperties} element sets style properties of cells or
2890 row or column labels.
2892 Interpreting @code{setCellProperties} requires answering two
2893 questions: which cells or labels to style, and what styles to use.
2895 @subsubheading Which Cells?
2900 intersect => where+ | intersectWhere | alternating | EMPTY
2903 :variable=ref (sourceVariable | derivedVariable)
2908 :variable=ref (sourceVariable | derivedVariable)
2909 :variable2=ref (sourceVariable | derivedVariable)
2912 alternating => EMPTY
2915 When @code{union} is present with @code{intersect} children, each of
2916 those children specifies a group of cells that should be styled, and
2917 the total group is all those cells taken together. When @code{union}
2918 is absent, every cell is styled. One attribute on
2919 @code{setCellProperties} affects the choice of cells:
2921 @defvr {Attribute} applyToConverse
2922 If true, this inverts the meaning of the cell selection: the selected
2923 cells are the ones @emph{not} designated. This is confusing, given
2924 the additional restrictions of @code{union}, but in the corpus
2925 @code{applyToConverse} is never present along with @code{union}.
2928 An @code{intersect} specifies restrictions on the cells to be matched.
2929 Each @code{where} child specifies which values of a given variable to
2930 include. The attributes of @code{intersect} are:
2932 @defvr {Attribute} variable
2933 Refers to a variable, e.g.@: @code{dimension0categories}. Only
2934 ``categories'' variables make sense here, but other variables, e.g.@:
2935 @code{dimension0group0map}, are sometimes seen. The reader may ignore
2939 @defvr {Attribute} include
2940 A value, or multiple values separated by semicolons,
2941 e.g.@: @code{0} or @code{13;14;15;16}.
2944 PSPP ignores @code{setCellProperties} when @code{intersectWhere} is
2947 @subsubheading What Styles?
2951 :target=ref (labeling | graph | interval | majorTicks)
2955 setMetaData :target=ref graph :key :value => EMPTY
2958 :target=ref (majorTicks | labeling)
2960 => format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat
2964 :target=ref majorTicks
2968 The @code{set*} children of @code{setCellProperties} determine the
2971 When @code{setCellProperties} contains a @code{setFormat} whose
2972 @code{target} references a @code{labeling} element, or if it contains
2973 a @code{setStyle} that references a @code{labeling} or @code{interval}
2974 element, the @code{setCellProperties} sets the style for table cells.
2975 The format from the @code{setFormat}, if present, replaces the cells'
2976 format. The style from the @code{setStyle} that references
2977 @code{labeling}, if present, replaces the label's font and cell
2978 styles, except that the background color is taken instead from the
2979 @code{interval}'s style, if present.
2981 When @code{setCellProperties} contains a @code{setFormat} whose
2982 @code{target} references a @code{majorTicks} element, or if it
2983 contains a @code{setStyle} whose @code{target} references a
2984 @code{majorTicks}, or if it contains a @code{setFrameStyle} element,
2985 the @code{setCellProperties} sets the style for row or column labels.
2986 In this case, the @code{setCellProperties} always contains a single
2987 @code{where} element whose @code{variable} designates the variable
2988 whose labels are to be styled. The format from the @code{setFormat},
2989 if present, replaces the labels' format. The style from the
2990 @code{setStyle} that references @code{majorTicks}, if present,
2991 replaces the labels' font and cell styles, except that the background
2992 color is taken instead from the @code{setFrameStyle}'s style, if
2995 When @code{setCellProperties} contains a @code{setStyle} whose
2996 @code{target} references a @code{graph} element, and one that
2997 references a @code{labeling} element, and the @code{union} element
2998 contains @code{alternating}, the @code{setCellProperties} sets the
2999 alternate foreground and background colors for the data area. The
3000 foreground color is taken from the style referenced by the
3001 @code{setStyle} that targets the @code{graph}, the background color
3002 from the @code{setStyle} for @code{labeling}.
3004 A reader may ignore a @code{setCellProperties} that only contains
3005 @code{setMetaData}, as well as @code{setMetaData} within other
3006 @code{setCellProperties}.
3008 A reader may ignore a @code{setCellProperties} whose only @code{set*}
3009 child is a @code{setStyle} that targets the @code{graph} element.
3011 @subsubheading The @code{setStyle} Element
3015 :target=ref (labeling | graph | interval | majorTicks)
3020 This element associates a style with the target.
3022 @defvr {Attribute} target
3023 The @code{id} of an element whose style is to be set.
3026 @defvr {Attribute} style
3027 The @code{id} of a @code{style} element that identifies the style to
3031 @node SPV Detail setFormat Element
3032 @subsection The @code{setFormat} Element
3036 :target=ref (majorTicks | labeling)
3038 => format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat
3041 This element sets the format of the target, ``format'' in this case
3042 meaning the SPSS print format for a variable.
3044 The details of this element vary depending on the schema version, as
3045 declared in the root @code{visualization} element's @code{version}
3046 attribute (@pxref{SPV Detail visualization Element}). A reader can
3047 interpret the content without knowing the schema version.
3049 The @code{setFormat} element itself has the following attributes.
3051 @defvr {Attribute} target
3052 Refers to an element whose style is to be set.
3055 @defvr {Attribute} reset
3056 If this is @code{true}, this format replaces the target's previous
3057 format. If it is @code{false}, the modifies the previous format.
3061 * SPV Detail numberFormat Element::
3062 * SPV Detail stringFormat Element::
3063 * SPV Detail dateTimeFormat Element::
3064 * SPV Detail elapsedTimeFormat Element::
3065 * SPV Detail format Element::
3066 * SPV Detail affix Element::
3069 @node SPV Detail numberFormat Element
3070 @subsubsection The @code{numberFormat} Element
3074 :minimumIntegerDigits=int?
3075 :maximumFractionDigits=int?
3076 :minimumFractionDigits=int?
3078 :scientific=(onlyForSmall | whenNeeded | true | false)?
3085 Specifies a format for displaying a number. The available options are
3086 a superset of those available from PSPP print formats. PSPP chooses a
3087 print format type for a @code{numberFormat} as follows:
3091 If @code{scientific} is @code{true}, uses @code{E} format.
3094 If @code{prefix} is @code{$}, uses @code{DOLLAR} format.
3097 If @code{suffix} is @code{%}, uses @code{PCT} format.
3100 If @code{useGrouping} is @code{true}, uses @code{COMMA} format.
3103 Otherwise, uses @code{F} format.
3106 For translating to a print format, PSPP uses
3107 @code{maximumFractionDigits} as the number of decimals, unless that
3108 attribute is missing or out of the range [0,15], in which case it uses
3111 @defvr {Attribute} minimumIntegerDigits
3112 Minimum number of digits to display before the decimal point. Always
3113 observed as @code{0}.
3116 @defvr {Attribute} maximumFractionDigits
3117 @defvrx {Attribute} minimumFractionDigits
3118 Maximum or minimum, respectively, number of digits to display after
3119 the decimal point. The observed values of each attribute range from 0
3123 @defvr {Attribute} useGrouping
3124 Whether to use the grouping character to group digits in large
3128 @defvr {Attribute} scientific
3129 This attribute controls when and whether the number is formatted in
3130 scientific notation. It takes the following values:
3134 Use scientific notation only when the number's magnitude is smaller
3135 than the value of the @code{small} attribute.
3138 Use scientific notation when the number will not otherwise fit in the
3142 Always use scientific notation. Not observed in the corpus.
3145 Never use scientific notation. A number that won't otherwise fit will
3146 be replaced by an error indication (see the @code{errorCharacter}
3147 attribute). Not observed in the corpus.
3151 @defvr {Attribute} small
3152 Only present when the @code{scientific} attribute is
3153 @code{onlyForSmall}, this is a numeric magnitude below which the
3154 number will be formatted in scientific notation. The values @code{0}
3155 and @code{0.0001} have been observed. The value @code{0} seems like a
3156 pathological choice, since no real number has a magnitude less than 0;
3157 perhaps in practice such a choice is equivalent to setting
3158 @code{scientific} to @code{false}.
3161 @defvr {Attribute} prefix
3162 @defvrx {Attribute} suffix
3163 Specifies a prefix or a suffix to apply to the formatted number. Only
3164 @code{suffix} has been observed, with value @samp{%}.
3167 @node SPV Detail stringFormat Element
3168 @subsubsection The @code{stringFormat} Element
3171 stringFormat => relabel* affix*
3173 relabel :from=real :to => EMPTY
3176 The @code{stringFormat} element specifies how to display a string. By
3177 default, a string is displayed verbatim, but @code{relabel} can change
3180 The @code{relabel} element appears as a child of @code{stringFormat}
3181 (and of @code{format}, when it is used to format strings). It
3182 specifies how to display a given value. It is used to implement value
3183 labels and to display the system-missing value in a human-readable
3184 way. It has the following attributes:
3186 @defvr {Attribute} from
3187 The value to map. In the corpus this is an integer or the
3188 system-missing value @code{-1.797693134862316E300}.
3191 @defvr {Attribute} to
3192 The string to display in place of the value of @code{from}. In the
3193 corpus this is a wide variety of value labels; the system-missing
3194 value is mapped to @samp{.}.
3197 @node SPV Detail dateTimeFormat Element
3198 @subsubsection The @code{dateTimeFormat} Element
3202 :baseFormat[dt_base_format]=(date | time | dateTime)
3204 :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
3206 :yearAbbreviation=bool?
3211 :monthFormat=(long | short | number | paddedNumber)?
3215 :showDayOfWeek=bool?
3216 :dayOfWeekAbbreviation=bool?
3218 :dayOfMonthPadding=bool?
3220 :minutePadding=bool?
3221 :secondPadding=bool?
3227 :dayType=(month | year)?
3228 :hourFormat=(AMPM | AS_24 | AS_12)?
3232 This element appears only in schema version 2.5 and earlier
3233 (@pxref{SPV Detail visualization Element}).
3235 Data to be formatted in date formats is stored as strings in legacy
3236 data, in the format @code{yyyy-mm-ddTHH:MM:SS.SSS} and must be parsed
3237 and reformatted by the reader.
3239 The following attribute is required.
3241 @defvr {Attribute} baseFormat
3242 Specifies whether a date and time are both to be displayed, or just
3246 Many of the attributes' meanings are obvious. The following seem to
3247 be worth documenting.
3249 @defvr {Attribute} separatorChars
3250 Exactly four characters. In order, these are used for: decimal point,
3251 grouping, date separator, time separator. Always @samp{.,-:}.
3254 @defvr {Attribute} mdyOrder
3255 Within a date, the order of the days, months, and years.
3256 @code{dayMonthYear} is the only observed value, but one would expect
3257 that @code{monthDayYear} and @code{yearMonthDay} to be reasonable as
3261 @defvr {Attribute} showYear
3262 @defvrx {Attribute} yearAbbreviation
3263 Whether to include the year and, if so, whether the year should be
3264 shown abbreviated, that is, with only 2 digits. Each is @code{true}
3265 or @code{false}; only values of @code{true} and @code{false},
3266 respectively, have been observed.
3269 @defvr {Attribute} showMonth
3270 @defvrx {Attribute} monthFormat
3271 Whether to include the month (@code{true} or @code{false}) and, if so,
3272 how to format it. @code{monthFormat} is one of the following:
3276 The full name of the month, e.g.@: in an English locale,
3280 The abbreviated name of the month, e.g.@: in an English locale,
3284 The number representing the month, e.g.@: 9 for September.
3287 A two-digit number representing the month, e.g.@: 09 for September.
3290 Only values of @code{true} and @code{short}, respectively, have been
3294 @defvr {Attribute} dayType
3295 This attribute is always @code{month} in the corpus, specifying that
3296 the day of the month is to be displayed; a value of @code{year} is
3297 supposed to indicate that the day of the year, where 1 is January 1,
3298 is to be displayed instead.
3301 @defvr {Attribute} hourFormat
3302 @code{hourFormat}, if present, is one of:
3306 The time is displayed with an @code{am} or @code{pm} suffix, e.g.@:
3310 The time is displayed in a 24-hour format, e.g.@: @code{22:15}.
3312 This is the only value observed in the corpus.
3315 The time is displayed in a 12-hour format, without distinguishing
3316 morning or evening, e.g.@: @code{10;15}.
3319 @code{hourFormat} is sometimes present for @code{elapsedTime} formats,
3320 which is confusing since a time duration does not have a concept of AM
3321 or PM. This might indicate a bug in the code that generated the XML
3322 in the corpus, or it might indicate that @code{elapsedTime} is
3323 sometimes used to format a time of day.
3326 For a @code{baseFormat} of @code{date}, PSPP chooses a print format
3327 type based on the following rules:
3331 If @code{showQuarter} is true: @code{QYR}.
3334 Otherwise, if @code{showWeek} is true: @code{WKYR}.
3337 Otherwise, if @code{mdyOrder} is @code{dayMonthYear}:
3341 If @code{monthFormat} is @code{number} or @code{paddedNumber}: @code{EDATE}.
3344 Otherwise: @code{DATE}.
3348 Otherwise, if @code{mdyOrder} is @code{yearMonthDay}: @code{SDATE}.
3351 Otherwise, @code{ADATE}.
3354 For a @code{baseFormat} of @code{dateTime}, PSPP uses @code{YMDHMS} if
3355 @code{mdyOrder} is @code{yearMonthDay} and @code{DATETIME} otherwise.
3356 For a @code{baseFormat} of @code{time}, PSPP uses @code{DTIME} if
3357 @code{showDay} is true, otherwise @code{TIME} if @code{showHour} is
3358 true, otherwise @code{MTIME}.
3360 For a @code{baseFormat} of @code{date}, the chosen width is the
3361 minimum for the format type, adding 2 if @code{yearAbbreviation} is
3362 false or omitted. For other base formats, the chosen width is the
3363 minimum for its type, plus 3 if @code{showSecond} is true, plus 4 more
3364 if @code{showMillis} is also true. Decimals are 0 by default, or 3
3365 if @code{showMillis} is true.
3367 @node SPV Detail elapsedTimeFormat Element
3368 @subsubsection The @code{elapsedTimeFormat} Element
3372 :baseFormat[dt_base_format]=(date | time | dateTime)
3375 :minutePadding=bool?
3376 :secondPadding=bool?
3386 This element specifies the way to display a time duration.
3388 Data to be formatted in elapsed time formats is stored as strings in
3389 legacy data, in the format @code{H:MM:SS.SSS}, with additional hour
3390 digits as needed for long durations, and must be parsed and
3391 reformatted by the reader.
3393 The following attribute is required.
3395 @defvr {Attribute} baseFormat
3396 Specifies whether a day and a time are both to be displayed, or just
3400 The remaining attributes specify exactly how to display the elapsed
3403 For @code{baseFormat} of @code{time}, PSPP converts this element to
3404 print format type @code{DTIME}; otherwise, if @code{showHour} is true,
3405 to @code{TIME}; otherwise, to @code{MTIME}. The chosen width is the
3406 minimum for the chosen type, adding 3 if @code{showSecond} is true,
3407 adding 4 more if @code{showMillis} is also true. Decimals are 0 by
3408 default, or 3 if @code{showMillis} is true.
3410 @node SPV Detail format Element
3411 @subsubsection The @code{format} Element
3415 :baseFormat[f_base_format]=(date | time | dateTime | elapsedTime)?
3418 :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
3423 :yearAbbreviation=bool?
3425 :monthFormat=(long | short | number | paddedNumber)?
3427 :dayOfMonthPadding=bool?
3431 :showDayOfWeek=bool?
3432 :dayOfWeekAbbreviation=bool?
3434 :minutePadding=bool?
3435 :secondPadding=bool?
3441 :dayType=(month | year)?
3442 :hourFormat=(AMPM | AS_24 | AS_12)?
3443 :minimumIntegerDigits=int?
3444 :maximumFractionDigits=int?
3445 :minimumFractionDigits=int?
3447 :scientific=(onlyForSmall | whenNeeded | true | false)?
3451 :tryStringsAsNumbers=bool?
3452 :negativesOutside=bool?
3456 This element is the union of all of the more-specific format elements.
3457 It is interpreted in the same way as one of those format elements,
3458 using @code{baseFormat} to determine which kind of format to use.
3460 There are a few attributes not present in the more specific formats:
3462 @defvr {Attribute} tryStringsAsNumbers
3463 When this is @code{true}, it is supposed to indicate that string
3464 values should be parsed as numbers and then displayed according to
3465 numeric formatting rules. However, in the corpus it is always
3469 @defvr {Attribute} negativesOutside
3470 If true, the negative sign should be shown before the prefix; if
3471 false, it should be shown after.
3474 @node SPV Detail affix Element
3475 @subsubsection The @code{affix} Element
3479 :definesReference=int
3480 :position=(subscript | superscript)
3486 This defines a suffix (or, theoretically, a prefix) for a formatted
3487 value. It is used to insert a reference to a footnote. It has the
3488 following attributes:
3490 @defvr {Attribute} definesReference
3491 This specifies the footnote number as a natural number: 1 for the
3492 first footnote, 2 for the second, and so on.
3495 @defvr {Attribute} position
3496 Position for the footnote label. Always @code{superscript}.
3499 @defvr {Attribute} suffix
3500 Whether the affix is a suffix (@code{true}) or a prefix
3501 (@code{false}). Always @code{true}.
3504 @defvr {Attribute} value
3505 The text of the suffix or prefix. Typically a letter, e.g.@: @code{a}
3506 for footnote 1, @code{b} for footnote 2, @enddots{} The corpus
3507 contains other values: @code{*}, @code{**}, and a few that begin with
3508 at least one comma: @code{,b}, @code{,c}, @code{,,b}, and @code{,,c}.
3511 @node SPV Detail interval Element
3512 @subsection The @code{interval} Element
3515 interval :style=ref style => labeling footnotes?
3519 :variable=ref (sourceVariable | derivedVariable)
3520 => (formatting | format | footnotes)*
3522 formatting :variable=ref (sourceVariable | derivedVariable) => formatMapping*
3524 formatMapping :from=int => format?
3528 :variable=ref (sourceVariable | derivedVariable)
3531 footnoteMapping :definesReference=int :from=int :to => EMPTY
3534 The @code{interval} element and its descendants determine the basic
3535 formatting and labeling for the table's cells. These basic styles are
3536 overridden by more specific styles set using @code{setCellProperties}
3537 (@pxref{SPV Detail setCellProperties Element}).
3539 The @code{style} attribute of @code{interval} itself may be ignored.
3541 The @code{labeling} element may have a single @code{formatting} child.
3542 If present, its @code{variable} attribute refers to a variable whose
3543 values are format specifiers as numbers, e.g. value 0x050802 for F8.2.
3544 However, the numbers are not actually interpreted that way. Instead,
3545 each number actually present in the variable's data is mapped by a
3546 @code{formatMapping} child of @code{formatting} to a @code{format}
3547 that specifies how to display it.
3549 The @code{labeling} element may also have a @code{footnotes} child
3550 element. The @code{variable} attribute of this element refers to a
3551 variable whose values are comma-delimited strings that list the
3552 1-based indexes of footnote references. (Cells without any footnote
3553 references are numeric 0 instead of strings.)
3555 Each @code{footnoteMapping} child of the @code{footnotes} element
3556 defines the footnote marker to be its @code{to} attribute text for the
3557 footnote whose 1-based index is given in its @code{definesReference}
3560 @node SPV Detail style Element
3561 @subsection The @code{style} Element
3568 :border-bottom=(solid | thick | thin | double | none)?
3569 :border-top=(solid | thick | thin | double | none)?
3570 :border-left=(solid | thick | thin | double | none)?
3571 :border-right=(solid | thick | thin | double | none)?
3572 :border-bottom-color?
3575 :border-right-color?
3578 :font-weight=(regular | bold)?
3579 :font-style=(regular | italic)?
3580 :font-underline=(none | underline)?
3581 :margin-bottom=dimension?
3582 :margin-left=dimension?
3583 :margin-right=dimension?
3584 :margin-top=dimension?
3585 :textAlignment=(left | right | center | decimal | mixed)?
3586 :labelLocationHorizontal=(positive | negative | center)?
3587 :labelLocationVertical=(positive | negative | center)?
3588 :decimal-offset=dimension?
3595 A @code{style} element has an effect only when it is referenced by
3596 another element to set some aspect of the table's style. Most of the
3597 attributes are self-explanatory. The rest are described below.
3599 @defvr {Attribute} {color}
3600 In some cases, the text color; in others, the background color.
3603 @defvr {Attribute} {color2}
3607 @defvr {Attribute} {labelAngle}
3608 Normally 0. The value -90 causes inner column or outer row labels to
3609 be rotated vertically.
3612 @defvr {Attribute} {labelLocationHorizontal}
3616 @defvr {Attribute} {labelLocationVertical}
3617 The value @code{positive} corresponds to vertically aligning text to
3618 the top of a cell, @code{negative} to the bottom, @code{center} to the
3622 @node SPV Detail labelFrame Element
3623 @subsection The @code{labelFrame} Element
3626 labelFrame :style=ref style => location+ label? paragraph?
3628 paragraph :hangingIndent=dimension? => EMPTY
3631 A @code{labelFrame} element specifies content and style for some
3632 aspect of a table. Only @code{labelFrame} elements that have a
3633 @code{label} child are important. The @code{purpose} attribute in the
3634 @code{label} determines what the @code{labelFrame} affects:
3638 The table's title and its style.
3641 The table's caption and its style.
3644 The table's footnotes and the style for the footer area.
3647 The style for the layer area.
3653 The @code{style} attribute references the style to use for the area.
3655 The @code{label}, if present, specifies the text to put into the title
3656 or caption or footnotes. For footnotes, the label has two @code{text}
3657 children for every footnote, each of which has a @code{usesReference}
3658 attribute identifying the 1-based index of a footnote. The first,
3659 third, fifth, @dots{} @code{text} child specifies the content for a
3660 footnote; the second, fourth, sixth, @dots{} child specifies the
3661 marker. Content tends to end in a new-line, which the reader may wish
3662 to trim; similarly, markers tend to end in @samp{.}.
3664 The @code{paragraph}, if present, may be ignored, since it is always
3667 @node SPV Detail Legacy Properties
3668 @subsection Legacy Properties
3670 The detail XML format has features for styling most of the aspects of
3671 a table. It also inherits defaults for many aspects from structure
3672 XML, which has the following @code{tableProperties} element:
3677 => generalProperties footnoteProperties cellFormatProperties borderProperties printingProperties
3680 :hideEmptyRows=bool?
3681 :maximumColumnWidth=dimension?
3682 :maximumRowWidth=dimension?
3683 :minimumColumnWidth=dimension?
3684 :minimumRowWidth=dimension?
3685 :rowDimensionLabels=(inCorner | nested)?
3689 :markerPosition=(superscript | subscript)?
3690 :numberFormat=(alphabetic | numeric)?
3693 cellFormatProperties => cell_style+
3696 :alternatingColor=color?
3697 :alternatingTextColor=color?
3705 :font-style=(regular | italic)?
3706 :font-weight=(regular | bold)?
3707 :font-underline=(none | underline)?
3708 :labelLocationVertical=(positive | negative | center)?
3709 :margin-bottom=dimension?
3710 :margin-left=dimension?
3711 :margin-right=dimension?
3712 :margin-top=dimension?
3713 :textAlignment=(left | right | center | decimal | mixed)?
3714 :decimal-offset=dimension?
3717 borderProperties => border_style+
3720 :borderStyleType=(none | solid | dashed | thick | thin | double)?
3725 :printAllLayers=bool?
3726 :rescaleLongTableToFitPage=bool?
3727 :rescaleWideTableToFitPage=bool?
3728 :windowOrphanLines=int?
3730 :continuationTextAtBottom=bool?
3731 :continuationTextAtTop=bool?
3732 :printEachLayerOnSeparatePage=bool?
3736 The @code{name} attribute appears only in standalone @file{.stt} files
3737 (@pxref{SPSS TableLook STT Format}).