@node SPV Structure Member Format
@section Structure Member Format
+A structure member lays out the high-level structure for a group of
+output items such as heading, tables, and charts. Structure members
+do not include the details of tables and charts but instead refer to
+them by their member names.
+
Structure members' XML files claim conformance with a collection of
XML Schemas. These schemas are distributed, under a nonfree license,
with SPSS binaries. Fortunately, the schemas are not necessary to
-understand the structure members. To a degree, the schemas can even
+understand the structure members. The schemas can even
be deceptive because they document elements and attributes that are
not in the corpus and do not document elements and attributes that are
-commonly found there.
+commonly found in the corpus.
Structure members use a different XML namespace for each schema, but
these namespaces are not entirely consistent. In some SPV files, for
@center @image{dev/spv-structure, 5in}
@end iftex
+The following example shows the contents of a typical structure member
+for a @cmd{DESCRIPTIVES} procedure. A real structure member would not
+have indentation. This example all omits most attributes, all XML
+namespace information, and the CSS from the embedded HTML:
+
+@example
+<?xml version="1.0" encoding="utf-8"?>
+<heading>
+ <label>Output</label>
+ <heading commandName="Descriptives">
+ <label>Descriptives</label>
+ <container>
+ <label>Title</label>
+ <text commandName="Descriptives" type="title">
+ <html lang="en">
+<![CDATA[<head><style type="text/css">...</style></head><BR>Descriptives]]>
+ </html>
+ </text>
+ </container>
+ <container visibility="hidden">
+ <label>Notes</label>
+ <table commandName="Descriptives" subType="Notes" type="note">
+ <tableStructure>
+ <dataPath>00000000001_lightNotesData.bin</dataPath>
+ </tableStructure>
+ </table>
+ </container>
+ <container>
+ <label>Descriptive Statistics</label>
+ <table commandName="Descriptives" subType="Descriptive Statistics"
+ type="table">
+ <tableStructure>
+ <dataPath>00000000002_lightTableData.bin</dataPath>
+ </tableStructure>
+ </table>
+ </container>
+ </heading>
+</heading>
+@end example
+
@menu
* SPV Structure heading Element::
* SPV Structure label Element::
Contents: CDATA
The CDATA contains an HTML document. In some cases, the document
-starts with @code{<html>} and ends with @code{</html}; in others the
+starts with @code{<html>} and ends with @code{</html>}; in others the
@code{html} element is implied. Generally the HTML includes a
@code{head} element with a CSS stylesheet. The HTML body often begins
with @code{<BR>}. The actual content ranges from trivial to simple:
@node SPV Structure @code{text} Element (Inside @code{pageParagraph})
@subsection The @code{text} Element (Inside @code{pageParagraph})
-Parent: @code{pageParagraph}
+Parent: @code{pageParagraph} @*
Contents: CDATA?
This @code{text} element is nested inside a @code{pageParagraph}. There
text: in the corpus, either an @code{html} or @code{p} element. It is
@emph{almost}-XHTML because the @code{html} element designates the
default namespace as
-@code{http://xml.spss.com/spss/viewer/viewer-tree} instead of an XHTML
+@indicateurl{http://xml.spss.com/spss/viewer/viewer-tree} instead of an XHTML
namespace, and because the CDATA can contain substitution variables:
@code{&[Page]} for the page number and @code{&[PageTitle]} for the
page title.
@cartouche
@format
Dimensions @result{} int[@t{n-dims}] Dimension*[@t{n-dims}]
-Dimension @result{} Value[@t{name}] DimUnknown int[@t{n-categories}] Category*[@t{n-categories}]
-DimUnknown @result{}
+Dimension @result{} Value[@t{name}] DimProperties int[@t{n-categories}] Category*[@t{n-categories}]
+DimProperties @result{}
byte[@t{d1}]
(00 @math{|} 01 @math{|} 02)[@t{d2}]
(i0 @math{|} i2)[@t{d3}]
- bool[@t{d4}]
- bool[@t{d5}]
- 01
- int[@t{d6}]
+ bool[@t{show-dim-label}]
+ bool[@t{hide-all-labels}]
+ 01 int[@t{dim-index}]
@end format
@end cartouche
@code{name} is the name of the dimension, e.g. @code{Variables},
@code{Statistics}, or a variable name.
+The meanings of @code{d1}, @code{d2}, and @code{d3} are unknown.
@code{d1} is usually 0 but many other values have been observed.
-@code{d3} is 2 over 99% of the time.
+If @code{show-dim-label} is 01, the pivot table displays a label for
+the dimension itself. Because usually the group and category labels
+are enough explanation, it is usually 00.
-@code{d5} is 0 over 99% of the time.
+If @code{hide-all-labels} is 01, the pivot table omits all labels for
+the dimension, including group and category labels. It is usually 00.
+When @code{hide-all-labels} is 01, @code{show-dim-label} is ignored.
-@code{d6} is either -1 or the 0-based index of the dimension, e.g.@: 0
-for the first dimension, 1 for the second, and so on. The latter is
-the case 98% of the time in the corpus.
+@code{dim-index} is usually the 0-based index of the dimension, e.g.@:
+0 for the first dimension, 1 for the second, and so on. Sometimes it
+is -1. There is no visible difference.
@node SPV Light Member Categories
@subsection Categories
@cartouche
@format
Category @result{} Value[@t{name}] (Leaf @math{|} Group)
-Leaf @result{} 00 00 00 i2 int[@t{index}] i0
+Leaf @result{} 00 00 00 i2 int[@t{cat-index}] i0
Group @result{}
bool[@t{merge}] 00 01 (i0 @math{|} i2)[@t{data}]
i-1 int[@t{n-subcategories}] Category*[@t{n-subcategories}]
@code{name} is the name of the category (or group).
-A Leaf represents a leaf category. The Leaf's @code{index} is a
+A Leaf represents a leaf category. The Leaf's @code{cat-index} is a
nonnegative integer less than @code{n-categories} in the Dimension in
-which the Category is nested (directly or indirectly).
+which the Category is nested (directly or indirectly). These
+categories represent the original order in which the categories were
+sorted; if the user sorted or rearranged the categories, then the
+order of categories in the file reflects that without changing the
+@code{cat-index} values.
-A Group represents a Group of nested categories. Usually a Group
-contains at least one Category, so that @code{n-subcategories} is
-positive, but a few Groups with @code{n-subcategories} 0 has been
-observed.
+A Group is a group of nested categories. Usually a Group contains at
+least one Category, so that @code{n-subcategories} is positive, but a
+few Groups with @code{n-subcategories} 0 has been observed.
If a Group's @code{merge} is 00, the most common value, then the group
is really a distinct group that should be represented as such in the
Data @result{}
int[@t{layers}] int[@t{rows}] int[@t{columns}] int*[@t{n-dimensions}]
int[@t{n-data}] Datum*[@t{n-data}]
-Datum @result{} int64[@t{index}] v3(00?) Value
+Datum @result{} int64[@t{index}] v1(00?) Value
@end format
@end cartouche
-The values of @code{layers}, @code{rows}, and @code{columns} each
-specifies the number of dimensions displayed in layers, rows, and
+The values of @code{n-layers}, @code{n-rows}, and @code{n-columns}
+each specifies the number of dimensions displayed in layers, rows, and
columns, respectively. Any of them may be zero. Their values sum to
@code{n-dimensions} from Dimensions (@pxref{SPV Light Member
Dimensions}).
The @code{n-dimensions} integers are a permutation of the 0-based
-dimension numbers. The first @code{layers} integers specify each of
-the dimensions represented by layers, the next @code{rows} integers
+dimension numbers. The first @code{n-layers} integers specify each of
+the dimensions represented by layers, the next @code{n-rows} integers
specify the dimensions represented by rows, and the final
-@code{columns} integers specify the dimensions represented by columns.
-When there is more than one dimension of a given kind, the inner
-dimensions are given first.
+@code{n-columns} integers specify the dimensions represented by
+columns. When there is more than one dimension of a given kind, the
+inner dimensions are given first.
The format of a Datum varies slightly from version 1 to version 3: in
version 1 it allows for an extra optional 00 byte.
For example, suppose there are 3 dimensions with 3, 4, and 5
categories, respectively. The datum at coordinates (1, 2, 3) has
index @math{5 \times (4 \times (3 \times 0 + 1) + 2) + 3 = 33}.
+Within a given dimension, the index is the @code{cat-index} in a Leaf.
@node SPV Light Member Value
@subsection Value
* SPV Detail coordinates Element::
* SPV Detail faceting Element::
* SPV Detail facetLayout Element::
+* SPV Detail style Element::
@end menu
@node SPV Detail visualization Element
@defvr {Required} style
The @code{id} of a @code{style} element (@pxref{SPV Detail style
-element}). This is the base style for the entire pivot table. In
+Element}). This is the base style for the entire pivot table. In
every example in the corpus, the value is @code{visualizationStyle}
and the corresponding @code{style} element has no attributes other
than @code{id}.
@defvr {Required} cellStyle
@defvrx {Required} style
Each of these is the @code{id} of a @code{style} element (@pxref{SPV
-Detail style element}). The former is the default style for
+Detail style Element}). The former is the default style for
individual cells, the latter for the entire table.
@end defvr
@defvrx {Optional} useGroupging
The syntax and meaning of these attributes is the same as on the
@code{format} element for a numeric format. @pxref{SPV Detail format
-element}.
+Element}.
@end defvr
@node SPV Detail stringFormat Element
e.g.@: @code{0} or @code{13;14;15;16}.
@end defvr
-@subsubheading The @code{intersectWhere}
+@subsubheading The @code{intersectWhere} Element
Parent: @code{intersect} @*
Contents: empty
the corpus they always take the values @code{dimension2categories} and
@code{dimension0categories}, respectively.
@end defvr
+
+@node SPV Detail style Element
+@subsection The @code{style} Element
+
+TBD.