Add support for reading and writing SPV files.

author Ben Pfaff <blp@cs.stanford.edu>

Sun, 30 Dec 2018 05:15:43 +0000 (21:15 -0800)

committer Ben Pfaff <blp@cs.stanford.edu>

Sun, 10 Feb 2019 00:02:28 +0000 (16:02 -0800)
author Ben Pfaff <blp@cs.stanford.edu>
Sun, 30 Dec 2018 05:15:43 +0000 (21:15 -0800)
committer Ben Pfaff <blp@cs.stanford.edu>
Sun, 10 Feb 2019 00:02:28 +0000 (16:02 -0800)
diff --git a/doc/dev/spv-file-format.texi b/doc/dev/spv-file-format.texi

index 379f6988f7238675fbe959ef871e3949c8633fca..bf6356691ebbb5187ba702063411f348dcca5d42 100644 (file)
--- a/doc/dev/spv-file-format.texi
+++ b/doc/dev/spv-file-format.texi
@@ -4,23 +4,22 @@
  SPSS Viewer or @file{.spv} files, here called SPV files, are written
  by SPSS 16 and later to represent the contents of its output editor.
  This chapter documents the format, based on examination of a corpus of
-about 500 files from a variety of sources.  This description is
-detailed enough to read SPV files, but probably not enough to write
-them.
+about 3,000 files from a variety of sources.  This description is
+detailed enough to both read and write SPV files.
  
-SPSS 15 and earlier versions use a completely different output format
-based on the Microsoft Compound Document Format.  This format is not
-documented here.
+SPSS 15 and earlier versions instead use @file{.spo} files, which have
+a completely different output format based on the Microsoft Compound
+Document Format.  This format is not documented here.
  
  An SPV file is a Zip archive that can be read with @command{zipinfo}
  and @command{unzip} and similar programs.  The final member in the Zip
-archive is a file named @file{META-INF/MANIFEST.MF}.  This structure
-makes SPV files resemble Java ``JAR'' files (and ODF files), but
-whereas a JAR manifest contains a sequence of colon-delimited
-key/value pairs, an SPV manifest contains the string
-@samp{allowPivoting=true}, without a new-line.  (This string may be
-the best way to identify an SPV file; it is invariant across the
-corpus.)
+archive is the @dfn{manifest}, a file named
+@file{META-INF/MANIFEST.MF}.  This structure makes SPV files resemble
+Java ``JAR'' files (and ODF files), but whereas a JAR manifest
+contains a sequence of colon-delimited key/value pairs, an SPV
+manifest contains the string @samp{allowPivoting=true}, without a
+new-line.  PSPP uses this string to identify an SPV file; it is
+invariant across the corpus.
  
  The rest of the members in an SPV file's Zip archive fall into two
  categories: @dfn{structure} and @dfn{detail} members.  Structure
@@ -70,6 +69,9 @@ tending to skip values.  Older SPV files use different naming
  conventions.  Structure member refer to detail members by name, and so
  their exact names do not matter to readers as long as they are unique.
  
+SPSS tolerates corrupted Zip archives that Zip reader libraries tend
+to reject.  These can be fixed up with @command{zip -FF}.
+
  @menu
  * SPV Structure Member Format::
  * SPV Light Detail Member Format::
@@ -109,101 +111,113 @@ structure member's root is @code{heading} element, which contains
  tree.  In turn, @code{container} holds a @code{label} and one more
  child, usually @code{text} or @code{table}.
  
-The following diagram shows the hierarchy within an SPV structure
-member more precisely.  Names represent elements and <text> and
-<cdata> represent plain text and CDATA, respectively.  Edges point
-from parent to child.  Unlabeled edges indicate that the child appears
-exactly once; edges labeled with *, zero or more times; edges labeled
-with ?, zero or one times.  Where possible, child elements are shown
-in the order they actually appear within a parent element.
+The following sections document the elements found in structure
+members in a context-free grammar-like fashion.  Consider the
+following example, which specifies the attributes and content for the
+@code{container} element:
  
  @example
-@group
-heading
-  +--?--> pageSetup
-  |         +--> pageHeader +--> pageParagraph --> text --> <cdata>
-  |         +--> pageFooter +--> pageParagraph --> text --> <cdata>
-  +-----> label --?--> <text>
-  +--*--> heading
-  +--*--> container
-             +-----> label --?--> <text>
-             +--?--> text ---> html --> <cdata>
-             +--?--> table
-             |         +--?-- tableProperties
-             |         |        +--> generalProperties
-             |         |        +--> footnoteProperties
-             |         |        +--> cellFormatProperties
-             |         |        |      +--> caption -------> style
-             |         |        |      +--> footnotes -----> style
-             |         |        |      +--> rowLabelse ----> style
-             |         |        |      +--> columnLabels --> style
-             |         |        |      +--> data ----------> style
-             |         |        |      +--> layers --------> style
-             |         |        |      +--> title ---------> style
-             |         |        |      +--> cornerLabels --> style
-             |         |        +--> borderProperties
-             |         |        |      +--> topInnerFrame
-             |         |        |      +--> rightInnerFrame
-             |         |        |      +--> horizontalDimensionBorderColumns
-             |         |        |      +--> horizontalDimensionBorderRows
-             |         |        |      +--> horizontalCategoryBorderColumns
-             |         |        |      +--> leftInnerFrame
-             |         |        |      +--> verticalDimensionBorderRows
-             |         |        |      +--> titleLayerSeparator
-             |         |        |      +--> verticalCategoryBorderRows
-             |         |        |      +--> topOuterFrame
-             |         |        |      +--> bottomInnerFrame
-             |         |        |      +--> leftOuterFrame
-             |         |        |      +--> dataAreaTop
-             |         |        |      +--> verticalDimensionBorderColumns
-             |         |        |      +--> dataAreaLeft
-             |         |        |      +--> horizontalCategoryBorderRows
-             |         |        |      +--> bottomOuterFrame
-             |         |        |      +--> rightOuterFrame
-             |         |        |      +--> verticalCategoryBorderColumns
-             |         |        +--> printingProperties
-             |         +----- tableStructure
-             |                  +--?--> path ------> <text>
-             |                  +-----> dataPath --> <text>
-             +--?--> graph
-             |         +--?--> dataPath --> <text>
-             |         +-----> path ------> <text>
-             +--?--> model
-                       +--?--> ViZml --> <text>
-                       +--?--> path ---> <text>
-                       +--?--> pmmlContainerPath ---> <text>
-                       +--?--> statsContainerPath --> <text>
-@end group
-@end example
-
-The elements found in structure members are documented below.  For
-each element, we note the possible parent elements and the element's
-contents.  The contents are specified as pseudo-regular expressions
-with the following conventions:
+container
+   :visibility=(visible | hidden)
+   :page-break-before=(always)?
+   :text-align=(left | center)?
+   :width=dimension
+=> label (table | container_text | graph | model | object | image)
+@end example
  
-@table @asis
-@item text
-XML text content.
+Each attribute specification begins with @samp{:} followed by the
+attribute's name.  If the attribute's value has an easily specified
+form, then @samp{=} and its description follows the name.  Finally, if
+the attribute is optional, the specification ends with @samp{?}.  The
+following value specifications are defined:
  
-@item CDATA
-XML CDATA content.
+@table @code
+@item (@var{a} | @var{b} | @dots{})
+One of the listed literal strings.  If only one string is listed, it
+is the only acceptable value.  If @code{OTHER} is listed, then any
+string not explicitly listed is also accepted.
  
-@item @code{element}
-The named element.
+@item bool
+Either @code{true} or @code{false}.
  
-@item (@dots{})
-Grouping multiple elements.
+@item dimension
+A floating-point number followed by a unit, e.g.@: @code{10pt}.  Units
+in the corpus include @code{in} (inch), @code{pt} (points, 72/inch),
+@code{px} (``device-independent pixels'', 96/inch), and @code{cm}.
+The corpus also contains localized names for units: @code{인치} for
+inch, @code{пт} for points, and @code{см} for centimeters.  If the
+unit is omitted then points should be assumed.  The number and unit
+may be separated by white space.
  
-@item [@var{x}]
-An optional @var{x}.
+@item real
+A floating-point number.
  
-@item @var{a} @math{|} @var{b}
-A choice between @var{a} and @var{b}.
+@item int
+An integer.
+
+@item color
+A color in one of the forms @code{#@var{rr}@var{gg}@var{bb}} or
+@code{@var{rr}@var{gg}@var{bb}}, or the string @code{transparent}, or
+one of the standard Web color names.
+
+@item ref
+@item ref @var{element}
+@itemx ref(@var{elem1} | @var{elem2} | @dots{})
+The name from the @code{id} attribute in some element.  If one or more
+elements are named, the name must refer to one of those elements,
+otherwise any element is acceptable.
+@end table
+
+All elements have an optional @code{id} attribute.  If present, its
+value must be unique.  In practice many elements are assigned
+@code{id} attributes that are never referenced.
+
+The content specification for an element supports the following
+syntax:
+
+@table @code
+@item @var{element}
+An element.
+
+@item @var{a} @var{b}
+@var{a} followed by @var{b}.
+
+@item @var{a} | @var{b} | @var{c}
+One of @var{a} or @var{b} or @var{c}.
+
+@item @var{a}?
+Zero or one instances of @var{a}.
  
-@item @var{x}*
-Zero or more @var{x}.
+@item @var{a}*
+Zero or more instances of @var{a}.
+
+@item @var{b}+
+One or more instances of @var{a}.
+
+@item (@var{subexpression})
+Grouping for a subexpression.
+
+@item EMPTY
+No content.
+
+@item TEXT
+Text and CDATA.
  @end table
  
+Element and attribute names are sometimes suffixed by another name in
+square brackets to distinguish different uses of the same name.  For
+example, structure XML has two @code{text} elements, one inside
+@code{container}, the other inside @code{pageParagraph}.  The former
+is defined as @code{text[container_text]} and referenced as
+@code{container_text}, the latter defined as
+@code{text[pageParagraph_text]} and referenced as
+@code{pageParagraph_text}.
+
+This language is used in the PSPP source code for parsing structure
+and detail XML members.  Refer to
+@file{src/output/spv/structure-xml.grammar} and
+@file{src/output/spv/detail-xml.grammar} for the full grammars.
+
  The following example shows the contents of a typical structure member
  for a @cmd{DESCRIPTIVES} procedure.  A real structure member is not
  indented.  This example also omits most attributes, all XML namespace
@@ -251,21 +265,33 @@ information, and the CSS from the embedded HTML:
  * SPV Structure text Element (Inside @code{container})::
  * SPV Structure html Element::
  * SPV Structure table Element::
-* SPV Structure tableStructure Element::
  * SPV Structure graph Element::
  * SPV Structure model Element::
  * SPV Structure dataPath and path Elements::
  * SPV Structure pageSetup Element::
-* SPV Structure pageHeader and pageFooter Elements::
-* SPV Structure pageParagraph Element::
  * SPV Structure @code{text} Element (Inside @code{pageParagraph})::
  @end menu
  
  @node SPV Structure heading Element
  @subsection The @code{heading} Element
  
-Parent: Document root or @code{heading} @*
-Contents: @code{pageSetup}? @code{label} (@code{container} @math{|} @code{heading})*
+@example
+heading[root_heading]
+   :creator-version?
+   :creator?
+   :creation-date-time?
+   :lockReader=bool?
+   :schemaLocation?
+=> label pageSetup? (container | heading)*
+
+heading
+   :creator-version?
+   :commandName?
+   :visibility[heading_visibility]=(collapsed)?
+   :locale?
+   :olang?
+=> label (container | heading)*
+@end example
  
  The root of a structure member is a @code{heading}, which represents a
  section of output beginning with a title (the @code{label}) and
@@ -274,13 +300,13 @@ ordinarily followed by content containers or further nested
  common document formats, which precede the content that they head,
  @code{heading} contains the elements that appear below the heading.
  
-The document root heading, only, may also contain a @code{pageSetup}
+The document root heading, only, may contain a @code{pageSetup}
  element.
  
  The following attributes have been observed on both document root and
  nested @code{heading} elements.
  
-@defvr {Optional} creator-version
+@defvr {Attribute} creator-version
  The version of the software that created this SPV file.  A string of
  the form @code{xxyyzzww} represents software version xx.yy.zz.ww,
  e.g.@: @code{21000001} is version 21.0.0.1.  Trailing pairs of zeros
@@ -293,25 +319,25 @@ three of those forms).
  The following attributes have been observed on document root
  @code{heading} elements only:
  
-@defvr {Optional} @code{creator}
+@defvr {Attribute} @code{creator}
  The directory in the file system of the software that created this SPV
  file.
  @end defvr
  
-@defvr {Optional} @code{creation-date-time}
+@defvr {Attribute} @code{creation-date-time}
  The date and time at which the SPV file was written, in a
  locale-specific format, e.g.@: @code{Friday, May 16, 2014 6:47:37 PM
  PDT} or @code{lunedì 17 marzo 2014 3.15.48 CET} or even @code{Friday,
  December 5, 2014 5:00:19 o'clock PM EST}.
  @end defvr
  
-@defvr {Optional} @code{lockReader}
+@defvr {Attribute} @code{lockReader}
  Whether a reader should be allowed to edit the output.  The possible
  values are @code{true} and @code{false}, but the corpus only contains
  @code{false}.
  @end defvr
  
-@defvr {Optional} @code{schemaLocation}
+@defvr {Attribute} @code{schemaLocation}
  This is actually an XML Namespace attribute.  A reader may ignore it.
  @end defvr
  
@@ -319,23 +345,22 @@ This is actually an XML Namespace attribute.  A reader may ignore it.
  The following attributes have been observed only on nested
  @code{heading} elements:
  
-@defvr {Required} @code{commandName}
-The locale-invariant name of the command that produced the output,
-e.g.@: @code{Frequencies}, @code{T-Test}, @code{Non Par Corr}.
+@defvr {Attribute} @code{commandName}
+A locale-invariant identifier for the command that produced the
+output, e.g.@: @code{Frequencies}, @code{T-Test}, @code{Non Par Corr}.
  @end defvr
  
-@defvr {Optional} @code{visibility}
-To what degree the output represented by the element is visible.  The
-only observed value is @code{collapsed}.
+@defvr {Attribute} @code{visibility}
+To what degree the output represented by the element is visible.
  @end defvr
  
-@defvr {Optional} @code{locale}
+@defvr {Attribute} @code{locale}
  The locale used for output, in Windows format, which is similar to the
  format used in Unix with the underscore replaced by a hyphen, e.g.@:
  @code{en-US}, @code{en-GB}, @code{el-GR}, @code{sr-Cryl-RS}.
  @end defvr
  
-@defvr {Optional} @code{olang}
+@defvr {Attribute} @code{olang}
  The output language, e.g.@: @code{en}, @code{it}, @code{es},
  @code{de}, @code{pt-BR}.
  @end defvr
@@ -343,56 +368,64 @@ The output language, e.g.@: @code{en}, @code{it}, @code{es},
  @node SPV Structure label Element
  @subsection The @code{label} Element
  
-Parent: @code{heading} or @code{container} @*
-Contents: text
+@example
+label => TEXT
+@end example
  
  Every @code{heading} and @code{container} holds a @code{label} as its
  first child.  The root @code{heading} in a structure member always
-contains the string ``Output''.  Otherwise, the text in @code{label}
-describes what it labels, often by naming the statistical procedure
-that was executed, e.g.@: ``Frequencies'' or ``T-Test''.  Labels are
-often very generic, especially within a @code{container}, e.g.@:
-``Title'' or ``Warnings'' or ``Notes''.  Label text is localized
-according to the output language, e.g.@: in Italian a frequency table
-procedure is labeled ``Frequenze''.
+contains the string ``Output'' (localized).  Otherwise, the text in
+@code{label} describes what it labels, often by naming the statistical
+procedure that was executed, e.g.@: ``Frequencies'' or ``T-Test''.
+Labels are often very generic, especially within a @code{container},
+e.g.@: ``Title'' or ``Warnings'' or ``Notes''.  Label text is
+localized according to the output language, e.g.@: in Italian a
+frequency table procedure is labeled ``Frequenze''.
  
  The corpus contains a few examples of empty labels, ones that contain
  no text.
  
-This element has no attributes.
-
  @node SPV Structure container Element
  @subsection The @code{container} Element
  
-Parent: @code{heading} @*
-Contents: @code{label} (@code{table} @math{|} @code{text} @math{|} @code{graph} @math{|} @code{model})
+@example
+container
+   :visibility=(visible | hidden)
+   :page-break-before=(always)?
+   :text-align=(left | center)?
+   :width=dimension
+=> label (table | container_text | graph | model | object | image)
+@end example
  
-A @code{container} serves to label a @code{table} or a @code{text}
-item.
+A @code{container} serves to contain and label a @code{table},
+@code{text}, or other kind of item.
  
  This element has the following attributes.
  
-@defvr {Required} @code{visibility}
-Either @code{visible} or @code{hidden}, this indicates whether the
-container's content is displayed.
+@defvr {Attribute} @code{visibility}
+Whether the container's content is displayed.  ``Notes'' tables are
+often hidden; other data is usually
  @end defvr
  
-@defvr {Optional} @code{text-align}
-Presumably indicates the alignment of text within the container.  The
-only observed value is @code{left}.  Observed with nested @code{table}
-and @code{text} elements.
+@defvr {Attribute} @code{text-align}
+Alignment of text within the container.  Observed with nested
+@code{table} and @code{text} elements.
  @end defvr
  
-@defvr {Optional} @code{width}
-The width of the container in the form @code{@var{n}px}, e.g.@:
-@code{1097px}.
+@defvr {Attribute} @code{width}
+The width of the container, e.g.@: @code{1097px}.
  @end defvr
  
  @node SPV Structure text Element (Inside @code{container})
  @subsection The @code{text} Element (Inside @code{container})
  
-Parent: @code{container} @*
-Contents: @code{html}
+@example
+text[container_text]
+  :type[text_type]=(title | log | text | page-title)
+  :commandName?
+  :creator-version?
+=> html
+@end example
  
  This @code{text} element is nested inside a @code{container}.  There
  is a different @code{text} element that is nested inside a
@@ -400,86 +433,138 @@ is a different @code{text} element that is nested inside a
  
  This element has the following attributes.
  
-@defvr {Required} @code{type}
-One of @code{title}, @code{log}, or @code{text}.
+@defvr {Attribute} @code{type}
+The semantics of the text.
  @end defvr
  
-@defvr {Optional} @code{commandName}
+@defvr {Attribute} @code{commandName}
  As on the @code{heading} element.  For output not specific to a
  command, this is simply @code{log}.  The corpus contains one example
  of where @code{commandName} is present but set to the empty string.
  @end defvr
  
-@defvr {Optional} @code{creator-version}
+@defvr {Attribute} @code{creator-version}
  As on the @code{heading} element.
  @end defvr
  
  @node SPV Structure html Element
  @subsection The @code{html} Element
  
-Parent: @code{text} @*
-Contents: CDATA
+@example
+html :lang=(en) => TEXT
+@end example
+
+The element contains an HTML document as text (or, in practice, as
+CDATA).  In some cases, the document starts with @code{<html>} and
+ends with @code{</html>}; in others the @code{html} element is
+implied.  Generally the HTML includes a @code{head} element with a CSS
+stylesheet.  The HTML body often begins with @code{<BR>}.
+
+The HTML document uses only the following elements:
+
+@table @code
+@item html
+Sometimes, the document is enclosed with
+@code{<html>}@dots{}@code{</html>}.
+
+@item br
+The HTML body often begins with @code{<BR>} and may contain it as well.
+
+@item b
+@itemx i
+@itemx u
+Styling.
+
+@item font
+The attributes @code{face}, @code{color}, and @code{size} are
+observed.  The value of @code{color} takes one of the forms
+@code{#@var{rr}@var{gg}@var{bb}} or @code{rgb (@var{r}, @var{g},
+@var{b})}.  The value of @code{size} is a number between 1 and 7,
+inclusive.
+@end table
  
-The CDATA contains an HTML document.  In some cases, the document
-starts with @code{<html>} and ends with @code{</html>}; in others the
-@code{html} element is implied.  Generally the HTML includes a
-@code{head} element with a CSS stylesheet.  The HTML body often begins
-with @code{<BR>}.  The actual content ranges from trivial to simple:
-just discarding the CSS and tags yields readable results.
+The CSS in the corpus is simple.  To understand it, a parser only
+needs to be able to skip white space, @code{<!--}, and @code{-->}, and
+parse style only for @code{p} elements.  Only @code{font-weight},
+@code{font-style}, @code{font-decoration}, @code{font-family}, and
+@code{font-size} matter.
  
  This element has the following attributes.
  
-@defvr {Required} @code{lang}
+@defvr {Attribute} @code{lang}
  This always contains @code{en} in the corpus.
  @end defvr
  
  @node SPV Structure table Element
  @subsection The @code{table} Element
  
-Parent: @code{container} @*
-Contents: @code{tableStructure}
+@example
+table
+   :VDPId?
+   :ViZmlSource?
+   :activePageId=int?
+   :commandName
+   :creator-version?
+   :displayFiltering=bool?
+   :maxNumCells=int?
+   :orphanTolerance=int?
+   :rowBreakNumber=int?
+   :subType
+   :tableId
+   :tableLookId?
+   :type[table_type]=(table | note | warning)
+=> tableProperties? tableStructure
+
+tableStructure => path? dataPath
+@end example
  
  This element has the following attributes.
  
-@defvr {Required} @code{commandName}
+@defvr {Attribute} @code{commandName}
  As on the @code{heading} element.
  @end defvr
  
-@defvr {Required} @code{type}
+@defvr {Attribute} @code{type}
  One of @code{table}, @code{note}, or @code{warning}.
  @end defvr
  
-@defvr {Required} @code{subType}
-The locale-invariant name for the particular kind of output that this
-table represents in the procedure.  This can be the same as
+@defvr {Attribute} @code{subType}
+The locale-invariant command ID for the particular kind of output that
+this table represents in the procedure.  This can be the same as
  @code{commandName} e.g.@: @code{Frequencies}, or different, e.g.@:
  @code{Case Processing Summary}.  Generic subtypes @code{Notes} and
  @code{Warnings} are often used.
  @end defvr
  
-@defvr {Required} @code{tableId}
+@defvr {Attribute} @code{tableId}
  A number that uniquely identifies the table within the SPV file,
  typically a large negative number such as @code{-4147135649387905023}.
  @end defvr
  
-@defvr {Optional} @code{creator-version}
+@defvr {Attribute} @code{creator-version}
  As on the @code{heading} element.  In the corpus, this is only present
  for version 21 and up and always includes all 8 digits.
  @end defvr
  
-@node SPV Structure tableStructure Element
-@subsection The @code{tableStructure} Element
-
-Parent: @code{table} @*
-Contents: @code{dataPath}
-
-This element has no attributes.
+@xref{SPV Detail Legacy Properties}, for details on the
+@code{tableProperties} element.
  
  @node SPV Structure graph Element
  @subsection The @code{graph} Element
  
-Parent: @code{container} @*
-Contents: @code{dataPath}? @code{path}
+@example
+graph
+   :VDPId?
+   :ViZmlSource?
+   :commandName?
+   :creator-version?
+   :dataMapId?
+   :dataMapURI?
+   :editor?
+   :refMapId?
+   :refMapURI?
+=> dataPath? path
+@end example
  
  This element represents a graph.  The @code{dataPath} and @code{path}
  elements name the Zip members that give the details of the graph.
@@ -489,8 +574,24 @@ in the corpus.
  @node SPV Structure model Element
  @subsection The @code{model} Element
  
-Parent: @code{container} @*
-Contents: (@code{ViZml}? @code{path}) @math{|} (@code{pmmlContainerPath} @code{statsContainerPath})
+@example
+model
+   :PMMLContainerId
+   :PMMLId
+   :StatXMLContainerId
+   :VDPId
+   :auxiliaryViewName
+   :commandName
+   :creator-version
+   :mainViewName
+=> ViZml? path | pmmlContainerPath statsContainerPath
+
+pmmlContainerPath => TEXT
+
+statsContainerPath => TEXT
+
+ViZml :viewName? => TEXT
+@end example
  
  This element represents a model.  The @code{dataPath} and @code{path}
  elements name the Zip members that give the details of the model.
@@ -506,8 +607,11 @@ name Zip members with @file{.scf} extension.
  @node SPV Structure dataPath and path Elements
  @subsection The @code{dataPath} and @code{path} Elements
  
-Parent: @code{tableStructure} or @code{graph} or @code{model} @*
-Contents: text
+@example
+dataPath => TEXT
+
+path => TEXT
+@end example
  
  These element contain the name of the Zip members that hold details
  for a container.  For tables:
@@ -539,65 +643,67 @@ These elements have no attributes.
  @node SPV Structure pageSetup Element
  @subsection The @code{pageSetup} Element
  
-Parent: @code{heading} @*
-Contents: @code{pageHeader} @code{pageFooter}
+@example
+pageSetup
+   :initial-page-number=int?
+   :chart-size=(as-is | full-height | half-height | quarter-height | OTHER)?
+   :margin-left=dimension?
+   :margin-right=dimension?
+   :margin-top=dimension?
+   :margin-bottom=dimension?
+   :paper-height=dimension?
+   :paper-width=dimension?
+   :reference-orientation?
+   :space-after=dimension?
+=> pageHeader pageFooter
+
+pageHeader => pageParagraph?
+
+pageFooter => pageParagraph?
+
+pageParagraph => pageParagraph_text
+@end example
  
-This element has the following attributes.
+The @code{pageSetup} element has the following attributes.
  
-@defvr {Required} @code{initial-page-number}
-Always @code{1}.
+@defvr {Attribute} @code{initial-page-number}
+The page number to put on the first page of printed output.  Usually
+@code{1}.
  @end defvr
  
-@defvr {Optional} @code{chart-size}
-Always @code{as-is} or a localization (!) of it (e.g.@: @code{dimensione
-attuale}, @code{Wie vorgegeben}).
+@defvr {Attribute} @code{chart-size}
+One of the listed, self-explanatory chart sizes,
+@code{quarter-height}, or a localization (!) of one of these (e.g.@:
+@code{dimensione attuale}, @code{Wie vorgegeben}).
  @end defvr
  
-@defvr {Optional} @code{margin-left}
-@defvrx {Optional} @code{margin-right}
-@defvrx {Optional} @code{margin-top}
-@defvrx {Optional} @code{margin-bottom}
-Margin sizes in the form @code{@var{size}in}, e.g.@: @code{0.25in}.
+@defvr {Attribute} @code{margin-left}
+@defvrx {Attribute} @code{margin-right}
+@defvrx {Attribute} @code{margin-top}
+@defvrx {Attribute} @code{margin-bottom}
+Margin sizes, e.g.@: @code{0.25in}.
  @end defvr
  
-@defvr {Optional} @code{paper-height}
-@defvrx {Optional} @code{paper-width}
-Paper sizes in the form @code{@var{size}in}, e.g.@: @code{8.5in} by
-@code{11in} for letter paper or @code{8.267in} by @code{11.692in} for
-A4 paper.
+@defvr {Attribute} @code{paper-height}
+@defvrx {Attribute} @code{paper-width}
+Paper sizes.
  @end defvr
  
-@defvr {Optional} @code{reference-orientation}
-Always @code{0deg}.
+@defvr {Attribute} @code{reference-orientation}
+Indicates the orientation of the output page.  Either @code{0deg}
+(portrait) or @code{90deg} (landscape),
  @end defvr
  
-@defvr {Optional} @code{space-after}
-Always @code{12pt}.
+@defvr {Attribute} @code{space-after}
+The amount of space between printed objects, typically @code{12pt}.
  @end defvr
  
-@node SPV Structure pageHeader and pageFooter Elements
-@subsection The @code{pageHeader} and @code{pageFooter} Elements
-
-Parent: @code{pageSetup} @*
-Contents: @code{pageParagraph}*
-
-This element has no attributes.
-
-@node SPV Structure pageParagraph Element
-@subsection The @code{pageParagraph} Element
-
-Parent: @code{pageHeader} or @code{pageFooter} @*
-Contents: @code{text}
-
-Text to go at the top or bottom of a page, respectively.
-
-This element has no attributes.
-
  @node SPV Structure @code{text} Element (Inside @code{pageParagraph})
  @subsection The @code{text} Element (Inside @code{pageParagraph})
  
-Parent: @code{pageParagraph} @*
-Contents: CDATA?
+@example
+text[pageParagraph_text] :type=(title | text) => TEXT
+@end example
  
  This @code{text} element is nested inside a @code{pageParagraph}.  There
  is a different @code{text} element that is nested inside a
@@ -607,8 +713,31 @@ The element is either empty, or contains CDATA that holds almost-XHTML
  text: in the corpus, either an @code{html} or @code{p} element.  It is
  @emph{almost}-XHTML because the @code{html} element designates the
  default namespace as
-@indicateurl{http://xml.spss.com/spss/viewer/viewer-tree} instead of an XHTML
-namespace, and because the CDATA can contain substitution variables:
+@indicateurl{http://xml.spss.com/spss/viewer/viewer-tree} instead of
+an XHTML namespace, and because the CDATA can contain substitution
+variables.  The following variables are supported:
+
+@table @code
+@item &[Date]
+@itemx &[Time]
+The current date or time in the preferred format for the locale.
+
+@item &[Head1]
+@itemx &[Head2]
+@itemx &[Head3]
+@itemx &[Head4]
+First-, second-, third-, or fourth-level heading.
+
+@item &[PageTitle]
+The page title.
+
+@item &[Filename]
+Name of the output file.
+
+@item &[Page]
+The page number.
+@end table
+
  @code{&[Page]} for the page number and @code{&[PageTitle]} for the
  page title.
  
@@ -625,7 +754,7 @@ Typical contents (indented for clarity):
  
  This element has the following attributes.
  
-@defvr {Required} @code{type}
+@defvr {Attribute} @code{type}
  Always @code{text}.
  @end defvr
  
@@ -647,10 +776,10 @@ and have no semantic significance.
  A bytes with a fixed value, written as a pair of hexadecimal digits.
  
  @item i0, i1, @dots{}, i9, i10, i11, @dots{}
-@itemx b0, b1, @dots{}, b9, b10, b11, @dots{}
+@itemx ib0, ib1, @dots{}, ib9, ib10, ib11, @dots{}
  A 32-bit integer in little-endian or big-endian byte order,
-respectively, with a fixed value, written in decimal, prefixed by
-@samp{i}.
+respectively, with a fixed value, written in decimal.  Prefixed by
+@samp{i} for little-endian or @samp{ib} for big-endian.
  
  @item byte
  A byte.
@@ -663,7 +792,7 @@ A byte with value 0 or 1.
  A 16-bit integer in little-endian or big-endian byte order,
  respectively.
  
-@item int
+@item int32
  @itemx be32
  A 32-bit integer in little-endian or big-endian byte order,
  respectively.
@@ -689,12 +818,12 @@ data.  (The encoding is indicated by the Formats nonterminal.)
  @var{x} is optional, e.g.@: 00? is an optional zero byte.
  
  @item @var{x}*@var{n}
-@var{x} is repeated @var{n} times, e.g. byte*10 for ten arbitrary bytes.
+@var{x} is repeated @var{n} times, e.g.@: byte*10 for ten arbitrary bytes.
  
  @item @var{x}[@var{name}]
  Gives @var{x} the specified @var{name}.  Names are used in textual
  explanations.  They are also used, also bracketed, to indicate counts,
-e.g.@: int[@t{n}] byte*[@t{n}] for a 32-bit integer followed by the
+e.g.@: @code{int32[n] byte*[n]} for a 32-bit integer followed by the
  specified number of arbitrary bytes.
  
  @item @var{a} @math{|} @var{b}
@@ -706,8 +835,9 @@ in the presence of @math{|}, e.g.@: in 00 (01 @math{|} 02 @math{|} 03)
  00.
  
  @item count(@var{x})
-A 32-bit integer that indicates the number of bytes in @var{x},
-followed by @var{x} itself.
+@itemx becount(@var{x})
+A 32-bit integer, in little-endian or big-endian byte order, respectively,
+that indicates the number of bytes in @var{x}, followed by @var{x} itself.
  
  @item v1(@var{x})
  In a version 1 @file{.bin} member, @var{x}; in version 3, nothing.
@@ -717,38 +847,44 @@ In a version 1 @file{.bin} member, @var{x}; in version 3, nothing.
  In a version 3 @file{.bin} member, @var{x}; in version 1, nothing.
  @end table
  
+PSPP uses this grammar to parse light detail members.  See
+@file{src/output/spv/light-binary.grammar} in the PSPP source tree for
+the full grammar.
+
  Little-endian byte order is far more common in this format, but a few
  pieces of the format use big-endian byte order.
  
+Light detail members express linear units in two ways: points (pt), at
+72/inch, and ``device-independent pixels'' (px), at 96/inch.  To
+convert from pt to px, multiply by 1.33 and round up.  To convert
+from px to pt, divide by 1.33 and round down.
+
  A ``light'' detail member @file{.bin} consists of a number of sections
-concatenated together, terminated by a byte 01:
-
-@cartouche
-@format
-LightMember @result{}
-    Header Title
-    Caption Footnotes
-    Fonts Borders PrintSettings TableSettings Formats
-    Dimensions Data
-    01
-@end format
-@end cartouche
+concatenated together, terminated by an optional byte 01:
+
+@example
+LightMember =>
+    Header Titles Footnotes
+    Areas Borders PrintSettings TableSettings Formats
+    Dimensions Axes Cells
+    01?
+@end example
  
  The following sections go into more detail.
  
  @menu
  * SPV Light Member Header::
-* SPV Light Member Title::
-* SPV Light Member Caption::
+* SPV Light Member Titles::
  * SPV Light Member Footnotes::
-* SPV Light Member Fonts::
+* SPV Light Member Areas::
  * SPV Light Member Borders::
  * SPV Light Member Print Settings::
  * SPV Light Member Table Settings::
  * SPV Light Member Formats::
  * SPV Light Member Dimensions::
  * SPV Light Member Categories::
-* SPV Light Member Data::
+* SPV Light Member Axes::
+* SPV Light Member Cells::
  * SPV Light Member Value::
  * SPV Light Member ValueMod::
  @end menu
@@ -758,32 +894,26 @@ The following sections go into more detail.
  
  An SPV light member begins with a 39-byte header:
  
-@cartouche
-@format
-Header @result{}
+@example
+Header =>
      01 00
-    (i1 @math{|} i3)[@t{version}]
-    bool
-    bool[@t{show-numeric-markers}]
-    bool[@t{rotate-inner-column-labels}]
-    bool[@t{rotate-outer-row-labels}]
-    bool
-    int
-    int[@t{min-column-width}] int[@t{max-column-width}]
-    int[@t{min-row-width}] int[@t{max-row-width}]
-    int64[@t{table-id}]
-@end format
-@end cartouche
+    (i1 @math{|} i3)[version]
+    bool[x0]
+    bool[x1]
+    bool[rotate-inner-column-labels]
+    bool[rotate-outer-row-labels]
+    bool[x2]
+    int32[x3]
+    int32[min-col-width] int32[max-col-width]
+    int32[min-row-width] int32[max-row-width]
+    int64[table-id]
+@end example
  
  @code{version} is a version number that affects the interpretation of
  some of the other data in the member.  We will refer to ``version 1''
  and ``version 3'' later on and use v1(@dots{}) and v3(@dots{}) for
  version-specific formatting (as described previously).
  
-If @code{show-numeric-markers} is 1, footnote markers are shown as
-numbers, starting from 1; otherwise, they are shown as letters,
-starting from @samp{a}.
-
  If @code{rotate-inner-column-labels} is 1, then column labels closest
  to the data are rotated to be vertical; otherwise, they are shown
  in the normal way.
@@ -797,94 +927,92 @@ the structure member that refers to the detail member.  For example,
  if @code{tableId} is @code{-4122591256483201023}, then @code{table-id}
  would be 0xc6c99d183b300001.
  
-@code{min-column-width} is the minimum width that a column will be
-assigned automatically.  @code{max-column-width} is the maximum width
+@code{min-col-width} is the minimum width that a column will be
+assigned automatically.  @code{max-col-width} is the maximum width
  that a column will be assigned to accommodate a long column label.
  @code{min-row-width} and @code{max-row-width} are a similar range for
  the width of row labels.  All of these measurements are in 1/96 inch
-units.
-
-The meaning of the other variable parts of the header is not known.
-
-@node SPV Light Member Title
-@subsection Title
-
-@cartouche
-@format
-Title @result{}
-    Value[@t{title1}] 01?
-    Value[@t{c}] 01? 31
-    Value[@t{title2}] 01?
-@end format
-@end cartouche
-
-The Title, which follows the Header, specifies the pivot table's title
-twice, as @code{title1} and @code{title2}.  In the corpus, they are
-always the same.
-
-Whereas the Value in @code{title1} and in @code{title2} are
-appropriate for presentation, and localized to the user's language,
-@code{c} is in English, sometimes less specific, and sometimes less
-well formatted.  For example, for a frequency table, @code{title1} and
-@code{title2} name the variable and @code{c} is simply ``Frequencies''.
-
-@node SPV Light Member Caption
-@subsection Caption
-
-@cartouche
-@format
-Caption @result{} Caption1 Caption2
-Caption1 @result{} 31 Value @math{|} 58
-Caption2 @result{} 31 Value @math{|} 58
-@end format
-@end cartouche
-
-The Caption, if present, is shown below the table.  Caption2 is
-normally present.  Caption1 is only rarely nonempty; it might reflect
-user editing of the caption.
+units (called a ``device independent pixel'' unit in Windows).
+
+The meaning of the other variable parts of the header is not known.  A
+writer may safely use version 3, true for @code{x0}, false for
+@code{x1}, true for @code{x2}, and 0x15 for @code{x3}.
+
+@node SPV Light Member Titles
+@subsection Titles
+
+@example
+Titles =>
+    Value[title] 01?
+    Value[subtype] 01? 31
+    Value[user-title] 01?
+    (31 Value[corner-text] @math{|} 58)
+    (31 Value[caption] @math{|} 58)
+@end example
+
+The Titles follow the Header and specify the table's title, caption,
+and corner text.
+
+The @code{user-title} is shown above the title and reflects any user
+editing of the title text or style.  The @code{title} is the title
+originally generated by the procedure.  Both of these are appropriate
+for presentation and localized to the user's language.  For example,
+for a frequency table, @code{title} and @code{user-title} normally
+name the variable and @code{c} is simply ``Frequencies''.
+
+@code{subtype} is the same as the @code{subType} attribute in the
+@code{table} structure XML element that referred to this member.
+@xref{SPV Structure table Element}, for details.
+
+The @code{corner-text}, if present, is shown in the upper-left corner
+of the table, above the row headings and to the left of the column
+headings.  It is usually absent.  Corner text prevents row dimension
+labels from being displayed above the dimension's group and category
+labels (see @code{show-row-labels-in-corner}).
+
+The @code{caption}, if present, is shown below the table.
+@code{caption} reflects user editing of the caption.
  
  @node SPV Light Member Footnotes
  @subsection Footnotes
  
-@cartouche
-@format
-Footnotes @result{} int[@t{n}] Footnote*[@t{n}]
-Footnote @result{} Value[@t{text}] (58 @math{|} 31 Value[@t{marker}]) byte*4
-@end format
-@end cartouche
+@example
+Footnotes => int32[n-footnotes] Footnote*[n-footnotes]
+Footnote => Value[text] (58 @math{|} 31 Value[marker]) byte*4
+@end example
  
-Each footnote has @code{text} and an optional customer @code{marker}
+Each footnote has @code{text} and an optional custom @code{marker}
  (such as @samp{*}).
  
-@node SPV Light Member Fonts
-@subsection Fonts
+@node SPV Light Member Areas
+@subsection Areas
  
-@cartouche
-@format
-Fonts @result{} 00 Font*8
-Font @result{}
-    byte[@t{index}] 31
-    string[@t{typeface}] float[@t{size}] int[@t{style}] bool[@t{underline}]
-    int[@t{halign}] int[@t{valign}]
-    string[@t{fgcolor}] string[@t{bgcolor}]
-    byte[@t{alternate}] string[@t{altfg}] string[@t{altbg}]
-    v3(int[@t{left-margin}] int[@t{right-margin}] int[@t{top-margin}] int[@t{bottom-margin}])
-@end format
-@end cartouche
+@example
+Areas => 00? Area*8
+Area =>
+    byte[index] 31
+    string[typeface] float[size] int32[style] bool[underline]
+    int32[halign] int32[valign]
+    string[fg-color] string[bg-color]
+    bool[alternate] string[alt-fg-color] string[alt-bg-color]
+    v3(int32[left-margin] int32[right-margin] int32[top-margin] int32[bottom-margin])
+@end example
  
-Each Font represents the font style for a different element, in the
-following order: title, caption, footer, corner, column
-labels, row labels, data, and layers.
+Each Area represents the style for a different area of the table, in
+the following order: title, caption, footer, corner, column labels,
+row labels, data, and layers.
  
-@code{index} is the 1-based index of the Font, i.e. 1 for the first
-Font, through 8 for the final Font.
+@code{index} is the 1-based index of the Area, i.e. 1 for the first
+Area, through 8 for the final Area.
  
-@code{typeface} is the string name of the font.  In the corpus, this
-is @code{SansSerif} in over 99% of instances and @code{Times New
-Roman} in the rest.
+@code{typeface} is the string name of the font used in the area.  In
+the corpus, this is @code{SansSerif} in over 99% of instances and
+@code{Times New Roman} in the rest.
  
-@code{size} is the size of the font, in points.  The most common size
-in the corpus is 12 points.
+@code{size} is the size of the font, in px (@pxref{SPV Light Detail
+Member Format}) The most common size in the corpus is 12 px.  Even
+though @code{size} has a floating-point type, in the corpus its values
+are always integers.
  
  @code{style} is a bit mask.  Bit 0 (with value 1) is set for bold, bit
  1 (with value 2) is set for italic.
@@ -899,34 +1027,34 @@ numbers and most other formats are right-justified.
  @code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
  for bottom.
  
-@code{fgcolor} and @code{bgcolor} are the foreground color and
+@code{fg-color} and @code{bg-color} are the foreground color and
  background color, respectively.  In the corpus, these are always
  @code{#000000} and @code{#ffffff}, respectively.
  
-@code{alternate} is 01 if rows should alternate colors, 00 if all rows
-should be the same color.  When @code{alternate} is 01, @code{altfg}
-and @code{altbg} specify the colors for the alternate rows.
+@code{alternate} is 1 if rows should alternate colors, 0 if all rows
+should be the same color.  When @code{alternate} is 1,
+@code{alt-fg-color} and @code{alt-bg-color} specify the colors for the
+alternate rows; otherwise they are empty strings.
  
  @code{left-margin}, @code{right-margin}, @code{top-margin}, and
-@code{bottom-margin} are measured in multiples of 1/96 inch.
+@code{bottom-margin} are measured in px.
  
  @node SPV Light Member Borders
  @subsection Borders
  
-@cartouche
-@format
-Borders @result{}
-    b1[@t{endian}]
-    be32[@t{n-borders}] Border*[@t{n-borders}]
-    bool[@t{show-grid-lines}]
-    00 00 00
-
-Border @result{}
-    be32[@t{border-type}]
-    be32[@t{stroke-type}]
-    be32[@t{color}]
-@end format
-@end cartouche
+@example
+Borders =>
+    count(
+        ib1[endian]
+        be32[n-borders] Border*[n-borders]
+        bool[show-grid-lines]
+        00 00 00)
+
+Border =>
+    be32[border-type]
+    be32[stroke-type]
+    be32[color]
+@end example
  
  The Borders reflect how borders between regions are drawn.
  
@@ -982,20 +1110,19 @@ opaque color, therefore opaque black is 0xff000000.
  @node SPV Light Member Print Settings
  @subsection Print Settings
  
-@cartouche
-@format
-PrintSettings @result{}
-    b1[@t{endian}]
-    bool[@t{all-layers}]
-    bool[@t{paginate-layers}]
-    bool[@t{fit-width}]
-    bool[@t{fit-length}]
-    bool[@t{top-continuation}]
-    bool[@t{bottom-continuation}]
-    be32[@t{n-orphan-lines}]
-    bestring[@t{continuation-string}]
-@end format
-@end cartouche
+@example
+PrintSettings =>
+    count(
+        ib1[endian]
+        bool[all-layers]
+        bool[paginate-layers]
+        bool[fit-width]
+        bool[fit-length]
+        bool[top-continuation]
+        bool[bottom-continuation]
+        be32[n-orphan-lines]
+        bestring[continuation-string])
+@end example
  
  The PrintSettings reflect settings for printing.  The fixed value of
  @code{endian} can be used to validate the endianness.
@@ -1021,43 +1148,41 @@ page.  Usually, @code{continuation-string} is empty.
  @node SPV Light Member Table Settings
  @subsection Table Settings
  
-@cartouche
-@format
-TableSettings @result{}
-    be32[@t{endian}]
-    be32
-    be32[@t{current-layer}]
-    bool[@t{omit-empty}]
-    bool[@t{show-row-labels-in-corner}]
-    bool[@t{show-alphabetic-markers}]
-    bool[@t{footnote-marker-position}]
-    v3(
-      byte
-      count(
-        Breakpoints[@t{row-breaks}] Breakpoints[@t{column-breaks}]
-        Keeps[@t{row-keeps}] Keeps[@t{column-keeps}]
-        PointKeeps[@t{row-keeps}] PointKeeps[@t{column-keeps}]
-      )
-      bestring[@t{notes}]
-      bestring[@t{table-look}]
-      00...
-    )
-
-Breakpoints @result{} be32[@t{n-breaks}] be32*[@t{n-breaks}]
-
-Keeps @result{} be32[@t{n-keeps}] Keep*@t{n-keeps}
-Keep @result{} be32[@t{offset}] be[@t{n}]
-
-PointKeeps @result{} be32[@t{n-point-keeps}] PointKeep*@t{n-point-keeps}
-PointKeep @result{} be32[@t{offset}] be32 be32
-
-@end format
-@end cartouche
+@example
+TableSettings =>
+    count(
+      v3(
+        ib1[endian]
+        be32[x5]
+        be32[current-layer]
+        bool[omit-empty]
+        bool[show-row-labels-in-corner]
+        bool[show-alphabetic-markers]
+        bool[footnote-marker-superscripts]
+        byte[x6]
+        becount(
+          Breakpoints[row-breaks] Breakpoints[column-breaks]
+          Keeps[row-keeps] Keeps[column-keeps]
+          PointKeeps[row-point-keeps] PointKeeps[column-point-keeps]
+        )
+        bestring[notes]
+        bestring[table-look]
+        00...))
+
+Breakpoints => be32[n-breaks] be32*[n-breaks]
+
+Keeps => be32[n-keeps] Keep*[n-keeps]
+Keep => be32[offset] be32[n]
+
+PointKeeps => be32[n-point-keeps] PointKeep*[n-point-keeps]
+PointKeep => be32[offset] be32 be32
+@end example
  
  The TableSettings reflect display settings.  The fixed value of
  @code{endian} can be used to validate the endianness.
  
-@code{current-layer} is the displayed layer.
+@code{current-layer} is the displayed layer.  The interpretation when
+there is more than one layer dimension is not yet known.
  
  If @code{omit-empty} is 1, empty rows or columns (ones with nothing in
  any cell) are hidden; otherwise, they are shown.
@@ -1066,10 +1191,10 @@ If @code{show-row-labels-in-corner} is 1, then row labels are shown in
  the upper left corner; otherwise, they are shown nested.
  
  If @code{show-alphabetic-markers} is 1, markers are shown as letters
-(e.g. @samp{a}, @samp{b}, @samp{c}, @dots{}); otherwise, they are
+(e.g.@: @samp{a}, @samp{b}, @samp{c}, @dots{}); otherwise, they are
  shown as numbers starting from 1.
  
-When @code{footnote-marker-position} is 1, footnote markers are shown
+When @code{footnote-marker-superscripts} is 1, footnote markers are shown
  as superscripts, otherwise as subscripts.
  
  The Breakpoints are rows or columns after which there is a page break;
@@ -1093,74 +1218,36 @@ is displayed when the user hovers the cursor over the table, like
  @code{table-look} is the name of a SPSS ``TableLook'' table style,
  such as ``Default'' or ``Academic''; it is often empty.
  
-TableSettings ends with an arbitrary number of null bytes.
+TableSettings ends with an arbitrary number of null bytes.  A writer
+may safely write 82 null bytes.
+
+A writer may safely use 4 for @code{x5} and 0 for @code{x6}.
  
  @node SPV Light Member Formats
  @subsection Formats
  
-@cartouche
-@format
-Formats @result{}
-    int[@t{n-widths}] int*[@t{n-widths}]
-    string[@t{encoding}]
-    int[@t{current-layer}]
-    bool[@t{digit-grouping}] bool[@t{leading-zero}] bool
-    int[@t{epoch}]
-    byte[@t{decimal}] byte[@t{grouping}]
+@example
+Formats =>
+    int32[n-widths] int32*[n-widths]
+    string[locale]
+    int32[current-layer]
+    bool bool bool
+    Y0
      CustomCurrency
      count(
        v1(X0?)
-      v3(count(X1 count(X2)) count(X3))
-
-X0 @result{}
-    byte*14
-    string[@t{command}] string[@t{command-local}]
-    string[@t{language}] string[@t{charset}] string[@t{locale}]
-    bool 00 bool bool
-    int[@t{epoch}]
-    byte[@t{decimal}] byte[@t{grouping}]
-    CustomCurrency
-    byte[@t{missing}] bool
-
-X1 @result{}
-    byte*2
-    byte[@t{lang}]
-    byte[@t{variable-mode}]
-    byte[@t{value-mode}]
-    int*2
-    00*17
-    bool
-    01
-X2 @result{}
-    int[@t{n-heights}] int*[@t{n-heights}]
-    int[@t{n-style-map}] BlankMap*[@t{n-style-map}]
-    int[@t{n-styles}] StylePair*[@t{n-styles}]
-    count((i0 i0)?)
-StyleMap @result{} int64[@t{cell-index}] int16[@t{style-index}]
-X3 @result{}
-    01 00 (03 @math{|} 04) 00 00 00
-    string[@t{command}] string[@t{command-local}]
-    string[@t{language}] string[@t{charset}] string[@t{locale}]
-    bool 00 bool bool
-    int[@t{epoch}]
-    byte[@t{decimal}] byte[@t{grouping}]
-    double[@t{small}] 01
-    (string[@t{dataset}] string[@t{datafile}] i0 int[@t{date}] i0)?
-    CustomCurrency
-    byte[@t{missing}] bool (i2000000 i0)?
-
-CustomCurrency @result{} int[@t{n-ccs}] string*[@t{n-ccs}]
-@end format
-@end cartouche
+      v3(count(X1 count(X2)) count(X3)))
+Y0 => int32[epoch] byte[decimal] byte[grouping]
+CustomCurrency => int32[n-ccs] string*[n-ccs]
+@end example
  
  If @code{n-widths} is nonzero, then the accompanying integers are
-column widths as manually adjusted by the user.  (Row heights are
-computed automatically based on the widths.)
+column widths as manually adjusted by the user.
  
-@code{encoding} is a character encoding, usually a Windows code page
-such as @code{en_US.windows-1252} or @code{it_IT.windows-1252}.  The
-rest of the character strings in the member use this encoding.  The
-encoding string is itself encoded in US-ASCII.
+@code{locale} is a locale including an encoding, such as
+@code{en_US.windows-1252} or @code{it_IT.windows-1252}.  The rest of
+the character strings in the member use this encoding.  The encoding
+string is itself encoded in US-ASCII.
  
  @code{epoch} is the year that starts the epoch.  A 2-digit year is
  interpreted as belonging to the 100 years beginning at the epoch.  The
@@ -1176,6 +1263,25 @@ are @samp{.} and @samp{,}.
  @samp{'} (apostrophe), @samp{ } (space), and zero (presumably
  indicating that digits should not be grouped).
  
+@code{n-ccs} is observed as either 0 or 5.  When it is 5, the
+following strings are CCA through CCE format strings.  @xref{Custom
+Currency Formats,,, pspp, PSPP}.  Most commonly these are all
+@code{-,,,} but other strings occur.
+
+@subsubheading X0
+
+X0 only appears, optionally, in version 1 members.
+
+@example
+X0 => byte*14 Y1 Y2
+Y1 =>
+    string[command] string[command-local]
+    string[language] string[charset] string[locale]
+    bool bool bool bool
+    Y0
+Y2 => CustomCurrency byte[missing] bool[x16]
+@end example
+
  @code{command} describes the statistical procedure that generated the
  output, in English.  It is not necessarily the literal syntax name of
  the procedure: for example, NPAR TESTS becomes ``Nonparametric
@@ -1188,24 +1294,112 @@ output, e.g.@: @code{DataSet1}, and @code{datafile} the name of the
  file it was read from, e.g.@: @file{C:\Users\foo\bar.sav}.  The latter
  is sometimes the empty string.
  
+@code{missing} is the character used to indicate that a cell contains
+a missing value.  It is always observed as @samp{.}.
+
+X0 repeats @code{decimal}, @code{grouping}, CustomCurrency, and
+@code{missing} already included in Formats.
+
+A writer may safely use false for @code{x16}.
+
+@subsubheading X1
+
+X1 only appears in version 3 members.
+
+@example
+X1 =>
+    00 byte[x14] bool[x15]
+    byte[lang]
+    byte[show-variables]
+    byte[show-values]
+    int32[x17] int32[x18]
+    00*17
+    bool[x19]
+    01
+@end example
+
+@code{lang} may indicate the language in use.  Some values seem to be
+0: @t{en}, 1: @t{de}, 2: @t{es}, 3: @t{it}, 5: @t{ko}, 6: @t{pl}, 8:
+@t{zh-tw}, 10: @t{pt_BR}, 11: @t{fr}.  The @code{locale} in Formats
+and the @code{language}, @code{charset}, and @code{locale} in X0 are
+more likely to be useful in practice.
+
+@code{show-variables} determines how variables are displayed by
+default.  A value of 1 means to display variable names, 2 to display
+variable labels when available, 3 to display both (name followed by
+label, separated by a space).  The most common value is 0, which
+probably means to use a global default.
+
+@code{show-values} is a similar setting for values.  A value of 1
+means to display the value, 2 to display the value label when
+available, 3 to display both.  Again, the most common value is 0,
+which probably means to use a global default.
+
+A writer may safely use 1 for @code{x14}, false for @code{x15}, -1 for
+@code{x17} and @code{x18}, and false for @code{x19}.
+
+@subsubheading X2
+
+X2 only appears in version 3 members.
+
+@example
+X2 =>
+    int32[n-row-heights] int32*[n-row-heights]
+    int32[n-style-map] StyleMap*[n-style-map]
+    int32[n-styles] StylePair*[n-styles]
+    count((i0 i0)?)
+StyleMap => int64[cell-index] int16[style-index]
+@end example
+
+If present, @code{n-row-heights} and the accompanying integers are row
+heights as manually adjusted by the user.
+
+The rest of X2 specifies styles for data cells.  At first glance this
+is odd, because each data cell can have its own style embedded as part
+of the data, but in practice X2 specifies a style for a cell only if
+that cell is empty (and thus does not appear in the data at all).
+Each StyleMap specifies the index of a blank cell, calculated the same
+was as in the Cells (@pxref{SPV Light Member Cells}), along with a
+0-based index into the accompanying StylePair array.
+
+A writer may safely omit the optional @code{i0 i0} inside the
+@code{count(@dots{})}.
+
+@subsubheading X3
+
+X3 only appears in version 3 members.
+
+@example
+X3 =>
+    01 00 byte[x20] 00 00 00
+    Y1
+    double[small] 01
+    (string[dataset] string[datafile] i0 int32[date] i0)?
+    Y2
+    (int32 i0)?
+@end example
+
  @code{date} is a date, as seconds since the epoch, i.e.@: since
-January 1, 1970.  Pivot tables within an SPV files often have dates a
-few minutes apart, so this is probably a creation date for the tables
+January 1, 1970.  Pivot tables within an SPV file often have dates a
+few minutes apart, so this is probably a creation date for the table
  rather than for the file.
  
+X3 repeats @code{decimal}, @code{grouping}, CustomCurrency, and
+@code{missing} already included in Formats.  @code{command},
+@code{command-local}, @code{language}, @code{charset}, and
+@code{locale} have the same meaning as in X0.
+
+@code{small} is a small real number, e.g.@: .001.  Numbers smaller
+than this in absolute value are displayed in scientific notation.
+
  Sometimes @code{dataset}, @code{datafile}, and @code{date} are present
  and other times they are absent.  The reader can distinguish by
  assuming that they are present and then checking whether the
  presumptive @code{dataset} contains a null byte (a valid string never
  will).
  
-@code{n-ccs} is observed as either 0 or 5.  When it is 5, the
-following strings are CCA through CCE format strings.  @xref{Custom
-Currency Formats,,, pspp, PSPP}.  Most commonly these are all
-@code{-,,,} but other strings occur.
-
-@code{missing} is the character used to indicate that a cell contains
-a missing value.  It is always observed as @samp{.}.
+A writer may safely use 4 for @code{x20} and omit the optional bytes
+at the end.
  
  @node SPV Light Member Dimensions
  @subsection Dimensions
@@ -1213,29 +1407,40 @@ a missing value.  It is always observed as @samp{.}.
  A pivot table presents multidimensional data.  A Dimension identifies
  the categories associated with each dimension.
  
-@cartouche
-@format
-Dimensions @result{} int[@t{n-dims}] Dimension*[@t{n-dims}]
-Dimension @result{} Value[@t{name}] DimProperties int[@t{n-categories}] Category*[@t{n-categories}]
-DimProperties @result{}
-    byte[@t{d1}]
-    (00 @math{|} 01 @math{|} 02)[@t{d2}]
-    (i0 @math{|} i2)[@t{d3}]
-    bool[@t{show-dim-label}]
-    bool[@t{hide-all-labels}]
-    01 int[@t{dim-index}]
-@end format
-@end cartouche
-
-@code{name} is the name of the dimension, e.g. @code{Variables},
-@code{Statistics}, or a variable name.
+@example
+Dimensions => int32[n-dims] Dimension*[n-dims]
+Dimension =>
+    Value[name] DimProperties
+    int32[n-categories] Category*[n-categories]
+DimProperties =>
+    byte[x1]
+    byte[x2]
+    int32[x3]
+    bool[hide-dim-label]
+    bool[hide-all-labels]
+    01 int32[dim-index]
+@end example
  
-The meanings of @code{d1}, @code{d2}, and @code{d3} are unknown.
-@code{d1} is usually 0 but many other values have been observed.
+@code{name} is the name of the dimension, e.g.@: @code{Variables},
+@code{Statistics}, or a variable name.
  
-If @code{show-dim-label} is 01, the pivot table displays a label for
+The meanings of @code{x1} and @code{x3} are unknown.  @code{x1} is
+usually 0 but many other values have been observed.  A writer may
+safely use 0 for @code{x1} and 2 for @code{x3}.
+
+@code{x2} is 0, 1, or 2.  For a pivot table with @var{L} layer
+dimensions, @var{R} row dimensions, and @var{C} column dimensions,
+@code{x2} is 2 for the first @var{L} dimensions, 0 for the next
+@var{R} dimensions, and 1 for the remaining @var{C} dimensions.  This
+does not mean that the layer dimensions must be presented first,
+followed by the row dimensions, followed by the column dimensions---on
+the contrary, they are frequently in a different order---but @code{x2}
+must follow this pattern to prevent the pivot table from being
+misinterpreted.
+
+If @code{hide-dim-label} is 00, the pivot table displays a label for
  the dimension itself.  Because usually the group and category labels
-are enough explanation, it is usually 00.
+are enough explanation, it is usually 01.
  
  If @code{hide-all-labels} is 01, the pivot table omits all labels for
  the dimension, including group and category labels.  It is usually 00.
@@ -1251,25 +1456,27 @@ is -1.  There is no visible difference.
  Categories are arranged in a tree.  Only the leaf nodes in the tree
  are really categories; the others just serve as grouping constructs.
  
-@cartouche
-@format
-Category @result{} Value[@t{name}] (Leaf @math{|} Group)
-Leaf @result{} 00 00 00 i2 int[@t{cat-index}] i0
-Group @result{}
-    bool[@t{merge}] 00 01 (i0 @math{|} i2)[@t{data}]
-    i-1 int[@t{n-subcategories}] Category*[@t{n-subcategories}]
-@end format
-@end cartouche
+@example
+Category => Value[name] (Leaf @math{|} Group)
+Leaf => 00 00 00 i2 int32[leaf-index] i0
+Group =>
+    bool[merge] 00 01 int32[x22]
+    i-1 int32[n-subcategories] Category*[n-subcategories]
+@end example
  
  @code{name} is the name of the category (or group).
  
-A Leaf represents a leaf category.  The Leaf's @code{cat-index} is a
-nonnegative integer less than @code{n-categories} in the Dimension in
-which the Category is nested (directly or indirectly).  These
-categories represent the original order in which the categories were
-sorted; if the user sorted or rearranged the categories, then the
-order of categories in the file reflects that without changing the
-@code{cat-index} values.
+A Leaf represents a leaf category.  The Leaf's @code{leaf-index} is a
+nonnegative integer unique within the Dimension and less than
+@code{n-categories} in the Dimension.  If the user does not sort or
+rearrange the categories, then @code{leaf-index} starts at 0 for the
+first Leaf in the dimension and increments by 1 with each successive
+Leaf.  If the user does sorts or rearrange the categories, then the
+order of categories in the file reflects that change and
+@code{leaf-index} reflects the original order.
+
+Occasionally a dimension has no leaf categories at all.  A table that
+contains such a dimension necessarily has no data at all.
  
  A Group is a group of nested categories.  Usually a Group contains at
  least one Category, so that @code{n-subcategories} is positive, but a
@@ -1284,24 +1491,26 @@ parent group, then direct children of the dimension), and this group's
  name is irrelevant and should not be displayed.  (Merged groups can be
  nested!)
  
-A Group's @code{data} appears to be i2 when all of the categories
+(For writing an SPV file, there is no need to use the @code{merge}
+feature unless it is convenient.)
+
+A Group's @code{x22} appears to be i2 when all of the categories
  within a group are leaf categories that directly represent data values
-for a variable (e.g. in a frequency table or crosstabulation, a group
-of values in a variable being tabulated) and i0 otherwise.
+for a variable (e.g.@: in a frequency table or crosstabulation, a group
+of values in a variable being tabulated) and i0 otherwise.  A writer
+may safely write a constant 0 in this field.
  
-@node SPV Light Member Data
-@subsection Data
+@node SPV Light Member Axes
+@subsection Axes
  
-The final part of an SPV light member contains the actual data.
+After the dimensions come assignment of each dimension to one of the
+axes: layers, rows, and columns.
  
-@cartouche
-@format
-Data @result{}
-    int[@t{layers}] int[@t{rows}] int[@t{columns}] int*[@t{n-dimensions}]
-    int[@t{n-data}] Datum*[@t{n-data}]
-Datum @result{} int64[@t{index}] v1(00?) Value
-@end format
-@end cartouche
+@example
+Axes =>
+    int32[n-layers] int32[n-rows] int32[n-columns]
+    int32*[n-layers] int32*[n-rows] int32*[n-columns]
+@end example
  
  The values of @code{n-layers}, @code{n-rows}, and @code{n-columns}
  each specifies the number of dimensions displayed in layers, rows, and
@@ -1309,33 +1518,41 @@ columns, respectively.  Any of them may be zero.  Their values sum to
  @code{n-dimensions} from Dimensions (@pxref{SPV Light Member
  Dimensions}).
  
-The @code{n-dimensions} integers are a permutation of the 0-based
-dimension numbers.  The first @code{n-layers} integers specify each of
-the dimensions represented by layers, the next @code{n-rows} integers
-specify the dimensions represented by rows, and the final
-@code{n-columns} integers specify the dimensions represented by
-columns.  When there is more than one dimension of a given kind, the
-inner dimensions are given first.
+The following @code{n-dimensions} integers, in three groups, are a
+permutation of the 0-based dimension numbers.  The first
+@code{n-layers} integers specify each of the dimensions represented by
+layers, the next @code{n-rows} integers specify the dimensions
+represented by rows, and the final @code{n-columns} integers specify
+the dimensions represented by columns.  When there is more than one
+dimension of a given kind, the inner dimensions are given first.
+
+@node SPV Light Member Cells
+@subsection Cells
+
+The final part of an SPV light member contains the actual data.
  
-The format of a Datum varies slightly from version 1 to version 3: in
-version 1 it allows for an extra optional 00 byte.
+@example
+Cells => int32[n-cells] Cell*[n-cells]
+Cell => int64[index] v1(00?) Value
+@end example
  
-A Datum consists of an @code{index} and a Value.  Suppose there are
-@math{d} dimensions and dimension @math{i}, @math{0 \le i < d}, has
-@math{n_i} categories.  Consider the datum at coordinates @math{x_i},
-@math{0 \le i < d}, and note that @math{0 \le x_i < n_i}.  Then the
-index is calculated by the following algorithm:
+A Cell consists of an @code{index} and a Value.  Suppose there are
+@math{d} dimensions, numbered 1 through @math{d} in the order given in
+the Dimensions previously, and that dimension @math{i}, has @math{n_i}
+categories.  Consider the cell at coordinates @math{x_i}, @math{1 \le
+i \le d}, and note that @math{0 \le x_i < n_i}.  Then the index is
+calculated by the following algorithm:
  
  @display
  let @i{index} = 0
-for each @math{i} from 0 to @math{d - 1}:
+for each @math{i} from 1 to @math{d}:
      @i{index} = (@math{n_i \times} @i{index}) @math{+} @math{x_i}
  @end display
  
  For example, suppose there are 3 dimensions with 3, 4, and 5
-categories, respectively.  The datum at coordinates (1, 2, 3) has
+categories, respectively.  The cell at coordinates (1, 2, 3) has
  index @math{5 \times (4 \times (3 \times 0 + 1) + 2) + 3 = 33}.
-Within a given dimension, the index is the @code{cat-index} in a Leaf.
+Within a given dimension, the index is the @code{leaf-index} in a Leaf.
  
  @node SPV Light Member Value
  @subsection Value
@@ -1343,23 +1560,21 @@ Within a given dimension, the index is the @code{cat-index} in a Leaf.
  Value is used throughout the SPV light member format.  It boils down
  to a number or a string.
  
-@cartouche
-@format
-Value @result{} 00? 00? 00? 00? RawValue
-RawValue @result{}
-    01 ValueMod int[@t{format}] double[@t{x}]
-  @math{|} 02 ValueMod int[@t{format}] double[@t{x}]
-    string[@t{varname}] string[@t{vallab}] (01 @math{|} 02 @math{|} 03)
-  @math{|} 03 string[@t{local}] ValueMod string[@t{id}] string[@t{c}] bool[@t{type}]
-  @math{|} 04 ValueMod int[@t{format}] string[@t{vallab}] string[@t{varname}]
-    (01 @math{|} 02 @math{|} 03) string[@t{s}]
-  @math{|} 05 ValueMod string[@t{varname}] string[@t{varlabel}] (01 @math{|} 02 @math{|} 03)
-  @math{|} ValueMod string[@t{format}] int[@t{n-args}] Argument*[@t{n-args}]
-Argument @result{}
+@example
+Value => 00? 00? 00? 00? RawValue
+RawValue =>
+    01 ValueMod int32[format] double[x]
+  @math{|} 02 ValueMod int32[format] double[x]
+    string[var-name] string[value-label] byte[show]
+  @math{|} 03 string[local] ValueMod string[id] string[c] bool[fixed]
+  @math{|} 04 ValueMod int32[format] string[value-label] string[var-name]
+    byte[show] string[s]
+  @math{|} 05 ValueMod string[var-name] string[var-label] byte[show]
+  @math{|} ValueMod string[template] int32[n-args] Argument*[n-args]
+Argument =>
      i0 Value
-  @math{|} int[@t{x}] i0 Value*[@t{x}@math{+}1]      /* @t{x} @math{>} 0 */
-@end format
-@end cartouche
+  @math{|} int32[x] i0 Value*[x]      /* x > 0 */
+@end example
  
  There are several possible encodings, which one can distinguish by the
  first nonzero byte in the encoding.
@@ -1368,7 +1583,8 @@ first nonzero byte in the encoding.
  @item 01
  The numeric value @code{x}, intended to be presented to the user
  formatted according to @code{format}, which is in the format described
-for system files.  @xref{System File Output Formats}, for details.
+for system files, except that format 40 is a synonym for F format
+instead of MTIME.  @xref{System File Output Formats}, for details.
  Most commonly, @code{format} has width 40 (the maximum).
  
  An @code{x} with the maximum negative double value @code{-DBL_MAX}
@@ -1378,12 +1594,14 @@ special values.
  
  @item 02
  Similar to @code{01}, with the additional information that @code{x} is
-a value of variable @code{varname} and has value label @code{vallab}.
-Both @code{varname} and @code{vallab} can be the empty string, the
-latter very commonly.
+a value of variable @code{var-name} and has value label
+@code{value-label}.  Both @code{var-name} and @code{value-label} can
+be the empty string, the latter very commonly.
  
-The meaning of the final byte is unknown.  Possibly it is connected to
-whether the value or the label should be displayed.
+@code{show} determines whether to show the numeric value or the value
+label.  A value of 1 means to show the value, 2 to show the label, 3
+to show both, and 0 means to use the default specified in
+@code{show-values} (@pxref{SPV Light Member Formats}).
  
  @item 03
  A text string, in two forms: @code{c} is in English, and sometimes
@@ -1401,7 +1619,7 @@ nonempty.
  programming language identifier, e.g.@: @code{cumulative_percent} or
  @code{factor_14}.  It is not unique.
  
-@code{type} is 00 for text taken from user input, such as syntax
+@code{fixed} is 00 for text taken from user input, such as syntax
  fragment, expressions, file names, data set names, and 01 for fixed
  text strings such as names of procedures or statistics.  In the former
  case, @code{id} is always the empty string; in the latter case,
@@ -1414,22 +1632,30 @@ too interesting, and the corpus contains many clearly invalid formats
  like A16.39 or A255.127 or A134.1, so readers should probably ignore
  the format entirely.
  
-@code{s} is a value of variable @code{varname} and has value label
-@code{vallab}.  @code{varname} is never empty but @code{vallab} is
-commonly empty.
+@code{s} is a value of variable @code{var-name} and has value label
+@code{value-label}.  @code{var-name} is never empty but
+@code{value-label} is commonly empty.
  
-The meaning of the final byte is unknown.
+@code{show} has the same meaning as in the encoding for 02.
  
  @item 05
-Variable @code{varname}, which is rarely observed as empty in the
-corpus, with variable label @code{varlabel}, which is often empty.
+Variable @code{var-name}, which is rarely observed as empty in the
+corpus, with variable label @code{var-label}, which is often empty.
  
-The meaning of the final byte is unknown.
+@code{show} determines whether to show the variable name or the
+variable label.  A value of 1 means to show the name, 2 to show the
+label, 3 to show both, and 0 means to use the default specified in
+@code{show-variables} (@pxref{SPV Light Member Formats}).
  
-@item 31 or 58
-(These bytes begin a ValueMod.)  A format string, analogous to
-@code{printf}, followed by one or more Arguments, each of which has
-one or more values.  The format string uses the following syntax:
+@item otherwise
+When the first byte of a RawValue is not one of the above, the
+RawValue starts with a ValueMod, whose syntax is described in the next
+section.  (A ValueMod always begins with byte 31 or 58.)
+
+This case is a template string, analogous to @code{printf}, followed
+by one or more Arguments, each of which has one or more values.  The
+template string is copied directly into the output except for the
+following special syntax,
  
  @table @code
  @item \%
@@ -1437,7 +1663,7 @@ one or more values.  The format string uses the following syntax:
  @itemx \[
  @itemx \]
  Each of these expands to the character following @samp{\\}, to escape
-characters that have special meaning in format strings.  These are
+characters that have special meaning in template strings.  These are
  effective inside and outside the @code{[@dots{}]}  syntax forms
  described below.
  
@@ -1492,74 +1718,79 @@ Given appropriate values, expands to @code{1, 2, 3}.
  @end table
  @end table
  
-The format string is localized to the user's locale.
+The template string is localized to the user's locale.
  @end table
  
+A writer may safely omit all of the optional 00 bytes at the beginning
+of a Value, except that it should write a single 00 byte before a
+templated Value.
+
  @node SPV Light Member ValueMod
  @subsection ValueMod
  
  A ValueMod can specify special modifications to a Value.
  
-@cartouche
-@format
-ValueMod @result{}
-    31 i0 (i0 @math{|} i1 string[@t{subscript}])
-    v1(00 (i1 @math{|} i2) 00 00 int 00 00)
-    v3(count(FormatString StylePair))
-  @math{|} 31 int[@t{n-refs}] int16*[@t{n-refs}] Format
-  @math{|} 58
-
-Format @result{} 00 00 count(FormatString Style 58)
-FormatString @result{} count((count((i0 58)?) (58 @math{|} 31 string))?)
-
-StylePair @result{}
-    (31 Style | 58)
-    (31 Style2 | 58)
-
-Style @result{}
-    bool[@t{bold}] bool[@t{italic}] bool[@t{underline}] bool[@t{show}]
-    string[@t{fgcolor}] string[@t{bgcolor}]
-    string[@t{typeface}] byte[@t{size}]
-
-Style2 @result{}
-    int[@t{halign}] int[@t{valign}] double[@t{offset}]
-    int16[@t{left-margin}] int16[@t{right-margin}]
-    int16[@t{top-margin}] int16[@t{bottom-margin}]
-@end format
-@end cartouche
-
-A ValueMod that begins with ``31 i0'' specifies a string to append to
-the main text of the Value, as a subscript.  The subscript text is a
-brief indicator, e.g.@: @samp{a} or @samp{a,b}, with its meaning
-indicated by the table caption.  In this usage, subscripts are similar
-to footnotes.  One apparent difference is that a Value can only
-reference one footnote but a subscript can list more than one letter.
-
-A ValueMod that begins with 31 followed by a nonzero ``int'' specifies
-a footnote or footnotes that the Value references.  Footnote markers
-are shown appended to the main text of the Value, as superscripts.
-
-The Format, if present, is a format string for substitutions using the
-syntax explained previously.  It appears to be an English-language
-version of the localized format string in the Value in which the
-Format is nested.
-
-Style and Style2, if present, change the style for this individual
-Value.  @code{bold}, @code{italic}, and @code{underline} control the
-particular style.  @code{fgcolor} and @code{bgcolor} are strings, such
-as @code{#ffffff}.  The @code{size} is a font size in units of 1/96
-inch.
-
-@code{halign} is 0 for center, 2 for left, 4 for right, 6 for decimal,
-0xffffffad for mixed.  For decimal alignment, @code{offset} is the
-decimal point's offset from the right side of the cell, in units of
-1/72 inch.
+@example
+ValueMod =>
+    58
+  @math{|} 31
+    int32[n-refs] int16*[n-refs]
+    (i0 | i1 string[subscript])
+    v1(00 (i1 | i2) 00? 00? int32 00? 00?)
+    v3(count(TemplateString StylePair))
+
+TemplateString => count((count((i0 58)?) (58 @math{|} 31 string[id]))?)
+
+StylePair =>
+    (31 FontStyle | 58)
+    (31 CellStyle | 58)
+
+FontStyle =>
+    bool[bold] bool[italic] bool[underline] bool[show]
+    string[fg-color] string[bg-color]
+    string[typeface] byte[size]
+
+CellStyle =>
+    int32[halign] int32[valign] double[decimal-offset]
+    int16[left-margin] int16[right-margin]
+    int16[top-margin] int16[bottom-margin]
+@end example
  
+A ValueMod that begins with ``31'' specifies special modifications to
+a Value.
+
+Each of the @code{n-refs} integers is a reference to a Footnote
+(@pxref{SPV Light Member Footnotes}) by 0-based index.  Footnote
+markers are shown appended to the main text of the Value, as
+superscripts.
+
+The @code{subscript}, if present, is a string to append to the main
+text of the Value, as a subscript.  The subscript text is a brief
+indicator, e.g.@: @samp{a} or @samp{a,b}, with its meaning indicated
+by the table caption.
+
+The @code{id} inside the TemplateString, if present, is a template
+string for substitutions using the syntax explained previously.  It
+appears to be an English-language version of the localized template
+string in the Value in which the Template is nested.  A writer may
+safely omit the optional fixed data in TemplateString.
+
+FontStyle and CellStyle, if present, change the style for this
+individual Value.  In FontStyle, @code{bold}, @code{italic}, and
+@code{underline} control the particular style.  @code{show} is
+ordinarily 1; if it is 0, then the cell data is not shown.
+@code{fg-color} and @code{bg-color} are strings in the format
+@code{#rrggbb}, e.g.@: @code{#ff0000} for red or @code{#ffffff} for
+white.  The empty string is occasionally observed also.  The
+@code{size} is a font size in units of 1/128 inch.
+
+In CellStyle, @code{halign} is 0 for center, 2 for left, 4 for right,
+6 for decimal, 0xffffffad for mixed.  For decimal alignment,
+@code{decimal-offset} is the decimal point's offset from the right
+side of the cell, in pt (@pxref{SPV Light Detail Member Format}).
  @code{valign} specifies vertical alignment: 0 for center, 1 for top, 3
-for bottom.
-
-@code{left-margin}, @code{right-margin}, @code{top-margin}, and
-@code{bottom-margin} are in units of 1/72 inch.
+for bottom.  @code{left-margin}, @code{right-margin},
+@code{top-margin}, and @code{bottom-margin} are in pt.
  
  @node SPV Legacy Detail Member Binary Format
  @section Legacy Detail Member Binary Format
@@ -1585,13 +1816,13 @@ In a version 0xb0 legacy member, @var{x}; in other versions, nothing.
  
  A legacy detail member @file{.bin} has the following overall format:
  
-@cartouche
-@format
-LegacyBinary @result{}
-    00 byte[@t{version}] int16[@t{n-sources}] int[@t{member-size}]
-    Metadata*[@t{n-sources}] Data*[@t{n-sources}]
-@end format
-@end cartouche
+@example
+LegacyBinary =>
+    00 byte[version] int16[n-sources] int32[member-size]
+    Metadata*[n-sources]
+    #Data*[n-sources]
+    #Strings?
+@end example
  
  @code{version} is a version number that affects the interpretation of
  some of the other data in the member.  Versions 0xaf and 0xb0 are
@@ -1603,105 +1834,120 @@ which has Metadata and Data.
  
  @code{member-size} is the size of the legacy binary member, in bytes.
  
-The following sections go into more detail.
+The Data and Strings above are commented out because the Metadata has
+some oddities that mean that the Data sometimes seems to start at
+an unexpected place.  The following section goes into detail.
  
  @menu
  * SPV Legacy Member Metadata::
-* SPV Legacy Member Data::
+* SPV Legacy Member Numeric Data::
+* SPV Legacy Member String Data::
  @end menu
  
  @node SPV Legacy Member Metadata
  @subsection Metadata
  
-@cartouche
-@format
-Metadata @result{}
-    int[@t{n-data}] int[@t{n-variables}] int[@t{offset}]
-    vAF(byte*32[@t{source-name}])
-    vB0(byte*64[@t{source-name}] int[@t{x}])
-@end format
-@end cartouche
+@example
+Metadata =>
+    int32[n-values] int32[n-variables] int32[data-offset]
+    vAF(byte*28[source-name])
+    vB0(byte*64[source-name] int32[x])
+@end example
  
  A data source has @code{n-variables} variables, each with
-@code{n-data} data values.
+@code{n-values} data values.
  
-@code{source-name} is a 32- or 64-byte string padded on the right with
-zero bytes.  The names that appear in the corpus are very generic:
+@code{source-name} is a 28- or 64-byte string padded on the right with
+0-bytes.  The names that appear in the corpus are very generic:
  usually @code{tableData} for pivot table data or @code{source0} for
  chart data.
  
-A given Metadata's @code{offset} is the offset, in bytes, from the
-beginning of the member to the start of the corresponding Data.  This
-allows programs to skip to the beginning of the data for a particular
-source; it is also important to determine whether a source includes
-any string data (@pxref{SPV Legacy Member Data}).
+A given Metadata's @code{data-offset} is the offset, in bytes, from
+the beginning of the member to the start of the corresponding Data.
+This allows programs to skip to the beginning of the data for a
+particular source.  In every case in the corpus, the Data follow the
+Metadata in the same order, but it is important to use
+@code{data-offset} instead of reading sequentially through the file
+because of the exception described below.
+
+One SPV file in the corpus has legacy binary members with version 0xb0
+but a 28-byte @code{source-name} field (and only a single source).  In
+practice, this means that the 64-byte @code{source-name} used in
+version 0xb0 has a lot of 0-bytes in the middle followed by the
+@code{variable-name} of the following Data.  As long as a reader
+treats the first 0-byte in the @code{source-name} as terminating the
+string, it can properly interpret these members.
  
  The meaning of @code{x} in version 0xb0 is unknown.
  
-@node SPV Legacy Member Data
-@subsection Data
+@node SPV Legacy Member Numeric Data
+@subsection Numeric Data
  
-@cartouche
-@format
-Data @result{} NumericData*[@t{n-variables}] StringData?
-NumericData @result{} byte*288[@t{variable-name}] double*[@t{n-data}]
-@end format
-@end cartouche
+@example
+Data => Variable*[n-variables]
+Variable => byte*288[variable-name] double*[n-values]
+@end example
  
  Data follow the Metadata in the legacy binary format, with sources in
-the same order.  Each NumericSeries begins with a @code{variable-name}
-that generally indicates its role in the pivot table, e.g.@: ``cell'',
-``cellFormat'', ``dimension0categories'', ``dimension0group0'',
-followed by the numeric data, one double per datum.  A double with the
-maximum negative double @code{-DBL_MAX} represents the system-missing
-value SYSMIS.
-
-@cartouche
-@format
-StringData @result{} i1 string[@t{source-name}] Pairs Labels
-
-Pairs @result{} int[@t{n-string-vars}] PairSeries*[@t{n-string-vars}]
-PairVar @result{} string[@t{pair-var-name}] int[@t{n-pairs}] Pair*[@t{n-pairs}]
-Pair @result{} int[@t{i}] int[@t{j}]
-
-Labels @result{} int[@t{n-labels}] Label*[@t{n-labels}]
-Label @result{} int[@t{frequency}] int[@t{s}]
-@end format
-@end cartouche
-
-A source may include a mix of numeric and string data values.  When a
-source includes any string data, the data values that are strings are
-set to SYSMIS in the NumericData, and StringData follows the
-NumericData.  A source that contains no string data omits the
-StringData.  To reliably determine whether a source includes
-StringData, the reader should check whether the offset following the
-NumericData is the offset of the next source, as indicated by its
-Metadata (or the end of the member, in the case of the last source).
-
-StringData repeats the name of the source (from Metadata).
-
-The string data overlays the numeric data.  @code{n-string-vars} is
-the number of variables in the source that include string data.  More
-precisely, it is the 1-based index of the last variable in the source
-that includes any string data; thus, it would be 4 if there are 5
-variables and only the fourth one includes string data.
-
-Each PairVar consists a sequence of 0 or more Pair nonterminals, each
-of which maps from a 0-based index within variable @code{i} to a
-0-based label index @code{j}, e.g.@: pair @code{i} = 2, @code{j} = 3,
-means that the third data value (with value SYSMIS) is to be replaced
-by the string of the fourth Label.
+the same order (but readers should use the @code{data-offset} in
+Metadata records, rather than reading sequentially).  Each Variable
+begins with a @code{variable-name} that generally indicates its role
+in the pivot table, e.g.@: ``cell'', ``cellFormat'',
+``dimension0categories'', ``dimension0group0'', followed by the
+numeric data, one double per datum.  A double with the maximum
+negative double @code{-DBL_MAX} represents the system-missing value
+SYSMIS.
+
+@node SPV Legacy Member String Data
+@subsection String Data
+
+@example
+Strings => SourceMaps[maps] Labels
+
+SourceMaps => int32[n-maps] SourceMap*[n-maps]
+
+SourceMap => string[source-name] int32[n-variables] VariableMap*[n-variables]
+VariableMap => string[variable-name] int32[n-data] DatumMap*[n-data]
+DatumMap => int32[value-idx] int32[label-idx]
+
+Labels => int32[n-labels] Label*[n-labels]
+Label => int32[frequency] string[label]
+@end example
+
+Each variable may include a mix of numeric and string data values.  If
+a legacy binary member contains any string data, Strings is present;
+otherwise, it ends just after the last Data element.
+
+The string data overlays the numeric data.  When a variable includes
+any string data, its Variable represents the string values with a
+SYSMIS or NaN placeholder.  (Not all such values need be
+placeholders.)
+
+Each SourceMap provides a mapping between SYSMIS or NaN values in source
+@code{source-name} and the string data that they represent.
+@code{n-variables} is the number of variables in the source that
+include string data.  More precisely, it is the 1-based index of the
+last variable in the source that includes any string data; thus, it
+would be 4 if there are 5 variables and only the fourth one includes
+string data.
+
+A VariableMap repeats its variable's name, but variables are always
+present in the same order as the source, starting from the first
+variable, without skipping any even if they have no string values.
+Each VariableMap contains DatumMap nonterminals, each of which maps
+from a 0-based index within its variable's data to a 0-based label
+index, e.g.@: pair @code{value-idx} = 2, @code{label-idx} = 3, means
+that the third data value (which must be SYSMIS or NaN) is to be
+replaced by the string of the fourth Label.
  
  The labels themselves follow the pairs.  The valuable part of each
-label is the string @code{s}.  Each label also includes a
-@code{frequency} that reports the number of pairs that reference it
-(although this is not useful).
+label is the string @code{label}.  Each label also includes a
+@code{frequency} that reports the number of DatumMaps that reference
+it (although this is not useful).
  
  @node SPV Legacy Detail Member XML Format
  @section Legacy Detail Member XML Format
  
-This format is still under investigation.
-
  The design of the detail XML format is not what one would end up with
  for describing pivot tables.  This is because it is a special case
  of a much more general format (``visualization XML'' or ``VizML'')
@@ -1709,207 +1955,379 @@ that can describe a wide range of visualizations.  Most of this
  generality is overkill for tables, and so we end up with a funny
  subset of a general-purpose format.
  
+An XML Schema for VizML is available, distributed with SPSS binaries,
+under a nonfree license.  It contains documentation that is
+occasionally helpful.
+
+See @file{src/output/spv/detail-xml.grammar} in the PSPP source tree
+for the full grammar that it uses for parsing.
+
  The important elements of the detail XML format are:
  
  @itemize @bullet
  @item
-Variables.  Variables in detail XML roughly correspond to the
-dimensions in a light detail member.  There is one variable for each
-dimension, plus one variable for each level of labeling along an axis.
-
-The bulk of variables are defined with @code{sourceVariable} elements.
-The data for these variables comes from the associated
-@code{tableData.bin} member.  Some variables are defined, with
-@code{derivedVariable} elements, as a constant or in terms of a
-mapping function from a source variable.
+Variables.  @xref{SPV Detail Variable Elements}.
  
  @item
  Assignment of variables to axes.  A variable can appear as columns, or
  rows, or layers.  The @code{faceting} element and its sub-elements
  describe this assignment.
+
+@item
+Styles and other annotations.
  @end itemize
  
-All elements have an optional @code{id} attribute.  In practice many
-elements are assigned @code{id} attributes that are never referenced.
+This description is not detailed enough to write legacy tables.
+Instead, write tables in the light binary format.
  
  @menu
  * SPV Detail visualization Element::
-* SPV Detail userSource Element::
-* SPV Detail sourceVariable Element::
-* SPV Detail derivedVariable Element::
+* SPV Detail Variable Elements::
  * SPV Detail extension Element::
  * SPV Detail graph Element::
  * SPV Detail location Element::
-* SPV Detail coordinates Element::
  * SPV Detail faceting Element::
  * SPV Detail facetLayout Element::
+* SPV Detail label Element::
+* SPV Detail setCellProperties Element::
+* SPV Detail setFormat Element::
+* SPV Detail interval Element::
  * SPV Detail style Element::
+* SPV Detail labelFrame Element::
+* SPV Detail Legacy Properties::
  @end menu
  
  @node SPV Detail visualization Element
  @subsection The @code{visualization} Element
  
-@format
-Parent: Document root
-Contents:
-     extension?
-     userSource
-     (sourceVariable @math{|} derivedVariable)@math{+}
-     graph
-     labelFrame@math{+}
-     container?
-     style@math{+}
-     layerController?
-@end format
+@example
+visualization
+   :creator
+   :date
+   :lang
+   :name
+   :style[style_ref]=ref style
+   :type
+   :version
+   :schemaLocation?
+=> visualization_extension?
+   userSource
+   (sourceVariable | derivedVariable)+
+   categoricalDomain?
+   graph
+   labelFrame[lf1]*
+   container?
+   labelFrame[lf2]*
+   style+
+   layerController?
+
+extension[visualization_extension]
+   :numRows=int?
+   :showGridline=bool?
+   :minWidthSet=(true)?
+   :maxWidthSet=(true)?
+=> EMPTY
+
+userSource :missing=(listwise | pairwise)? => EMPTY
+
+categoricalDomain => variableReference simpleSort
+
+simpleSort :method[sort_method]=(custom) => categoryOrder
+
+container :style=ref style => container_extension? location+ labelFrame*
+
+extension[container_extension] :combinedFootnotes=(true) => EMPTY
+
+layerController
+   :source=(tableData)
+   :target=ref label?
+=> EMPTY
+@end example
  
-This element has the following attributes.
+The @code{visualization} element is the root of detail XML member.  It
+has the following attributes:
  
-@defvr {Required} creator
+@defvr {Attribute} creator
  The version of the software that created this SPV file, as a string of
  the form @code{xxyyzz}, which represents software version xx.yy.zz,
  e.g.@: @code{160001} is version 16.0.1.  The corpus includes major
  versions 16 through 19.
  @end defvr
  
-@defvr {Required} date
+@defvr {Attribute} date
  The date on the which the file was created, as a string of the form
  @code{YYYY-MM-DD}.
  @end defvr
  
-@defvr {Required} lang
+@defvr {Attribute} lang
  The locale used for output, in Windows format, which is similar to the
  format used in Unix with the underscore replaced by a hyphen, e.g.@:
  @code{en-US}, @code{en-GB}, @code{el-GR}, @code{sr-Cryl-RS}.
  @end defvr
  
-@defvr {Required} name
+@defvr {Attribute} name
  The title of the pivot table, localized to the output language.
  @end defvr
  
-@defvr {Required} style
-The @code{id} of a @code{style} element (@pxref{SPV Detail style
-Element}).  This is the base style for the entire pivot table.  In
-every example in the corpus, the value is @code{visualizationStyle}
-and the corresponding @code{style} element has no attributes other
-than @code{id}.
+@defvr {Attribute} style
+The base style for the pivot table.  In every example in the corpus,
+the @code{style} element has no attributes other than @code{id}.
  @end defvr
  
-@defvr {Required} type
+@defvr {Attribute} type
  A floating-point number.  The meaning is unknown.
  @end defvr
  
-@defvr {Required} version
+@defvr {Attribute} version
  The visualization schema version number.  In the corpus, the value is
  one of 2.4, 2.5, 2.7, and 2.8.
  @end defvr
  
-@node SPV Detail userSource Element
-@subsection The @code{userSource} Element
+The @code{userSource} element has no visible effect.
  
-Parent: @code{visualization} @*
-Contents:
-
-This element has the following attributes.
+The @code{extension} element as a child of @code{visualization} has
+the following attributes.
  
-@defvr {Optional} missing
-Always @code{listwise}.
+@defvr {Attribute} numRows
+An integer that presumably defines the number of rows in the displayed
+pivot table.
  @end defvr
  
-@node SPV Detail sourceVariable Element
-@subsection The @code{sourceVariable} Element
-
-Parent: @code{visualization} @*
-Contents: @code{extension}* (@code{format} @math{|} @code{stringFormat})?
-
-This element defines a variable whose values can be used elsewhere in
-the visualization.  It ties this element's @code{id} to a variable
-from the @file{tableData.bin} member that corresponds to this
-@file{.xml}.
-
-This element has the following attributes.
-
-@defvr {Required} categorical
-Always set to @code{true}.
+@defvr {Attribute} showGridline
+Always set to @code{false} in the corpus.
  @end defvr
  
-@defvr {Required} source
-Always set to @code{tableData}, the @code{source-name} in the
-corresponding @file{tableData.bin} member (@pxref{SPV Legacy Member
-Metadata}).
+@defvr {Attribute} minWidthSet
+@defvrx {Attribute} maxWidthSet
+Always set to @code{true} in the corpus.
  @end defvr
  
-@defvr {Required} sourceName
-The name of a variable within the source, the @code{variable-name} in
-the corresponding @file{tableData.bin} member (@pxref{SPV Legacy
-Member Data}).
-@end defvr
+The @code{extension} element as a child of @code{container} has the
+following attribute
  
-@defvr {Optional} dependsOn
-The @code{variable-name} of a variable linked to this one, so that a
-viewer can work with them together.  For a group variable, this is the
-name of the corresponding categorical variable.
+@defvr {Attribute} combinedFootnotes
+Meaning unknown.
  @end defvr
  
-@defvr {Optional} label
-The variable label, if any
-@end defvr
+The @code{categoricalDomain} and @code{simpleSort} elements have no
+visible effect.
  
-@defvr {Optional} labelVariable
-The @code{variable-name} of a variable whose string values correspond
-one-to-one with the values of this variable and are suitable for use
-as value labels.
-@end defvr
+The @code{layerController} element has no visible effect.
  
-@node SPV Detail derivedVariable Element
-@subsection The @code{derivedVariable} Element
+@node SPV Detail Variable Elements
+@subsection Variable Elements
  
-Parent: @code{visualization} @*
-Contents: @code{extension}* (@code{format} @math{|} @code{stringFormat} @code{valueMapEntry}*)
+A ``variable'' in detail XML is a 1-dimensional array of data.  Each
+element of the array may, independently, have string or numeric
+content.  All of the variables in a given detail XML member either
+have the same number of elements or have zero elements.
  
-Like @code{sourceVariable}, this element defines a variable whose
-values can be used elsewhere in the visualization.  Instead of being
-read from a data source, the variable's data are defined by a
-mathematical expression.
+Two different elements define variables and their content:
  
-This element has the following attributes.
+@table @code
+@item sourceVariable
+These variables' data comes from the associated @code{tableData.bin}
+member.
  
-@defvr {Required} categorical
-Always set to @code{true}.
-@end defvr
+@item derivedVariable
+These variables are defined in terms of a mapping function from a
+source variable, or they are empty.
+@end table
  
-@defvr {Required} value
-An expression that defines the variable's value.  In theory this could
-be an arbitrary expression in terms of constants, functions, and other
-variables, e.g.@: @math{(@var{var1} + @var{var2}) / 2}.  In practice,
-the corpus contains only the following forms of expressions:
+A variable named @code{cell} always exists.  This variable holds the
+data displayed in the table.
+
+Variables in detail XML roughly correspond to the dimensions in a
+light detail member.  Each dimension has the following variables with
+stylized names, where @var{n} is a number for the dimension starting
+from 0:
  
  @table @code
-@item constant(@var{number})
-@itemx constant(@var{variable})
-A constant.  The meaning when a variable is named is unknown.
-Sometimes the ``variable name'' has spaces in it.
+@item dimension@var{n}categories
+The dimension's leaf categories (@pxref{SPV Light Member Categories}).
+
+@item dimension@var{n}group0
+Present only if the dimension's categories are grouped, this variable
+holds the group labels for the categories.  Grouping is inferred
+through adjacent identical labels.  Categories that are not part of a
+group have empty-string data in this variable.
+
+@item dimension@var{n}group1
+Present only if the first-level groups are further grouped, this
+variable holds the labels for the second-level groups.  There can be
+additional variables with further levels of grouping.
+
+@item dimension@var{n}
+An empty variable.
+@end table
  
-@item map(@var{variable})
+Determining the data for a (non-empty) variable is a multi-step
+process:
+
+@enumerate
+@item
+Draw initial data from its source, for a @code{sourceVariable}, or
+from another named variable, for a @code{derivedVariable}.
+
+@item
+Apply mappings from @code{valueMapEntry} elements within the
+@code{derivedVariable} element, if any.
+
+@item
+Apply mappings from @code{relabel} elements within a @code{format} or
+@code{stringFormat} element in the @code{sourceVariable} or
+@code{derivedVariable} element, if any.
+
+@item
+If the variable is a @code{sourceVariable} with a @code{labelVariable}
+attribute, and there were no mappings to apply in previous steps, then
+replace each element of the variable by the corresponding value in the
+label variable.
+@end enumerate
+
+A single variable's data can be modified in two of the steps, if both
+@code{valueMapEntry} and @code{relabel} are used.  The following
+example from the corpus maps several integers to 2, then maps 2 in
+turn to the string ``Input'':
+
+@example
+<derivedVariable categorical="true" dependsOn="dimension0categories"
+                 id="dimension0group0map" value="map(dimension0group0)">
+  <stringFormat>
+    <relabel from="2" to="Input"/>
+    <relabel from="10" to="Missing Value Handling"/>
+    <relabel from="14" to="Resources"/>
+    <relabel from="0" to=""/>
+    <relabel from="1" to=""/>
+    <relabel from="13" to=""/>
+  </stringFormat>
+  <valueMapEntry from="2;3;5;6;7;8;9" to="2"/>
+  <valueMapEntry from="10;11" to="10"/>
+  <valueMapEntry from="14;15" to="14"/>
+  <valueMapEntry from="0" to="0"/>
+  <valueMapEntry from="1" to="1"/>
+  <valueMapEntry from="13" to="13"/>
+</derivedVariable>
+@end example
+
+@menu
+* SPV Detail sourceVariable Element::
+* SPV Detail derivedVariable Element::
+* SPV Detail valueMapEntry Element::
+@end menu
+
+@node SPV Detail sourceVariable Element
+@subsubsection The @code{sourceVariable} Element
+
+@example
+sourceVariable
+   :id
+   :categorical=(true)
+   :source
+   :domain=ref categoricalDomain?
+   :sourceName
+   :dependsOn=ref sourceVariable?
+   :label?
+   :labelVariable=ref sourceVariable?
+=> variable_extension* (format | stringFormat)?
+@end example
+
+This element defines a variable whose data comes from the
+@file{tableData.bin} member that corresponds to this @file{.xml}.
+
+This element has the following attributes.
+
+@defvr {Attribute} id
+An @code{id} is always present because this element exists to be
+referenced from other elements.
+@end defvr
+
+@defvr {Attribute} categorical
+Always set to @code{true}.
+@end defvr
+
+@defvr {Attribute} source
+Always set to @code{tableData}, the @code{source-name} in the
+corresponding @file{tableData.bin} member (@pxref{SPV Legacy Member
+Metadata}).
+@end defvr
+
+@defvr {Attribute} sourceName
+The name of a variable within the source, corresponding to the
+@code{variable-name} in the @file{tableData.bin} member (@pxref{SPV
+Legacy Member Numeric Data}).
+@end defvr
+
+@defvr {Attribute} label
+The variable label, if any.
+@end defvr
+
+@defvr {Attribute} labelVariable
+The @code{variable-name} of a variable whose string values correspond
+one-to-one with the values of this variable and are suitable for use
+as value labels.
+@end defvr
+
+@defvr {Attribute} dependsOn
+This attribute doesn't affect the display of a table.
+@end defvr
+
+@node SPV Detail derivedVariable Element
+@subsubsection The @code{derivedVariable} Element
+
+@example
+derivedVariable
+   :id
+   :categorical=(true)
+   :value
+   :dependsOn=ref sourceVariable?
+=> variable_extension* (format | stringFormat)? valueMapEntry*
+@end example
+
+Like @code{sourceVariable}, this element defines a variable whose
+values can be used elsewhere in the visualization.  Instead of being
+read from a data source, the variable's data are defined by a
+mathematical expression.
+
+This element has the following attributes.
+
+@defvr {Attribute} id
+An @code{id} is always present because this element exists to be
+referenced from other elements.
+@end defvr
+
+@defvr {Attribute} categorical
+Always set to @code{true}.
+@end defvr
+
+@defvr {Attribute} value
+An expression that defines the variable's value.  In theory this could
+be an arbitrary expression in terms of constants, functions, and other
+variables, e.g.@: @math{(@var{var1} + @var{var2}) / 2}.  In practice,
+the corpus contains only the following forms of expressions:
+
+@table @code
+@item constant(0)
+@itemx constant(@var{variable})
+All zeros.  The reason why a variable is sometimes named is unknown.
+Sometimes the ``variable name'' has spaces in it.
+
+@item map(@var{variable})
  Transforms the values in the named @var{variable} using the
  @code{valueMapEntry}s contained within the element.
  @end table
  @end defvr
  
-@defvr {Optional} dependsOn
-The @code{variable-name} of a variable linked to this one, so that a
-viewer can work with them together.  For a group variable, this is the
-name of the corresponding categorical variable.
+@defvr {Attribute} dependsOn
+This attribute doesn't affect the display of a table.
  @end defvr
  
-@menu
-* SPV Detail valueMapEntry Element::
-@end menu
-
  @node SPV Detail valueMapEntry Element
  @subsubsection The @code{valueMapEntry} Element
  
-Parent: @code{derivedVariable} @*
-Contents: empty
+@example
+valueMapEntry :from :to => EMPTY
+@end example
  
  A @code{valueMapEntry} element defines a mapping from one or more
  values of a source expression to a target value.  (In the corpus, the
@@ -1917,15 +2335,17 @@ source expression is always just the name of a variable.)  Each target
  value requires a separate @code{valueMapEntry}.  If multiple source
  values map to the same target value, they can be combined or separate.
  
+In the corpus, all of the source and target values are integers.
+
  @code{valueMapEntry} has the following attributes.
  
-@defvr {Required} from
+@defvr {Attribute} from
  A source value, or multiple source values separated by semicolons,
  e.g.@: @code{0} or @code{13;14;15;16}.
  @end defvr
  
-@defvr {Required} to
-The target value.
+@defvr {Attribute} to
+The target value, e.g.@: @code{0}.
  @end defvr
  
  @node SPV Detail extension Element
@@ -1937,36 +2357,25 @@ attributes on this element, and their meanings, vary based on the
  context.  Each known usage is described separately below.  The current
  extensions use attributes exclusively, without any nested elements.
  
-@subsubheading @code{visualization} Parent Element
-
-With @code{visualization} as its parent element, @code{extension} has
-the following attributes.
-
-@defvr {Optional} numRows
-An integer that presumably defines the number of rows in the displayed
-pivot table.
-@end defvr
-
-@defvr {Optional} showGridline
-Always set to @code{false} in the corpus.
-@end defvr
-
-@defvr {Optional} minWidthSet
-@defvrx {Optional} maxWidthSet
-Always set to @code{true} in the corpus.
-@end defvr
-
  @subsubheading @code{container} Parent Element
  
+@example
+extension[container_extension] :combinedFootnotes=(true) => EMPTY
+@end example
+
  With @code{container} as its parent element, @code{extension} has the
  following attributes.
  
-@defvr {Required} combinedFootnotes
+@defvr {Attribute} combinedFootnotes
  Always set to @code{true} in the corpus.
  @end defvr
  
  @subsubheading @code{sourceVariable} and @code{derivedVariable} Parent Element
  
+@example
+extension[variable_extension] :from :helpId => EMPTY
+@end example
+
  With @code{sourceVariable} or @code{derivedVariable} as its parent
  element, @code{extension} has the following attributes.  A given
  parent element often contains several @code{extension} elements that
@@ -1979,24 +2388,45 @@ specify the meaning of the source data's variables or sources, e.g.@:
  <extension from="5" helpId="corrected_total"/>
  @end example
  
-@defvr {Required} from
+More commonly they are less helpful, e.g.@:
+
+@example
+<extension from="0" helpId="notes"/>
+<extension from="1" helpId="notes"/>
+<extension from="2" helpId="notes"/>
+<extension from="5" helpId="notes"/>
+<extension from="6" helpId="notes"/>
+<extension from="7" helpId="notes"/>
+<extension from="8" helpId="notes"/>
+<extension from="12" helpId="notes"/>
+<extension from="13" helpId="no_help"/>
+<extension from="14" helpId="notes"/>
+@end example
+
+@defvr {Attribute} from
  An integer or a name like ``dimension0''.
  @end defvr
  
-@defvr {Required} helpId
+@defvr {Attribute} helpId
  An identifier.
  @end defvr
  
  @node SPV Detail graph Element
  @subsection The @code{graph} Element
  
-Parent: @code{visualization} @*
-Contents: @code{location}@math{+} @code{coordinates} @code{faceting} @code{facetLayout} @code{interval}
+@example
+graph
+   :cellStyle=ref style
+   :style=ref style
+=> location+ coordinates faceting facetLayout interval
+
+coordinates => EMPTY
+@end example
  
  @code{graph} has the following attributes.
  
-@defvr {Required} cellStyle
-@defvrx {Required} style
+@defvr {Attribute} cellStyle
+@defvrx {Attribute} style
  Each of these is the @code{id} of a @code{style} element (@pxref{SPV
  Detail style Element}).  The former is the default style for
  individual cells, the latter for the entire table.
@@ -2005,25 +2435,31 @@ individual cells, the latter for the entire table.
  @node SPV Detail location Element
  @subsection The @code{location} Element
  
-Parent: @code{graph} @*
-Contents: empty
+@example
+location
+   :part=(height | width | top | bottom | left | right)
+   :method=(sizeToContent | attach | fixed | same)
+   :min=dimension?
+   :max=dimension?
+   :target=ref (labelFrame | graph | container)?
+   :value?
+=> EMPTY
+@end example
  
  Each instance of this element specifies where some part of the table
  frame is located.  All the examples in the corpus have four instances
  of this element, one for each of the parts @code{height},
  @code{width}, @code{left}, and @code{top}.  Some examples in the
  corpus add a fifth for part @code{bottom}, even though it is not clear
-how all of @code{top}, @code{bottom}, and @code{heigth} can be honored
+how all of @code{top}, @code{bottom}, and @code{height} can be honored
  at the same time.  In any case, @code{location} seems to have little
  importance in representing tables; a reader can safely ignore it.
  
-@defvr {Required} part
-One of @code{height}, @code{width}, @code{top}, @code{bottom}, or
-@code{left}.  Presumably @code{right} is acceptable as well but the
-corpus contains no examples.
+@defvr {Attribute} part
+The part of the table being located.
  @end defvr
  
-@defvr {Required} method
+@defvr {Attribute} method
  How the location is determined:
  
  @table @code
@@ -2045,14 +2481,14 @@ Same as the specified @code{target}.  Observed only for part
  @end table
  @end defvr
  
-@defvr {Optional} min
+@defvr {Attribute} min
  Minimum size.  Only observed with value @code{100pt}.  Only observed
  for part @code{width}.
  @end defvr
  
  @defvr {Dependent} target
  Required when @code{method} is @code{attach} or @code{same}, not
-observed otherwise.  This is the ID of an element to attach to.
+observed otherwise.  This identifies an element to attach to.
  Observed with the ID of @code{title}, @code{footnote}, @code{graph},
  and other elements.
  @end defvr
@@ -2064,98 +2500,58 @@ on parts @code{top} and @code{left}, and @code{100%} on part
  @code{bottom}.
  @end defvr
  
-@node SPV Detail coordinates Element
-@subsection The @code{coordinates} Element
-
-Parent: @code{graph} @*
-Contents: empty
-
-This element is always present and always empty, with no attributes
-(except @code{id}).
-
  @node SPV Detail faceting Element
  @subsection The @code{faceting} Element
  
-Parent: @code{graph} @*
-Contents: @code{cross} @code{layer}*
-
-The @code{faceting} element describes the row, column, and layer
-structure of the table.  Its @code{cross} child determines the row and
-column structure, and each @code{layer} child (if any) represents a
-layer.
-
-@code{faceting} has no attributes (other than @code{id}).
-
-@subsubheading The @code{cross} Element
-
-Parent: @code{faceting} @*
-Contents: @code{nest} @code{nest}
-
-The @code{cross} element describes the row and column structure of the
-table.  It has exactly two @code{nest} children, the first of which
-describes the table's rows and the second the table's columns.
+@example
+faceting => layer[layers1]* cross layer[layers2]*
  
-@code{cross} has no attributes (other than @code{id}).
+cross => (unity | nest) (unity | nest)
  
-@subsubheading The @code{nest} Element
+unity => EMPTY
  
-Parent: @code{cross} @*
-Contents: @code{variableReference}@math{+}
+nest => variableReference[vars]+
  
-A given @code{nest} usually consists of one or more dimensions, each
-of which is represented by @code{variableReference} child elements.
-Minimally, a dimension has two @code{variableReference} children, one
-for the categories, one for the data, e.g.:
+variableReference :ref=ref (sourceVariable | derivedVariable) => EMPTY
  
-@example
-<nest>
-  <variableReference ref="dimension0categories"/>
-  <variableReference ref="dimension0"/>
-</nest>
+layer
+   :variable=ref (sourceVariable | derivedVariable)
+   :value
+   :visible=bool?
+   :method[layer_method]=(nest)?
+   :titleVisible=bool?
+=> EMPTY
  @end example
  
-@noindent
-Groups of categories introduce additional variable references, e.g.@:
+The @code{faceting} element describes the row, column, and layer
+structure of the table.  Its @code{cross} child determines the row and
+column structure, and each @code{layer} child (if any) represents a
+layer.  Layers may appear before or after @code{cross}.
  
-@example
-<nest>
-  <variableReference ref="dimension0categories"/>
-  <variableReference ref="dimension0group0"/>
-  <variableReference ref="dimension0"/>
-</nest>
-@end example
+The @code{cross} element describes the row and column structure of the
+table.  It has exactly two children, the first of which describes the
+table's columns and the second the table's rows.  Each child is a
+@code{nest} element if the table has any dimensions along the axis in
+question, otherwise a @code{unity} element.
  
-@noindent
-Grouping can be hierarchical, e.g.@:
+A @code{nest} element contains of one or more dimensions listed from
+innermost to outermost, each represented by @code{variableReference}
+child elements.  Each variable in a dimension is listed in order.
+@xref{SPV Detail Variable Elements}, for information on the variables
+that comprise a dimension.
  
-@example
-<nest>
-  <variableReference ref="dimension0categories"/>
-  <variableReference ref="dimension0group1"/>
-  <variableReference ref="dimension0group0"/>
-  <variableReference ref="dimension0"/>
-</nest>
-@end example
-
-@noindent
-XXX what are group maps?
+A @code{nest} can contain a single dimension, e.g.:
  
  @example
-<nest id="nest_1973">
-  <variableReference ref="dimension1categories"/>
-  <variableReference ref="dimension1group1map"/>
-  <variableReference ref="dimension1group0map"/>
-  <variableReference ref="dimension1"/>
-</nest>
  <nest>
    <variableReference ref="dimension0categories"/>
-  <variableReference ref="dimension0group0map"/>
+  <variableReference ref="dimension0group0"/>
    <variableReference ref="dimension0"/>
  </nest>
  @end example
  
  @noindent
-A @code{nest} can contain multiple dimensions:
+A @code{nest} can contain multiple dimensions, e.g.:
  
  @example
  <nest>
@@ -2167,34 +2563,17 @@ A @code{nest} can contain multiple dimensions:
  </nest>
  @end example
  
-One @code{nest} within a given @code{cross} may have no dimensions, in
-which case it still has one @code{variableReference} child, which
-references a @code{derivedVariable} whose @code{value} attribute is
+A @code{nest} may have no dimensions, in which case it still has one
+@code{variableReference} child, which references a
+@code{derivedVariable} whose @code{value} attribute is
  @code{constant(0)}.  In the corpus, such a @code{derivedVariable} has
-@code{row} or @code{column}, respectively, as its @code{id}.
-
-@code{nest} has no attributes (other than @code{id}).
-
-@subsubheading The @code{variableReference} Element
+@code{row} or @code{column}, respectively, as its @code{id}.  This is
+equivalent to using a @code{unity} element in place of @code{nest}.
  
-Parent: @code{nest} @*
-Contents: empty
+A @code{variableReference} element refers to a variable through its
+@code{ref} attribute.
  
-@code{variableReference} has one attribute.
-
-@defvr {Required} ref
-The @code{id} of a @code{sourceVariable} or @code{derivedVariable}
-element.
-@end defvr
-
-@subsubheading The @code{layer} Element
-
-Parent: @code{faceting} @*
-Contents: empty
-
-Each layer is represented by a pair of @code{layer} elements.  The
-first of this pair is for a category variable, the second for the data
-variable, e.g.:
+Each @code{layer} element represents a dimension, e.g.:
  
  @example
  <layer value="0" variable="dimension0categories" visible="true"/>
@@ -2204,53 +2583,78 @@ variable, e.g.:
  @noindent
  @code{layer} has the following attributes.
  
-@defvr {Required} variable
-The @code{id} of a @code{sourceVariable} or @code{derivedVariable}
-element.
+@defvr {Attribute} variable
+Refers to a @code{sourceVariable} or @code{derivedVariable} element.
  @end defvr
  
-@defvr {Required} value
+@defvr {Attribute} value
  The value to select.  For a category variable, this is always
  @code{0}; for a data variable, it is the same as the @code{variable}
  attribute.
  @end defvr
  
-@defvr {Optional} visible
+@defvr {Attribute} visible
  Whether the layer is visible.  Generally, category layers are visible
  and data layers are not, but sometimes this attribute is omitted.
  @end defvr
  
-@defvr {Optional} method
+@defvr {Attribute} method
  When present, this is always @code{nest}.
  @end defvr
  
  @node SPV Detail facetLayout Element
  @subsection The @code{facetLayout} Element
  
-Parent: @code{graph} @*
-Contents: @code{tableLayout} @code{facetLevel}@math{+} @code{setCellProperties}*
-
-@subsubheading The @code{tableLayout} Element
+@example
+facetLayout => tableLayout setCellProperties[scp1]*
+               facetLevel+ setCellProperties[scp2]*
+
+tableLayout
+   :verticalTitlesInCorner=bool
+   :style=ref style?
+   :fitCells=(ticks both)?
+=> EMPTY
+@end example
+               
+The @code{facetLayout} element and its descendants control styling for
+the table.
  
-Parent: @code{facetLayout} @*
-Contents: empty
+Its @code{tableLayout} child has the following attributes
  
-@defvr {Required} verticalTitlesInCorner
-Always set to @code{true}.
+@defvr {Attribute} verticalTitlesInCorner
+If true, in the absence of corner text, row headings will be displayed
+in the corner.
  @end defvr
  
-@defvr {Optional} style
-The @code{id} of a @code{style} element.
+@defvr {Attribute} style
+Refers to a @code{style} element.
  @end defvr
  
-@defvr {Optional} fitCells
-Always set to @code{ticks}.
+@defvr {Attribute} fitCells
+Meaning unknown.
  @end defvr
  
  @subsubheading The @code{facetLevel} Element
  
-Parent: @code{facetLayout} @*
-Contents: @code{axis}
+@example
+facetLevel :level=int :gap=dimension? => axis
+
+axis :style=ref style => label? majorTicks
+
+majorTicks
+   :labelAngle=int
+   :length=dimension
+   :style=ref style
+   :tickFrameStyle=ref style
+   :labelFrequency=int?
+   :stagger=bool?
+=> gridline?
+
+gridline
+   :style=ref style
+   :zOrder=int
+=> EMPTY
+@end example
  
  Each @code{facetLevel} describes a @code{variableReference} or
  @code{layer}, and a table has one @code{facetLevel} element for
@@ -2267,7 +2671,7 @@ attribute, described below.  Second, in the corpus, a
  appended.  One should not formally rely on this, of course, but it is
  usefully indicative.
  
-@defvr {Required} level
+@defvr {Attribute} level
  A 1-based index into the @code{variableReference} and @code{layer}
  elements, e.g.@: a @code{facetLayout} with a @code{level} of 1
  describes the first @code{variableReference} in the SPV detail member,
@@ -2276,47 +2680,64 @@ and in a member with four @code{variableReference} elements, a
  @code{layer} in the member.
  @end defvr
  
-@defvr {Required} gap
+@defvr {Attribute} gap
  Always observed as @code{0pt}.
  @end defvr
  
-@subsubheading The @code{axis} Element
+Each @code{facetLevel} contains an @code{axis}, which in turn may
+contain a @code{label} for the @code{facetLevel} (@pxref{SPV Detail
+label Element}) and does contain a @code{majorTicks} element.  
  
-Parent: @code{facetLevel} @*
-Contents: @code{label}? @code{majorTicks}
+@defvr {Attribute} labelAngle
+Normally 0.  The value -90 causes inner column or outer row labels to
+be rotated vertically.
+@end defvr
  
  @defvr {Attribute} style
-The @code{id} of a @code{style} element.
+@defvrx {Attribute} tickFrameStyle
+Each refers to a @code{style} element.  @code{style} is the style of
+the tick labels, @code{tickFrameStyle} the style for the frames around
+the labels.
  @end defvr
  
-@subsubheading The @code{label} Element
-
-Parent: @code{axis} or @code{labelFrame} @*
-Contents: @code{text}@math{+} @math{|} @code{descriptionGroup}
+@node SPV Detail label Element
+@subsection The @code{label} Element
  
-This element represents a label on some aspect of the table.  For example,
-the table's title is a @code{label}.
+@example
+label
+   :style=ref style
+   :textFrameStyle=ref style?
+   :purpose=(title | subTitle | subSubTitle | layer | footnote)?
+=> text+ | descriptionGroup
+
+descriptionGroup
+   :target=ref faceting
+   :separator?
+=> (description | text)+
+
+description :name=(variable | value) => EMPTY
+
+text
+   :usesReference=int?
+   :definesReference=int?
+   :position=(subscript | superscript)?
+   :style=ref style
+=> TEXT
+@end example
  
-The contents of the label can be one or more @code{text} elements or a
-@code{descriptionGroup}.
+This element represents a label on some aspect of the table.
  
  @defvr {Attribute} style
-@defvrx {Optional} textFrameStyle
-Each of these is the @code{id} of a @code{style} element.
-@code{style} is the style of the label text, @code{textFrameStyle} the
-style for the frame around the label.
+@defvrx {Attribute} textFrameStyle
+Each of these refers to a @code{style} element.  @code{style} is the
+style of the label text, @code{textFrameStyle} the style for the frame
+around the label.
  @end defvr
  
-@defvr {Optional} purpose
-The kind of entity being labeled, one of @code{title},
-@code{subTitle}, @code{layer}, or @code{footnote}.
+@defvr {Attribute} purpose
+The kind of entity being labeled.
  @end defvr
  
-@subsubheading The @code{descriptionGroup} Element
-
-Parent: @code{label} @*
-Contents: (@code{description} @math{|} @code{text})@math{+}
-
  A @code{descriptionGroup} concatenates one or more elements to form a
  label.  Each element can be a @code{text} element, which contains
  literal text, or a @code{description} element that substitutes a value
@@ -2342,241 +2763,378 @@ Typical contents for a @code{descriptionGroup} are a value by itself:
  <description name="variable"/><text>:</text><description name="value"/>
  @end example
  
-@subsubheading The @code{description} Element
-
-Parent: @code{descriptionGroup} @*
-Contents: empty
-
  A @code{description} is like a macro that expands to some property of
-the target of its parent @code{descriptionGroup}.
+the target of its parent @code{descriptionGroup}.  The @code{name}
+attribute specifies the property.
  
-@defvr {Attribute} name
-The name of the property.  Only @code{variable} and @code{value}
-appear in the corpus.
-@end defvr
+@node SPV Detail setCellProperties Element
+@subsection The @code{setCellProperties} Element
  
-@subsubheading The @code{majorTicks} Element
-
-Parent: @code{axis} @*
-Contents: @code{gridline}?
-
-@defvr {Attribute} labelAngle
-@defvrx {Attribute} length
-Both always defined to @code{0}.
-@end defvr
+@example
+setCellProperties
+   :applyToConverse=bool?
+=> (setStyle | setFrameStyle | setFormat | setMetaData)* union[union_]?
+@end example
  
-@defvr {Attribute} style
-@defvrx {Attribute} tickFrameStyle
-Each of these is the @code{id} of a @code{style} element.
-@code{style} is the style of the tick labels, @code{tickFrameStyle}
-the style for the frames around the labels.
-@end defvr
+The @code{setCellProperties} element sets style properties of cells or
+row or column labels.
  
-@subsubheading The @code{gridline} Element
+Interpreting @code{setCellProperties} requires answering two
+questions: which cells or labels to style, and what styles to use.
  
-Parent: @code{majorTicks} @*
-Contents: empty
+@subsubheading Which Cells?
  
-Represents ``gridlines,'' which for a table represents the lines
-between the rows or columns of a table (XXX?).
+@example
+union => intersect+
  
-@defvr {Attribute} style
-The style for the gridline.
-@end defvr
+intersect => where+ | intersectWhere | alternating | EMPTY
  
-@defvr {Attribute} zOrder
-Observed as a number between 28 and 31.  Does not seem to be
-important.
-@end defvr
+where
+   :variable=ref (sourceVariable | derivedVariable)
+   :include
+=> EMPTY
  
-@subsubheading The @code{setCellProperties} Element
+intersectWhere
+   :variable=ref (sourceVariable | derivedVariable)
+   :variable2=ref (sourceVariable | derivedVariable)
+=> EMPTY
  
-Parent: @code{facetLayout} @*
-Contents: @code{setMetaData} @code{setStyle}* @code{setFormat}@math{+} @code{union}?
+alternating => EMPTY
+@end example
  
-This element sets style properties of cells designated by the
-@code{target} attribute of its child elements, as further restricted
-by the optional @code{union} element if present.  The @code{target}
-values often used, e.g.@: @code{graph} or @code{labeling}, actually
-affect every cell, so the @code{union} element is a useful
-restriction.
+When @code{union} is present with @code{intersect} children, each of
+those children specifies a group of cells that should be styled, and
+the total group is all those cells taken together.  When @code{union}
+is absent, every cell is styled.  One attribute on
+@code{setCellProperties} affects the choice of cells:
  
-@defvr {Optional} applyToConverse
-If present, always @code{true}.  This appears to invert the meaning of
-the @code{target} of sub-elements: the selected cells are the ones
-@emph{not} designated by @code{target}.  This is confusing, given the
-additional restrictions of @code{union}, but in the corpus
+@defvr {Attribute} applyToConverse
+If true, this inverts the meaning of the cell selection: the selected
+cells are the ones @emph{not} designated.  This is confusing, given
+the additional restrictions of @code{union}, but in the corpus
  @code{applyToConverse} is never present along with @code{union}.
  @end defvr
  
-@subsubheading The @code{setMetaData} Element
+An @code{intersect} specifies restrictions on the cells to be matched.
+Each @code{where} child specifies which values of a given variable to
+include.  The attributes of @code{intersect} are:
  
-Parent: @code{setCellProperties} @*
-Contents: empty
-
-This element is not known to have any visible effect.
+@defvr {Attribute} variable
+Refers to a variable, e.g.@: @code{dimension0categories}.  Only
+``categories'' variables make sense here, but other variables, e.g.@:
+@code{dimension0group0map}, are sometimes seen.  The reader may ignore
+these.
+@end defvr
  
-@defvr {Required} target
-The @code{id} of an element whose metadata is to be set.  In the
-corpus, this is always @code{graph}, the @code{id} used for the
-@code{graph} element.
+@defvr {Attribute} include
+A value, or multiple values separated by semicolons,
+e.g.@: @code{0} or @code{13;14;15;16}.
  @end defvr
  
-@defvr {Required} key
-@defvrx {Required} value
-A key-value pair to set for the target.
+PSPP ignores @code{setCellProperties} when @code{intersectWhere} is
+present.
  
-In the corpus, @code{key} is @code{cellPropId} or, rarely,
-@code{diagProps}, and @code{value} is always the @code{id} of the
-parent @code{setCellProperties}.
-@end defvr
+@subsubheading What Styles?
+
+@example
+setStyle
+   :target=ref (labeling | graph | interval | majorTicks)
+   :style=ref style
+=> EMPTY
+
+setMetaData :target=ref graph :key :value => EMPTY
+
+setFormat
+   :target=ref (majorTicks | labeling)
+   :reset=bool?
+=> format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat
+
+setFrameStyle
+   :style=ref style
+   :target=ref majorTicks
+=> EMPTY
+@end example
+
+The @code{set*} children of @code{setCellProperties} determine the
+styles to set.
+
+When @code{setCellProperties} contains a @code{setFormat} whose
+@code{target} references a @code{labeling} element, or if it contains
+a @code{setStyle} that references a @code{labeling} or @code{interval}
+element, the @code{setCellProperties} sets the style for table cells.
+The format from the @code{setFormat}, if present, replaces the cells'
+format.  The style from the @code{setStyle} that references
+@code{labeling}, if present, replaces the label's font and cell
+styles, except that the background color is taken instead from the
+@code{interval}'s style, if present.
+
+When @code{setCellProperties} contains a @code{setFormat} whose
+@code{target} references a @code{majorTicks} element, or if it
+contains a @code{setStyle} whose @code{target} references a
+@code{majorTicks}, or if it contains a @code{setFrameStyle} element,
+the @code{setCellProperties} sets the style for row or column labels.
+In this case, the @code{setCellProperties} always contains a single
+@code{where} element whose @code{variable} designates the variable
+whose labels are to be styled.  The format from the @code{setFormat},
+if present, replaces the labels' format.  The style from the
+@code{setStyle} that references @code{majorTicks}, if present,
+replaces the labels' font and cell styles, except that the background
+color is taken instead from the @code{setFrameStyle}'s style, if
+present.
+
+When @code{setCellProperties} contains a @code{setStyle} whose
+@code{target} references a @code{graph} element, and one that
+references a @code{labeling} element, and the @code{union} element
+contains @code{alternating}, the @code{setCellProperties} sets the
+alternate foreground and background colors for the data area.  The
+foreground color is taken from the style referenced by the
+@code{setStyle} that targets the @code{graph}, the background color
+from the @code{setStyle} for @code{labeling}.
+
+A reader may ignore a @code{setCellProperties} that only contains
+@code{setMetaData}, as well as @code{setMetaData} within other
+@code{setCellProperties}.
+
+A reader may ignore a @code{setCellProperties} whose only @code{set*}
+child is a @code{setStyle} that targets the @code{graph} element.
  
  @subsubheading The @code{setStyle} Element
  
-Parent: @code{setCellProperties} @*
-Contents: empty
+@example
+setStyle
+   :target=ref (labeling | graph | interval | majorTicks)
+   :style=ref style
+=> EMPTY
+@end example
  
  This element associates a style with the target.
  
-@defvr {Required} target
-The @code{id} of an element whose style is to be set.  In the corpus,
-this is always the @code{id} of an @code{interval}, @code{labeling},
-or, rarely, @code{graph} element.
+@defvr {Attribute} target
+The @code{id} of an element whose style is to be set.
  @end defvr
  
-@defvr {Required} style
+@defvr {Attribute} style
  The @code{id} of a @code{style} element that identifies the style to
  set on the target.
  @end defvr
  
-@subsubheading The @code{setFormat} Element
+@node SPV Detail setFormat Element
+@subsection The @code{setFormat} Element
  
-@format
-Parent: @code{setCellProperties}
-Contents:
-    @code{format}
-  @math{|} @code{numberFormat}
-  @math{|} @code{stringFormat}@math{+}
-  @math{|} @code{dateTimeFormat}
-@end format
+@example
+setFormat
+   :target=ref (majorTicks | labeling)
+   :reset=bool?
+=> format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat
+@end example
  
  This element sets the format of the target, ``format'' in this case
  meaning the SPSS print format for a variable.
  
  The details of this element vary depending on the schema version, as
  declared in the root @code{visualization} element's @code{version}
-attribute (@pxref{SPV Detail visualization Element}).  In version 2.5
-and earlier, @code{setFormat} contains one of a number of child
-elements that correspond to the different varieties of print formats.
-In version 2.7 and later, @code{setFormat} instead always contains a
-@code{format} element.
-
-XXX reinvestigate the above claim about versions: it appears to be
-incorrect.
+attribute (@pxref{SPV Detail visualization Element}).  A reader can
+interpret the content without knowing the schema version.
  
  The @code{setFormat} element itself has the following attributes.
  
-@defvr {Required} target
-The @code{id} of an element whose style is to be set.  In the corpus,
-this is always the @code{id} of an @code{majorTicks} or
-@code{labeling} element.
+@defvr {Attribute} target
+Refers to an element whose style is to be set.
  @end defvr
  
-@defvr {Optional} reset
-If this is @code{true}, this format overrides the target's previous
-format.  If it is @code{false}, the adds to the previous format.  In
-the corpus this is always @code{true}.  The default behavior is
-unknown.
+@defvr {Attribute} reset
+If this is @code{true}, this format replaces the target's previous
+format.  If it is @code{false}, the modifies the previous format.
  @end defvr
  
  @menu
-* SPV Detail format Element::
  * SPV Detail numberFormat Element::
  * SPV Detail stringFormat Element::
  * SPV Detail dateTimeFormat Element::
+* SPV Detail elapsedTimeFormat Element::
+* SPV Detail format Element::
  * SPV Detail affix Element::
-* SPV Detail relabel Element::
-* SPV Detail union Element::
  @end menu
  
-@node SPV Detail format Element
-@subsubsection The @code{format} Element
+@node SPV Detail numberFormat Element
+@subsubsection The @code{numberFormat} Element
  
-Parent: @code{sourceVariable}, @code{derivedVariable}, @code{formatMapping}, @code{labeling}, @code{formatMapping}, @code{setFormat} @*
-Contents: (@code{affix}@math{+} @math{|} @code{relabel}@math{+})?
+@example
+numberFormat
+   :minimumIntegerDigits=int?
+   :maximumFractionDigits=int?
+   :minimumFractionDigits=int?
+   :useGrouping=bool?
+   :scientific=(onlyForSmall | whenNeeded | true | false)?
+   :small=real?
+   :prefix?
+   :suffix?
+=> affix*
+@end example
  
-This element appears only in schema version 2.7 (@pxref{SPV Detail
-visualization Element}).
+Specifies a format for displaying a number.  The available options are
+a superset of those available from PSPP print formats.  PSPP chooses a
+print format type for a @code{numberFormat} as follows:
  
-This element determines a format, equivalent to an SPSS print format.
+@enumerate
+@item
+If @code{scientific} is @code{true}, uses @code{E} format.
  
-@subsubheading Attributes for All Formats
+@item
+If @code{prefix} is @code{$}, uses @code{DOLLAR} format.
  
-These attributes apply to all kinds of formats.  The most important of
-these attributes determines the high-level kind of formatting in use:
+@item
+If @code{suffix} is @code{%}, uses @code{PCT} format.
+
+@item
+If @code{useGrouping} is @code{true}, uses @code{COMMA} format.
  
-@defvr {Optional} baseFormat
-Either @code{dateTime} or @code{elapsedTime}.  When this attribute is
-omitted, this element is a numeric or string format.
+@item
+Otherwise, uses @code{F} format.
+@end enumerate
+
+For translating to a print format, PSPP uses
+@code{maximumFractionDigits} as the number of decimals, unless that
+attribute is missing or out of the range [0,15], in which case it uses
+2 decimals.
+
+@defvr {Attribute} minimumIntegerDigits
+Minimum number of digits to display before the decimal point.  Always
+observed as @code{0}.
  @end defvr
  
-@noindent
-Whether, in the corpus, other attributes are always present (``yes''),
-never present (``no''), or sometimes present (``opt'') depends on
-@code{baseFormat}:
-
-@multitable {maximumFractionDigits} {@code{dateTime}} {@code{elapsedTime}} {number} {string}
-@headitem Attribute @tab @code{dateTime} @tab @code{elapsedTime} @tab number @tab string
-@item errorCharacter        @tab yes @tab yes @tab yes @tab opt
-@item @w{ }
-@item separatorChars        @tab yes @tab  no @tab  no @tab no
-@item @w{ }
-@item mdyOrder              @tab yes @tab  no @tab  no @tab no
-@item @w{ }
-@item showYear              @tab yes @tab  no @tab  no @tab no
-@item yearAbbreviation      @tab yes @tab  no @tab  no @tab no
-@item @w{ }
-@item showMonth             @tab yes @tab  no @tab  no @tab no
-@item monthFormat           @tab yes @tab  no @tab  no @tab no
-@item @w{ }
-@item showDay               @tab yes @tab opt @tab  no @tab no
-@item dayPadding            @tab yes @tab opt @tab  no @tab no
-@item dayOfMonthPadding     @tab yes @tab  no @tab  no @tab no
-@item dayType               @tab yes @tab  no @tab  no @tab no
-@item @w{ }
-@item showHour              @tab yes @tab opt @tab  no @tab no
-@item hourFormat            @tab yes @tab opt @tab  no @tab no
-@item hourPadding           @tab yes @tab yes @tab  no @tab no
-@item @w{ }
-@item showMinute            @tab yes @tab yes @tab  no @tab no
-@item minutePadding         @tab yes @tab yes @tab  no @tab no
-@item @w{ }
-@item showSecond            @tab yes @tab yes @tab  no @tab no
-@item secondPadding         @tab  no @tab yes @tab  no @tab no
-@item @w{ }
-@item showMillis            @tab  no @tab yes @tab  no @tab no
-@item @w{ }
-@item minimumIntegerDigits  @tab  no @tab  no @tab yes @tab no
-@item maximumFractionDigits @tab  no @tab yes @tab yes @tab no
-@item minimumFractionDigits @tab  no @tab yes @tab yes @tab no
-@item useGrouping           @tab  no @tab opt @tab yes @tab no
-@item scientific            @tab  no @tab  no @tab yes @tab no
-@item small                 @tab  no @tab  no @tab opt @tab no
-@item suffix                @tab  no @tab  no @tab opt @tab no
-@item @w{ }
-@item tryStringsAsNumbers   @tab  no @tab  no @tab  no @tab yes
-@item @w{ }
-@end multitable
-
-@defvr {Attribute} errorCharacter
-A character that replaces the formatted value when it cannot otherwise
-be represented in the given format.  Always @samp{*}.
-@end defvr
-
-@subsubheading Date and Time Attributes
-
-These attributes are used with @code{dateTime} and @code{elapsedTime}
-formats or both.
+@defvr {Attribute} maximumFractionDigits
+@defvrx {Attribute} minimumFractionDigits
+Maximum or minimum, respectively, number of digits to display after
+the decimal point.  The observed values of each attribute range from 0
+to 9.
+@end defvr
+
+@defvr {Attribute} useGrouping
+Whether to use the grouping character to group digits in large
+numbers.
+@end defvr
+
+@defvr {Attribute} scientific
+This attribute controls when and whether the number is formatted in
+scientific notation.  It takes the following values:
+
+@table @code
+@item onlyForSmall
+Use scientific notation only when the number's magnitude is smaller
+than the value of the @code{small} attribute.
+
+@item whenNeeded
+Use scientific notation when the number will not otherwise fit in the
+available space.
+
+@item true
+Always use scientific notation.  Not observed in the corpus.
+
+@item false
+Never use scientific notation.  A number that won't otherwise fit will
+be replaced by an error indication (see the @code{errorCharacter}
+attribute).  Not observed in the corpus.
+@end table
+@end defvr
+
+@defvr {Attribute} small
+Only present when the @code{scientific} attribute is
+@code{onlyForSmall}, this is a numeric magnitude below which the
+number will be formatted in scientific notation.  The values @code{0}
+and @code{0.0001} have been observed.  The value @code{0} seems like a
+pathological choice, since no real number has a magnitude less than 0;
+perhaps in practice such a choice is equivalent to setting
+@code{scientific} to @code{false}.
+@end defvr
+
+@defvr {Attribute} prefix
+@defvrx {Attribute} suffix
+Specifies a prefix or a suffix to apply to the formatted number.  Only
+@code{suffix} has been observed, with value @samp{%}.
+@end defvr
+
+@node SPV Detail stringFormat Element
+@subsubsection The @code{stringFormat} Element
+
+@example
+stringFormat => relabel* affix*
+
+relabel :from=real :to => EMPTY
+@end example
+
+The @code{stringFormat} element specifies how to display a string.  By
+default, a string is displayed verbatim, but @code{relabel} can change
+it.
+
+The @code{relabel} element appears as a child of @code{stringFormat}
+(and of @code{format}, when it is used to format strings).  It
+specifies how to display a given value.  It is used to implement value
+labels and to display the system-missing value in a human-readable
+way.  It has the following attributes:
+
+@defvr {Attribute} from
+The value to map.  In the corpus this is an integer or the
+system-missing value @code{-1.797693134862316E300}.
+@end defvr
+
+@defvr {Attribute} to
+The string to display in place of the value of @code{from}.  In the
+corpus this is a wide variety of value labels; the system-missing
+value is mapped to @samp{.}.
+@end defvr
+
+@node SPV Detail dateTimeFormat Element
+@subsubsection The @code{dateTimeFormat} Element
+
+@example
+dateTimeFormat
+   :baseFormat[dt_base_format]=(date | time | dateTime)
+   :separatorChars?
+   :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
+   :showYear=bool?
+   :yearAbbreviation=bool?
+   :showQuarter=bool?
+   :quarterPrefix?
+   :quarterSuffix?
+   :showMonth=bool?
+   :monthFormat=(long | short | number | paddedNumber)?
+   :showWeek=bool?
+   :weekPadding=bool?
+   :weekSuffix?
+   :showDayOfWeek=bool?
+   :dayOfWeekAbbreviation=bool?
+   :dayPadding=bool?
+   :dayOfMonthPadding=bool?
+   :hourPadding=bool?
+   :minutePadding=bool?
+   :secondPadding=bool?
+   :showDay=bool?
+   :showHour=bool?
+   :showMinute=bool?
+   :showSecond=bool?
+   :showMillis=bool?
+   :dayType=(month | year)?
+   :hourFormat=(AMPM | AS_24 | AS_12)?
+=> affix*
+@end example
+
+This element appears only in schema version 2.5 and earlier
+(@pxref{SPV Detail visualization Element}).
+
+Data to be formatted in date formats is stored as strings in legacy
+data, in the format @code{yyyy-mm-ddTHH:MM:SS.SSS} and must be parsed
+and reformatted by the reader.
+
+The following attribute is required.
+
+@defvr {Attribute} baseFormat
+Specifies whether a date and time are both to be displayed, or just
+one of them.
+@end defvr
+
+Many of the attributes' meanings are obvious.  The following seem to
+be worth documenting.
  
  @defvr {Attribute} separatorChars
  Exactly four characters.  In order, these are used for: decimal point,
@@ -2623,27 +3181,6 @@ Only values of @code{true} and @code{short}, respectively, have been
  observed.
  @end defvr
  
-@defvr {Attribute} dayPadding
-@defvrx {Attribute} dayOfMonthPadding
-@defvrx {Attribute} hourPadding
-@defvrx {Attribute} minutePadding
-@defvrx {Attribute} secondPadding
-These attributes presumably control whether each field in the output
-is padded with spaces to its maximum width, but the details are not
-understood.  The only observed value for any of these attributes is
-@code{true}.
-@end defvr
-
-@defvr {Attribute} showDay
-@defvrx {Attribute} showHour
-@defvrx {Attribute} showMinute
-@defvrx {Attribute} showSecond
-@defvrx {Attribute} showMillis
-These attributes presumably control whether each field is displayed
-in the output, but the details are not understood.  The only
-observed value for any of these attributes is @code{true}.
-@end defvr
-
  @defvr {Attribute} dayType
  This attribute is always @code{month} in the corpus, specifying that
  the day of the month is to be displayed; a value of @code{year} is
@@ -2676,189 +3213,165 @@ in the corpus, or it might indicate that @code{elapsedTime} is
  sometimes used to format a time of day.
  @end defvr
  
-@subsubheading Numeric Attributes
-
-These attributes are used for formats when @code{baseFormat} is
-@code{number}.  Attributes @code{maximumFractionDigits}, and
-@code{minimumFractionDigits}, and @code{useGrouping} are also used
-when @code{baseFormat} is @code{elapsedTime}.
+For a @code{baseFormat} of @code{date}, PSPP chooses a print format
+type based on the following rules:
  
-@defvr {Attribute} minimumIntegerDigits
-Minimum number of digits to display before the decimal point.  Always
-observed as @code{0}.
-@end defvr
-
-@defvr {Attribute} maximumFractionDigits
-@defvrx {Attribute} maximumFractionDigits
-Maximum or minimum, respectively, number of digits to display after
-the decimal point.  The observed values of each attribute range from 0
-to 9.
-@end defvr
-
-@defvr {Attribute} useGrouping
-Whether to use the grouping character to group digits in large
-numbers.  It would make sense for the grouping character to come from
-the @code{separatorChars} attribute, but that attribute is only
-present when @code{baseFormat} is @code{dateTime} or
-@code{elapsedTime}, in the corpus at least.  Perhaps that is because
-this attribute has only been observed as @code{false}.
-@end defvr
-
-@defvr {Attribute} scientific
-This attribute controls when and whether the number is formatted in
-scientific notation.  It takes the following values:
+@enumerate
+@item
+If @code{showQuarter} is true: @code{QYR}.
  
-@table @code
-@item onlyForSmall
-Use scientific notation only when the number's magnitude is smaller
-than the value of the @code{small} attribute.
+@item
+Otherwise, if @code{showWeek} is true: @code{WKYR}.
  
-@item whenNeeded
-Use scientific notation when the number will not otherwise fit in the
-available space.
+@item
+Otherwise, if @code{mdyOrder} is @code{dayMonthYear}:
  
-@item true
-Always use scientific notation.  Not observed in the corpus.
+@enumerate a
+@item
+If @code{monthFormat} is @code{number} or @code{paddedNumber}: @code{EDATE}.
  
-@item false
-Never use scientific notation.  A number that won't otherwise fit will
-be replaced by an error indication (see the @code{errorCharacter}
-attribute).  Not observed in the corpus.
-@end table
-@end defvr
+@item
+Otherwise: @code{DATE}.
+@end enumerate
  
-@defvr {Optional} small
-Only present when the @code{scientific} attribute is
-@code{onlyForSmall}, this is a numeric magnitude below which the
-number will be formatted in scientific notation.  The values @code{0}
-and @code{0.0001} have been observed.  The value @code{0} seems like a
-pathological choice, since no real number has a magnitude less than 0;
-perhaps in practice such a choice is equivalent to setting
-@code{scientific} to @code{false}.
-@end defvr
+@item
+Otherwise, if @code{mdyOrder} is @code{yearMonthDay}: @code{SDATE}.
  
-@defvr {Optional} prefix
-@defvrx {Optional} suffix
-Specifies a prefix or a suffix to apply to the formatted number.  Only
-@code{suffix} has been observed, with value @samp{%}.
-@end defvr
+@item
+Otherwise, @code{ADATE}.
+@end enumerate
  
-@subsubheading String Attributes
+For a @code{baseFormat} of @code{dateTime}, PSPP uses @code{YMDHMS} if
+@code{mdyOrder} is @code{yearMonthDay} and @code{DATETIME} otherwise.
+For a @code{baseFormat} of @code{time}, PSPP uses @code{DTIME} if
+@code{showDay} is true, otherwise @code{TIME} if @code{showHour} is
+true, otherwise @code{MTIME}.
  
-These attributes are used for formats when @code{baseFormat} is
-@code{string}.
+For a @code{baseFormat} of @code{date}, the chosen width is the
+minimum for the format type, adding 2 if @code{yearAbbreviation} is
+false or omitted.  For other base formats, the chosen width is the
+minimum for its type, plus 3 if @code{showSecond} is true, plus 4 more
+if @code{showMillis} is also true.  Decimals are 0 by default, or 3
+if @code{showMillis} is true.
  
-@defvr {Attribute} tryStringsAsNumbers
-When this is @code{true}, it is supposed to indicate that string
-values should be parsed as numbers and then displayed according to
-numeric formatting rules.  However, in the corpus it is always
-@code{false}.
-@end defvr
+@node SPV Detail elapsedTimeFormat Element
+@subsubsection The @code{elapsedTimeFormat} Element
  
-@node SPV Detail numberFormat Element
-@subsubsection The @code{numberFormat} Element
+@example
+elapsedTimeFormat
+   :baseFormat[dt_base_format]=(date | time | dateTime)
+   :dayPadding=bool?
+   :hourPadding=bool?
+   :minutePadding=bool?
+   :secondPadding=bool?
+   :showYear=bool?
+   :showDay=bool?
+   :showHour=bool?
+   :showMinute=bool?
+   :showSecond=bool?
+   :showMillis=bool?
+=> affix*
+@end example
  
-Parent: @code{setFormat} @*
-Contents: @code{affix}@math{+}
+This element specifies the way to display a time duration.
  
-This element appears only in schema version 2.5 and earlier
-(@pxref{SPV Detail visualization Element}).  Possibly this element
-could also contain @code{relabel} elements in a more diverse corpus.
+Data to be formatted in elapsed time formats is stored as strings in
+legacy data, in the format @code{H:MM:SS.SSS}, with additional hour
+digits as needed for long durations, and must be parsed and
+reformatted by the reader.
  
-This element has the following attributes.
+The following attribute is required.
  
-@defvr {Attribute} maximumFractionDigits
-@defvrx {Attribute} minimumFractionDigits
-@defvrx {Attribute} minimumIntegerDigits
-@defvrx {Optional} scientific
-@defvrx {Optional} small
-@defvrx {Optional} suffix
-@defvrx {Optional} useGroupging
-The syntax and meaning of these attributes is the same as on the
-@code{format} element for a numeric format.  @pxref{SPV Detail format
-Element}.
+@defvr {Attribute} baseFormat
+Specifies whether a day and a time are both to be displayed, or just
+one of them.
  @end defvr
  
-@node SPV Detail stringFormat Element
-@subsubsection The @code{stringFormat} Element
-
-Parent: @code{setFormat} @*
-Contents: (@code{affix}@math{+} @math{|} @code{relabel}@math{+})?
-
-This element appears only in schema version 2.5 and earlier
-(@pxref{SPV Detail visualization Element}).
-
-This element has no attributes.
-
-@node SPV Detail dateTimeFormat Element
-@subsubsection The @code{dateTimeFormat} Element
+The remaining attributes specify exactly how to display the elapsed
+time.
  
-Parent: @code{setFormat} @*
-Contents: empty
+For @code{baseFormat} of @code{time}, PSPP converts this element to
+print format type @code{DTIME}; otherwise, if @code{showHour} is true,
+to @code{TIME}; otherwise, to @code{MTIME}.  The chosen width is the
+minimum for the chosen type, adding 3 if @code{showSecond} is true,
+adding 4 more if @code{showMillis} is also true.  Decimals are 0 by
+default, or 3 if @code{showMillis} is true.
  
-This element appears only in schema version 2.5 and earlier
-(@pxref{SPV Detail visualization Element}).  Possibly this element
-could also contain @code{affix} and @code{relabel} elements in a more
-diverse corpus.
+@node SPV Detail format Element
+@subsubsection The @code{format} Element
  
-The following attribute is required.
+@example
+format
+   :baseFormat[f_base_format]=(date | time | dateTime | elapsedTime)?
+   :errorCharacter?
+   :separatorChars?
+   :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
+   :showYear=bool?
+   :showQuarter=bool?
+   :quarterPrefix?
+   :quarterSuffix?
+   :yearAbbreviation=bool?
+   :showMonth=bool?
+   :monthFormat=(long | short | number | paddedNumber)?
+   :dayPadding=bool?
+   :dayOfMonthPadding=bool?
+   :showWeek=bool?
+   :weekPadding=bool?
+   :weekSuffix?
+   :showDayOfWeek=bool?
+   :dayOfWeekAbbreviation=bool?
+   :hourPadding=bool?
+   :minutePadding=bool?
+   :secondPadding=bool?
+   :showDay=bool?
+   :showHour=bool?
+   :showMinute=bool?
+   :showSecond=bool?
+   :showMillis=bool?
+   :dayType=(month | year)?
+   :hourFormat=(AMPM | AS_24 | AS_12)?
+   :minimumIntegerDigits=int?
+   :maximumFractionDigits=int?
+   :minimumFractionDigits=int?
+   :useGrouping=bool?
+   :scientific=(onlyForSmall | whenNeeded | true | false)?
+   :small=real?
+   :prefix?
+   :suffix?
+   :tryStringsAsNumbers=bool?
+   :negativesOutside=bool?
+=> relabel* affix*
+@end example
  
-@defvr {Attribute} baseFormat
-Either @code{dateTime} or @code{time}.
-@end defvr
+This element is the union of all of the more-specific format elements.
+It is interpreted in the same way as one of those format elements,
+using @code{baseFormat} to determine which kind of format to use.
  
-When @code{baseFormat} is @code{dateTime}, the following attributes
-are available.
+There are a few attributes not present in the more specific formats:
  
-@defvr {Attribute} dayOfMonthPadding
-@defvrx {Attribute} dayPadding
-@defvrx {Attribute} dayType
-@defvrx {Attribute} hourFormat
-@defvrx {Attribute} hourPadding
-@defvrx {Attribute} mdyOrder
-@defvrx {Attribute} minutePadding
-@defvrx {Attribute} monthFormat
-@defvrx {Attribute} separatorChars
-@defvrx {Attribute} showDay
-@defvrx {Attribute} showHour
-@defvrx {Attribute} showMinute
-@defvrx {Attribute} showMonth
-@defvrx {Attribute} showSecond
-@defvrx {Attribute} showYear
-@defvrx {Attribute} yearAbbreviation
-The syntax and meaning of these attributes is the same as on the
-@code{format} element when that element's @code{baseFormat} is
-@code{dateTime}.  @pxref{SPV Detail format Element}.
+@defvr {Attribute} tryStringsAsNumbers
+When this is @code{true}, it is supposed to indicate that string
+values should be parsed as numbers and then displayed according to
+numeric formatting rules.  However, in the corpus it is always
+@code{false}.
  @end defvr
  
-When @code{baseFormat} is @code{time}, the following attributes are
-available.
-
-@defvr {Attribute} hourFormat
-@defvrx {Attribute} hourPadding
-@defvrx {Attribute} minutePadding
-@defvrx {Attribute} monthFormat
-@defvrx {Attribute} separatorChars
-@defvrx {Attribute} showDay
-@defvrx {Attribute} showHour
-@defvrx {Attribute} showMinute
-@defvrx {Attribute} showMonth
-@defvrx {Attribute} showSecond
-@defvrx {Attribute} showYear
-@defvrx {Attribute} yearAbbreviation
-The syntax and meaning of these attributes is the same as on the
-@code{format} element when that element's @code{baseFormat} is
-@code{elapsedTime}.  @pxref{SPV Detail format Element}.
+@defvr {Attribute} negativesOutside
+If true, the negative sign should be shown before the prefix; if
+false, it should be shown after.
  @end defvr
  
  @node SPV Detail affix Element
  @subsubsection The @code{affix} Element
  
-Parent: @code{format} or @code{numberFormat} or @code{stringFormat} @*
-Contents: empty
-
-Possibly this element could have @code{dateTimeFormat} as a parent in
-a more diverse corpus.
+@example
+affix
+   :definesReference=int
+   :position=(subscript | superscript)
+   :suffix=bool
+   :value
+=> EMPTY
+@end example
  
  This defines a suffix (or, theoretically, a prefix) for a formatted
  value.  It is used to insert a reference to a footnote.  It has the
@@ -2885,86 +3398,225 @@ contains other values: @code{*}, @code{**}, and a few that begin with
  at least one comma: @code{,b}, @code{,c}, @code{,,b}, and @code{,,c}.
  @end defvr
  
-@node SPV Detail relabel Element
-@subsubsection The @code{relabel} Element
+@node SPV Detail interval Element
+@subsection The @code{interval} Element
  
-Parent: @code{format} or @code{stringFormat} @*
-Contents: empty
+@example
+interval :style=ref style => labeling footnotes?
  
-Possibly this element could have @code{numberFormat} or
-@code{dateTimeFormat} as a parent in a more diverse corpus.
+labeling
+   :style=ref style?
+   :variable=ref (sourceVariable | derivedVariable)
+=> (formatting | format | footnotes)*
  
-This specifies how to display a given value.  It is used to implement
-value labels and to display the system-missing value in a
-human-readable way.  It has the following attributes:
+formatting :variable=ref (sourceVariable | derivedVariable) => formatMapping*
  
-@defvr {Attribute} from
-The value to map.  In the corpus this is an integer or the
-system-missing value @code{-1.797693134862316E300}.
-@end defvr
+formatMapping :from=int => format?
  
-@defvr {Attribute} to
-The string to display in place of the value of @code{from}.  In the
-corpus this is a wide variety of value labels; the system-missing
-value is mapped to @samp{.}.
+footnotes
+   :superscript=bool?
+   :variable=ref (sourceVariable | derivedVariable)
+=> footnoteMapping*
+
+footnoteMapping :definesReference=int :from=int :to => EMPTY
+@end example
+
+The @code{interval} element and its descendants determine the basic
+formatting and labeling for the table's cells.  These basic styles are
+overridden by more specific styles set using @code{setCellProperties}
+(@pxref{SPV Detail setCellProperties Element}).
+
+The @code{style} attribute of @code{interval} itself may be ignored.
+
+The @code{labeling} element may have a single @code{formatting} child.
+If present, its @code{variable} attribute refers to a variable whose
+values are format specifiers as numbers, e.g. value 0x050802 for F8.2.
+However, the numbers are not actually interpreted that way.  Instead,
+each number actually present in the variable's data is mapped by a
+@code{formatMapping} child of @code{formatting} to a @code{format}
+that specifies how to display it.
+
+The @code{labeling} element may also have a @code{footnotes} child
+element.  The @code{variable} attribute of this element refers to a
+variable whose values are comma-delimited strings that list the
+1-based indexes of footnote references.  (Cells without any footnote
+references are numeric 0 instead of strings.)
+
+Each @code{footnoteMapping} child of the @code{footnotes} element
+defines the footnote marker to be its @code{to} attribute text for the
+footnote whose 1-based index is given in its @code{definesReference}
+attribute.
+
+@node SPV Detail style Element
+@subsection The @code{style} Element
+
+@example
+style
+   :color=color?
+   :color2=color?
+   :labelAngle=real?
+   :border-bottom=(solid | thick | thin | double | none)?
+   :border-top=(solid | thick | thin | double | none)?
+   :border-left=(solid | thick | thin | double | none)?
+   :border-right=(solid | thick | thin | double | none)?
+   :border-bottom-color?
+   :border-top-color?
+   :border-left-color?
+   :border-right-color?
+   :font-family?
+   :font-size?
+   :font-weight=(regular | bold)?
+   :font-style=(regular | italic)?
+   :font-underline=(none | underline)?
+   :margin-bottom=dimension?
+   :margin-left=dimension?
+   :margin-right=dimension?
+   :margin-top=dimension?
+   :textAlignment=(left | right | center | decimal | mixed)?
+   :labelLocationHorizontal=(positive | negative | center)?
+   :labelLocationVertical=(positive | negative | center)?
+   :decimal-offset=dimension?
+   :size?
+   :width?
+   :visible=bool?
+=> EMPTY
+@end example
+
+A @code{style} element has an effect only when it is referenced by
+another element to set some aspect of the table's style.  Most of the
+attributes are self-explanatory.  The rest are described below.
+
+@defvr {Attribute} {color}
+In some cases, the text color; in others, the background color.
  @end defvr
  
-@node SPV Detail union Element
-@subsubsection The @code{union} Element
+@defvr {Attribute} {color2}
+Not used.
+@end defvr
  
-Parent: @code{setCellProperties} @*
-Contents: @code{intersect}@math{+}
+@defvr {Attribute} {labelAngle}
+Normally 0.  The value -90 causes inner column or outer row labels to
+be rotated vertically.
+@end defvr
  
-This element represents a set of cells, computed as the union of the
-sets represented by each of its children.
+@defvr {Attribute} {labelLocationHorizontal}
+Not used.
+@end defvr
  
-@subsubheading The @code{intersect} Element
+@defvr {Attribute} {labelLocationVertical}
+The value @code{positive} corresponds to vertically aligning text to
+the top of a cell, @code{negative} to the bottom, @code{center} to the
+middle.
+@end defvr
  
-Parent: @code{union} @*
-Contents: @code{where}@math{+} @math{|} @code{intersectWhere}?
+@node SPV Detail labelFrame Element
+@subsection The @code{labelFrame} Element
  
-This element represents a set of cells, computed as the intersection
-of the sets represented by each of its children.
+@example
+labelFrame :style=ref style => location+ label? paragraph?
  
-Of the two possible children, in the corpus @code{where} is far more
-common, appearing thousands of times, whereas @code{intersectWhere}
-only appears 4 times.
+paragraph :hangingIndent=dimension? => EMPTY
+@end example
  
-Most @code{intersect} elements have two or more children.
+A @code{labelFrame} element specifies content and style for some
+aspect of a table.  Only @code{labelFrame} elements that have a
+@code{label} child are important.  The @code{purpose} attribute in the
+@code{label} determines what the @code{labelFrame} affects:
  
-@subsubheading The @code{where} Element
+@table @code
+@item title
+The table's title and its style.
  
-Parent: @code{intersect} @*
-Contents: empty
+@item subTitle
+The table's caption and its style.
  
-This element represents the set of cells in which the value of a
-specified variable falls within a specified set.
+@item footnote
+The table's footnotes and the style for the footer area.
  
-@defvr {Attribute} variable
-The @code{id} of a variable, e.g.@: @code{dimension0categories} or
-@code{dimension0group0map}.
-@end defvr
+@item layer
+The style for the layer area.
  
-@defvr {Attribute} include
-A value, or multiple values separated by semicolons,
-e.g.@: @code{0} or @code{13;14;15;16}.
-@end defvr
+@item subSubTitle
+Ignored.
+@end table
  
-@subsubheading The @code{intersectWhere} Element
+The @code{style} attribute references the style to use for the area.
  
-Parent: @code{intersect} @*
-Contents: empty
+The @code{label}, if present, specifies the text to put into the title
+or caption or footnotes.  For footnotes, the label has two @code{text}
+children for every footnote, each of which has a @code{usesReference}
+attribute identifying the 1-based index of a footnote.  The first,
+third, fifth, @dots{} @code{text} child specifies the content for a
+footnote; the second, fourth, sixth, @dots{} child specifies the
+marker.  Content tends to end in a new-line, which the reader may wish
+to trim; similarly, markers tend to end in @samp{.}.
  
-The meaning of this element is unknown.
+The @code{paragraph}, if present, may be ignored, since it is always
+empty.
  
-@defvr {Attribute} variable
-@defvrx {Attribute} variable2
-The meaning of these attributes is unknown.  In the four examples in
-the corpus they always take the values @code{dimension2categories} and
-@code{dimension0categories}, respectively.
-@end defvr
+@node SPV Detail Legacy Properties
+@subsection Legacy Properties
  
-@node SPV Detail style Element
-@subsection The @code{style} Element
+The detail XML format has features for styling most of the aspects of
+a table.  It also inherits defaults for many aspects from structure
+XML, which has the following @code{tableProperties} element:
  
-TBD.
+@example
+tableProperties
+=> generalProperties footnoteProperties cellFormatProperties borderProperties printingProperties
+
+generalProperties
+   :hideEmptyRows=bool?
+   :maximumColumnWidth=dimension?
+   :maximumRowWidth=dimension?
+   :minimumColumnWidth=dimension?
+   :minimumRowWidth=dimension?
+   :rowDimensionLabels=(inCorner | nested)?
+=> EMPTY
+
+footnoteProperties
+   :markerPosition=(superscript | subscript)?
+   :numberFormat=(alphabetic | numeric)?
+=> EMPTY
+
+cellFormatProperties => cell_style+
+
+any[cell_style]
+   :alternatingColor=color?
+   :alternatingTextColor=color?
+=> style
+
+style
+   :color=color?
+   :color2=color?
+   :font-family?
+   :font-size?
+   :font-style=(regular | italic)?
+   :font-weight=(regular | bold)?
+   :labelLocationVertical=(positive | negative | center)?
+   :margin-bottom=dimension?
+   :margin-left=dimension?
+   :margin-right=dimension?
+   :margin-top=dimension?
+   :textAlignment=(left | right | center | decimal | mixed)?
+   :decimal-offset=dimension?
+=> EMPTY
+
+borderProperties => border_style+
+
+any[border_style]
+   :borderStyleType=(none | solid | dashed | thick | thin | double)?
+   :color=color?
+=> EMPTY
+
+printingProperties
+   :printAllLayers=bool?
+   :rescaleLongTableToFitPage=bool?
+   :rescaleWideTableToFitPage=bool?
+   :windowOrphanLines=int?
+   :continuationText?
+   :continuationTextAtBottom=bool?
+   :continuationTextAtTop=bool?
+   :printEachLayerOnSeparatePage=bool?
+=> EMPTY
+@end example
diff --git a/src/language/dictionary/sys-file-info.c b/src/language/dictionary/sys-file-info.c

index 80fa903b066ca704f05d5e1572c2b42270707760..1fd80281c421bc42feffdf89102eefdf147cf2e6 100644 (file)
--- a/src/language/dictionary/sys-file-info.c
+++ b/src/language/dictionary/sys-file-info.c
@@ -694,7 +694,7 @@ display_attributes (const struct attrset *dict_attrset,
                       var_get_attributes (vars[i]), flags);
  
    if (pivot_table_is_empty (table))
-    pivot_table_destroy (table);
+    pivot_table_unref (table);
    else
      pivot_table_submit (table);
  }
diff --git a/src/language/stats/crosstabs.q b/src/language/stats/crosstabs.q

index 0665b75e6f74bf1cc3d9355ff3477ff6546b5532..4c51b0b8edd739d6d4c06d68759e7376943924d5 100644 (file)
--- a/src/language/stats/crosstabs.q
+++ b/src/language/stats/crosstabs.q
@@ -1097,7 +1097,7 @@ output_crosstabulation (struct crosstabs_proc *proc, struct crosstabulation *xt)
        if (!pivot_table_is_empty (risk))
          pivot_table_submit (risk);
        else
-        pivot_table_destroy (risk);
+        pivot_table_unref (risk);
      }
  
    if (direct)
diff --git a/src/libpspp/zip-writer.c b/src/libpspp/zip-writer.c

index fd36fc523a029b61e0dee5122ef330bbb8c4a4ed..0959551abefd00720cd47f1e768910bcfa9bd340 100644 (file)
--- a/src/libpspp/zip-writer.c
+++ b/src/libpspp/zip-writer.c
@@ -29,6 +29,7 @@
  #include "gl/xalloc.h"
  
  #include "libpspp/message.h"
+#include "libpspp/temp-file.h"
  
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
@@ -188,6 +189,33 @@ zip_writer_add (struct zip_writer *zw, FILE *file, const char *member_name)
    member->name = xstrdup (member_name);
  }
  
+/* Adds a member named MEMBER_NAME whose contents is the null-terminated string
+   CONTENT. */
+void
+zip_writer_add_string (struct zip_writer *zw, const char *member_name,
+                       const char *content)
+{
+  zip_writer_add_memory (zw, member_name, content, strlen (content));
+}
+
+/* Adds a member named MEMBER_NAME whose contents is the SIZE bytes of
+   CONTENT. */
+void
+zip_writer_add_memory (struct zip_writer *zw, const char *member_name,
+                       const void *content, size_t size)
+{
+  FILE *fp = create_temp_file ();
+  if (fp == NULL)
+    {
+      msg_error (errno, _("error creating temporary file"));
+      zw->ok = false;
+      return;
+    }
+  fwrite (content, size, 1, fp);
+  zip_writer_add (zw, fp, member_name);
+  close_temp_file (fp);
+}
+
  /* Finalizes the contents of ZW and closes it.  Returns true if successful,
     false if a write error occurred while finalizing the file or at any earlier
     time. */
diff --git a/src/libpspp/zip-writer.h b/src/libpspp/zip-writer.h

index aff3f994f364f0ea242559fd6001abe3c731e70e..1b5b359fb9725d7dd55e5a43f1924c72636d7548 100644 (file)
--- a/src/libpspp/zip-writer.h
+++ b/src/libpspp/zip-writer.h
@@ -22,6 +22,10 @@
  
  struct zip_writer *zip_writer_create (const char *file_name);
  void zip_writer_add (struct zip_writer *, FILE *, const char *member_name);
+void zip_writer_add_string (struct zip_writer *, const char *member_name,
+                            const char *content);
+void zip_writer_add_memory (struct zip_writer *, const char *member_name,
+                            const void *content, size_t size);
  bool zip_writer_close (struct zip_writer *);
  
  #endif /* libpspp/zip-writer.h */
diff --git a/src/output/ascii.c b/src/output/ascii.c

index 2584faffaacf22777388bb4a9170a5c9b3cf037e..a78a904de1325f3b91c775fb0d96d236e4c5ed51 100644 (file)
--- a/src/output/ascii.c
+++ b/src/output/ascii.c
@@ -416,14 +416,16 @@ ascii_output_lines (struct ascii_driver *a, size_t n_lines)
  {
    for (size_t y = 0; y < n_lines; y++)
      {
-      struct u8_line *line = &a->lines[y];
+      if (y < a->allocated_lines)
+        {
+          struct u8_line *line = &a->lines[y];
  
-      while (ds_chomp_byte (&line->s, ' '))
-        continue;
-      fwrite (ds_data (&line->s), 1, ds_length (&line->s), a->file);
+          while (ds_chomp_byte (&line->s, ' '))
+            continue;
+          fwrite (ds_data (&line->s), 1, ds_length (&line->s), a->file);
+          u8_line_clear (&a->lines[y]);
+        }
        putc ('\n', a->file);
-
-      u8_line_clear (&a->lines[y]);
      }
  }
  
diff --git a/src/output/automake.mk b/src/output/automake.mk

index c1ff60370c9171d65d72da729ce738a6eac28405..968c7697333e257ef330afa158af94c1c4f6b39e 100644 (file)
--- a/src/output/automake.mk
+++ b/src/output/automake.mk
@@ -72,6 +72,7 @@ src_output_liboutput_la_SOURCES = \
         src/output/pivot-table.h \
         src/output/render.c \
         src/output/render.h \
+       src/output/spv-driver.c \
         src/output/tab.c \
         src/output/tab.h \
         src/output/table-item.c \
@@ -97,7 +98,10 @@ src_output_liboutput_la_SOURCES += \
         src/output/charts/spreadlevel-cairo.c \
         src/output/charts/scatterplot-cairo.c
  endif
+nodist_src_output_liboutput_la_SOURCES =
  
  EXTRA_DIST += \
         src/output/README \
         src/output/mk-class-boilerplate
+
+include src/output/spv/automake.mk
diff --git a/src/output/cairo.c b/src/output/cairo.c

index 101735418142f9e63b41bb29fbe2caa451461982..11f24d0404925c9f4451caa0316186a5bbccbbf4 100644 (file)
--- a/src/output/cairo.c
+++ b/src/output/cairo.c
@@ -478,7 +478,7 @@ parse_color (struct output_driver *d, struct string_map *options,
  {
    char *string = parse_string (opt (d, options, key, default_value));
    if (!parse_color__ (string, color) && !parse_color__ (default_value, color))
-    *color = CELL_COLOR_BLACK;
+    *color = (struct cell_color) CELL_COLOR_BLACK;
    free (string);
  }
  
@@ -1549,7 +1549,8 @@ xr_layout_cell_text (struct xr_driver *xr, const struct table_cell *cell,
            pango_layout_set_text (font->layout, marker, strlen (marker));
  
            PangoAttrList *attrs = pango_attr_list_new ();
-          pango_attr_list_insert (attrs, pango_attr_rise_new (7000));
+          pango_attr_list_insert (attrs, pango_attr_scale_new (PANGO_SCALE_SMALL));
+          pango_attr_list_insert (attrs, pango_attr_rise_new (3000));
            pango_layout_set_attributes (font->layout, attrs);
            pango_attr_list_unref (attrs);
  
@@ -1596,7 +1597,8 @@ xr_layout_cell_text (struct xr_driver *xr, const struct table_cell *cell,
        if (font_style->underline)
          pango_attr_list_insert (attrs, pango_attr_underline_new (
                                 PANGO_UNDERLINE_SINGLE));
-      add_attr_with_start (attrs, pango_attr_rise_new (7000), initial_length);
+      add_attr_with_start (attrs, pango_attr_scale_new (PANGO_SCALE_SMALL), initial_length);
+      add_attr_with_start (attrs, pango_attr_rise_new (3000), initial_length);
        add_attr_with_start (
          attrs, pango_attr_font_desc_new (font->desc), initial_length);
        pango_layout_set_attributes (font->layout, attrs);
diff --git a/src/output/csv.c b/src/output/csv.c

index 65566c4e3a624a2ba5b02589438b019e52624e0c..9c70a4515899dcda1570cecbdd8a297e00698ee8 100644 (file)
--- a/src/output/csv.c
+++ b/src/output/csv.c
@@ -263,13 +263,12 @@ csv_submit (struct output_driver *driver,
            fputs ("\nFootnotes:\n", csv->file);
  
            for (size_t i = 0; i < n_footnotes; i++)
-            if (f[i])
-              {
-                csv_output_field (csv, f[i]->marker);
-                fputs (csv->separator, csv->file);
-                csv_output_field (csv, f[i]->content);
-                putc ('\n', csv->file);
-              }
+            {
+              csv_output_field (csv, f[i]->marker);
+              fputs (csv->separator, csv->file);
+              csv_output_field (csv, f[i]->content);
+              putc ('\n', csv->file);
+            }
  
            free (f);
          }
diff --git a/src/output/driver.c b/src/output/driver.c

index d37ec8b7d79229eff8e224120ab8c49a4e7f414f..b18b822f22ca4b1f99d0e66736b8f345e5185d2f 100644 (file)
--- a/src/output/driver.c
+++ b/src/output/driver.c
@@ -196,7 +196,11 @@ defer_text (struct output_engine *e, struct output_item *item)
    if (!is_text_item (item))
      return false;
  
-  enum text_item_type type = text_item_get_type (to_text_item (item));
+  struct text_item *text_item = to_text_item (item);
+  if (text_item->markup)        /* XXX */
+    return false;
+
+  enum text_item_type type = text_item_get_type (text_item);
    if (type != TEXT_ITEM_SYNTAX && type != TEXT_ITEM_LOG)
      return false;
  
@@ -208,7 +212,7 @@ defer_text (struct output_engine *e, struct output_item *item)
    if (!ds_is_empty (&e->deferred_text))
      ds_put_byte (&e->deferred_text, '\n');
  
-  const char *text = text_item_get_text (to_text_item (item));
+  const char *text = text_item_get_text (text_item);
    ds_put_cstr (&e->deferred_text, text);
    output_item_unref (item);
  
@@ -250,7 +254,7 @@ output_submit (struct output_item *item)
        if (idx >= 1 && idx <= 4)
          {
            char *key = xasprintf ("Head%zu", idx);
-          string_map_find_and_delete (&e->heading_vars, key);
+          free (string_map_find_and_delete (&e->heading_vars, key));
            free (key);
          }
      }
@@ -424,6 +428,7 @@ extern const struct output_driver_factory list_driver_factory;
  extern const struct output_driver_factory html_driver_factory;
  extern const struct output_driver_factory csv_driver_factory;
  extern const struct output_driver_factory odt_driver_factory;
+extern const struct output_driver_factory spv_driver_factory;
  #ifdef HAVE_CAIRO
  extern const struct output_driver_factory pdf_driver_factory;
  extern const struct output_driver_factory ps_driver_factory;
@@ -437,6 +442,7 @@ static const struct output_driver_factory *factories[] =
      &html_driver_factory,
      &csv_driver_factory,
      &odt_driver_factory,
+    &spv_driver_factory,
  #ifdef HAVE_CAIRO
      &pdf_driver_factory,
      &ps_driver_factory,
diff --git a/src/output/html.c b/src/output/html.c

index 1027dea501037e5e74655ecb0a4d76b7c2357280..307f6c223d83bb3fea6040da6e7f04d8d30ada13 100644 (file)
--- a/src/output/html.c
+++ b/src/output/html.c
@@ -419,6 +419,22 @@ html_put_table_item_text (struct html_driver *html,
    html_put_footnote_markers (html, text->footnotes, text->n_footnotes);
  }
  
+static void
+html_put_table_item_layers (struct html_driver *html,
+                            const struct table_item_layers *layers)
+{
+  for (size_t i = 0; i < layers->n_layers; i++)
+    {
+      if (i)
+        fputs ("<BR>\n", html->file);
+
+      const struct table_item_layer *layer = &layers->layers[i];
+      escape_string (html->file, layer->content, strlen (layer->content),
+                     " ", "<BR>");
+      html_put_footnote_markers (html, layer->footnotes, layer->n_footnotes);
+    }
+}
+
  static void
  html_output_table (struct html_driver *html, const struct table_item *item)
  {
@@ -438,16 +454,15 @@ html_output_table (struct html_driver *html, const struct table_item *item)
    size_t n_footnotes = table_collect_footnotes (item, &f);
  
    for (size_t i = 0; i < n_footnotes; i++)
-    if (f[i])
-      {
-        put_tfoot (html, t, &tfoot);
-        fputs ("<SUP>", html->file);
-        escape_string (html->file, f[i]->marker, strlen (f[i]->marker),
-                       " ", "<BR>");
-        fputs ("</SUP> ", html->file);
-        escape_string (html->file, f[i]->content, strlen (f[i]->content),
-                       " ", "<BR>");
-      }
+    {
+      put_tfoot (html, t, &tfoot);
+      fputs ("<SUP>", html->file);
+      escape_string (html->file, f[i]->marker, strlen (f[i]->marker),
+                     " ", "<BR>");
+      fputs ("</SUP> ", html->file);
+      escape_string (html->file, f[i]->content, strlen (f[i]->content),
+                     " ", "<BR>");
+    }
    free (f);
    if (tfoot)
      fputs ("</TD></TR></TFOOT>\n", html->file);
@@ -455,7 +470,7 @@ html_output_table (struct html_driver *html, const struct table_item *item)
    fputs ("<TBODY VALIGN=\"TOP\">\n", html->file);
  
    const struct table_item_text *title = table_item_get_title (item);
-  const struct table_item_text *layers = table_item_get_layers (item);
+  const struct table_item_layers *layers = table_item_get_layers (item);
    if (title || layers)
      {
        fputs ("  <CAPTION>", html->file);
@@ -464,7 +479,7 @@ html_output_table (struct html_driver *html, const struct table_item *item)
        if (title && layers)
          fputs ("<BR>\n", html->file);
        if (layers)
-        html_put_table_item_text (html, layers);
+        html_put_table_item_layers (html, layers);
        fputs ("</CAPTION>\n", html->file);
      }
  
diff --git a/src/output/odt.c b/src/output/odt.c

index 90a76186a7d81f7cdedd7f8510388aa30a0306ae..104e57a9332c373502794a9550ddda5bc1b09f4d 100644 (file)
--- a/src/output/odt.c
+++ b/src/output/odt.c
@@ -80,26 +80,6 @@ odt_driver_cast (struct output_driver *driver)
    return UP_CAST (driver, struct odt_driver, driver);
  }
  
-/* Create the "mimetype" file needed by ODF */
-static bool
-create_mimetype (struct zip_writer *zip)
-{
-  FILE *fp;
-
-  fp = create_temp_file ();
-  if (fp == NULL)
-    {
-      msg_error (errno, _("error creating temporary file"));
-      return false;
-    }
-
-  fprintf (fp, "application/vnd.oasis.opendocument.text");
-  zip_writer_add (zip, fp, "mimetype");
-  close_temp_file (fp);
-
-  return true;
-}
-
  /* Creates a new temporary file and stores it in *FILE, then creates an XML
     writer for it and stores it in *W. */
  static void
@@ -306,11 +286,8 @@ odt_create (struct file_handle *fh, enum settings_output_devices device_type,
    odt->handle = fh;
    odt->file_name = xstrdup (file_name);
  
-  if (!create_mimetype (zip))
-    {
-      output_driver_destroy (d);
-      return NULL;
-    }
+  zip_writer_add_string (zip, "mimetype",
+                         "application/vnd.oasis.opendocument.text");
  
    /* Create the manifest */
    create_writer (&odt->manifest_file, &odt->manifest_wtr);
@@ -461,6 +438,26 @@ write_table_item_text (struct odt_driver *odt,
    xmlTextWriterEndElement (odt->content_wtr);
  }
  
+static void
+write_table_item_layers (struct odt_driver *odt,
+                         const struct table_item_layers *layers)
+{
+  if (!layers)
+    return;
+
+  for (size_t i = 0; i < layers->n_layers; i++)
+    {
+      const struct table_item_layer *layer = &layers->layers[i];
+      xmlTextWriterStartElement (odt->content_wtr, _xml("text:h"));
+      xmlTextWriterWriteFormatAttribute (odt->content_wtr,
+                                         _xml("text:outline-level"), "%d", 2);
+      xmlTextWriterWriteString (odt->content_wtr, _xml (layer->content) );
+      for (size_t i = 0; i < layer->n_footnotes; i++)
+        write_footnote (odt, layer->footnotes[i]);
+      xmlTextWriterEndElement (odt->content_wtr);
+    }
+}
+
  static void
  write_table (struct odt_driver *odt, const struct table_item *item)
  {
@@ -469,7 +466,7 @@ write_table (struct odt_driver *odt, const struct table_item *item)
  
    /* Write a heading for the table */
    write_table_item_text (odt, table_item_get_title (item));
-  write_table_item_text (odt, table_item_get_layers (item));
+  write_table_item_layers (odt, table_item_get_layers (item));
  
    /* Start table */
    xmlTextWriterStartElement (odt->content_wtr, _xml("table:table"));
diff --git a/src/output/pivot-output.c b/src/output/pivot-output.c

index 08db3a0781f375fb9022c739eeeb23bba8aa7091..2ff2dc3817aea2cae35d82c75a7063985626dc77 100644 (file)
--- a/src/output/pivot-output.c
+++ b/src/output/pivot-output.c
@@ -299,24 +299,6 @@ pivot_table_submit_layer (const struct pivot_table *pt,
    const size_t *pindexes[PIVOT_N_AXES]
      = { [PIVOT_AXIS_LAYER] = layer_indexes };
  
-  struct string layer_label = DS_EMPTY_INITIALIZER;
-  const struct pivot_axis *layer_axis = &pt->axes[PIVOT_AXIS_LAYER];
-  for (size_t i = 0; i < layer_axis->n_dimensions; i++)
-    {
-      const struct pivot_dimension *d = layer_axis->dimensions[i];
-      if (d->n_leaves)
-        {
-          if (!ds_is_empty (&layer_label))
-            ds_put_byte (&layer_label, '\n');
-          pivot_value_format (d->root->name, pt->show_values,
-                              pt->show_variables, &layer_label);
-          ds_put_cstr (&layer_label, ": ");
-          pivot_value_format (d->data_leaves[layer_indexes[i]]->name,
-                              pt->show_values, pt->show_variables,
-                              &layer_label);
-        }
-    }
-
    size_t body[TABLE_N_AXES];
    size_t *column_enumeration = pivot_table_enumerate_axis (
      pt, PIVOT_AXIS_COLUMN, layer_indexes, pt->omit_empty, &body[H]);
@@ -462,16 +444,40 @@ pivot_table_submit_layer (const struct pivot_table *pt,
        table_item_text_destroy (title);
      }
  
-  if (!ds_is_empty (&layer_label))
+  const struct pivot_axis *layer_axis = &pt->axes[PIVOT_AXIS_LAYER];
+  struct table_item_layers *layers = NULL;
+  for (size_t i = 0; i < layer_axis->n_dimensions; i++)
+    {
+      const struct pivot_dimension *d = layer_axis->dimensions[i];
+      if (d->n_leaves)
+        {
+          if (!layers)
+            {
+              layers = xzalloc (sizeof *layers);
+              layers->style = area_style_override (
+                NULL, &pt->areas[PIVOT_AREA_LAYERS], NULL, NULL);
+              layers->layers = xnmalloc (layer_axis->n_dimensions,
+                                         sizeof *layers->layers);
+            }
+
+          const struct pivot_value *name
+            = d->data_leaves[layer_indexes[i]]->name;
+          struct table_item_layer *layer = &layers->layers[layers->n_layers++];
+          struct string s = DS_EMPTY_INITIALIZER;
+          pivot_value_format_body (name, pt->show_values, pt->show_variables,
+                                   &s);
+          layer->content = ds_steal_cstr (&s);
+          layer->n_footnotes = name->n_footnotes;
+          layer->footnotes = xnmalloc (layer->n_footnotes,
+                                       sizeof *layer->footnotes);
+          for (size_t i = 0; i < name->n_footnotes; i++)
+            layer->footnotes[i] = footnotes[name->footnotes[i]->idx];
+        }
+    }
+  if (layers)
      {
-      struct table_item_text *layers = table_item_text_create (
-        ds_cstr (&layer_label));
-      layers->style = area_style_override (NULL, &pt->areas[PIVOT_AREA_LAYERS],
-                                           NULL, NULL);
        table_item_set_layers (ti, layers);
-      table_item_text_destroy (layers);
-
-      ds_destroy (&layer_label);
+      table_item_layers_destroy (layers);
      }
  
    if (pt->caption)
@@ -484,6 +490,8 @@ pivot_table_submit_layer (const struct pivot_table *pt,
      }
  
    free (footnotes);
+  ti->pt = pivot_table_ref (pt);
+
    table_item_submit (ti);
  }
  
diff --git a/src/output/pivot-table.c b/src/output/pivot-table.c

index 96821170eff043db82a47e88004d19f2ba84747c..30371cae368ad32884f51e675fab8b588e41a07e 100644 (file)
--- a/src/output/pivot-table.c
+++ b/src/output/pivot-table.c
@@ -61,6 +61,70 @@ pivot_area_to_string (enum pivot_area area)
      }
  }
  
+void
+pivot_area_get_default_style (enum pivot_area area, struct area_style *style)
+{
+#define STYLE(BOLD, H, V, L, R, T, B) {                         \
+    .cell_style = {                                             \
+      .halign = TABLE_HALIGN_##H,                               \
+      .valign = TABLE_VALIGN_##V,                               \
+      .margin = { [TABLE_HORZ][0] = L, [TABLE_HORZ][1] = R,     \
+                  [TABLE_VERT][0] = T, [TABLE_VERT][1] = B },   \
+    },                                                          \
+    .font_style = {                                             \
+      .bold = BOLD,                                             \
+      .fg = { [0] = CELL_COLOR_BLACK, [1] = CELL_COLOR_BLACK},  \
+      .bg = { [0] = CELL_COLOR_WHITE, [1] = CELL_COLOR_WHITE},  \
+      .size = 9,                                                \
+    },                                                          \
+  }
+  static const struct area_style default_area_styles[PIVOT_N_AREAS] = {
+    [PIVOT_AREA_TITLE]         = STYLE( true, CENTER, CENTER,  8,11,1,8),
+    [PIVOT_AREA_CAPTION]       = STYLE(false, LEFT,   TOP,     8,11,1,1),
+    [PIVOT_AREA_FOOTER]        = STYLE(false, LEFT,   TOP,    11, 8,2,3),
+    [PIVOT_AREA_CORNER]        = STYLE(false, LEFT,   BOTTOM,  8,11,1,1),
+    [PIVOT_AREA_COLUMN_LABELS] = STYLE(false, CENTER, BOTTOM,  8,11,1,3),
+    [PIVOT_AREA_ROW_LABELS]    = STYLE(false, LEFT,   TOP,     8,11,1,3),
+    [PIVOT_AREA_DATA]          = STYLE(false, MIXED,  TOP,     8,11,1,1),
+    [PIVOT_AREA_LAYERS]        = STYLE(false, LEFT,   BOTTOM,  8,11,1,3),
+    };
+#undef STYLE
+
+  *style = default_area_styles[area];
+  style->font_style.typeface = xstrdup ("SansSerif");
+}
+
+void
+pivot_border_get_default_style (enum pivot_border border,
+                                struct table_border_style *style)
+{
+  static const enum table_stroke default_strokes[PIVOT_N_BORDERS] = {
+    [PIVOT_BORDER_TITLE]        = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_OUTER_LEFT]   = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_OUTER_TOP]    = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_OUTER_RIGHT]  = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_OUTER_BOTTOM] = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_INNER_LEFT]   = TABLE_STROKE_THICK,
+    [PIVOT_BORDER_INNER_TOP]    = TABLE_STROKE_THICK,
+    [PIVOT_BORDER_INNER_RIGHT]  = TABLE_STROKE_THICK,
+    [PIVOT_BORDER_INNER_BOTTOM] = TABLE_STROKE_THICK,
+    [PIVOT_BORDER_DATA_LEFT]    = TABLE_STROKE_THICK,
+    [PIVOT_BORDER_DATA_TOP]     = TABLE_STROKE_THICK,
+    [PIVOT_BORDER_DIM_ROW_HORZ] = TABLE_STROKE_SOLID,
+    [PIVOT_BORDER_DIM_ROW_VERT] = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_DIM_COL_HORZ] = TABLE_STROKE_SOLID,
+    [PIVOT_BORDER_DIM_COL_VERT] = TABLE_STROKE_SOLID,
+    [PIVOT_BORDER_CAT_ROW_HORZ] = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_CAT_ROW_VERT] = TABLE_STROKE_NONE,
+    [PIVOT_BORDER_CAT_COL_HORZ] = TABLE_STROKE_SOLID,
+    [PIVOT_BORDER_CAT_COL_VERT] = TABLE_STROKE_SOLID,
+  };
+  *style = (struct table_border_style) {
+      .stroke = default_strokes[border],
+      .color = CELL_COLOR_BLACK,
+  };
+}
+
  /* Returns the name of BORDER. */
  const char *
  pivot_border_to_string (enum pivot_border border)
@@ -279,7 +343,7 @@ pivot_dimension_create__ (struct pivot_table *table,
      axis->dimensions, (axis->n_dimensions + 1) * sizeof *axis->dimensions);
    axis->dimensions[axis->n_dimensions++] = d;
  
-  /* XXX extent and label_depth need to be calculated later. */
+  /* axis->extent and axis->label_depth will be calculated later. */
  
    return d;
  }
@@ -580,14 +644,6 @@ pivot_result_class_change (const char *s_, const struct fmt_spec *format)
    return rc != NULL;
  }
  \f
-/* One piece of data within a pivot table. */
-struct pivot_cell
-  {
-    struct hmap_node hmap_node; /* In struct pivot_table's 'cells' hmap. */
-    struct pivot_value *value;
-    unsigned int idx[];         /* One index per table dimension. */
-  };
-
  /* Pivot tables. */
  
  /* Creates and returns a new pivot table with the given TITLE.  TITLE should be
@@ -630,36 +686,17 @@ struct pivot_table *
  pivot_table_create__ (struct pivot_value *title)
  {
    struct pivot_table *table = xzalloc (sizeof *table);
+  table->ref_cnt = 1;
    table->weight_format = (struct fmt_spec) { FMT_F, 40, 0 };
    table->title = title;
  
-  /* Set default area styles. */
-#define STYLE(BOLD, H, V, L, R, T, B) {                         \
-    .cell_style = {                                             \
-      .halign = TABLE_HALIGN_##H,                               \
-      .valign = TABLE_VALIGN_##V,                               \
-      .margin = { [TABLE_HORZ][0] = L, [TABLE_HORZ][1] = R,     \
-                  [TABLE_VERT][0] = T, [TABLE_VERT][1] = B },   \
-    },                                                          \
-      .font_style = {                                           \
-      .bold = BOLD,                                             \
-      .fg = { [0] = CELL_COLOR_BLACK, [1] = CELL_COLOR_BLACK},  \
-      .bg = { [0] = CELL_COLOR_WHITE, [1] = CELL_COLOR_WHITE},  \
-    },                                                          \
-         }
-  static const struct area_style default_area_styles[PIVOT_N_AREAS] = {
-    [PIVOT_AREA_TITLE]         = STYLE( true, CENTER, CENTER,  8,11,1,8),
-    [PIVOT_AREA_CAPTION]       = STYLE(false, LEFT,   TOP,     8,11,1,1),
-    [PIVOT_AREA_FOOTER]        = STYLE(false, LEFT,   TOP,    11, 8,2,3),
-    [PIVOT_AREA_CORNER]        = STYLE(false, LEFT,   BOTTOM,  8,11,1,1),
-    [PIVOT_AREA_COLUMN_LABELS] = STYLE(false, CENTER, BOTTOM,  8,11,1,3),
-    [PIVOT_AREA_ROW_LABELS]    = STYLE(false, LEFT,   TOP,     8,11,1,3),
-    [PIVOT_AREA_DATA]          = STYLE(false, MIXED,  TOP,     8,11,1,1),
-    [PIVOT_AREA_LAYERS]        = STYLE(false, LEFT,   BOTTOM,  8,11,1,3),
-    };
-#undef STYLE
+  table->sizing[TABLE_HORZ].range[0] = 50;
+  table->sizing[TABLE_HORZ].range[1] = 72;
+  table->sizing[TABLE_VERT].range[0] = 36;
+  table->sizing[TABLE_VERT].range[1] = 120;
+
    for (size_t i = 0; i < PIVOT_N_AREAS; i++)
-    table->areas[i] = default_area_styles[i];
+    pivot_area_get_default_style (i, &table->areas[i]);
  
    /* Set default border styles. */
    static const enum table_stroke default_strokes[PIVOT_N_BORDERS] = {
@@ -713,12 +750,27 @@ pivot_table_create_for_text (struct pivot_value *title,
    return table;
  }
  
-/* Destroys TABLE and frees everything it points to. */
+/* Increases TABLE's reference count, indicating that it has an additional
+   owner.  A pivot table that is shared among multiple owners must not be
+   modified. */
+struct pivot_table *
+pivot_table_ref (const struct pivot_table *table_)
+{
+  struct pivot_table *table = CONST_CAST (struct pivot_table *, table_);
+  table->ref_cnt++;
+  return table;
+}
+
+/* Decreases TABLE's reference count, indicating that it has one fewer owner.
+   If TABLE no longer has any owners, it is freed. */
  void
-pivot_table_destroy (struct pivot_table *table)
+pivot_table_unref (struct pivot_table *table)
  {
    if (!table)
      return;
+  assert (table->ref_cnt > 0);
+  if (--table->ref_cnt)
+    return;
  
    free (table->current_layer);
    free (table->table_look);
@@ -771,6 +823,14 @@ pivot_table_destroy (struct pivot_table *table)
    free (table);
  }
  
+/* Returns true if TABLE has more than one owner.  A pivot table that is shared
+   among multiple owners must not be modified. */
+bool
+pivot_table_is_shared (const struct pivot_table *table)
+{
+  return table->ref_cnt > 1;
+}
+
  /* Sets the format used for PIVOT_RC_COUNT cells to the one used for variable
     WV, which should be the weight variable for the dictionary whose data or
     statistics are being put into TABLE.
@@ -957,9 +1017,9 @@ pivot_make_default_footnote_marker (int idx, bool show_numeric_markers)
    return pivot_value_new_user_text (text, -1);
  }
  
-/* Creates or modifies a footnote in TABLE with 0-based number IDX.  If MARKER
-   is nonnull, sets the footnote's marker; if CONTENT is nonnull, sets the
-   footnote's content. */
+/* Creates or modifies a footnote in TABLE with 0-based number IDX (and creates
+   all lower indexes as a side effect).  If MARKER is nonnull, sets the
+   footnote's marker; if CONTENT is nonnull, sets the footnote's content. */
  struct pivot_footnote *
  pivot_table_create_footnote__ (struct pivot_table *table, size_t idx,
                                 struct pivot_value *marker,
@@ -1344,6 +1404,52 @@ compose_headings (const struct pivot_axis *axis,
    return headings;
  }
  
+static void
+free_headings (const struct pivot_axis *axis, char ***headings)
+{
+  for (size_t i = 0; i < axis->label_depth; i++)
+    {
+      for (size_t j = 0; j < axis->extent; j++)
+        free (headings[i][j]);
+      free (headings[i]);
+    }
+  free (headings);
+}
+
+static void
+pivot_table_sizing_dump (const char *name, const struct pivot_table_sizing *s,
+                         int indentation)
+{
+  indent (indentation);
+  printf ("%ss: min=%d, max=%d\n", name, s->range[0], s->range[1]);
+  if (s->n_widths)
+    {
+      indent (indentation + 1);
+      printf ("%s widths:", name);
+      for (size_t i = 0; i < s->n_widths; i++)
+        printf (" %d", s->widths[i]);
+      printf ("\n");
+    }
+  if (s->n_breaks)
+    {
+      indent (indentation + 1);
+      printf ("break after %ss:", name);
+      for (size_t i = 0; i < s->n_breaks; i++)
+        printf (" %zu", s->breaks[i]);
+      printf ("\n");
+    }
+  if (s->n_keeps)
+    {
+      indent (indentation + 1);
+      printf ("keep %ss together:", name);
+      for (size_t i = 0; i < s->n_keeps; i++)
+        printf (" [%zu,%zu]",
+                s->keeps[i].ofs,
+                s->keeps[i].ofs + s->keeps[i].n - 1);
+      printf ("\n");
+    }
+}
+
  void
  pivot_table_dump (const struct pivot_table *table, int indentation)
  {
@@ -1363,9 +1469,17 @@ pivot_table_dump (const struct pivot_table *table, int indentation)
    if (table->date)
      {
        indent (indentation);
-      printf ("date: %s", ctime (&table->date)); /* XXX thread unsafe */
+      char buf[26];
+      printf ("date: %s", ctime_r (&table->date, buf));
      }
  
+  indent (indentation);
+  printf ("sizing:\n");
+  pivot_table_sizing_dump ("column", &table->sizing[TABLE_HORZ],
+                           indentation + 1);
+  pivot_table_sizing_dump ("row", &table->sizing[TABLE_VERT],
+                           indentation + 1);
+
    indent (indentation);
    printf ("areas:\n");
    for (enum pivot_area area = 0; area < PIVOT_N_AREAS; area++)
@@ -1422,7 +1536,7 @@ pivot_table_dump (const struct pivot_table *table, int indentation)
            fputs (" =", stdout);
  
            struct pivot_value **names = xnmalloc (layer_axis->label_depth,
-                                               sizeof *names);
+                                                 sizeof *names);
            size_t n_names = 0;
            for (const struct pivot_category *c
                   = d->presentation_leaves[layer_indexes[i]];
@@ -1462,6 +1576,7 @@ pivot_table_dump (const struct pivot_table *table, int indentation)
              }
            putchar ('\n');
          }
+      free_headings (&table->axes[PIVOT_AXIS_COLUMN], column_headings);
  
        indent (indentation + 1);
        printf ("-----------------------------------------------\n");
@@ -1510,6 +1625,7 @@ pivot_table_dump (const struct pivot_table *table, int indentation)
  
        free (column_enumeration);
        free (row_enumeration);
+      free_headings (&table->axes[PIVOT_AXIS_ROW], row_headings);
      }
  
    pivot_table_dump_value (table->caption, "caption", indentation);
@@ -1528,6 +1644,7 @@ pivot_table_dump (const struct pivot_table *table, int indentation)
        putchar ('\n');
      }
  
+  free (dindexes);
    settings_set_decimal_char (old_decimal);
  }
  \f
@@ -1735,7 +1852,7 @@ pivot_value_format_body (const struct pivot_value *value,
        break;
  
      case PIVOT_VALUE_TEMPLATE:
-      pivot_format_template (out, value->template.s, value->template.args,
+      pivot_format_template (out, value->template.local, value->template.args,
                               value->template.n_args, show_values,
                               show_variables);
        break;
@@ -1794,6 +1911,7 @@ pivot_value_destroy (struct pivot_value *value)
        /* Do not free the elements of footnotes because VALUE does not own
           them. */
        free (value->footnotes);
+      free (value->subscript);
  
        switch (value->type)
          {
@@ -1823,7 +1941,9 @@ pivot_value_destroy (struct pivot_value *value)
            break;
  
          case PIVOT_VALUE_TEMPLATE:
-          free (value->template.s);
+          free (value->template.local);
+          if (value->template.id != value->template.local)
+            free (value->template.id);
            for (size_t i = 0; i < value->template.n_args; i++)
              pivot_argument_uninit (&value->template.args[i]);
            free (value->template.args);
@@ -1837,15 +1957,16 @@ pivot_value_destroy (struct pivot_value *value)
     DEFAULT_STYLE for the parts of the style that VALUE doesn't override. */
  void
  pivot_value_get_style (struct pivot_value *value,
-                       const struct area_style *default_style,
+                       const struct font_style *base_font_style,
+                       const struct cell_style *base_cell_style,
                         struct area_style *area)
  {
    font_style_copy (&area->font_style, (value->font_style
                                         ? value->font_style
-                                       : &default_style->font_style));
-  area->cell_style = (value->cell_style
-                      ? *value->cell_style
-                      : default_style->cell_style);
+                                       : base_font_style));
+  area->cell_style = *(value->cell_style
+                       ? value->cell_style
+                       : base_cell_style);
  }
  
  /* Copies AREA into VALUE's style. */
@@ -2058,6 +2179,12 @@ void
  pivot_value_add_footnote (struct pivot_value *v,
                            struct pivot_footnote *footnote)
  {
+  /* Some legacy tables include numerous duplicate footnotes.  Suppress
+     them. */
+  for (size_t i = 0; i < v->n_footnotes; i++)
+    if (v->footnotes[i] == footnote)
+      return;
+
    v->footnotes = xrealloc (v->footnotes,
                             (v->n_footnotes + 1) * sizeof *v->footnotes);
    v->footnotes[v->n_footnotes++] = footnote;
diff --git a/src/output/pivot-table.h b/src/output/pivot-table.h

index 6104cb13133b04a0de008c9a99756e84e7e90d4a..f26378d703dde2b594e1755b2ff72df33b0d475c 100644 (file)
--- a/src/output/pivot-table.h
+++ b/src/output/pivot-table.h
@@ -100,6 +100,7 @@ enum pivot_area
    };
  
  const char *pivot_area_to_string (enum pivot_area);
+void pivot_area_get_default_style (enum pivot_area, struct area_style *);
  
  /* Table borders for styling purposes. */
  enum pivot_border
@@ -138,6 +139,8 @@ enum pivot_border
    };
  
  const char *pivot_border_to_string (enum pivot_border);
+void pivot_border_get_default_style (enum pivot_border,
+                                     struct table_border_style *);
  
  /* Sizing for rows or columns of a rendered table.  The comments below talk
     about columns and their widths but they apply equally to rows and their
@@ -365,6 +368,11 @@ bool pivot_result_class_change (const char *, const struct fmt_spec *);
  /* A pivot table.  See the top of this file for more information. */
  struct pivot_table
    {
+    /* Reference count.  A pivot_table may be shared between multiple owners,
+       indicated by a reference count greater than 1.  When this is the case,
+       the output item must not be modified. */
+    int ref_cnt;
+
      /* Display settings. */
      bool rotate_inner_column_labels;
      bool rotate_outer_row_labels;
@@ -441,7 +449,10 @@ struct pivot_table *pivot_table_create (const char *title);
  struct pivot_table *pivot_table_create__ (struct pivot_value *title);
  struct pivot_table *pivot_table_create_for_text (struct pivot_value *title,
                                                   struct pivot_value *content);
-void pivot_table_destroy (struct pivot_table *);
+
+struct pivot_table *pivot_table_ref (const struct pivot_table *);
+void pivot_table_unref (struct pivot_table *);
+bool pivot_table_is_shared (const struct pivot_table *);
  
  /* Format of PIVOT_RC_COUNT cells. */
  void pivot_table_set_weight_var (struct pivot_table *,
@@ -633,7 +644,8 @@ struct pivot_value
          /* PIVOT_VALUE_TEMPLATE. */
          struct
            {
-            char *s;
+            char *local;              /* Localized. */
+            char *id;                 /* Identifier. */
              struct pivot_argument *args;
              size_t n_args;
            }
@@ -687,7 +699,8 @@ void pivot_value_destroy (struct pivot_value *);
  
  /* Styling. */
  void pivot_value_get_style (struct pivot_value *,
-                            const struct area_style *default_style,
+                            const struct font_style *base_font_style,
+                            const struct cell_style *base_cell_style,
                              struct area_style *);
  void pivot_value_set_style (struct pivot_value *, const struct area_style *);
  
@@ -699,5 +712,13 @@ struct pivot_argument
    };
  
  void pivot_argument_uninit (struct pivot_argument *);
+\f
+/* One piece of data within a pivot table. */
+struct pivot_cell
+  {
+    struct hmap_node hmap_node; /* In struct pivot_table's 'cells' hmap. */
+    struct pivot_value *value;
+    unsigned int idx[];         /* One index per table dimension. */
+  };
  
  #endif /* output/pivot-table.h */
diff --git a/src/output/render.c b/src/output/render.c

index 0fad329ee47aaf03849bd4adf85a3df7e32e2e08..eada45198612b05f9df72a0876807d8ba5ec5339 100644 (file)
--- a/src/output/render.c
+++ b/src/output/render.c
@@ -1514,18 +1514,14 @@ add_footnote_page (struct render_pager *p, const struct table_item *item)
    if (!n_footnotes)
      return;
  
-  struct tab_table *t = tab_create (2, n_footnotes);
-
+  struct tab_table *t = tab_create (1, n_footnotes);
    for (size_t i = 0; i < n_footnotes; i++)
      if (f[i])
        {
-        tab_text_format (t, 0, i, TAB_LEFT, "%s.", f[i]->marker);
-        tab_text (t, 1, i, TAB_LEFT, f[i]->content);
+        tab_text_format (t, 0, i, TAB_LEFT, "%s. %s",
+                         f[i]->marker, f[i]->content);
          if (f[i]->style)
-          {
-            tab_add_style (t, 0, i, f[i]->style);
-            tab_add_style (t, 1, i, f[i]->style);
-          }
+          tab_add_style (t, 0, i, f[i]->style);
        }
    render_pager_add_table (p, &t->table, 0);
  
@@ -1548,6 +1544,26 @@ add_text_page (struct render_pager *p, const struct table_item_text *t,
    render_pager_add_table (p, &tab->table, min_width);
  }
  
+static void
+add_layers_page (struct render_pager *p,
+                 const struct table_item_layers *layers, int min_width)
+{
+  if (!layers)
+    return;
+
+  struct tab_table *tab = tab_create (1, layers->n_layers);
+  for (size_t i = 0; i < layers->n_layers; i++)
+    {
+      const struct table_item_layer *layer = &layers->layers[i];
+      tab_text (tab, 0, i, 0, layer->content);
+      for (size_t j = 0; j < layer->n_footnotes; j++)
+        tab_add_footnote (tab, 0, i, layer->footnotes[j]);
+    }
+  if (layers->style)
+    tab->styles[0] = area_style_clone (tab->container, layers->style);
+  render_pager_add_table (p, &tab->table, min_width);
+}
+
  /* Creates and returns a new render_pager for rendering TABLE_ITEM on the
     device with the given PARAMS. */
  struct render_pager *
@@ -1571,7 +1587,7 @@ render_pager_create (const struct render_params *params,
    add_text_page (p, table_item_get_title (table_item), title_width);
  
    /* Layers. */
-  add_text_page (p, table_item_get_layers (table_item), title_width);
+  add_layers_page (p, table_item_get_layers (table_item), title_width);
  
    /* Body. */
    render_pager_add_table (p, table_ref (table_item_get_table (table_item)), 0);
diff --git a/src/output/spv-driver.c b/src/output/spv-driver.c

new file mode 100644 (file)

index 0000000..411b89b
--- /dev/null
+++ b/src/output/spv-driver.c
@@ -0,0 +1,126 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/driver-provider.h"
+
+#include <stdlib.h>
+
+#include "data/file-handle-def.h"
+#include "libpspp/cast.h"
+#include "output/group-item.h"
+#include "output/page-setup-item.h"
+#include "output/table-item.h"
+#include "output/text-item.h"
+#include "output/spv/spv-writer.h"
+
+#include "gl/xalloc.h"
+
+#include "gettext.h"
+#define _(msgid) gettext (msgid)
+
+struct spv_driver
+  {
+    struct output_driver driver;
+    struct spv_writer *writer;
+    struct file_handle *handle;
+  };
+
+static const struct output_driver_class spv_driver_class;
+
+static struct spv_driver *
+spv_driver_cast (struct output_driver *driver)
+{
+  assert (driver->class == &spv_driver_class);
+  return UP_CAST (driver, struct spv_driver, driver);
+}
+
+static struct output_driver *
+spv_create (struct file_handle *fh, enum settings_output_devices device_type,
+             struct string_map *o UNUSED)
+{
+  struct output_driver *d;
+  struct spv_driver *spv;
+
+  spv = xzalloc (sizeof *spv);
+  d = &spv->driver;
+  spv->handle = fh;
+  output_driver_init (&spv->driver, &spv_driver_class, fh_get_file_name (fh),
+                      device_type);
+
+  char *error = spv_writer_open (fh_get_file_name (fh), &spv->writer);
+  if (spv->writer == NULL)
+    {
+      msg (ME, "%s", error);
+      goto error;
+    }
+
+  return d;
+
+ error:
+  fh_unref (fh);
+  output_driver_destroy (d);
+  return NULL;
+}
+
+static void
+spv_destroy (struct output_driver *driver)
+{
+  struct spv_driver *spv = spv_driver_cast (driver);
+
+  char *error = spv_writer_close (spv->writer);
+  if (error)
+    msg (ME, "%s", error);
+  fh_unref (spv->handle);
+  free (spv);
+}
+
+static void
+spv_submit (struct output_driver *driver,
+             const struct output_item *output_item)
+{
+  struct spv_driver *spv = spv_driver_cast (driver);
+
+  if (is_group_open_item (output_item))
+    spv_writer_open_heading (spv->writer,
+                             to_group_open_item (output_item)->command_name,
+                             to_group_open_item (output_item)->command_name);
+  else if (is_group_close_item (output_item))
+    spv_writer_close_heading (spv->writer);
+  else if (is_table_item (output_item))
+    {
+      const struct table_item *table_item = to_table_item (output_item);
+      if (table_item->pt)
+        spv_writer_put_table (spv->writer, table_item->pt);
+    }
+  else if (is_text_item (output_item))
+    spv_writer_put_text (spv->writer, to_text_item (output_item));
+  else if (is_page_setup_item (output_item))
+    spv_writer_set_page_setup (spv->writer,
+                               to_page_setup_item (output_item)->page_setup);
+}
+
+struct output_driver_factory spv_driver_factory =
+  { "spv", "pspp.spv", spv_create };
+
+static const struct output_driver_class spv_driver_class =
+  {
+    "spv",
+    spv_destroy,
+    spv_submit,
+    NULL,
+  };
diff --git a/src/output/spv/automake.mk b/src/output/spv/automake.mk

new file mode 100644 (file)

index 0000000..cda806b
--- /dev/null
+++ b/src/output/spv/automake.mk
@@ -0,0 +1,104 @@
+# PSPP - a program for statistical analysis.
+# Copyright (C) 2017 Free Software Foundation, Inc.
+# 
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+# 
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+# 
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+# 
+## Process this file with automake to produce Makefile.in  -*- makefile -*-
+
+src_output_liboutput_la_SOURCES += \
+       src/output/spv/spv-css-parser.c \
+       src/output/spv/spv-css-parser.h \
+       src/output/spv/spv-dump.c \
+       src/output/spv/spv-legacy-data.c \
+       src/output/spv/spv-legacy-data.h \
+       src/output/spv/spv-legacy-decoder.c \
+       src/output/spv/spv-legacy-decoder.h \
+       src/output/spv/spv-light-decoder.c \
+       src/output/spv/spv-light-decoder.h \
+       src/output/spv/spv-output.c \
+       src/output/spv/spv-output.h \
+       src/output/spv/spv-select.c \
+       src/output/spv/spv-select.h \
+       src/output/spv/spv-writer.c \
+       src/output/spv/spv-writer.h \
+       src/output/spv/spv.c \
+       src/output/spv/spv.h \
+       src/output/spv/spvbin-helpers.c \
+       src/output/spv/spvbin-helpers.h \
+       src/output/spv/spvxml-helpers.c \
+       src/output/spv/spvxml-helpers.h
+
+light_binary_in = \
+       src/output/spv/binary-parser-generator \
+       src/output/spv/light-binary.grammar
+light_binary_out = \
+       src/output/spv/light-binary-parser.c \
+       src/output/spv/light-binary-parser.h
+src/output/spv/light-binary-parser.c: $(light_binary_in)
+       $(AM_V_GEN)python $^ code spvlb '"output/spv/light-binary-parser.h"' > $@.tmp
+       $(AM_V_at)mv $@.tmp $@
+src/output/spv/light-binary-parser.h: $(light_binary_in)
+       $(AM_V_GEN)python $^ header spvlb > $@.tmp && mv $@.tmp $@
+nodist_src_output_liboutput_la_SOURCES += $(light_binary_out)
+BUILT_SOURCES += $(light_binary_out)
+CLEANFILES += $(light_binary_out)
+EXTRA_DIST += $(light_binary_in)
+
+old_binary_in = \
+       src/output/spv/binary-parser-generator \
+       src/output/spv/old-binary.grammar
+old_binary_out = \
+       src/output/spv/old-binary-parser.c \
+       src/output/spv/old-binary-parser.h
+src/output/spv/old-binary-parser.c: $(old_binary_in)
+       $(AM_V_GEN)python $^ code spvob '"output/spv/old-binary-parser.h"' > $@.tmp
+       $(AM_V_at)mv $@.tmp $@
+src/output/spv/old-binary-parser.h: $(old_binary_in)
+       $(AM_V_GEN)python $^ header spvob > $@.tmp && mv $@.tmp $@
+nodist_src_output_liboutput_la_SOURCES += $(old_binary_out)
+BUILT_SOURCES += $(old_binary_out)
+CLEANFILES += $(old_binary_out)
+EXTRA_DIST += $(old_binary_in)
+
+detail_xml_in = \
+       src/output/spv/xml-parser-generator \
+       src/output/spv/detail-xml.grammar
+detail_xml_out = \
+       src/output/spv/detail-xml-parser.c \
+       src/output/spv/detail-xml-parser.h
+src/output/spv/detail-xml-parser.c: $(detail_xml_in)
+       $(AM_V_GEN)python $^ code spvdx '"output/spv/detail-xml-parser.h"' > $@.tmp
+       $(AM_V_at)mv $@.tmp $@
+src/output/spv/detail-xml-parser.h: $(detail_xml_in)
+       $(AM_V_GEN)python $^ header spvdx > $@.tmp && mv $@.tmp $@
+nodist_src_output_liboutput_la_SOURCES += $(detail_xml_out)
+BUILT_SOURCES += $(detail_xml_out)
+CLEANFILES += $(detail_xml_out)
+EXTRA_DIST += $(detail_xml_in)
+
+structure_xml_in = \
+       src/output/spv/xml-parser-generator \
+       src/output/spv/structure-xml.grammar
+structure_xml_out = \
+       src/output/spv/structure-xml-parser.c \
+       src/output/spv/structure-xml-parser.h
+src/output/spv/structure-xml-parser.c: $(structure_xml_in)
+       $(AM_V_GEN)python $^ code spvsx '"output/spv/structure-xml-parser.h"' > $@.tmp
+       $(AM_V_at)mv $@.tmp $@
+src/output/spv/structure-xml-parser.h: $(structure_xml_in)
+       $(AM_V_GEN)python $^ header spvsx > $@.tmp && mv $@.tmp $@
+nodist_src_output_liboutput_la_SOURCES += $(structure_xml_out)
+BUILT_SOURCES += $(structure_xml_out)
+CLEANFILES += $(structure_xml_out)
+EXTRA_DIST += $(structure_xml_in)
diff --git a/src/output/spv/binary-parser-generator b/src/output/spv/binary-parser-generator

new file mode 100644 (file)

index 0000000..dd2f56c
--- /dev/null
+++ b/src/output/spv/binary-parser-generator
@@ -0,0 +1,861 @@
+#! /usr/bin/python
+
+import getopt
+import os
+import struct
+import sys
+
+n_errors = 0
+
+def error(msg):
+    global n_errors
+    sys.stderr.write("%s:%d: %s\n" % (file_name, line_number, msg))
+    n_errors += 1
+
+
+def fatal(msg):
+    error(msg)
+    sys.exit(1)
+
+
+def get_line():
+    global line
+    global line_number
+    line = input_file.readline()
+    line_number += 1
+
+
+def is_num(s):
+    return s.isdigit() or (s[0] == '-' and s[1].isdigit())
+
+
+xdigits = "0123456789abcdefABCDEF"
+def is_xdigits(s):
+    for c in s:
+        if c not in xdigits:
+            return False
+    return True
+
+
+def expect(type):
+    if token[0] != type:
+        fatal("syntax error expecting %s" % type)
+
+
+def match(type):
+    if token[0] == type:
+        get_token()
+        return True
+    else:
+        return False
+
+
+def must_match(type):
+    expect(type)
+    get_token()
+
+
+def get_token():
+    global token
+    global line
+    prev = token
+    if line == "":
+        if token == ('eof', ):
+            fatal("unexpected end of input")
+        get_line()
+        if not line:
+            token = ('eof', )
+            return
+        elif line == '\n':
+            token = (';', )
+            return
+        elif not line[0].isspace():
+            token = (';', )
+            return
+
+    line = line.lstrip()
+    if line == "":
+        get_token()
+    elif line[0] == '#':
+        line = ''
+        get_token()
+    elif line[0] in '[]()?|*':
+        token = (line[0], )
+        line = line[1:]
+    elif line.startswith('=>'):
+        token = (line[:2], )
+        line = line[2:]
+    elif line.startswith('...'):
+        token = (line[:3], )
+        line = line[3:]
+    elif line[0].isalnum() or line[0] == '-':
+        n = 1
+        while n < len(line) and (line[n].isalnum() or line[n] == '-'):
+            n += 1
+        s = line[:n]
+        line = line[n:]
+
+        if prev[0] == '*' and is_num(s):
+            token = ('number', int(s, 10))
+        elif len(s) == 2 and is_xdigits(s):
+            token = ('bytes', struct.pack('B', int(s, 16)))
+        elif s[0] == 'i' and is_num(s[1:]):
+            token = ('bytes', struct.pack('<i', int(s[1:])))
+        elif s[:2] == 'ib' and is_num(s[2:]):
+            token = ('bytes', struct.pack('>i', int(s[2:])))
+        elif s[0].isupper():
+            token = ('nonterminal', s)
+        elif s in ('bool', 'int16', 'int32', 'int64', 'be16', 'be32', 'be64',
+                   'string', 'bestring', 'byte', 'float', 'double',
+                   'count', 'becount', 'v1', 'v3', 'vAF', 'vB0',
+                   'case', 'else'):
+            token = (s, )
+        else:
+            token = ('id', s)
+    else:
+        fatal("unknown character %c" % line[0])
+
+
+def usage():
+    argv0 = os.path.basename(sys.argv[0])
+    print('''\
+%(argv0)s, parser generator for SPV binary members
+usage: %(argv0)s GRAMMAR header
+       %(argv0)s GRAMMAR code HEADER_NAME
+  where GRAMMAR contains grammar definitions\
+''' % {"argv0": argv0})
+    sys.exit(0)
+
+
+class Item(object):
+    def __init__(self, type_, name, n, content):
+        self.type_ = type_
+        self.name = name
+        self.n = n
+        self.content = content
+    def __repr__(self):
+        if self.type_ == 'constant':
+            return ' '.join(['%02x' % ord(x) for x in self.content])
+        elif self.content:
+            return "%s(%s)" % (self.type_, self.content)
+        else:
+            return self.type_
+
+def parse_item():
+    t = token
+    name = None
+    if t[0] == 'bytes':
+        type_ = 'constant'
+        content = t[1]
+        get_token()
+    elif t[0] in ('bool', 'byte',
+                  'int16', 'int32', 'int64',
+                  'be16', 'be32', 'be64',
+                  'string', 'bestring',
+                  'float', 'double',
+                  'nonterminal', '...'):
+        type_ = 'variable'
+        content = t
+        get_token()
+        if t[0] == 'nonterminal':
+            name = name_to_id(content[1])
+    elif t[0] in ('v1', 'v3', 'vAF', 'vB0', 'count', 'becount'):
+        type_ = t[0]
+        get_token()
+        must_match('(')
+        content = parse_choice()
+        must_match(')')
+    elif match('case'):
+        return parse_case()
+    elif match('('):
+        type_ = '()'
+        content = parse_choice()
+        must_match(')')
+    else:
+        print token
+        fatal('syntax error expecting item')
+
+    n = 1
+    optional = False
+    if match('*'):
+        if token[0] == 'number':
+            n = token[1]
+            get_token()
+        elif match('['):
+            expect('id')
+            n = token[1]
+            get_token()
+            must_match(']')
+            if n.startswith('n-'):
+                name = n[2:]
+        else:
+            fatal('expecting quantity')
+    elif match('?'):
+        optional = True
+
+    if match('['):
+        expect('id')
+        if type_ == 'constant' and not optional:
+            fatal("%s: cannot name a constant" % token[1])
+
+        name = token[1]
+        get_token()
+        must_match(']')
+
+    if type_ == 'constant':
+        content *= n
+        n = 1
+
+    item = Item(type_, name, n, content)
+    if optional:
+        item = Item('|', None, 1, [[item], []])
+    return item
+
+
+def parse_concatenation():
+    items = []
+    while token[0] not in (')', ';', '|', 'eof'):
+        item = parse_item()
+        if (item.type_ == 'constant'
+            and items
+            and items[-1].type_ == 'constant'):
+            items[-1].content += item.content
+        else:
+            items.append(item)
+    return items
+
+
+def parse_choice():
+    sub = parse_concatenation()
+    if token[0] != '|':
+        return sub
+
+    choices = [sub]
+    while match('|'):
+        choices.append(parse_concatenation())
+
+    return [Item('|', None, 1, choices)]
+
+
+def parse_case():
+    must_match('(')
+    choices = {}
+    while True:
+        choice = None
+        if match('else'):
+            choice = 'else'
+
+        items = parse_concatenation()
+        if choice is None:
+            if (not items
+                or items[0].type_ != 'constant'
+                or len(items[0].content) != 1):
+                fatal("choice must begin with xx (or 'else')")
+            choice = '%02x' % ord(items[0].content)
+
+        if choice in choices:
+            fatal("duplicate choice %s" % choice)
+        choices[choice] = items
+
+        if match(')'):
+            break
+        must_match('|')
+
+    case_name = None
+    if match('['):
+        expect('id')
+        case_name = token[1]
+        get_token()
+        must_match(']')
+
+    return Item('case', case_name, 1,
+                { '%s_%s' % (case_name, k) : v for k, v in choices.items() })
+
+
+def parse_production():
+    expect('nonterminal')
+    name = token[1]
+    get_token()
+    must_match('=>')
+    return name, parse_choice()
+
+
+def print_members(p, indent):
+    for item in p:
+        if item.type_ == 'variable' and item.name:
+            if item.content[0] == 'nonterminal':
+                typename = 'struct %s%s' % (prefix,
+                                            name_to_id(item.content[1]))
+                n_stars = 1
+            else:
+                c_types = {'bool': ('bool', 0),
+                           'byte': ('uint8_t', 0),
+                           'int16': ('uint16_t', 0),
+                           'int32': ('uint32_t', 0),
+                           'int64': ('uint64_t', 0),
+                           'be16': ('uint16_t', 0),
+                           'be32': ('uint32_t', 0),
+                           'be64': ('uint64_t', 0),
+                           'string': ('char', 1),
+                           'bestring': ('char', 1),
+                           'float': ('double', 0),
+                           'double': ('double', 0),
+                           '...': ('uint8_t', 1)}
+                typename, n_stars = c_types[item.content[0]]
+
+            array_suffix = ''
+            if item.n:
+                if isinstance(item.n, int):
+                    if item.n > 1:
+                        array_suffix = '[%d]' % item.n
+                else:
+                    n_stars += 1
+            
+            print "%s%s %s%s%s;" % (indent, typename, '*' * n_stars,
+                                    name_to_id(item.name),
+                                    array_suffix)
+        elif item.type_ in ('v1', 'v3', 'vAF', 'vB0',
+                            'count', 'becount', '()'):
+            print_members(item.content, indent)
+        elif item.type_ == '|':
+            for choice in item.content:
+                print_members(choice, indent)
+        elif item.type_ == 'case':
+            print "%sint %s;" % (indent, item.name)
+            print "%sunion {" % indent
+            for name, choice in sorted(item.content.items()):
+                print "%s    struct {" % indent
+                print_members(choice, indent + ' ' * 8)
+                print "%s    } %s;" % (indent, name)
+            print "%s};" % indent
+        elif item.type_ == 'constant':
+            if item.name:
+                print "%sbool %s;" % (indent, item.name)
+        elif item.type_ not in ("constant", "variable"):
+            fatal("unhandled type %s" % item.type_)
+
+
+def bytes_to_hex(s):
+    return ''.join(['"'] + ["\\x%02x" % ord(x) for x in s] + ['"'])
+
+
+class Parser_Context(object):
+    def __init__(self):
+        self.suffixes = {}
+        self.bail = 'error'
+        self.need_error_handler = False
+    def gen_name(self, prefix):
+        n = self.suffixes.get(prefix, 0) + 1
+        self.suffixes[prefix] = n
+        return '%s%d' % (prefix, n) if n > 1 else prefix
+    def save_pos(self, indent):
+        pos = self.gen_name('pos')
+        print "%sstruct spvbin_position %s = spvbin_position_save (input);" % (indent, pos)
+        return pos
+    def save_error(self, indent):
+        error = self.gen_name('save_n_errors')
+        print "%ssize_t %s = input->n_errors;" % (indent, error)
+        return error
+    def parse_limit(self, endian, indent):
+        limit = self.gen_name('saved_limit')
+        print """\
+%sstruct spvbin_limit %s;
+%sif (!spvbin_limit_parse%s (&%s, input))
+%s    goto %s;""" % (
+    indent, limit,
+    indent, '_be' if endian == 'big' else '', limit,
+    indent, self.bail)
+        return limit
+        
+
+def print_parser_items(name, production, indent, accessor, ctx):
+    for item_idx in range(len(production)):
+        if item_idx > 0:
+            print
+
+        item = production[item_idx]
+        if item.type_ == 'constant':
+            print """%sif (!spvbin_match_bytes (input, %s, %d))
+%s    goto %s;""" % (
+                indent, bytes_to_hex(item.content), len(item.content),
+                indent, ctx.bail)
+            ctx.need_error_handler = True
+            if item.name:
+                print "%sp->%s = true;" % (indent, item.name)
+        elif item.type_ == 'variable':
+            if item.content[0] == 'nonterminal':
+                func = '%sparse_%s' % (prefix, name_to_id(item.content[1]))
+            else:
+                func = 'spvbin_parse_%s' % item.content[0]
+
+            if item.name:
+                dst = "&p->%s%s" % (accessor, name_to_id(item.name))
+            else:
+                dst = "NULL"
+            if item.n == 1:
+                print """%sif (!%s (input, %s))
+%s    goto %s;""" % (indent, func, dst,
+                     indent, ctx.bail)
+
+                if item.content[0] != 'nonterminal' and item.name == 'version':
+                    print "%sinput->version = p->%s%s;" % (
+                        indent, accessor, name_to_id(item.name))
+            else:
+                if isinstance(item.n, int):
+                    count = item.n
+                else:
+                    count = 'p->%s%s' % (accessor, name_to_id(item.n))
+
+                i_name = ctx.gen_name('i')
+                if item.name:
+                    if not isinstance(item.n, int):
+                        print "%sp->%s%s = xcalloc (%s, sizeof *p->%s%s);" % (
+                            indent,
+                            accessor, name_to_id(item.name), count,
+                            accessor, name_to_id(item.name))
+                    dst += '[%s]' % i_name
+                print "%sfor (int %s = 0; %s < %s; %s++)" % (
+                    indent, i_name, i_name, count, i_name)
+                print """%s    if (!%s (input, %s))
+%s        goto %s;""" % (indent, func, dst,
+                     indent, ctx.bail)
+
+            ctx.need_error_handler = True
+        elif item.type_ == '()':
+            if item.n != 1:
+                # Not yet implemented
+                raise AssertionError
+
+            print_parser_items(name, item.content, indent, accessor, ctx)
+        elif item.type_ in  ('v1', 'v3', 'vAF', 'vB0'):
+            if item.n != 1:
+                # Not yet implemented
+                raise AssertionError
+
+            print "%sif (input->version == 0x%s) {" % (indent, item.type_[1:])
+            print_parser_items(name, item.content, indent + '    ', accessor, ctx)
+            print "%s}" % indent
+        elif item.type_ in ('count', 'becount'):
+            if item.n != 1:
+                # Not yet implemented
+                raise AssertionError
+
+            pos = ctx.save_pos(indent)
+            endian = 'big' if item.type_ == 'becount' else 'little'
+            limit = ctx.parse_limit(endian, indent)
+
+            save_bail = ctx.bail
+            ctx.bail = ctx.gen_name('backtrack')
+
+            print "%sdo {" % indent
+            indent += '    '
+            if (item.content
+                and item.content[-1].type_ == 'variable'
+                and item.content[-1].content[0] == '...'):
+                content = item.content[:-1]
+                ellipsis = True
+            else:
+                content = item.content
+                ellipsis = False
+            print_parser_items(name, content, indent, accessor, ctx)
+
+            if ellipsis:
+                print "%sinput->ofs = input->size;" % indent
+            else:
+                print """%sif (!spvbin_input_at_end (input))
+%s    goto %s;""" % (indent,
+                     indent, ctx.bail)
+            print '%sspvbin_limit_pop (&%s, input);' % (indent, limit)
+            print '%sbreak;' % indent
+            print
+            print '%s%s:' % (indent[4:], ctx.bail)
+            # In theory, we should emit code to clear whatever we're
+            # backtracking from.  In practice, it's not important to
+            # do that.
+            print "%sspvbin_position_restore (&%s, input);" % (indent, pos)
+            print '%sspvbin_limit_pop (&%s, input);' % (indent, limit)
+            print '%sgoto %s;' % (indent, save_bail)
+            indent = indent[4:]
+            print "%s} while (0);" % indent
+
+            ctx.bail = save_bail
+        elif item.type_ == '|':
+            save_bail = ctx.bail
+
+            print "%sdo {" % indent
+            indent += '    '
+            pos = ctx.save_pos(indent)
+            error = ctx.save_error(indent)
+            i = 0
+            for choice in item.content:
+                if i:
+                    print "%sspvbin_position_restore (&%s, input);" % (indent, pos)
+                    print "%sinput->n_errors = %s;" % (indent, error)
+                i += 1
+
+                if i != len(item.content):
+                    ctx.bail = ctx.gen_name('backtrack')
+                else:
+                    ctx.bail = save_bail
+                print_parser_items(name, choice, indent, accessor, ctx)
+                print "%sbreak;" % indent
+                if i != len(item.content):
+                    print
+                    print '%s%s:' % (indent[4:], ctx.bail)
+                    # In theory, we should emit code to clear whatever we're
+                    # backtracking from.  In practice, it's not important to
+                    # do that.
+            indent = indent[4:]
+            print "%s} while (0);" % indent
+        elif item.type_ == 'case':
+            i = 0
+            for choice_name, choice in sorted(item.content.items()):
+                if choice_name.endswith('else'):
+                    print "%s} else {" % indent
+                    print "%s    p->%s%s = -1;" % (indent, accessor, item.name)
+                    print
+                else:
+                    print "%s%sif (spvbin_match_byte (input, 0x%s)) {" % (
+                        indent, '} else ' if i else '', choice_name[-2:])
+                    print "%s    p->%s%s = 0x%s;" % (
+                        indent, accessor, item.name, choice_name[-2:])
+                    print
+                    choice = choice[1:]
+                
+                print_parser_items(name, choice, indent + '    ',
+                                   accessor + choice_name + '.', ctx)
+                i += 1
+            print "%s}" % indent
+        else:
+            # Not implemented
+            raise AssertionError
+
+
+def print_parser(name, production, indent):
+    print '''
+bool
+%(prefix)sparse_%(name)s (struct spvbin_input *input, struct %(prefix)s%(name)s **p_)
+{
+    *p_ = NULL;
+    struct %(prefix)s%(name)s *p = xzalloc (sizeof *p);
+    p->start = input->ofs;
+''' % {'prefix': prefix,
+       'name': name_to_id(name)}
+
+    ctx = Parser_Context()
+    print_parser_items(name, production, indent, '', ctx)
+
+    print '''
+    p->len = input->ofs - p->start;
+    *p_ = p;
+    return true;'''
+
+    if ctx.need_error_handler:
+        print """
+error:
+    spvbin_error (input, "%s", p->start);
+    %sfree_%s (p);
+    return false;""" % (name, prefix, name_to_id(name))
+
+    print "}"
+
+def print_free_items(name, production, indent, accessor, ctx):
+    for item in production:
+        if item.type_ == 'constant':
+            pass
+        elif item.type_ == 'variable':
+            if not item.name:
+                continue
+
+            if item.content[0] == 'nonterminal':
+                free_func = '%sfree_%s' % (prefix, name_to_id(item.content[1]))
+            elif item.content[0] in ('string', 'bestring', '...'):
+                free_func = 'free'
+            else:
+                free_func = None
+
+            dst = "p->%s%s" % (accessor, name_to_id(item.name))
+
+            if item.n == 1:
+                if free_func:
+                    print "%s%s (%s);" % (indent, free_func, dst)
+            else:
+                if isinstance(item.n, int):
+                    count = item.n
+                else:
+                    count = 'p->%s%s' % (accessor, name_to_id(item.n))
+
+                i_name = ctx.gen_name('i')
+                if free_func:
+                    print "%sfor (int %s = 0; %s < %s; %s++)" % (
+                        indent, i_name, i_name, count, i_name)
+                    print "%s    %s (%s[%s]);" % (
+                        indent, free_func, dst, i_name)
+                if not isinstance(item.n, int):
+                    print "%sfree (p->%s%s);" % (
+                        indent, accessor, name_to_id(item.name))
+        elif item.type_ in ('()', 'v1', 'v3', 'vAF', 'vB0',
+                            'count', 'becount'):
+            if item.n != 1:
+                # Not yet implemented
+                raise AssertionError
+
+            print_free_items(name, item.content, indent, accessor, ctx)
+        elif item.type_ == '|':
+            for choice in item.content:
+                print_free_items(name, choice, indent, accessor, ctx)
+        elif item.type_ == 'case':
+            i = 0
+            for choice_name, choice in sorted(item.content.items()):
+                if choice_name.endswith('else'):
+                    value_name = '-1'
+                else:
+                    value_name = '0x%s' % choice_name[-2:]
+
+                print '%s%sif (p->%s%s == %s) {' % (
+                    indent, '} else ' if i else '', accessor, item.name,
+                    value_name)
+                
+                print_free_items(name, choice, indent + '    ',
+                                 accessor + choice_name + '.', ctx)
+                i += 1
+            print "%s}" % indent
+        else:
+            # Not implemented
+            raise AssertionError
+
+def print_free(name, production, indent):
+    print '''
+void
+%(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *p)
+{
+    if (p == NULL)
+        return;
+''' % {'prefix': prefix,
+       'name': name_to_id(name)}
+
+    print_free_items(name, production, indent, '', Parser_Context())
+
+    print "    free (p);"
+    print "}"
+
+def print_print_items(name, production, indent, accessor, ctx):
+    for item_idx in range(len(production)):
+        if item_idx > 0:
+            print
+
+        item = production[item_idx]
+        if item.type_ == 'constant':
+            if item.name:
+                print '%sspvbin_print_presence ("%s", indent + 1, p->%s);' % (
+                    indent, item.name, item.name)
+        elif item.type_ == 'variable':
+            if not item.name:
+                continue
+
+            if item.content[0] == 'nonterminal':
+                func = '%sprint_%s' % (prefix, name_to_id(item.content[1]))
+            else:
+                c_types = {'bool': 'bool',
+                           'byte': 'byte',
+                           'int16': 'int16',
+                           'int32': 'int32',
+                           'int64': 'int64',
+                           'be16': 'int16',
+                           'be32': 'int32',
+                           'be64': 'int64',
+                           'string': 'string',
+                           'bestring': 'string',
+                           'float': 'double',
+                           'double': 'double',
+                           '...': ('uint8_t', 1)}
+                func = 'spvbin_print_%s' % c_types[item.content[0]]
+
+            dst = "p->%s%s" % (accessor, name_to_id(item.name))
+            if item.n == 1:
+                print '%s%s ("%s", indent + 1, %s);' % (indent, func,
+                                                      item.name, dst)
+            else:
+                if isinstance(item.n, int):
+                    count = item.n
+                else:
+                    count = 'p->%s%s' % (accessor, name_to_id(item.n))
+
+                i_name = ctx.gen_name('i')
+                elem_name = ctx.gen_name('elem_name')
+                dst += '[%s]' % i_name
+                print """\
+%(indent)sfor (int %(index)s = 0; %(index)s < %(count)s; %(index)s++) {
+%(indent)s    char *%(elem_name)s = xasprintf ("%(item.name)s[%%d]", %(index)s);
+%(indent)s    %(func)s (%(elem_name)s, indent + 1, %(dst)s);
+%(indent)s    free (%(elem_name)s);
+%(indent)s}""" % {'indent': indent,
+                  'index': i_name,
+                  'count': count,
+                  'elem_name' : elem_name,
+                  'item.name': item.name,
+                  'func': func,
+                  'dst': dst}
+        elif item.type_ == '()':
+            if item.n != 1:
+                # Not yet implemented
+                raise AssertionError
+
+            print_print_items(name, item.content, indent, accessor, ctx)
+        elif item.type_ in  ('v1', 'v3', 'vAF', 'vB0'):
+            if item.n != 1:
+                # Not yet implemented
+                raise AssertionError
+
+            print_print_items(name, item.content, indent, accessor, ctx)
+        elif item.type_ in ('count', 'becount'):
+            if item.n != 1:
+                # Not yet implemented
+                raise AssertionError
+
+            indent += '    '
+            if (item.content
+                and item.content[-1].type_ == 'variable'
+                and item.content[-1].content[0] == '...'):
+                content = item.content[:-1]
+            else:
+                content = item.content
+            print_print_items(name, content, indent, accessor, ctx)
+        elif item.type_ == '|':
+            for choice in item.content:
+                print_print_items(name, choice, indent, accessor, ctx)
+        elif item.type_ == 'case':
+            i = 0
+            print """\
+%sspvbin_print_case ("%s", indent + 1, p->%s%s);""" % (
+    indent, item.name, accessor, name_to_id(item.name))
+            for choice_name, choice in sorted(item.content.items()):
+                if choice_name.endswith('else'):
+                    value_name = '-1'
+                else:
+                    value_name = '0x%s' % choice_name[-2:]
+
+                print '%s%sif (p->%s%s == %s) {' % (
+                    indent, '} else ' if i else '', accessor, item.name,
+                    value_name)
+                
+                print_print_items(name, choice, indent + '    ',
+                                  accessor + choice_name + '.', ctx)
+                i += 1
+            print "%s}" % indent
+        else:
+            # Not implemented
+            raise AssertionError
+
+
+def print_print(name, production, indent):
+    print '''
+void
+%(prefix)sprint_%(name)s (const char *title, int indent, const struct %(prefix)s%(name)s *p)
+{
+    spvbin_print_header (title, p ? p->start : -1, p ? p->len : -1, indent);
+    if (p == NULL) {
+        printf ("none\\n");
+        return;
+    }
+    putchar ('\\n');
+''' % {'prefix': prefix,
+       'rawname': name,
+       'name': name_to_id(name)}
+
+    ctx = Parser_Context()
+    print_print_items(name, production, indent, '', ctx)
+
+    print "}"
+
+def name_to_id(s):
+    return s[0].lower() + ''.join(['_%c' % x.lower() if x.isupper() else x
+                                   for x in s[1:]]).replace('-', '_')
+    
+
+if __name__ == "__main__":
+    argv0 = sys.argv[0]
+    try:
+        options, args = getopt.gnu_getopt(sys.argv[1:], 'h', ['help'])
+    except getopt.GetoptError as e:
+        sys.stderr.write("%s: %s\n" % (argv0, e.msg))
+        sys.exit(1)
+
+    for key, value in options:
+        if key in ['-h', '--help']:
+            usage()
+        else:
+            sys.exit(0)
+
+    if len(args) < 3:
+        sys.stderr.write("%s: bad usage (use --help for help)\n" % argv0)
+        sys.exit(1)
+
+    global file_name
+    file_name, output_type, prefix = args[:3]
+    input_file = open(file_name)
+
+    prefix = '%s_' % prefix
+
+    global line
+    global line_number
+    line = ""
+    line_number = 0
+
+    productions = {}
+
+    global token
+    token = ('start', )
+    get_token()
+    while True:
+        while match(';'):
+            pass
+        if token[0] == 'eof':
+            break
+
+        name, production = parse_production()
+        if name in productions:
+            fatal("%s: duplicate production" % name)
+        productions[name] = production
+
+    print '/* Generated automatically -- do not modify!    -*- buffer-read-only: t -*- */'
+    if output_type == 'code' and len(args) == 4:
+        header_name = args[3]
+
+        print """\
+#include <config.h>
+#include %s
+#include <stdio.h>
+#include <stdlib.h>
+#include "libpspp/str.h"
+#include "gl/xalloc.h"\
+""" % header_name
+        for name, production in productions.items():
+            print_parser(name, production, ' ' * 4)
+            print_free(name, production, ' ' * 4)
+            print_print(name, production, ' ' * 4)
+    elif output_type == 'header' and len(args) == 3:
+        print """\
+#ifndef %(PREFIX)sPARSER_H
+#define %(PREFIX)sPARSER_H
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include "output/spv/spvbin-helpers.h"\
+""" % {'PREFIX': prefix.upper()}
+        for name, production in productions.items():
+            print '\nstruct %s%s {' % (prefix, name_to_id(name))
+            print "    size_t start, len;"
+            print_members(production, ' ' * 4)
+            print '''};
+bool %(prefix)sparse_%(name)s (struct spvbin_input *, struct %(prefix)s%(name)s **);
+void %(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *);
+void %(prefix)sprint_%(name)s (const char *title, int indent, const struct %(prefix)s%(name)s *);\
+''' % {'prefix': prefix,
+       'name': name_to_id(name)}
+        print """\
+
+#endif /* %(PREFIX)sPARSER_H */""" % {'PREFIX': prefix.upper()}
+    else:
+        sys.stderr.write("%s: bad usage (use --help for help)" % argv0)
diff --git a/src/output/spv/detail-xml.grammar b/src/output/spv/detail-xml.grammar

new file mode 100644 (file)

index 0000000..756033e
--- /dev/null
+++ b/src/output/spv/detail-xml.grammar
@@ -0,0 +1,346 @@
+visualization
+   :creator
+   :date
+   :lang
+   :name
+   :style[style_ref]=ref style
+   :type
+   :version
+   :schemaLocation?
+=> visualization_extension?
+   userSource
+   (sourceVariable | derivedVariable)+
+   categoricalDomain?
+   graph
+   labelFrame[lf1]*
+   container?
+   labelFrame[lf2]*
+   style+
+   layerController?
+
+extension[visualization_extension]
+   :numRows=int?
+   :showGridline=bool?
+   :minWidthSet=(true)?
+   :maxWidthSet=(true)?
+=> EMPTY
+
+userSource :missing=(listwise | pairwise)? => EMPTY   # Related to omit_empty?
+
+categoricalDomain => variableReference simpleSort
+
+simpleSort :method[sort_method]=(custom) => categoryOrder
+
+sourceVariable
+   :id
+   :categorical=(true)
+   :source
+   :domain=ref categoricalDomain?
+   :sourceName
+   :dependsOn=ref sourceVariable?
+   :label?
+   :labelVariable=ref sourceVariable?
+=> variable_extension* (format | stringFormat)?
+
+derivedVariable
+   :id
+   :categorical=(true)
+   :value
+   :dependsOn=ref sourceVariable?
+=> variable_extension* (format | stringFormat)? valueMapEntry*
+
+extension[variable_extension] :from :helpId => EMPTY
+
+valueMapEntry :from :to => EMPTY
+
+categoryOrder => TEXT
+
+graph
+   :cellStyle=ref style
+   :style=ref style
+=> location+ coordinates faceting facetLayout interval
+
+location
+   :part=(height | width | top | bottom | left | right)
+   :method=(sizeToContent | attach | fixed | same)
+   :min=dimension?
+   :max=dimension?
+   :target=ref (labelFrame | graph | container)?
+   :value?
+=> EMPTY
+
+coordinates => EMPTY
+
+faceting => layer[layers1]* cross layer[layers2]*
+
+cross => (unity | nest) (unity | nest)
+
+nest => variableReference[vars]+
+
+unity => EMPTY
+
+variableReference :ref=ref (sourceVariable | derivedVariable) => EMPTY
+
+layer
+   :variable=ref (sourceVariable | derivedVariable)
+   :value
+   :visible=bool?
+   :method[layer_method]=(nest)?
+   :titleVisible=bool?
+=> EMPTY
+
+facetLayout => tableLayout setCellProperties[scp1]*
+               facetLevel+ setCellProperties[scp2]*
+
+tableLayout
+   :verticalTitlesInCorner=bool
+   :style=ref style?
+   :fitCells=(ticks both)?
+=> EMPTY
+
+facetLevel :level=int :gap=dimension? => axis
+
+axis :style=ref style => label? majorTicks
+
+label
+   :style=ref style
+   :textFrameStyle=ref style?
+   :purpose=(title | subTitle | subSubTitle | layer | footnote)?
+=> text+ | descriptionGroup
+
+descriptionGroup
+   :target=ref faceting
+   :separator?
+=> (description | text)+
+
+description :name=(variable | value) => EMPTY
+
+majorTicks
+   :labelAngle=int
+   :length=dimension
+   :style=ref style
+   :tickFrameStyle=ref style
+   :labelFrequency=int?
+   :stagger=bool?
+=> gridline?
+
+gridline
+   :style=ref style
+   :zOrder=int
+=> EMPTY
+
+setCellProperties
+   :applyToConverse=bool?
+=> (setStyle | setFrameStyle | setFormat | setMetaData)* union[union_]?
+
+setStyle
+   :target=ref (labeling | graph | interval | majorTicks)
+   :style=ref style
+=> EMPTY
+
+setMetaData
+   :target=ref graph
+   :key
+   :value
+=> EMPTY
+
+setFormat
+   :target=ref (majorTicks | labeling)
+   :reset=bool?
+=> format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat
+
+setFrameStyle
+   :style=ref style
+   :target=ref majorTicks
+=> EMPTY
+
+format
+   :baseFormat[f_base_format]=(date | time | dateTime | elapsedTime)?
+   :errorCharacter?
+   :separatorChars?
+   :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
+   :showYear=bool?
+   :showQuarter=bool?
+   :quarterPrefix?
+   :quarterSuffix?
+   :yearAbbreviation=bool?
+   :showMonth=bool?
+   :monthFormat=(long | short | number | paddedNumber)?
+   :dayPadding=bool?
+   :dayOfMonthPadding=bool?
+   :showWeek=bool?
+   :weekPadding=bool?
+   :weekSuffix?
+   :showDayOfWeek=bool?
+   :dayOfWeekAbbreviation=bool?
+   :hourPadding=bool?
+   :minutePadding=bool?
+   :secondPadding=bool?
+   :showDay=bool?
+   :showHour=bool?
+   :showMinute=bool?
+   :showSecond=bool?
+   :showMillis=bool?
+   :dayType=(month | year)?
+   :hourFormat=(AMPM | AS_24 | AS_12)?
+   :minimumIntegerDigits=int?
+   :maximumFractionDigits=int?
+   :minimumFractionDigits=int?
+   :useGrouping=bool?
+   :scientific=(onlyForSmall | whenNeeded | true | false)?
+   :small=real?
+   :prefix?
+   :suffix?
+   :tryStringsAsNumbers=bool?
+   :negativesOutside=bool?
+=> relabel* affix*
+
+numberFormat
+   :minimumIntegerDigits=int?
+   :maximumFractionDigits=int?
+   :minimumFractionDigits=int?
+   :useGrouping=bool?
+   :scientific=(onlyForSmall | whenNeeded | true | false)?
+   :small=real?
+   :prefix?
+   :suffix?
+=> affix*
+
+stringFormat => relabel* affix*
+
+dateTimeFormat
+   :baseFormat[dt_base_format]=(date | time | dateTime)
+   :separatorChars?
+   :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)?
+   :showYear=bool?
+   :yearAbbreviation=bool?
+   :showQuarter=bool?
+   :quarterPrefix?
+   :quarterSuffix?
+   :showMonth=bool?
+   :monthFormat=(long | short | number | paddedNumber)?
+   :showWeek=bool?
+   :weekPadding=bool?
+   :weekSuffix?
+   :showDayOfWeek=bool?
+   :dayOfWeekAbbreviation=bool?
+   :dayPadding=bool?
+   :dayOfMonthPadding=bool?
+   :hourPadding=bool?
+   :minutePadding=bool?
+   :secondPadding=bool?
+   :showDay=bool?
+   :showHour=bool?
+   :showMinute=bool?
+   :showSecond=bool?
+   :showMillis=bool?
+   :dayType=(month | year)?
+   :hourFormat=(AMPM | AS_24 | AS_12)?
+=> affix*
+
+elapsedTimeFormat
+   :baseFormat[dt_base_format]=(date | time | dateTime)
+   :dayPadding=bool?
+   :hourPadding=bool?
+   :minutePadding=bool?
+   :secondPadding=bool?
+   :showYear=bool?
+   :showDay=bool?
+   :showHour=bool?
+   :showMinute=bool?
+   :showSecond=bool?
+   :showMillis=bool?
+=> affix*
+
+affix
+   :definesReference=int
+   :position=(subscript | superscript)
+   :suffix=bool
+   :value
+=> EMPTY
+
+relabel :from=real :to => EMPTY
+
+union => intersect+
+
+intersect => where+ | intersectWhere | alternating | EMPTY
+
+where
+   :variable=ref (sourceVariable | derivedVariable)
+   :include
+=> EMPTY
+
+intersectWhere
+   :variable=ref (sourceVariable | derivedVariable)
+   :variable2=ref (sourceVariable | derivedVariable)
+=> EMPTY
+
+alternating => EMPTY
+
+text
+   :usesReference=int?
+   :definesReference=int?
+   :position=(subscript | superscript)?
+   :style=ref style
+=> TEXT
+
+interval :style=ref style => labeling footnotes?
+
+labeling
+   :style=ref style?
+   :variable=ref (sourceVariable | derivedVariable)
+=> (formatting | format | footnotes)*
+
+formatting :variable=ref (sourceVariable | derivedVariable) => formatMapping*
+
+formatMapping :from=int => format?
+
+footnotes
+   :superscript=bool?
+   :variable=ref (sourceVariable | derivedVariable)
+=> footnoteMapping*
+
+footnoteMapping :definesReference=int :from=int :to => EMPTY
+
+style
+   :color=color?
+   :color2=color?
+   :labelAngle=real?
+   :border-bottom=(solid | thick | thin | double | none)?
+   :border-top=(solid | thick | thin | double | none)?
+   :border-left=(solid | thick | thin | double | none)?
+   :border-right=(solid | thick | thin | double | none)?
+   :border-bottom-color?
+   :border-top-color?
+   :border-left-color?
+   :border-right-color?
+   :font-family?
+   :font-size?
+   :font-weight=(regular | bold)?
+   :font-style=(regular | italic)?
+   :font-underline=(none | underline)?
+   :margin-bottom=dimension?
+   :margin-left=dimension?
+   :margin-right=dimension?
+   :margin-top=dimension?
+   :textAlignment=(left | right | center | decimal | mixed)?
+   :labelLocationHorizontal=(positive | negative | center)?
+   :labelLocationVertical=(positive | negative | center)?
+   :decimal-offset=dimension?
+   :size?
+   :width?
+   :visible=bool?
+=> EMPTY
+
+layerController
+   :source=(tableData)
+   :target=ref label?
+=> EMPTY
+
+container :style=ref style => container_extension? location+ labelFrame*
+
+extension[container_extension] :combinedFootnotes=(true) => EMPTY
+
+labelFrame :style=ref style => location+ label? paragraph?
+
+paragraph :hangingIndent=dimension? => EMPTY
diff --git a/src/output/spv/light-binary.grammar b/src/output/spv/light-binary.grammar

new file mode 100644 (file)

index 0000000..b48f3d8
--- /dev/null
+++ b/src/output/spv/light-binary.grammar
@@ -0,0 +1,201 @@
+Table =>
+   Header Titles Footnotes
+   Areas Borders PrintSettings[ps] TableSettings[ts] Formats
+   Dimensions Axes Cells
+   01?
+
+Header =>
+   01 00
+   int32[version]
+   bool[x0]
+   bool[x1]
+   bool[rotate-inner-column-labels]
+   bool[rotate-outer-row-labels]
+   bool[x2]
+   int32[x3]
+   int32[min-col-width] int32[max-col-width]
+   int32[min-row-height] int32[max-row-height]
+   int64[table-id]
+
+Titles =>
+   Value[title] 01?
+   Value[subtype] 01? 31
+   Value[user-title] 01?
+   (31 Value[corner-text] | 58)
+   (31 Value[caption] | 58)
+
+Footnotes => int32[n-footnotes] Footnote*[n-footnotes]
+Footnote => Value[text] (58 | 31 Value[marker]) int32[x4]
+
+Areas => 00? Area*8[areas]
+Area =>
+   byte[index] 31
+   string[typeface] float[size] int32[style] bool[underline]
+   int32[halign] int32[valign]
+   string[fg-color] string[bg-color]
+   bool[alternate] string[alt-fg-color] string[alt-bg-color]
+   v3(int32[left-margin] int32[right-margin] int32[top-margin] int32[bottom-margin])
+
+Borders =>
+   count(
+       ib1
+       be32[n-borders] Border*[n-borders]
+       bool[show-grid-lines]
+       00 00 00)
+
+Border =>
+   be32[border-type]
+   be32[stroke-type]
+   be32[color]
+
+PrintSettings =>
+   count(
+       ib1
+       bool[all-layers]
+       bool[paginate-layers]
+       bool[fit-width]
+       bool[fit-length]
+       bool[top-continuation]
+       bool[bottom-continuation]
+       be32[n-orphan-lines]
+       bestring[continuation-string])
+
+TableSettings =>
+   count(
+     v3(
+       ib1
+       be32[x5]
+       be32[current-layer]
+       bool[omit-empty]
+       bool[show-row-labels-in-corner]
+       bool[show-alphabetic-markers]
+       bool[footnote-marker-superscripts]
+       byte[x6]
+       becount(
+        Breakpoints[row-breaks] Breakpoints[col-breaks]
+        Keeps[row-keeps] Keeps[col-keeps]
+        PointKeeps[row-point-keeps] PointKeeps[col-point-keeps]
+       )
+       bestring[notes]
+       bestring[table-look]
+       )...)
+
+Breakpoints => be32[n-breaks] be32*[n-breaks]
+
+Keeps => be32[n-keeps] Keep*[n-keeps]
+Keep => be32[offset] be32[n]
+
+PointKeeps => be32[n-point-keeps] PointKeep*[n-point-keeps]
+PointKeep => be32[offset] be32 be32
+
+Formats =>
+   int32[n-widths] int32*[n-widths]
+   string[locale]
+   int32[current-layer]
+   bool[x7] bool[x8] bool[x9]
+   Y0
+   CustomCurrency
+   count(
+     v1(X0?)
+     v3(count(X1 count(X2)) count(X3)))
+Y0 => int32[epoch] byte[decimal] byte[grouping]
+CustomCurrency => int32[n-ccs] string*[n-ccs]
+
+X0 => byte*14 Y1 Y2
+Y1 =>
+   string[command] string[command-local]
+   string[language] string[charset] string[locale]
+   bool[x10] bool[x11] bool[x12] bool[x13]
+   Y0
+Y2 => CustomCurrency byte[missing] bool[x16]
+
+X1 =>
+   00 byte[x14] bool[x15]
+   byte[lang]
+   byte[show-variables]
+   byte[show-values]
+   int32[x17] int32[x18]
+   00*17
+   bool[x19]
+   01
+
+X2 =>
+   int32[n-row-heights] int32*[n-row-heights]
+   int32[n-style-map] StyleMap*[n-style-map]
+   int32[n-styles] StylePair*[n-styles]
+   count((i0 i0)?)
+StyleMap => int64[cell-index] int16[style-index]
+
+X3 =>
+   01 00 byte[x20] 00 00 00
+   Y1
+   double[small] 01
+   (string[dataset] string[datafile] i0 int32[date] i0)?
+   Y2
+   (int32[x21] i0)?
+
+Dimensions => int32[n-dims] Dimension*[n-dims]
+Dimension =>
+    Value[name] DimProperties[props]
+    int32[n-categories] Category*[n-categories]
+DimProperties =>
+   byte[x1]
+   byte[x2]
+   int32[x3]
+   bool[hide-dim-label]
+   bool[hide-all-labels]
+   01 int32[dim-index]
+
+Category => Value[name] (Leaf | Group)
+Leaf => 00 00 00 i2 int32[leaf-index] i0
+Group =>
+   bool[merge] 00 01 int32[x22]
+   i-1 int32[n-subcategories] Category*[n-subcategories]
+
+Axes =>
+   int32[n-layers] int32[n-rows] int32[n-columns]
+   int32*[n-layers] int32*[n-rows] int32*[n-columns]
+
+Cells => int32[n-cells] Cell*[n-cells]
+Cell => int64[index] v1(00?) Value
+
+Value =>
+  00? 00? 00? 00?
+  case(
+      01 ValueMod int32[format] double[x]
+    | 02 ValueMod int32[format] double[x]
+      string[var-name] string[value-label] byte[show]
+    | 03 string[local] ValueMod string[id] string[c] bool[fixed]
+    | 04 ValueMod int32[format] string[value-label] string[var-name]
+      byte[show] string[s]
+    | 05 ValueMod string[var-name] string[var-label] byte[show]
+    | 06 string[local] ValueMod string[id] string[c]
+    | else ValueMod string[template] int32[n-args] Argument*[n-args]
+  )[type]
+Argument =>
+    i0 Value[value]
+  | int32[n-values] i0 Value*[n-values]
+
+ValueMod =>
+    58
+  | 31
+    int32[n-refs] int16*[n-refs]
+    (i0 | i1 string[subscript])
+    v1(00 (i1 | i2) 00? 00? int32 00? 00?)
+    v3(count(TemplateString StylePair))
+
+TemplateString => count((count((i0 58)?) (58 | 31 string[id]))?)
+
+StylePair =>
+    (31 FontStyle | 58)
+    (31 CellStyle | 58)
+
+FontStyle =>
+    bool[bold] bool[italic] bool[underline] bool[show]
+    string[fg-color] string[bg-color]
+    string[typeface] byte[size]
+
+CellStyle =>
+    int32[halign] int32[valign] double[decimal-offset]
+    int16[left-margin] int16[right-margin]
+    int16[top-margin] int16[bottom-margin]
diff --git a/src/output/spv/old-binary.grammar b/src/output/spv/old-binary.grammar

new file mode 100644 (file)

index 0000000..0e7f0d1
--- /dev/null
+++ b/src/output/spv/old-binary.grammar
@@ -0,0 +1,23 @@
+LegacyBinary =>
+    00 byte[version] int16[n-sources] int32[member-size]
+    Metadata*[n-sources][metadata]
+    #Data*[n-sources][data]
+    #Strings?
+
+Metadata =>
+    int32[n-values] int32[n-variables] int32[data-offset]
+    byte*28[source-name]
+    vB0(byte*36[ext-source-name] int32[x])
+
+#Data => Variable*[n-variables]
+#Variable => byte*288[variable-name] double*[n-values]
+
+Strings => SourceMaps[maps] Labels
+
+SourceMaps => int32[n-maps] SourceMap*[n-maps]
+SourceMap => string[source-name] int32[n-variables] VariableMap*[n-variables]
+VariableMap => string[variable-name] int32[n-data] DatumMap*[n-data]
+DatumMap => int32[value-idx] int32[label-idx]
+
+Labels => int32[n-labels] Label*[n-labels]
+Label => int32[frequency] string[label]
diff --git a/src/output/spv/spv-css-parser.c b/src/output/spv/spv-css-parser.c

new file mode 100644 (file)

index 0000000..c3a7118
--- /dev/null
+++ b/src/output/spv/spv-css-parser.c
@@ -0,0 +1,175 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "spv-css-parser.h"
+
+#include <stdlib.h>
+#include <string.h>
+
+#include "libpspp/str.h"
+#include "output/pivot-table.h"
+#include "spv.h"
+
+#include "gl/c-ctype.h"
+#include "gl/xalloc.h"
+#include "gl/xmemdup0.h"
+
+enum css_token_type
+  {
+    T_EOF,
+    T_ID,
+    T_LCURLY,
+    T_RCURLY,
+    T_COLON,
+    T_SEMICOLON,
+    T_ERROR
+  };
+
+struct css_token
+  {
+    enum css_token_type type;
+    char *s;
+  };
+
+static char *
+css_skip_spaces (char *p)
+{
+  for (;;)
+    {
+      if (c_isspace (*p))
+        p++;
+      else if (!strncmp (p, "<!--", 4))
+        p += 4;
+      else if (!strncmp (p, "-->", 3))
+        p += 3;
+      else
+        return p;
+    }
+}
+
+static bool
+css_is_separator (unsigned char c)
+{
+  return c_isspace (c) || strchr ("{}:;", c);
+}
+
+static void
+css_token_get (char **p_, struct css_token *token)
+{
+  char *p = *p_;
+
+  free (token->s);
+  token->s = NULL;
+
+  p = css_skip_spaces (p);
+  if (*p == '\0')
+    token->type = T_EOF;
+  else if (*p == '{')
+    {
+      token->type = T_LCURLY;
+      p++;
+    }
+  else if (*p == '}')
+    {
+      token->type = T_RCURLY;
+      p++;
+    }
+  else if (*p == ':')
+    {
+      token->type = T_COLON;
+      p++;
+    }
+  else if (*p == ';')
+    {
+      token->type = T_SEMICOLON;
+      p++;
+    }
+  else
+    {
+      token->type = T_ID;
+      char *start = p;
+      while (!css_is_separator (*p))
+        p++;
+      token->s = xmemdup0 (start, p - start);
+    }
+  *p_ = p;
+}
+
+static void
+css_decode_key_value (const char *key, const char *value,
+                      struct font_style *font)
+{
+  if (!strcmp (key, "font-weight"))
+    font->bold = !strcmp (value, "bold");
+  else if (!strcmp (key, "font-style"))
+    font->italic = !strcmp (value, "italic");
+  else if (!strcmp (key, "font-decoration"))
+    font->underline = !strcmp (value, "underline");
+  else if (!strcmp (key, "font-family"))
+    {
+      free (font->typeface);
+      font->typeface = xstrdup (value);
+    }
+  else if (!strcmp (key, "font-size"))
+    font->size = atoi (value);
+
+  /* fg_color, bg_color */
+
+}
+
+char *
+spv_parse_css_style (char *style, struct font_style *font)
+{
+  *font = (struct font_style) FONT_STYLE_INITIALIZER;
+
+  char *p = style;
+  struct css_token token = { .s = NULL };
+  css_token_get (&p, &token);
+  while (token.type != T_EOF)
+    {
+      if (token.type != T_ID || !strcmp (token.s, "p"))
+        {
+          css_token_get (&p, &token);
+          continue;
+        }
+
+      char *key = token.s;
+      token.s = NULL;
+      css_token_get (&p, &token);
+
+      if (token.type == T_COLON)
+        {
+          struct string value = DS_EMPTY_INITIALIZER;
+          for (;;)
+            {
+              css_token_get (&p, &token);
+              if (token.type != T_ID)
+                break;
+              if (!ds_is_empty (&value))
+                ds_put_byte (&value, ' ');
+              ds_put_cstr (&value, token.s);
+            }
+
+          css_decode_key_value (key, ds_cstr (&value), font);
+
+          ds_destroy (&value);
+        }
+      free (key);
+    }
+  return NULL;
+}
diff --git a/src/output/spv/spv-css-parser.h b/src/output/spv/spv-css-parser.h

new file mode 100644 (file)

index 0000000..44f7142
--- /dev/null
+++ b/src/output/spv/spv-css-parser.h
@@ -0,0 +1,24 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_CSS_PARSER_H
+#define OUTPUT_SPV_CSS_PARSER_H 1
+
+struct font_style;
+
+char *spv_parse_css_style (char *style, struct font_style *font);
+
+#endif /* output/spv/spv-css-parser.h */
diff --git a/src/output/spv/spv-dump.c b/src/output/spv/spv-dump.c

new file mode 100644 (file)

index 0000000..2e9841b
--- /dev/null
+++ b/src/output/spv/spv-dump.c
@@ -0,0 +1,83 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2017, 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spv.h"
+
+#include <inttypes.h>
+#include <stdlib.h>
+
+#include "data/settings.h"
+#include "output/pivot-table.h"
+
+#include "gl/xalloc.h"
+
+static void
+indent (int indentation)
+{
+  for (int i = 0; i < indentation * 2; i++)
+    putchar (' ');
+}
+
+void
+spv_item_dump (const struct spv_item *item, int indentation)
+{
+  indent (indentation);
+  if (item->label)
+    printf ("\"%s\" ", item->label);
+  if (!item->visible)
+    printf ("(hidden) ");
+
+  switch (item->type)
+    {
+    case SPV_ITEM_HEADING:
+      printf ("heading\n");
+      for (size_t i = 0; i < item->n_children; i++)
+        spv_item_dump (item->children[i], indentation + 1);
+      break;
+
+    case SPV_ITEM_TEXT:
+      printf ("text \"%s\"\n",
+              pivot_value_to_string (item->text, SETTINGS_VALUE_SHOW_DEFAULT,
+                                     SETTINGS_VALUE_SHOW_DEFAULT));
+      break;
+
+    case SPV_ITEM_TABLE:
+      if (item->table)
+        pivot_table_dump (item->table, indentation + 1);
+      else
+        {
+          printf ("unloaded table in %s", item->bin_member);
+          if (item->xml_member)
+            printf (" and %s", item->xml_member);
+          putchar ('\n');
+        }
+      break;
+
+    case SPV_ITEM_GRAPH:
+      printf ("graph\n");
+      break;
+
+    case SPV_ITEM_MODEL:
+      printf ("model\n");
+      break;
+
+    case SPV_ITEM_OBJECT:
+      printf ("object type=\"%s\" uri=\"%s\"\n", item->object_type, item->uri);
+      break;
+    }
+}
diff --git a/src/output/spv/spv-legacy-data.c b/src/output/spv/spv-legacy-data.c

new file mode 100644 (file)

index 0000000..3b732ef
--- /dev/null
+++ b/src/output/spv/spv-legacy-data.c
@@ -0,0 +1,389 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spv-legacy-data.h"
+
+#include <inttypes.h>
+#include <math.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "libpspp/cast.h"
+#include "libpspp/float-format.h"
+#include "data/val-type.h"
+#include "output/spv/old-binary-parser.h"
+
+#include "gl/minmax.h"
+#include "gl/xalloc.h"
+#include "gl/xmemdup0.h"
+#include "gl/xsize.h"
+#include "gl/xvasprintf.h"
+
+void
+spv_data_uninit (struct spv_data *data)
+{
+  if (!data)
+    return;
+
+  for (size_t i = 0; i < data->n_sources; i++)
+    spv_data_source_uninit (&data->sources[i]);
+  free (data->sources);
+}
+
+void
+spv_data_dump (const struct spv_data *data, FILE *stream)
+{
+  for (size_t i = 0; i < data->n_sources; i++)
+    {
+      if (i > 0)
+        putc ('\n', stream);
+      spv_data_source_dump (&data->sources[i], stream);
+    }
+}
+
+struct spv_data_source *
+spv_data_find_source (const struct spv_data *data, const char *source_name)
+{
+  for (size_t i = 0; i < data->n_sources; i++)
+    if (!strcmp (data->sources[i].source_name, source_name))
+      return &data->sources[i];
+
+  return NULL;
+}
+
+struct spv_data_variable *
+spv_data_find_variable (const struct spv_data *data,
+                        const char *source_name,
+                        const char *variable_name)
+{
+  struct spv_data_source *source = spv_data_find_source (data, source_name);
+  return source ? spv_data_source_find_variable (source, variable_name) : NULL;
+}
+
+void
+spv_data_source_uninit (struct spv_data_source *source)
+{
+  if (!source)
+    return;
+
+  for (size_t i = 0; i < source->n_vars; i++)
+    spv_data_variable_uninit (&source->vars[i]);
+  free (source->vars);
+  free (source->source_name);
+}
+
+void
+spv_data_source_dump (const struct spv_data_source *source, FILE *stream)
+{
+  fprintf (stream, "source \"%s\" (%zu values):\n",
+           source->source_name, source->n_values);
+  for (size_t i = 0; i < source->n_vars; i++)
+    spv_data_variable_dump (&source->vars[i], stream);
+}
+
+struct spv_data_variable *
+spv_data_source_find_variable (const struct spv_data_source *source,
+                               const char *variable_name)
+{
+  for (size_t i = 0; i < source->n_vars; i++)
+    if (!strcmp (source->vars[i].var_name, variable_name))
+      return &source->vars[i];
+  return NULL;
+}
+
+void
+spv_data_variable_uninit (struct spv_data_variable *var)
+{
+  if (!var)
+    return;
+
+  free (var->var_name);
+  for (size_t i = 0; i < var->n_values; i++)
+    spv_data_value_uninit (&var->values[i]);
+  free (var->values);
+}
+
+void
+spv_data_variable_dump (const struct spv_data_variable *var, FILE *stream)
+{
+  fprintf (stream, "variable \"%s\":", var->var_name);
+  for (size_t i = 0; i < var->n_values; i++)
+    {
+      if (i)
+        putc (',', stream);
+      putc (' ', stream);
+      spv_data_value_dump (&var->values[i], stream);
+    }
+  putc ('\n', stream);
+}
+
+void
+spv_data_value_uninit (struct spv_data_value *value)
+{
+  if (value && value->width >= 0)
+    free (value->s);
+}
+
+bool
+spv_data_value_equal (const struct spv_data_value *a,
+                      const struct spv_data_value *b)
+{
+  return (a->width == b->width
+          && a->index == b->index
+          && (a->width < 0
+              ? a->d == b->d
+              : !strcmp (a->s, b->s)));
+}
+
+struct spv_data_value *
+spv_data_values_clone (const struct spv_data_value *src, size_t n)
+{
+  struct spv_data_value *dst = xmemdup (src, n * sizeof *src);
+  for (size_t i = 0; i < n; i++)
+    if (dst[i].width >= 0)
+      dst[i].s = xstrdup (dst[i].s);
+  return dst;
+}
+
+void
+spv_data_value_dump (const struct spv_data_value *value, FILE *stream)
+{
+  if (value->index != SYSMIS)
+    fprintf (stream, "%.*ge-", DBL_DIG + 1, value->index);
+  if (value->width >= 0)
+    fprintf (stream, "\"%s\"", value->s);
+  else if (value->d == SYSMIS)
+    putc ('.', stream);
+  else
+    fprintf (stream, "%.*g", DBL_DIG + 1, value->d);
+}
+\f
+static char *
+decode_fixed_string (const uint8_t *buf_, size_t size)
+{
+  const char *buf = CHAR_CAST (char *, buf_);
+  return xmemdup0 (buf, strnlen (buf, size));
+}
+
+static char *
+decode_var_name (const struct spvob_metadata *md)
+{
+  int n0 = strnlen ((char *) md->source_name, sizeof md->source_name);
+  int n1 = (n0 < sizeof md->source_name ? 0
+            : strnlen ((char *) md->ext_source_name,
+                       sizeof md->ext_source_name));
+  return xasprintf ("%.*s%.*s",
+                    n0, (char *) md->source_name,
+                    n1, (char *) md->ext_source_name);
+}
+
+static char * WARN_UNUSED_RESULT
+decode_data (const uint8_t *in, size_t size, size_t data_offset,
+             struct spv_data_source *source, size_t *end_offsetp)
+{
+  size_t var_size = xsum (288, xtimes (source->n_values, 8));
+  size_t source_size = xtimes (source->n_vars, var_size);
+  size_t end_offset = xsum (data_offset, source_size);
+  if (size_overflow_p (end_offset))
+    return xasprintf ("Data source \"%s\" exceeds supported %zu-byte size.",
+                      source->source_name, SIZE_MAX - 1);
+  if (end_offset > size)
+    return xasprintf ("%zu-byte data source \"%s\" starting at offset %#zx "
+                      "runs past end of %zu-byte ZIP member.",
+                      source_size, source->source_name, data_offset,
+                      size);
+
+  in += data_offset;
+  for (size_t i = 0; i < source->n_vars; i++)
+    {
+      struct spv_data_variable *var = &source->vars[i];
+      var->var_name = decode_fixed_string (in, 288);
+      in += 288;
+
+      var->values = xnmalloc (source->n_values, sizeof *var->values);
+      var->n_values = source->n_values;
+      for (size_t j = 0; j < source->n_values; j++)
+        {
+          var->values[j].index = SYSMIS;
+          var->values[j].width = -1;
+          var->values[j].d = float_get_double (FLOAT_IEEE_DOUBLE_LE, in);
+          in += 8;
+        }
+    }
+
+  *end_offsetp = end_offset;
+  return NULL;
+}
+
+static char * WARN_UNUSED_RESULT
+decode_variable_map (const char *source_name,
+                     const struct spvob_variable_map *in,
+                     const struct spvob_labels *labels,
+                     struct spv_data_variable *out)
+{
+  if (strcmp (in->variable_name, out->var_name))
+    return xasprintf ("Source \"%s\" variable \"%s\" mapping is associated "
+                      "with wrong variable \"%s\".",
+                      source_name, out->var_name, in->variable_name);
+
+  for (size_t i = 0; i < in->n_data; i++)
+    {
+      const struct spvob_datum_map *map = in->data[i];
+
+      if (map->value_idx >= out->n_values)
+        return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu "
+                          "attempts to set 0-based value %"PRIu32" "
+                          "but source has only %zu values.",
+                          source_name, out->var_name, i,
+                          map->value_idx, out->n_values);
+      struct spv_data_value *value = &out->values[map->value_idx];
+      if (value->width >= 0)
+        return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu "
+                          "attempts to change string value %"PRIu32".",
+                          source_name, out->var_name, i,
+                          map->value_idx);
+      else if (value->d != SYSMIS && !isnan (value->d))
+        return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu "
+                          "attempts to change non-missing value %"PRIu32".",
+                          source_name, out->var_name, i,
+                          map->value_idx);
+
+      if (map->label_idx >= labels->n_labels)
+        return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu "
+                          "attempts to set value %"PRIu32" to 0-based label "
+                          "%"PRIu32" but only %"PRIu32" labels are present.",
+                          source_name, out->var_name, i,
+                          map->value_idx, map->label_idx, labels->n_labels);
+      const struct spvob_label *label = labels->labels[map->label_idx];
+
+      value->width = strlen (label->label);
+      value->s = xmemdup0 (label->label, value->width);
+    }
+
+  return NULL;
+}
+
+static char * WARN_UNUSED_RESULT
+decode_source_map (const struct spvob_source_map *in,
+                   const struct spvob_labels *labels,
+                   struct spv_data_source *out)
+{
+  if (in->n_variables > out->n_vars)
+    return xasprintf ("source map for \"%s\" has %"PRIu32" variables but "
+                      "source has only %zu",
+                      out->source_name, in->n_variables, out->n_vars);
+
+  for (size_t i = 0; i < in->n_variables; i++)
+    {
+      char *error = decode_variable_map (out->source_name, in->variables[i],
+                                         labels, &out->vars[i]);
+      if (error)
+        return error;
+    }
+
+  return NULL;
+}
+
+static char * WARN_UNUSED_RESULT
+decode_strings (const struct spvob_strings *in, struct spv_data *out)
+{
+  for (size_t i = 0; i < in->maps->n_maps; i++)
+    {
+      const struct spvob_source_map *sm = in->maps->maps[i];
+      const char *name = sm->source_name;
+      struct spv_data_source *source = spv_data_find_source (out, name);
+      if (!source)
+        return xasprintf ("cannot decode source map for unknown source \"%s\"",
+                          name);
+
+      char *error = decode_source_map (sm, in->labels, source);
+      if (error)
+        return error;
+    }
+
+  return NULL;
+}
+
+char * WARN_UNUSED_RESULT
+spv_legacy_data_decode (const uint8_t *in, size_t size, struct spv_data *out)
+{
+  char *error = NULL;
+  memset (out, 0, sizeof *out);
+
+  struct spvbin_input input;
+  spvbin_input_init (&input, in, size);
+
+  struct spvob_legacy_binary *lb;
+  bool ok = spvob_parse_legacy_binary (&input, &lb);
+  if (!ok)
+    {
+      error = spvbin_input_to_error (&input, NULL);
+      goto error;
+    }
+
+  out->sources = xcalloc (lb->n_sources, sizeof *out->sources);
+  out->n_sources = lb->n_sources;
+
+  for (size_t i = 0; i < lb->n_sources; i++)
+    {
+      const struct spvob_metadata *md = lb->metadata[i];
+      struct spv_data_source *source = &out->sources[i];
+
+      source->source_name = decode_var_name (md);
+      source->n_vars = md->n_variables;
+      source->n_values = md->n_values;
+      source->vars = xcalloc (md->n_variables, sizeof *source->vars);
+
+      size_t end;
+      error = decode_data (in, size, md->data_offset, source, &end);
+      if (error)
+        goto error;
+
+      input.ofs = MAX (input.ofs, end);
+    }
+
+  if (input.ofs < input.size)
+    {
+      struct spvob_strings *strings;
+      bool ok = spvob_parse_strings (&input, &strings);
+      if (!ok)
+        error = spvbin_input_to_error (&input, NULL);
+      else
+        {
+          if (input.ofs != input.size)
+            error = xasprintf ("expected end of file at offset #%zx",
+                               input.ofs);
+          else
+            error = decode_strings (strings, out);
+          spvob_free_strings (strings);
+        }
+
+      if (error)
+        goto error;
+    }
+
+  spvob_free_legacy_binary (lb);
+
+  return NULL;
+
+error:
+  spv_data_uninit (out);
+  memset (out, 0, sizeof *out);
+  spvob_free_legacy_binary (lb);
+  return error;
+}
diff --git a/src/output/spv/spv-legacy-data.h b/src/output/spv/spv-legacy-data.h

new file mode 100644 (file)

index 0000000..1e8fa2d
--- /dev/null
+++ b/src/output/spv/spv-legacy-data.h
@@ -0,0 +1,89 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_LEGACY_DATA_H
+#define OUTPUT_SPV_LEGACY_DATA_H 1
+
+/* SPSS Viewer (SPV) legacy binary data decoder.
+
+   Used by spv.h, not useful directly. */
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include "libpspp/compiler.h"
+
+struct spv_data
+  {
+    struct spv_data_source *sources;
+    size_t n_sources;
+  };
+
+void spv_data_uninit (struct spv_data *);
+void spv_data_dump (const struct spv_data *, FILE *);
+
+struct spv_data_source *spv_data_find_source (const struct spv_data *,
+                                              const char *source_name);
+struct spv_data_variable *spv_data_find_variable (const struct spv_data *,
+                                                  const char *source_name,
+                                                  const char *variable_name);
+
+struct spv_data_source
+  {
+    char *source_name;
+    struct spv_data_variable *vars;
+    size_t n_vars, n_values;
+  };
+
+void spv_data_source_uninit (struct spv_data_source *);
+void spv_data_source_dump (const struct spv_data_source *, FILE *);
+
+struct spv_data_variable *spv_data_source_find_variable (
+  const struct spv_data_source *, const char *variable_name);
+
+struct spv_data_variable
+  {
+    char *var_name;
+    struct spv_data_value *values;
+    size_t n_values;
+  };
+
+void spv_data_variable_uninit (struct spv_data_variable *);
+void spv_data_variable_dump (const struct spv_data_variable *, FILE *);
+
+struct spv_data_value
+  {
+    double index;
+    int width;
+    union
+      {
+        double d;
+        char *s;
+      };
+  };
+
+void spv_data_value_uninit (struct spv_data_value *);
+bool spv_data_value_equal (const struct spv_data_value *,
+                           const struct spv_data_value *);
+struct spv_data_value *spv_data_values_clone (const struct spv_data_value *,
+                                              size_t n);
+
+char *spv_legacy_data_decode (const uint8_t *in, size_t size,
+                              struct spv_data *out) WARN_UNUSED_RESULT;
+void spv_data_value_dump (const struct spv_data_value *, FILE *);
+
+#endif /* output/spv/spv-legacy-data.h */
diff --git a/src/output/spv/spv-legacy-decoder.c b/src/output/spv/spv-legacy-decoder.c

new file mode 100644 (file)

index 0000000..7dcef69
--- /dev/null
+++ b/src/output/spv/spv-legacy-decoder.c
@@ -0,0 +1,2215 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2017, 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spv-legacy-decoder.h"
+
+#include <errno.h>
+#include <inttypes.h>
+#include <math.h>
+#include <limits.h>
+#include <stdlib.h>
+
+#include "data/data-out.h"
+#include "data/calendar.h"
+#include "data/format.h"
+#include "data/value.h"
+#include "libpspp/assertion.h"
+#include "libpspp/hash-functions.h"
+#include "libpspp/hmap.h"
+#include "libpspp/message.h"
+#include "output/pivot-table.h"
+#include "output/spv/detail-xml-parser.h"
+#include "output/spv/spv-legacy-data.h"
+#include "output/spv/spv.h"
+#include "output/spv/structure-xml-parser.h"
+
+#include "gl/c-strtod.h"
+#include "gl/xalloc.h"
+#include "gl/xmemdup0.h"
+
+#include <libxml/tree.h>
+
+#include "gettext.h"
+#define N_(msgid) msgid
+#define _(msgid) gettext (msgid)
+
+struct spv_legacy_properties
+  {
+    /* General properties. */
+    bool omit_empty;
+    int width_ranges[TABLE_N_AXES][2];      /* In 1/96" units. */
+    bool row_labels_in_corner;
+
+    /* Footnote display settings. */
+    bool show_numeric_markers;
+    bool footnote_marker_superscripts;
+
+    /* Styles. */
+    struct area_style areas[PIVOT_N_AREAS];
+    struct table_border_style borders[PIVOT_N_BORDERS];
+
+    /* Print settings. */
+    bool print_all_layers;
+    bool paginate_layers;
+    bool shrink_to_width;
+    bool shrink_to_length;
+    bool top_continuation, bottom_continuation;
+    char *continuation;
+    size_t n_orphan_lines;
+  };
+
+struct spv_series
+  {
+    struct hmap_node hmap_node; /* By name. */
+    char *name;
+    char *label;
+    struct fmt_spec format;
+
+    struct spv_series *label_series;
+    bool is_label_series;
+
+    const struct spvxml_node *xml;
+
+    struct spv_data_value *values;
+    size_t n_values;
+    struct hmap map;            /* Contains "struct spv_mapping". */
+    bool remapped;
+
+    struct pivot_dimension *dimension;
+
+    struct pivot_category **index_to_category;
+    size_t n_index;
+
+    struct spvdx_affix **affixes;
+    size_t n_affixes;
+  };
+
+static void spv_map_destroy (struct hmap *);
+
+static struct spv_series *
+spv_series_first (struct hmap *series_map)
+{
+  struct spv_series *series;
+  HMAP_FOR_EACH (series, struct spv_series, hmap_node, series_map)
+    return series;
+  return NULL;
+}
+
+static struct spv_series *
+spv_series_find (const struct hmap *series_map, const char *name)
+{
+  struct spv_series *series;
+  HMAP_FOR_EACH_WITH_HASH (series, struct spv_series, hmap_node,
+                           hash_string (name, 0), series_map)
+    if (!strcmp (name, series->name))
+      return series;
+  return NULL;
+}
+
+static struct spv_series *
+spv_series_from_ref (const struct hmap *series_map,
+                     const struct spvxml_node *ref)
+{
+  const struct spvxml_node *node
+    = (spvdx_is_source_variable (ref)
+       ? &spvdx_cast_source_variable (ref)->node_
+       : &spvdx_cast_derived_variable (ref)->node_);
+  struct spv_series *series = spv_series_find (series_map, node->id);
+  if (!series)
+    printf ("missing series %s\n", node->id);
+  return series;
+}
+
+static void UNUSED
+spv_series_dump (const struct spv_series *series)
+{
+  printf ("series \"%s\"", series->name);
+  if (series->label)
+    printf (" (label \"%s\")", series->label);
+  printf (", %zu values:", series->n_values);
+  for (size_t i = 0; i < series->n_values; i++)
+    {
+      putchar (' ');
+      spv_data_value_dump (&series->values[i], stdout);
+    }
+  putchar ('\n');
+}
+
+static void
+spv_series_destroy (struct hmap *series_map)
+{
+  struct spv_series *series, *next_series;
+  HMAP_FOR_EACH_SAFE (series, next_series, struct spv_series, hmap_node,
+                      series_map)
+    {
+      free (series->name);
+      free (series->label);
+
+      for (size_t i = 0; i < series->n_values; i++)
+        spv_data_value_uninit (&series->values[i]);
+      free (series->values);
+
+      spv_map_destroy (&series->map);
+
+      free (series->index_to_category);
+
+      hmap_delete (series_map, &series->hmap_node);
+      free (series);
+    }
+  hmap_destroy (series_map);
+}
+
+struct spv_mapping
+  {
+    struct hmap_node hmap_node;
+    double from;
+    struct spv_data_value to;
+  };
+
+static struct spv_mapping *
+spv_map_search (const struct hmap *map, double from)
+{
+  struct spv_mapping *mapping;
+  HMAP_FOR_EACH_WITH_HASH (mapping, struct spv_mapping, hmap_node,
+                           hash_double (from, 0), map)
+    if (mapping->from == from)
+      return mapping;
+  return NULL;
+}
+
+static const struct spv_data_value *
+spv_map_lookup (const struct hmap *map, const struct spv_data_value *in)
+{
+  if (in->width >= 0)
+    return in;
+
+  const struct spv_mapping *m = spv_map_search (map, in->d);
+  return m ? &m->to : in;
+}
+
+static bool
+parse_real (const char *s, double *real)
+{
+  int save_errno = errno;
+  errno = 0;
+  char *end;
+  *real = c_strtod (s, &end);
+  bool ok = !errno && end > s && !*end;
+  errno = save_errno;
+
+  return ok;
+}
+
+static char * WARN_UNUSED_RESULT
+spv_map_insert (struct hmap *map, double from, const char *to,
+                bool try_strings_as_numbers, const struct fmt_spec *format)
+{
+  struct spv_mapping *mapping = spv_map_search (map, from);
+
+  if (mapping)
+    return xasprintf ("Duplicate relabeling for from=\"%.*g\"",
+                      DBL_DIG + 1, from);
+  mapping = xmalloc (sizeof *mapping);
+  mapping->from = from;
+
+  if ((try_strings_as_numbers || (format && fmt_is_numeric (format->type)))
+      && parse_real (to, &mapping->to.d))
+    {
+      if (try_strings_as_numbers)
+        mapping->to.width = -1;
+      else
+        {
+          union value v = { .f = mapping->to.d };
+          mapping->to.s = data_out_stretchy (&v, NULL, format, NULL);
+          mapping->to.width = strlen (mapping->to.s);
+        }
+    }
+  else
+    {
+      mapping->to.width = strlen (to);
+      mapping->to.s = xstrdup (to);
+    }
+  hmap_insert (map, &mapping->hmap_node, hash_double (from, 0));
+  return NULL;
+}
+
+static void
+spv_map_destroy (struct hmap *map)
+{
+  struct spv_mapping *mapping, *next;
+  HMAP_FOR_EACH_SAFE (mapping, next, struct spv_mapping, hmap_node, map)
+    {
+      spv_data_value_uninit (&mapping->to);
+      hmap_delete (map, &mapping->hmap_node);
+      free (mapping);
+    }
+  hmap_destroy (map);
+}
+
+static char * WARN_UNUSED_RESULT
+spv_series_parse_relabels (struct hmap *map,
+                           struct spvdx_relabel **relabels, size_t n_relabels,
+                           bool try_strings_as_numbers,
+                           const struct fmt_spec *format)
+{
+  for (size_t i = 0; i < n_relabels; i++)
+    {
+      const struct spvdx_relabel *relabel = relabels[i];
+      char *error = spv_map_insert (map, relabel->from, relabel->to,
+                                    try_strings_as_numbers, format);
+      if (error)
+        return error;
+    }
+  return NULL;
+}
+
+static char * WARN_UNUSED_RESULT
+spv_series_parse_value_map_entry (struct hmap *map,
+                                  const struct spvdx_value_map_entry *vme)
+{
+  for (const char *p = vme->from; ; p++)
+    {
+      int save_errno = errno;
+      errno = 0;
+      char *end;
+      double from = c_strtod (p, &end);
+      bool ok = !errno && end > p && strchr (";", *end);
+      errno = save_errno;
+      if (!ok)
+        return xasprintf ("Syntax error in valueMapEntry from=\"%s\".",
+                          vme->from);
+
+      char *error = spv_map_insert (map, from, vme->to, true,
+                                    &(struct fmt_spec) { FMT_A, 40, 0 });
+      if (error)
+        return error;
+
+      p = end;
+      if (*p == '\0')
+        return NULL;
+      assert (*p == ';');
+    }
+}
+
+static struct fmt_spec
+decode_date_time_format (const struct spvdx_date_time_format *dtf)
+{
+  if (dtf->dt_base_format == SPVDX_DT_BASE_FORMAT_DATE)
+    {
+      enum fmt_type type
+        = (dtf->show_quarter > 0 ? FMT_QYR
+           : dtf->show_week > 0 ? FMT_WKYR
+           : dtf->mdy_order == SPVDX_MDY_ORDER_DAY_MONTH_YEAR
+           ? (dtf->month_format == SPVDX_MONTH_FORMAT_NUMBER
+              || dtf->month_format == SPVDX_MONTH_FORMAT_PADDED_NUMBER
+              ? FMT_EDATE : FMT_DATE)
+           : dtf->mdy_order == SPVDX_MDY_ORDER_YEAR_MONTH_DAY ? FMT_SDATE
+           : FMT_ADATE);
+
+      int w = fmt_min_output_width (type);
+      if (dtf->year_abbreviation <= 0)
+        w += 2;
+      return (struct fmt_spec) { .type = type, .w = w };
+    }
+  else
+    {
+      enum fmt_type type
+        = (dtf->dt_base_format == SPVDX_DT_BASE_FORMAT_DATE_TIME
+           ? (dtf->mdy_order == SPVDX_MDY_ORDER_YEAR_MONTH_DAY
+              ? FMT_YMDHMS
+              : FMT_DATETIME)
+           : (dtf->show_day > 0 ? FMT_DTIME
+              : dtf->show_hour > 0 ? FMT_TIME
+              : FMT_MTIME));
+      int w = fmt_min_output_width (type);
+      int d = 0;
+      if (dtf->show_second > 0)
+        {
+          w += 3;
+          if (dtf->show_millis > 0)
+            {
+              d = 3;
+              w += d + 1;
+            }
+        }
+      return (struct fmt_spec) { .type = type, .w = w, .d = d };
+    }
+}
+
+static struct fmt_spec
+decode_elapsed_time_format (const struct spvdx_elapsed_time_format *etf)
+{
+  enum fmt_type type
+    = (etf->dt_base_format != SPVDX_DT_BASE_FORMAT_TIME ? FMT_DTIME
+       : etf->show_hour > 0 ? FMT_TIME
+       : FMT_MTIME);
+  int w = fmt_min_output_width (type);
+  int d = 0;
+  if (etf->show_second > 0)
+    {
+      w += 3;
+      if (etf->show_millis > 0)
+        {
+          d = 3;
+          w += d + 1;
+        }
+    }
+  return (struct fmt_spec) { .type = type, .w = w, .d = d };
+}
+
+static struct fmt_spec
+decode_number_format (const struct spvdx_number_format *nf)
+{
+  enum fmt_type type = (nf->scientific == SPVDX_SCIENTIFIC_TRUE ? FMT_E
+                        : nf->prefix && !strcmp (nf->prefix, "$") ? FMT_DOLLAR
+                        : nf->suffix && !strcmp (nf->suffix, "%") ? FMT_PCT
+                        : nf->use_grouping ? FMT_COMMA
+                        : FMT_F);
+
+  int d = nf->maximum_fraction_digits;
+  if (d < 0 || d > 15)
+    d = 2;
+
+  struct fmt_spec f = (struct fmt_spec) { type, 40, d };
+  fmt_fix_output (&f);
+  return f;
+}
+
+/* Returns an *approximation* of IN as a fmt_spec.
+
+   Not for use with string formats, which don't have any options anyway. */
+static struct fmt_spec
+decode_format (const struct spvdx_format *in)
+{
+  if (in->f_base_format == SPVDX_F_BASE_FORMAT_DATE ||
+      in->f_base_format == SPVDX_F_BASE_FORMAT_TIME ||
+      in->f_base_format == SPVDX_F_BASE_FORMAT_DATE_TIME)
+    {
+      struct spvdx_date_time_format dtf = {
+        .dt_base_format = (in->f_base_format == SPVDX_F_BASE_FORMAT_DATE
+                           ? SPVDX_DT_BASE_FORMAT_DATE
+                           : in->f_base_format == SPVDX_F_BASE_FORMAT_TIME
+                           ? SPVDX_DT_BASE_FORMAT_TIME
+                           : SPVDX_DT_BASE_FORMAT_DATE_TIME),
+        .separator_chars = in->separator_chars,
+        .mdy_order = in->mdy_order,
+        .show_year = in->show_year,
+        .year_abbreviation = in->year_abbreviation,
+        .show_quarter = in->show_quarter,
+        .quarter_prefix = in->quarter_prefix,
+        .quarter_suffix = in->quarter_suffix,
+        .show_month = in->show_month,
+        .month_format = in->month_format,
+        .show_week = in->show_week,
+        .week_padding = in->week_padding,
+        .week_suffix = in->week_suffix,
+        .show_day_of_week = in->show_day_of_week,
+        .day_of_week_abbreviation = in->day_of_week_abbreviation,
+        .day_padding = in->day_padding,
+        .day_of_month_padding = in->day_of_month_padding,
+        .hour_padding = in->hour_padding,
+        .minute_padding = in->minute_padding,
+        .second_padding = in->second_padding,
+        .show_day = in->show_day,
+        .show_hour = in->show_hour,
+        .show_minute = in->show_minute,
+        .show_second = in->show_second,
+        .show_millis = in->show_millis,
+        .day_type = in->day_type,
+        .hour_format = in->hour_format,
+      };
+      return decode_date_time_format (&dtf);
+    }
+  else if (in->f_base_format == SPVDX_F_BASE_FORMAT_ELAPSED_TIME)
+    {
+      struct spvdx_elapsed_time_format etf = {
+        .dt_base_format = (in->f_base_format == SPVDX_F_BASE_FORMAT_DATE
+                           ? SPVDX_DT_BASE_FORMAT_DATE
+                           : in->f_base_format == SPVDX_F_BASE_FORMAT_TIME
+                           ? SPVDX_DT_BASE_FORMAT_TIME
+                           : SPVDX_DT_BASE_FORMAT_DATE_TIME),
+        .day_padding = in->day_padding,
+        .minute_padding = in->minute_padding,
+        .second_padding = in->second_padding,
+        .show_year = in->show_year,
+        .show_day = in->show_day,
+        .show_hour = in->show_hour,
+        .show_minute = in->show_minute,
+        .show_second = in->show_second,
+        .show_millis = in->show_millis,
+      };
+      return decode_elapsed_time_format (&etf);
+    }
+  else
+    {
+      assert (!in->f_base_format);
+      struct spvdx_number_format nf = {
+        .minimum_integer_digits = in->minimum_integer_digits,
+        .maximum_fraction_digits = in->maximum_fraction_digits,
+        .minimum_fraction_digits = in->minimum_fraction_digits,
+        .use_grouping = in->use_grouping,
+        .scientific = in->scientific,
+        .small = in->small,
+        .prefix = in->prefix,
+        .suffix = in->suffix,
+      };
+      return decode_number_format (&nf);
+    }
+}
+
+static void
+spv_series_execute_mapping (struct spv_series *series)
+{
+  if (!hmap_is_empty (&series->map))
+    {
+      series->remapped = true;
+      for (size_t i = 0; i < series->n_values; i++)
+        {
+          struct spv_data_value *value = &series->values[i];
+          if (value->width >= 0)
+            continue;
+
+          const struct spv_mapping *mapping = spv_map_search (&series->map,
+                                                              value->d);
+          if (mapping)
+            {
+              value->index = value->d;
+              assert (value->index == floor (value->index));
+              value->width = mapping->to.width;
+              if (value->width >= 0)
+                value->s = xmemdup0 (mapping->to.s, mapping->to.width);
+              else
+                value->d = mapping->to.d;
+            }
+        }
+    }
+}
+
+static char * WARN_UNUSED_RESULT
+spv_series_remap_formats (struct spv_series *series,
+                          struct spvxml_node **seq, size_t n_seq)
+{
+  spv_map_destroy (&series->map);
+  hmap_init (&series->map);
+  for (size_t i = 0; i < n_seq; i++)
+    {
+      struct spvxml_node *node = seq[i];
+      if (spvdx_is_format (node))
+        {
+          struct spvdx_format *f = spvdx_cast_format (node);
+          series->format = decode_format (f);
+          char *error = spv_series_parse_relabels (
+            &series->map, f->relabel, f->n_relabel,
+            f->try_strings_as_numbers > 0, &series->format);
+          if (error)
+            return error;
+
+          series->affixes = f->affix;
+          series->n_affixes = f->n_affix;
+        }
+      else if (spvdx_is_string_format (node))
+        {
+          struct spvdx_string_format *sf = spvdx_cast_string_format (node);
+          char *error = spv_series_parse_relabels (&series->map,
+                                                   sf->relabel, sf->n_relabel,
+                                                   false, NULL);
+          if (error)
+            return error;
+
+          series->affixes = sf->affix;
+          series->n_affixes = sf->n_affix;
+        }
+      else
+        NOT_REACHED ();
+    }
+  spv_series_execute_mapping (series);
+  return NULL;
+}
+
+static char * WARN_UNUSED_RESULT
+spv_series_remap_vmes (struct spv_series *series,
+                       struct spvdx_value_map_entry **vmes,
+                       size_t n_vmes)
+{
+  spv_map_destroy (&series->map);
+  hmap_init (&series->map);
+  for (size_t i = 0; i < n_vmes; i++)
+    {
+      char *error = spv_series_parse_value_map_entry (&series->map, vmes[i]);
+      if (error)
+        return error;
+    }
+  spv_series_execute_mapping (series);
+  return NULL;
+}
+
+static void
+decode_footnotes (struct pivot_table *table, const struct spvdx_footnotes *f)
+{
+  if (f->n_footnote_mapping > 0)
+    pivot_table_create_footnote__ (table, f->n_footnote_mapping - 1,
+                                   NULL, NULL);
+  for (size_t i = 0; i < f->n_footnote_mapping; i++)
+    {
+      const struct spvdx_footnote_mapping *fm = f->footnote_mapping[i];
+      pivot_table_create_footnote__ (table, fm->defines_reference - 1,
+                                     pivot_value_new_user_text (fm->to, -1),
+                                     NULL);
+    }
+}
+
+static struct cell_color
+optional_color (int color, struct cell_color default_color)
+{
+  return (color >= 0
+          ? (struct cell_color) CELL_COLOR (color >> 16, color >> 8, color)
+          : default_color);
+}
+
+static int
+optional_length (const char *s, int default_length)
+{
+  /* There is usually a "pt" suffix.  We ignore it. */
+  int length;
+  return s && sscanf (s, "%d", &length) == 1 ? length : default_length;
+}
+
+static int
+optional_px (double inches, int default_px)
+{
+  return inches != DBL_MAX ? inches * 96.0 : default_px;
+}
+
+static int
+optional_pt (double inches, int default_pt)
+{
+  return inches != DBL_MAX ? inches * 72.0 + .5 : default_pt;
+}
+
+static void
+decode_spvdx_style_incremental (const struct spvdx_style *in,
+                                const struct spvdx_style *bg,
+                                struct area_style *out)
+{
+  if (in && in->font_weight)
+    out->font_style.bold = in->font_weight == SPVDX_FONT_WEIGHT_BOLD;
+  if (in && in->font_style)
+    out->font_style.italic = in->font_style == SPVDX_FONT_STYLE_ITALIC;
+  if (in && in->font_underline)
+    out->font_style.underline = in->font_underline == SPVDX_FONT_UNDERLINE_UNDERLINE;
+  if (in && in->color >= 0)
+    {
+      out->font_style.fg[0] = optional_color (
+        in->color, (struct cell_color) CELL_COLOR_BLACK);
+      out->font_style.fg[1] = out->font_style.fg[0];
+    }
+  if (bg && bg->color >= 0)
+    {
+      out->font_style.bg[0] = optional_color (
+        bg->color, (struct cell_color) CELL_COLOR_WHITE);
+      out->font_style.bg[1] = out->font_style.bg[0];
+    }
+  if (in && in->font_family)
+    {
+      free (out->font_style.typeface);
+      out->font_style.typeface = xstrdup (in->font_family);
+    }
+  if (in && in->font_size)
+    {
+      int size = optional_length (in->font_size, 0);
+      if (size)
+        out->font_style.size = size;
+    }
+  if (in && in->text_alignment)
+    out->cell_style.halign
+      = (in->text_alignment == SPVDX_TEXT_ALIGNMENT_LEFT
+         ? TABLE_HALIGN_LEFT
+         : in->text_alignment == SPVDX_TEXT_ALIGNMENT_RIGHT
+         ? TABLE_HALIGN_RIGHT
+         : in->text_alignment == SPVDX_TEXT_ALIGNMENT_CENTER
+         ? TABLE_HALIGN_CENTER
+         : in->text_alignment == SPVDX_TEXT_ALIGNMENT_DECIMAL
+         ? TABLE_HALIGN_DECIMAL
+         : TABLE_HALIGN_MIXED);
+  if (in && in->label_location_vertical)
+    out->cell_style.valign =
+      (in->label_location_vertical == SPVDX_LABEL_LOCATION_VERTICAL_NEGATIVE
+       ? TABLE_VALIGN_BOTTOM
+       : in->label_location_vertical == SPVDX_LABEL_LOCATION_VERTICAL_POSITIVE
+       ? TABLE_VALIGN_TOP
+       : TABLE_VALIGN_CENTER);
+  if (in && in->decimal_offset != DBL_MAX)
+    out->cell_style.decimal_offset = optional_px (in->decimal_offset, 0);
+#if 0
+  if (in && in->margin_left != DBL_MAX)
+    out->cell_style.margin[TABLE_HORZ][0] = optional_pt (in->margin_left, 8);
+  if (in && in->margin_right != DBL_MAX)
+    out->cell_style.margin[TABLE_HORZ][1] = optional_pt (in->margin_right, 11);
+  if (in && in->margin_top != DBL_MAX)
+    out->cell_style.margin[TABLE_VERT][0] = optional_pt (in->margin_top, 1);
+  if (in && in->margin_bottom != DBL_MAX)
+    out->cell_style.margin[TABLE_VERT][1] = optional_pt (in->margin_bottom, 1);
+#endif
+}
+
+static void
+decode_spvdx_style (const struct spvdx_style *in,
+                    const struct spvdx_style *bg,
+                    struct area_style *out)
+{
+  *out = (struct area_style) AREA_STYLE_INITIALIZER;
+  decode_spvdx_style_incremental (in, bg, out);
+}
+
+static void
+add_footnote (struct pivot_value *v, int idx, struct pivot_table *table)
+{
+  if (idx < 1 || idx > table->n_footnotes)
+    return;
+
+  pivot_value_add_footnote (v, table->footnotes[idx - 1]);
+}
+
+static char * WARN_UNUSED_RESULT
+decode_label_frame (struct pivot_table *table,
+                    const struct spvdx_label_frame *lf)
+{
+  if (!lf->label)
+    return NULL;
+
+  struct pivot_value **target;
+  struct area_style *area;
+  if (lf->label->purpose == SPVDX_PURPOSE_TITLE)
+    {
+      target = &table->title;
+      area = &table->areas[PIVOT_AREA_TITLE];
+    }
+  else if (lf->label->purpose == SPVDX_PURPOSE_SUB_TITLE)
+    {
+      target = &table->caption;
+      area = &table->areas[PIVOT_AREA_CAPTION];
+    }
+  else if (lf->label->purpose == SPVDX_PURPOSE_FOOTNOTE)
+    {
+      if (lf->label->n_text > 0
+          && lf->label->text[0]->uses_reference != INT_MIN)
+        {
+          target = NULL;
+          area = &table->areas[PIVOT_AREA_FOOTER];
+        }
+      else
+        return NULL;
+    }
+  else if (lf->label->purpose == SPVDX_PURPOSE_LAYER)
+    {
+      target = NULL;
+      area = &table->areas[PIVOT_AREA_LAYERS];
+    }
+  else
+    return NULL;
+
+  area_style_uninit (area);
+  decode_spvdx_style (lf->label->style, lf->label->text_frame_style, area);
+
+  if (target)
+    {
+      struct pivot_value *value = xzalloc (sizeof *value);
+      value->type = PIVOT_VALUE_TEXT;
+      for (size_t i = 0; i < lf->label->n_text; i++)
+        {
+          const struct spvdx_text *in = lf->label->text[i];
+          if (in->defines_reference != INT_MIN)
+            add_footnote (value, in->defines_reference, table);
+          else if (!value->text.local)
+            value->text.local = xstrdup (in->text);
+          else
+            {
+              char *new = xasprintf ("%s%s", value->text.local, in->text);
+              free (value->text.local);
+              value->text.local = new;
+            }
+        }
+      pivot_value_destroy (*target);
+      *target = value;
+    }
+  else
+    for (size_t i = 0; i < lf->label->n_text; i++)
+      {
+        const struct spvdx_text *in = lf->label->text[i];
+        if (in->uses_reference == INT_MIN)
+          continue;
+        if (i % 2)
+          {
+            size_t length = strlen (in->text);
+            if (length && in->text[length - 1] == '\n')
+              length--;
+
+            pivot_table_create_footnote__ (
+              table, in->uses_reference - 1, NULL,
+              pivot_value_new_user_text (in->text, length));
+          }
+        else
+          {
+            size_t length = strlen (in->text);
+            if (length && in->text[length - 1] == '.')
+              length--;
+
+            pivot_table_create_footnote__ (
+              table, in->uses_reference - 1,
+              pivot_value_new_user_text (in->text, length), NULL);
+          }
+      }
+  return NULL;
+}
+
+/* Special return value for decode_spvdx_variable(). */
+static char BAD_REFERENCE;
+
+static char * WARN_UNUSED_RESULT
+decode_spvdx_source_variable (const struct spvxml_node *node,
+                              struct spv_data *data,
+                              struct hmap *series_map)
+{
+  const struct spvdx_source_variable *sv = spvdx_cast_source_variable (node);
+
+  struct spv_series *label_series = NULL;
+  if (sv->label_variable)
+    {
+      label_series = spv_series_find (series_map,
+                                      sv->label_variable->node_.id);
+      if (!label_series)
+        return &BAD_REFERENCE;
+
+      label_series->is_label_series = true;
+    }
+
+  const struct spv_data_variable *var = spv_data_find_variable (
+    data, sv->source, sv->source_name);
+  if (!var)
+    return xasprintf ("sourceVariable %s references nonexistent "
+                      "source %s variable %s.",
+                      sv->node_.id, sv->source, sv->source_name);
+
+  struct spv_series *s = xzalloc (sizeof *s);
+  s->name = xstrdup (node->id);
+  s->xml = node;
+  s->label = sv->label ? xstrdup (sv->label) : NULL;
+  s->label_series = label_series;
+  s->values = spv_data_values_clone (var->values, var->n_values);
+  s->n_values = var->n_values;
+  s->format = F_8_0;
+  hmap_init (&s->map);
+  hmap_insert (series_map, &s->hmap_node, hash_string (s->name, 0));
+
+  char *error = spv_series_remap_formats (s, sv->seq, sv->n_seq);
+  if (error)
+    return error;
+
+  if (label_series && !s->remapped)
+    {
+      for (size_t i = 0; i < s->n_values; i++)
+        if (s->values[i].width < 0)
+          {
+            char *dest;
+            if (label_series->values[i].width < 0)
+              {
+                union value v = { .f = label_series->values[i].d };
+                dest = data_out_stretchy (&v, "UTF-8", &s->format, NULL);
+              }
+            else
+              dest = label_series->values[i].s;
+            char *error = spv_map_insert (&s->map, s->values[i].d,
+                                          dest, false, NULL);
+            free (error);   /* Duplicates are OK. */
+            if (label_series->values[i].width < 0)
+              free (dest);
+          }
+    }
+
+  return NULL;
+}
+
+static char * WARN_UNUSED_RESULT
+decode_spvdx_derived_variable (const struct spvxml_node *node,
+                               struct hmap *series_map)
+{
+  const struct spvdx_derived_variable *dv = spvdx_cast_derived_variable (node);
+
+  struct spv_data_value *values;
+  size_t n_values;
+
+  struct substring value = ss_cstr (dv->value);
+  if (ss_equals (value, ss_cstr ("constant(0)")))
+    {
+      struct spv_series *existing_series = spv_series_first (series_map);
+      if (!existing_series)
+        return &BAD_REFERENCE;
+
+      n_values = existing_series->n_values;
+      values = xcalloc (n_values, sizeof *values);
+      for (size_t i = 0; i < n_values; i++)
+        values[i].width = -1;
+    }
+  else if (ss_starts_with (value, ss_cstr ("constant(")))
+    {
+      values = NULL;
+      n_values = 0;
+    }
+  else if (ss_starts_with (value, ss_cstr ("map("))
+           && ss_ends_with (value, ss_cstr (")")))
+    {
+      char *dependency_name = ss_xstrdup (ss_substr (value, 4,
+                                                     value.length - 5));
+      struct spv_series *dependency
+        = spv_series_find (series_map, dependency_name);
+      free (dependency_name);
+      if (!dependency)
+        return &BAD_REFERENCE;
+
+      values = spv_data_values_clone (dependency->values,
+                                      dependency->n_values);
+      n_values = dependency->n_values;
+    }
+  else
+    return xasprintf ("Derived variable %s has unknown value \"%s\"",
+                      node->id, dv->value);
+
+  struct spv_series *s = xzalloc (sizeof *s);
+  s->format = F_8_0;
+  s->name = xstrdup (node->id);
+  s->values = values;
+  s->n_values = n_values;
+  hmap_init (&s->map);
+  hmap_insert (series_map, &s->hmap_node, hash_string (s->name, 0));
+
+  char *error = spv_series_remap_vmes (s, dv->value_map_entry,
+                                       dv->n_value_map_entry);
+  if (error)
+    return error;
+
+  error = spv_series_remap_formats (s, dv->seq, dv->n_seq);
+  if (error)
+    return error;
+
+  if (n_values > 0)
+    {
+      for (size_t i = 0; i < n_values; i++)
+        if (values[i].width != 0)
+          goto nonempty;
+      for (size_t i = 0; i < n_values; i++)
+        spv_data_value_uninit (&s->values[i]);
+      free (s->values);
+
+      s->values = NULL;
+      s->n_values = 0;
+
+    nonempty:;
+    }
+  return NULL;
+}
+
+struct format_mapping
+  {
+    struct hmap_node hmap_node;
+    uint32_t from;
+    struct fmt_spec to;
+  };
+
+static const struct format_mapping *
+format_map_find (const struct hmap *format_map, uint32_t u32_format)
+{
+  if (format_map)
+    {
+      const struct format_mapping *fm;
+      HMAP_FOR_EACH_IN_BUCKET (fm, struct format_mapping, hmap_node,
+                               hash_int (u32_format, 0), format_map)
+        if (fm->from == u32_format)
+          return fm;
+    }
+
+  return NULL;
+}
+
+static struct fmt_spec
+spv_format_from_data_value (const struct spv_data_value *data,
+                            const struct hmap *format_map)
+{
+  if (!data)
+    return fmt_for_output (FMT_F, 40, 2);
+
+  uint32_t u32_format = data->width < 0 ? data->d : atoi (data->s);
+  const struct format_mapping *fm = format_map_find (format_map, u32_format);
+  return fm ? fm->to : spv_decode_fmt_spec (u32_format);
+}
+
+static struct pivot_value *
+pivot_value_from_data_value (const struct spv_data_value *data,
+                             const struct spv_data_value *format,
+                             const struct hmap *format_map)
+{
+  struct pivot_value *v = xzalloc (sizeof *v);
+  struct fmt_spec f = spv_format_from_data_value (format, format_map);
+  if (data->width >= 0)
+    {
+      if (format && fmt_get_category (f.type) == FMT_CAT_DATE)
+        {
+          int year, month, day, hour, minute, second, msec, len = -1;
+          if (sscanf (data->s, "%4d-%2d-%2dT%2d:%2d:%2d.%3d%n",
+                      &year, &month, &day, &hour, &minute, &second,
+                      &msec, &len) == 7
+              && len == 23
+              && data->s[len] == '\0')
+            {
+              double date = calendar_gregorian_to_offset (year, month, day,
+                                                          NULL);
+              if (date != SYSMIS)
+                {
+                  v->type = PIVOT_VALUE_NUMERIC;
+                  v->numeric.x = (date * 60. * 60. * 24.
+                                  + hour * 60. * 60.
+                                  + minute * 60.
+                                  + second
+                                  + msec / 1000.0);
+                  v->numeric.format = f;
+                  return v;
+                }
+            }
+        }
+      else if (format && fmt_get_category (f.type) == FMT_CAT_TIME)
+        {
+          int hour, minute, second, msec, len = -1;
+          if (sscanf (data->s, "%d:%2d:%2d.%3d%n",
+                      &hour, &minute, &second, &msec, &len) == 4
+              && len > 0
+              && data->s[len] == '\0')
+            {
+              v->type = PIVOT_VALUE_NUMERIC;
+              v->numeric.x = (hour * 60. * 60.
+                              + minute * 60.
+                              + second
+                              + msec / 1000.0);
+              v->numeric.format = f;
+              return v;
+            }
+        }
+      v->type = PIVOT_VALUE_STRING;
+      v->string.s = xstrdup (data->s);
+    }
+  else
+    {
+      v->type = PIVOT_VALUE_NUMERIC;
+      v->numeric.x = data->d;
+      v->numeric.format = f;
+    }
+  return v;
+}
+
+static void
+add_parents (struct pivot_category *cat, struct pivot_category *parent,
+             size_t group_index)
+{
+  cat->parent = parent;
+  cat->group_index = group_index;
+  if (pivot_category_is_group (cat))
+    for (size_t i = 0; i < cat->n_subs; i++)
+      add_parents (cat->subs[i], cat, i);
+}
+
+static const struct spvdx_facet_level *
+find_facet_level (const struct spvdx_visualization *v, int facet_level)
+{
+  const struct spvdx_facet_layout *layout = v->graph->facet_layout;
+  for (size_t i = 0; i < layout->n_facet_level; i++)
+    {
+      const struct spvdx_facet_level *fl = layout->facet_level[i];
+      if (facet_level == fl->level)
+        return fl;
+    }
+  return NULL;
+}
+
+static bool
+should_show_label (const struct spvdx_facet_level *fl)
+{
+  return fl && fl->axis->label && fl->axis->label->style->visible != 0;
+}
+
+static size_t
+max_category (const struct spv_series *s)
+{
+  double max_cat = -DBL_MAX;
+  for (size_t i = 0; i < s->n_values; i++)
+    {
+      const struct spv_data_value *dv = &s->values[i];
+      double d = dv->width < 0 ? dv->d : dv->index;
+      if (d > max_cat)
+        max_cat = d;
+    }
+  assert (max_cat >= 0 && max_cat < SIZE_MAX - 1);
+
+  return max_cat;
+}
+
+static void
+add_affixes (struct pivot_table *table, struct pivot_value *value,
+             struct spvdx_affix **affixes, size_t n_affixes)
+{
+  for (size_t i = 0; i < n_affixes; i++)
+    add_footnote (value, affixes[i]->defines_reference, table);
+}
+
+static struct pivot_dimension *
+add_dimension (struct spv_series **series, size_t n,
+               enum pivot_axis_type axis_type,
+               const struct spvdx_visualization *v, struct pivot_table *table,
+               struct spv_series **dim_seriesp, size_t *n_dim_seriesp,
+               int base_facet_level)
+{
+  const struct spvdx_facet_level *fl
+    = find_facet_level (v, base_facet_level + n);
+  if (fl)
+    {
+      struct area_style *area = (axis_type == PIVOT_AXIS_COLUMN
+                                 ? &table->areas[PIVOT_AREA_COLUMN_LABELS]
+                                 : axis_type == PIVOT_AXIS_ROW
+                                 ? &table->areas[PIVOT_AREA_ROW_LABELS]
+                                 : NULL);
+      if (area && fl->axis->label)
+        {
+          area_style_uninit (area);
+          decode_spvdx_style (fl->axis->label->style,
+                              fl->axis->label->text_frame_style, area);
+        }
+    }
+
+  if (axis_type == PIVOT_AXIS_ROW)
+    {
+      const struct spvdx_facet_level *fl2
+        = find_facet_level (v, base_facet_level + (n - 1));
+      if (fl2)
+        decode_spvdx_style_incremental (
+          fl2->axis->major_ticks->style,
+          fl2->axis->major_ticks->tick_frame_style,
+          &table->areas[PIVOT_AREA_ROW_LABELS]);
+    }
+
+  const struct spvdx_facet_level *fl3 = find_facet_level (v, base_facet_level);
+  if (fl3 && fl3->axis->major_ticks->label_angle == -90)
+    {
+      if (axis_type == PIVOT_AXIS_COLUMN)
+        table->rotate_inner_column_labels = true;
+      else
+        table->rotate_outer_row_labels = true;
+    }
+
+  /* Find the first row for each category. */
+  size_t max_cat = max_category (series[0]);
+  size_t *cat_rows = xnmalloc (max_cat + 1, sizeof *cat_rows);
+  for (size_t k = 0; k <= max_cat; k++)
+    cat_rows[k] = SIZE_MAX;
+  for (size_t k = 0; k < series[0]->n_values; k++)
+    {
+      const struct spv_data_value *dv = &series[0]->values[k];
+      double d = dv->width < 0 ? dv->d : dv->index;
+      if (d >= 0 && d < SIZE_MAX - 1)
+        {
+          size_t row = d;
+          if (cat_rows[row] == SIZE_MAX)
+            cat_rows[row] = k;
+        }
+    }
+
+  /* Drop missing categories and count what's left. */
+  size_t n_cats = 0;
+  for (size_t k = 0; k <= max_cat; k++)
+    if (cat_rows[k] != SIZE_MAX)
+      cat_rows[n_cats++] = cat_rows[k];
+  assert (n_cats > 0);
+
+  /* Make the categories. */
+  struct pivot_dimension *d = xzalloc (sizeof *d);
+  table->dimensions[table->n_dimensions++] = d;
+
+  series[0]->n_index = max_cat + 1;
+  series[0]->index_to_category = xcalloc (
+    max_cat + 1, sizeof *series[0]->index_to_category);
+  struct pivot_category **cats = xnmalloc (n_cats, sizeof **cats);
+  for (size_t k = 0; k < n_cats; k++)
+    {
+      struct spv_data_value *dv = &series[0]->values[cat_rows[k]];
+      int dv_num = dv ? dv->d : dv->index;
+      struct pivot_category *cat = xzalloc (sizeof *cat);
+      cat->name = pivot_value_from_data_value (
+        spv_map_lookup (&series[0]->map, dv), NULL, NULL);
+      cat->parent = NULL;
+      cat->dimension = d;
+      cat->data_index = k;
+      cat->presentation_index = cat_rows[k];
+      cats[k] = cat;
+      series[0]->index_to_category[dv_num] = cat;
+
+      add_affixes (table, cat->name, series[0]->affixes, series[0]->n_affixes);
+    }
+  free (cat_rows);
+
+  struct pivot_axis *axis = &table->axes[axis_type];
+  d->axis_type = axis_type;
+  d->level = axis->n_dimensions;
+  d->top_index = table->n_dimensions - 1;
+  d->root = xzalloc (sizeof *d->root);
+  *d->root = (struct pivot_category) {
+    .name = pivot_value_new_user_text (
+      series[0]->label ? series[0]->label : "", -1),
+    .dimension = d,
+    .show_label = should_show_label (fl),
+    .data_index = SIZE_MAX,
+    .presentation_index = SIZE_MAX,
+  };
+  d->data_leaves = xmemdup (cats, n_cats * sizeof *cats);
+  d->presentation_leaves = xmemdup (cats, n_cats * sizeof *cats);
+  d->n_leaves = d->allocated_leaves = n_cats;
+
+  /* Now group them, in one pass per grouping variable, innermost first. */
+  for (size_t j = 1; j < n; j++)
+    {
+      struct pivot_category **new_cats = xnmalloc (n_cats, sizeof **cats);
+      size_t n_new_cats = 0;
+
+      /* Allocate a category index. */
+      size_t max_cat = max_category (series[j]);
+      series[j]->n_index = max_cat + 1;
+      series[j]->index_to_category = xcalloc (
+        max_cat + 1, sizeof *series[j]->index_to_category);
+      for (size_t cat1 = 0; cat1 < n_cats; )
+        {
+          /* Find a sequence of categories cat1...cat2 (exclusive), that all
+             have the same value in series 'j'.  (This might be only a single
+             category; we will drop unnamed 1-category groups later.) */
+          size_t row1 = cats[cat1]->presentation_index;
+          const struct spv_data_value *dv1 = &series[j]->values[row1];
+          size_t cat2;
+          for (cat2 = cat1 + 1; cat2 < n_cats; cat2++)
+            {
+              size_t row2 = cats[cat2]->presentation_index;
+              const struct spv_data_value *dv2 = &series[j]->values[row2];
+              if (!spv_data_value_equal (dv1, dv2))
+                break;
+            }
+          size_t n_subs = cat2 - cat1;
+
+          struct pivot_category *new_cat;
+          const struct spv_data_value *name
+            = spv_map_lookup (&series[j]->map, dv1);
+          if (n_subs == 1 && name->width == 0)
+            {
+              /* The existing category stands on its own. */
+              new_cat = cats[cat1++];
+            }
+          else
+            {
+              /* Create a new group with cat...cat2 as subcategories. */
+              new_cat = xzalloc (sizeof *new_cat);
+              *new_cat = (struct pivot_category) {
+                .name = pivot_value_from_data_value (name, NULL, NULL),
+                .dimension = d,
+                .subs = xnmalloc (n_subs, sizeof *new_cat->subs),
+                .n_subs = n_subs,
+                .show_label = true,
+                .data_index = SIZE_MAX,
+                .presentation_index = row1,
+              };
+              for (size_t k = 0; k < n_subs; k++)
+                new_cat->subs[k] = cats[cat1++];
+
+              int dv1_num = dv1->width < 0 ? dv1->d : dv1->index;
+              series[j]->index_to_category[dv1_num] = new_cat;
+            }
+
+          add_affixes (table, new_cat->name,
+                       series[j]->affixes, series[j]->n_affixes);
+
+          /* Append the new group to the list of new groups. */
+          new_cats[n_new_cats++] = new_cat;
+        }
+
+      free (cats);
+      cats = new_cats;
+      n_cats = n_new_cats;
+    }
+
+  /* Now drop unnamed 1-category groups and add parent pointers. */
+  for (size_t j = 0; j < n_cats; j++)
+    add_parents (cats[j], d->root, j);
+
+  d->root->subs = cats;
+  d->root->n_subs = n_cats;
+
+  dim_seriesp[(*n_dim_seriesp)++] = series[0];
+  series[0]->dimension = d;
+
+  axis->dimensions = xnrealloc (axis->dimensions, axis->n_dimensions + 1,
+                               sizeof *axis->dimensions);
+  axis->dimensions[axis->n_dimensions++] = d;
+  axis->extent *= d->n_leaves;
+
+  return d;
+}
+
+static void
+add_dimensions (struct hmap *series_map, const struct spvdx_nest *nest,
+                enum pivot_axis_type axis_type,
+                const struct spvdx_visualization *v, struct pivot_table *table,
+                struct spv_series **dim_seriesp, size_t *n_dim_seriesp,
+                int level_ofs)
+{
+  struct pivot_axis *axis = &table->axes[axis_type];
+  if (!axis->extent)
+    axis->extent = 1;
+
+  if (!nest)
+    return;
+
+  struct spv_series **series = xnmalloc (nest->n_vars, sizeof *series);
+  for (size_t i = 0; i < nest->n_vars; )
+    {
+      size_t n;
+      for (n = 0; i + n < nest->n_vars; n++)
+        {
+          series[n] = spv_series_from_ref (series_map, nest->vars[i + n]->ref);
+          if (!series[n] || !series[n]->n_values)
+            break;
+        }
+
+      if (n > 0)
+        add_dimension (series, n, axis_type, v, table,
+                       dim_seriesp, n_dim_seriesp, level_ofs + i);
+      i += n + 1;
+    }
+  free (series);
+}
+
+static void
+add_layers (struct hmap *series_map,
+            struct spvdx_layer **layers, size_t n_layers,
+            const struct spvdx_visualization *v, struct pivot_table *table,
+            struct spv_series **dim_seriesp, size_t *n_dim_seriesp,
+            int level_ofs)
+{
+  struct pivot_axis *axis = &table->axes[PIVOT_AXIS_LAYER];
+  if (!axis->extent)
+    axis->extent = 1;
+
+  if (!n_layers)
+    return;
+
+  struct spv_series **series = xnmalloc (n_layers, sizeof *series);
+  for (size_t i = 0; i < n_layers; )
+    {
+      size_t n;
+      for (n = 0; i + n < n_layers; n++)
+        {
+          series[n] = spv_series_from_ref (series_map,
+                                           layers[i + n]->variable);
+          if (!series[n] || !series[n]->n_values)
+            break;
+        }
+
+      if (n > 0)
+        {
+          struct pivot_dimension *d = add_dimension (
+            series, n, PIVOT_AXIS_LAYER, v, table,
+            dim_seriesp, n_dim_seriesp, level_ofs + i);
+
+          int index = atoi (layers[i]->value);
+          assert (index < d->n_leaves);
+          table->current_layer = xrealloc (
+            table->current_layer,
+            axis->n_dimensions * sizeof *table->current_layer);
+          table->current_layer[axis->n_dimensions - 1] = index;
+        }
+      i += n + 1;
+    }
+  free (series);
+}
+
+static int
+optional_int (int x, int default_value)
+{
+  return x != INT_MIN ? x : default_value;
+}
+
+static enum pivot_area
+pivot_area_from_name (const char *name)
+{
+  static const char *area_names[PIVOT_N_AREAS] = {
+    [PIVOT_AREA_TITLE] = "title",
+    [PIVOT_AREA_CAPTION] = "caption",
+    [PIVOT_AREA_FOOTER] = "footnotes",
+    [PIVOT_AREA_CORNER] = "cornerLabels",
+    [PIVOT_AREA_COLUMN_LABELS] = "columnLabels",
+    [PIVOT_AREA_ROW_LABELS] = "rowLabels",
+    [PIVOT_AREA_DATA] = "data",
+    [PIVOT_AREA_LAYERS] = "layers",
+  };
+
+  enum pivot_area area;
+  for (area = 0; area < PIVOT_N_AREAS; area++)
+    if (!strcmp (name, area_names[area]))
+      break;
+  return area;
+}
+
+static enum pivot_border
+pivot_border_from_name (const char *name)
+{
+  static const char *border_names[PIVOT_N_BORDERS] = {
+    [PIVOT_BORDER_TITLE] = "titleLayerSeparator",
+    [PIVOT_BORDER_OUTER_LEFT] = "leftOuterFrame",
+    [PIVOT_BORDER_OUTER_TOP] = "topOuterFrame",
+    [PIVOT_BORDER_OUTER_RIGHT] = "rightOuterFrame",
+    [PIVOT_BORDER_OUTER_BOTTOM] = "bottomOuterFrame",
+    [PIVOT_BORDER_INNER_LEFT] = "leftInnerFrame",
+    [PIVOT_BORDER_INNER_TOP] = "topInnerFrame",
+    [PIVOT_BORDER_INNER_RIGHT] = "rightInnerFrame",
+    [PIVOT_BORDER_INNER_BOTTOM] = "bottomInnerFrame",
+    [PIVOT_BORDER_DATA_LEFT] = "dataAreaLeft",
+    [PIVOT_BORDER_DATA_TOP] = "dataAreaTop",
+    [PIVOT_BORDER_DIM_ROW_HORZ] = "horizontalDimensionBorderRows",
+    [PIVOT_BORDER_DIM_ROW_VERT] = "verticalDimensionBorderRows",
+    [PIVOT_BORDER_DIM_COL_HORZ] = "horizontalDimensionBorderColumns",
+    [PIVOT_BORDER_DIM_COL_VERT] = "verticalDimensionBorderColumns",
+    [PIVOT_BORDER_CAT_ROW_HORZ] = "horizontalCategoryBorderRows",
+    [PIVOT_BORDER_CAT_ROW_VERT] = "verticalCategoryBorderRows",
+    [PIVOT_BORDER_CAT_COL_HORZ] = "horizontalCategoryBorderColumns",
+    [PIVOT_BORDER_CAT_COL_VERT] = "verticalCategoryBorderColumns",
+  };
+
+  enum pivot_border border;
+  for (border = 0; border < PIVOT_N_BORDERS; border++)
+    if (!strcmp (name, border_names[border]))
+      break;
+  return border;
+}
+
+static struct pivot_category *
+find_category (struct spv_series *series, int index)
+{
+  return (index >= 0 && index < series->n_index
+          ? series->index_to_category[index]
+          : NULL);
+}
+
+static bool
+int_in_array (int value, const int *array, size_t n)
+{
+  for (size_t i = 0; i < n; i++)
+    if (array[i] == value)
+      return true;
+
+  return false;
+}
+
+static void
+apply_styles_to_value (struct pivot_table *table,
+                       struct pivot_value *value,
+                       const struct spvdx_set_format *sf,
+                       const struct area_style *base_area_style,
+                       const struct spvdx_style *fg,
+                       const struct spvdx_style *bg)
+{
+  if (sf)
+    {
+      if (sf->reset > 0)
+        {
+          free (value->footnotes);
+          value->footnotes = NULL;
+          value->n_footnotes = 0;
+        }
+
+      struct fmt_spec format = { .w = 0 };
+      if (sf->format)
+        {
+          format = decode_format (sf->format);
+          add_affixes (table, value, sf->format->affix, sf->format->n_affix);
+        }
+      else if (sf->number_format)
+        {
+          format = decode_number_format (sf->number_format);
+          add_affixes (table, value, sf->number_format->affix,
+                       sf->number_format->n_affix);
+        }
+      else if (sf->n_string_format)
+        {
+          for (size_t i = 0; i < sf->n_string_format; i++)
+            add_affixes (table, value, sf->string_format[i]->affix,
+                         sf->string_format[i]->n_affix);
+        }
+      else if (sf->date_time_format)
+        {
+          format = decode_date_time_format (sf->date_time_format);
+          add_affixes (table, value, sf->date_time_format->affix,
+                       sf->date_time_format->n_affix);
+        }
+      else if (sf->elapsed_time_format)
+        {
+          format = decode_elapsed_time_format (sf->elapsed_time_format);
+          add_affixes (table, value, sf->elapsed_time_format->affix,
+                       sf->elapsed_time_format->n_affix);
+        }
+
+      if (format.w)
+        {
+          if (value->type == PIVOT_VALUE_NUMERIC)
+            value->numeric.format = format;
+
+          /* Possibly we should try to apply date and time formats too,
+             but none seem to occur in practice so far. */
+        }
+    }
+  if (fg || bg)
+    {
+      struct area_style area;
+      pivot_value_get_style (
+        value,
+        value->font_style ? value->font_style : &base_area_style->font_style,
+        value->cell_style ? value->cell_style : &base_area_style->cell_style,
+        &area);
+      decode_spvdx_style_incremental (fg, bg, &area);
+      pivot_value_set_style (value, &area);
+      area_style_uninit (&area);
+    }
+}
+
+static void
+decode_set_cell_properties__ (struct pivot_table *table,
+                              struct hmap *series_map,
+                              const struct spvdx_intersect *intersect,
+                              const struct spvdx_style *interval,
+                              const struct spvdx_style *graph,
+                              const struct spvdx_style *labeling,
+                              const struct spvdx_style *frame,
+                              const struct spvdx_style *major_ticks,
+                              const struct spvdx_set_format *set_format)
+{
+  if (graph && labeling && intersect->alternating
+      && !interval && !major_ticks && !frame && !set_format)
+    {
+      /* Sets alt_fg_color and alt_bg_color. */
+      struct area_style area;
+      decode_spvdx_style (labeling, graph, &area);
+      table->areas[PIVOT_AREA_DATA].font_style.fg[1]
+        = area.font_style.fg[0];
+      table->areas[PIVOT_AREA_DATA].font_style.bg[1]
+        = area.font_style.bg[0];
+      area_style_uninit (&area);
+    }
+  else if (graph
+           && !labeling && !interval && !major_ticks && !frame && !set_format)
+    {
+      /* 'graph->width' likely just sets the width of the table as a
+         whole.  */
+    }
+  else if (!graph && !labeling && !interval && !frame && !set_format
+           && !major_ticks)
+    {
+      /* No-op.  (Presumably there's a setMetaData we don't care about.) */
+    }
+  else if (((set_format && spvdx_is_major_ticks (set_format->target))
+            || major_ticks || frame)
+           && intersect->n_where == 1)
+    {
+      /* Formatting for individual row or column labels. */
+      const struct spvdx_where *w = intersect->where[0];
+      struct spv_series *s = spv_series_find (series_map, w->variable->id);
+      assert (s);
+
+      const char *p = w->include;
+
+      while (*p)
+        {
+          char *tail;
+          int include = strtol (p, &tail, 10);
+
+          struct pivot_category *c = find_category (s, include);
+          if (c)
+            {
+              const struct area_style *base_area_style
+                = (c->dimension->axis_type == PIVOT_AXIS_ROW
+                   ? &table->areas[PIVOT_AREA_ROW_LABELS]
+                   : &table->areas[PIVOT_AREA_COLUMN_LABELS]);
+              apply_styles_to_value (table, c->name, set_format,
+                                     base_area_style, major_ticks, frame);
+            }
+
+          if (tail == p)
+            break;
+          p = tail;
+          if (*p == ';')
+            p++;
+        }
+    }
+  else if ((set_format && spvdx_is_labeling (set_format->target))
+           || labeling || interval)
+    {
+      /* Formatting for individual cells or groups of them with some dimensions
+         in common. */
+      int **indexes = xcalloc (table->n_dimensions, sizeof *indexes);
+      size_t *n = xcalloc (table->n_dimensions, sizeof *n);
+      size_t *allocated = xcalloc (table->n_dimensions, sizeof *allocated);
+
+      for (size_t i = 0; i < intersect->n_where; i++)
+        {
+          const struct spvdx_where *w = intersect->where[i];
+          struct spv_series *s = spv_series_find (series_map, w->variable->id);
+          assert (s);
+          if (!s->dimension)
+            {
+              /* Group indexes may be included even though they are redundant.
+                 Ignore them. */
+              continue;
+            }
+
+          size_t j = s->dimension->top_index;
+
+          const char *p = w->include;
+          while (*p)
+            {
+              char *tail;
+              int include = strtol (p, &tail, 10);
+
+              struct pivot_category *c = find_category (s, include);
+              if (c)
+                {
+                  if (n[j] >= allocated[j])
+                    indexes[j] = x2nrealloc (indexes[j], &allocated[j],
+                                             sizeof *indexes[j]);
+                  indexes[j][n[j]++] = c->data_index;
+                }
+
+              if (tail == p)
+                break;
+              p = tail;
+              if (*p == ';')
+                p++;
+            }
+        }
+
+#if 0
+      printf ("match:");
+      for (size_t i = 0; i < table->n_dimensions; i++)
+        {
+          if (n[i])
+            {
+              printf (" %d=(", i);
+              for (size_t j = 0; j < n[i]; j++)
+                {
+                  if (j)
+                    putchar (',');
+                  printf ("%d", indexes[i][j]);
+                }
+              putchar (')');
+            }
+        }
+      printf ("\n");
+#endif
+
+      /* XXX This is inefficient in the common case where all of the dimensions
+         are matched.  We should use a heuristic where if all of the dimensions
+         are matched and the product of n[*] is less than
+         hmap_count(&table->cells) then iterate through all the possibilities
+         rather than all the cells.  Or even only do it if there is just one
+         possibility. */
+
+      struct pivot_cell *cell;
+      HMAP_FOR_EACH (cell, struct pivot_cell, hmap_node, &table->cells)
+        {
+          for (size_t i = 0; i < table->n_dimensions; i++)
+            {
+              if (n[i] && !int_in_array (cell->idx[i], indexes[i], n[i]))
+                goto skip;
+            }
+          apply_styles_to_value (table, cell->value, set_format,
+                                 &table->areas[PIVOT_AREA_DATA],
+                                 labeling, interval);
+
+        skip: ;
+        }
+
+      for (size_t i = 0; i < table->n_dimensions; i++)
+        free (indexes[i]);
+      free (indexes);
+      free (n);
+      free (allocated);
+    }
+  else
+    NOT_REACHED ();
+}
+
+static void
+decode_set_cell_properties (struct pivot_table *table, struct hmap *series_map,
+                            struct spvdx_set_cell_properties **scps,
+                            size_t n_scps)
+{
+  for (size_t i = 0; i < n_scps; i++)
+    {
+      const struct spvdx_set_cell_properties *scp = scps[i];
+      const struct spvdx_style *interval = NULL;
+      const struct spvdx_style *graph = NULL;
+      const struct spvdx_style *labeling = NULL;
+      const struct spvdx_style *frame = NULL;
+      const struct spvdx_style *major_ticks = NULL;
+      const struct spvdx_set_format *set_format = NULL;
+      for (size_t j = 0; j < scp->n_seq; j++)
+        {
+          const struct spvxml_node *node = scp->seq[j];
+          if (spvdx_is_set_style (node))
+            {
+              const struct spvdx_set_style *set_style
+                = spvdx_cast_set_style (node);
+              if (spvdx_is_graph (set_style->target))
+                graph = set_style->style;
+              else if (spvdx_is_labeling (set_style->target))
+                labeling = set_style->style;
+              else if (spvdx_is_interval (set_style->target))
+                interval = set_style->style;
+              else if (spvdx_is_major_ticks (set_style->target))
+                major_ticks = set_style->style;
+              else
+                NOT_REACHED ();
+            }
+          else if (spvdx_is_set_frame_style (node))
+            frame = spvdx_cast_set_frame_style (node)->style;
+          else if (spvdx_is_set_format (node))
+            set_format = spvdx_cast_set_format (node);
+          else
+            assert (spvdx_is_set_meta_data (node));
+        }
+
+      if (scp->union_ && scp->apply_to_converse <= 0)
+        {
+          for (size_t j = 0; j < scp->union_->n_intersect; j++)
+            decode_set_cell_properties__ (
+              table, series_map, scp->union_->intersect[j],
+              interval, graph, labeling, frame, major_ticks, set_format);
+        }
+      else if (!scp->union_ && scp->apply_to_converse > 0)
+        {
+          if ((set_format && spvdx_is_labeling (set_format->target))
+              || labeling || interval)
+            {
+              struct pivot_cell *cell;
+              HMAP_FOR_EACH (cell, struct pivot_cell, hmap_node, &table->cells)
+                apply_styles_to_value (table, cell->value, set_format,
+                                       &table->areas[PIVOT_AREA_DATA],
+                                       NULL, NULL);
+            }
+        }
+      else if (!scp->union_ && scp->apply_to_converse <= 0)
+        {
+          /* Appears to be used to set the font for something--but what? */
+        }
+      else
+        NOT_REACHED ();
+    }
+}
+
+char * WARN_UNUSED_RESULT
+decode_spvsx_legacy_properties (const struct spvsx_table_properties *in,
+                                struct spv_legacy_properties **outp)
+{
+  struct spv_legacy_properties *out = xzalloc (sizeof *out);
+  char *error;
+
+  if (!in)
+    {
+      error = xstrdup ("Legacy table lacks tableProperties");
+      goto error;
+    }
+
+  const struct spvsx_general_properties *g = in->general_properties;
+  out->omit_empty = g->hide_empty_rows != 0;
+  out->width_ranges[TABLE_HORZ][0] = optional_pt (g->minimum_column_width, -1);
+  out->width_ranges[TABLE_HORZ][1] = optional_pt (g->maximum_column_width, -1);
+  out->width_ranges[TABLE_VERT][0] = optional_pt (g->minimum_row_width, -1);
+  out->width_ranges[TABLE_VERT][1] = optional_pt (g->maximum_row_width, -1);
+  out->row_labels_in_corner
+    = g->row_dimension_labels != SPVSX_ROW_DIMENSION_LABELS_NESTED;
+
+  const struct spvsx_footnote_properties *f = in->footnote_properties;
+  out->footnote_marker_superscripts
+    = (f->marker_position != SPVSX_MARKER_POSITION_SUBSCRIPT);
+  out->show_numeric_markers
+    = (f->number_format == SPVSX_NUMBER_FORMAT_NUMERIC);
+
+  for (int i = 0; i < PIVOT_N_AREAS; i++)
+    pivot_area_get_default_style (i, &out->areas[i]);
+
+  const struct spvsx_cell_format_properties *cfp = in->cell_format_properties;
+  for (size_t i = 0; i < cfp->n_cell_style; i++)
+    {
+      const struct spvsx_cell_style *c = cfp->cell_style[i];
+      const char *name = CHAR_CAST (const char *, c->node_.raw->name);
+      enum pivot_area area = pivot_area_from_name (name);
+      if (area == PIVOT_N_AREAS)
+        {
+          error = xasprintf ("unknown area \"%s\" in cellFormatProperties",
+                             name);
+          goto error;
+        }
+
+      struct area_style *a = &out->areas[area];
+      const struct spvsx_style *s = c->style;
+      if (s->font_weight)
+        a->font_style.bold = s->font_weight == SPVSX_FONT_WEIGHT_BOLD;
+      if (s->font_style)
+        a->font_style.italic = s->font_style == SPVSX_FONT_STYLE_ITALIC;
+      a->font_style.underline = false;
+      if (s->color >= 0)
+        a->font_style.fg[0] = optional_color (
+          s->color, (struct cell_color) CELL_COLOR_BLACK);
+      if (c->alternating_text_color >= 0 || s->color >= 0)
+        a->font_style.fg[1] = optional_color (c->alternating_text_color,
+                                              a->font_style.fg[0]);
+      if (s->color2 >= 0)
+        a->font_style.bg[0] = optional_color (
+          s->color2, (struct cell_color) CELL_COLOR_WHITE);
+      if (c->alternating_color >= 0 || s->color2 >= 0)
+        a->font_style.bg[1] = optional_color (c->alternating_color,
+                                              a->font_style.bg[0]);
+      if (s->font_family)
+        {
+          free (a->font_style.typeface);
+          a->font_style.typeface = xstrdup (s->font_family);
+        }
+
+      if (s->font_size)
+        a->font_style.size = optional_length (s->font_size, 0);
+
+      if (s->text_alignment)
+        a->cell_style.halign
+          = (s->text_alignment == SPVSX_TEXT_ALIGNMENT_LEFT
+             ? TABLE_HALIGN_LEFT
+             : s->text_alignment == SPVSX_TEXT_ALIGNMENT_RIGHT
+             ? TABLE_HALIGN_RIGHT
+             : s->text_alignment == SPVSX_TEXT_ALIGNMENT_CENTER
+             ? TABLE_HALIGN_CENTER
+             : s->text_alignment == SPVSX_TEXT_ALIGNMENT_DECIMAL
+             ? TABLE_HALIGN_DECIMAL
+             : TABLE_HALIGN_MIXED);
+      if (s->label_location_vertical)
+        a->cell_style.valign
+          = (s->label_location_vertical == SPVSX_LABEL_LOCATION_VERTICAL_NEGATIVE
+             ? TABLE_VALIGN_BOTTOM
+             : s->label_location_vertical == SPVSX_LABEL_LOCATION_VERTICAL_POSITIVE
+             ? TABLE_VALIGN_TOP
+             : TABLE_VALIGN_CENTER);
+
+      if (s->decimal_offset != DBL_MAX)
+        a->cell_style.decimal_offset = optional_px (s->decimal_offset, 0);
+
+      if (s->margin_left != DBL_MAX)
+        a->cell_style.margin[TABLE_HORZ][0] = optional_px (s->margin_left, 8);
+      if (s->margin_right != DBL_MAX)
+        a->cell_style.margin[TABLE_HORZ][1] = optional_px (s->margin_right,
+                                                           11);
+      if (s->margin_top != DBL_MAX)
+        a->cell_style.margin[TABLE_VERT][0] = optional_px (s->margin_top, 1);
+      if (s->margin_bottom != DBL_MAX)
+        a->cell_style.margin[TABLE_VERT][1] = optional_px (s->margin_bottom,
+                                                           1);
+    }
+
+  for (int i = 0; i < PIVOT_N_BORDERS; i++)
+    pivot_border_get_default_style (i, &out->borders[i]);
+
+  const struct spvsx_border_properties *bp = in->border_properties;
+  for (size_t i = 0; i < bp->n_border_style; i++)
+    {
+      const struct spvsx_border_style *bin = bp->border_style[i];
+      const char *name = CHAR_CAST (const char *, bin->node_.raw->name);
+      enum pivot_border border = pivot_border_from_name (name);
+      if (border == PIVOT_N_BORDERS)
+        {
+          error = xasprintf ("unknown border \"%s\" parsing borderProperties",
+                             name);
+          goto error;
+        }
+
+      struct table_border_style *bout = &out->borders[border];
+      bout->stroke
+        = (bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_NONE
+           ? TABLE_STROKE_NONE
+           : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_DASHED
+           ? TABLE_STROKE_DASHED
+           : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_THICK
+           ? TABLE_STROKE_THICK
+           : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_THIN
+           ? TABLE_STROKE_THIN
+           : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_DOUBLE
+           ? TABLE_STROKE_DOUBLE
+           : TABLE_STROKE_SOLID);
+      bout->color = optional_color (bin->color,
+                                    (struct cell_color) CELL_COLOR_BLACK);
+    }
+
+  const struct spvsx_printing_properties *pp = in->printing_properties;
+  out->print_all_layers = pp->print_all_layers > 0;
+  out->paginate_layers = pp->print_each_layer_on_separate_page > 0;
+  out->shrink_to_width = pp->rescale_wide_table_to_fit_page > 0;
+  out->shrink_to_length = pp->rescale_long_table_to_fit_page > 0;
+  out->top_continuation = pp->continuation_text_at_top > 0;
+  out->bottom_continuation = pp->continuation_text_at_bottom > 0;
+  out->continuation = xstrdup (pp->continuation_text
+                               ? pp->continuation_text : "(cont.)");
+  out->n_orphan_lines = optional_int (pp->window_orphan_lines, 2);
+
+  *outp = out;
+  return NULL;
+
+error:
+  spv_legacy_properties_destroy (out);
+  *outp = NULL;
+  return error;
+}
+
+void
+spv_legacy_properties_destroy (struct spv_legacy_properties *props)
+{
+  if (props)
+    {
+      for (size_t i = 0; i < PIVOT_N_AREAS; i++)
+        area_style_uninit (&props->areas[i]);
+      free (props->continuation);
+      free (props);
+    }
+}
+
+static struct spv_series *
+parse_formatting (const struct spvdx_visualization *v,
+                  const struct hmap *series_map, struct hmap *format_map)
+{
+  const struct spvdx_labeling *labeling = v->graph->interval->labeling;
+  struct spv_series *cell_format = NULL;
+  for (size_t i = 0; i < labeling->n_seq; i++)
+    {
+      const struct spvdx_formatting *f
+        = spvdx_cast_formatting (labeling->seq[i]);
+      if (!f)
+        continue;
+
+      cell_format = spv_series_from_ref (series_map, f->variable);
+      for (size_t j = 0; j < f->n_format_mapping; j++)
+        {
+          const struct spvdx_format_mapping *fm = f->format_mapping[j];
+
+          if (fm->format)
+            {
+              struct format_mapping *out = xmalloc (sizeof *out);
+              out->from = fm->from;
+              out->to = decode_format (fm->format);
+              hmap_insert (format_map, &out->hmap_node,
+                           hash_int (out->from, 0));
+            }
+        }
+    }
+
+  return cell_format;
+}
+
+static void
+format_map_destroy (struct hmap *format_map)
+{
+  struct format_mapping *fm, *next;
+  HMAP_FOR_EACH_SAFE (fm, next, struct format_mapping, hmap_node, format_map)
+    {
+      hmap_delete (format_map, &fm->hmap_node);
+      free (fm);
+    }
+  hmap_destroy (format_map);
+}
+
+char * WARN_UNUSED_RESULT
+decode_spvdx_table (const struct spvdx_visualization *v,
+                    const struct spv_legacy_properties *props,
+                    struct spv_data *data, struct pivot_table **outp)
+{
+  struct pivot_table *table = pivot_table_create__ (NULL);
+
+  struct hmap series_map = HMAP_INITIALIZER (series_map);
+  struct hmap format_map = HMAP_INITIALIZER (format_map);
+  struct spv_series **dim_series = NULL;
+  char *error;
+
+  /* First get the legacy properties. */
+  table->omit_empty = props->omit_empty;
+  for (enum table_axis axis = 0; axis < TABLE_N_AXES; axis++)
+    for (int i = 0; i < 2; i++)
+      if (props->width_ranges[axis][i] > 0)
+        table->sizing[axis].range[i] = props->width_ranges[axis][i];
+  table->row_labels_in_corner = props->row_labels_in_corner;
+
+  table->footnote_marker_superscripts = props->footnote_marker_superscripts;
+  table->show_numeric_markers = props->show_numeric_markers;
+
+  for (size_t i = 0; i < PIVOT_N_AREAS; i++)
+    {
+      area_style_uninit (&table->areas[i]);
+      area_style_copy (&table->areas[i], &props->areas[i]);
+    }
+  for (size_t i = 0; i < PIVOT_N_BORDERS; i++)
+    table->borders[i] = props->borders[i];
+
+  table->print_all_layers = props->print_all_layers;
+  table->paginate_layers = props->paginate_layers;
+  table->shrink_to_fit[TABLE_HORZ] = props->shrink_to_width;
+  table->shrink_to_fit[TABLE_VERT] = props->shrink_to_length;
+  table->top_continuation = props->top_continuation;
+  table->bottom_continuation = props->bottom_continuation;
+  table->continuation = xstrdup (props->continuation);
+  table->n_orphan_lines = props->n_orphan_lines;
+
+  struct spvdx_visualization_extension *ve = v->visualization_extension;
+  table->show_grid_lines = ve && ve->show_gridline;
+
+  /* Sizing from the legacy properties can get overridden. */
+  if (v->graph->cell_style->width)
+    {
+      int min_width, max_width, n = 0;
+      if (sscanf (v->graph->cell_style->width, "%*d%%;%dpt;%dpt%n",
+                  &min_width, &max_width, &n)
+          && v->graph->cell_style->width[n] == '\0')
+        {
+          table->sizing[TABLE_HORZ].range[0] = min_width;
+          table->sizing[TABLE_HORZ].range[1] = max_width;
+        }
+    }
+
+  /* Footnotes.
+
+     Any pivot_value might refer to footnotes, so it's important to process the
+     footnotes early to ensure that those references can be resolved.  There is
+     a possible problem that a footnote might itself reference an
+     as-yet-unprocessed footnote, but that's OK because footnote references
+     don't actually look at the footnote contents but only resolve a pointer to
+     where the footnote will go later.
+
+     Before we really start, create all the footnotes we'll fill in.  This is
+     because sometimes footnotes refer to themselves or to each other and we
+     don't want to reject those references. */
+  if (v->container)
+    for (size_t i = 0; i < v->container->n_label_frame; i++)
+      {
+        const struct spvdx_label_frame *lf = v->container->label_frame[i];
+        if (lf->label
+            && lf->label->purpose == SPVDX_PURPOSE_FOOTNOTE
+            && lf->label->n_text > 0
+            && lf->label->text[0]->uses_reference > 0)
+          {
+            pivot_table_create_footnote__ (
+              table, lf->label->text[0]->uses_reference - 1,
+              NULL, NULL);
+          }
+      }
+
+  if (v->graph->interval->footnotes)
+    decode_footnotes (table, v->graph->interval->footnotes);
+
+  struct spv_series *footnotes = NULL;
+  for (size_t i = 0; i < v->graph->interval->labeling->n_seq; i++)
+    {
+      const struct spvxml_node *node = v->graph->interval->labeling->seq[i];
+      if (spvdx_is_footnotes (node))
+        {
+          const struct spvdx_footnotes *f = spvdx_cast_footnotes (node);
+          footnotes = spv_series_from_ref (&series_map, f->variable);
+          decode_footnotes (table, f);
+        }
+    }
+  for (size_t i = 0; i < v->n_lf1; i++)
+    {
+      error = decode_label_frame (table, v->lf1[i]);
+      if (error)
+        goto exit;
+    }
+  for (size_t i = 0; i < v->n_lf2; i++)
+    {
+      error = decode_label_frame (table, v->lf2[i]);
+      if (error)
+        goto exit;
+    }
+  if (v->container)
+    for (size_t i = 0; i < v->container->n_label_frame; i++)
+      {
+        error = decode_label_frame (table, v->container->label_frame[i]);
+        if (error)
+          goto exit;
+      }
+  if (v->graph->interval->labeling->style)
+    {
+      area_style_uninit (&table->areas[PIVOT_AREA_DATA]);
+      decode_spvdx_style (v->graph->interval->labeling->style,
+                          v->graph->cell_style,
+                          &table->areas[PIVOT_AREA_DATA]);
+    }
+
+  /* Decode all of the sourceVariable and derivedVariable  */
+  struct spvxml_node **nodes = xmemdup (v->seq, v->n_seq * sizeof *v->seq);
+  size_t n_nodes = v->n_seq;
+  while (n_nodes > 0)
+    {
+      bool progress = false;
+      for (size_t i = 0; i < n_nodes; )
+        {
+          error = (spvdx_is_source_variable (nodes[i])
+                   ? decode_spvdx_source_variable (nodes[i], data, &series_map)
+                   : decode_spvdx_derived_variable (nodes[i], &series_map));
+          if (!error)
+            {
+              nodes[i] = nodes[--n_nodes];
+              progress = true;
+            }
+          else if (error == &BAD_REFERENCE)
+            i++;
+          else
+            {
+              free (nodes);
+              goto exit;
+            }
+        }
+
+      if (!progress)
+        {
+          free (nodes);
+          error = xasprintf ("Table has %zu variables with circular or "
+                             "unresolved references, including variable %s.",
+                             n_nodes, nodes[0]->id);
+          goto exit;
+        }
+    }
+  free (nodes);
+
+  const struct spvdx_cross *cross = v->graph->faceting->cross;
+
+  assert (cross->n_seq == 1);
+  const struct spvdx_nest *columns = spvdx_cast_nest (cross->seq[0]);
+  size_t max_columns = columns ? columns->n_vars : 0;
+
+  assert (cross->n_seq2 == 1);
+  const struct spvdx_nest *rows = spvdx_cast_nest (cross->seq2[0]);
+  size_t max_rows = rows ? rows->n_vars : 0;
+
+  size_t max_layers = (v->graph->faceting->n_layers1
+                       + v->graph->faceting->n_layers2);
+
+  size_t max_dims = max_columns + max_rows + max_layers;
+  table->dimensions = xnmalloc (max_dims, sizeof *table->dimensions);
+  dim_series = xnmalloc (max_dims, sizeof *dim_series);
+  size_t n_dim_series = 0;
+  add_dimensions (&series_map, columns, PIVOT_AXIS_COLUMN, v, table,
+                  dim_series, &n_dim_series, 1);
+  add_dimensions (&series_map, rows, PIVOT_AXIS_ROW, v, table,
+                  dim_series, &n_dim_series, max_columns + 1);
+  add_layers (&series_map,
+              v->graph->faceting->layers1, v->graph->faceting->n_layers1,
+              v, table, dim_series, &n_dim_series, max_rows + max_columns + 1);
+  add_layers (&series_map,
+              v->graph->faceting->layers2, v->graph->faceting->n_layers2,
+              v, table, dim_series, &n_dim_series,
+              max_rows + max_columns + v->graph->faceting->n_layers1 + 1);
+
+  struct spv_series *cell = spv_series_find (&series_map, "cell");
+  if (!cell)
+    {
+      error = xstrdup (_("Table lacks cell data."));
+      goto exit;
+    }
+
+  struct spv_series *cell_format = parse_formatting (v, &series_map,
+                                                     &format_map);
+
+  assert (table->n_dimensions == n_dim_series);
+  size_t *dim_indexes = xnmalloc (table->n_dimensions, sizeof *dim_indexes);
+  for (size_t i = 0; i < cell->n_values; i++)
+    {
+      for (size_t j = 0; j < table->n_dimensions; j++)
+        {
+          const struct spv_data_value *value = &dim_series[j]->values[i];
+          const struct pivot_category *cat = find_category (
+            dim_series[j], value->width < 0 ? value->d : value->index);
+          if (!cat)
+            goto skip;
+          dim_indexes[j] = cat->data_index;
+        }
+
+      struct pivot_value *value = pivot_value_from_data_value (
+        &cell->values[i], cell_format ? &cell_format->values[i] : NULL,
+        &format_map);
+      if (footnotes)
+        {
+          const struct spv_data_value *d = &footnotes->values[i];
+          if (d->width >= 0)
+            {
+              const char *p = d->s;
+              while (*p)
+                {
+                  char *tail;
+                  int idx = strtol (p, &tail, 10);
+                  add_footnote (value, idx, table);
+                  if (tail == p)
+                    break;
+                  p = tail;
+                  if (*p == ',')
+                    p++;
+                }
+            }
+        }
+
+      if (value->type == PIVOT_VALUE_NUMERIC
+          && value->numeric.x == SYSMIS
+          && !value->n_footnotes)
+        {
+          /* Apparently, system-missing values are just empty cells? */
+          pivot_value_destroy (value);
+        }
+      else
+        pivot_table_put (table, dim_indexes, table->n_dimensions, value);
+    skip:;
+    }
+  free (dim_indexes);
+
+  decode_set_cell_properties (table, &series_map, v->graph->facet_layout->scp1,
+                              v->graph->facet_layout->n_scp1);
+  decode_set_cell_properties (table, &series_map, v->graph->facet_layout->scp2,
+                              v->graph->facet_layout->n_scp2);
+
+  pivot_table_assign_label_depth (table);
+
+  format_map_destroy (&format_map);
+
+exit:
+  free (dim_series);
+  spv_series_destroy (&series_map);
+  if (error)
+    {
+      pivot_table_unref (table);
+      *outp = NULL;
+    }
+  else
+    *outp = table;
+  return error;
+}
diff --git a/src/output/spv/spv-legacy-decoder.h b/src/output/spv/spv-legacy-decoder.h

new file mode 100644 (file)

index 0000000..b5a7e30
--- /dev/null
+++ b/src/output/spv/spv-legacy-decoder.h
@@ -0,0 +1,45 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_LEGACY_DECODER_H
+#define OUTPUT_SPV_LEGACY_DECODER_H 1
+
+/* SPSS Viewer (SPV) legacy binary decoder.
+
+   Used by spv.h, not useful directly. */
+
+#include "libpspp/compiler.h"
+
+struct pivot_table;
+struct spvdx_visualization;
+struct spvsx_table_properties;
+struct spv_data;
+
+struct spv_legacy_properties;
+
+void spv_legacy_properties_destroy (struct spv_legacy_properties *);
+
+char *decode_spvsx_legacy_properties (const struct spvsx_table_properties *,
+                                      struct spv_legacy_properties **)
+  WARN_UNUSED_RESULT;
+
+char *decode_spvdx_table (const struct spvdx_visualization *,
+                          const struct spv_legacy_properties *,
+                          struct spv_data *,
+                          struct pivot_table **outp)
+  WARN_UNUSED_RESULT;
+
+#endif /* output/spv/spv-legacy-decoder.h */
diff --git a/src/output/spv/spv-light-decoder.c b/src/output/spv/spv-light-decoder.c

new file mode 100644 (file)

index 0000000..f8b07d0
--- /dev/null
+++ b/src/output/spv/spv-light-decoder.c
@@ -0,0 +1,856 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2017, 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spv-light-decoder.h"
+
+#include <inttypes.h>
+#include <limits.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "libpspp/i18n.h"
+#include "libpspp/message.h"
+#include "output/pivot-table.h"
+#include "output/spv/light-binary-parser.h"
+#include "output/spv/spv.h"
+
+#include "gl/xalloc.h"
+#include "gl/xsize.h"
+
+static char *
+to_utf8 (const char *s, const char *encoding)
+{
+  return recode_string ("UTF-8", encoding, s, strlen (s));
+}
+
+static char *
+to_utf8_if_nonempty (const char *s, const char *encoding)
+{
+  return s && s[0] ? to_utf8 (s, encoding) : NULL;
+}
+
+static void
+convert_widths (const uint32_t *in, uint32_t n, int **out, size_t *n_out)
+{
+  if (n)
+    {
+      *n_out = n;
+      *out = xnmalloc (n, sizeof **out);
+      for (size_t i = 0; i < n; i++)
+        (*out)[i] = in[i];
+    }
+}
+
+static void
+convert_breakpoints (const struct spvlb_breakpoints *in,
+                     size_t **out, size_t *n_out)
+{
+  if (in && in->n_breaks)
+    {
+      *n_out = in->n_breaks;
+      *out = xnmalloc (in->n_breaks, sizeof *out);
+      for (size_t i = 0; i < in->n_breaks; i++)
+        (*out)[i] = in->breaks[i];
+    }
+}
+
+static void
+convert_keeps (const struct spvlb_keeps *in,
+               struct pivot_keep **out, size_t *n_out)
+{
+  if (in && in->n_keeps)
+    {
+      *n_out = in->n_keeps;
+      *out = xnmalloc (*n_out, sizeof **out);
+      for (size_t i = 0; i < *n_out; i++)
+        {
+          (*out)[i].ofs = in->keeps[i]->offset;
+          (*out)[i].n = in->keeps[i]->n;
+        }
+    }
+}
+
+static struct cell_color
+decode_spvlb_color_string (const char *s, uint8_t def)
+{
+  int r, g, b;
+  if (sscanf (s, "#%2x%2x%2x", &r, &g, &b) != 3)
+    {
+      if (*s)
+        {
+          fprintf (stderr, "bad color %s\n", s);
+          exit (1);
+        }
+      r = g = b = def;
+    }
+  return (struct cell_color) CELL_COLOR (r, g, b);
+}
+
+static struct cell_color
+decode_spvlb_color_u32 (uint32_t x)
+{
+  return (struct cell_color) { x >> 24, x >> 16, x >> 8, x };
+}
+
+static struct font_style *
+decode_spvlb_font_style (const struct spvlb_font_style *in,
+                         const char *encoding)
+{
+  if (!in)
+    return NULL;
+
+  struct font_style *out = xmalloc (sizeof *out);
+  *out = (struct font_style) {
+    .bold = in->bold,
+    .italic = in->italic,
+    .underline = in->underline,
+    .fg[0] = decode_spvlb_color_string (in->fg_color, 0x00),
+    .bg[0] = decode_spvlb_color_string (in->bg_color, 0xff),
+    .typeface = to_utf8 (in->typeface, encoding),
+    .size = in->size / 1.33,
+  };
+  out->fg[1] = out->fg[0];
+  out->bg[1] = out->bg[0];
+  return out;
+}
+
+static enum table_halign
+decode_spvlb_halign (uint32_t in)
+{
+  switch (in)
+    {
+    case 0:
+      return TABLE_HALIGN_CENTER;
+
+    case 2:
+      return TABLE_HALIGN_LEFT;
+
+    case 4:
+      return TABLE_HALIGN_RIGHT;
+
+    case 6:
+    case 61453:
+      return TABLE_HALIGN_DECIMAL;
+
+    case 0xffffffad:
+    case 64173:
+      return TABLE_HALIGN_MIXED;
+
+    default:
+      fprintf (stderr, "bad cell style halign %"PRIu32"\n", in);
+      exit (1);
+    }
+}
+
+static enum table_valign
+decode_spvlb_valign (uint32_t in)
+{
+  switch (in)
+    {
+    case 0:
+      return TABLE_VALIGN_CENTER;
+
+    case 1:
+      return TABLE_VALIGN_TOP;
+
+    case 3:
+      return TABLE_VALIGN_BOTTOM;
+
+    default:
+      fprintf (stderr, "bad cell style valign %"PRIu32"\n", in);
+      exit (1);
+    }
+}
+
+static struct cell_style *
+decode_spvlb_cell_style (const struct spvlb_cell_style *in)
+{
+  if (!in)
+    return NULL;
+
+  struct cell_style *out = xzalloc (sizeof *out);
+
+  out->halign = decode_spvlb_halign (in->halign);
+  out->valign = decode_spvlb_valign (in->valign);
+  out->decimal_offset = in->decimal_offset;
+  out->margin[TABLE_HORZ][0] = in->left_margin;
+  out->margin[TABLE_HORZ][1] = in->right_margin;
+  out->margin[TABLE_VERT][0] = in->top_margin;
+  out->margin[TABLE_VERT][1] = in->bottom_margin;
+
+  return out;
+}
+
+static struct pivot_value *decode_spvlb_value (const struct pivot_table *,
+                                             const struct spvlb_value *,
+                                             const char *encoding);
+
+static void
+decode_spvlb_argument (const struct pivot_table *table,
+                       const struct spvlb_argument *in,
+                       struct pivot_argument *out,
+                       const char *encoding)
+{
+  if (in->value)
+    {
+      out->n = 1;
+      out->values = xmalloc (sizeof *out->values);
+      out->values[0] = decode_spvlb_value (table, in->value, encoding);
+    }
+  else
+    {
+      out->n = in->n_values;
+      out->values = xnmalloc (out->n, sizeof *out->values);
+      for (size_t i = 0; i < out->n; i++)
+        out->values[i] = decode_spvlb_value (table, in->values[i], encoding);
+    }
+}
+
+static enum settings_value_show
+decode_spvlb_value_show (uint8_t in)
+{
+  switch (in)
+    {
+    case 0: return SETTINGS_VALUE_SHOW_DEFAULT;
+    case 1: return SETTINGS_VALUE_SHOW_VALUE;
+    case 2: return SETTINGS_VALUE_SHOW_LABEL;
+    case 3: return SETTINGS_VALUE_SHOW_BOTH;
+    default:
+      fprintf (stderr, "bad value show %"PRIu8"\n", in);
+      exit (1);
+    }
+}
+
+static struct pivot_value *
+decode_spvlb_value (const struct pivot_table *table,
+                    const struct spvlb_value *in,
+                    const char *encoding)
+{
+  struct pivot_value *out = xzalloc (sizeof *out);
+  const struct spvlb_value_mod *vm;
+
+  switch (in->type)
+    {
+    case 1:
+      vm = in->type_01.value_mod;
+      out->type = PIVOT_VALUE_NUMERIC;
+      out->numeric.x = in->type_01.x;
+      out->numeric.format = spv_decode_fmt_spec (in->type_01.format);
+      break;
+
+    case 2:
+      vm = in->type_02.value_mod;
+      out->type = PIVOT_VALUE_NUMERIC;
+      out->numeric.x = in->type_02.x;
+      out->numeric.format = spv_decode_fmt_spec (in->type_02.format);
+      out->numeric.var_name = to_utf8_if_nonempty (in->type_02.var_name,
+                                                   encoding);
+      out->numeric.value_label = to_utf8_if_nonempty (in->type_02.value_label,
+                                                      encoding);
+      out->numeric.show = decode_spvlb_value_show (in->type_02.show);
+      break;
+
+    case 3:
+      vm = in->type_03.value_mod;
+      out->type = PIVOT_VALUE_TEXT;
+      out->text.local = to_utf8 (in->type_03.local, encoding);
+      out->text.c = to_utf8 (in->type_03.c, encoding);
+      out->text.id = to_utf8 (in->type_03.id, encoding);
+      out->text.user_provided = !in->type_03.fixed;
+      break;
+
+    case 4:
+      vm = in->type_04.value_mod;
+      out->type = PIVOT_VALUE_STRING;
+      out->string.s = to_utf8 (in->type_04.s, encoding);
+      out->string.var_name = to_utf8 (in->type_04.var_name, encoding);
+      out->string.value_label = to_utf8_if_nonempty (in->type_04.value_label,
+                                                     encoding);
+      out->string.show = decode_spvlb_value_show (in->type_04.show);
+      break;
+
+    case 5:
+      vm = in->type_05.value_mod;
+      out->type = PIVOT_VALUE_VARIABLE;
+      out->variable.var_name = to_utf8 (in->type_05.var_name, encoding);
+      out->variable.var_label = to_utf8_if_nonempty (in->type_05.var_label,
+                                                     encoding);
+      out->variable.show = decode_spvlb_value_show (in->type_05.show);
+      break;
+
+    case 6:
+      vm = in->type_06.value_mod;
+      out->type = PIVOT_VALUE_TEXT;
+      out->text.local = to_utf8 (in->type_06.local, encoding);
+      out->text.c = to_utf8 (in->type_06.c, encoding);
+      out->text.id = to_utf8 (in->type_06.id, encoding);
+      out->text.user_provided = false;
+      break;
+
+    case -1:
+      vm = in->type_else.value_mod;
+      out->type = PIVOT_VALUE_TEMPLATE;
+      out->template.local = to_utf8 (in->type_else.template, encoding);
+      out->template.id = out->template.local;
+      out->template.n_args = in->type_else.n_args;
+      out->template.args = xnmalloc (in->type_else.n_args,
+                                     sizeof *out->template.args);
+      for (size_t i = 0; i < out->template.n_args; i++)
+        decode_spvlb_argument (table, in->type_else.args[i],
+                               &out->template.args[i], encoding);
+      break;
+
+    default:
+      assert (0);
+    }
+
+  if (vm)
+    {
+      if (vm->subscript)
+        out->subscript = to_utf8 (vm->subscript, encoding);
+
+      if (vm->n_refs)
+        {
+          out->footnotes = xnmalloc (vm->n_refs, sizeof *out->footnotes);
+          for (size_t i = 0; i < vm->n_refs; i++)
+            {
+              uint16_t idx = vm->refs[i];
+              if (idx < table->n_footnotes)
+                out->footnotes[out->n_footnotes++] = table->footnotes[idx];
+              else
+                {
+                  fprintf (stderr, "bad footnote index: %"PRIu16" >= %zu\n",
+                           idx, table->n_footnotes);
+                  exit (1);
+                }
+            }
+        }
+
+      if (vm->style_pair)
+        {
+          out->font_style = decode_spvlb_font_style (
+            vm->style_pair->font_style, encoding);
+          out->cell_style = decode_spvlb_cell_style (
+            vm->style_pair->cell_style);
+        }
+
+      if (vm->template_string
+          && vm->template_string->id
+          && vm->template_string->id[0]
+          && out->type == PIVOT_VALUE_TEMPLATE)
+        out->template.id = to_utf8 (vm->template_string->id, encoding);
+    }
+
+  return out;
+}
+
+static void
+decode_spvlb_area (const struct spvlb_area *in, struct area_style *out,
+                   const char *encoding)
+{
+  out->font_style.bold = (in->style & 1) != 0;
+  out->font_style.italic = (in->style & 2) != 0;
+  out->font_style.underline = in->underline;
+  out->font_style.fg[0] = decode_spvlb_color_string (in->fg_color, 0x00);
+  out->font_style.bg[0] = decode_spvlb_color_string (in->bg_color, 0xff);
+  out->font_style.typeface = to_utf8 (in->typeface, encoding);
+  out->font_style.size = in->size / 1.33;
+  out->font_style.fg[1] = (in->alternate
+                           ? decode_spvlb_color_string (in->alt_fg_color, 0x00)
+                           : out->font_style.fg[0]);
+  out->font_style.bg[1] = (in->alternate
+                           ? decode_spvlb_color_string (in->alt_bg_color, 0xff)
+                           : out->font_style.bg[0]);
+  assert (in->halign != 61453);
+  out->cell_style.halign = decode_spvlb_halign (in->halign);
+  out->cell_style.valign = decode_spvlb_valign (in->valign);
+
+  /* TABLE_HALIGN_DECIMAL doesn't seem to be a real halign for areas, which is
+     good because there's no way to indicate the decimal offset.  Just in
+     case: */
+  if (out->cell_style.halign == TABLE_HALIGN_DECIMAL)
+    out->cell_style.halign = TABLE_HALIGN_MIXED;
+
+  out->cell_style.margin[TABLE_HORZ][0] = in->left_margin;
+  out->cell_style.margin[TABLE_HORZ][1] = in->right_margin;
+  out->cell_style.margin[TABLE_VERT][0] = in->top_margin;
+  out->cell_style.margin[TABLE_VERT][1] = in->bottom_margin;
+}
+
+static void decode_spvlb_group (const struct pivot_table *,
+                                struct spvlb_category **,
+                                size_t n_categories,
+                                bool show_label,
+                                struct pivot_category *parent,
+                                struct pivot_dimension *,
+                                const char *encoding);
+
+static void
+decode_spvlb_categories (const struct pivot_table *table,
+                         struct spvlb_category **categories,
+                         size_t n_categories,
+                         struct pivot_category *parent,
+                         struct pivot_dimension *dimension,
+                         const char *encoding)
+{
+  for (size_t i = 0; i < n_categories; i++)
+    {
+      const struct spvlb_category *in = categories[i];
+      if (in->group && in->group->merge)
+        {
+          decode_spvlb_categories (table, in->group->subcategories,
+                                   in->group->n_subcategories,
+                                   parent, dimension, encoding);
+          continue;
+        }
+
+      struct pivot_category *out = xzalloc (sizeof *out);
+      out->name = decode_spvlb_value (table, in->name, encoding);
+      out->parent = parent;
+      out->dimension = dimension;
+      if (in->group)
+        {
+          decode_spvlb_group (table, in->group->subcategories,
+                              in->group->n_subcategories,
+                              true, out, dimension, encoding);
+          out->data_index = SIZE_MAX;
+          out->presentation_index = SIZE_MAX;
+        }
+      else
+        {
+          out->data_index = in->leaf->leaf_index;
+          out->presentation_index = dimension->n_leaves;
+          dimension->n_leaves++;
+        }
+
+      if (parent->n_subs >= parent->allocated_subs)
+        parent->subs = x2nrealloc (parent->subs, &parent->allocated_subs,
+                                   sizeof *parent->subs);
+      parent->subs[parent->n_subs++] = out;
+    }
+}
+
+static void
+decode_spvlb_group (const struct pivot_table *table,
+                    struct spvlb_category **categories,
+                    size_t n_categories, bool show_label,
+                    struct pivot_category *category,
+                    struct pivot_dimension *dimension,
+                    const char *encoding)
+{
+  category->subs = xcalloc (n_categories, sizeof *category->subs);
+  category->n_subs = 0;
+  category->allocated_subs = 0;
+  category->show_label = show_label;
+
+  decode_spvlb_categories (table, categories, n_categories, category,
+                           dimension, encoding);
+}
+
+static void
+fill_leaves (struct pivot_category *category,
+             struct pivot_dimension *dimension)
+{
+  if (pivot_category_is_group (category))
+    {
+      for (size_t i = 0; i < category->n_subs; i++)
+        fill_leaves (category->subs[i], dimension);
+    }
+  else
+    {
+      if (category->data_index >= dimension->n_leaves)
+        {
+          fprintf (stderr, "leaf_index %zu >= n_leaves %zu\n",
+                   category->data_index, dimension->n_leaves);
+          exit (1);
+        }
+      if (dimension->data_leaves[category->data_index])
+        {
+          fprintf (stderr, "two leaves with data_index %zu\n",
+                   category->data_index);
+          exit (1);
+        }
+      dimension->data_leaves[category->data_index] = category;
+      dimension->presentation_leaves[category->presentation_index] = category;
+    }
+}
+
+static struct pivot_dimension *
+decode_spvlb_dimension (const struct pivot_table *table,
+                        const struct spvlb_dimension *in,
+                        size_t idx, const char *encoding)
+{
+  /* Convert most of the dimension. */
+  struct pivot_dimension *out = xzalloc (sizeof *out);
+  out->level = UINT_MAX;
+  out->top_index = idx;
+  out->hide_all_labels = in->props->hide_all_labels;
+
+  out->root = xzalloc (sizeof *out->root);
+  *out->root = (struct pivot_category) {
+    .name = decode_spvlb_value (table, in->name, encoding),
+    .dimension = out,
+    .data_index = SIZE_MAX,
+    .presentation_index = SIZE_MAX,
+  };
+  decode_spvlb_group (table, in->categories, in->n_categories,
+                      !in->props->hide_dim_label, out->root, out, encoding);
+
+  /* Allocate and fill the array of leaves now that we know how many there
+     are. */
+  out->data_leaves = xcalloc (out->n_leaves, sizeof *out->data_leaves);
+  out->presentation_leaves = xcalloc (out->n_leaves,
+                                      sizeof *out->presentation_leaves);
+  out->allocated_leaves = out->n_leaves;
+  fill_leaves (out->root, out);
+  for (size_t i = 0; i < out->n_leaves; i++)
+    {
+      assert (out->data_leaves[i] != NULL);
+      assert (out->presentation_leaves[i] != NULL);
+    }
+
+  return out;
+}
+
+static enum table_stroke
+decode_spvlb_stroke (uint32_t stroke_type)
+{
+  switch (stroke_type)
+    {
+    case 0: return TABLE_STROKE_NONE;
+    case 1: return TABLE_STROKE_SOLID;
+    case 2: return TABLE_STROKE_DASHED;
+    case 3: return TABLE_STROKE_THICK;
+    case 4: return TABLE_STROKE_THIN;
+    case 5: return TABLE_STROKE_DOUBLE;
+
+    default:
+      fprintf (stderr, "bad stroke %"PRIu32"\n", stroke_type);
+      exit (1);
+    }
+}
+
+static void
+decode_spvlb_border (const struct spvlb_border *in, struct pivot_table *table)
+
+{
+  if (in->border_type >= PIVOT_N_BORDERS)
+    {
+      fprintf (stderr, "bad border type %"PRIu32"\n", in->border_type);
+      exit (1);
+    }
+
+  struct table_border_style *out = &table->borders[in->border_type];
+  out->stroke = decode_spvlb_stroke (in->stroke_type);
+  out->color = decode_spvlb_color_u32 (in->color);
+}
+
+static void
+decode_spvlb_axis (const uint32_t *dimension_indexes, size_t n_dimensions,
+                   enum pivot_axis_type axis_type, struct pivot_table *table)
+{
+  struct pivot_axis *axis = &table->axes[axis_type];
+  axis->dimensions = xnmalloc (n_dimensions, sizeof *axis->dimensions);
+  axis->n_dimensions = n_dimensions;
+  axis->extent = 1;
+  for (size_t i = 0; i < n_dimensions; i++)
+    {
+      uint32_t idx = dimension_indexes[i];
+      if (idx >= table->n_dimensions)
+        {
+          fprintf (stderr, "bad dimension index %"PRIu32" >= %zu",
+                   idx, table->n_dimensions);
+          exit (1);
+        }
+
+      struct pivot_dimension *d = table->dimensions[idx];
+      if (d->level != UINT_MAX)
+        {
+          fprintf (stderr, "duplicate dimension %"PRIu32, idx);
+          exit (1);
+        }
+
+      axis->dimensions[i] = d;
+      d->axis_type = axis_type;
+      d->level = i;
+
+      axis->extent *= d->n_leaves;
+    }
+}
+
+static void
+decode_data_index (uint64_t in, const struct pivot_table *table,
+                   size_t *out)
+{
+  uint64_t remainder = in;
+  for (size_t i = table->n_dimensions - 1; i > 0; i--)
+    {
+      const struct pivot_dimension *d = table->dimensions[i];
+      if (d->n_leaves)
+        {
+          out[i] = remainder % d->n_leaves;
+          remainder /= d->n_leaves;
+        }
+      else
+        out[i] = 0;
+    }
+  if (remainder >= table->dimensions[0]->n_leaves)
+    {
+      fprintf (stderr, "out of range cell data index %"PRIu64, in);
+      exit (1);
+    }
+  out[0] = remainder;
+}
+
+static void
+decode_spvlb_cells (struct spvlb_cell **in, size_t n_in,
+                    struct pivot_table *table, const char *encoding)
+{
+  if (!table->n_dimensions)
+    return;
+
+  size_t *dindexes = xnmalloc (table->n_dimensions, sizeof *dindexes);
+  for (size_t i = 0; i < n_in; i++)
+    {
+      decode_data_index (in[i]->index, table, dindexes);
+      struct pivot_value *value = decode_spvlb_value (table, in[i]->value,
+                                                      encoding);
+      pivot_table_put (table, dindexes, table->n_dimensions, value);
+    }
+  free (dindexes);
+}
+
+static void
+decode_spvlb_footnote (const struct spvlb_footnote *in, const char *encoding,
+                       size_t idx, struct pivot_table *table)
+{
+  struct pivot_value *content = decode_spvlb_value (table, in->text, encoding);
+  struct pivot_value *marker = NULL;
+  if (in->marker)
+    {
+      marker = decode_spvlb_value (table, in->marker, encoding);
+      if (marker->type == PIVOT_VALUE_TEXT)
+        marker->text.user_provided = false;
+    }
+  pivot_table_create_footnote__ (table, idx, marker, content);
+}
+
+static void
+decode_current_layer (uint64_t current_layer, struct pivot_table *table)
+{
+  const struct pivot_axis *axis = &table->axes[PIVOT_AXIS_LAYER];
+  table->current_layer = xnmalloc (axis->n_dimensions,
+                                   sizeof *table->current_layer);
+
+  for (size_t i = 0; i < axis->n_dimensions; i++)
+    {
+      const struct pivot_dimension *d = axis->dimensions[i];
+      if (d->n_leaves)
+        {
+          table->current_layer[i] = current_layer % d->n_leaves;
+          current_layer /= d->n_leaves;
+        }
+      else
+        table->current_layer[i] = 0;
+    }
+  if (current_layer > 0)
+    {
+      fprintf (stderr, "out of range layer data index %"PRIu64, current_layer);
+      exit (1);
+    }
+}
+
+char *
+decode_spvlb_table (const struct spvlb_table *in, struct pivot_table **outp)
+{
+  if (in->header->version != 1 && in->header->version != 3)
+    return xasprintf ("unknown version %"PRIu32" (expected 1 or 3)",
+                      in->header->version);
+
+  struct pivot_table *out = xzalloc (sizeof *out);
+  out->ref_cnt = 1;
+  hmap_init (&out->cells);
+
+  const struct spvlb_y1 *y1 = (in->formats->x0 ? in->formats->x0->y1
+                               : in->formats->x3 ? in->formats->x3->y1
+                               : NULL);
+  const char *encoding;
+  if (y1)
+    encoding = y1->charset;
+  else
+    {
+      const char *dot = strchr (in->formats->locale, '.');
+      encoding = dot ? dot + 1 : "windows-1252";
+    }
+
+  /* Display settings. */
+  out->show_numeric_markers = !in->ts->show_alphabetic_markers;
+  out->rotate_inner_column_labels = in->header->rotate_inner_column_labels;
+  out->rotate_outer_row_labels = in->header->rotate_outer_row_labels;
+  out->row_labels_in_corner = in->ts->show_row_labels_in_corner;
+  out->show_grid_lines = in->borders->show_grid_lines;
+  out->footnote_marker_superscripts = in->ts->footnote_marker_superscripts;
+  out->omit_empty = in->ts->omit_empty;
+
+  const struct spvlb_x1 *x1 = in->formats->x1;
+  if (x1)
+    {
+      out->show_values = decode_spvlb_value_show (x1->show_values);
+      out->show_variables = decode_spvlb_value_show (x1->show_variables);
+    }
+
+  /* Column and row display settings. */
+  out->sizing[TABLE_VERT].range[0] = in->header->min_row_height;
+  out->sizing[TABLE_VERT].range[1] = in->header->max_row_height;
+  out->sizing[TABLE_HORZ].range[0] = in->header->min_col_width;
+  out->sizing[TABLE_HORZ].range[1] = in->header->max_col_width;
+
+  convert_widths (in->formats->widths, in->formats->n_widths,
+                  &out->sizing[TABLE_HORZ].widths,
+                  &out->sizing[TABLE_HORZ].n_widths);
+
+  const struct spvlb_x2 *x2 = in->formats->x2;
+  if (x2)
+    convert_widths (x2->row_heights, x2->n_row_heights,
+                    &out->sizing[TABLE_VERT].widths,
+                    &out->sizing[TABLE_VERT].n_widths);
+
+  convert_breakpoints (in->ts->row_breaks,
+                       &out->sizing[TABLE_VERT].breaks,
+                       &out->sizing[TABLE_VERT].n_breaks);
+  convert_breakpoints (in->ts->col_breaks,
+                       &out->sizing[TABLE_HORZ].breaks,
+                       &out->sizing[TABLE_HORZ].n_breaks);
+
+  convert_keeps (in->ts->row_keeps,
+                 &out->sizing[TABLE_VERT].keeps,
+                 &out->sizing[TABLE_VERT].n_keeps);
+  convert_keeps (in->ts->col_keeps,
+                 &out->sizing[TABLE_HORZ].keeps,
+                 &out->sizing[TABLE_HORZ].n_keeps);
+
+  out->notes = to_utf8_if_nonempty (in->ts->notes, encoding);
+  out->table_look = to_utf8_if_nonempty (in->ts->table_look, encoding);
+
+  /* Print settings. */
+  out->print_all_layers = in->ps->all_layers;
+  out->paginate_layers = in->ps->paginate_layers;
+  out->shrink_to_fit[TABLE_HORZ] = in->ps->fit_width;
+  out->shrink_to_fit[TABLE_VERT] = in->ps->fit_length;
+  out->top_continuation = in->ps->top_continuation;
+  out->bottom_continuation = in->ps->bottom_continuation;
+  out->continuation = xstrdup (in->ps->continuation_string);
+  out->n_orphan_lines = in->ps->n_orphan_lines;
+
+  /* Format settings. */
+  out->epoch = in->formats->y0->epoch;
+  out->decimal = in->formats->y0->decimal;
+  out->grouping = in->formats->y0->grouping;
+  const struct spvlb_custom_currency *cc = in->formats->custom_currency;
+  for (int i = 0; i < 5; i++)
+    if (cc && i < cc->n_ccs)
+      out->ccs[i] = xstrdup (cc->ccs[i]);
+  out->small = in->formats->x3 ? in->formats->x3->small : 0;
+
+  /* Command information. */
+  if (y1)
+    {
+      out->command_local = to_utf8 (y1->command_local, encoding);
+      out->command_c = to_utf8 (y1->command, encoding);
+      out->language = xstrdup (y1->language);
+      /* charset? */
+      out->locale = xstrdup (y1->locale);
+    }
+
+  /* Source information. */
+  const struct spvlb_x3 *x3 = in->formats->x3;
+  if (x3)
+    {
+      if (x3->dataset && x3->dataset[0] && x3->dataset[0] != 4)
+        out->dataset = to_utf8 (x3->dataset, encoding);
+      out->datafile = to_utf8_if_nonempty (x3->datafile, encoding);
+      out->date = x3->date;
+    }
+
+  /* Footnotes.
+
+     Any pivot_value might refer to footnotes, so it's important to process the
+     footnotes early to ensure that those references can be resolved.  There is
+     a possible problem that a footnote might itself reference an
+     as-yet-unprocessed footnote, but that's OK because footnote references
+     don't actually look at the footnote contents but only resolve a pointer to
+     where the footnote will go later.
+
+     Before we really start, create all the footnotes we'll fill in.  This is
+     because sometimes footnotes refer to themselves or to each other and we
+     don't want to reject those references. */
+  const struct spvlb_footnotes *fn = in->footnotes;
+  if (fn->n_footnotes > 0)
+    {
+      pivot_table_create_footnote__ (out, fn->n_footnotes - 1, NULL, NULL);
+      for (size_t i = 0; i < fn->n_footnotes; i++)
+        decode_spvlb_footnote (in->footnotes->footnotes[i], encoding, i, out);
+    }
+
+  /* Title and caption. */
+  out->title = decode_spvlb_value (out, in->titles->user_title, encoding);
+  out->subtype = decode_spvlb_value (out, in->titles->subtype, encoding);
+  if (in->titles->corner_text)
+    out->corner_text = decode_spvlb_value (out, in->titles->corner_text,
+                                           encoding);
+  if (in->titles->caption)
+    out->caption = decode_spvlb_value (out, in->titles->caption, encoding);
+
+  /* Styles. */
+  for (size_t i = 0; i < PIVOT_N_AREAS; i++)
+    decode_spvlb_area (in->areas->areas[i], &out->areas[i], encoding);
+  for (size_t i = 0; i < PIVOT_N_BORDERS; i++)
+    decode_spvlb_border (in->borders->borders[i], out);
+
+  /* Dimensions. */
+  out->n_dimensions = in->dimensions->n_dims;
+  out->dimensions = xcalloc (out->n_dimensions, sizeof *out->dimensions);
+  for (size_t i = 0; i < out->n_dimensions; i++)
+    out->dimensions[i] = decode_spvlb_dimension (out, in->dimensions->dims[i],
+                                                 i, encoding);
+
+  /* Axes. */
+  size_t a = in->axes->n_layers;
+  size_t b = in->axes->n_rows;
+  size_t c = in->axes->n_columns;
+  if (size_overflow_p (xsum3 (a, b, c)) || a + b + c != out->n_dimensions)
+    {
+      fprintf (stderr, "wrong number of dimensions\n");
+      exit (1);
+    }
+  decode_spvlb_axis (in->axes->layers, in->axes->n_layers,
+                     PIVOT_AXIS_LAYER, out);
+  decode_spvlb_axis (in->axes->rows, in->axes->n_rows, PIVOT_AXIS_ROW, out);
+  decode_spvlb_axis (in->axes->columns, in->axes->n_columns,
+                     PIVOT_AXIS_COLUMN, out);
+
+  pivot_table_assign_label_depth (out);
+
+  decode_current_layer (in->ts->current_layer, out);
+
+  /* Data. */
+  decode_spvlb_cells (in->cells->cells, in->cells->n_cells, out, encoding);
+
+  *outp = out;
+  return NULL;
+}
diff --git a/src/output/spv/spv-light-decoder.h b/src/output/spv/spv-light-decoder.h

new file mode 100644 (file)

index 0000000..158707b
--- /dev/null
+++ b/src/output/spv/spv-light-decoder.h
@@ -0,0 +1,33 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_LIGHT_DECODER_H
+#define OUTPUT_SPV_LIGHT_DECODER_H 1
+
+/* SPSS Viewer (SPV) light binary decoder.
+
+   Used by spv.h, not useful directly. */
+
+#include "libpspp/compiler.h"
+
+struct pivot_table;
+struct spvlb_table;
+
+char *decode_spvlb_table (const struct spvlb_table *,
+                          struct pivot_table **outp)
+  WARN_UNUSED_RESULT;
+
+#endif /* output/spv/spv-light-decoder.h */
diff --git a/src/output/spv/spv-output.c b/src/output/spv/spv-output.c

new file mode 100644 (file)

index 0000000..f83ff45
--- /dev/null
+++ b/src/output/spv/spv-output.c
@@ -0,0 +1,51 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spv-output.h"
+
+#include "output/pivot-table.h"
+#include "output/spv/spv.h"
+#include "output/text-item.h"
+
+#include "gl/xalloc.h"
+
+void
+spv_text_submit (const struct spv_item *in)
+{
+  enum spv_item_class class = spv_item_get_class (in);
+  enum text_item_type type
+    = (class == SPV_CLASS_HEADINGS ? TEXT_ITEM_TITLE
+       : class == SPV_CLASS_PAGETITLE ? TEXT_ITEM_PAGE_TITLE
+       : TEXT_ITEM_LOG);
+  const struct pivot_value *value = spv_item_get_text (in);
+  char *text = pivot_value_to_string (value, SETTINGS_VALUE_SHOW_DEFAULT,
+                                      SETTINGS_VALUE_SHOW_DEFAULT);
+  struct text_item *item = text_item_create_nocopy (type, text);
+  const struct font_style *font = value->font_style;
+  if (font)
+    {
+      item->bold = font->bold;
+      item->italic = font->italic;
+      item->underline = font->underline;
+      item->markup = font->markup;
+      if (font->typeface)
+        item->typeface = xstrdup (font->typeface);
+      item->size = font->size;
+    }
+  text_item_submit (item);
+}
diff --git a/src/output/spv/spv-output.h b/src/output/spv/spv-output.h

new file mode 100644 (file)

index 0000000..b100c0c
--- /dev/null
+++ b/src/output/spv/spv-output.h
@@ -0,0 +1,26 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_OUTPUT_H
+#define OUTPUT_SPV_OUTPUT_H 1
+
+/* Interface between SPVs and the PSPP output engine. */
+
+struct spv_item;
+
+void spv_text_submit (const struct spv_item *);
+
+#endif /* output/spv/spv-output.h */
diff --git a/src/output/spv/spv-select.c b/src/output/spv/spv-select.c

new file mode 100644 (file)

index 0000000..b3beda8
--- /dev/null
+++ b/src/output/spv/spv-select.c
@@ -0,0 +1,217 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "spv-select.h"
+
+#include <string.h>
+
+#include "libpspp/assertion.h"
+#include "output/spv/spv.h"
+
+#include "gl/c-strcase.h"
+#include "gl/xalloc.h"
+
+static bool
+is_descendant (const struct spv_item *ancestor,
+               const struct spv_item *descendant)
+{
+  for (; descendant; descendant = descendant->parent)
+    if (descendant == ancestor)
+      return true;
+  return false;
+}
+
+static struct spv_item *
+find_command_item (struct spv_item *item)
+{
+  /* A command item itself does not have a command item. */
+  if (!item->parent || !item->parent->parent)
+    return NULL;
+
+  do
+    {
+      item = item->parent;
+    }
+  while (item->parent && item->parent->parent);
+  return item;
+}
+
+void
+spv_select (const struct spv_reader *spv, const struct spv_criteria *c,
+            struct spv_item ***itemsp, size_t *n_itemsp)
+{
+  size_t n_items = 0;
+  size_t allocated_items = 0;
+  struct spv_item **items = NULL;
+
+  struct spv_item **nth_command = xcalloc (c->n_commands, sizeof *nth_command);
+  const struct spv_item *root = spv_get_root (spv);
+  for (size_t i = 0; i < c->n_commands; i++)
+    {
+      const struct spv_command_match *cm = &c->commands[i];
+      if (cm->instance < 0)
+        {
+          for (size_t j = root->n_children; j--; )
+            {
+              struct spv_item *item = root->children[j];
+              if (item->command_id
+                  && (!cm->name || !strcmp (item->command_id, cm->name)))
+                {
+                  nth_command[i] = item;
+                  break;
+                }
+            }
+        }
+      else if (cm->instance > 0)
+        {
+          size_t n = 0;
+          for (size_t j = 0; j < root->n_children; j++)
+            {
+              struct spv_item *item = root->children[j];
+              if (item->command_id
+                  && (!cm->name || !strcmp (item->command_id, cm->name))
+                  && ++n == cm->instance)
+                {
+                  nth_command[i] = item;
+                  break;
+                }
+            }
+        }
+    }
+
+  struct spv_item *item;
+  struct spv_item *command_item = NULL;
+  int instance_within_command = 0;
+  bool included_as_last_instance = false;
+  SPV_ITEM_FOR_EACH_SKIP_ROOT (item, spv_get_root (spv))
+    {
+      if (!((1u << spv_item_get_class (item)) & c->classes))
+        continue;
+
+      if (!c->include_hidden && !spv_item_is_visible (item))
+        continue;
+
+      if (c->error)
+        {
+          spv_item_load (item);
+          if (!item->error)
+            continue;
+        }
+
+      if (c->commands)
+        {
+          const char *id = spv_item_get_command_id (item);
+          if (!id)
+            continue;
+
+          for (size_t i = 0; i < c->n_commands; i++)
+            {
+              const struct spv_command_match *cm = &c->commands[i];
+              if ((!cm->name || !c_strcasecmp (cm->name, id))
+                  && (!cm->instance
+                      || (nth_command[i]
+                          && is_descendant (nth_command[i], item))))
+                goto ok;
+            }
+          continue;
+        ok:;
+        }
+
+      if (!string_set_is_empty (&c->subtypes))
+        {
+          const char *subtype = spv_item_get_subtype (item);
+          if (!subtype || !string_set_contains (&c->subtypes, subtype))
+            continue;
+        }
+
+      if (c->n_labels)
+        {
+          const char *label = spv_item_get_label (item);
+          if (!label)
+            continue;
+
+          size_t label_len = strlen (label);
+          bool match = false;
+          for (size_t i = 0; !match && i < c->n_labels; i++)
+            {
+              const char *arg = c->labels[i].arg;
+              size_t arg_len = strlen (arg);
+              switch (c->labels[i].op)
+                {
+                case SPV_LABEL_MATCH_EQUALS:
+                  match = !strcmp (label, arg);
+                  break;
+                case SPV_LABEL_MATCH_CONTAINS:
+                  match = strstr (label, arg);
+                  break;
+                case SPV_LABEL_MATCH_STARTS:
+                  match = !strncmp (label, arg, arg_len);
+                  break;
+                case SPV_LABEL_MATCH_ENDS:
+                  match = (label_len >= arg_len
+                           && !memcmp (label + (label_len - arg_len), arg,
+                                       arg_len));
+                  break;
+                default:
+                  NOT_REACHED ();
+                }
+            }
+          if (!match)
+            continue;
+        }
+
+      if (c->n_instances)
+        {
+          struct spv_item *new_command_item = find_command_item (item);
+          if (new_command_item != command_item)
+            {
+              command_item = new_command_item;
+              instance_within_command = 0;
+              included_as_last_instance = false;
+            }
+          if (!command_item)
+            continue;
+          instance_within_command++;
+
+          bool include_last = false;
+          for (size_t i = 0; i < c->n_instances; i++)
+            if (instance_within_command == c->instances[i])
+              goto ok2;
+            else if (c->instances[i] == -1)
+              include_last = true;
+
+          if (!include_last)
+            continue;
+          if (included_as_last_instance)
+            n_items--;
+          else
+            included_as_last_instance = true;
+
+        ok2:;
+        }
+
+      if (n_items >= allocated_items)
+        items = x2nrealloc (items, &allocated_items, sizeof *items);
+      items[n_items++] = item;
+    }
+
+  free (nth_command);
+
+  *itemsp = items;
+  *n_itemsp = n_items;
+}
diff --git a/src/output/spv/spv-select.h b/src/output/spv/spv-select.h

new file mode 100644 (file)

index 0000000..f0e42c5
--- /dev/null
+++ b/src/output/spv/spv-select.h
@@ -0,0 +1,75 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_SELECT_H
+#define OUTPUT_SPV_SELECT_H 1
+
+#include "libpspp/string-set.h"
+
+struct spv_item;
+struct spv_reader;
+
+struct spv_criteria
+  {
+    bool include_hidden;
+
+#define SPV_ALL_CLASSES ((1u << SPV_N_CLASSES) - 1)
+    unsigned int classes;
+
+    struct spv_command_match *commands;
+    size_t n_commands;
+
+    struct string_set subtypes;
+
+    struct spv_label_match *labels;
+    size_t n_labels;
+
+    int *instances;
+    size_t n_instances;
+
+    bool error;
+  };
+
+#define SPV_CRITERIA_INITIALIZER(CRITERIA)                      \
+  {                                                             \
+    .classes = SPV_ALL_CLASSES,                                 \
+    .subtypes = STRING_SET_INITIALIZER (CRITERIA.subtypes),     \
+  }
+
+struct spv_command_match
+  {
+    char *name;
+    int instance;
+  };
+
+enum spv_label_match_op
+  {
+    SPV_LABEL_MATCH_EQUALS,
+    SPV_LABEL_MATCH_CONTAINS,
+    SPV_LABEL_MATCH_STARTS,
+    SPV_LABEL_MATCH_ENDS,
+  };
+
+struct spv_label_match
+  {
+    enum spv_label_match_op op;
+    char *arg;
+  };
+
+void spv_select (const struct spv_reader *, const struct spv_criteria *,
+                 struct spv_item ***items, size_t *n_items);
+
+#endif /* output/spv/spv-select.h */
diff --git a/src/output/spv/spv-writer.c b/src/output/spv/spv-writer.c

new file mode 100644 (file)

index 0000000..0701a52
--- /dev/null
+++ b/src/output/spv/spv-writer.c
@@ -0,0 +1,1019 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spv-writer.h"
+
+#include <inttypes.h>
+#include <libxml/xmlwriter.h>
+#include <math.h>
+#include <stdlib.h>
+#include <time.h>
+
+#include "libpspp/array.h"
+#include "libpspp/assertion.h"
+#include "libpspp/cast.h"
+#include "libpspp/float-format.h"
+#include "libpspp/integer-format.h"
+#include "libpspp/temp-file.h"
+#include "libpspp/version.h"
+#include "libpspp/zip-writer.h"
+#include "output/page-setup-item.h"
+#include "output/pivot-table.h"
+#include "output/text-item.h"
+
+#include "gl/xalloc.h"
+#include "gl/xvasprintf.h"
+
+#include "gettext.h"
+#define _(msgid) gettext (msgid)
+#define N_(msgid) (msgid)
+
+struct spv_writer
+  {
+    struct zip_writer *zw;
+
+    FILE *heading;
+    int heading_depth;
+    xmlTextWriter *xml;
+
+    int n_tables;
+
+    int n_headings;
+    struct page_setup *page_setup;
+    bool need_page_break;
+  };
+
+char * WARN_UNUSED_RESULT
+spv_writer_open (const char *filename, struct spv_writer **writerp)
+{
+  *writerp = NULL;
+
+  struct zip_writer *zw = zip_writer_create (filename);
+  if (!zw)
+    return xasprintf (_("%s: create failed"), filename);
+
+  struct spv_writer *w = xmalloc (sizeof *w);
+  *w = (struct spv_writer) { .zw = zw };
+  *writerp = w;
+  return NULL;
+}
+
+char * WARN_UNUSED_RESULT
+spv_writer_close (struct spv_writer *w)
+{
+  if (!w)
+    return NULL;
+
+  zip_writer_add_string (w->zw, "META-INF/MANIFEST.MF", "allowPivoting=true");
+
+  while (w->heading_depth)
+    spv_writer_close_heading (w);
+
+  char *error = NULL;
+  if (!zip_writer_close (w->zw))
+    error = xstrdup (_("I/O error writing SPV file"));
+
+  page_setup_destroy (w->page_setup);
+  free (w);
+  return error;
+}
+
+void
+spv_writer_set_page_setup (struct spv_writer *w,
+                           const struct page_setup *page_setup)
+{
+  page_setup_destroy (w->page_setup);
+  w->page_setup = page_setup_clone (page_setup);
+}
+
+static void
+write_attr (struct spv_writer *w, const char *name, const char *value)
+{
+  xmlTextWriterWriteAttribute (w->xml,
+                               CHAR_CAST (xmlChar *, name),
+                               CHAR_CAST (xmlChar *, value));
+}
+
+static void PRINTF_FORMAT (3, 4)
+write_attr_format (struct spv_writer *w, const char *name,
+                   const char *format, ...)
+{
+  va_list args;
+  va_start (args, format);
+  char *value = xvasprintf (format, args);
+  va_end (args);
+
+  write_attr (w, name, value);
+  free (value);
+}
+
+static void
+start_elem (struct spv_writer *w, const char *name)
+{
+  xmlTextWriterStartElement (w->xml, CHAR_CAST (xmlChar *, name));
+}
+
+static void
+end_elem (struct spv_writer *w)
+{
+  xmlTextWriterEndElement (w->xml);
+}
+
+static void
+write_text (struct spv_writer *w, const char *text)
+{
+  xmlTextWriterWriteString (w->xml, CHAR_CAST (xmlChar *, text));
+}
+
+static void
+write_page_heading (struct spv_writer *w, const struct page_heading *h,
+                    const char *name)
+{
+  start_elem (w, name);
+  if (h->n)
+    {
+      start_elem (w, "pageParagraph");
+      for (size_t i = 0; i < h->n; i++)
+        {
+          start_elem (w, "text");
+          write_attr (w, "type", "title");
+          write_text (w, h->paragraphs[i].markup); /* XXX */
+          end_elem (w);
+        }
+      end_elem (w);
+    }
+  end_elem (w);
+}
+
+static void
+write_page_setup (struct spv_writer *w, const struct page_setup *ps)
+{
+  start_elem (w, "pageSetup");
+  write_attr_format (w, "initial-page-number", "%d", ps->initial_page_number);
+  write_attr (w, "chart-size",
+              (ps->chart_size == PAGE_CHART_AS_IS ? "as-is"
+               : ps->chart_size == PAGE_CHART_FULL_HEIGHT ? "full-height"
+               : ps->chart_size == PAGE_CHART_HALF_HEIGHT ? "half-height"
+               : "quarter-height"));
+  write_attr_format (w, "margin-left", "%.2fin", ps->margins[TABLE_HORZ][0]);
+  write_attr_format (w, "margin-right", "%.2fin", ps->margins[TABLE_HORZ][1]);
+  write_attr_format (w, "margin-top", "%.2fin", ps->margins[TABLE_VERT][0]);
+  write_attr_format (w, "margin-bottom", "%.2fin", ps->margins[TABLE_VERT][1]);
+  write_attr_format (w, "paper-height", "%.2fin", ps->paper[TABLE_VERT]);
+  write_attr_format (w, "paper-width", "%.2fin", ps->paper[TABLE_HORZ]);
+  write_attr (w, "reference-orientation",
+              ps->orientation == PAGE_PORTRAIT ? "portrait" : "landscape");
+  write_attr_format (w, "space-after", "%.1fpt", ps->object_spacing * 72.0);
+  write_page_heading (w, &ps->headings[0], "pageHeader");
+  write_page_heading (w, &ps->headings[1], "pageFooter");
+  end_elem (w);
+}
+
+static bool
+spv_writer_open_file (struct spv_writer *w)
+{
+  w->heading = create_temp_file ();
+  if (!w->heading)
+    return false;
+
+  w->xml = xmlNewTextWriter (xmlOutputBufferCreateFile (w->heading, NULL));
+  xmlTextWriterStartDocument (w->xml, NULL, "UTF-8", NULL);
+  start_elem (w, "heading");
+
+  time_t t = time (NULL);
+  struct tm *tm = gmtime (&t);
+  char *tm_s = asctime (tm);
+  write_attr (w, "creation-date-time", tm_s);
+
+  write_attr (w, "creator", version);
+
+  write_attr (w, "creator-version", "21");
+
+  write_attr (w, "xmlns", "http://xml.spss.com/spss/viewer/viewer-tree");
+  write_attr (w, "xmlns:vps", "http://xml.spss.com/spss/viewer/viewer-pagesetup");
+  write_attr (w, "xmlns:vtx", "http://xml.spss.com/spss/viewer/viewer-text");
+  write_attr (w, "xmlns:vtb", "http://xml.spss.com/spss/viewer/viewer-table");
+
+  start_elem (w, "label");
+  write_text (w, _("Output"));
+  end_elem (w);
+
+  if (w->page_setup)
+    {
+      write_page_setup (w, w->page_setup);
+
+      page_setup_destroy (w->page_setup);
+      w->page_setup = NULL;
+    }
+
+  return true;
+}
+
+void
+spv_writer_open_heading (struct spv_writer *w, const char *command_id,
+                         const char *label)
+{
+  if (!w->heading)
+    {
+      if (!spv_writer_open_file (w))
+        return;
+    }
+
+  w->heading_depth++;
+  start_elem (w, "heading");
+  write_attr (w, "commandName", command_id);
+  /* XXX locale */
+  /* XXX olang */
+
+  start_elem (w, "label");
+  write_text (w, label);
+  end_elem (w);
+}
+
+static void
+spv_writer_close_file (struct spv_writer *w, const char *infix)
+{
+  if (!w->heading)
+    return;
+
+  end_elem (w);
+  xmlTextWriterEndDocument (w->xml);
+  xmlFreeTextWriter (w->xml);
+
+  char *member_name = xasprintf ("outputViewer%010d%s.xml",
+                                 w->n_headings++, infix);
+  zip_writer_add (w->zw, w->heading, member_name);
+  free (member_name);
+
+  w->heading = NULL;
+}
+
+void
+spv_writer_close_heading (struct spv_writer *w)
+{
+  const char *infix = "";
+  if (w->heading_depth)
+    {
+      infix = "_heading";
+      end_elem (w);
+      w->heading_depth--;
+    }
+
+  if (!w->heading_depth)
+    spv_writer_close_file (w, infix);
+}
+
+static void
+start_container (struct spv_writer *w)
+{
+  start_elem (w, "container");
+  write_attr (w, "visibility", "visible");
+  if (w->need_page_break)
+    {
+      write_attr (w, "page-break-before", "always");
+      w->need_page_break = false;
+    }
+}
+
+void
+spv_writer_put_text (struct spv_writer *w, const struct text_item *text)
+{
+  if (text->type == TEXT_ITEM_EJECT_PAGE)
+    w->need_page_break = true;
+
+  bool initial_depth = w->heading_depth;
+  if (!initial_depth)
+    spv_writer_open_file (w);
+
+  start_container (w);
+
+  start_elem (w, "label");
+  write_text (w, (text->type == TEXT_ITEM_TITLE ? "Title"
+                  : text->type == TEXT_ITEM_PAGE_TITLE ? "Page Title"
+                  : "Log"));
+  end_elem (w);
+
+  start_elem (w, "vtx:text");
+  write_attr (w, "type", (text->type == TEXT_ITEM_TITLE ? "title"
+                          : text->type == TEXT_ITEM_PAGE_TITLE ? "page-title"
+                          : "log"));
+
+  start_elem (w, "html");
+  write_text (w, text->text);   /* XXX */
+  end_elem (w); /* html */
+  end_elem (w); /* vtx:text */
+  end_elem (w); /* container */
+
+  if (!initial_depth)
+    spv_writer_close_file (w, "");
+}
+\f
+#define H TABLE_HORZ
+#define V TABLE_VERT
+
+struct buf
+  {
+    uint8_t *data;
+    size_t len;
+    size_t allocated;
+  };
+
+static uint8_t *
+put_uninit (struct buf *b, size_t n)
+{
+  while (b->allocated - b->len < n)
+    b->data = x2nrealloc (b->data, &b->allocated, sizeof b->data);
+  uint8_t *p = &b->data[b->len];
+  b->len += n;
+  return p;
+}
+
+static void
+put_byte (struct buf *b, uint8_t byte)
+{
+  *put_uninit (b, 1) = byte;
+}
+
+static void
+put_bool (struct buf *b, bool boolean)
+{
+  put_byte (b, boolean);
+}
+
+static void
+put_bytes (struct buf *b, const char *bytes, size_t n)
+{
+  memcpy (put_uninit (b, n), bytes, n);
+}
+
+static void
+put_u16 (struct buf *b, uint16_t x)
+{
+  put_uint16 (native_to_le16 (x), put_uninit (b, sizeof x));
+}
+
+static void
+put_u32 (struct buf *b, uint32_t x)
+{
+  put_uint32 (native_to_le32 (x), put_uninit (b, sizeof x));
+}
+
+static void
+put_u64 (struct buf *b, uint64_t x)
+{
+  put_uint64 (native_to_le64 (x), put_uninit (b, sizeof x));
+}
+
+static void
+put_be32 (struct buf *b, uint32_t x)
+{
+  put_uint32 (native_to_be32 (x), put_uninit (b, sizeof x));
+}
+
+static void
+put_double (struct buf *b, double x)
+{
+  float_convert (FLOAT_NATIVE_DOUBLE, &x,
+                 FLOAT_IEEE_DOUBLE_LE, put_uninit (b, 8));
+}
+
+static void
+put_float (struct buf *b, float x)
+{
+  float_convert (FLOAT_NATIVE_FLOAT, &x,
+                 FLOAT_IEEE_SINGLE_LE, put_uninit (b, 4));
+}
+
+static void
+put_string (struct buf *b, const char *s_)
+{
+  const char *s = s_ ? s_ : "";
+  size_t len = strlen (s);
+  put_u32 (b, len);
+  memcpy (put_uninit (b, len), s, len);
+}
+
+static void
+put_bestring (struct buf *b, const char *s_)
+{
+  const char *s = s_ ? s_ : "";
+  size_t len = strlen (s);
+  put_be32 (b, len);
+  memcpy (put_uninit (b, len), s, len);
+}
+
+static size_t
+start_count (struct buf *b)
+{
+  put_u32 (b, 0);
+  return b->len;
+}
+
+static void
+end_count_u32 (struct buf *b, size_t start)
+{
+  put_uint32 (native_to_le32 (b->len - start), &b->data[start - 4]);
+}
+
+static void
+end_count_be32 (struct buf *b, size_t start)
+{
+  put_uint32 (native_to_be32 (b->len - start), &b->data[start - 4]);
+}
+
+static void
+put_color (struct buf *buf, const struct cell_color *color)
+{
+  char *s = xasprintf ("#%02"PRIx8"%02"PRIx8"%02"PRIx8,
+                       color->r, color->g, color->b);
+  put_string (buf, s);
+  free (s);
+}
+
+static void
+put_font_style (struct buf *buf, const struct font_style *font_style)
+{
+  put_bool (buf, font_style->bold);
+  put_bool (buf, font_style->italic);
+  put_bool (buf, font_style->underline);
+  put_bool (buf, 1);
+  put_color (buf, &font_style->fg[0]);
+  put_color (buf, &font_style->bg[0]);
+  put_string (buf, font_style->typeface ? font_style->typeface : "SansSerif");
+  put_byte (buf, ceil (font_style->size * 1.33));
+}
+
+static void
+put_halign (struct buf *buf, enum table_halign halign,
+            uint32_t mixed, uint32_t decimal)
+{
+  put_u32 (buf, (halign == TABLE_HALIGN_RIGHT ? 4
+                 : halign == TABLE_HALIGN_LEFT ? 2
+                 : halign == TABLE_HALIGN_CENTER ? 0
+                 : halign == TABLE_HALIGN_MIXED ? mixed
+                 : decimal));
+}
+
+static void
+put_valign (struct buf *buf, enum table_valign valign)
+{
+  put_u32 (buf, (valign == TABLE_VALIGN_TOP ? 1
+                 : valign == TABLE_VALIGN_CENTER ? 0
+                 : 3));
+}
+
+static void
+put_cell_style (struct buf *buf, const struct cell_style *cell_style)
+{
+  put_halign (buf, cell_style->halign, 0xffffffad, 6);
+  put_valign (buf, cell_style->valign);
+  put_double (buf, cell_style->decimal_offset);
+  put_u16 (buf, cell_style->margin[H][0]);
+  put_u16 (buf, cell_style->margin[H][1]);
+  put_u16 (buf, cell_style->margin[V][0]);
+  put_u16 (buf, cell_style->margin[V][1]);
+}
+
+static void UNUSED
+put_style_pair (struct buf *buf, const struct font_style *font_style,
+                const struct cell_style *cell_style)
+{
+  if (font_style)
+    {
+      put_byte (buf, 0x31);
+      put_font_style (buf, font_style);
+    }
+  else
+    put_byte (buf, 0x58);
+
+  if (cell_style)
+    {
+      put_byte (buf, 0x31);
+      put_cell_style (buf, cell_style);
+    }
+  else
+    put_byte (buf, 0x58);
+}
+
+static void
+put_value_mod (struct buf *buf, const struct pivot_value *value,
+               const char *template)
+{
+  if (value->n_footnotes || value->subscript
+      || template || value->font_style || value->cell_style)
+    {
+      put_byte (buf, 0x31);
+
+      /* Footnotes. */
+      put_u32 (buf, value->n_footnotes);
+      for (size_t i = 0; i < value->n_footnotes; i++)
+        put_u16 (buf, value->footnotes[i]->idx);
+
+      if (value->subscript)
+        {
+          put_u32 (buf, 1);
+          put_string (buf, value->subscript);
+        }
+      else
+        put_u32 (buf, 0);
+
+      /* Template and style. */
+      uint32_t v3_start = start_count (buf);
+      uint32_t template_string_start = start_count (buf);
+      if (template)
+        {
+          uint32_t inner_start = start_count (buf);
+          end_count_u32 (buf, inner_start);
+
+          put_byte (buf, 0x31);
+          put_string (buf, template);
+        }
+      end_count_u32 (buf, template_string_start);
+      put_style_pair (buf, value->font_style, value->cell_style);
+      end_count_u32 (buf, v3_start);
+    }
+  else
+    put_byte (buf, 0x58);
+}
+
+static void
+put_format (struct buf *buf, const struct fmt_spec *f)
+{
+  put_u32 (buf, (fmt_to_io (f->type) << 16) | (f->w << 8) | f->d);
+}
+
+static int
+show_values_to_spvlb (enum settings_value_show show)
+{
+  return (show == SETTINGS_VALUE_SHOW_DEFAULT ? 0
+          : show == SETTINGS_VALUE_SHOW_VALUE ? 1
+          : show == SETTINGS_VALUE_SHOW_LABEL ? 2
+          : 3);
+}
+
+static void
+put_show_values (struct buf *buf, enum settings_value_show show)
+{
+  put_byte (buf, show_values_to_spvlb (show));
+}
+
+static void
+put_value (struct buf *buf, const struct pivot_value *value)
+{
+  switch (value->type)
+    {
+    case PIVOT_VALUE_NUMERIC:
+      if (value->numeric.var_name || value->numeric.value_label)
+        {
+          put_byte (buf, 2);
+          put_value_mod (buf, value, NULL);
+          put_format (buf, &value->numeric.format);
+          put_double (buf, value->numeric.x);
+          put_string (buf, value->numeric.var_name);
+          put_string (buf, value->numeric.value_label);
+          put_show_values (buf, value->numeric.show);
+        }
+      else
+        {
+          put_byte (buf, 1);
+          put_value_mod (buf, value, NULL);
+          put_format (buf, &value->numeric.format);
+          put_double (buf, value->numeric.x);
+        }
+      break;
+
+    case PIVOT_VALUE_STRING:
+      put_byte (buf, 4);
+      put_value_mod (buf, value, NULL);
+      put_format (buf,
+                  &(struct fmt_spec) { FMT_A, strlen (value->string.s), 0 });
+      put_string (buf, value->string.value_label);
+      put_string (buf, value->string.var_name);
+      put_show_values (buf, value->string.show);
+      put_string (buf, value->string.s);
+      break;
+
+    case PIVOT_VALUE_VARIABLE:
+      put_byte (buf, 5);
+      put_value_mod (buf, value, NULL);
+      put_string (buf, value->variable.var_name);
+      put_string (buf, value->variable.var_label);
+      put_show_values (buf, value->variable.show);
+      break;
+
+    case PIVOT_VALUE_TEXT:
+      put_byte (buf, 3);
+      put_string (buf, value->text.local);
+      put_value_mod (buf, value, NULL);
+      put_string (buf, value->text.id);
+      put_string (buf, value->text.c);
+      put_byte (buf, 1);        /* XXX user-provided */
+      break;
+
+    case PIVOT_VALUE_TEMPLATE:
+      put_byte (buf, 0);
+      put_value_mod (buf, value, value->template.id);
+      put_string (buf, value->template.local);
+      put_u32 (buf, value->template.n_args);
+      for (size_t i = 0; i < value->template.n_args; i++)
+        {
+          const struct pivot_argument *arg = &value->template.args[i];
+          assert (arg->n >= 1);
+          if (arg->n > 1)
+            {
+              put_u32 (buf, arg->n);
+              put_u32 (buf, 0);
+              for (size_t j = 0; j < arg->n; j++)
+                {
+                  if (j > 0)
+                    put_bytes (buf, "\0\0\0\0", 4);
+                  put_value (buf, arg->values[j]);
+                }
+            }
+          else
+            {
+              put_u32 (buf, 0);
+              put_value (buf, arg->values[0]);
+            }
+        }
+      break;
+
+    default:
+      NOT_REACHED ();
+    }
+}
+
+static void
+put_optional_value (struct buf *buf, const struct pivot_value *value)
+{
+  if (value)
+    {
+      put_byte (buf, 0x31);
+      put_value (buf, value);
+    }
+  else
+    put_byte (buf, 0x58);
+}
+
+static void
+put_category (struct buf *buf, const struct pivot_category *c)
+{
+  put_value (buf, c->name);
+  if (pivot_category_is_leaf (c))
+    {
+      put_bytes (buf, "\0\0\0", 3);
+      put_u32 (buf, 2);
+      put_u32 (buf, c->data_index);
+      put_u32 (buf, 0);
+    }
+  else
+    {
+      put_bytes (buf, "\0\0\1", 3);
+      put_u32 (buf, 0);
+      put_u32 (buf, -1);
+      put_u32 (buf, c->n_subs);
+      for (size_t i = 0; i < c->n_subs; i++)
+        put_category (buf, c->subs[i]);
+    }
+}
+
+static void
+put_y0 (struct buf *buf, const struct pivot_table *table)
+{
+  put_u32 (buf, table->epoch);
+  put_byte (buf, table->decimal);
+  put_byte (buf, table->grouping);
+}
+
+static void
+put_custom_currency (struct buf *buf, const struct pivot_table *table)
+{
+  put_u32 (buf, 5);
+  for (int i = 0; i < 5; i++)
+    put_string (buf, table->ccs[i]);
+}
+
+static void
+put_x1 (struct buf *buf, const struct pivot_table *table)
+{
+  put_bytes (buf, "\0\1\0", 3);
+  put_byte (buf, 0);
+  put_show_values (buf, table->show_variables);
+  put_show_values (buf, table->show_values);
+  put_u32 (buf, -1);
+  put_u32 (buf, -1);
+  for (int i = 0; i < 17; i++)
+    put_byte (buf, 0);
+  put_bool (buf, false);
+  put_byte (buf, 1);
+}
+
+static void
+put_x2 (struct buf *buf)
+{
+  put_u32 (buf, 0);             /* n-row-heights */
+  put_u32 (buf, 0);             /* n-style-map */
+  put_u32 (buf, 0);             /* n-styles */
+  put_u32 (buf, 0);
+}
+
+static void
+put_x3 (struct buf *buf, const struct pivot_table *table)
+{
+  put_bytes (buf, "\1\0\4\0\0\0", 6);
+  put_string (buf, table->command_c);
+  put_string (buf, table->command_local);
+  put_string (buf, table->language);
+  put_string (buf, "UTF-8");    /* XXX */
+  put_string (buf, table->locale);
+  put_bytes (buf, "\0\0\1\1", 4);
+  put_y0 (buf, table);
+  put_double (buf, table->small);
+  put_byte (buf, 1);
+  put_string (buf, table->dataset);
+  put_string (buf, table->datafile);
+  put_u32 (buf, 0);
+  put_u32 (buf, table->date);
+  put_u32 (buf, 0);
+
+  /* Y2. */
+  put_custom_currency (buf, table);
+  put_byte (buf, '.');
+  put_bool (buf, 0);
+}
+
+static void
+put_light_table (struct buf *buf, uint64_t table_id,
+                 const struct pivot_table *table)
+{
+  /* Header. */
+  put_bytes (buf, "\1\0", 2);
+  put_u32 (buf, 3);
+  put_bool (buf, true);
+  put_bool (buf, false);
+  put_bool (buf, table->rotate_inner_column_labels);
+  put_bool (buf, table->rotate_outer_row_labels);
+  put_bool (buf, true);
+  put_u32 (buf, 0x15);
+  put_u32 (buf, table->sizing[H].range[0]);
+  put_u32 (buf, table->sizing[H].range[1]);
+  put_u32 (buf, table->sizing[V].range[0]);
+  put_u32 (buf, table->sizing[V].range[1]);
+  put_u64 (buf, table_id);
+
+  /* Titles. */
+  put_value (buf, table->title);
+  put_value (buf, table->subtype);
+  put_optional_value (buf, table->title);
+  put_optional_value (buf, table->corner_text);
+  put_optional_value (buf, table->caption);
+
+  /* Footnotes. */
+  put_u32 (buf, table->n_footnotes);
+  for (size_t i = 0; i < table->n_footnotes; i++)
+    {
+      put_value (buf, table->footnotes[i]->content);
+      put_optional_value (buf, table->footnotes[i]->marker);
+      put_u32 (buf, 0);
+    }
+
+  /* Areas. */
+  for (size_t i = 0; i < PIVOT_N_AREAS; i++)
+    {
+      const struct area_style *a = &table->areas[i];
+      put_byte (buf, i + 1);
+      put_byte (buf, 0x31);
+      put_string (buf, (a->font_style.typeface
+                        ? a->font_style.typeface
+                        : "SansSerif"));
+      put_float (buf, ceil (a->font_style.size * 1.33));
+      put_u32 (buf, ((a->font_style.bold ? 1 : 0)
+                     | (a->font_style.italic ? 2 : 0)));
+      put_bool (buf, a->font_style.underline);
+      put_halign (buf, a->cell_style.halign, 64173, 61453);
+      put_valign (buf, a->cell_style.valign);
+
+      put_color (buf, &a->font_style.fg[0]);
+      put_color (buf, &a->font_style.bg[0]);
+
+      bool alt
+        = (!cell_color_equal (&a->font_style.fg[0], &a->font_style.fg[1])
+           || !cell_color_equal (&a->font_style.bg[0], &a->font_style.bg[1]));
+      put_bool (buf, alt);
+      if (alt)
+        {
+          put_color (buf, &a->font_style.fg[1]);
+          put_color (buf, &a->font_style.bg[1]);
+        }
+      else
+        {
+          put_string (buf, "");
+          put_string (buf, "");
+        }
+
+      put_u32 (buf, a->cell_style.margin[H][0]);
+      put_u32 (buf, a->cell_style.margin[H][1]);
+      put_u32 (buf, a->cell_style.margin[V][0]);
+      put_u32 (buf, a->cell_style.margin[V][1]);
+    }
+
+  /* Borders. */
+  uint32_t borders_start = start_count (buf);
+  put_be32 (buf, 1);
+  put_be32 (buf, PIVOT_N_BORDERS);
+  for (size_t i = 0; i < PIVOT_N_BORDERS; i++)
+    {
+      const struct table_border_style *b = &table->borders[i];
+      put_be32 (buf, i);
+      put_be32 (buf, (b->stroke == TABLE_STROKE_NONE ? 0
+                      : b->stroke == TABLE_STROKE_SOLID ? 1
+                      : b->stroke == TABLE_STROKE_DASHED ? 2
+                      : b->stroke == TABLE_STROKE_THICK ? 3
+                      : b->stroke == TABLE_STROKE_THIN ? 4
+                      : 5));
+      put_be32 (buf, ((b->color.alpha << 24)
+                      | (b->color.r << 16)
+                      | (b->color.g << 8)
+                      | b->color.b));
+    }
+  put_bool (buf, table->show_grid_lines);
+  put_bytes (buf, "\0\0\0", 3);
+  end_count_u32 (buf, borders_start);
+
+  /* Print Settings. */
+  uint32_t ps_start = start_count (buf);
+  put_be32 (buf, 1);
+  put_bool (buf, table->print_all_layers);
+  put_bool (buf, table->paginate_layers);
+  put_bool (buf, table->shrink_to_fit[H]);
+  put_bool (buf, table->shrink_to_fit[V]);
+  put_bool (buf, table->top_continuation);
+  put_bool (buf, table->bottom_continuation);
+  put_be32 (buf, table->n_orphan_lines);
+  put_bestring (buf, table->continuation);
+  end_count_u32 (buf, ps_start);
+
+  /* Table Settings. */
+  uint32_t ts_start = start_count (buf);
+  put_be32 (buf, 1);
+  put_be32 (buf, 4);
+  put_be32 (buf, 0);            /* XXX current_layer */
+  put_bool (buf, table->omit_empty);
+  put_bool (buf, table->row_labels_in_corner);
+  put_bool (buf, !table->show_numeric_markers);
+  put_bool (buf, table->footnote_marker_superscripts);
+  put_byte (buf, 0);
+  uint32_t keep_start = start_count (buf);
+  put_be32 (buf, 0);            /* n-row-breaks */
+  put_be32 (buf, 0);            /* n-column-breaks */
+  put_be32 (buf, 0);            /* n-row-keeps */
+  put_be32 (buf, 0);            /* n-column-keeps */
+  put_be32 (buf, 0);            /* n-row-point-keeps */
+  put_be32 (buf, 0);            /* n-column-point-keeps */
+  end_count_be32 (buf, keep_start);
+  put_bestring (buf, table->notes);
+  put_bestring (buf, table->table_look);
+  for (size_t i = 0; i < 82; i++)
+    put_byte (buf, 0);
+  end_count_u32 (buf, ts_start);
+
+  /* Formats. */
+  put_u32 (buf, 0);             /* n-widths */
+  put_string (buf, "en_US.ISO_8859-1:1987"); /* XXX */
+  put_u32 (buf, 0);                /* XXX current-layer */
+  put_bool (buf, 0);
+  put_bool (buf, 0);
+  put_bool (buf, 1);
+  put_y0 (buf, table);
+  put_custom_currency (buf, table);
+  uint32_t formats_start = start_count (buf);
+  uint32_t x1_start = start_count (buf);
+  put_x1 (buf, table);
+  uint32_t x2_start = start_count (buf);
+  put_x2 (buf);
+  end_count_u32 (buf, x2_start);
+  end_count_u32 (buf, x1_start);
+  uint32_t x3_start = start_count (buf);
+  put_x3 (buf, table);
+  end_count_u32 (buf, x3_start);
+  end_count_u32 (buf, formats_start);
+
+  /* Dimensions. */
+  put_u32 (buf, table->n_dimensions);
+  int *x2 = xnmalloc (table->n_dimensions, sizeof *x2);
+  for (size_t i = 0; i < table->axes[PIVOT_AXIS_LAYER].n_dimensions; i++)
+    x2[i] = 2;
+  for (size_t i = 0; i < table->axes[PIVOT_AXIS_ROW].n_dimensions; i++)
+    x2[i + table->axes[PIVOT_AXIS_LAYER].n_dimensions] = 0;
+  for (size_t i = 0; i < table->axes[PIVOT_AXIS_COLUMN].n_dimensions; i++)
+    x2[i
+       + table->axes[PIVOT_AXIS_LAYER].n_dimensions
+       + table->axes[PIVOT_AXIS_ROW].n_dimensions] = 1;
+  for (size_t i = 0; i < table->n_dimensions; i++)
+    {
+      const struct pivot_dimension *d = table->dimensions[i];
+      put_value (buf, d->root->name);
+      put_byte (buf, 0);
+      put_byte (buf, x2[i]);
+      put_u32 (buf, 2);
+      put_bool (buf, !d->root->show_label);
+      put_bool (buf, d->hide_all_labels);
+      put_bool (buf, 1);
+      put_u32 (buf, i);
+
+      put_u32 (buf, d->root->n_subs);
+      for (size_t j = 0; j < d->root->n_subs; j++)
+        put_category (buf, d->root->subs[j]);
+    }
+  free (x2);
+
+  /* Axes. */
+  put_u32 (buf, table->axes[PIVOT_AXIS_LAYER].n_dimensions);
+  put_u32 (buf, table->axes[PIVOT_AXIS_ROW].n_dimensions);
+  put_u32 (buf, table->axes[PIVOT_AXIS_COLUMN].n_dimensions);
+  for (size_t i = 0; i < table->axes[PIVOT_AXIS_LAYER].n_dimensions; i++)
+    put_u32 (buf, table->axes[PIVOT_AXIS_LAYER].dimensions[i]->top_index);
+  for (size_t i = 0; i < table->axes[PIVOT_AXIS_ROW].n_dimensions; i++)
+    put_u32 (buf, table->axes[PIVOT_AXIS_ROW].dimensions[i]->top_index);
+  for (size_t i = 0; i < table->axes[PIVOT_AXIS_COLUMN].n_dimensions; i++)
+    put_u32 (buf, table->axes[PIVOT_AXIS_COLUMN].dimensions[i]->top_index);
+
+  /* Cells. */
+  put_u32 (buf, hmap_count (&table->cells));
+  const struct pivot_cell *cell;
+  HMAP_FOR_EACH (cell, struct pivot_cell, hmap_node, &table->cells)
+    {
+      uint64_t index = 0;
+      for (size_t j = 0; j < table->n_dimensions; j++)
+        index = (table->dimensions[j]->n_leaves * index) + cell->idx[j];
+      put_u64 (buf, index);
+
+      put_value (buf, cell->value);
+    }
+}
+
+void
+spv_writer_put_table (struct spv_writer *w, const struct pivot_table *table)
+{
+  struct pivot_table *table_rw = CONST_CAST (struct pivot_table *, table);
+  if (!table_rw->subtype)
+    table_rw->subtype = pivot_value_new_user_text ("unknown", -1);
+
+  int table_id = ++w->n_tables;
+
+  bool initial_depth = w->heading_depth;
+  if (!initial_depth)
+    spv_writer_open_file (w);
+
+  start_container (w);
+
+  char *title = pivot_value_to_string (table->title,
+                                         SETTINGS_VALUE_SHOW_DEFAULT,
+                                         SETTINGS_VALUE_SHOW_DEFAULT);
+
+  start_elem (w, "label");
+  write_text (w, title);
+  end_elem (w);
+
+  start_elem (w, "vtb:table");
+  write_attr (w, "commandName", table->command_c);
+  write_attr (w, "type", "table"); /* XXX */
+  write_attr (w, "subType", title);
+  write_attr_format (w, "tableId", "%d", table_id);
+
+  free (title);
+
+  start_elem (w, "vtb:tableStructure");
+  start_elem (w, "vtb:dataPath");
+  char *data_path = xasprintf ("%010d_lightTableData.bin", table_id);
+  write_text (w, data_path);
+  end_elem (w); /* vtb:dataPath */
+  end_elem (w); /* vtb:tableStructure */
+  end_elem (w); /* vtb:table */
+  end_elem (w); /* container */
+
+  if (!initial_depth)
+    spv_writer_close_file (w, "");
+
+  struct buf buf = { NULL, 0, 0 };
+  put_light_table (&buf, table_id, table);
+  zip_writer_add_memory (w->zw, data_path, buf.data, buf.len);
+  free (buf.data);
+
+  free (data_path);
+}
diff --git a/src/output/spv/spv-writer.h b/src/output/spv/spv-writer.h

new file mode 100644 (file)

index 0000000..076ce13
--- /dev/null
+++ b/src/output/spv/spv-writer.h
@@ -0,0 +1,42 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_WRITER_H
+#define OUTPUT_SPV_WRITER_H 1
+
+struct page_setup;
+struct table_item_text;
+struct pivot_table;
+struct spv_writer;
+struct text_item;
+
+#include "libpspp/compiler.h"
+
+char *spv_writer_open (const char *filename, struct spv_writer **)
+  WARN_UNUSED_RESULT;
+char *spv_writer_close (struct spv_writer *) WARN_UNUSED_RESULT;
+
+void spv_writer_set_page_setup (struct spv_writer *,
+                                const struct page_setup *);
+
+void spv_writer_open_heading (struct spv_writer *, const char *command_id,
+                              const char *label);
+void spv_writer_close_heading (struct spv_writer *);
+
+void spv_writer_put_text (struct spv_writer *, const struct text_item *);
+void spv_writer_put_table (struct spv_writer *, const struct pivot_table *);
+
+#endif /* output/spv/spv-writer.h */
diff --git a/src/output/spv/spv.c b/src/output/spv/spv.c

new file mode 100644 (file)

index 0000000..3f5a743
--- /dev/null
+++ b/src/output/spv/spv.c
@@ -0,0 +1,1181 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2017, 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spv.h"
+
+#include <assert.h>
+#include <inttypes.h>
+#include <libxml/HTMLparser.h>
+#include <libxml/xmlreader.h>
+#include <stdarg.h>
+#include <stdlib.h>
+
+#include "libpspp/assertion.h"
+#include "libpspp/cast.h"
+#include "libpspp/hash-functions.h"
+#include "libpspp/message.h"
+#include "libpspp/str.h"
+#include "libpspp/zip-reader.h"
+#include "output/page-setup-item.h"
+#include "output/pivot-table.h"
+#include "output/spv/detail-xml-parser.h"
+#include "output/spv/light-binary-parser.h"
+#include "output/spv/spv-css-parser.h"
+#include "output/spv/spv-legacy-data.h"
+#include "output/spv/spv-legacy-decoder.h"
+#include "output/spv/spv-light-decoder.h"
+#include "output/spv/structure-xml-parser.h"
+
+#include "gl/c-ctype.h"
+#include "gl/intprops.h"
+#include "gl/minmax.h"
+#include "gl/xalloc.h"
+#include "gl/xvasprintf.h"
+#include "gl/xsize.h"
+
+#include "gettext.h"
+#define _(msgid) gettext (msgid)
+#define N_(msgid) (msgid)
+
+struct spv_reader
+  {
+    struct string zip_errs;
+    struct zip_reader *zip;
+    struct spv_item *root;
+    struct page_setup *page_setup;
+  };
+
+const struct page_setup *
+spv_get_page_setup (const struct spv_reader *spv)
+{
+  return spv->page_setup;
+}
+
+const char *
+spv_item_type_to_string (enum spv_item_type type)
+{
+  switch (type)
+    {
+    case SPV_ITEM_HEADING: return "heading";
+    case SPV_ITEM_TEXT: return "text";
+    case SPV_ITEM_TABLE: return "table";
+    case SPV_ITEM_GRAPH: return "graph";
+    case SPV_ITEM_MODEL: return "model";
+    case SPV_ITEM_OBJECT: return "object";
+    default: return "**error**";
+    }
+}
+
+const char *
+spv_item_class_to_string (enum spv_item_class class)
+{
+  switch (class)
+    {
+#define SPV_CLASS(ENUM, NAME) case SPV_CLASS_##ENUM: return NAME;
+      SPV_CLASSES
+#undef SPV_CLASS
+    default: return NULL;
+    }
+}
+
+enum spv_item_class
+spv_item_class_from_string (const char *name)
+{
+#define SPV_CLASS(ENUM, NAME) \
+  if (!strcmp (name, NAME)) return SPV_CLASS_##ENUM;
+  SPV_CLASSES
+#undef SPV_CLASS
+
+  return SPV_N_CLASSES;
+}
+
+enum spv_item_type
+spv_item_get_type (const struct spv_item *item)
+{
+  return item->type;
+}
+
+enum spv_item_class
+spv_item_get_class (const struct spv_item *item)
+{
+  const char *label = spv_item_get_label (item);
+  if (!label)
+    label = "";
+
+  switch (item->type)
+    {
+    case SPV_ITEM_HEADING:
+      return SPV_CLASS_OUTLINEHEADERS;
+
+    case SPV_ITEM_TEXT:
+      return (!strcmp (label, "Title") ? SPV_CLASS_HEADINGS
+              : !strcmp (label, "Log") ? SPV_CLASS_LOGS
+              : !strcmp (label, "Page Title") ? SPV_CLASS_PAGETITLE
+              : SPV_CLASS_TEXTS);
+
+    case SPV_ITEM_TABLE:
+      return (!strcmp (label, "Warnings") ? SPV_CLASS_WARNINGS
+              : !strcmp (label, "Notes") ? SPV_CLASS_NOTES
+              : SPV_CLASS_TABLES);
+
+    case SPV_ITEM_GRAPH:
+      return SPV_CLASS_CHARTS;
+
+    case SPV_ITEM_MODEL:
+      return SPV_CLASS_MODELS;
+
+    case SPV_ITEM_OBJECT:
+      return SPV_CLASS_OTHER;
+
+    default:
+      return SPV_CLASS_UNKNOWN;
+    }
+}
+
+const char *
+spv_item_get_label (const struct spv_item *item)
+{
+  return item->label;
+}
+
+bool
+spv_item_is_heading (const struct spv_item *item)
+{
+  return item->type == SPV_ITEM_HEADING;
+}
+
+size_t
+spv_item_get_n_children (const struct spv_item *item)
+{
+  return item->n_children;
+}
+
+struct spv_item *
+spv_item_get_child (const struct spv_item *item, size_t idx)
+{
+  assert (idx < item->n_children);
+  return item->children[idx];
+}
+
+bool
+spv_item_is_table (const struct spv_item *item)
+{
+  return item->type == SPV_ITEM_TABLE;
+}
+
+bool
+spv_item_is_text (const struct spv_item *item)
+{
+  return item->type == SPV_ITEM_TEXT;
+}
+
+const struct pivot_value *
+spv_item_get_text (const struct spv_item *item)
+{
+  assert (spv_item_is_text (item));
+  return item->text;
+}
+
+struct spv_item *
+spv_item_next (const struct spv_item *item)
+{
+  if (item->n_children)
+    return item->children[0];
+
+  while (item->parent)
+    {
+      size_t idx = item->parent_idx + 1;
+      item = item->parent;
+      if (idx < item->n_children)
+        return item->children[idx];
+    }
+
+  return NULL;
+}
+
+const struct spv_item *
+spv_item_get_parent (const struct spv_item *item)
+{
+  return item->parent;
+}
+
+size_t
+spv_item_get_level (const struct spv_item *item)
+{
+  int level = 0;
+  for (; item->parent; item = item->parent)
+    level++;
+  return level;
+}
+
+const char *
+spv_item_get_command_id (const struct spv_item *item)
+{
+  return item->command_id;
+}
+
+const char *
+spv_item_get_subtype (const struct spv_item *item)
+{
+  return item->subtype;
+}
+
+bool
+spv_item_is_visible (const struct spv_item *item)
+{
+  return item->visible;
+}
+
+static void
+spv_item_destroy (struct spv_item *item)
+{
+  if (item)
+    {
+      free (item->structure_member);
+
+      free (item->label);
+      free (item->command_id);
+
+      for (size_t i = 0; i < item->n_children; i++)
+        spv_item_destroy (item->children[i]);
+      free (item->children);
+
+      pivot_table_unref (item->table);
+      spv_legacy_properties_destroy (item->legacy_properties);
+      free (item->bin_member);
+      free (item->xml_member);
+      free (item->subtype);
+
+      pivot_value_destroy (item->text);
+
+      free (item->object_type);
+      free (item->uri);
+
+      free (item);
+    }
+}
+
+static void
+spv_heading_add_child (struct spv_item *parent, struct spv_item *child)
+{
+  assert (parent->type == SPV_ITEM_HEADING);
+  assert (!child->parent);
+
+  child->parent = parent;
+  child->parent_idx = parent->n_children;
+
+  if (parent->n_children >= parent->allocated_children)
+    parent->children = x2nrealloc (parent->children,
+                                   &parent->allocated_children,
+                                   sizeof *parent->children);
+  parent->children[parent->n_children++] = child;
+}
+
+static xmlNode *
+find_xml_child_element (xmlNode *parent, const char *child_name)
+{
+  for (xmlNode *node = parent->children; node; node = node->next)
+    if (node->type == XML_ELEMENT_NODE
+        && node->name
+        && !strcmp (CHAR_CAST (char *, node->name), child_name))
+      return node;
+
+  return NULL;
+}
+
+static char *
+get_xml_attr (const xmlNode *node, const char *name)
+{
+  return CHAR_CAST (char *, xmlGetProp (node, CHAR_CAST (xmlChar *, name)));
+}
+
+static void
+put_xml_attr (const char *name, const char *value, struct string *dst)
+{
+  if (!value)
+    return;
+
+  ds_put_format (dst, " %s=\"", name);
+  for (const char *p = value; *p; p++)
+    {
+      switch (*p)
+        {
+        case '\n':
+          ds_put_cstr (dst, "&#10;");
+          break;
+        case '&':
+          ds_put_cstr (dst, "&amp;");
+          break;
+        case '<':
+          ds_put_cstr (dst, "&lt;");
+          break;
+        case '>':
+          ds_put_cstr (dst, "&gt;");
+          break;
+        case '"':
+          ds_put_cstr (dst, "&quot;");
+          break;
+        default:
+          ds_put_byte (dst, *p);
+          break;
+        }
+    }
+  ds_put_byte (dst, '"');
+}
+
+static void
+extract_html_text (const xmlNode *node, int base_font_size, struct string *s)
+{
+  if (node->type == XML_ELEMENT_NODE)
+    {
+      const char *name = CHAR_CAST (char *, node->name);
+      if (!strcmp (name, "br"))
+        ds_put_byte (s, '\n');
+      else if (strcmp (name, "style"))
+        {
+          const char *tag = NULL;
+          if (strchr ("biu", name[0]) && name[1] == '\0')
+            {
+              tag = name;
+              ds_put_format (s, "<%s>", tag);
+            }
+          else if (!strcmp (name, "font"))
+            {
+              tag = "span";
+              ds_put_format (s, "<%s", tag);
+
+              char *face = get_xml_attr (node, "face");
+              put_xml_attr ("face", face, s);
+              free (face);
+
+              char *color = get_xml_attr (node, "color");
+              if (color)
+                {
+                  if (color[0] == '#')
+                    put_xml_attr ("color", color, s);
+                  else
+                    {
+                      uint8_t r, g, b;
+                      if (sscanf (color, "rgb (%"SCNu8", %"SCNu8", %"SCNu8" )",
+                                  &r, &g, &b) == 3)
+                        {
+                          char color2[8];
+                          snprintf (color2, sizeof color2,
+                                    "#%02"PRIx8"%02"PRIx8"%02"PRIx8,
+                                    r, g, b);
+                          put_xml_attr ("color", color2, s);
+                        }
+                    }
+                }
+              free (color);
+
+              char *size_s = get_xml_attr (node, "size");
+              int html_size = size_s ? atoi (size_s) : 0;
+              free (size_s);
+              if (html_size >= 1 && html_size <= 7)
+                {
+                  static const double scale[7] = {
+                    .444, .556, .667, .778, 1.0, 1.33, 2.0
+                  };
+                  double size = base_font_size * scale[html_size - 1];
+
+                  char size2[INT_BUFSIZE_BOUND (int)];
+                  snprintf (size2, sizeof size2, "%.0f", size * 1024.);
+                  put_xml_attr ("size", size2, s);
+                }
+
+              ds_put_cstr (s, ">");
+            }
+          for (const xmlNode *child = node->children; child;
+               child = child->next)
+            extract_html_text (child, base_font_size, s);
+          if (tag)
+            ds_put_format (s, "</%s>", tag);
+        }
+    }
+  else if (node->type == XML_TEXT_NODE)
+    {
+      /* U+00A0 NONBREAKING SPACE is really, really common in SPV text and it
+         makes it impossible to break syntax across lines.  Translate it into a
+         regular space.  (Note that U+00A0 is C2 A0 in UTF-8.)
+
+         Do the same for U+2007 FIGURE SPACE, which also crops out weirdly
+         sometimes. */
+      ds_extend (s, ds_length (s) + xmlStrlen (node->content));
+      for (const uint8_t *p = node->content; *p; )
+        {
+          int c;
+          if (p[0] == 0xc2 && p[1] == 0xa0)
+            {
+              c = ' ';
+              p += 2;
+            }
+          else if (p[0] == 0xe2 && p[1] == 0x80 && p[2] == 0x87)
+            {
+              c = ' ';
+              p += 3;
+            }
+          else
+            c = *p++;
+
+          if (c_isspace (c))
+            {
+              int last = ds_last (s);
+              if (last != EOF && !c_isspace (last))
+                ds_put_byte (s, c);
+            }
+          else if (c == '<')
+            ds_put_cstr (s, "&lt;");
+          else if (c == '>')
+            ds_put_cstr (s, "&gt;");
+          else if (c == '&')
+            ds_put_cstr (s, "&amp;");
+          else
+            ds_put_byte (s, c);
+        }
+    }
+}
+
+static xmlDoc *
+parse_embedded_html (const xmlNode *node)
+{
+  /* Extract HTML from XML node. */
+  char *html_s = CHAR_CAST (char *, xmlNodeGetContent (node));
+  if (!html_s)
+    xalloc_die ();
+
+  xmlDoc *html_doc = htmlReadMemory (
+    html_s, strlen (html_s),
+    NULL, "UTF-8", (HTML_PARSE_RECOVER | HTML_PARSE_NOERROR
+                    | HTML_PARSE_NOWARNING | HTML_PARSE_NOBLANKS
+                    | HTML_PARSE_NONET));
+  free (html_s);
+
+  return html_doc;
+}
+
+/* Given NODE, which should contain HTML content, returns the text within that
+   content as an allocated string.  The caller must eventually free the
+   returned string (with xmlFree()). */
+static char *
+decode_embedded_html (const xmlNode *node, struct font_style *font_style)
+{
+  struct string markup = DS_EMPTY_INITIALIZER;
+  *font_style = (struct font_style) FONT_STYLE_INITIALIZER;
+  font_style->size = 10;
+
+  xmlDoc *html_doc = parse_embedded_html (node);
+  if (html_doc)
+    {
+      xmlNode *root = xmlDocGetRootElement (html_doc);
+      xmlNode *head = root ? find_xml_child_element (root, "head") : NULL;
+      xmlNode *style = head ? find_xml_child_element (head, "style") : NULL;
+      if (style)
+        {
+          uint8_t *style_s = xmlNodeGetContent (style);
+          spv_parse_css_style (CHAR_CAST (char *, style_s), font_style);
+          xmlFree (style_s);
+        }
+
+      if (root)
+        extract_html_text (root, font_style->size, &markup);
+      xmlFreeDoc (html_doc);
+    }
+
+  font_style->markup = true;
+  return ds_steal_cstr (&markup);
+}
+
+static char *
+xstrdup_if_nonempty (const char *s)
+{
+  return s && s[0] ? xstrdup (s) : NULL;
+}
+
+static void
+decode_container_text (const struct spvsx_container_text *ct,
+                       struct spv_item *item)
+{
+  item->type = SPV_ITEM_TEXT;
+  item->command_id = xstrdup_if_nonempty (ct->command_name);
+
+  item->text = xzalloc (sizeof *item->text);
+  item->text->type = PIVOT_VALUE_TEXT;
+  item->text->font_style = xmalloc (sizeof *item->text->font_style);
+  item->text->text.local = decode_embedded_html (ct->html->node_.raw,
+                                                 item->text->font_style);
+}
+
+static void
+decode_page_p (const xmlNode *in, struct page_paragraph *out)
+{
+  char *style = get_xml_attr (in, "style");
+  out->halign = (style && strstr (style, "center") ? TABLE_HALIGN_CENTER
+                 : style && strstr (style, "right") ? TABLE_HALIGN_RIGHT
+                 : TABLE_HALIGN_LEFT);
+  free (style);
+
+  struct font_style font_style;
+  out->markup = decode_embedded_html (in, &font_style);
+  font_style_uninit (&font_style);
+}
+
+static void
+decode_page_paragraph (const struct spvsx_page_paragraph *page_paragraph,
+                       struct page_heading *ph)
+{
+  memset (ph, 0, sizeof *ph);
+
+  const struct spvsx_page_paragraph_text *page_paragraph_text
+    = page_paragraph->page_paragraph_text;
+  if (!page_paragraph_text)
+    return;
+
+  xmlDoc *html_doc = parse_embedded_html (page_paragraph_text->node_.raw);
+  if (!html_doc)
+    return;
+
+  xmlNode *root = xmlDocGetRootElement (html_doc);
+  xmlNode *body = find_xml_child_element (root, "body");
+  if (body)
+    for (const xmlNode *node = body->children; node; node = node->next)
+      if (node->type == XML_ELEMENT_NODE
+          && !strcmp (CHAR_CAST (const char *, node->name), "p"))
+        {
+          ph->paragraphs = xrealloc (ph->paragraphs,
+                                     (ph->n + 1) * sizeof *ph->paragraphs);
+          decode_page_p (node, &ph->paragraphs[ph->n++]);
+        }
+  xmlFreeDoc (html_doc);
+}
+
+void
+spv_item_load (const struct spv_item *item)
+{
+  if (spv_item_is_table (item))
+    spv_item_get_table (item);
+}
+
+bool
+spv_item_is_light_table (const struct spv_item *item)
+{
+  return item->type == SPV_ITEM_TABLE && !item->xml_member;
+}
+
+char * WARN_UNUSED_RESULT
+spv_item_get_raw_light_table (const struct spv_item *item,
+                              void **data, size_t *size)
+{
+  return zip_member_read_all (item->spv->zip, item->bin_member, data, size);
+}
+
+char * WARN_UNUSED_RESULT
+spv_item_get_light_table (const struct spv_item *item,
+                          struct spvlb_table **tablep)
+{
+  *tablep = NULL;
+
+  if (!spv_item_is_light_table (item))
+    return xstrdup ("not a light binary table object");
+
+  void *data;
+  size_t size;
+  char *error = spv_item_get_raw_light_table (item, &data, &size);
+  if (error)
+    return error;
+
+  struct spvbin_input input;
+  spvbin_input_init (&input, data, size);
+
+#if 0
+  struct spvlb_header *header;
+  if (!spvlb_parse_header (&input, &header))
+    return xstrdup("bad header");
+  spvlb_print_header ("file", 0, header);
+
+  struct spvlb_titles *titles;
+  if (!spvlb_parse_titles (&input, &titles))
+    return xstrdup("bad titles");
+  spvlb_print_titles ("file", 0, titles);
+
+  struct spvlb_footnotes *footnotes;
+  if (!spvlb_parse_footnotes (&input, &footnotes))
+    return xstrdup("bad footnotes");
+  spvlb_print_footnotes ("file", 0, footnotes);
+
+  struct spvlb_areas *areas;
+  if (!spvlb_parse_areas (&input, &areas))
+    return xstrdup("bad areas");
+  spvlb_print_areas ("file", 0, areas);
+
+  struct spvlb_borders *borders;
+  if (!spvlb_parse_borders (&input, &borders))
+    return xstrdup("bad borders");
+  spvlb_print_borders ("file", 0, borders);
+
+  struct spvlb_print_settings *print_settings;
+  if (!spvlb_parse_print_settings (&input, &print_settings))
+    return xstrdup("bad print_settings");
+  spvlb_print_print_settings ("file", 0, print_settings);
+
+  struct spvlb_table_settings *table_settings;
+  if (!spvlb_parse_table_settings (&input, &table_settings))
+    return xstrdup("bad table_settings");
+  spvlb_print_table_settings ("file", 0, table_settings);
+
+  input.ofs = 0;
+#endif
+  struct spvlb_table *table;
+  error = (!spvlb_parse_table (&input, &table)
+           ? spvbin_input_to_error (&input, item->bin_member)
+           : input.ofs != input.size
+           ? xasprintf ("%s: expected end of file at offset %#zx",
+                        item->bin_member, input.ofs)
+           : NULL);
+  free (data);
+  if (!error)
+    *tablep = table;
+  return error;
+}
+
+static char *
+pivot_table_open_light (struct spv_item *item)
+{
+  assert (spv_item_is_light_table (item));
+
+  struct spvlb_table *raw_table;
+  char *error = spv_item_get_light_table (item, &raw_table);
+  if (!error)
+    error = decode_spvlb_table (raw_table, &item->table);
+  spvlb_free_table (raw_table);
+
+  return error;
+}
+
+bool
+spv_item_is_legacy_table (const struct spv_item *item)
+{
+  return item->type == SPV_ITEM_TABLE && item->xml_member;
+}
+
+char * WARN_UNUSED_RESULT
+spv_item_get_raw_legacy_data (const struct spv_item *item,
+                              void **data, size_t *size)
+{
+  if (!spv_item_is_legacy_table (item))
+    return xstrdup ("not a legacy table object");
+
+  return zip_member_read_all (item->spv->zip, item->bin_member, data, size);
+}
+
+char * WARN_UNUSED_RESULT
+spv_item_get_legacy_data (const struct spv_item *item, struct spv_data *data)
+{
+  void *raw;
+  size_t size;
+  char *error = spv_item_get_raw_legacy_data (item, &raw, &size);
+  if (!error)
+    {
+      error = spv_legacy_data_decode (raw, size, data);
+      free (raw);
+    }
+
+  return error;
+}
+
+static char * WARN_UNUSED_RESULT
+spv_read_xml_member (struct spv_reader *spv, const char *member_name,
+                     bool keep_blanks, const char *root_element_name,
+                     xmlDoc **docp)
+{
+  *docp = NULL;
+
+  struct zip_member *zm = zip_member_open (spv->zip, member_name);
+  if (!zm)
+    return ds_steal_cstr (&spv->zip_errs);
+
+  xmlParserCtxt *parser;
+  xmlKeepBlanksDefault (keep_blanks);
+  parser = xmlCreatePushParserCtxt(NULL, NULL, NULL, 0, NULL);
+  if (!parser)
+    {
+      zip_member_finish (zm);
+      return xasprintf (_("%s: Failed to create XML parser"), member_name);
+    }
+
+  int retval;
+  char buf[4096];
+  while ((retval = zip_member_read (zm, buf, sizeof buf)) > 0)
+    xmlParseChunk (parser, buf, retval, false);
+  xmlParseChunk (parser, NULL, 0, true);
+
+  xmlDoc *doc = parser->myDoc;
+  bool well_formed = parser->wellFormed;
+  xmlFreeParserCtxt (parser);
+
+  if (retval < 0)
+    {
+      char *error = ds_steal_cstr (&spv->zip_errs);
+      zip_member_finish (zm);
+      xmlFreeDoc (doc);
+      return error;
+    }
+  zip_member_finish (zm);
+
+  if (!well_formed)
+    {
+      xmlFreeDoc (doc);
+      return xasprintf(_("%s: document is not well-formed"), member_name);
+    }
+
+  const xmlNode *root_node = xmlDocGetRootElement (doc);
+  assert (root_node->type == XML_ELEMENT_NODE);
+  if (strcmp (CHAR_CAST (char *, root_node->name), root_element_name))
+    {
+      xmlFreeDoc (doc);
+      return xasprintf(_("%s: root node is \"%s\" but \"%s\" was expected"),
+                       member_name,
+                       CHAR_CAST (char *, root_node->name), root_element_name);
+    }
+
+  *docp = doc;
+  return NULL;
+}
+
+char * WARN_UNUSED_RESULT
+spv_item_get_legacy_table (const struct spv_item *item, xmlDoc **docp)
+{
+  assert (spv_item_is_legacy_table (item));
+
+  return spv_read_xml_member (item->spv, item->xml_member, false,
+                              "visualization", docp);
+}
+
+char * WARN_UNUSED_RESULT
+spv_item_get_structure (const struct spv_item *item, struct _xmlDoc **docp)
+{
+  return spv_read_xml_member (item->spv, item->structure_member, false,
+                              "heading", docp);
+}
+
+static char * WARN_UNUSED_RESULT
+pivot_table_open_legacy (struct spv_item *item)
+{
+  assert (spv_item_is_legacy_table (item));
+
+  struct spv_data data;
+  char *error = spv_item_get_legacy_data (item, &data);
+  if (error)
+    {
+      char *s = xasprintf ("%s: %s", item->bin_member, error);
+      free (error);
+      return s;
+    }
+
+  xmlDoc *doc;
+  error = spv_read_xml_member (item->spv, item->xml_member, false,
+                               "visualization", &doc);
+  if (error)
+    {
+      spv_data_uninit (&data);
+      return error;
+    }
+
+  struct spvxml_context ctx = SPVXML_CONTEXT_INIT (ctx);
+  struct spvdx_visualization *v;
+  spvdx_parse_visualization (&ctx, xmlDocGetRootElement (doc), &v);
+  error = spvxml_context_finish (&ctx, &v->node_);
+
+  if (!error)
+    error = decode_spvdx_table (v, item->legacy_properties, &data,
+                                &item->table);
+
+  if (error)
+    {
+      char *s = xasprintf ("%s: %s", item->xml_member, error);
+      free (error);
+      error = s;
+    }
+
+  spv_data_uninit (&data);
+  spvdx_free_visualization (v);
+  if (doc)
+    xmlFreeDoc (doc);
+
+  return error;
+}
+
+struct pivot_table *
+spv_item_get_table (const struct spv_item *item_)
+{
+  struct spv_item *item = CONST_CAST (struct spv_item *, item_);
+
+  assert (spv_item_is_table (item));
+  if (!item->table)
+    {
+      char *error = (item->xml_member
+                     ? pivot_table_open_legacy (item)
+                     : pivot_table_open_light (item));
+      if (error)
+        {
+          item->error = true;
+          msg (ME, "%s", error);
+          item->table = pivot_table_create_for_text (
+            pivot_value_new_text (N_("Error")),
+            pivot_value_new_user_text (error, -1));
+          free (error);
+        }
+    }
+
+  return item->table;
+}
+
+/* Constructs a new spv_item from XML and stores it in *ITEMP.  Returns NULL if
+   successful, otherwise an error message for the caller to use and free (with
+   free()).
+
+   XML should be a 'heading' or 'container' element. */
+static char * WARN_UNUSED_RESULT
+spv_decode_container (const struct spvsx_container *c,
+                      const char *structure_member,
+                      struct spv_item *parent)
+{
+  struct spv_item *item = xzalloc (sizeof *item);
+  item->spv = parent->spv;
+  item->label = xstrdup (c->label->text);
+  item->visible = c->visibility == SPVSX_VISIBILITY_VISIBLE;
+  item->structure_member = xstrdup (structure_member);
+
+  assert (c->n_seq == 1);
+  struct spvxml_node *content = c->seq[0];
+  if (spvsx_is_container_text (content))
+    decode_container_text (spvsx_cast_container_text (content), item);
+  else if (spvsx_is_table (content))
+    {
+      item->type = SPV_ITEM_TABLE;
+
+      struct spvsx_table *table = spvsx_cast_table (content);
+      const struct spvsx_table_structure *ts = table->table_structure;
+      item->bin_member = xstrdup (ts->data_path->text);
+      item->command_id = xstrdup_if_nonempty (table->command_name);
+      if (ts->path)
+        {
+          item->xml_member = ts->path ? xstrdup (ts->path->text) : NULL;
+          char *error = decode_spvsx_legacy_properties (
+            table->table_properties, &item->legacy_properties);
+          if (error)
+            {
+              spv_item_destroy (item);
+              return error;
+            }
+        }
+    }
+  else if (spvsx_is_graph (content))
+    {
+      struct spvsx_graph *graph = spvsx_cast_graph (content);
+      item->type = SPV_ITEM_GRAPH;
+      item->command_id = xstrdup_if_nonempty (graph->command_name);
+      /* XXX */
+    }
+  else if (spvsx_is_model (content))
+    {
+      struct spvsx_model *model = spvsx_cast_model (content);
+      item->type = SPV_ITEM_MODEL;
+      item->command_id = xstrdup_if_nonempty (model->command_name);
+      /* XXX */
+    }
+  else if (spvsx_is_object (content))
+    {
+      struct spvsx_object *object = spvsx_cast_object (content);
+      item->type = SPV_ITEM_OBJECT;
+      item->object_type = xstrdup (object->type);
+      item->uri = xstrdup (object->uri);
+    }
+  else if (spvsx_is_image (content))
+    {
+      struct spvsx_image *image = spvsx_cast_image (content);
+      item->type = SPV_ITEM_OBJECT;
+      item->object_type = xstrdup ("image");
+      item->uri = xstrdup (image->data_path->text);
+    }
+  else
+    NOT_REACHED ();
+
+  spv_heading_add_child (parent, item);
+  return NULL;
+}
+
+static char * WARN_UNUSED_RESULT
+spv_decode_children (struct spv_reader *spv, const char *structure_member,
+                     struct spvxml_node **seq, size_t n_seq,
+                     struct spv_item *parent)
+{
+  for (size_t i = 0; i < n_seq; i++)
+    {
+      const struct spvxml_node *node = seq[i];
+
+      char *error;
+      if (spvsx_is_container (node))
+        {
+          const struct spvsx_container *container
+            = spvsx_cast_container (node);
+          error = spv_decode_container (container, structure_member, parent);
+        }
+      else if (spvsx_is_heading (node))
+        {
+          const struct spvsx_heading *subheading = spvsx_cast_heading (node);
+          struct spv_item *subitem = xzalloc (sizeof *subitem);
+          subitem->structure_member = xstrdup (structure_member);
+          subitem->spv = parent->spv;
+          subitem->type = SPV_ITEM_HEADING;
+          subitem->label = xstrdup (subheading->label->text);
+          if (subheading->command_name)
+            subitem->command_id = xstrdup (subheading->command_name);
+          subitem->visible = !subheading->heading_visibility_present;
+          spv_heading_add_child (parent, subitem);
+
+          error = spv_decode_children (spv, structure_member,
+                                       subheading->seq, subheading->n_seq,
+                                       subitem);
+        }
+      else
+        NOT_REACHED ();
+
+      if (error)
+        return error;
+    }
+
+  return NULL;
+}
+
+static struct page_setup *
+decode_page_setup (const struct spvsx_page_setup *in, const char *file_name)
+{
+  struct page_setup *out = xmalloc (sizeof *out);
+  *out = (struct page_setup) PAGE_SETUP_INITIALIZER;
+
+  out->initial_page_number = in->initial_page_number;
+
+  if (in->paper_width != DBL_MAX)
+    out->paper[TABLE_HORZ] = in->paper_width;
+  if (in->paper_height != DBL_MAX)
+    out->paper[TABLE_VERT] = in->paper_height;
+
+  if (in->margin_left != DBL_MAX)
+    out->margins[TABLE_HORZ][0] = in->margin_left;
+  if (in->margin_right != DBL_MAX)
+    out->margins[TABLE_HORZ][1] = in->margin_right;
+  if (in->margin_top != DBL_MAX)
+    out->margins[TABLE_VERT][0] = in->margin_top;
+  if (in->margin_bottom != DBL_MAX)
+    out->margins[TABLE_VERT][1] = in->margin_bottom;
+
+  if (in->space_after != DBL_MAX)
+    out->object_spacing = in->space_after;
+
+  if (in->chart_size)
+    out->chart_size = (in->chart_size == SPVSX_CHART_SIZE_FULL_HEIGHT
+                       ? PAGE_CHART_FULL_HEIGHT
+                       : in->chart_size == SPVSX_CHART_SIZE_HALF_HEIGHT
+                       ? PAGE_CHART_HALF_HEIGHT
+                       : in->chart_size == SPVSX_CHART_SIZE_QUARTER_HEIGHT
+                       ? PAGE_CHART_QUARTER_HEIGHT
+                       : PAGE_CHART_AS_IS);
+
+  decode_page_paragraph (in->page_header->page_paragraph, &out->headings[0]);
+  decode_page_paragraph (in->page_footer->page_paragraph, &out->headings[1]);
+
+  out->file_name = xstrdup (file_name);
+
+  return out;
+}
+
+static char * WARN_UNUSED_RESULT
+spv_heading_read (struct spv_reader *spv,
+                  const char *file_name, const char *member_name)
+{
+  xmlDoc *doc;
+  char *error = spv_read_xml_member (spv, member_name, true, "heading", &doc);
+  if (error)
+    return error;
+
+  struct spvxml_context ctx = SPVXML_CONTEXT_INIT (ctx);
+  struct spvsx_root_heading *root;
+  spvsx_parse_root_heading (&ctx, xmlDocGetRootElement (doc), &root);
+  error = spvxml_context_finish (&ctx, &root->node_);
+
+  if (!error && root->page_setup)
+    spv->page_setup = decode_page_setup (root->page_setup, file_name);
+
+  for (size_t i = 0; !error && i < root->n_seq; i++)
+    error = spv_decode_children (spv, member_name, root->seq, root->n_seq,
+                                 spv->root);
+
+  if (error)
+    {
+      char *s = xasprintf ("%s: %s", member_name, error);
+      free (error);
+      error = s;
+    }
+
+  spvsx_free_root_heading (root);
+  xmlFreeDoc (doc);
+
+  return error;
+}
+
+struct spv_item *
+spv_get_root (const struct spv_reader *spv)
+{
+  return spv->root;
+}
+
+static int
+spv_detect__ (struct zip_reader *zip, char **errorp)
+{
+  *errorp = NULL;
+
+  const char *member = "META-INF/MANIFEST.MF";
+  if (!zip_reader_contains_member (zip, member))
+    return 0;
+
+  void *data;
+  size_t size;
+  *errorp = zip_member_read_all (zip, "META-INF/MANIFEST.MF",
+                                 &data, &size);
+  if (*errorp)
+    return -1;
+
+  const char *magic = "allowPivoting=true";
+  bool is_spv = size == strlen (magic) && !memcmp (magic, data, size);
+  free (data);
+
+  return is_spv;
+}
+
+/* Returns NULL if FILENAME is an SPV file, otherwise an error string that the
+   caller must eventually free(). */
+char * WARN_UNUSED_RESULT
+spv_detect (const char *filename)
+{
+  struct string zip_error;
+  struct zip_reader *zip = zip_reader_create (filename, &zip_error);
+  if (!zip)
+    return ds_steal_cstr (&zip_error);
+
+  char *error;
+  if (spv_detect__ (zip, &error) <= 0 && !error)
+    error = xasprintf("%s: not an SPV file", filename);
+  zip_reader_destroy (zip);
+  ds_destroy (&zip_error);
+  return error;
+}
+
+
+char * WARN_UNUSED_RESULT
+spv_open (const char *filename, struct spv_reader **spvp)
+{
+  *spvp = NULL;
+
+  struct spv_reader *spv = xzalloc (sizeof *spv);
+  ds_init_empty (&spv->zip_errs);
+  spv->zip = zip_reader_create (filename, &spv->zip_errs);
+  if (!spv->zip)
+    {
+      char *error = ds_steal_cstr (&spv->zip_errs);
+      spv_close (spv);
+      return error;
+    }
+
+  char *error;
+  int detect = spv_detect__ (spv->zip, &error);
+  if (detect <= 0)
+    {
+      spv_close (spv);
+      return error ? error : xasprintf("%s: not an SPV file", filename);
+    }
+
+  spv->root = xzalloc (sizeof *spv->root);
+  spv->root->spv = spv;
+  spv->root->type = SPV_ITEM_HEADING;
+  for (size_t i = 0; ; i++)
+    {
+      const char *member_name = zip_reader_get_member_name (spv->zip, i);
+      if (!member_name)
+        break;
+
+      struct substring member_name_ss = ss_cstr (member_name);
+      if (ss_starts_with (member_name_ss, ss_cstr ("outputViewer"))
+          && ss_ends_with (member_name_ss, ss_cstr (".xml")))
+        {
+          char *error = spv_heading_read (spv, filename, member_name);
+          if (error)
+            {
+              spv_close (spv);
+              return error;
+            }
+        }
+    }
+
+  *spvp = spv;
+  return NULL;
+}
+
+void
+spv_close (struct spv_reader *spv)
+{
+  if (spv)
+    {
+      ds_destroy (&spv->zip_errs);
+      zip_reader_destroy (spv->zip);
+      spv_item_destroy (spv->root);
+      page_setup_destroy (spv->page_setup);
+      free (spv);
+    }
+}
+
+struct fmt_spec
+spv_decode_fmt_spec (uint32_t u32)
+{
+  if (!u32
+      || (u32 == 0x10000 || u32 == 1 /* both used as string formats */))
+    return fmt_for_output (FMT_F, 40, 2);
+
+  uint8_t raw_type = u32 >> 16;
+  uint8_t w = u32 >> 8;
+  uint8_t d = u32;
+
+  msg_disable ();
+  struct fmt_spec spec = { .type = FMT_F, .w = w, .d = d };
+  bool ok = raw_type >= 40 || fmt_from_io (raw_type, &spec.type);
+  if (ok)
+    {
+      fmt_fix_output (&spec);
+      ok = fmt_check_width_compat (&spec, 0);
+    }
+  msg_enable ();
+
+  if (!ok)
+    {
+      fprintf (stderr, "bad format %#"PRIx32"\n", u32); /* XXX */
+      spec = fmt_for_output (FMT_F, 40, 2);
+      exit (1);
+    }
+
+  return spec;
+}
diff --git a/src/output/spv/spv.h b/src/output/spv/spv.h

new file mode 100644 (file)

index 0000000..fe43000
--- /dev/null
+++ b/src/output/spv/spv.h
@@ -0,0 +1,196 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef OUTPUT_SPV_H
+#define OUTPUT_SPV_H 1
+
+/* SPSS Viewer (SPV) file reader.
+
+   An SPV file, represented as struct spv_reader, contains a number of
+   top-level headings, each of which recursively contains other headings and
+   tables.  Here, we model a heading, text, table, or other element as an
+   "item", and a an SPV file as a single root item that contains each of the
+   top-level headings as a child item.
+ */
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+#include "libpspp/compiler.h"
+
+struct pivot_table;
+struct spv_data;
+struct spv_reader;
+struct spvlb_table;
+struct _xmlDoc;
+
+/* SPV files. */
+
+char *spv_open (const char *filename, struct spv_reader **) WARN_UNUSED_RESULT;
+void spv_close (struct spv_reader *);
+
+char *spv_detect (const char *filename) WARN_UNUSED_RESULT;
+
+const char *spv_get_errors (const struct spv_reader *);
+void spv_clear_errors (struct spv_reader *);
+
+struct spv_item *spv_get_root (const struct spv_reader *);
+void spv_item_dump (const struct spv_item *, int indentation);
+
+const struct page_setup *spv_get_page_setup (const struct spv_reader *);
+
+/* Items.
+
+   An spv_item represents of the elements that can occur in an SPV file.  Items
+   form a tree because "heading" items can have an arbitrary number of child
+   items, which in turn may also be headings.  The root item, that is, the item
+   returned by spv_get_root(), is always a heading. */
+
+enum spv_item_type
+  {
+    SPV_ITEM_HEADING,
+    SPV_ITEM_TEXT,
+    SPV_ITEM_TABLE,
+    SPV_ITEM_GRAPH,
+    SPV_ITEM_MODEL,
+    SPV_ITEM_OBJECT,
+  };
+
+const char *spv_item_type_to_string (enum spv_item_type);
+
+#define SPV_CLASSES                                \
+    SPV_CLASS(CHARTS, "charts")                    \
+    SPV_CLASS(HEADINGS, "headings")                \
+    SPV_CLASS(LOGS, "logs")                        \
+    SPV_CLASS(MODELS, "models")                    \
+    SPV_CLASS(TABLES, "tables")                    \
+    SPV_CLASS(TEXTS, "texts")                      \
+    SPV_CLASS(TREES, "trees")                      \
+    SPV_CLASS(WARNINGS, "warnings")                \
+    SPV_CLASS(OUTLINEHEADERS, "outlineheaders")    \
+    SPV_CLASS(PAGETITLE, "pagetitle")              \
+    SPV_CLASS(NOTES, "notes")                      \
+    SPV_CLASS(UNKNOWN, "unknown")                  \
+    SPV_CLASS(OTHER, "other")
+enum spv_item_class
+  {
+#define SPV_CLASS(ENUM, NAME) SPV_CLASS_##ENUM,
+    SPV_CLASSES
+#undef SPV_CLASS
+  };
+enum
+  {
+#define SPV_CLASS(ENUM, NAME) +1
+    SPV_N_CLASSES = SPV_CLASSES
+#undef SPV_CLASS
+};
+
+const char *spv_item_class_to_string (enum spv_item_class);
+enum spv_item_class spv_item_class_from_string (const char *);
+
+struct spv_item
+  {
+    struct spv_reader *spv;
+    struct spv_item *parent;
+    size_t parent_idx;
+
+    bool error;
+
+    char *structure_member;
+
+    enum spv_item_type type;
+    char *label;
+    char *command_id;           /* Unique command identifier. */
+
+    /* Whether the item is visible.
+       For SPV_ITEM_HEADING, false indicates that the item is collapsed.
+       For SPV_ITEM_TABLE, false indicates that the item is not shown. */
+    bool visible;
+
+    /* SPV_ITEM_HEADING only. */
+    struct spv_item **children;
+    size_t n_children, allocated_children;
+
+    /* SPV_ITEM_TABLE only. */
+    struct pivot_table *table;    /* NULL if not yet loaded. */
+    struct spv_legacy_properties *legacy_properties;
+    char *bin_member;
+    char *xml_member;
+    char *subtype;
+
+    /* SPV_ITEM_TEXT only.  */
+    struct pivot_value *text;
+
+    /* SPV_ITEM_OBJECT only. */
+    char *object_type;
+    char *uri;
+  };
+
+void spv_item_load (const struct spv_item *);
+
+enum spv_item_type spv_item_get_type (const struct spv_item *);
+enum spv_item_class spv_item_get_class (const struct spv_item *);
+
+const char *spv_item_get_label (const struct spv_item *);
+
+bool spv_item_is_heading (const struct spv_item *);
+size_t spv_item_get_n_children (const struct spv_item *);
+struct spv_item *spv_item_get_child (const struct spv_item *, size_t idx);
+
+bool spv_item_is_table (const struct spv_item *);
+struct pivot_table *spv_item_get_table (const struct spv_item *);
+
+bool spv_item_is_text (const struct spv_item *);
+const struct pivot_value *spv_item_get_text (const struct spv_item *);
+
+bool spv_item_is_visible (const struct spv_item *);
+
+#define SPV_ITEM_FOR_EACH(ITER, ROOT) \
+  for ((ITER) = (ROOT); (ITER) != NULL; (ITER) = spv_item_next(ITER))
+#define SPV_ITEM_FOR_EACH_SKIP_ROOT(ITER, ROOT) \
+  for ((ITER) = (ROOT); ((ITER) = spv_item_next(ITER)) != NULL; )
+struct spv_item *spv_item_next (const struct spv_item *);
+
+const struct spv_item *spv_item_get_parent (const struct spv_item *);
+size_t spv_item_get_level (const struct spv_item *);
+
+const char *spv_item_get_member_name (const struct spv_item *);
+const char *spv_item_get_command_id (const struct spv_item *);
+const char *spv_item_get_subtype (const struct spv_item *);
+
+char *spv_item_get_structure (const struct spv_item *, struct _xmlDoc **)
+  WARN_UNUSED_RESULT;
+
+bool spv_item_is_light_table (const struct spv_item *);
+char *spv_item_get_light_table (const struct spv_item *,
+                                    struct spvlb_table **)
+  WARN_UNUSED_RESULT;
+char *spv_item_get_raw_light_table (const struct spv_item *,
+                                    void **data, size_t *size)
+  WARN_UNUSED_RESULT;
+
+bool spv_item_is_legacy_table (const struct spv_item *);
+char *spv_item_get_raw_legacy_data (const struct spv_item *item,
+                                    void **data, size_t *size)
+  WARN_UNUSED_RESULT;
+char *spv_item_get_legacy_data (const struct spv_item *, struct spv_data *)
+  WARN_UNUSED_RESULT;
+char *spv_item_get_legacy_table (const struct spv_item *, struct _xmlDoc **)
+  WARN_UNUSED_RESULT;
+
+struct fmt_spec spv_decode_fmt_spec (uint32_t u32);
+
+#endif /* output/spv/spv.h */
diff --git a/src/output/spv/spvbin-helpers.c b/src/output/spv/spvbin-helpers.c

new file mode 100644 (file)

index 0000000..e405310
--- /dev/null
+++ b/src/output/spv/spvbin-helpers.c
@@ -0,0 +1,358 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spvbin-helpers.h"
+
+#include <inttypes.h>
+#include <string.h>
+
+#include "libpspp/float-format.h"
+#include "libpspp/integer-format.h"
+#include "libpspp/str.h"
+
+#include "gl/xmemdup0.h"
+
+void
+spvbin_input_init (struct spvbin_input *input, const void *data, size_t size)
+{
+  *input = (struct spvbin_input) { .data = data, .size = size };
+}
+
+bool
+spvbin_input_at_end (const struct spvbin_input *input)
+{
+  return input->ofs >= input->size;
+}
+
+char *
+spvbin_input_to_error (const struct spvbin_input *input, const char *name)
+{
+  struct string s = DS_EMPTY_INITIALIZER;
+  if (name)
+    ds_put_format (&s, "%s: ", name);
+  ds_put_cstr (&s, "parse error decoding ");
+  for (size_t i = input->n_errors; i-- > 0; )
+    if (i < SPVBIN_MAX_ERRORS)
+      ds_put_format (&s, "/%s@%#zx", input->errors[i].name,
+                     input->errors[i].start);
+  ds_put_format (&s, " near %#zx", input->error_ofs);
+  return ds_steal_cstr (&s);
+}
+
+\f
+bool
+spvbin_match_bytes (struct spvbin_input *input, const void *bytes, size_t n)
+{
+  if (input->size - input->ofs < n
+      || memcmp (&input->data[input->ofs], bytes, n))
+    return false;
+
+  input->ofs += n;
+  return true;
+}
+
+bool
+spvbin_match_byte (struct spvbin_input *input, uint8_t byte)
+{
+  return spvbin_match_bytes (input, &byte, 1);
+}
+
+bool
+spvbin_parse_bool (struct spvbin_input *input, bool *p)
+{
+  if (input->ofs >= input->size || input->data[input->ofs] > 1)
+    return false;
+  if (p)
+    *p = input->data[input->ofs];
+  input->ofs++;
+  return true;
+}
+
+static const void *
+spvbin_parse__ (struct spvbin_input *input, size_t n)
+{
+  if (input->size - input->ofs < n)
+    return NULL;
+
+  const void *src = &input->data[input->ofs];
+  input->ofs += n;
+  return src;
+}
+
+bool
+spvbin_parse_byte (struct spvbin_input *input, uint8_t *p)
+{
+  const void *src = spvbin_parse__ (input, sizeof *p);
+  if (src && p)
+    *p = *(const uint8_t *) src;
+  return src != NULL;
+}
+
+bool
+spvbin_parse_int16 (struct spvbin_input *input, uint16_t *p)
+{
+  const void *src = spvbin_parse__ (input, sizeof *p);
+  if (src && p)
+    *p = le_to_native16 (get_uint16 (src));
+  return src != NULL;
+}
+
+bool
+spvbin_parse_int32 (struct spvbin_input *input, uint32_t *p)
+{
+  const void *src = spvbin_parse__ (input, sizeof *p);
+  if (src && p)
+    *p = le_to_native32 (get_uint32 (src));
+  return src != NULL;
+}
+
+bool
+spvbin_parse_int64 (struct spvbin_input *input, uint64_t *p)
+{
+  const void *src = spvbin_parse__ (input, sizeof *p);
+  if (src && p)
+    *p = le_to_native64 (get_uint64 (src));
+  return src != NULL;
+}
+
+bool
+spvbin_parse_be16 (struct spvbin_input *input, uint16_t *p)
+{
+  const void *src = spvbin_parse__ (input, sizeof *p);
+  if (src && p)
+    *p = be_to_native16 (get_uint16 (src));
+  return src != NULL;
+}
+
+bool
+spvbin_parse_be32 (struct spvbin_input *input, uint32_t *p)
+{
+  const void *src = spvbin_parse__ (input, sizeof *p);
+  if (src && p)
+    *p = be_to_native32 (get_uint32 (src));
+  return src != NULL;
+}
+
+bool
+spvbin_parse_be64 (struct spvbin_input *input, uint64_t *p)
+{
+  const void *src = spvbin_parse__ (input, sizeof *p);
+  if (src && p)
+    *p = be_to_native64 (get_uint64 (src));
+  return src != NULL;
+}
+
+bool
+spvbin_parse_double (struct spvbin_input *input, double *p)
+{
+  const void *src = spvbin_parse__ (input, 8);
+  if (src && p)
+    *p = float_get_double (FLOAT_IEEE_DOUBLE_LE, src);
+  return src != NULL;
+}
+
+bool
+spvbin_parse_float (struct spvbin_input *input, double *p)
+{
+  const void *src = spvbin_parse__ (input, 4);
+  if (src && p)
+    *p = float_get_double (FLOAT_IEEE_SINGLE_LE, src);
+  return src != NULL;
+}
+
+static bool
+spvbin_parse_string__ (struct spvbin_input *input,
+                       uint32_t (*raw_to_native32) (uint32_t),
+                       char **p)
+{
+  *p = NULL;
+
+  uint32_t length;
+  if (input->size - input->ofs < sizeof length)
+    return false;
+
+  const uint8_t *src = &input->data[input->ofs];
+  length = raw_to_native32 (get_uint32 (src));
+  if (input->size - input->ofs - sizeof length < length)
+    return false;
+
+  if (p)
+    *p = xmemdup0 (src + sizeof length, length);
+  input->ofs += sizeof length + length;
+  return true;
+}
+
+bool
+spvbin_parse_string (struct spvbin_input *input, char **p)
+{
+  return spvbin_parse_string__ (input, le_to_native32, p);
+}
+
+bool
+spvbin_parse_bestring (struct spvbin_input *input, char **p)
+{
+  return spvbin_parse_string__ (input, be_to_native32, p);
+}
+
+void
+spvbin_error (struct spvbin_input *input, const char *name, size_t start)
+{
+  if (!input->n_errors)
+    input->error_ofs = input->ofs;
+
+  /* We keep track of the error depth regardless of whether we can store all of
+     them.  The parser needs this to accurately save and restore error
+     state. */
+  if (input->n_errors < SPVBIN_MAX_ERRORS)
+    {
+      input->errors[input->n_errors].name = name;
+      input->errors[input->n_errors].start = start;
+    }
+  input->n_errors++;
+}
+\f
+void
+spvbin_print_header (const char *title, size_t start UNUSED, size_t len UNUSED, int indent)
+{
+  for (int i = 0; i < indent * 4; i++)
+    putchar (' ');
+  fputs (title, stdout);
+#if 0
+  if (start != SIZE_MAX)
+    printf (" (0x%zx, %zu)", start, len);
+#endif
+  fputs (": ", stdout);
+}
+
+void
+spvbin_print_presence (const char *title, int indent, bool present)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  puts (present ? "present" : "absent");
+}
+
+void
+spvbin_print_bool (const char *title, int indent, bool x)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  printf ("%s\n", x ? "true" : "false");
+}
+
+void
+spvbin_print_byte (const char *title, int indent, uint8_t x)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  printf ("%"PRIu8"\n", x);
+}
+
+void
+spvbin_print_int16 (const char *title, int indent, uint16_t x)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  printf ("%"PRIu16"\n", x);
+}
+
+void
+spvbin_print_int32 (const char *title, int indent, uint32_t x)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  printf ("%"PRIu32"\n", x);
+}
+
+void
+spvbin_print_int64 (const char *title, int indent, uint64_t x)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  printf ("%"PRIu64"\n", x);
+}
+
+void
+spvbin_print_double (const char *title, int indent, double x)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  printf ("%g\n", x);
+}
+
+void
+spvbin_print_string (const char *title, int indent, const char *s)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  if (s)
+    printf ("\"%s\"\n", s);
+  else
+    printf ("none\n");
+}
+
+void
+spvbin_print_case (const char *title, int indent, int x)
+{
+  spvbin_print_header (title, -1, -1, indent);
+  printf ("%d\n", x);
+}
+\f
+struct spvbin_position
+spvbin_position_save (const struct spvbin_input *input)
+{
+  struct spvbin_position pos = { input->ofs };
+  return pos;
+}
+
+void
+spvbin_position_restore (struct spvbin_position *pos,
+                         struct spvbin_input *input)
+{
+  input->ofs = pos->ofs;
+}
+\f
+static bool
+spvbin_limit_parse__ (struct spvbin_limit *limit, struct spvbin_input *input,
+                      uint32_t (*raw_to_native32) (uint32_t))
+{
+  limit->size = input->size;
+
+  uint32_t count;
+  if (input->size - input->ofs < sizeof count)
+    return false;
+
+  const uint8_t *src = &input->data[input->ofs];
+  count = raw_to_native32 (get_uint32 (src));
+  if (input->size - input->ofs - sizeof count < count)
+    return false;
+
+  input->ofs += sizeof count;
+  input->size = input->ofs + count;
+  return true;
+}
+
+bool
+spvbin_limit_parse (struct spvbin_limit *limit, struct spvbin_input *input)
+{
+  return spvbin_limit_parse__ (limit, input, le_to_native32);
+}
+
+bool
+spvbin_limit_parse_be (struct spvbin_limit *limit, struct spvbin_input *input)
+{
+  return spvbin_limit_parse__ (limit, input, be_to_native32);
+}
+
+void
+spvbin_limit_pop (struct spvbin_limit *limit, struct spvbin_input *input)
+{
+  input->size = limit->size;
+}
diff --git a/src/output/spv/spvbin-helpers.h b/src/output/spv/spvbin-helpers.h

new file mode 100644 (file)

index 0000000..1cb8e34
--- /dev/null
+++ b/src/output/spv/spvbin-helpers.h
@@ -0,0 +1,94 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef SPVBIN_HELPERS_H
+#define SPVBIN_HELPERS_H 1
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+
+struct spvbin_input
+  {
+    const uint8_t *data;
+    size_t ofs;
+    size_t size;
+    int version;
+
+#define SPVBIN_MAX_ERRORS 16
+    struct
+      {
+        const char *name;
+        size_t start;
+      }
+    errors[SPVBIN_MAX_ERRORS];
+    size_t n_errors;
+    size_t error_ofs;
+  };
+
+void spvbin_input_init (struct spvbin_input *, const void *, size_t);
+bool spvbin_input_at_end (const struct spvbin_input *);
+
+char *spvbin_input_to_error (const struct spvbin_input *, const char *name);
+
+bool spvbin_match_bytes (struct spvbin_input *, const void *, size_t);
+bool spvbin_match_byte (struct spvbin_input *, uint8_t);
+
+bool spvbin_parse_bool (struct spvbin_input *, bool *);
+bool spvbin_parse_byte (struct spvbin_input *, uint8_t *);
+bool spvbin_parse_int16 (struct spvbin_input *, uint16_t *);
+bool spvbin_parse_int32 (struct spvbin_input *, uint32_t *);
+bool spvbin_parse_int64 (struct spvbin_input *, uint64_t *);
+bool spvbin_parse_be16 (struct spvbin_input *, uint16_t *);
+bool spvbin_parse_be32 (struct spvbin_input *, uint32_t *);
+bool spvbin_parse_be64 (struct spvbin_input *, uint64_t *);
+bool spvbin_parse_double (struct spvbin_input *, double *);
+bool spvbin_parse_float (struct spvbin_input *, double *);
+bool spvbin_parse_string (struct spvbin_input *, char **);
+bool spvbin_parse_bestring (struct spvbin_input *, char **);
+
+void spvbin_error (struct spvbin_input *, const char *name, size_t start);
+
+void spvbin_print_header (const char *title, size_t start, size_t len,
+                          int indent);
+void spvbin_print_presence (const char *title, int indent, bool);
+void spvbin_print_bool (const char *title, int indent, bool);
+void spvbin_print_byte (const char *title, int indent, uint8_t);
+void spvbin_print_int16 (const char *title, int indent, uint16_t);
+void spvbin_print_int32 (const char *title, int indent, uint32_t);
+void spvbin_print_int64 (const char *title, int indent, uint64_t);
+void spvbin_print_double (const char *title, int indent, double);
+void spvbin_print_string (const char *title, int indent, const char *);
+void spvbin_print_case (const char *title, int indent, int);
+
+struct spvbin_position
+  {
+    size_t ofs;
+  };
+
+struct spvbin_position spvbin_position_save (const struct spvbin_input *);
+void spvbin_position_restore (struct spvbin_position *, struct spvbin_input *);
+
+struct spvbin_limit
+  {
+    size_t size;
+  };
+
+bool spvbin_limit_parse (struct spvbin_limit *, struct spvbin_input *);
+bool spvbin_limit_parse_be (struct spvbin_limit *, struct spvbin_input *);
+void spvbin_limit_pop (struct spvbin_limit *, struct spvbin_input *);
+
+#endif /* output/spv/spvbin-helpers.h */
diff --git a/src/output/spv/spvxml-helpers.c b/src/output/spv/spvxml-helpers.c

new file mode 100644 (file)

index 0000000..f294316
--- /dev/null
+++ b/src/output/spv/spvxml-helpers.c
@@ -0,0 +1,873 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "output/spv/spvxml-helpers.h"
+
+#include <errno.h>
+#include <float.h>
+#include <string.h>
+
+#include "libpspp/cast.h"
+#include "libpspp/compiler.h"
+#include "libpspp/hash-functions.h"
+#include "libpspp/str.h"
+
+#include "gl/xvasprintf.h"
+
+char * WARN_UNUSED_RESULT
+spvxml_context_finish (struct spvxml_context *ctx, struct spvxml_node *root)
+{
+  if (!ctx->error)
+    root->class_->spvxml_node_collect_ids (ctx, root);
+  if (!ctx->error)
+    root->class_->spvxml_node_resolve_refs (ctx, root);
+
+  hmap_destroy (&ctx->id_map);
+
+  return ctx->error;
+}
+
+void
+spvxml_node_context_uninit (struct spvxml_node_context *nctx)
+{
+  for (struct spvxml_attribute *a = nctx->attrs;
+       a < &nctx->attrs[nctx->n_attrs]; a++)
+    free (a->value);
+}
+
+static const char *
+xml_element_type_to_string (xmlElementType type)
+{
+  switch (type)
+    {
+    case XML_ELEMENT_NODE: return "element";
+    case XML_ATTRIBUTE_NODE: return "attribute";
+    case XML_TEXT_NODE: return "text";
+    case XML_CDATA_SECTION_NODE: return "CDATA section";
+    case XML_ENTITY_REF_NODE: return "entity reference";
+    case XML_ENTITY_NODE: return "entity";
+    case XML_PI_NODE: return "PI";
+    case XML_COMMENT_NODE: return "comment";
+    case XML_DOCUMENT_NODE: return "document";
+    case XML_DOCUMENT_TYPE_NODE: return "document type";
+    case XML_DOCUMENT_FRAG_NODE: return "document fragment";
+    case XML_NOTATION_NODE: return "notation";
+    case XML_HTML_DOCUMENT_NODE: return "HTML document";
+    case XML_DTD_NODE: return "DTD";
+    case XML_ELEMENT_DECL: return "element declaration";
+    case XML_ATTRIBUTE_DECL: return "attribute declaration";
+    case XML_ENTITY_DECL: return "entity declaration";
+    case XML_NAMESPACE_DECL: return "namespace declaration";
+    case XML_XINCLUDE_START: return "XInclude start";
+    case XML_XINCLUDE_END: return "XInclude end";
+    case XML_DOCB_DOCUMENT_NODE: return "docb document";
+    default: return "<error>";
+    }
+}
+
+static void
+spvxml_format_node_path (const xmlNode *node, struct string *s)
+{
+  enum { MAX_STACK = 32 };
+  const xmlNode *stack[MAX_STACK];
+  size_t n = 0;
+
+  while (node != NULL && node->type != XML_DOCUMENT_NODE && n < MAX_STACK)
+    {
+      stack[n++] = node;
+      node = node->parent;
+    }
+
+  while (n > 0)
+    {
+      node = stack[--n];
+      ds_put_byte (s, '/');
+      if (node->name)
+        ds_put_cstr (s, CHAR_CAST (char *, node->name));
+      if (node->type == XML_ELEMENT_NODE)
+        {
+          if (node->parent)
+            {
+              size_t total = 1;
+              size_t index = 1;
+              for (const xmlNode *sibling = node->parent->children;
+                   sibling; sibling = sibling->next)
+                {
+                  if (sibling == node)
+                    index = total;
+                  else if (sibling->type == XML_ELEMENT_NODE
+                           && !strcmp (CHAR_CAST (char *, sibling->name),
+                                       CHAR_CAST (char *, node->name)))
+                    total++;
+                }
+              if (total > 1)
+                ds_put_format (s, "[%zu]", index);
+            }
+        }
+      else
+        ds_put_format (s, "(%s)", xml_element_type_to_string (node->type));
+    }
+}
+
+static struct spvxml_node *
+spvxml_node_find (struct spvxml_context *ctx, const char *name,
+                  unsigned int hash)
+{
+  struct spvxml_node *node;
+  HMAP_FOR_EACH_WITH_HASH (node, struct spvxml_node, id_node, hash,
+                           &ctx->id_map)
+    if (!strcmp (node->id, name))
+      return node;
+
+  return NULL;
+}
+
+void
+spvxml_node_collect_id (struct spvxml_context *ctx, struct spvxml_node *node)
+{
+  if (!node->id)
+    return;
+
+  unsigned int hash = hash_string (node->id, 0);
+  struct spvxml_node *other = spvxml_node_find (ctx, node->id, hash);
+  if (other)
+    {
+      if (!ctx->error)
+        {
+          struct string node_path = DS_EMPTY_INITIALIZER;
+          spvxml_format_node_path (node->raw, &node_path);
+
+          struct string other_path = DS_EMPTY_INITIALIZER;
+          spvxml_format_node_path (other->raw, &other_path);
+
+          ctx->error = xasprintf ("Nodes %s and %s both have ID \"%s\".",
+                                  ds_cstr (&node_path),
+                                  ds_cstr (&other_path), node->id);
+
+          ds_destroy (&node_path);
+          ds_destroy (&other_path);
+        }
+
+      return;
+    }
+
+  hmap_insert (&ctx->id_map, &node->id_node, hash);
+}
+
+struct spvxml_node *
+spvxml_node_resolve_ref (struct spvxml_context *ctx,
+                         const xmlNode *src, const char *attr_name,
+                         const struct spvxml_node_class *const *classes,
+                         size_t n)
+{
+  char *dst_id = CHAR_CAST (
+    char *, xmlGetProp (CONST_CAST (xmlNode *, src),
+                        CHAR_CAST (xmlChar *, attr_name)));
+  if (!dst_id)
+    return NULL;
+
+  struct spvxml_node *dst = spvxml_node_find (ctx, dst_id,
+                                              hash_string (dst_id, 0));
+  if (!dst)
+    {
+      struct string node_path = DS_EMPTY_INITIALIZER;
+      spvxml_format_node_path (src, &node_path);
+
+      ctx->error = xasprintf (
+        "%s: Attribute %s has unknown target ID \"%s\".",
+        ds_cstr (&node_path), attr_name, dst_id);
+
+      ds_destroy (&node_path);
+      free (dst_id);
+      return NULL;
+    }
+
+  if (!n)
+    {
+      free (dst_id);
+      return dst;
+    }
+  for (size_t i = 0; i < n; i++)
+    if (classes[i] == dst->class_)
+      {
+        free (dst_id);
+        return dst;
+      }
+
+  if (!ctx->error)
+    {
+      struct string s = DS_EMPTY_INITIALIZER;
+      spvxml_format_node_path (src, &s);
+
+      ds_put_format (&s, ": Attribute \"%s\" should refer to a \"%s\"",
+                     attr_name, classes[0]->name);
+      if (n == 2)
+        ds_put_format (&s, " or \"%s\"", classes[1]->name);
+      else if (n > 2)
+        {
+          for (size_t i = 1; i < n - 1; i++)
+            ds_put_format (&s, ", \"%s\"", classes[i]->name);
+          ds_put_format (&s, ", or \"%s\"", classes[n - 1]->name);
+        }
+      ds_put_format (&s, " element, but its target ID \"%s\" "
+                     "actually refers to a \"%s\" element.",
+                     dst_id, dst->class_->name);
+
+      ctx->error = ds_steal_cstr (&s);
+    }
+
+  free (dst_id);
+  return NULL;
+}
+
+void PRINTF_FORMAT (2, 3)
+spvxml_attr_error (struct spvxml_node_context *nctx, const char *format, ...)
+{
+  if (nctx->up->error)
+    return;
+
+  struct string s = DS_EMPTY_INITIALIZER;
+  ds_put_cstr (&s, "error parsing attributes of ");
+  spvxml_format_node_path (nctx->parent, &s);
+
+  va_list args;
+  va_start (args, format);
+  ds_put_cstr (&s, ": ");
+  ds_put_vformat (&s, format, args);
+  va_end (args);
+
+  nctx->up->error = ds_steal_cstr (&s);
+}
+
+/* xmlGetPropNodeValueInternal() is from tree.c in libxml2 2.9.4+dfsg1, which
+   is covered by the following copyright and license:
+
+   Except where otherwise noted in the source code (e.g. the files hash.c,
+   list.c and the trio files, which are covered by a similar licence but with
+   different Copyright notices) all the files are:
+
+   Copyright (C) 1998-2012 Daniel Veillard.  All Rights Reserved.
+
+   Permission is hereby granted, free of charge, to any person obtaining a copy
+   of this software and associated documentation files (the "Software"), to
+   deal in the Software without restriction, including without limitation the
+   rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+   sell copies of the Software, and to permit persons to whom the Software is
+   fur- nished to do so, subject to the following conditions:
+
+   The above copyright notice and this permission notice shall be included in
+   all copies or substantial portions of the Software.
+
+   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+   IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+   FIT- NESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+   THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+   LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+   FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+   IN THE SOFTWARE.
+*/
+static xmlChar*
+xmlGetPropNodeValueInternal(const xmlAttr *prop)
+{
+    if (prop == NULL)
+        return(NULL);
+    if (prop->type == XML_ATTRIBUTE_NODE) {
+        /*
+        * Note that we return at least the empty string.
+        *   TODO: Do we really always want that?
+        */
+        if (prop->children != NULL) {
+            if ((prop->children->next == NULL) &&
+                ((prop->children->type == XML_TEXT_NODE) ||
+                (prop->children->type == XML_CDATA_SECTION_NODE)))
+            {
+                /*
+                * Optimization for the common case: only 1 text node.
+                */
+                return(xmlStrdup(prop->children->content));
+            } else {
+                xmlChar *ret;
+
+                ret = xmlNodeListGetString(prop->doc, prop->children, 1);
+                if (ret != NULL)
+                    return(ret);
+            }
+        }
+        return(xmlStrdup((xmlChar *)""));
+    } else if (prop->type == XML_ATTRIBUTE_DECL) {
+        return(xmlStrdup(((xmlAttributePtr)prop)->defaultValue));
+    }
+    return(NULL);
+}
+
+static struct spvxml_attribute *
+find_attribute (struct spvxml_node_context *nctx, const char *name)
+{
+  /* XXX This is linear search but we could use binary search. */
+  for (struct spvxml_attribute *a = nctx->attrs;
+       a < &nctx->attrs[nctx->n_attrs]; a++)
+    if (!strcmp (a->name, name))
+      return a;
+
+  return NULL;
+}
+
+static void
+format_attribute (struct string *s, const xmlAttr *attr)
+{
+  const char *name = CHAR_CAST (char *, attr->name);
+  char *value = CHAR_CAST (char *, xmlGetPropNodeValueInternal (attr));
+  ds_put_format (s, "%s=\"%s\"", name, value);
+  free (value);
+}
+
+void
+spvxml_parse_attributes (struct spvxml_node_context *nctx)
+{
+  for (const xmlAttr *node = nctx->parent->properties; node; node = node->next)
+    {
+      const char *node_name = CHAR_CAST (char *, node->name);
+      struct spvxml_attribute *a = find_attribute (nctx, node_name);
+      if (!a)
+        {
+          if (!strcmp (node_name, "id"))
+            continue;
+
+          struct string unexpected = DS_EMPTY_INITIALIZER;
+          format_attribute (&unexpected, node);
+          int n = 1;
+
+          for (node = node->next; node; node = node->next)
+            {
+              node_name = CHAR_CAST (char *, node->name);
+              if (!find_attribute (nctx, node_name)
+                  && strcmp (node_name, "id"))
+                {
+                  ds_put_byte (&unexpected, ' ');
+                  format_attribute (&unexpected, node);
+                  n++;
+                }
+            }
+
+          spvxml_attr_error (nctx, "Node has unexpected attribute%s: %s",
+                             n > 1 ? "s" : "", ds_cstr (&unexpected));
+          ds_destroy (&unexpected);
+          return;
+        }
+      if (a->value)
+        {
+          spvxml_attr_error (nctx, "Duplicate attribute \"%s\".", a->name);
+          return;
+        }
+      a->value = CHAR_CAST (char *, xmlGetPropNodeValueInternal (node));
+    }
+
+  for (struct spvxml_attribute *a = nctx->attrs;
+       a < &nctx->attrs[nctx->n_attrs]; a++)
+    {
+      if (a->required && !a->value)
+        spvxml_attr_error (nctx, "Missing required attribute \"%s\".",
+                           a->name);
+      return;
+    }
+}
+
+int
+spvxml_attr_parse_enum (struct spvxml_node_context *nctx,
+                        const struct spvxml_attribute *a,
+                        const struct spvxml_enum enums[])
+{
+  if (!a->value)
+    return 0;
+
+  for (const struct spvxml_enum *e = enums; e->name; e++)
+    if (!strcmp (a->value, e->name))
+      return e->value;
+
+  for (const struct spvxml_enum *e = enums; e->name; e++)
+    if (!strcmp (e->name, "OTHER"))
+      return e->value;
+
+  spvxml_attr_error (nctx, "Attribute %s has unexpected value \"%s\".",
+                a->name, a->value);
+  return 0;
+}
+
+int
+spvxml_attr_parse_bool (struct spvxml_node_context *nctx,
+                        const struct spvxml_attribute *a)
+{
+  static const struct spvxml_enum bool_enums[] = {
+    { "true", 1 },
+    { "false", 0 },
+    { NULL, 0 },
+  };
+
+  return !a->value ? -1 : spvxml_attr_parse_enum (nctx, a, bool_enums);
+}
+
+bool
+spvxml_attr_parse_fixed (struct spvxml_node_context *nctx,
+                         const struct spvxml_attribute *a,
+                         const char *attr_value)
+{
+  const struct spvxml_enum fixed_enums[] = {
+    { attr_value, true },
+    { NULL, 0 },
+  };
+
+  return spvxml_attr_parse_enum (nctx, a, fixed_enums);
+}
+
+int
+spvxml_attr_parse_int (struct spvxml_node_context *nctx,
+                       const struct spvxml_attribute *a)
+{
+  if (!a->value)
+    return INT_MIN;
+
+  char *tail = NULL;
+  int save_errno = errno;
+  errno = 0;
+  long int integer = strtol (a->value, &tail, 10);
+  if (errno || *tail || integer <= INT_MIN || integer > INT_MAX)
+    {
+      spvxml_attr_error (nctx, "Attribute %s has unexpected value "
+                         "\"%s\" expecting small integer.", a->name, a->value);
+      integer = INT_MIN;
+    }
+  errno = save_errno;
+
+  return integer;
+}
+
+static int
+lookup_color_name (const char *s)
+{
+  struct color
+    {
+      struct hmap_node hmap_node;
+      const char *name;
+      int code;
+    };
+
+  static struct color colors[] =
+    {
+      { .name = "aliceblue", .code = 0xf0f8ff },
+      { .name = "antiquewhite", .code = 0xfaebd7 },
+      { .name = "aqua", .code = 0x00ffff },
+      { .name = "aquamarine", .code = 0x7fffd4 },
+      { .name = "azure", .code = 0xf0ffff },
+      { .name = "beige", .code = 0xf5f5dc },
+      { .name = "bisque", .code = 0xffe4c4 },
+      { .name = "black", .code = 0x000000 },
+      { .name = "blanchedalmond", .code = 0xffebcd },
+      { .name = "blue", .code = 0x0000ff },
+      { .name = "blueviolet", .code = 0x8a2be2 },
+      { .name = "brown", .code = 0xa52a2a },
+      { .name = "burlywood", .code = 0xdeb887 },
+      { .name = "cadetblue", .code = 0x5f9ea0 },
+      { .name = "chartreuse", .code = 0x7fff00 },
+      { .name = "chocolate", .code = 0xd2691e },
+      { .name = "coral", .code = 0xff7f50 },
+      { .name = "cornflowerblue", .code = 0x6495ed },
+      { .name = "cornsilk", .code = 0xfff8dc },
+      { .name = "crimson", .code = 0xdc143c },
+      { .name = "cyan", .code = 0x00ffff },
+      { .name = "darkblue", .code = 0x00008b },
+      { .name = "darkcyan", .code = 0x008b8b },
+      { .name = "darkgoldenrod", .code = 0xb8860b },
+      { .name = "darkgray", .code = 0xa9a9a9 },
+      { .name = "darkgreen", .code = 0x006400 },
+      { .name = "darkgrey", .code = 0xa9a9a9 },
+      { .name = "darkkhaki", .code = 0xbdb76b },
+      { .name = "darkmagenta", .code = 0x8b008b },
+      { .name = "darkolivegreen", .code = 0x556b2f },
+      { .name = "darkorange", .code = 0xff8c00 },
+      { .name = "darkorchid", .code = 0x9932cc },
+      { .name = "darkred", .code = 0x8b0000 },
+      { .name = "darksalmon", .code = 0xe9967a },
+      { .name = "darkseagreen", .code = 0x8fbc8f },
+      { .name = "darkslateblue", .code = 0x483d8b },
+      { .name = "darkslategray", .code = 0x2f4f4f },
+      { .name = "darkslategrey", .code = 0x2f4f4f },
+      { .name = "darkturquoise", .code = 0x00ced1 },
+      { .name = "darkviolet", .code = 0x9400d3 },
+      { .name = "deeppink", .code = 0xff1493 },
+      { .name = "deepskyblue", .code = 0x00bfff },
+      { .name = "dimgray", .code = 0x696969 },
+      { .name = "dimgrey", .code = 0x696969 },
+      { .name = "dodgerblue", .code = 0x1e90ff },
+      { .name = "firebrick", .code = 0xb22222 },
+      { .name = "floralwhite", .code = 0xfffaf0 },
+      { .name = "forestgreen", .code = 0x228b22 },
+      { .name = "fuchsia", .code = 0xff00ff },
+      { .name = "gainsboro", .code = 0xdcdcdc },
+      { .name = "ghostwhite", .code = 0xf8f8ff },
+      { .name = "gold", .code = 0xffd700 },
+      { .name = "goldenrod", .code = 0xdaa520 },
+      { .name = "gray", .code = 0x808080 },
+      { .name = "green", .code = 0x008000 },
+      { .name = "greenyellow", .code = 0xadff2f },
+      { .name = "grey", .code = 0x808080 },
+      { .name = "honeydew", .code = 0xf0fff0 },
+      { .name = "hotpink", .code = 0xff69b4 },
+      { .name = "indianred", .code = 0xcd5c5c },
+      { .name = "indigo", .code = 0x4b0082 },
+      { .name = "ivory", .code = 0xfffff0 },
+      { .name = "khaki", .code = 0xf0e68c },
+      { .name = "lavender", .code = 0xe6e6fa },
+      { .name = "lavenderblush", .code = 0xfff0f5 },
+      { .name = "lawngreen", .code = 0x7cfc00 },
+      { .name = "lemonchiffon", .code = 0xfffacd },
+      { .name = "lightblue", .code = 0xadd8e6 },
+      { .name = "lightcoral", .code = 0xf08080 },
+      { .name = "lightcyan", .code = 0xe0ffff },
+      { .name = "lightgoldenrodyellow", .code = 0xfafad2 },
+      { .name = "lightgray", .code = 0xd3d3d3 },
+      { .name = "lightgreen", .code = 0x90ee90 },
+      { .name = "lightgrey", .code = 0xd3d3d3 },
+      { .name = "lightpink", .code = 0xffb6c1 },
+      { .name = "lightsalmon", .code = 0xffa07a },
+      { .name = "lightseagreen", .code = 0x20b2aa },
+      { .name = "lightskyblue", .code = 0x87cefa },
+      { .name = "lightslategray", .code = 0x778899 },
+      { .name = "lightslategrey", .code = 0x778899 },
+      { .name = "lightsteelblue", .code = 0xb0c4de },
+      { .name = "lightyellow", .code = 0xffffe0 },
+      { .name = "lime", .code = 0x00ff00 },
+      { .name = "limegreen", .code = 0x32cd32 },
+      { .name = "linen", .code = 0xfaf0e6 },
+      { .name = "magenta", .code = 0xff00ff },
+      { .name = "maroon", .code = 0x800000 },
+      { .name = "mediumaquamarine", .code = 0x66cdaa },
+      { .name = "mediumblue", .code = 0x0000cd },
+      { .name = "mediumorchid", .code = 0xba55d3 },
+      { .name = "mediumpurple", .code = 0x9370db },
+      { .name = "mediumseagreen", .code = 0x3cb371 },
+      { .name = "mediumslateblue", .code = 0x7b68ee },
+      { .name = "mediumspringgreen", .code = 0x00fa9a },
+      { .name = "mediumturquoise", .code = 0x48d1cc },
+      { .name = "mediumvioletred", .code = 0xc71585 },
+      { .name = "midnightblue", .code = 0x191970 },
+      { .name = "mintcream", .code = 0xf5fffa },
+      { .name = "mistyrose", .code = 0xffe4e1 },
+      { .name = "moccasin", .code = 0xffe4b5 },
+      { .name = "navajowhite", .code = 0xffdead },
+      { .name = "navy", .code = 0x000080 },
+      { .name = "oldlace", .code = 0xfdf5e6 },
+      { .name = "olive", .code = 0x808000 },
+      { .name = "olivedrab", .code = 0x6b8e23 },
+      { .name = "orange", .code = 0xffa500 },
+      { .name = "orangered", .code = 0xff4500 },
+      { .name = "orchid", .code = 0xda70d6 },
+      { .name = "palegoldenrod", .code = 0xeee8aa },
+      { .name = "palegreen", .code = 0x98fb98 },
+      { .name = "paleturquoise", .code = 0xafeeee },
+      { .name = "palevioletred", .code = 0xdb7093 },
+      { .name = "papayawhip", .code = 0xffefd5 },
+      { .name = "peachpuff", .code = 0xffdab9 },
+      { .name = "peru", .code = 0xcd853f },
+      { .name = "pink", .code = 0xffc0cb },
+      { .name = "plum", .code = 0xdda0dd },
+      { .name = "powderblue", .code = 0xb0e0e6 },
+      { .name = "purple", .code = 0x800080 },
+      { .name = "red", .code = 0xff0000 },
+      { .name = "rosybrown", .code = 0xbc8f8f },
+      { .name = "royalblue", .code = 0x4169e1 },
+      { .name = "saddlebrown", .code = 0x8b4513 },
+      { .name = "salmon", .code = 0xfa8072 },
+      { .name = "sandybrown", .code = 0xf4a460 },
+      { .name = "seagreen", .code = 0x2e8b57 },
+      { .name = "seashell", .code = 0xfff5ee },
+      { .name = "sienna", .code = 0xa0522d },
+      { .name = "silver", .code = 0xc0c0c0 },
+      { .name = "skyblue", .code = 0x87ceeb },
+      { .name = "slateblue", .code = 0x6a5acd },
+      { .name = "slategray", .code = 0x708090 },
+      { .name = "slategrey", .code = 0x708090 },
+      { .name = "snow", .code = 0xfffafa },
+      { .name = "springgreen", .code = 0x00ff7f },
+      { .name = "steelblue", .code = 0x4682b4 },
+      { .name = "tan", .code = 0xd2b48c },
+      { .name = "teal", .code = 0x008080 },
+      { .name = "thistle", .code = 0xd8bfd8 },
+      { .name = "tomato", .code = 0xff6347 },
+      { .name = "turquoise", .code = 0x40e0d0 },
+      { .name = "violet", .code = 0xee82ee },
+      { .name = "wheat", .code = 0xf5deb3 },
+      { .name = "white", .code = 0xffffff },
+      { .name = "whitesmoke", .code = 0xf5f5f5 },
+      { .name = "yellow", .code = 0xffff00 },
+      { .name = "yellowgreen", .code = 0x9acd32 },
+    };
+
+  static struct hmap color_table = HMAP_INITIALIZER (color_table);
+
+  if (hmap_is_empty (&color_table))
+    for (size_t i = 0; i < sizeof colors / sizeof *colors; i++)
+      hmap_insert (&color_table, &colors[i].hmap_node,
+                   hash_string (colors[i].name, 0));
+
+  const struct color *color;
+  HMAP_FOR_EACH_WITH_HASH (color, struct color, hmap_node,
+                           hash_string (s, 0), &color_table)
+    if (!strcmp (color->name, s))
+      return color->code;
+  return -1;
+}
+
+int
+spvxml_attr_parse_color (struct spvxml_node_context *nctx,
+                         const struct spvxml_attribute *a)
+{
+  if (!a->value || !strcmp (a->value, "transparent"))
+    return -1;
+
+  int r, g, b;
+  if (sscanf (a->value, "#%2x%2x%2x", &r, &g, &b) == 3
+      || sscanf (a->value, "%2x%2x%2x", &r, &g, &b) == 3)
+    return (r << 16) | (g << 8) | b;
+
+  int code = lookup_color_name (a->value);
+  if (code >= 0)
+    return code;
+
+  spvxml_attr_error (nctx, "Attribute %s has unexpected value "
+                     "\"%s\" expecting #rrggbb or rrggbb or web color name.",
+                     a->name, a->value);
+  return 0;
+}
+
+static bool
+try_strtod (char *s, char **tail, double *real)
+{
+  char *comma = strchr (s, ',');
+  if (comma)
+    *comma = '.';
+
+  int save_errno = errno;
+  errno = 0;
+  *tail = NULL;
+  *real = strtod (s, tail);
+  bool ok = errno == 0;
+  errno = save_errno;
+
+  if (!ok)
+    *real = DBL_MAX;
+  return ok;
+}
+
+double
+spvxml_attr_parse_real (struct spvxml_node_context *nctx,
+                        const struct spvxml_attribute *a)
+{
+  if (!a->value)
+    return DBL_MAX;
+
+  char *tail;
+  double real;
+  if (!try_strtod (a->value, &tail, &real) || *tail)
+    spvxml_attr_error (nctx, "Attribute %s has unexpected value "
+                       "\"%s\" expecting real number.", a->name, a->value);
+
+  return real;
+}
+
+double
+spvxml_attr_parse_dimension (struct spvxml_node_context *nctx,
+                             const struct spvxml_attribute *a)
+{
+  if (!a->value)
+    return DBL_MAX;
+
+  char *tail;
+  double real;
+  if (!try_strtod (a->value, &tail, &real))
+    goto error;
+
+  tail += strspn (tail, " \t\r\n");
+
+  struct unit
+    {
+      const char *name;
+      double divisor;
+    };
+  static const struct unit units[] = {
+    /* Inches. */
+    { "in", 1.0 },
+    { "인치", 1.0 },
+
+    /* Device-independent pixels. */
+    { "px", 96.0 },
+
+    /* Points. */
+    { "pt", 72.0 },
+    { "пт", 72.0 },
+    { "", 72.0 },
+
+    /* Centimeters. */
+    { "cm", 2.54 },
+    { "см", 2.54 },
+  };
+
+  for (size_t i = 0; i < sizeof units / sizeof *units; i++)
+    if (!strcmp (units[i].name, tail))
+      return real / units[i].divisor;
+  goto error;
+
+error:
+  spvxml_attr_error (nctx, "Attribute %s has unexpected value "
+                     "\"%s\" expecting dimension.", a->name, a->value);
+  return DBL_MAX;
+}
+
+struct spvxml_node *
+spvxml_attr_parse_ref (struct spvxml_node_context *nctx UNUSED,
+                       const struct spvxml_attribute *a UNUSED)
+{
+  return NULL;
+}
+\f
+void PRINTF_FORMAT (3, 4)
+spvxml_content_error (struct spvxml_node_context *nctx, const xmlNode *node,
+                      const char *format, ...)
+{
+  if (nctx->up->error)
+    return;
+
+  struct string s = DS_EMPTY_INITIALIZER;
+
+  ds_put_cstr (&s, "error parsing content of ");
+  spvxml_format_node_path (nctx->parent, &s);
+
+  if (node)
+    {
+      ds_put_format (&s, " at %s", xml_element_type_to_string (node->type));
+      if (node->name)
+        ds_put_format (&s, " \"%s\"", node->name);
+    }
+  else
+    ds_put_format (&s, " at end of content");
+
+  va_list args;
+  va_start (args, format);
+  ds_put_cstr (&s, ": ");
+  ds_put_vformat (&s, format, args);
+  va_end (args);
+
+  //puts (ds_cstr (&s));
+
+  nctx->up->error = ds_steal_cstr (&s);
+}
+
+bool
+spvxml_content_parse_element (struct spvxml_node_context *nctx,
+                              xmlNode **nodep,
+                              const char *elem_name, xmlNode **outp)
+{
+  xmlNode *node = *nodep;
+  while (node)
+    {
+      if (node->type == XML_ELEMENT_NODE
+          && (!strcmp (CHAR_CAST (char *, node->name), elem_name)
+              || !strcmp (elem_name, "any")))
+        {
+          *outp = node;
+          *nodep = node->next;
+          return true;
+        }
+      else if (node->type != XML_COMMENT_NODE)
+        break;
+
+      node = node->next;
+    }
+
+  spvxml_content_error (nctx, node, "\"%s\" element expected.", elem_name);
+  *outp = NULL;
+  return false;
+}
+
+bool
+spvxml_content_parse_text (struct spvxml_node_context *nctx UNUSED, xmlNode **nodep,
+                           char **textp)
+{
+  struct string text = DS_EMPTY_INITIALIZER;
+
+  xmlNode *node = *nodep;
+  while (node)
+    {
+      if (node->type == XML_TEXT_NODE || node->type == XML_CDATA_SECTION_NODE)
+        {
+          char *segment = CHAR_CAST (char *, xmlNodeGetContent (node));
+          if (!text.ss.string)
+            {
+              text.ss = ss_cstr (segment);
+              text.capacity = text.ss.length;
+            }
+          else
+            {
+              ds_put_cstr (&text, segment);
+              free (segment);
+            }
+        }
+      else if (node->type != XML_COMMENT_NODE)
+        break;
+
+      node = node->next;
+    }
+  *nodep = node;
+
+  *textp = ds_steal_cstr (&text);
+
+  return true;
+}
+
+bool
+spvxml_content_parse_end (struct spvxml_node_context *nctx, xmlNode *node)
+{
+  for (;;)
+    {
+      if (!node)
+        return true;
+      else if (node->type != XML_COMMENT_NODE)
+        break;
+
+      node = node->next;
+    }
+
+  struct string s = DS_EMPTY_INITIALIZER;
+
+  for (int i = 0; i < 4 && node; i++, node = node->next)
+    {
+      if (i)
+        ds_put_cstr (&s, ", ");
+      ds_put_cstr (&s, xml_element_type_to_string (node->type));
+      if (node->name)
+        ds_put_format (&s, " \"%s\"", node->name);
+    }
+  if (node)
+    ds_put_format (&s, ", ...");
+
+  spvxml_content_error (nctx, node, "Extra content found expecting end: %s",
+                        ds_cstr (&s));
+  ds_destroy (&s);
+
+  return false;
+}
+
diff --git a/src/output/spv/spvxml-helpers.h b/src/output/spv/spvxml-helpers.h

new file mode 100644 (file)

index 0000000..16f7335
--- /dev/null
+++ b/src/output/spv/spvxml-helpers.h
@@ -0,0 +1,126 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef SPVXML_HELPERS_H
+#define SPVXML_HELPERS_H 1
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <libxml/xmlreader.h>
+#include "libpspp/compiler.h"
+#include "libpspp/hmap.h"
+
+struct spvxml_node;
+
+struct spvxml_context
+  {
+    struct hmap id_map;
+
+    char *error;
+    bool hard_error;
+  };
+
+#define SPVXML_CONTEXT_INIT(CONTEXT) \
+  { HMAP_INITIALIZER ((CONTEXT).id_map), NULL, false }
+
+char *spvxml_context_finish (struct spvxml_context *, struct spvxml_node *root)
+  WARN_UNUSED_RESULT;
+
+struct spvxml_node_context
+  {
+    struct spvxml_context *up;
+    const xmlNode *parent;
+
+    struct spvxml_attribute *attrs;
+    size_t n_attrs;
+  };
+
+void spvxml_node_context_uninit (struct spvxml_node_context *);
+
+struct spvxml_node_class
+  {
+    const char *name;
+    void (*spvxml_node_free) (struct spvxml_node *);
+    void (*spvxml_node_collect_ids) (struct spvxml_context *,
+                                     struct spvxml_node *);
+    void (*spvxml_node_resolve_refs) (struct spvxml_context *,
+                                      struct spvxml_node *);
+  };
+
+struct spvxml_node
+  {
+    struct hmap_node id_node;
+    char *id;
+
+    const struct spvxml_node_class *class_;
+    const xmlNode *raw;
+  };
+
+void spvxml_node_collect_id (struct spvxml_context *, struct spvxml_node *);
+struct spvxml_node *spvxml_node_resolve_ref (
+  struct spvxml_context *, const xmlNode *, const char *attr_name,
+  const struct spvxml_node_class *const *, size_t n);
+
+/* Attribute parsing. */
+struct spvxml_attribute
+  {
+    const char *name;
+    bool required;
+    char *value;
+  };
+
+void spvxml_parse_attributes (struct spvxml_node_context *);
+void spvxml_attr_error (struct spvxml_node_context *, const char *format, ...)
+  PRINTF_FORMAT (2, 3);
+
+struct spvxml_enum
+  {
+    const char *name;
+    int value;
+  };
+
+int spvxml_attr_parse_enum (struct spvxml_node_context *,
+                            const struct spvxml_attribute *,
+                            const struct spvxml_enum[]);
+int spvxml_attr_parse_bool (struct spvxml_node_context *,
+                            const struct spvxml_attribute *);
+bool spvxml_attr_parse_fixed (struct spvxml_node_context *,
+                             const struct spvxml_attribute *,
+                             const char *attr_value);
+int spvxml_attr_parse_int (struct spvxml_node_context *,
+                           const struct spvxml_attribute *);
+int spvxml_attr_parse_color (struct spvxml_node_context *,
+                             const struct spvxml_attribute *);
+double spvxml_attr_parse_real (struct spvxml_node_context *,
+                               const struct spvxml_attribute *);
+double spvxml_attr_parse_dimension (struct spvxml_node_context *,
+                                    const struct spvxml_attribute *);
+struct spvxml_node *spvxml_attr_parse_ref (struct spvxml_node_context *,
+                                           const struct spvxml_attribute *);
+\f
+/* Content parsing. */
+
+void spvxml_content_error (struct spvxml_node_context *, const xmlNode *,
+                           const char *format, ...)
+  PRINTF_FORMAT (3, 4);
+bool spvxml_content_parse_element (struct spvxml_node_context *, xmlNode **,
+                                   const char *elem_name, xmlNode **);
+bool spvxml_content_parse_text (struct spvxml_node_context *, xmlNode **,
+                                char **textp);
+void spvxml_content_parse_etc (xmlNode **);
+bool spvxml_content_parse_end (struct spvxml_node_context *, xmlNode *);
+
+#endif /* output/spv/spvxml-helpers.h */
diff --git a/src/output/spv/structure-xml.grammar b/src/output/spv/structure-xml.grammar

new file mode 100644 (file)

index 0000000..0ca2d05
--- /dev/null
+++ b/src/output/spv/structure-xml.grammar
@@ -0,0 +1,166 @@
+heading[root_heading]
+   :creator-version?
+   :creator?
+   :creation-date-time?
+   :lockReader=bool?
+   :schemaLocation?
+=> label pageSetup? (container | heading)*
+
+heading
+   :creator-version?
+   :commandName?
+   :visibility[heading_visibility]=(collapsed)?
+   :locale?
+   :olang?
+=> label (container | heading)*
+
+label => TEXT
+
+container
+   :visibility=(visible | hidden)
+   :page-break-before=(always)?
+   :text-align=(left | center)?
+   :width=dimension
+=> label (table | container_text | graph | model | object | image)
+
+text[container_text]
+  :type[text_type]=(title | log | text | page-title)
+  :commandName?
+  :creator-version?
+=> html
+
+html :lang=(en) => TEXT
+
+table
+   :VDPId?
+   :ViZmlSource?
+   :activePageId=int?
+   :commandName
+   :creator-version?
+   :displayFiltering=bool?
+   :maxNumCells=int?
+   :orphanTolerance=int?
+   :rowBreakNumber=int?
+   :subType
+   :tableId
+   :tableLookId?
+   :type[table_type]=(table | note | warning)
+=> tableProperties? tableStructure
+
+tableProperties
+=> generalProperties footnoteProperties cellFormatProperties borderProperties printingProperties
+
+generalProperties
+   :hideEmptyRows=bool?
+   :maximumColumnWidth=dimension?
+   :maximumRowWidth=dimension?
+   :minimumColumnWidth=dimension?
+   :minimumRowWidth=dimension?
+   :rowDimensionLabels=(inCorner | nested)?
+=> EMPTY
+
+footnoteProperties
+   :markerPosition=(superscript | subscript)?
+   :numberFormat=(alphabetic | numeric)?
+=> EMPTY
+
+cellFormatProperties => cell_style+
+
+any[cell_style]
+   :alternatingColor=color?
+   :alternatingTextColor=color?
+=> style
+
+style
+   :color=color?
+   :color2=color?
+   :font-family?
+   :font-size?
+   :font-style=(regular | italic)?
+   :font-weight=(regular | bold)?
+   :labelLocationVertical=(positive | negative | center)?
+   :margin-bottom=dimension?
+   :margin-left=dimension?
+   :margin-right=dimension?
+   :margin-top=dimension?
+   :textAlignment=(left | right | center | decimal | mixed)?
+   :decimal-offset=dimension?
+=> EMPTY
+
+borderProperties => border_style+
+
+any[border_style]
+   :borderStyleType=(none | solid | dashed | thick | thin | double)?
+   :color=color?
+=> EMPTY
+
+printingProperties
+   :printAllLayers=bool?
+   :rescaleLongTableToFitPage=bool?
+   :rescaleWideTableToFitPage=bool?
+   :windowOrphanLines=int?
+   :continuationText?
+   :continuationTextAtBottom=bool?
+   :continuationTextAtTop=bool?
+   :printEachLayerOnSeparatePage=bool?
+=> EMPTY
+
+tableStructure => path? dataPath
+
+graph
+   :VDPId?
+   :ViZmlSource?
+   :commandName?
+   :creator-version?
+   :dataMapId?
+   :dataMapURI?
+   :editor?
+   :refMapId?
+   :refMapURI?
+=> dataPath? path
+
+model
+   :PMMLContainerId
+   :PMMLId
+   :StatXMLContainerId
+   :VDPId
+   :auxiliaryViewName
+   :commandName
+   :creator-version
+   :mainViewName
+=> ViZml? path | pmmlContainerPath statsContainerPath
+
+pmmlContainerPath => TEXT
+
+statsContainerPath => TEXT
+
+ViZml :viewName? => TEXT
+
+dataPath => TEXT
+
+path => TEXT
+
+pageSetup
+   :initial-page-number=int?
+   :chart-size=(as-is | full-height | half-height | quarter-height | OTHER)?
+   :margin-left=dimension?
+   :margin-right=dimension?
+   :margin-top=dimension?
+   :margin-bottom=dimension?
+   :paper-height=dimension?
+   :paper-width=dimension?
+   :reference-orientation?
+   :space-after=dimension?
+=> pageHeader pageFooter
+
+pageHeader => pageParagraph?
+
+pageFooter => pageParagraph?
+
+pageParagraph => pageParagraph_text
+
+text[pageParagraph_text] :type=(title | text) => TEXT
+
+object :type :uri => EMPTY
+
+image :VDPId :commandName => dataPath
diff --git a/src/output/spv/xml-parser-generator b/src/output/spv/xml-parser-generator

new file mode 100644 (file)

index 0000000..38f0afd
--- /dev/null
+++ b/src/output/spv/xml-parser-generator
@@ -0,0 +1,1076 @@
+#! /usr/bin/python
+
+import getopt
+import os
+import re
+import struct
+import sys
+
+n_errors = 0
+
+def error(msg):
+    global n_errors
+    sys.stderr.write("%s:%d: %s\n" % (file_name, line_number, msg))
+    n_errors += 1
+
+
+def fatal(msg):
+    error(msg)
+    sys.exit(1)
+
+
+def get_line():
+    global line
+    global line_number
+    line = input_file.readline()
+    line = re.sub('#.*', '\n', line)
+    line_number += 1
+
+
+def expect(type):
+    if token[0] != type:
+        fatal("syntax error expecting %s" % type)
+
+
+def match(type):
+    if token[0] == type:
+        get_token()
+        return True
+    else:
+        return False
+
+
+def must_match(type):
+    expect(type)
+    get_token()
+
+
+def match_id(id_):
+    if token == ('id', id_):
+        get_token()
+        return True
+    else:
+        return False
+
+
+def is_idchar(c):
+    return c.isalnum() or c in '-_'
+
+
+def get_token():
+    global token
+    global line
+    prev = token
+    while True:
+        if line == "":
+            if token == ('eof', ):
+                fatal("unexpected end of input")
+            get_line()
+            if not line:
+                token = ('eof', )
+                break
+            elif line == '\n':
+                token = (';', )
+                break
+
+        line = line.lstrip()
+        if line == "":
+            continue
+
+        if line.startswith('=>'):
+            token = (line[:2],)
+            line = line[2:]
+        elif line[0] in '[]()?|*+=:':
+            token = (line[0],)
+            line = line[1:]
+        elif is_idchar(line[0]):
+            n = 1
+            while n < len(line) and is_idchar(line[n]):
+                n += 1
+            s = line[:n]
+            token = ('id', s)
+            line = line[n:]
+        else:
+            fatal("unknown character %c" % line[0])
+        break
+
+
+def usage():
+    argv0 = os.path.basename(sys.argv[0])
+    print('''\
+%(argv0)s, parser generator for SPV XML members
+usage: %(argv0)s GRAMMAR header PREFIX
+       %(argv0)s GRAMMAR code PREFIX HEADER_NAME
+  where GRAMMAR contains grammar definitions\
+''' % {"argv0": argv0})
+    sys.exit(0)
+
+
+def parse_term():
+    if match('('):
+        sub = parse_alternation()
+        must_match(')')
+        return sub
+    else:
+        member_name, nonterminal_name = parse_name()
+        if member_name.isupper():
+            fatal('%s; unknown terminal' % member_name)
+        else:
+            return {'type': 'nonterminal',
+                    'nonterminal_name': nonterminal_name,
+                    'member_name': member_name}
+
+
+def parse_quantified():
+    item = parse_term()
+    if token[0] in ['*', '+', '?']:
+        item = {'type': token[0], 'item': item}
+        get_token()
+    return item
+
+
+def parse_sequence():
+    if match_id('EMPTY'):
+        return {'type': 'empty'}
+    items = []
+    while True:
+        sub = parse_quantified()
+        if sub['type'] == 'sequence':
+            items.extend(sub[1:])
+        else:
+            items.append(sub)
+        if token[0] in ('|', ';', ')', 'eof'):
+            break
+    return {'type': 'sequence', 'items': items} if len(items) > 1 else items[0]
+
+
+def parse_alternation():
+    items = [parse_sequence()]
+    while match('|'):
+        items.append(parse_sequence())
+    if len(items) > 1:
+        return {'type': '|', 'items': items}
+    else:
+        return items[0]
+
+
+def parse_name():
+    # The name used in XML for the attribute or element always comes
+    # first.
+    expect('id')
+    xml_name = token[1]
+    get_token()
+
+    # If a different name is needed to disambiguate when the same name
+    # is used in different contexts in XML, it comes later, in
+    # brackets.
+    if match('['):
+        expect('id')
+        unique_name = token[1]
+        get_token()
+        must_match(']')
+    else:
+        unique_name = xml_name
+
+    return unique_name, xml_name
+
+
+enums = {}
+def parse_production():
+    unique_name, xml_name = parse_name()
+
+    attr_xml_names = set()
+    attributes = {}
+    while match(':'):
+        attr_unique_name, attr_xml_name = parse_name()
+        if match('='):
+            if match('('):
+                attr_value = set()
+                while not match(')'):
+                    expect('id')
+                    attr_value.add(token[1])
+                    get_token()
+                    match('|')
+
+                global enums
+                if attr_unique_name not in enums:
+                    enums[attr_unique_name] = attr_value
+                elif enums[attr_unique_name] != attr_value:
+                    sys.stderr.write('%s: different enums with same name\n'
+                                     % attr_unique_name)
+                    sys.exit(1)
+            elif match_id('bool'):
+                attr_value = set(('true', 'false'))
+            elif match_id('dimension'):
+                attr_value = 'dimension'
+            elif match_id('real'):
+                attr_value = 'real'
+            elif match_id('int'):
+                attr_value = 'int'
+            elif match_id('color'):
+                attr_value = 'color'
+            elif match_id('ref'):
+                if token[0] == 'id':
+                    ref_type = token[1]
+                    attr_value = ('ref', ref_type)
+                    get_token()
+                elif match('('):
+                    ref_types = set()
+                    while not match(')'):
+                        expect('id')
+                        ref_types.add(token[1])
+                        get_token()
+                        match('|')
+                    attr_value = ('ref', ref_types)
+                else:
+                    attr_value = ('ref', None)
+            else:
+                fatal("unknown attribute value type")
+        else:
+            attr_value = 'string'
+        attr_required = not match('?')
+
+        if attr_xml_name == 'id':
+            if attr_value != 'string':
+                fatal("id attribute must have string type")
+            attr_value = 'id'
+
+        if attr_unique_name in attributes:
+            fatal("production %s has two attributes %s" % (unique_name,
+                                                           attr_unique_name))
+        if attr_xml_name in attr_xml_names:
+            fatal("production %s has two attributes %s" % (unique_name,
+                                                           attr_xml_name))
+        attr_xml_names.add(attr_xml_name)
+        attributes[attr_unique_name] = (attr_xml_name,
+                                        attr_value, attr_required)
+    if 'id' not in attributes:
+        attributes["id"] = ('id', 'id', False)
+
+    must_match('=>')
+
+    if match_id('TEXT'):
+        rhs = {'type': 'text'}
+    elif match_id('ETC'):
+        rhs = {'type': 'etc'}
+    else:
+        rhs = parse_alternation()
+
+    n = 0
+    for a in rhs['items'] if rhs['type'] == '|' else (rhs,):
+        for term in a['items'] if a['type'] == 'sequence' else (a,):
+            if term['type'] == 'empty':
+                pass
+            elif term['type'] == 'nonterminal':
+                pass
+            elif term['type'] == '?' and term['item']['type'] == 'nonterminal':
+                pass
+            elif (term['type'] in ('*', '+')
+                  and term['item']['type'] == 'nonterminal'):
+                pass
+            else:
+                n += 1
+                term['seq_name'] = 'seq' if n == 1 else 'seq%d' % n
+
+    return unique_name, xml_name, attributes, rhs
+
+
+used_enums = set()
+def print_members(attributes, rhs, indent):
+    attrs = []
+    new_enums = set()
+    for unique_name, (xml_name, value, required) in attributes.items():
+        c_name = name_to_id(unique_name)
+        if type(value) is set:
+            if len(value) <= 1:
+                if not required:
+                    attrs += [('bool %s_present;' % c_name,
+                               'True if attribute present')]
+            elif value == set(('true', 'false')):
+                if required:
+                    attrs += [('bool %s;' % c_name, None)]
+                else:
+                    attrs += [('int %s;' % c_name,
+                               '-1 if not present, otherwise 0 or 1')]
+            else:
+                attrs += [('enum %s%s %s;' % (prefix, c_name, c_name),
+                           'Always nonzero' if required else
+                           'Zero if not present')]
+
+                global used_enums
+                if unique_name not in used_enums:
+                    new_enums.add(unique_name)
+        elif value == 'dimension' or value == 'real':
+            attrs += [('double %s;' % c_name,
+                       'In inches.  ' + ('Always present' if required else
+                                         'DBL_MAX if not present'))]
+        elif value == 'int':
+            attrs += [('int %s;' % c_name,
+                       'Always present' if required
+                       else 'INT_MIN if not present')]
+        elif value == 'color':
+            attrs += [('int %s;' % c_name,
+                       'Always present' if required
+                       else '-1 if not present')]
+        elif value == 'string':
+            attrs += [('char *%s;' % c_name,
+                       'Always nonnull' if required else 'Possibly null')]
+        elif value[0] == 'ref':
+            struct = ('spvxml_node'
+                      if value[1] is None or type(value[1]) is set
+                      else '%s%s' % (prefix, name_to_id(value[1])))
+            attrs += [('struct %s *%s;' % (struct, c_name),
+                       'Always nonnull' if required else 'Possibly null')]
+        elif value == 'id':
+            pass
+        else:
+            assert False
+
+    for enum_name in new_enums:
+        used_enums.add(enum_name)
+        c_name = name_to_id(enum_name)
+        print '\nenum %s%s {' % (prefix, c_name)
+        i = 0
+        for value in sorted(enums[enum_name]):
+            print '    %s%s_%s%s,' % (prefix.upper(),
+                                      c_name.upper(),
+                                      name_to_id(value).upper(),
+                                      ' = 1' if i == 0 else '')
+            i += 1
+        print '};'
+        print 'const char *%s%s_to_string (enum %s%s);' % (
+            prefix, c_name, prefix, c_name)
+
+    print '\nstruct %s%s {' % (prefix, name_to_id(name))
+    print '%sstruct spvxml_node node_;' % indent
+
+    if attrs:
+        print '\n%s/* Attributes. */' % indent
+        for decl, comment in attrs:
+            line = '%s%s' % (indent, decl)
+            if comment:
+                n_spaces = max(35 - len(line), 1)
+                line += '%s/* %s. */' % (' ' * n_spaces, comment)
+            print line
+
+    if rhs['type'] == 'etc' or rhs['type'] == 'empty':
+        return
+
+    print '\n%s/* Content. */' % indent
+    if rhs['type'] == 'text':
+        print '%schar *text; /* Always nonnull. */' % indent
+        return
+
+    for a in rhs['items'] if rhs['type'] == '|' else (rhs,):
+        for term in a['items'] if a['type'] == 'sequence' else (a,):
+            if term['type'] == 'empty':
+                pass
+            elif term['type'] == 'nonterminal':
+                nt_name = name_to_id(term['nonterminal_name'])
+                member_name = name_to_id(term['member_name'])
+                print '%sstruct %s%s *%s; /* Always nonnull. */' % (
+                    indent, prefix, nt_name, member_name)
+            elif term['type'] == '?' and term['item']['type'] == 'nonterminal':
+                nt_name = name_to_id(term['item']['nonterminal_name'])
+                member_name = name_to_id(term['item']['member_name'])
+                print '%sstruct %s%s *%s; /* Possibly null. */' % (
+                    indent, prefix, nt_name, member_name)
+            elif (term['type'] in ('*', '+')
+                  and term['item']['type'] == 'nonterminal'):
+                nt_name = name_to_id(term['item']['nonterminal_name'])
+                member_name = name_to_id(term['item']['member_name'])
+                print '%sstruct %s%s **%s;' % (indent, prefix,
+                                               nt_name, member_name)
+                print '%ssize_t n_%s;' % (indent, member_name)
+            else:
+                seq_name = term['seq_name']
+                print '%sstruct spvxml_node **%s;' % (indent, seq_name)
+                print '%ssize_t n_%s;' % (indent, seq_name)
+
+
+def bytes_to_hex(s):
+    return ''.join(['"'] + ["\\x%02x" % ord(x) for x in s] + ['"'])
+
+
+class Parser_Context(object):
+    def __init__(self, function_name, productions):
+        self.suffixes = {}
+        self.bail = 'error'
+        self.need_error_handler = False
+        self.parsers = {}
+        self.parser_index = 0
+        self.productions = productions
+
+        self.function_name = function_name
+        self.functions = []
+    def gen_name(self, prefix):
+        n = self.suffixes.get(prefix, 0) + 1
+        self.suffixes[prefix] = n
+        return '%s%d' % (prefix, n) if n > 1 else prefix
+    def new_function(self, type_name):
+        f = Function('%s_%d' % (self.function_name, len(self.functions) + 1),
+                     type_name)
+        self.functions += [f]
+        return f
+
+
+def print_attribute_decls(name, attributes):
+    if attributes:
+        print('    enum {')
+        for unique_name, (xml_name, value, required) in sorted(attributes.items()):
+            c_name = name_to_id(unique_name)
+            print('        ATTR_%s,' % c_name.upper())
+        print('    };')
+    print('    struct spvxml_attribute attrs[] = {')
+    for unique_name, (xml_name, value, required) in sorted(attributes.items()):
+        c_name = name_to_id(unique_name)
+        print('        [ATTR_%s] = { "%s", %s, NULL },'
+              % (c_name.upper(), xml_name, 'true' if required else 'false'))
+    print('    };')
+    print('    enum { N_ATTRS = sizeof attrs / sizeof *attrs };')
+
+
+def print_parser_for_attributes(name, attributes):
+    print('    /* Parse attributes. */')
+    print('    spvxml_parse_attributes (&nctx);')
+
+    if not attributes:
+        return
+
+    for unique_name, (xml_name, value, required) in sorted(attributes.items()):
+        c_name = name_to_id(unique_name)
+        params = '&nctx, &attrs[ATTR_%s]' % c_name.upper()
+        if type(value) is set:
+            if len(value) <= 1:
+                if required:
+                    print('    spvxml_attr_parse_fixed (%s, "%s");'
+                          % (params, tuple(value)[0]))
+                else:
+                    print('    p->%s_present = spvxml_attr_parse_fixed (\n'
+                          '        %s, "%s");'
+                          % (c_name, params, tuple(value)[0]))
+            elif value == set(('true', 'false')):
+                print('    p->%s = spvxml_attr_parse_bool (%s);'
+                      % (c_name, params))
+            else:
+                map_name = '%s%s_map' % (prefix, c_name)
+                print('    p->%s = spvxml_attr_parse_enum (\n'
+                      '        %s, %s);'
+                      % (c_name, params, map_name))
+        elif value in ('real', 'dimension', 'int', 'color'):
+            print('    p->%s = spvxml_attr_parse_%s (%s);'
+                  % (c_name, value, params))
+        elif value == 'string':
+            print('    p->%s = attrs[ATTR_%s].value;\n'
+                  '    attrs[ATTR_%s].value = NULL;'
+                  % (c_name, c_name.upper(),
+                     c_name.upper()))
+        elif value == 'id':
+            print('    p->node_.id = attrs[ATTR_%s].value;\n'
+                  '    attrs[ATTR_%s].value = NULL;'
+                  % (c_name.upper(), c_name.upper()))
+        elif value[0] == 'ref':
+            pass
+        else:
+            assert False
+    print('''\
+    if (ctx->error) {
+        spvxml_node_context_uninit (&nctx);
+        ctx->hard_error = true;
+        %sfree_%s (p);
+        return false;
+    }'''
+          % (prefix, name_to_id(name)))
+
+class Function(object):
+    def __init__(self, function_name, type_name):
+        self.function_name = function_name
+        self.type_name = type_name
+        self.suffixes = {}
+        self.code = []
+    def gen_name(self, prefix):
+        n = self.suffixes.get(prefix, 0) + 1
+        self.suffixes[prefix] = n
+        return '%s%d' % (prefix, n) if n > 1 else prefix
+    def print_(self):
+        print('''
+static bool
+%s (struct spvxml_node_context *nctx, xmlNode **input, struct %s *p)
+{'''
+              % (self.function_name, self.type_name))
+        while self.code and self.code[0] == '':
+            self.code = self.code[1:]
+        for line in self.code:
+            print('    %s' % line if line else '')
+        print('    return true;')
+        print('}')
+
+STATE_START = 0
+STATE_ALTERNATION = 1
+STATE_SEQUENCE = 2
+STATE_REPETITION = 3
+STATE_OPTIONAL = 4
+STATE_GENERAL = 5
+
+def generate_content_parser(nonterminal, rhs, function, ctx, state, seq_name):
+    seq_name = seq_name if seq_name else rhs.get('seq_name')
+    ctx.parser_index += 1
+
+    if rhs['type'] == 'etc':
+        function.code += ['spvxml_content_parse_etc (input);']
+    elif rhs['type'] == 'text':
+        function.code += ['if (!spvxml_content_parse_text (nctx, input, &p->text))',
+                          '    return false;']
+    elif rhs['type'] == '|':
+        for i in range(len(rhs['items'])):
+            choice = rhs['items'][i]
+            subfunc = ctx.new_function(function.type_name)
+            generate_content_parser(nonterminal, choice, subfunc, ctx,
+                                    STATE_ALTERNATION
+                                    if state == STATE_START
+                                    else STATE_GENERAL, seq_name)
+            function.code += ['%(start)s!%(tryfunc)s (nctx, input, p, %(subfunc)s)%(end)s'
+                               % {'start': 'if (' if i == 0 else '    && ',
+                                  'subfunc': subfunc.function_name,
+                                  'tryfunc': '%stry_parse_%s'
+                                  % (prefix, name_to_id(nonterminal)),
+                                  'end': ')' if i == len(rhs['items']) - 1 else ''}]
+        function.code += ['  {',
+                          '    spvxml_content_error (nctx, *input, "Syntax error.");',
+                          '    return false;',
+                          '  }']
+    elif rhs['type'] == 'sequence':
+        for element in rhs['items']:
+            generate_content_parser(nonterminal, element, function, ctx,
+                                    STATE_SEQUENCE
+                                    if state in (STATE_START,
+                                                 STATE_ALTERNATION)
+                                    else STATE_GENERAL, seq_name)
+    elif rhs['type'] == 'empty':
+        function.code += ['(void) nctx;']
+        function.code += ['(void) input;']
+        function.code += ['(void) p;']
+    elif rhs['type'] in ('*', '+', '?'):
+        subfunc = ctx.new_function(function.type_name)
+        generate_content_parser(nonterminal, rhs['item'], subfunc, ctx,
+                                (STATE_OPTIONAL
+                                 if rhs['type'] == '?'
+                                 else STATE_REPETITION)
+                                if state in (STATE_START,
+                                             STATE_ALTERNATION,
+                                             STATE_SEQUENCE)
+                                else STATE_GENERAL, seq_name)
+        next_name = function.gen_name('next')
+        args = {'subfunc': subfunc.function_name,
+                'tryfunc': '%stry_parse_%s' % (prefix,
+                                               name_to_id (nonterminal))}
+        if rhs['type'] == '?':
+            function.code += [
+                '%(tryfunc)s (nctx, input, p, %(subfunc)s);' % args]
+        else:
+            if rhs['type'] == '+':
+                function.code += ['if (!%(subfunc)s (nctx, input, p))' % args,
+                                  '    return false;']
+            function.code += [
+                'while (%(tryfunc)s (nctx, input, p, %(subfunc)s))' % args,
+                '    continue;']
+    elif rhs['type'] == 'nonterminal':
+        node_name = function.gen_name('node')
+        function.code += [
+            '',
+            'xmlNode *%s;' % node_name,
+            'if (!spvxml_content_parse_element (nctx, input, "%s", &%s))'
+            % (ctx.productions[rhs['nonterminal_name']][0], node_name),
+            '    return false;']
+        if state in (STATE_START,
+                     STATE_ALTERNATION,
+                     STATE_SEQUENCE,
+                     STATE_OPTIONAL):
+            target = '&p->%s' % name_to_id(rhs['member_name'])
+        else:
+            assert state in (STATE_REPETITION, STATE_GENERAL)
+            member = name_to_id(rhs['member_name']) if state == STATE_REPETITION else seq_name
+            function.code += ['struct %s%s *%s;' % (
+                prefix, name_to_id(rhs['nonterminal_name']), member)]
+            target = '&%s' % member
+        function.code += [
+            'if (!%sparse_%s (nctx->up, %s, %s))'
+            % (prefix, name_to_id(rhs['nonterminal_name']), node_name, target),
+            '    return false;']
+        if state in (STATE_REPETITION, STATE_GENERAL):
+            function.code += [
+                'p->%s = xrealloc (p->%s, sizeof *p->%s * (p->n_%s + 1));'
+                % (member, member, member, member),
+                'p->%s[p->n_%s++] = %s;' % (member, member, 
+                                            '&%s->node_' % member
+                                            if state == STATE_GENERAL
+                                            else member)]
+    else:
+        assert False
+
+def print_parser(name, production, productions, indent):
+    xml_name, attributes, rhs = production
+
+    print('''
+static bool UNUSED
+%(prefix)stry_parse_%(name)s (
+    struct spvxml_node_context *nctx, xmlNode **input,
+    struct %(prefix)s%(name)s *p,
+    bool (*sub) (struct spvxml_node_context *,
+                 xmlNode **,
+                 struct %(prefix)s%(name)s *))
+{
+    xmlNode *next = *input;
+    bool ok = sub (nctx, &next, p);
+    if (ok)
+        *input = next;
+    else if (!nctx->up->hard_error) {
+        free (nctx->up->error);
+        nctx->up->error = NULL;
+    }
+    return ok;
+}'''
+          % {'prefix': prefix,
+             'name': name_to_id(name)})
+
+    ctx = Parser_Context('%sparse_%s' % (prefix, name_to_id(name)),
+                         productions)
+    if rhs['type'] not in ('empty', 'etc'):
+        function = ctx.new_function('%s%s' % (prefix, name_to_id(name)))
+        generate_content_parser(name, rhs, function, ctx, 0, None)
+        for f in reversed(ctx.functions):
+            f.print_()
+
+    print('''
+bool
+%(prefix)sparse_%(name)s (
+    struct spvxml_context *ctx, xmlNode *input,
+    struct %(prefix)s%(name)s **p_)
+{'''
+          % {'prefix': prefix,
+             'name': name_to_id(name)})
+
+    print_attribute_decls(name, attributes)
+
+    print('    struct spvxml_node_context nctx = {')
+    print('        .up = ctx,')
+    print('        .parent = input,')
+    print('        .attrs = attrs,')
+    print('        .n_attrs = N_ATTRS,')
+    print('    };')
+    print('')
+    print('    *p_ = NULL;')
+    print('    struct %(prefix)s%(name)s *p = xzalloc (sizeof *p);'
+          % {'prefix': prefix,
+             'name': name_to_id(name)})
+    print('    p->node_.raw = input;')
+    print('    p->node_.class_ = &%(prefix)s%(name)s_class;'
+          % {'prefix': prefix,
+             'name': name_to_id(name)})
+    print('')
+
+    print_parser_for_attributes(name, attributes)
+
+    if rhs['type'] == 'empty':
+        print('''
+    /* Parse content. */
+    if (!spvxml_content_parse_end (&nctx, input->children)) {
+        ctx->hard_error = true;
+        spvxml_node_context_uninit (&nctx);
+        %sfree_%s (p);
+        return false;
+    }'''
+              % (prefix, name_to_id(name)))
+    elif rhs['type'] == 'etc':
+        print('''
+    /* Ignore content. */
+''')
+    else:
+        print('''
+    /* Parse content. */
+    input = input->children;
+    if (!%s (&nctx, &input, p)
+        || !spvxml_content_parse_end (&nctx, input)) {
+        ctx->hard_error = true;
+        spvxml_node_context_uninit (&nctx);
+        %sfree_%s (p);
+        return false;
+    }'''
+              % (function.function_name,
+                 prefix, name_to_id(name)))
+
+    print('''
+    spvxml_node_context_uninit (&nctx);
+    *p_ = p;
+    return true;''')
+
+    print "}"
+
+
+def print_free_members(attributes, rhs, indent):
+    for unique_name, (xml_name, value, required) in attributes.items():
+        c_name = name_to_id(unique_name)
+        if (type(value) is set
+            or value in ('dimension', 'real', 'int', 'color', 'id')
+            or value[0] == 'ref'):
+            pass
+        elif value == 'string':
+            print('    free (p->%s);' % c_name);
+        else:
+            assert False
+
+    if rhs['type'] in ('etc', 'empty'):
+        pass
+    elif rhs['type'] == 'text':
+        print('    free (p->text);')
+    else:
+        n = 0
+        for a in rhs['items'] if rhs['type'] == '|' else (rhs,):
+            for term in a['items'] if a['type'] == 'sequence' else (a,):
+                if term['type'] == 'empty':
+                    pass
+                elif (term['type'] == 'nonterminal'
+                      or (term['type'] == '?'
+                          and term['item']['type'] == 'nonterminal')):
+                    if term['type'] == '?':
+                        term = term['item']
+                    nt_name = name_to_id(term['nonterminal_name'])
+                    member_name = name_to_id(term['member_name'])
+                    print('    %sfree_%s (p->%s);' % (prefix, nt_name,
+                                                      member_name))
+                elif (term['type'] in ('*', '+')
+                      and term['item']['type'] == 'nonterminal'):
+                    nt_name = name_to_id(term['item']['nonterminal_name'])
+                    member_name = name_to_id(term['item']['member_name'])
+                    print('''\
+    for (size_t i = 0; i < p->n_%s; i++)
+        %sfree_%s (p->%s[i]);
+    free (p->%s);'''
+                          % (member_name,
+                             prefix, nt_name, member_name,
+                             member_name))
+                else:
+                    n += 1
+                    seq_name = 'seq' if n == 1 else 'seq%d' % n
+                    print('''\
+    for (size_t i = 0; i < p->n_%s; i++)
+        p->%s[i]->class_->spvxml_node_free (p->%s[i]);
+    free (p->%s);'''
+                          % (seq_name,
+                             seq_name, seq_name,
+                             seq_name))
+    print('    free (p->node_.id);')
+    print('    free (p);')
+
+
+def print_free(name, production, indent):
+    xml_name, attributes, rhs = production
+
+    print '''
+void
+%(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *p)
+{
+    if (!p)
+        return;
+''' % {'prefix': prefix,
+       'name': name_to_id(name)}
+
+    print_free_members(attributes, rhs, ' ' * 4)
+
+    print('}')
+
+def name_to_id(s):
+    return s[0].lower() + ''.join(['_%c' % x.lower() if x.isupper() else x
+                                   for x in s[1:]]).replace('-', '_')
+
+
+def print_recurse_members(attributes, rhs, function):
+    if rhs['type'] == 'etc' or rhs['type'] == 'empty':
+        pass
+    elif rhs['type'] == 'text':
+        pass
+    else:
+        n = 0
+        for a in rhs['items'] if rhs['type'] == '|' else (rhs,):
+            for term in a['items'] if a['type'] == 'sequence' else (a,):
+                if term['type'] == 'empty':
+                    pass
+                elif (term['type'] == 'nonterminal'
+                      or (term['type'] == '?'
+                          and term['item']['type'] == 'nonterminal')):
+                    if term['type'] == '?':
+                        term = term['item']
+                    nt_name = name_to_id(term['nonterminal_name'])
+                    member_name = name_to_id(term['member_name'])
+                    print('    %s%s_%s (ctx, p->%s);'
+                          % (prefix, function, nt_name, member_name))
+                elif (term['type'] in ('*', '+')
+                      and term['item']['type'] == 'nonterminal'):
+                    nt_name = name_to_id(term['item']['nonterminal_name'])
+                    member_name = name_to_id(term['item']['member_name'])
+                    print('''\
+    for (size_t i = 0; i < p->n_%s; i++)
+        %s%s_%s (ctx, p->%s[i]);'''
+                          % (member_name,
+                             prefix, function, nt_name, member_name))
+                else:
+                    n += 1
+                    seq_name = 'seq' if n == 1 else 'seq%d' % n
+                    print('''\
+    for (size_t i = 0; i < p->n_%s; i++)
+        p->%s[i]->class_->spvxml_node_%s (ctx, p->%s[i]);'''
+                          % (seq_name,
+                             seq_name, function, seq_name))
+
+
+def print_collect_ids(name, production):
+    xml_name, attributes, rhs = production
+
+    print '''
+void
+%(prefix)scollect_ids_%(name)s (struct spvxml_context *ctx, struct %(prefix)s%(name)s *p)
+{
+    if (!p)
+        return;
+
+    spvxml_node_collect_id (ctx, &p->node_);
+''' % {'prefix': prefix,
+       'name': name_to_id(name)}
+
+    print_recurse_members(attributes, rhs, 'collect_ids')
+
+    print('}')
+
+
+def print_resolve_refs(name, production):
+    xml_name, attributes, rhs = production
+
+    print '''
+bool
+%(prefix)sis_%(name)s (const struct spvxml_node *node)
+{
+    return node->class_ == &%(prefix)s%(name)s_class;
+}
+
+struct %(prefix)s%(name)s *
+%(prefix)scast_%(name)s (const struct spvxml_node *node)
+{
+    return (node && %(prefix)sis_%(name)s (node)
+            ? UP_CAST (node, struct %(prefix)s%(name)s, node_)
+            : NULL);
+}
+
+void
+%(prefix)sresolve_refs_%(name)s (struct spvxml_context *ctx UNUSED, struct %(prefix)s%(name)s *p UNUSED)
+{
+    if (!p)
+        return;
+''' % {'prefix': prefix,
+       'name': name_to_id(name)}
+
+    i = 0
+    for unique_name, (xml_name, value, required) in sorted(attributes.items()):
+        c_name = name_to_id(unique_name)
+        if type(value) is set or value[0] != 'ref':
+            continue
+
+        if value[1] is None:
+            print('    p->%s = spvxml_node_resolve_ref (ctx, p->node_.raw, \"%s\", NULL, 0);'
+                  % (c_name, xml_name))
+        else:
+            i += 1
+            name = 'classes'
+            if i > 1:
+                name += '%d' % i
+            if type(value[1]) is set:
+                print('    static const struct spvxml_node_class *const %s[] = {' % name)
+                for ref_type in value[1]:
+                    print('        &%(prefix)s%(ref_type)s_class,'
+                         % {'prefix': prefix,
+                            'ref_type': name_to_id(ref_type)})
+                print('    };');
+                print('    const size_t n_%s = sizeof %s / sizeof *%s;'
+                      % (name, name, name))
+                print('    p->%(member)s = spvxml_node_resolve_ref (ctx, p->node_.raw, \"%(attr)s\", %(name)s, n_%(name)s);'
+                      % {"member": c_name,
+                         "attr": xml_name,
+                         'prefix': prefix,
+                         'name': name
+                         })
+            else:
+                print('    static const struct spvxml_node_class *const %s' % name)
+                print('        = &%(prefix)s%(ref_type)s_class;'
+                      % {'prefix': prefix,
+                         'ref_type': name_to_id(value[1])})
+                print('    p->%(member)s = %(prefix)scast_%(ref_type)s (spvxml_node_resolve_ref (ctx, p->node_.raw, \"%(attr)s\", &%(name)s, 1));'
+                      % {"member": c_name,
+                         "attr": xml_name,
+                         'prefix': prefix,
+                         'name': name,
+                         'ref_type': name_to_id(value[1])})
+                
+
+    print_recurse_members(attributes, rhs, 'resolve_refs')
+
+    print('}')
+
+
+def name_to_id(s):
+    return s[0].lower() + ''.join(['_%c' % x.lower() if x.isupper() else x
+                                   for x in s[1:]]).replace('-', '_')
+
+
+if __name__ == "__main__":
+    argv0 = sys.argv[0]
+    try:
+        options, args = getopt.gnu_getopt(sys.argv[1:], 'h', ['help'])
+    except getopt.GetoptError as e:
+        sys.stderr.write("%s: %s\n" % (argv0, e.msg))
+        sys.exit(1)
+
+    for key, value in options:
+        if key in ['-h', '--help']:
+            usage()
+        else:
+            sys.exit(0)
+
+    if len(args) < 3:
+        sys.stderr.write("%s: bad usage (use --help for help)\n" % argv0)
+        sys.exit(1)
+
+    global file_name
+    global prefix
+    file_name, output_type, prefix = args[:3]
+    input_file = open(file_name)
+
+    prefix = '%s_' % prefix
+
+    global line
+    global line_number
+    line = ""
+    line_number = 0
+
+    productions = {}
+
+    global token
+    token = ('start', )
+    get_token()
+    while True:
+        while match(';'):
+            pass
+        if token[0] == 'eof':
+            break
+
+        name, xml_name, attributes, rhs = parse_production()
+        if name in productions:
+            fatal("%s: duplicate production" % name)
+        productions[name] = (xml_name, attributes, rhs)
+
+    print '/* Generated automatically -- do not modify!    -*- buffer-read-only: t -*- */'
+    if output_type == 'code' and len(args) == 4:
+        header_name = args[3]
+
+        print """\
+#include <config.h>
+#include %s
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include "libpspp/cast.h"
+#include "libpspp/str.h"
+#include "gl/xalloc.h"
+
+""" % header_name
+        for enum_name, values in sorted(enums.items()):
+            if len(values) <= 1:
+                continue
+
+            c_name = name_to_id(enum_name)
+            print('\nstatic const struct spvxml_enum %s%s_map[] = {'
+                  % (prefix, c_name))
+            for value in sorted(values):
+                print('    { "%s", %s%s_%s },' % (value, prefix.upper(),
+                                                 c_name.upper(),
+                                                 name_to_id(value).upper()))
+            print('    { NULL, 0 },')
+            print('};')
+            print('\nconst char *')
+            print('%s%s_to_string (enum %s%s %s)'
+                  % (prefix, c_name, prefix, c_name, c_name))
+            print('{')
+            print('    switch (%s) {' % c_name)
+            for value in sorted(values):
+                print('    case %s%s_%s: return "%s";'
+                       % (prefix.upper(), c_name.upper(),
+                          name_to_id(value).upper(), value))
+            print('    default: return NULL;')
+            print('    }')
+            print('}')
+
+        for name, (xml_name, attributes, rhs) in sorted(productions.items()):
+            print('static void %(prefix)scollect_ids_%(name)s (struct spvxml_context *, struct %(prefix)s%(name)s *);\n'
+                  'static void %(prefix)sresolve_refs_%(name)s (struct spvxml_context *ctx UNUSED, struct %(prefix)s%(name)s *p UNUSED);\n'
+                  % {'prefix': prefix,
+                     'name': name_to_id(name)})
+        for name, production in sorted(productions.items()):
+            print_parser(name, production, productions, ' ' * 4)
+            print_free(name, production, ' ' * 4)
+            print_collect_ids(name, production)
+            print_resolve_refs(name, production)
+            print('''
+static void
+%(prefix)sdo_free_%(name)s (struct spvxml_node *node)
+{
+    %(prefix)sfree_%(name)s (UP_CAST (node, struct %(prefix)s%(name)s, node_));
+}
+
+static void
+%(prefix)sdo_collect_ids_%(name)s (struct spvxml_context *ctx, struct spvxml_node *node)
+{
+    %(prefix)scollect_ids_%(name)s (ctx, UP_CAST (node, struct %(prefix)s%(name)s, node_));
+}
+
+static void
+%(prefix)sdo_resolve_refs_%(name)s (struct spvxml_context *ctx, struct spvxml_node *node)
+{
+    %(prefix)sresolve_refs_%(name)s (ctx, UP_CAST (node, struct %(prefix)s%(name)s, node_));
+}
+
+struct spvxml_node_class %(prefix)s%(name)s_class = {
+    "%(class)s",
+    %(prefix)sdo_free_%(name)s,
+    %(prefix)sdo_collect_ids_%(name)s,
+    %(prefix)sdo_resolve_refs_%(name)s,
+};
+'''
+            % {'prefix': prefix,
+               'name': name_to_id(name),
+               'class': (name if name == production[0]
+                         else '%s (%s)' % (name, production[0]))})
+    elif output_type == 'header' and len(args) == 3:
+        print """\
+#ifndef %(PREFIX)sPARSER_H
+#define %(PREFIX)sPARSER_H
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include "output/spv/spvxml-helpers.h"\
+""" % {'PREFIX': prefix.upper()}
+        for name, (xml_name, attributes, rhs) in sorted(productions.items()):
+            print_members(attributes, rhs, ' ' * 4)
+            print('''};
+
+extern struct spvxml_node_class %(prefix)s%(name)s_class;
+
+bool %(prefix)sparse_%(name)s (struct spvxml_context *, xmlNode *input, struct %(prefix)s%(name)s **);
+void %(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *);
+bool %(prefix)sis_%(name)s (const struct spvxml_node *);
+struct %(prefix)s%(name)s *%(prefix)scast_%(name)s (const struct spvxml_node *);'''
+                  % {'prefix': prefix,
+                     'name': name_to_id(name)})
+        print """\
+
+#endif /* %(PREFIX)sPARSER_H */""" % {'PREFIX': prefix.upper()}
+    else:
+        sys.stderr.write("%s: bad usage (use --help for help)" % argv0)
diff --git a/src/output/table-item.c b/src/output/table-item.c

index e9b854040a823982c36ac96690f7594c9a3a90b0..a1595479bbf5bde005fc9734085b471e8f5c0cff 100644 (file)
--- a/src/output/table-item.c
+++ b/src/output/table-item.c
@@ -24,6 +24,7 @@
  #include "libpspp/cast.h"
  #include "output/driver.h"
  #include "output/output-item-provider.h"
+#include "output/pivot-table.h"
  #include "output/table-item.h"
  
  #include "gl/xalloc.h"
@@ -68,6 +69,56 @@ table_item_text_destroy (struct table_item_text *text)
      }
  }
  
+void
+table_item_layer_copy (struct table_item_layer *dst,
+                       const struct table_item_layer *src)
+{
+  dst->content = xstrdup (src->content);
+  dst->footnotes = xmemdup (src->footnotes,
+                            src->n_footnotes * sizeof *src->footnotes);
+  dst->n_footnotes = src->n_footnotes;
+}
+
+void
+table_item_layer_uninit (struct table_item_layer *layer)
+{
+  if (layer)
+    {
+      free (layer->content);
+      free (layer->footnotes);
+    }
+}
+
+struct table_item_layers *
+table_item_layers_clone (const struct table_item_layers *old)
+{
+  if (!old)
+    return NULL;
+
+  struct table_item_layers *new = xmalloc (sizeof *new);
+  *new = (struct table_item_layers) {
+    .layers = xnmalloc (old->n_layers, sizeof *new->layers),
+    .n_layers = old->n_layers,
+    .style = area_style_clone (NULL, old->style),
+  };
+  for (size_t i = 0; i < new->n_layers; i++)
+    table_item_layer_copy (&new->layers[i], &old->layers[i]);
+  return new;
+}
+
+void
+table_item_layers_destroy (struct table_item_layers *layers)
+{
+  if (layers)
+    {
+      for (size_t i = 0; i < layers->n_layers; i++)
+        table_item_layer_uninit (&layers->layers[i]);
+      free (layers->layers);
+      area_style_free (layers->style);
+      free (layers);
+    }
+}
+
  /* Initializes ITEM as a table item for rendering TABLE.  The new table item
     initially has the specified TITLE and CAPTION, which may each be NULL.  The
     caller retains ownership of TITLE and CAPTION. */
@@ -80,6 +131,7 @@ table_item_create (struct table *table, const char *title, const char *caption)
    item->title = table_item_text_create (title);
    item->layers = NULL;
    item->caption = table_item_text_create (caption);
+  item->pt = NULL;
    return item;
  }
  
@@ -114,7 +166,7 @@ table_item_set_title (struct table_item *item,
  
  /* Returns ITEM's layers, which will be a null pointer if no layers have been
     set. */
-const struct table_item_text *
+const struct table_item_layers *
  table_item_get_layers (const struct table_item *item)
  {
    return item->layers;
@@ -127,11 +179,11 @@ table_item_get_layers (const struct table_item *item)
     This function may only be used on a table_item that is unshared. */
  void
  table_item_set_layers (struct table_item *item,
-                      const struct table_item_text *layers)
+                       const struct table_item_layers *layers)
  {
    assert (!table_item_is_shared (item));
-  table_item_text_destroy (item->layers);
-  item->layers = table_item_text_clone (layers);
+  table_item_layers_destroy (item->layers);
+  item->layers = table_item_layers_clone (layers);
  }
  
  /* Returns ITEM's caption, which is a null pointer if no caption has been
@@ -169,8 +221,9 @@ table_item_destroy (struct output_item *output_item)
  {
    struct table_item *item = to_table_item (output_item);
    table_item_text_destroy (item->title);
-  table_item_text_destroy (item->layers);
    table_item_text_destroy (item->caption);
+  table_item_layers_destroy (item->layers);
+  pivot_table_unref (item->pt);
    table_unref (item->table);
    free (item);
  }
diff --git a/src/output/table.c b/src/output/table.c

index 18b822ba70c5c7c4405e3f28f54c7d2c752e3daa..09d45af7c5721107e8b6221febfa812439617de2 100644 (file)
--- a/src/output/table.c
+++ b/src/output/table.c
@@ -289,13 +289,27 @@ table_collect_footnotes (const struct table_item *item,
      footnotes = add_footnotes (title->footnotes, title->n_footnotes,
                                 footnotes, &allocated, &n);
  
+  const struct table_item_layers *layers = table_item_get_layers (item);
+  if (layers)
+    {
+      for (size_t i = 0; i < layers->n_layers; i++)
+        footnotes = add_footnotes (layers->layers[i].footnotes,
+                                   layers->layers[i].n_footnotes,
+                                   footnotes, &allocated, &n);
+    }
+
    const struct table_item_text *caption = table_item_get_caption (item);
    if (caption)
      footnotes = add_footnotes (caption->footnotes, caption->n_footnotes,
                                 footnotes, &allocated, &n);
  
+  size_t n_nonnull = 0;
+  for (size_t i = 0; i < n; i++)
+    if (footnotes[i])
+      footnotes[n_nonnull++] = footnotes[i];
+
    *footnotesp = footnotes;
-  return n;
+  return n_nonnull;
  }
  \f
  const char *
diff --git a/src/ui/gui/psppire-output-window.c b/src/ui/gui/psppire-output-window.c

index ad1f03db5fb947c766616dccaf934a477eec9a76..de57f17f9dab9af6e73cd88e15c02aa9836b7727 100644 (file)
--- a/src/ui/gui/psppire-output-window.c
+++ b/src/ui/gui/psppire-output-window.c
@@ -253,6 +253,7 @@ struct file_types
  enum
    {
      FT_AUTO = 0,
+    FT_SPV,
      FT_PDF,
      FT_HTML,
      FT_ODT,
@@ -267,6 +268,7 @@ enum
  
  struct file_types ft[n_FT] = {
    {N_("Infer file type from extension"),  NULL},
+  {N_("SPSS Viewer (*.spv)"),             ".spv"},
    {N_("PDF (*.pdf)"),                     ".pdf"},
    {N_("HTML (*.html)"),                   ".html"},
    {N_("OpenDocument (*.odt)"),            ".odt"},
@@ -453,6 +455,9 @@ psppire_output_window_export (PsppireOutputWindow *window)
  
        switch (file_type)
         {
+        case FT_SPV:
+          export_output (window, &options, "spv");
+          break;
         case FT_PDF:
            export_output (window, &options, "pdf");
           break;
diff --git a/src/ui/gui/psppire-window.c b/src/ui/gui/psppire-window.c

index 8b2e3d5dfd7ef231e27b79f13c2229f591055e52..fa4b875433b99ed20642587119c0e4768d7164b9 100644 (file)
--- a/src/ui/gui/psppire-window.c
+++ b/src/ui/gui/psppire-window.c
@@ -33,6 +33,11 @@
  #include "data/file-handle-def.h"
  #include "data/dataset.h"
  #include "libpspp/version.h"
+#include "output/group-item.h"
+#include "output/pivot-table.h"
+#include "output/spv/spv.h"
+#include "output/spv/spv-output.h"
+#include "output/spv/spv-select.h"
  
  #include "helper.h"
  #include "psppire-data-window.h"
@@ -657,9 +662,12 @@ psppire_window_file_chooser_dialog (PsppireWindow *toplevel)
    gtk_file_filter_set_name (filter, _("Data and Syntax Files"));
    gtk_file_filter_add_mime_type (filter, "application/x-spss-sav");
    gtk_file_filter_add_mime_type (filter, "application/x-spss-por");
+  gtk_file_filter_add_mime_type (filter, "application/x-spss-spv");
    gtk_file_filter_add_pattern (filter, "*.zsav");
    gtk_file_filter_add_pattern (filter, "*.sps");
    gtk_file_filter_add_pattern (filter, "*.SPS");
+  gtk_file_filter_add_pattern (filter, "*.spv");
+  gtk_file_filter_add_pattern (filter, "*.SPV");
    gtk_file_chooser_add_filter (GTK_FILE_CHOOSER (dialog), filter);
  
    filter = gtk_file_filter_new ();
@@ -679,6 +687,12 @@ psppire_window_file_chooser_dialog (PsppireWindow *toplevel)
    gtk_file_filter_add_pattern (filter, "*.SPS");
    gtk_file_chooser_add_filter (GTK_FILE_CHOOSER (dialog), filter);
  
+  filter = gtk_file_filter_new ();
+  gtk_file_filter_set_name (filter, _("Output Files (*.spv) "));
+  gtk_file_filter_add_pattern (filter, "*.spv");
+  gtk_file_filter_add_pattern (filter, "*.SPV");
+  gtk_file_chooser_add_filter (GTK_FILE_CHOOSER (dialog), filter);
+
    filter = gtk_file_filter_new ();
    gtk_file_filter_set_name (filter, _("All Files"));
    gtk_file_filter_add_pattern (filter, "*");
@@ -712,6 +726,113 @@ psppire_window_file_chooser_dialog (PsppireWindow *toplevel)
    return dialog;
  }
  
+struct item_path
+  {
+    const struct spv_item **nodes;
+    size_t n;
+
+#define N_STUB 10
+    const struct spv_item *stub[N_STUB];
+  };
+
+static void
+swap_nodes (const struct spv_item **a, const struct spv_item **b)
+{
+  const struct spv_item *tmp = *a;
+  *a = *b;
+  *b = tmp;
+}
+
+static void
+get_path (const struct spv_item *item, struct item_path *path)
+{
+  size_t allocated = 10;
+  path->nodes = path->stub;
+  path->n = 0;
+
+  while (item)
+    {
+      if (path->n >= allocated)
+        {
+          if (path->nodes == path->stub)
+            path->nodes = xmemdup (path->stub, sizeof path->stub);
+          path->nodes = x2nrealloc (path->nodes, &allocated,
+                                    sizeof *path->nodes);
+        }
+      path->nodes[path->n++] = item;
+      item = item->parent;
+    }
+
+  for (size_t i = 0; i < path->n / 2; i++)
+    swap_nodes (&path->nodes[i], &path->nodes[path->n - i - 1]);
+}
+
+static void
+free_path (struct item_path *path)
+{
+  if (path && path->nodes != path->stub)
+    free (path->nodes);
+}
+
+static void
+dump_heading_transition (const struct spv_item *old,
+                         const struct spv_item *new)
+{
+  if (old == new)
+    return;
+
+  struct item_path old_path, new_path;
+  get_path (old, &old_path);
+  get_path (new, &new_path);
+
+  size_t common = 0;
+  for (; common < old_path.n && common < new_path.n; common++)
+    if (old_path.nodes[common] != new_path.nodes[common])
+      break;
+
+  for (size_t i = common; i < old_path.n; i++)
+    group_close_item_submit (group_close_item_create ());
+  for (size_t i = common; i < new_path.n; i++)
+    group_open_item_submit (group_open_item_create (
+                              new_path.nodes[i]->command_id));
+
+  free_path (&old_path);
+  free_path (&new_path);
+}
+
+void
+read_spv_file (const char *filename)
+{
+  struct spv_reader *spv;
+  char *error = spv_open (filename, &spv);
+  if (error)
+    {
+      /* XXX */
+      fprintf (stderr, "%s\n", error);
+      return;
+    }
+
+  struct spv_criteria criteria = SPV_CRITERIA_INITIALIZER (criteria);
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  struct spv_item *prev_heading = spv_get_root (spv);
+  for (size_t i = 0; i < n_items; i++)
+    {
+      struct spv_item *heading
+        = items[i]->type == SPV_ITEM_HEADING ? items[i] : items[i]->parent;
+      dump_heading_transition (prev_heading, heading);
+      if (items[i]->type == SPV_ITEM_TEXT)
+        spv_text_submit (items[i]);
+      else if (items[i]->type == SPV_ITEM_TABLE)
+        pivot_table_submit (spv_item_get_table (items[i]));
+      prev_heading = heading;
+    }
+  dump_heading_transition (prev_heading, spv_get_root (spv));
+  free (items);
+  spv_close (spv);
+}
+
  /* Callback for the file_open action.
     Prompts for a filename and opens it */
  void
@@ -740,7 +861,16 @@ psppire_window_open (PsppireWindow *de)
         if (retval == 1)
            open_data_window (de, name, encoding, NULL);
         else if (retval == 0)
-         open_syntax_window (name, encoding);
+          {
+            char *error = spv_detect (name);
+            if (!error)
+              read_spv_file (name);
+            else
+              {
+                free (error);
+                open_syntax_window (name, encoding);
+              }
+          }
  
          g_free (encoding);
         fh_unref (fh);
diff --git a/src/ui/gui/psppire-window.h b/src/ui/gui/psppire-window.h

index bc1b0f0b280770b87ae1414b7da4dc10b63c2e97..c65823a298e8c8063864675effab322bf78ecd3d 100644 (file)
--- a/src/ui/gui/psppire-window.h
+++ b/src/ui/gui/psppire-window.h
@@ -120,6 +120,8 @@ GtkWidget *psppire_window_file_chooser_dialog (PsppireWindow *toplevel);
  void add_most_recent (const char *file_name, const char *mime_type,
                        const char *encoding);
  
+void read_spv_file (const char *filename);
+
  G_END_DECLS
  
  #endif /* __PSPPIRE_WINDOW_H__ */
diff --git a/src/ui/gui/psppire.c b/src/ui/gui/psppire.c

index 1f903beda993c75616491e08f11f7100408ac2aa..2d5c185e018edb41ac603daec7d60f883b1fc702 100644 (file)
--- a/src/ui/gui/psppire.c
+++ b/src/ui/gui/psppire.c
@@ -39,6 +39,7 @@
  #include "output/driver.h"
  #include "output/journal.h"
  #include "output/message-item.h"
+#include "output/spv/spv.h"
  
  #include "ui/gui/dict-display.h"
  #include "ui/gui/executor.h"
@@ -192,8 +193,15 @@ psppire_preload_file (const gchar *file)
      w = open_data_window (NULL, filename, NULL, NULL);
    else if (retval == 0)
      {
-      create_data_window ();
-      w = open_syntax_window (filename, NULL);
+      char *error = spv_detect (filename);
+      if (!error)
+        read_spv_file (filename);
+      else
+        {
+          free (error);
+          create_data_window ();
+          open_syntax_window (filename, NULL);
+        }
      }
  
    fh_unref (fh);
diff --git a/tests/output/render.at b/tests/output/render.at

index 9f0140d9fd770340c1c0966b0d8813b27a2c1fff..0edcd9127fe9d51d4461d83cd34797a2613f59dc 100644 (file)
--- a/tests/output/render.at
+++ b/tests/output/render.at
@@ -249,7 +249,7 @@ AT_CHECK([render-test --csv input], [0],
  +----------+------+
  a. Approximation.
  b. This is a very long footnote that will have to wrap from one line to the
-   next.  Let's see if the rendering engine does it acceptably.
+next.  Let's see if the rendering engine does it acceptably.
  c. One
  d. Two
  e. Three
diff --git a/utilities/automake.mk b/utilities/automake.mk

index cbcf0ae2bb3350b3d2cb7dc60d1eabec40af6b54..028d26e01f51d2126004788d1a663305ddd18107 100644 (file)
--- a/utilities/automake.mk
+++ b/utilities/automake.mk
@@ -32,3 +32,13 @@ utilities_pspp_convert_LDFLAGS = $(PSPP_LDFLAGS) $(PG_LDFLAGS)
  if RELOCATABLE_VIA_LD
  utilities_pspp_convert_LDFLAGS += `$(RELOCATABLE_LDFLAGS) $(bindir)`
  endif
+
+bin_PROGRAMS += utilities/pspp-output
+utilities_pspp_output_SOURCES = utilities/pspp-output.c
+utilities_pspp_output_CPPFLAGS = \
+       $(LIBXML2_CFLAGS) $(AM_CPPFLAGS) -DINSTALLDIR=\"$(bindir)\"
+utilities_pspp_output_LDADD = \
+       src/libpspp.la \
+       src/libpspp-core.la \
+       $(CAIRO_LIBS)
+utilities_pspp_output_LDFLAGS = $(PSPP_LDFLAGS) $(LIBXML2_LIBS)
diff --git a/utilities/pspp-output.c b/utilities/pspp-output.c

new file mode 100644 (file)

index 0000000..009ae53
--- /dev/null
+++ b/utilities/pspp-output.c
@@ -0,0 +1,979 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2017, 2018 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include <getopt.h>
+#include <limits.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+#include "data/file-handle-def.h"
+#include "data/settings.h"
+#include "libpspp/i18n.h"
+#include "libpspp/message.h"
+#include "libpspp/string-map.h"
+#include "libpspp/string-set.h"
+#include "output/driver.h"
+#include "output/group-item.h"
+#include "output/page-setup-item.h"
+#include "output/pivot-table.h"
+#include "output/spv/light-binary-parser.h"
+#include "output/spv/spv-legacy-data.h"
+#include "output/spv/spv-output.h"
+#include "output/spv/spv-select.h"
+#include "output/spv/spv.h"
+#include "output/table-item.h"
+#include "output/text-item.h"
+
+#include "gl/c-ctype.h"
+#include "gl/error.h"
+#include "gl/progname.h"
+#include "gl/version-etc.h"
+#include "gl/xalloc.h"
+
+#include <libxml/tree.h>
+#include <libxml/xpath.h>
+#include <libxml/xpathInternals.h>
+
+#include "gettext.h"
+#define _(msgid) gettext (msgid)
+
+/* -O key=value: Output driver options. */
+static struct string_map output_options
+    = STRING_MAP_INITIALIZER (output_options);
+
+/* --member-name: Include .zip member name in "dir" output. */
+static bool show_member_name;
+
+/* --show-hidden, --select, --commands, ...: Selection criteria. */
+static struct spv_criteria criteria = SPV_CRITERIA_INITIALIZER(criteria);
+
+/* --sort: Sort members under dump-light-table, to make comparisons easier. */
+static bool sort;
+
+/* --raw: Dump raw binary data in dump-light-table. */
+static bool raw;
+
+/* Number of warnings issued. */
+static size_t n_warnings;
+
+static void usage (void);
+static void parse_options (int argc, char **argv);
+
+static void
+dump_item (const struct spv_item *item)
+{
+  switch (spv_item_get_type (item))
+    {
+    case SPV_ITEM_HEADING:
+      break;
+
+    case SPV_ITEM_TEXT:
+      spv_text_submit (item);
+      break;
+
+    case SPV_ITEM_TABLE:
+      pivot_table_submit (pivot_table_ref (spv_item_get_table (item)));
+      break;
+
+    case SPV_ITEM_GRAPH:
+      break;
+
+    case SPV_ITEM_MODEL:
+      break;
+
+    case SPV_ITEM_OBJECT:
+      break;
+
+    default:
+      abort ();
+    }
+}
+
+static void
+print_item_directory (const struct spv_item *item)
+{
+  for (int i = 1; i < spv_item_get_level (item); i++)
+    printf ("    ");
+
+  printf ("-");
+  const char *label = spv_item_get_label (item);
+  if (label)
+    printf (" %s", label);
+
+  enum spv_item_type type = spv_item_get_type (item);
+  printf (" %s", spv_item_type_to_string (type));
+  if (type == SPV_ITEM_TABLE)
+    {
+      const struct pivot_table *table = spv_item_get_table (item);
+      char *title = pivot_value_to_string (table->title,
+                                           SETTINGS_VALUE_SHOW_DEFAULT,
+                                           SETTINGS_VALUE_SHOW_DEFAULT);
+      if (!label || strcmp (title, label))
+        printf (" \"%s\"", title);
+      free (title);
+    }
+
+  const char *command_id = spv_item_get_command_id (item);
+  if (command_id)
+    printf (" \"%s\"", command_id);
+
+  if (!spv_item_is_visible (item))
+    printf (" (hidden)");
+  if (show_member_name && (item->xml_member || item->bin_member))
+    {
+      if (item->xml_member && item->bin_member)
+        printf (" in %s and %s", item->xml_member, item->bin_member);
+      else if (item->xml_member)
+        printf (" in %s", item->xml_member);
+      else if (item->bin_member)
+        printf (" in %s", item->bin_member);
+    }
+  putchar ('\n');
+}
+
+static void
+run_detect (int argc UNUSED, char **argv)
+{
+  char *err = spv_detect (argv[1]);
+  if (err)
+    error (1, 0, "%s", err);
+}
+
+static void
+run_directory (int argc UNUSED, char **argv)
+{
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  for (size_t i = 0; i < n_items; i++)
+    print_item_directory (items[i]);
+  free (items);
+
+  spv_close (spv);
+}
+
+struct item_path
+  {
+    const struct spv_item **nodes;
+    size_t n;
+
+#define N_STUB 10
+    const struct spv_item *stub[N_STUB];
+  };
+
+static void
+swap_nodes (const struct spv_item **a, const struct spv_item **b)
+{
+  const struct spv_item *tmp = *a;
+  *a = *b;
+  *b = tmp;
+}
+
+static void
+get_path (const struct spv_item *item, struct item_path *path)
+{
+  size_t allocated = 10;
+  path->nodes = path->stub;
+  path->n = 0;
+
+  while (item)
+    {
+      if (path->n >= allocated)
+        {
+          if (path->nodes == path->stub)
+            path->nodes = xmemdup (path->stub, sizeof path->stub);
+          path->nodes = x2nrealloc (path->nodes, &allocated,
+                                    sizeof *path->nodes);
+        }
+      path->nodes[path->n++] = item;
+      item = item->parent;
+    }
+
+  for (size_t i = 0; i < path->n / 2; i++)
+    swap_nodes (&path->nodes[i], &path->nodes[path->n - i - 1]);
+}
+
+static void
+free_path (struct item_path *path)
+{
+  if (path && path->nodes != path->stub)
+    free (path->nodes);
+}
+
+static void
+dump_heading_transition (const struct spv_item *old,
+                         const struct spv_item *new)
+{
+  if (old == new)
+    return;
+
+  struct item_path old_path, new_path;
+  get_path (old, &old_path);
+  get_path (new, &new_path);
+
+  size_t common = 0;
+  for (; common < old_path.n && common < new_path.n; common++)
+    if (old_path.nodes[common] != new_path.nodes[common])
+      break;
+
+  for (size_t i = common; i < old_path.n; i++)
+    group_close_item_submit (group_close_item_create ());
+  for (size_t i = common; i < new_path.n; i++)
+    group_open_item_submit (group_open_item_create (
+                              new_path.nodes[i]->command_id));
+
+  free_path (&old_path);
+  free_path (&new_path);
+}
+
+static void
+run_convert (int argc UNUSED, char **argv)
+{
+  output_engine_push ();
+  output_set_filename (argv[1]);
+  string_map_insert (&output_options, "output-file", argv[2]);
+  struct output_driver *driver = output_driver_create (&output_options);
+  if (!driver)
+    exit (EXIT_FAILURE);
+  output_driver_register (driver);
+
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  const struct page_setup *ps = spv_get_page_setup (spv);
+  if (ps)
+    page_setup_item_submit (page_setup_item_create (ps));
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  struct spv_item *prev_heading = spv_get_root (spv);
+  for (size_t i = 0; i < n_items; i++)
+    {
+      struct spv_item *heading
+        = items[i]->type == SPV_ITEM_HEADING ? items[i] : items[i]->parent;
+      dump_heading_transition (prev_heading, heading);
+      dump_item (items[i]);
+      prev_heading = heading;
+    }
+  dump_heading_transition (prev_heading, spv_get_root (spv));
+  free (items);
+
+  spv_close (spv);
+
+  output_engine_pop ();
+  fh_done ();
+}
+
+static void
+run_dump (int argc UNUSED, char **argv)
+{
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  for (size_t i = 0; i < n_items; i++)
+    if (items[i]->type == SPV_ITEM_TABLE)
+      {
+        pivot_table_dump (spv_item_get_table (items[i]), 0);
+        putchar ('\n');
+      }
+  free (items);
+
+  spv_close (spv);
+}
+
+static int
+compare_borders (const void *a_, const void *b_)
+{
+  const struct spvlb_border *const *ap = a_;
+  const struct spvlb_border *const *bp = b_;
+  uint32_t a = (*ap)->border_type;
+  uint32_t b = (*bp)->border_type;
+
+  return a < b ? -1 : a > b;
+}
+
+static int
+compare_cells (const void *a_, const void *b_)
+{
+  const struct spvlb_cell *const *ap = a_;
+  const struct spvlb_cell *const *bp = b_;
+  uint64_t a = (*ap)->index;
+  uint64_t b = (*bp)->index;
+
+  return a < b ? -1 : a > b;
+}
+
+static void
+run_dump_light_table (int argc UNUSED, char **argv)
+{
+  if (raw && isatty (STDOUT_FILENO))
+    error (1, 0, "not writing binary data to tty");
+
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  for (size_t i = 0; i < n_items; i++)
+    {
+      if (!spv_item_is_light_table (items[i]))
+        continue;
+
+      char *error;
+      if (raw)
+        {
+          void *data;
+          size_t size;
+          error = spv_item_get_raw_light_table (items[i], &data, &size);
+          if (!error)
+            {
+              fwrite (data, size, 1, stdout);
+              free (data);
+            }
+        }
+      else
+        {
+          struct spvlb_table *table;
+          error = spv_item_get_light_table (items[i], &table);
+          if (!error)
+            {
+              if (sort)
+                {
+                  qsort (table->borders->borders, table->borders->n_borders,
+                         sizeof *table->borders->borders, compare_borders);
+                  qsort (table->cells->cells, table->cells->n_cells,
+                         sizeof *table->cells->cells, compare_cells);
+                }
+              spvlb_print_table (items[i]->bin_member, 0, table);
+              spvlb_free_table (table);
+            }
+        }
+      if (error)
+        {
+          msg (ME, "%s", error);
+          free (error);
+        }
+    }
+
+  free (items);
+
+  spv_close (spv);
+}
+
+static void
+run_dump_legacy_data (int argc UNUSED, char **argv)
+{
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  for (size_t i = 0; i < n_items; i++)
+    if (spv_item_is_legacy_table (items[i]))
+      {
+        struct spv_data data;
+        char *error;
+        if (raw)
+          {
+            void *data;
+            size_t size;
+            error = spv_item_get_raw_legacy_data (items[i], &data, &size);
+            if (!error)
+              {
+                fwrite (data, size, 1, stdout);
+                free (data);
+              }
+          }
+        else
+          {
+            error = spv_item_get_legacy_data (items[i], &data);
+            if (!error)
+              {
+                printf ("%s:\n", items[i]->bin_member);
+                spv_data_dump (&data, stdout);
+                spv_data_uninit (&data);
+                printf ("\n");
+              }
+          }
+
+        if (error)
+          {
+            msg (ME, "%s", error);
+            free (error);
+          }
+      }
+  free (items);
+
+  spv_close (spv);
+}
+
+/* This is really bogus.
+
+   XPath doesn't have any notion of a default XML namespace, but all of the
+   elements in the documents we're interested in have a namespace.  Thus, we'd
+   need to require the XPath expressions to have a namespace on every single
+   element: vis:sourceVariable, vis:graph, and so on.  That's a pain.  So,
+   instead, we remove the default namespace from everyplace it occurs.  XPath
+   does support the null namespace, so this allows sourceVariable, graph,
+   etc. to work.
+
+   See http://plasmasturm.org/log/259/ and
+   https://mail.gnome.org/archives/xml/2003-April/msg00144.html for more
+   information.*/
+static void
+remove_default_xml_namespace (xmlNode *node)
+{
+  if (node->ns && !node->ns->prefix)
+    node->ns = NULL;
+
+  for (xmlNode *child = node->children; child; child = child->next)
+    remove_default_xml_namespace (child);
+}
+
+static void
+register_ns (xmlXPathContext *ctx, const char *prefix, const char *uri)
+{
+  xmlXPathRegisterNs (ctx, CHAR_CAST (xmlChar *, prefix),
+                      CHAR_CAST (xmlChar *, uri));
+}
+
+static xmlXPathContext *
+create_xpath_context (xmlDoc *doc)
+{
+  xmlXPathContext *ctx = xmlXPathNewContext (doc);
+  register_ns (ctx, "vgr", "http://xml.spss.com/spss/viewer/viewer-graph");
+  register_ns (ctx, "vizml", "http://xml.spss.com/visualization");
+  register_ns (ctx, "vmd", "http://xml.spss.com/spss/viewer/viewer-model");
+  register_ns (ctx, "vps", "http://xml.spss.com/spss/viewer/viewer-pagesetup");
+  register_ns (ctx, "vst", "http://xml.spss.com/spss/viewer/viewer-style");
+  register_ns (ctx, "vtb", "http://xml.spss.com/spss/viewer/viewer-table");
+  register_ns (ctx, "vtl", "http://xml.spss.com/spss/viewer/table-looks");
+  register_ns (ctx, "vtt", "http://xml.spss.com/spss/viewer/viewer-treemodel");
+  register_ns (ctx, "vtx", "http://xml.spss.com/spss/viewer/viewer-text");
+  register_ns (ctx, "xsi", "http://www.w3.org/2001/XMLSchema-instance");
+  return ctx;
+}
+
+static void
+dump_xml (int argc, char **argv, const char *member_name,
+          char *error_s, xmlDoc *doc)
+{
+  if (!error_s)
+    {
+      if (argc == 2)
+        {
+          printf ("<!-- %s -->\n", member_name);
+          xmlElemDump (stdout, NULL, xmlDocGetRootElement (doc));
+          putchar ('\n');
+        }
+      else
+        {
+          bool any_results = false;
+
+          remove_default_xml_namespace (xmlDocGetRootElement (doc));
+          for (int i = 2; i < argc; i++)
+            {
+              xmlXPathContext *xpath_ctx = create_xpath_context (doc);
+              xmlXPathSetContextNode (xmlDocGetRootElement (doc),
+                                      xpath_ctx);
+              xmlXPathObject *xpath_obj = xmlXPathEvalExpression(
+                CHAR_CAST (xmlChar *, argv[i]), xpath_ctx);
+              if (!xpath_obj)
+                error (1, 0, _("%s: invalid XPath expression"), argv[i]);
+
+              const xmlNodeSet *nodes = xpath_obj->nodesetval;
+              if (nodes && nodes->nodeNr > 0)
+                {
+                  if (!any_results)
+                    {
+                      printf ("<!-- %s -->\n", member_name);
+                      any_results = true;
+                    }
+                  for (size_t j = 0; j < nodes->nodeNr; j++)
+                    {
+                      xmlElemDump (stdout, doc, nodes->nodeTab[j]);
+                      putchar ('\n');
+                    }
+                }
+
+              xmlXPathFreeObject (xpath_obj);
+              xmlXPathFreeContext (xpath_ctx);
+            }
+          if (any_results)
+            putchar ('\n');;
+        }
+      xmlFreeDoc (doc);
+    }
+  else
+    {
+      printf ("<!-- %s -->\n", member_name);
+      msg (ME, "%s", error_s);
+      free (error_s);
+    }
+}
+
+static void
+run_dump_legacy_table (int argc, char **argv)
+{
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  for (size_t i = 0; i < n_items; i++)
+    if (spv_item_is_legacy_table (items[i]))
+      {
+        xmlDoc *doc;
+        char *error_s = spv_item_get_legacy_table (items[i], &doc);
+        dump_xml (argc, argv, items[i]->xml_member, error_s, doc);
+      }
+  free (items);
+
+  spv_close (spv);
+}
+
+static void
+run_dump_structure (int argc, char **argv)
+{
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  const char *last_structure_member = NULL;
+  for (size_t i = 0; i < n_items; i++)
+    if (!last_structure_member || strcmp (items[i]->structure_member,
+                                          last_structure_member))
+      {
+        last_structure_member = items[i]->structure_member;
+
+        xmlDoc *doc;
+        char *error_s = spv_item_get_structure (items[i], &doc);
+        dump_xml (argc, argv, items[i]->structure_member, error_s, doc);
+      }
+  free (items);
+
+  spv_close (spv);
+}
+
+static void
+run_is_legacy (int argc UNUSED, char **argv)
+{
+  struct spv_reader *spv;
+  char *err = spv_open (argv[1], &spv);
+  if (err)
+    error (1, 0, "%s", err);
+
+  bool is_legacy = false;
+
+  struct spv_item **items;
+  size_t n_items;
+  spv_select (spv, &criteria, &items, &n_items);
+  for (size_t i = 0; i < n_items; i++)
+    if (spv_item_is_legacy_table (items[i]))
+      {
+        is_legacy = true;
+        break;
+      }
+  free (items);
+
+  spv_close (spv);
+
+  exit (is_legacy ? EXIT_SUCCESS : EXIT_FAILURE);
+}
+
+struct command
+  {
+    const char *name;
+    int min_args, max_args;
+    void (*run) (int argc, char **argv);
+  };
+
+static const struct command commands[] =
+  {
+    { "detect", 1, 1, run_detect },
+    { "dir", 1, 1, run_directory },
+    { "convert", 2, 2, run_convert },
+
+    /* Undocumented commands. */
+    { "dump", 1, 1, run_dump },
+    { "dump-light-table", 1, 1, run_dump_light_table },
+    { "dump-legacy-data", 1, 1, run_dump_legacy_data },
+    { "dump-legacy-table", 1, INT_MAX, run_dump_legacy_table },
+    { "dump-structure", 1, INT_MAX, run_dump_structure },
+    { "is-legacy", 1, 1, run_is_legacy },
+  };
+static const int n_commands = sizeof commands / sizeof *commands;
+
+static const struct command *
+find_command (const char *name)
+{
+  for (size_t i = 0; i < n_commands; i++)
+    {
+      const struct command *c = &commands[i];
+      if (!strcmp (name, c->name))
+        return c;
+    }
+  return NULL;
+}
+
+static void
+emit_msg (const struct msg *m, void *aux UNUSED)
+{
+  if (m->severity == MSG_S_ERROR || m->severity == MSG_S_WARNING)
+    n_warnings++;
+
+  char *s = msg_to_string (m);
+  fprintf (stderr, "%s\n", s);
+  free (s);
+}
+
+int
+main (int argc, char **argv)
+{
+  set_program_name (argv[0]);
+  msg_set_handler (emit_msg, NULL);
+  settings_init ();
+  i18n_init ();
+
+  parse_options (argc, argv);
+
+  argc -= optind;
+  argv += optind;
+
+  if (argc < 1)
+    error (1, 0, _("missing command name (use --help for help)"));
+
+  const struct command *c = find_command (argv[0]);
+  if (!c)
+    error (1, 0, _("unknown command \"%s\" (use --help for help)"), argv[0]);
+
+  int n_args = argc - 1;
+  if (n_args < c->min_args || n_args > c->max_args)
+    {
+      if (c->min_args == c->max_args)
+        error (1, 0, _("\"%s\" command takes exactly %d argument%s"),
+               c->name, c->min_args, c->min_args ? "s" : "");
+      else if (c->max_args == INT_MAX)
+        error (1, 0, _("\"%s\" command requires at least %d argument%s"),
+               c->name, c->min_args, c->min_args ? "s" : "");
+      else
+        error (1, 0, _("\"%s\" command requires between %d and %d arguments"),
+               c->name, c->min_args, c->max_args);
+    }
+
+  c->run (argc, argv);
+
+  i18n_done ();
+
+  return n_warnings ? EXIT_FAILURE : EXIT_SUCCESS;
+}
+
+static void
+parse_select (char *arg, bool invert)
+{
+  unsigned classes = 0;
+  for (char *token = strtok (arg, ","); token; token = strtok (NULL, ","))
+    {
+      if (!strcmp (arg, "all"))
+        classes = SPV_ALL_CLASSES;
+      else if (!strcmp (arg, "help"))
+        {
+          puts (_("The following object classes are supported:"));
+          for (int class = 0; class < SPV_N_CLASSES; class++)
+            printf ("- %s\n", spv_item_class_to_string (class));
+          exit (0);
+        }
+      else
+        {
+          int class = spv_item_class_from_string (token);
+          if (class == SPV_N_CLASSES)
+            error (1, 0, _("%s: unknown object class (use --select=help "
+                           "for help"), arg);
+          classes |= 1u << class;
+        }
+    }
+
+  criteria.classes = invert ? classes ^ SPV_ALL_CLASSES : classes;
+}
+
+static void
+parse_commands (char *arg)
+{
+  size_t allocated_commands = criteria.n_commands;
+
+  for (char *token = strtok (arg, ","); token; token = strtok (NULL, ","))
+    {
+      char *save_ptr = NULL;
+      char *name = strtok_r (token, "()", &save_ptr);
+      char *number = strtok_r (NULL, "()", &save_ptr);
+
+      if (criteria.n_commands >= allocated_commands)
+        criteria.commands = x2nrealloc (criteria.commands, &allocated_commands,
+                                        sizeof *criteria.commands);
+
+      struct spv_command_match *cm = &criteria.commands[criteria.n_commands++];
+      if (!strcmp (name, "last"))
+        {
+          cm->name = NULL;
+          cm->instance = -1;
+        }
+      else if (c_isdigit (name[0]))
+        {
+          cm->name = NULL;
+          cm->instance = atoi (name);
+        }
+      else
+        {
+          cm->name = name;
+          cm->instance = (!number ? 0
+                          : !strcmp (number, "last") ? -1
+                          : atoi (number));
+        }
+    }
+}
+
+static void
+parse_subtypes (char *arg)
+{
+  for (char *token = strtok (arg, ","); token; token = strtok (NULL, ","))
+    string_set_insert (&criteria.subtypes, token);
+}
+
+static void
+parse_labels (char *arg, enum spv_label_match_op op)
+{
+  size_t allocated_labels = criteria.n_labels;
+
+  for (char *token = strtok (arg, ","); token; token = strtok (NULL, ","))
+    {
+      if (criteria.n_labels >= allocated_labels)
+        criteria.labels = x2nrealloc (criteria.labels, &allocated_labels,
+                                      sizeof *criteria.labels);
+
+      struct spv_label_match *lm = &criteria.labels[criteria.n_labels++];
+      lm->op = op;
+      lm->arg = arg;
+    }
+}
+
+static void
+parse_instances (char *arg)
+{
+  size_t allocated_instances = criteria.n_instances;
+
+  for (char *token = strtok (arg, ","); token; token = strtok (NULL, ","))
+    {
+      if (criteria.n_instances >= allocated_instances)
+        criteria.instances = x2nrealloc (criteria.instances,
+                                         &allocated_instances,
+                                         sizeof *criteria.instances);
+
+      criteria.instances[criteria.n_instances++]
+        = (!strcmp (token, "last") ? -1 : atoi (token));
+    }
+}
+
+static void
+parse_options (int argc, char *argv[])
+{
+  for (;;)
+    {
+      enum
+        {
+          OPT_MEMBER_NAME = UCHAR_MAX + 1,
+          OPT_SHOW_HIDDEN,
+          OPT_SELECT,
+          OPT_SELECT_EXCEPT,
+          OPT_COMMANDS,
+          OPT_SUBTYPES,
+          OPT_LABELS,
+          OPT_LABELS_CONTAINING,
+          OPT_LABELS_STARTING,
+          OPT_LABELS_ENDING,
+          OPT_INSTANCES,
+          OPT_ERRORS,
+          OPT_SORT,
+          OPT_RAW,
+        };
+      static const struct option long_options[] =
+        {
+          { "member-name", no_argument, NULL, OPT_MEMBER_NAME },
+          { "show-hidden", no_argument, NULL, OPT_SHOW_HIDDEN },
+          { "select", required_argument, NULL, OPT_SELECT },
+          { "select-except", required_argument, NULL, OPT_SELECT_EXCEPT },
+          { "commands", required_argument, NULL, OPT_COMMANDS },
+          { "subtypes", required_argument, NULL, OPT_SUBTYPES },
+          { "labels", required_argument, NULL, OPT_LABELS },
+          { "labels-containing", required_argument, NULL,
+            OPT_LABELS_CONTAINING },
+          { "labels-starting", required_argument, NULL, OPT_LABELS_STARTING },
+          { "labels-ending", required_argument, NULL, OPT_LABELS_ENDING },
+          { "instances", required_argument, NULL, OPT_INSTANCES },
+          { "errors", no_argument, NULL, OPT_ERRORS },
+          { "sort", no_argument, NULL, OPT_SORT },
+          { "raw", no_argument, NULL, OPT_RAW },
+          { "help", no_argument, NULL, 'h' },
+          { "version", no_argument, NULL, 'v' },
+          { NULL, 0, NULL, 0 },
+        };
+
+      int c;
+
+      c = getopt_long (argc, argv, "O:hv", long_options, NULL);
+      if (c == -1)
+        break;
+
+      switch (c)
+        {
+        case 'O':
+          output_driver_parse_option (optarg, &output_options);
+          break;
+
+        case OPT_MEMBER_NAME:
+          show_member_name = true;
+          break;
+
+        case OPT_SHOW_HIDDEN:
+          criteria.include_hidden = true;
+          break;
+
+        case OPT_SELECT:
+          parse_select (optarg, false);
+          break;
+
+        case OPT_SELECT_EXCEPT:
+          parse_select (optarg, true);
+          break;
+
+        case OPT_COMMANDS:
+          parse_commands (optarg);
+          break;
+
+        case OPT_SUBTYPES:
+          parse_subtypes (optarg);
+          break;
+
+        case OPT_LABELS:
+          parse_labels (optarg, SPV_LABEL_MATCH_EQUALS);
+          break;
+
+        case OPT_LABELS_CONTAINING:
+          parse_labels (optarg, SPV_LABEL_MATCH_CONTAINS);
+          break;
+
+        case OPT_LABELS_STARTING:
+          parse_labels (optarg, SPV_LABEL_MATCH_STARTS);
+          break;
+
+        case OPT_LABELS_ENDING:
+          parse_labels (optarg, SPV_LABEL_MATCH_ENDS);
+          break;
+
+        case OPT_INSTANCES:
+          parse_instances (optarg);
+          break;
+
+        case OPT_ERRORS:
+          criteria.error = true;
+          break;
+
+        case OPT_SORT:
+          sort = true;
+          break;
+
+        case OPT_RAW:
+          raw = true;
+          break;
+
+        case 'v':
+          version_etc (stdout, "pspp-output", PACKAGE_NAME, PACKAGE_VERSION,
+                       "Ben Pfaff", "John Darrington", NULL_SENTINEL);
+          exit (EXIT_SUCCESS);
+
+        case 'h':
+          usage ();
+          exit (EXIT_SUCCESS);
+
+        default:
+          exit (EXIT_FAILURE);
+        }
+    }
+}
+
+static void
+usage (void)
+{
+  struct string s = DS_EMPTY_INITIALIZER;
+  struct string_set formats = STRING_SET_INITIALIZER(formats);
+  output_get_supported_formats (&formats);
+  const char *format;
+  const struct string_set_node *node;
+  STRING_SET_FOR_EACH (format, node, &formats)
+    {
+      if (!ds_is_empty (&s))
+        ds_put_byte (&s, ' ');
+      ds_put_cstr (&s, format);
+    }
+  string_set_destroy (&formats);
+
+  printf ("\
+%s, a utility for working with SPSS output (.spv) files.\n\
+Usage: %s [OPTION]... COMMAND ARG...\n\
+\n\
+The following commands are available:\n\
+  detect INPUT           Detect whether INPUT is an SPV file.\n\
+  dir INPUT              List tables and other items in INPUT.\n\
+  convert INPUT OUTPUT   Convert .spv INPUT to OUTPUT.\n\
+\n\
+The desired format of OUTPUT is by default inferred from its extension:\n\
+%s\n\
+\n\
+Options:\n\
+  -O format=FORMAT          override format for output\n\
+  -O OPTION=VALUE           set output option\n\
+  --help              display this help and exit\n\
+  --version           output version information and exit\n",
+          program_name, program_name, ds_cstr (&s));
+  ds_destroy (&s);
+}
author	Ben Pfaff <blp@cs.stanford.edu>
	Sun, 30 Dec 2018 05:15:43 +0000 (21:15 -0800)
committer	Ben Pfaff <blp@cs.stanford.edu>
	Sun, 10 Feb 2019 00:02:28 +0000 (16:02 -0800)
doc/dev/spv-file-format.texi		patch \| blob \| history
src/language/dictionary/sys-file-info.c		patch \| blob \| history
src/language/stats/crosstabs.q		patch \| blob \| history
src/libpspp/zip-writer.c		patch \| blob \| history
src/libpspp/zip-writer.h		patch \| blob \| history
src/output/ascii.c		patch \| blob \| history
src/output/automake.mk		patch \| blob \| history
src/output/cairo.c		patch \| blob \| history
src/output/csv.c		patch \| blob \| history
src/output/driver.c		patch \| blob \| history
src/output/html.c		patch \| blob \| history
src/output/odt.c		patch \| blob \| history
src/output/pivot-output.c		patch \| blob \| history
src/output/pivot-table.c		patch \| blob \| history
src/output/pivot-table.h		patch \| blob \| history
src/output/render.c		patch \| blob \| history
src/output/spv-driver.c	[new file with mode: 0644]	patch \| blob
src/output/spv/automake.mk	[new file with mode: 0644]	patch \| blob
src/output/spv/binary-parser-generator	[new file with mode: 0644]	patch \| blob
src/output/spv/detail-xml.grammar	[new file with mode: 0644]	patch \| blob
src/output/spv/light-binary.grammar	[new file with mode: 0644]	patch \| blob
src/output/spv/old-binary.grammar	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-css-parser.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-css-parser.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-dump.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-legacy-data.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-legacy-data.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-legacy-decoder.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-legacy-decoder.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-light-decoder.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-light-decoder.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-output.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-output.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-select.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-select.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-writer.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv-writer.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spv.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spv.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spvbin-helpers.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spvbin-helpers.h	[new file with mode: 0644]	patch \| blob
src/output/spv/spvxml-helpers.c	[new file with mode: 0644]	patch \| blob
src/output/spv/spvxml-helpers.h	[new file with mode: 0644]	patch \| blob
src/output/spv/structure-xml.grammar	[new file with mode: 0644]	patch \| blob
src/output/spv/xml-parser-generator	[new file with mode: 0644]	patch \| blob
src/output/table-item.c		patch \| blob \| history
src/output/table.c		patch \| blob \| history
src/ui/gui/psppire-output-window.c		patch \| blob \| history
src/ui/gui/psppire-window.c		patch \| blob \| history
src/ui/gui/psppire-window.h		patch \| blob \| history
src/ui/gui/psppire.c		patch \| blob \| history
tests/output/render.at		patch \| blob \| history
utilities/automake.mk		patch \| blob \| history
utilities/pspp-output.c	[new file with mode: 0644]	patch \| blob