From: Ben Pfaff Date: Sat, 19 Oct 2019 05:30:43 +0000 (+0000) Subject: Add support for reading and writing SPV files. X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;ds=sidebyside;h=50f6ea7d66d03895020891215fb4f55bbf061003;p=pspp Add support for reading and writing SPV files. --- diff --git a/NEWS b/NEWS index 020bc6d148..9aeaca95ea 100644 --- a/NEWS +++ b/NEWS @@ -6,6 +6,18 @@ Please send PSPP bug reports to bug-gnu-pspp@gnu.org. Changes from 1.2.0 to 1.3.0: + * PSPP now supports the SPSS viewer (.spv) format that SPSS 16 and later + use to save the contents of its output editor: + + - PSPP and PSPPIRE can write output to .spv files. + + - The new utility pspp-output can convert .spv files to other formats. + + - The pspp-convert utility can now decrypt encrypted .spv files. The + encrypted viewer file format is unacceptably insecure, so to + discourage its use PSPP and PSPPIRE do not directly read or write + this format. + * A bug where the Data|Select Cases|Random Sample menu would generate invalid syntax has been fixed. @@ -14,15 +26,8 @@ Changes from 1.2.0 to 1.3.0: * Plain text output is no longer divided into pages, since it is now rarely printed on paper. - * pspp-convert: - - - New support to decrypt encrypted viewer (SPV) files. The - encrypted viewer file format is unacceptably insecure, so to - discourage its use PSPP and PSPPIRE do not directly read or write - this format. - - - New "-a", "-l", "--password-list" options to search for an - encrypted file's password. + * The pspp-convert utility has new "-a", "-l", "--password-list" + options to search for an encrypted file's password. * Improvements to SAVE DATA COLLECTION support for MDD files. diff --git a/doc/automake.mk b/doc/automake.mk index 951c6dbaa9..3daa2cc0be 100644 --- a/doc/automake.mk +++ b/doc/automake.mk @@ -35,6 +35,7 @@ doc_pspp_TEXINFOS = doc/version.texi \ doc/language.texi \ doc/license.texi \ doc/pspp-convert.texi \ + doc/pspp-output.texi \ doc/pspp-dump-sav.texi \ doc/ni.texi \ doc/not-implemented.texi \ diff --git a/doc/pspp-output.texi b/doc/pspp-output.texi new file mode 100644 index 0000000000..38c72d3aa7 --- /dev/null +++ b/doc/pspp-output.texi @@ -0,0 +1,191 @@ +@c PSPP - a program for statistical analysis. +@c Copyright (C) 2019 Free Software Foundation, Inc. +@c Permission is granted to copy, distribute and/or modify this document +@c under the terms of the GNU Free Documentation License, Version 1.3 +@c or any later version published by the Free Software Foundation; +@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. +@c A copy of the license is included in the section entitled "GNU +@c Free Documentation License". +@c +@node Invoking pspp-output +@chapter Invoking @command{pspp-output} +@cindex Invocation +@cindex @command{pspp-output} + +@command{pspp-output} is a command-line utility accompanying @pspp{}. +It supports multiple operations on SPSS viewer or @file{.spv} files, +here called SPV files. SPSS 16 and later writes SPV files to +represent the contents of its output editor. + +SPSS 15 and earlier versions instead use @file{.spo} files. +@command{pspp-output} does not support this format. + +@command{pspp-options} may be invoked in the following ways: + +@display +@t{pspp-output} @t{detect} @var{file} + +@t{pspp-output} [@var{options}] @t{dir} @var{file} + +@t{pspp-output} [@var{options}] @t{convert} @var{source} @var{destination} + +@t{pspp-output -@w{-}help} + +@t{pspp-output -@w{-}version} +@end display + +Each of these forms is documented separately below. +@command{pspp-output} also has several undocumented command forms that +developers may find useful for debugging. + +@node The pspp-output detect Command +@section The @code{detect} Command + +@display +@t{pspp-output} @t{detect} @var{file} +@end display + +When @var{file} is an SPV file, @command{pspp-output} exits +successfully without outputting anything. When @var{file} is not an +SPV file or some other error occurs, @command{pspp-output} prints an +error message and exits with a failure indication. + +@node The pspp-output dir Command +@section The @code{dir} Command + +@display +@t{pspp-output} [@var{options}] @t{dir} @var{file} +@end display + +Prints on stdout a table of contents for SPV file @var{file}. By +default, this table lists every object in the file, except for hidden +objects. @xref{Input Selection Options}, for information on the +options available to select a subset of objects. + +The following additional option for @command{dir} is intended mainly +for use by PSPP developers: + +@table @option +@item --member-names +Also show the names of the Zip members associated with each object. +@end table + +@node The pspp-output convert Command +@section The @code{convert} Command + +@display +@t{pspp-output} [@var{options}] @t{convert} @var{source} @var{destination} +@end display + +Reads SPV file @var{source} and converts it to another format, writing +the output to @var{destination}. + +By default, the intended format for @var{destination} is inferred +based on its extension, in the same way that the @command{pspp} +program does for its output files. @xref{Invoking PSPP}, for details. + +@xref{Input Selection Options}, for information on the options +available to select a subset of objects to include in the output. The +following additional options are accepted: + +@table @option +@item -O format=@var{format} +Overrides the format inferred from the output file's extension. Use +@option{--help} to list the available formats. @xref{Invoking PSPP}, +for details of the available output formats. + +@item -O @var{option}=@var{value} +Sets an option for the output file format. @xref{Invoking PSPP}, for +details of the available output options. + +@item -F +@itemx --force +By default, if the source is corrupt or otherwise cannot be processed, +the destination is not written. With @option{-F} or @option{--force}, +the destination is written as best it can, even with errors. +@end table + +@node Input Selection Options +@section Input Selection Options + +The @command{dir} and @command{convert} commands, by default, operate +on all of the objects in the source SPV file, except for objects that +are not visible in the output viewer window. The user may specify +these options to select a subset of the input objects. When multiple +options are used, only objects that satisfy all of them are selected: + +@table @option +@item --select=@r{[}^@r{]}@var{class}@dots{} +Include only objects of the given @var{class}; with leading @samp{^}, +include only objects not in the class. Use commas to separate +multiple classes. The supported classes are: + +@quotation +@code{charts headings logs models tables texts trees warnings +outlineheaders pagetitle notes unknown other} +@end quotation + +Use @option{--select=help} to print this list of classes. + +@item --commands=@r{[}^@r{]}@var{command}@dots{} +@itemx --subtypes=@r{[}^@r{]}@var{subtype}@dots{} +@itemx --labels=@r{[}^@r{]}@var{label}@dots{} +Include only objects with the specified @var{command}, @var{subtype}, +or @var{label}. With a leading @samp{^}, include only the objects +that do not match. Multiple values may be specified separated by +commas. An asterisk at the end of a value acts as a wildcard. + +The @option{--command} option matches command identifiers, case +insensitively. All of the objects produced by a single command use +the same, unique command identifier. Command identifiers are always +in English regardless of the language used for output. They often +differ from the command name in PSPP syntax. Use the +@command{pspp-output} program's @command{dir} command to print command +identifiers in particular output. + +The @option{--subtypes} option matches particular tables within a +command, case insensitively. Subtypes are not necessarily unique: two +commands that produce similar output tables may use the same subtype. +Subtypes are always in English and @command{dir} will print them. + +The @option{--labels} option matches the labels in table output (that +is, the table titles). Labels are affected by the output language, +variable names and labels, split file settings, and other factors. + +@item --instances=@var{instance}@dots{} +Include the specified @var{instance} of an object that matches the +other criteria within a single command. The @var{instance} may be a +number (1 for the first instance, 2 for the second, and so on) or +@code{last} for the last instance. + +@item --show-hidden +Include hidden output objects in the output. By default, they are +excluded. + +@item --or +Separates two sets of selection options. Objects selected by either +set of options are included in the output. +@end table + +The following additional input selection options are intended mainly +for use by PSPP developers: + +@table @option +@item --errors +Include only objects that cause an error when read. With the +@command{convert} command, this is most useful in conjunction with the +@option{--force} option. + +@item --members=@var{member}@dots{} +Include only the objects that include a listed Zip file @var{member}. +More than one name may be included, comma-separated. The members in +an SPV file may be listed with the @command{dir} command by adding the +@option{--show-members} option or with the @command{zipinfo} program +included with many operating systems. Error messages that +@command{pspp-output} prints when it reads SPV files also often +include member names. + +@item --member-names +Displays the name of the Zip member or members associated with each +object just above the object itself. +@end table diff --git a/doc/pspp.texi b/doc/pspp.texi index 7da31a4e83..39663483d0 100644 --- a/doc/pspp.texi +++ b/doc/pspp.texi @@ -122,6 +122,7 @@ in the production of this manual. * Utilities:: Other commands. * Invoking pspp-convert:: Utility for converting among file formats. +* Invoking pspp-output:: Utility for working with viewer (SPV) files. * Invoking pspp-dump-sav:: Utility for examining raw .sav files. * Not Implemented:: What's not here yet * Bugs:: Known problems; submitting bug reports. @@ -151,6 +152,7 @@ in the production of this manual. @include utilities.texi @include pspp-convert.texi +@include pspp-output.texi @include pspp-dump-sav.texi @include not-implemented.texi @include bugs.texi diff --git a/src/output/automake.mk b/src/output/automake.mk index 960f43bfc2..dd9f3fc9cd 100644 --- a/src/output/automake.mk +++ b/src/output/automake.mk @@ -72,6 +72,7 @@ src_output_liboutput_la_SOURCES = \ src/output/pivot-table.h \ src/output/render.c \ src/output/render.h \ + src/output/spv-driver.c \ src/output/table-item.c \ src/output/table-item.h \ src/output/table-provider.h \ @@ -95,7 +96,10 @@ src_output_liboutput_la_SOURCES += \ src/output/charts/spreadlevel-cairo.c \ src/output/charts/scatterplot-cairo.c endif +nodist_src_output_liboutput_la_SOURCES = EXTRA_DIST += \ src/output/README \ src/output/mk-class-boilerplate + +include src/output/spv/automake.mk diff --git a/src/output/driver.c b/src/output/driver.c index 8e856222dc..7782e91f0c 100644 --- a/src/output/driver.c +++ b/src/output/driver.c @@ -435,6 +435,7 @@ extern const struct output_driver_factory list_driver_factory; extern const struct output_driver_factory html_driver_factory; extern const struct output_driver_factory csv_driver_factory; extern const struct output_driver_factory odt_driver_factory; +extern const struct output_driver_factory spv_driver_factory; #ifdef HAVE_CAIRO extern const struct output_driver_factory pdf_driver_factory; extern const struct output_driver_factory ps_driver_factory; @@ -448,6 +449,7 @@ static const struct output_driver_factory *factories[] = &html_driver_factory, &csv_driver_factory, &odt_driver_factory, + &spv_driver_factory, #ifdef HAVE_CAIRO &pdf_driver_factory, &ps_driver_factory, diff --git a/src/output/pivot-output.c b/src/output/pivot-output.c index e068a50fc1..e624f78dcb 100644 --- a/src/output/pivot-output.c +++ b/src/output/pivot-output.c @@ -509,6 +509,8 @@ pivot_table_submit_layer (const struct pivot_table *pt, } free (footnotes); + ti->pt = pivot_table_ref (pt); + table_item_submit (ti); } diff --git a/src/output/pivot-table.c b/src/output/pivot-table.c index 3878b888e5..031656227b 100644 --- a/src/output/pivot-table.c +++ b/src/output/pivot-table.c @@ -704,10 +704,8 @@ pivot_table_create__ (struct pivot_value *title, const char *subtype) table->show_caption = true; table->weight_format = (struct fmt_spec) { FMT_F, 40, 0 }; table->title = title; - table->subtype = pivot_value_new_text (subtype); - - const char *command_id = output_get_command_name (); - table->command_c = command_id ? xstrdup (command_id) : NULL; + table->subtype = subtype ? pivot_value_new_text (subtype) : NULL; + table->command_c = output_get_command_name (); table->sizing[TABLE_HORZ].range[0] = 50; table->sizing[TABLE_HORZ].range[1] = 72; @@ -1985,15 +1983,16 @@ pivot_value_destroy (struct pivot_value *value) DEFAULT_STYLE for the parts of the style that VALUE doesn't override. */ void pivot_value_get_style (struct pivot_value *value, - const struct area_style *default_style, + const struct font_style *base_font_style, + const struct cell_style *base_cell_style, struct area_style *area) { font_style_copy (NULL, &area->font_style, (value->font_style ? value->font_style - : &default_style->font_style)); - area->cell_style = (value->cell_style - ? *value->cell_style - : default_style->cell_style); + : base_font_style)); + area->cell_style = *(value->cell_style + ? value->cell_style + : base_cell_style); } /* Copies AREA into VALUE's style. */ @@ -2228,6 +2227,12 @@ void pivot_value_add_footnote (struct pivot_value *v, const struct pivot_footnote *footnote) { + /* Some legacy tables include numerous duplicate footnotes. Suppress + them. */ + for (size_t i = 0; i < v->n_footnotes; i++) + if (v->footnotes[i] == footnote) + return; + v->footnotes = xrealloc (v->footnotes, (v->n_footnotes + 1) * sizeof *v->footnotes); v->footnotes[v->n_footnotes++] = footnote; diff --git a/src/output/pivot-table.h b/src/output/pivot-table.h index ae8a68e844..e7f0afad85 100644 --- a/src/output/pivot-table.h +++ b/src/output/pivot-table.h @@ -712,7 +712,8 @@ void pivot_value_destroy (struct pivot_value *); /* Styling. */ void pivot_value_get_style (struct pivot_value *, - const struct area_style *default_style, + const struct font_style *base_font_style, + const struct cell_style *base_cell_style, struct area_style *); void pivot_value_set_style (struct pivot_value *, const struct area_style *); diff --git a/src/output/spv-driver.c b/src/output/spv-driver.c new file mode 100644 index 0000000000..935b2fb9f7 --- /dev/null +++ b/src/output/spv-driver.c @@ -0,0 +1,127 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2019 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/driver-provider.h" + +#include + +#include "data/file-handle-def.h" +#include "libpspp/cast.h" +#include "output/group-item.h" +#include "output/page-setup-item.h" +#include "output/table-item.h" +#include "output/text-item.h" +#include "output/spv/spv-writer.h" + +#include "gl/xalloc.h" + +#include "gettext.h" +#define _(msgid) gettext (msgid) + +struct spv_driver + { + struct output_driver driver; + struct spv_writer *writer; + struct file_handle *handle; + }; + +static const struct output_driver_class spv_driver_class; + +static struct spv_driver * +spv_driver_cast (struct output_driver *driver) +{ + assert (driver->class == &spv_driver_class); + return UP_CAST (driver, struct spv_driver, driver); +} + +static struct output_driver * +spv_create (struct file_handle *fh, enum settings_output_devices device_type, + struct string_map *o UNUSED) +{ + struct output_driver *d; + struct spv_driver *spv; + + spv = xzalloc (sizeof *spv); + d = &spv->driver; + spv->handle = fh; + output_driver_init (&spv->driver, &spv_driver_class, fh_get_file_name (fh), + device_type); + + char *error = spv_writer_open (fh_get_file_name (fh), &spv->writer); + if (spv->writer == NULL) + { + msg (ME, "%s", error); + goto error; + } + + return d; + + error: + fh_unref (fh); + output_driver_destroy (d); + return NULL; +} + +static void +spv_destroy (struct output_driver *driver) +{ + struct spv_driver *spv = spv_driver_cast (driver); + + char *error = spv_writer_close (spv->writer); + if (error) + msg (ME, "%s", error); + fh_unref (spv->handle); + free (spv); +} + +static void +spv_submit (struct output_driver *driver, + const struct output_item *output_item) +{ + struct spv_driver *spv = spv_driver_cast (driver); + + if (is_group_open_item (output_item)) + spv_writer_open_heading (spv->writer, + to_group_open_item (output_item)->command_name, + to_group_open_item (output_item)->command_name); + else if (is_group_close_item (output_item)) + spv_writer_close_heading (spv->writer); + else if (is_table_item (output_item)) + { + const struct table_item *table_item = to_table_item (output_item); + if (table_item->pt) + spv_writer_put_table (spv->writer, table_item->pt); + } + else if (is_text_item (output_item)) + spv_writer_put_text (spv->writer, to_text_item (output_item), + output_get_command_name ()); + else if (is_page_setup_item (output_item)) + spv_writer_set_page_setup (spv->writer, + to_page_setup_item (output_item)->page_setup); +} + +struct output_driver_factory spv_driver_factory = + { "spv", "pspp.spv", spv_create }; + +static const struct output_driver_class spv_driver_class = + { + "spv", + spv_destroy, + spv_submit, + NULL, + }; diff --git a/src/output/spv/automake.mk b/src/output/spv/automake.mk new file mode 100644 index 0000000000..cda806b4a7 --- /dev/null +++ b/src/output/spv/automake.mk @@ -0,0 +1,104 @@ +# PSPP - a program for statistical analysis. +# Copyright (C) 2017 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . +# +## Process this file with automake to produce Makefile.in -*- makefile -*- + +src_output_liboutput_la_SOURCES += \ + src/output/spv/spv-css-parser.c \ + src/output/spv/spv-css-parser.h \ + src/output/spv/spv-dump.c \ + src/output/spv/spv-legacy-data.c \ + src/output/spv/spv-legacy-data.h \ + src/output/spv/spv-legacy-decoder.c \ + src/output/spv/spv-legacy-decoder.h \ + src/output/spv/spv-light-decoder.c \ + src/output/spv/spv-light-decoder.h \ + src/output/spv/spv-output.c \ + src/output/spv/spv-output.h \ + src/output/spv/spv-select.c \ + src/output/spv/spv-select.h \ + src/output/spv/spv-writer.c \ + src/output/spv/spv-writer.h \ + src/output/spv/spv.c \ + src/output/spv/spv.h \ + src/output/spv/spvbin-helpers.c \ + src/output/spv/spvbin-helpers.h \ + src/output/spv/spvxml-helpers.c \ + src/output/spv/spvxml-helpers.h + +light_binary_in = \ + src/output/spv/binary-parser-generator \ + src/output/spv/light-binary.grammar +light_binary_out = \ + src/output/spv/light-binary-parser.c \ + src/output/spv/light-binary-parser.h +src/output/spv/light-binary-parser.c: $(light_binary_in) + $(AM_V_GEN)python $^ code spvlb '"output/spv/light-binary-parser.h"' > $@.tmp + $(AM_V_at)mv $@.tmp $@ +src/output/spv/light-binary-parser.h: $(light_binary_in) + $(AM_V_GEN)python $^ header spvlb > $@.tmp && mv $@.tmp $@ +nodist_src_output_liboutput_la_SOURCES += $(light_binary_out) +BUILT_SOURCES += $(light_binary_out) +CLEANFILES += $(light_binary_out) +EXTRA_DIST += $(light_binary_in) + +old_binary_in = \ + src/output/spv/binary-parser-generator \ + src/output/spv/old-binary.grammar +old_binary_out = \ + src/output/spv/old-binary-parser.c \ + src/output/spv/old-binary-parser.h +src/output/spv/old-binary-parser.c: $(old_binary_in) + $(AM_V_GEN)python $^ code spvob '"output/spv/old-binary-parser.h"' > $@.tmp + $(AM_V_at)mv $@.tmp $@ +src/output/spv/old-binary-parser.h: $(old_binary_in) + $(AM_V_GEN)python $^ header spvob > $@.tmp && mv $@.tmp $@ +nodist_src_output_liboutput_la_SOURCES += $(old_binary_out) +BUILT_SOURCES += $(old_binary_out) +CLEANFILES += $(old_binary_out) +EXTRA_DIST += $(old_binary_in) + +detail_xml_in = \ + src/output/spv/xml-parser-generator \ + src/output/spv/detail-xml.grammar +detail_xml_out = \ + src/output/spv/detail-xml-parser.c \ + src/output/spv/detail-xml-parser.h +src/output/spv/detail-xml-parser.c: $(detail_xml_in) + $(AM_V_GEN)python $^ code spvdx '"output/spv/detail-xml-parser.h"' > $@.tmp + $(AM_V_at)mv $@.tmp $@ +src/output/spv/detail-xml-parser.h: $(detail_xml_in) + $(AM_V_GEN)python $^ header spvdx > $@.tmp && mv $@.tmp $@ +nodist_src_output_liboutput_la_SOURCES += $(detail_xml_out) +BUILT_SOURCES += $(detail_xml_out) +CLEANFILES += $(detail_xml_out) +EXTRA_DIST += $(detail_xml_in) + +structure_xml_in = \ + src/output/spv/xml-parser-generator \ + src/output/spv/structure-xml.grammar +structure_xml_out = \ + src/output/spv/structure-xml-parser.c \ + src/output/spv/structure-xml-parser.h +src/output/spv/structure-xml-parser.c: $(structure_xml_in) + $(AM_V_GEN)python $^ code spvsx '"output/spv/structure-xml-parser.h"' > $@.tmp + $(AM_V_at)mv $@.tmp $@ +src/output/spv/structure-xml-parser.h: $(structure_xml_in) + $(AM_V_GEN)python $^ header spvsx > $@.tmp && mv $@.tmp $@ +nodist_src_output_liboutput_la_SOURCES += $(structure_xml_out) +BUILT_SOURCES += $(structure_xml_out) +CLEANFILES += $(structure_xml_out) +EXTRA_DIST += $(structure_xml_in) diff --git a/src/output/spv/binary-parser-generator b/src/output/spv/binary-parser-generator new file mode 100644 index 0000000000..6824084b1a --- /dev/null +++ b/src/output/spv/binary-parser-generator @@ -0,0 +1,877 @@ +#! /usr/bin/python + +# PSPP - a program for statistical analysis. +# Copyright (C) 2017, 2018, 2019 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +import getopt +import os +import struct +import sys + +n_errors = 0 + +def error(msg): + global n_errors + sys.stderr.write("%s:%d: %s\n" % (file_name, line_number, msg)) + n_errors += 1 + + +def fatal(msg): + error(msg) + sys.exit(1) + + +def get_line(): + global line + global line_number + line = input_file.readline() + line_number += 1 + + +def is_num(s): + return s.isdigit() or (s[0] == '-' and s[1].isdigit()) + + +xdigits = "0123456789abcdefABCDEF" +def is_xdigits(s): + for c in s: + if c not in xdigits: + return False + return True + + +def expect(type): + if token[0] != type: + fatal("syntax error expecting %s" % type) + + +def match(type): + if token[0] == type: + get_token() + return True + else: + return False + + +def must_match(type): + expect(type) + get_token() + + +def get_token(): + global token + global line + prev = token + if line == "": + if token == ('eof', ): + fatal("unexpected end of input") + get_line() + if not line: + token = ('eof', ) + return + elif line == '\n': + token = (';', ) + return + elif not line[0].isspace(): + token = (';', ) + return + + line = line.lstrip() + if line == "": + get_token() + elif line[0] == '#': + line = '' + get_token() + elif line[0] in '[]()?|*': + token = (line[0], ) + line = line[1:] + elif line.startswith('=>'): + token = (line[:2], ) + line = line[2:] + elif line.startswith('...'): + token = (line[:3], ) + line = line[3:] + elif line[0].isalnum() or line[0] == '-': + n = 1 + while n < len(line) and (line[n].isalnum() or line[n] == '-'): + n += 1 + s = line[:n] + line = line[n:] + + if prev[0] == '*' and is_num(s): + token = ('number', int(s, 10)) + elif len(s) == 2 and is_xdigits(s): + token = ('bytes', struct.pack('B', int(s, 16))) + elif s[0] == 'i' and is_num(s[1:]): + token = ('bytes', struct.pack('i', int(s[2:]))) + elif s[0].isupper(): + token = ('nonterminal', s) + elif s in ('bool', 'int16', 'int32', 'int64', 'be16', 'be32', 'be64', + 'string', 'bestring', 'byte', 'float', 'double', + 'count', 'becount', 'v1', 'v3', 'vAF', 'vB0', + 'case', 'else'): + token = (s, ) + else: + token = ('id', s) + else: + fatal("unknown character %c" % line[0]) + + +def usage(): + argv0 = os.path.basename(sys.argv[0]) + print('''\ +%(argv0)s, parser generator for SPV binary members +usage: %(argv0)s GRAMMAR header + %(argv0)s GRAMMAR code HEADER_NAME + where GRAMMAR contains grammar definitions\ +''' % {"argv0": argv0}) + sys.exit(0) + + +class Item(object): + def __init__(self, type_, name, n, content): + self.type_ = type_ + self.name = name + self.n = n + self.content = content + def __repr__(self): + if self.type_ == 'constant': + return ' '.join(['%02x' % ord(x) for x in self.content]) + elif self.content: + return "%s(%s)" % (self.type_, self.content) + else: + return self.type_ + +def parse_item(): + t = token + name = None + if t[0] == 'bytes': + type_ = 'constant' + content = t[1] + get_token() + elif t[0] in ('bool', 'byte', + 'int16', 'int32', 'int64', + 'be16', 'be32', 'be64', + 'string', 'bestring', + 'float', 'double', + 'nonterminal', '...'): + type_ = 'variable' + content = t + get_token() + if t[0] == 'nonterminal': + name = name_to_id(content[1]) + elif t[0] in ('v1', 'v3', 'vAF', 'vB0', 'count', 'becount'): + type_ = t[0] + get_token() + must_match('(') + content = parse_choice() + must_match(')') + elif match('case'): + return parse_case() + elif match('('): + type_ = '()' + content = parse_choice() + must_match(')') + else: + print token + fatal('syntax error expecting item') + + n = 1 + optional = False + if match('*'): + if token[0] == 'number': + n = token[1] + get_token() + elif match('['): + expect('id') + n = token[1] + get_token() + must_match(']') + if n.startswith('n-'): + name = n[2:] + else: + fatal('expecting quantity') + elif match('?'): + optional = True + + if match('['): + expect('id') + if type_ == 'constant' and not optional: + fatal("%s: cannot name a constant" % token[1]) + + name = token[1] + get_token() + must_match(']') + + if type_ == 'constant': + content *= n + n = 1 + + item = Item(type_, name, n, content) + if optional: + item = Item('|', None, 1, [[item], []]) + return item + + +def parse_concatenation(): + items = [] + while token[0] not in (')', ';', '|', 'eof'): + item = parse_item() + if (item.type_ == 'constant' + and items + and items[-1].type_ == 'constant'): + items[-1].content += item.content + else: + items.append(item) + return items + + +def parse_choice(): + sub = parse_concatenation() + if token[0] != '|': + return sub + + choices = [sub] + while match('|'): + choices.append(parse_concatenation()) + + return [Item('|', None, 1, choices)] + + +def parse_case(): + must_match('(') + choices = {} + while True: + choice = None + if match('else'): + choice = 'else' + + items = parse_concatenation() + if choice is None: + if (not items + or items[0].type_ != 'constant' + or len(items[0].content) != 1): + fatal("choice must begin with xx (or 'else')") + choice = '%02x' % ord(items[0].content) + + if choice in choices: + fatal("duplicate choice %s" % choice) + choices[choice] = items + + if match(')'): + break + must_match('|') + + case_name = None + if match('['): + expect('id') + case_name = token[1] + get_token() + must_match(']') + + return Item('case', case_name, 1, + { '%s_%s' % (case_name, k) : v for k, v in choices.items() }) + + +def parse_production(): + expect('nonterminal') + name = token[1] + get_token() + must_match('=>') + return name, parse_choice() + + +def print_members(p, indent): + for item in p: + if item.type_ == 'variable' and item.name: + if item.content[0] == 'nonterminal': + typename = 'struct %s%s' % (prefix, + name_to_id(item.content[1])) + n_stars = 1 + else: + c_types = {'bool': ('bool', 0), + 'byte': ('uint8_t', 0), + 'int16': ('uint16_t', 0), + 'int32': ('uint32_t', 0), + 'int64': ('uint64_t', 0), + 'be16': ('uint16_t', 0), + 'be32': ('uint32_t', 0), + 'be64': ('uint64_t', 0), + 'string': ('char', 1), + 'bestring': ('char', 1), + 'float': ('double', 0), + 'double': ('double', 0), + '...': ('uint8_t', 1)} + typename, n_stars = c_types[item.content[0]] + + array_suffix = '' + if item.n: + if isinstance(item.n, int): + if item.n > 1: + array_suffix = '[%d]' % item.n + else: + n_stars += 1 + + print "%s%s %s%s%s;" % (indent, typename, '*' * n_stars, + name_to_id(item.name), + array_suffix) + elif item.type_ in ('v1', 'v3', 'vAF', 'vB0', + 'count', 'becount', '()'): + print_members(item.content, indent) + elif item.type_ == '|': + for choice in item.content: + print_members(choice, indent) + elif item.type_ == 'case': + print "%sint %s;" % (indent, item.name) + print "%sunion {" % indent + for name, choice in sorted(item.content.items()): + print "%s struct {" % indent + print_members(choice, indent + ' ' * 8) + print "%s } %s;" % (indent, name) + print "%s};" % indent + elif item.type_ == 'constant': + if item.name: + print "%sbool %s;" % (indent, item.name) + elif item.type_ not in ("constant", "variable"): + fatal("unhandled type %s" % item.type_) + + +def bytes_to_hex(s): + return ''.join(['"'] + ["\\x%02x" % ord(x) for x in s] + ['"']) + + +class Parser_Context(object): + def __init__(self): + self.suffixes = {} + self.bail = 'error' + self.need_error_handler = False + def gen_name(self, prefix): + n = self.suffixes.get(prefix, 0) + 1 + self.suffixes[prefix] = n + return '%s%d' % (prefix, n) if n > 1 else prefix + def save_pos(self, indent): + pos = self.gen_name('pos') + print "%sstruct spvbin_position %s = spvbin_position_save (input);" % (indent, pos) + return pos + def save_error(self, indent): + error = self.gen_name('save_n_errors') + print "%ssize_t %s = input->n_errors;" % (indent, error) + return error + def parse_limit(self, endian, indent): + limit = self.gen_name('saved_limit') + print """\ +%sstruct spvbin_limit %s; +%sif (!spvbin_limit_parse%s (&%s, input)) +%s goto %s;""" % ( + indent, limit, + indent, '_be' if endian == 'big' else '', limit, + indent, self.bail) + return limit + + +def print_parser_items(name, production, indent, accessor, ctx): + for item_idx in range(len(production)): + if item_idx > 0: + print + + item = production[item_idx] + if item.type_ == 'constant': + print """%sif (!spvbin_match_bytes (input, %s, %d)) +%s goto %s;""" % ( + indent, bytes_to_hex(item.content), len(item.content), + indent, ctx.bail) + ctx.need_error_handler = True + if item.name: + print "%sp->%s = true;" % (indent, item.name) + elif item.type_ == 'variable': + if item.content[0] == 'nonterminal': + func = '%sparse_%s' % (prefix, name_to_id(item.content[1])) + else: + func = 'spvbin_parse_%s' % item.content[0] + + if item.name: + dst = "&p->%s%s" % (accessor, name_to_id(item.name)) + else: + dst = "NULL" + if item.n == 1: + print """%sif (!%s (input, %s)) +%s goto %s;""" % (indent, func, dst, + indent, ctx.bail) + + if item.content[0] != 'nonterminal' and item.name == 'version': + print "%sinput->version = p->%s%s;" % ( + indent, accessor, name_to_id(item.name)) + else: + if isinstance(item.n, int): + count = item.n + else: + count = 'p->%s%s' % (accessor, name_to_id(item.n)) + + i_name = ctx.gen_name('i') + if item.name: + if not isinstance(item.n, int): + print "%sp->%s%s = xcalloc (%s, sizeof *p->%s%s);" % ( + indent, + accessor, name_to_id(item.name), count, + accessor, name_to_id(item.name)) + dst += '[%s]' % i_name + print "%sfor (int %s = 0; %s < %s; %s++)" % ( + indent, i_name, i_name, count, i_name) + print """%s if (!%s (input, %s)) +%s goto %s;""" % (indent, func, dst, + indent, ctx.bail) + + ctx.need_error_handler = True + elif item.type_ == '()': + if item.n != 1: + # Not yet implemented + raise AssertionError + + print_parser_items(name, item.content, indent, accessor, ctx) + elif item.type_ in ('v1', 'v3', 'vAF', 'vB0'): + if item.n != 1: + # Not yet implemented + raise AssertionError + + print "%sif (input->version == 0x%s) {" % (indent, item.type_[1:]) + print_parser_items(name, item.content, indent + ' ', accessor, ctx) + print "%s}" % indent + elif item.type_ in ('count', 'becount'): + if item.n != 1: + # Not yet implemented + raise AssertionError + + pos = ctx.save_pos(indent) + endian = 'big' if item.type_ == 'becount' else 'little' + limit = ctx.parse_limit(endian, indent) + + save_bail = ctx.bail + ctx.bail = ctx.gen_name('backtrack') + + print "%sdo {" % indent + indent += ' ' + if (item.content + and item.content[-1].type_ == 'variable' + and item.content[-1].content[0] == '...'): + content = item.content[:-1] + ellipsis = True + else: + content = item.content + ellipsis = False + print_parser_items(name, content, indent, accessor, ctx) + + if ellipsis: + print "%sinput->ofs = input->size;" % indent + else: + print """%sif (!spvbin_input_at_end (input)) +%s goto %s;""" % (indent, + indent, ctx.bail) + print '%sspvbin_limit_pop (&%s, input);' % (indent, limit) + print '%sbreak;' % indent + print + print '%s%s:' % (indent[4:], ctx.bail) + # In theory, we should emit code to clear whatever we're + # backtracking from. In practice, it's not important to + # do that. + print "%sspvbin_position_restore (&%s, input);" % (indent, pos) + print '%sspvbin_limit_pop (&%s, input);' % (indent, limit) + print '%sgoto %s;' % (indent, save_bail) + indent = indent[4:] + print "%s} while (0);" % indent + + ctx.bail = save_bail + elif item.type_ == '|': + save_bail = ctx.bail + + print "%sdo {" % indent + indent += ' ' + pos = ctx.save_pos(indent) + error = ctx.save_error(indent) + i = 0 + for choice in item.content: + if i: + print "%sspvbin_position_restore (&%s, input);" % (indent, pos) + print "%sinput->n_errors = %s;" % (indent, error) + i += 1 + + if i != len(item.content): + ctx.bail = ctx.gen_name('backtrack') + else: + ctx.bail = save_bail + print_parser_items(name, choice, indent, accessor, ctx) + print "%sbreak;" % indent + if i != len(item.content): + print + print '%s%s:' % (indent[4:], ctx.bail) + # In theory, we should emit code to clear whatever we're + # backtracking from. In practice, it's not important to + # do that. + indent = indent[4:] + print "%s} while (0);" % indent + elif item.type_ == 'case': + i = 0 + for choice_name, choice in sorted(item.content.items()): + if choice_name.endswith('else'): + print "%s} else {" % indent + print "%s p->%s%s = -1;" % (indent, accessor, item.name) + print + else: + print "%s%sif (spvbin_match_byte (input, 0x%s)) {" % ( + indent, '} else ' if i else '', choice_name[-2:]) + print "%s p->%s%s = 0x%s;" % ( + indent, accessor, item.name, choice_name[-2:]) + print + choice = choice[1:] + + print_parser_items(name, choice, indent + ' ', + accessor + choice_name + '.', ctx) + i += 1 + print "%s}" % indent + else: + # Not implemented + raise AssertionError + + +def print_parser(name, production, indent): + print ''' +bool +%(prefix)sparse_%(name)s (struct spvbin_input *input, struct %(prefix)s%(name)s **p_) +{ + *p_ = NULL; + struct %(prefix)s%(name)s *p = xzalloc (sizeof *p); + p->start = input->ofs; +''' % {'prefix': prefix, + 'name': name_to_id(name)} + + ctx = Parser_Context() + print_parser_items(name, production, indent, '', ctx) + + print ''' + p->len = input->ofs - p->start; + *p_ = p; + return true;''' + + if ctx.need_error_handler: + print """ +error: + spvbin_error (input, "%s", p->start); + %sfree_%s (p); + return false;""" % (name, prefix, name_to_id(name)) + + print "}" + +def print_free_items(name, production, indent, accessor, ctx): + for item in production: + if item.type_ == 'constant': + pass + elif item.type_ == 'variable': + if not item.name: + continue + + if item.content[0] == 'nonterminal': + free_func = '%sfree_%s' % (prefix, name_to_id(item.content[1])) + elif item.content[0] in ('string', 'bestring', '...'): + free_func = 'free' + else: + free_func = None + + dst = "p->%s%s" % (accessor, name_to_id(item.name)) + + if item.n == 1: + if free_func: + print "%s%s (%s);" % (indent, free_func, dst) + else: + if isinstance(item.n, int): + count = item.n + else: + count = 'p->%s%s' % (accessor, name_to_id(item.n)) + + i_name = ctx.gen_name('i') + if free_func: + print "%sfor (int %s = 0; %s < %s; %s++)" % ( + indent, i_name, i_name, count, i_name) + print "%s %s (%s[%s]);" % ( + indent, free_func, dst, i_name) + if not isinstance(item.n, int): + print "%sfree (p->%s%s);" % ( + indent, accessor, name_to_id(item.name)) + elif item.type_ in ('()', 'v1', 'v3', 'vAF', 'vB0', + 'count', 'becount'): + if item.n != 1: + # Not yet implemented + raise AssertionError + + print_free_items(name, item.content, indent, accessor, ctx) + elif item.type_ == '|': + for choice in item.content: + print_free_items(name, choice, indent, accessor, ctx) + elif item.type_ == 'case': + i = 0 + for choice_name, choice in sorted(item.content.items()): + if choice_name.endswith('else'): + value_name = '-1' + else: + value_name = '0x%s' % choice_name[-2:] + + print '%s%sif (p->%s%s == %s) {' % ( + indent, '} else ' if i else '', accessor, item.name, + value_name) + + print_free_items(name, choice, indent + ' ', + accessor + choice_name + '.', ctx) + i += 1 + print "%s}" % indent + else: + # Not implemented + raise AssertionError + +def print_free(name, production, indent): + print ''' +void +%(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *p) +{ + if (p == NULL) + return; +''' % {'prefix': prefix, + 'name': name_to_id(name)} + + print_free_items(name, production, indent, '', Parser_Context()) + + print " free (p);" + print "}" + +def print_print_items(name, production, indent, accessor, ctx): + for item_idx in range(len(production)): + if item_idx > 0: + print + + item = production[item_idx] + if item.type_ == 'constant': + if item.name: + print '%sspvbin_print_presence ("%s", indent + 1, p->%s);' % ( + indent, item.name, item.name) + elif item.type_ == 'variable': + if not item.name: + continue + + if item.content[0] == 'nonterminal': + func = '%sprint_%s' % (prefix, name_to_id(item.content[1])) + else: + c_types = {'bool': 'bool', + 'byte': 'byte', + 'int16': 'int16', + 'int32': 'int32', + 'int64': 'int64', + 'be16': 'int16', + 'be32': 'int32', + 'be64': 'int64', + 'string': 'string', + 'bestring': 'string', + 'float': 'double', + 'double': 'double', + '...': ('uint8_t', 1)} + func = 'spvbin_print_%s' % c_types[item.content[0]] + + dst = "p->%s%s" % (accessor, name_to_id(item.name)) + if item.n == 1: + print '%s%s ("%s", indent + 1, %s);' % (indent, func, + item.name, dst) + else: + if isinstance(item.n, int): + count = item.n + else: + count = 'p->%s%s' % (accessor, name_to_id(item.n)) + + i_name = ctx.gen_name('i') + elem_name = ctx.gen_name('elem_name') + dst += '[%s]' % i_name + print """\ +%(indent)sfor (int %(index)s = 0; %(index)s < %(count)s; %(index)s++) { +%(indent)s char *%(elem_name)s = xasprintf ("%(item.name)s[%%d]", %(index)s); +%(indent)s %(func)s (%(elem_name)s, indent + 1, %(dst)s); +%(indent)s free (%(elem_name)s); +%(indent)s}""" % {'indent': indent, + 'index': i_name, + 'count': count, + 'elem_name' : elem_name, + 'item.name': item.name, + 'func': func, + 'dst': dst} + elif item.type_ == '()': + if item.n != 1: + # Not yet implemented + raise AssertionError + + print_print_items(name, item.content, indent, accessor, ctx) + elif item.type_ in ('v1', 'v3', 'vAF', 'vB0'): + if item.n != 1: + # Not yet implemented + raise AssertionError + + print_print_items(name, item.content, indent, accessor, ctx) + elif item.type_ in ('count', 'becount'): + if item.n != 1: + # Not yet implemented + raise AssertionError + + indent += ' ' + if (item.content + and item.content[-1].type_ == 'variable' + and item.content[-1].content[0] == '...'): + content = item.content[:-1] + else: + content = item.content + print_print_items(name, content, indent, accessor, ctx) + elif item.type_ == '|': + for choice in item.content: + print_print_items(name, choice, indent, accessor, ctx) + elif item.type_ == 'case': + i = 0 + print """\ +%sspvbin_print_case ("%s", indent + 1, p->%s%s);""" % ( + indent, item.name, accessor, name_to_id(item.name)) + for choice_name, choice in sorted(item.content.items()): + if choice_name.endswith('else'): + value_name = '-1' + else: + value_name = '0x%s' % choice_name[-2:] + + print '%s%sif (p->%s%s == %s) {' % ( + indent, '} else ' if i else '', accessor, item.name, + value_name) + + print_print_items(name, choice, indent + ' ', + accessor + choice_name + '.', ctx) + i += 1 + print "%s}" % indent + else: + # Not implemented + raise AssertionError + + +def print_print(name, production, indent): + print ''' +void +%(prefix)sprint_%(name)s (const char *title, int indent, const struct %(prefix)s%(name)s *p) +{ + spvbin_print_header (title, p ? p->start : -1, p ? p->len : -1, indent); + if (p == NULL) { + printf ("none\\n"); + return; + } + putchar ('\\n'); +''' % {'prefix': prefix, + 'rawname': name, + 'name': name_to_id(name)} + + ctx = Parser_Context() + print_print_items(name, production, indent, '', ctx) + + print "}" + +def name_to_id(s): + return s[0].lower() + ''.join(['_%c' % x.lower() if x.isupper() else x + for x in s[1:]]).replace('-', '_') + + +if __name__ == "__main__": + argv0 = sys.argv[0] + try: + options, args = getopt.gnu_getopt(sys.argv[1:], 'h', ['help']) + except getopt.GetoptError as e: + sys.stderr.write("%s: %s\n" % (argv0, e.msg)) + sys.exit(1) + + for key, value in options: + if key in ['-h', '--help']: + usage() + else: + sys.exit(0) + + if len(args) < 3: + sys.stderr.write("%s: bad usage (use --help for help)\n" % argv0) + sys.exit(1) + + global file_name + file_name, output_type, prefix = args[:3] + input_file = open(file_name) + + prefix = '%s_' % prefix + + global line + global line_number + line = "" + line_number = 0 + + productions = {} + + global token + token = ('start', ) + get_token() + while True: + while match(';'): + pass + if token[0] == 'eof': + break + + name, production = parse_production() + if name in productions: + fatal("%s: duplicate production" % name) + productions[name] = production + + print '/* Generated automatically -- do not modify! -*- buffer-read-only: t -*- */' + if output_type == 'code' and len(args) == 4: + header_name = args[3] + + print """\ +#include +#include %s +#include +#include +#include "libpspp/str.h" +#include "gl/xalloc.h"\ +""" % header_name + for name, production in productions.items(): + print_parser(name, production, ' ' * 4) + print_free(name, production, ' ' * 4) + print_print(name, production, ' ' * 4) + elif output_type == 'header' and len(args) == 3: + print """\ +#ifndef %(PREFIX)sPARSER_H +#define %(PREFIX)sPARSER_H + +#include +#include +#include +#include "output/spv/spvbin-helpers.h"\ +""" % {'PREFIX': prefix.upper()} + for name, production in productions.items(): + print '\nstruct %s%s {' % (prefix, name_to_id(name)) + print " size_t start, len;" + print_members(production, ' ' * 4) + print '''}; +bool %(prefix)sparse_%(name)s (struct spvbin_input *, struct %(prefix)s%(name)s **); +void %(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *); +void %(prefix)sprint_%(name)s (const char *title, int indent, const struct %(prefix)s%(name)s *);\ +''' % {'prefix': prefix, + 'name': name_to_id(name)} + print """\ + +#endif /* %(PREFIX)sPARSER_H */""" % {'PREFIX': prefix.upper()} + else: + sys.stderr.write("%s: bad usage (use --help for help)" % argv0) diff --git a/src/output/spv/detail-xml.grammar b/src/output/spv/detail-xml.grammar new file mode 100644 index 0000000000..37bbabcf25 --- /dev/null +++ b/src/output/spv/detail-xml.grammar @@ -0,0 +1,362 @@ +# PSPP - a program for statistical analysis. +# Copyright (C) 2017, 2018, 2019 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +visualization + :creator + :date + :lang + :name + :style[style_ref]=ref style + :type + :version + :schemaLocation? +=> visualization_extension? + userSource + (sourceVariable | derivedVariable)+ + categoricalDomain? + graph + labelFrame[lf1]* + container? + labelFrame[lf2]* + style+ + layerController? + +extension[visualization_extension] + :numRows=int? + :showGridline=bool? + :minWidthSet=(true)? + :maxWidthSet=(true)? +=> EMPTY + +userSource :missing=(listwise | pairwise)? => EMPTY # Related to omit_empty? + +categoricalDomain => variableReference simpleSort + +simpleSort :method[sort_method]=(custom) => categoryOrder + +sourceVariable + :id + :categorical=(true) + :source + :domain=ref categoricalDomain? + :sourceName + :dependsOn=ref sourceVariable? + :label? + :labelVariable=ref sourceVariable? +=> variable_extension* (format | stringFormat)? + +derivedVariable + :id + :categorical=(true) + :value + :dependsOn=ref sourceVariable? +=> variable_extension* (format | stringFormat)? valueMapEntry* + +extension[variable_extension] :from :helpId => EMPTY + +valueMapEntry :from :to => EMPTY + +categoryOrder => TEXT + +graph + :cellStyle=ref style + :style=ref style +=> location+ coordinates faceting facetLayout interval + +location + :part=(height | width | top | bottom | left | right) + :method=(sizeToContent | attach | fixed | same) + :min=dimension? + :max=dimension? + :target=ref (labelFrame | graph | container)? + :value? +=> EMPTY + +coordinates => EMPTY + +faceting => layer[layers1]* cross layer[layers2]* + +cross => (unity | nest) (unity | nest) + +nest => variableReference[vars]+ + +unity => EMPTY + +variableReference :ref=ref (sourceVariable | derivedVariable) => EMPTY + +layer + :variable=ref (sourceVariable | derivedVariable) + :value + :visible=bool? + :method[layer_method]=(nest)? + :titleVisible=bool? +=> EMPTY + +facetLayout => tableLayout setCellProperties[scp1]* + facetLevel+ setCellProperties[scp2]* + +tableLayout + :verticalTitlesInCorner=bool + :style=ref style? + :fitCells=(ticks both)? +=> EMPTY + +facetLevel :level=int :gap=dimension? => axis + +axis :style=ref style => label? majorTicks + +label + :style=ref style + :textFrameStyle=ref style? + :purpose=(title | subTitle | subSubTitle | layer | footnote)? +=> text+ | descriptionGroup + +descriptionGroup + :target=ref faceting + :separator? +=> (description | text)+ + +description :name=(variable | value) => EMPTY + +majorTicks + :labelAngle=int + :length=dimension + :style=ref style + :tickFrameStyle=ref style + :labelFrequency=int? + :stagger=bool? +=> gridline? + +gridline + :style=ref style + :zOrder=int +=> EMPTY + +setCellProperties + :applyToConverse=bool? +=> (setStyle | setFrameStyle | setFormat | setMetaData)* union[union_]? + +setStyle + :target=ref (labeling | graph | interval | majorTicks) + :style=ref style +=> EMPTY + +setMetaData + :target=ref graph + :key + :value +=> EMPTY + +setFormat + :target=ref (majorTicks | labeling) + :reset=bool? +=> format | numberFormat | stringFormat+ | dateTimeFormat | elapsedTimeFormat + +setFrameStyle + :style=ref style + :target=ref majorTicks +=> EMPTY + +format + :baseFormat[f_base_format]=(date | time | dateTime | elapsedTime)? + :errorCharacter? + :separatorChars? + :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)? + :showYear=bool? + :showQuarter=bool? + :quarterPrefix? + :quarterSuffix? + :yearAbbreviation=bool? + :showMonth=bool? + :monthFormat=(long | short | number | paddedNumber)? + :dayPadding=bool? + :dayOfMonthPadding=bool? + :showWeek=bool? + :weekPadding=bool? + :weekSuffix? + :showDayOfWeek=bool? + :dayOfWeekAbbreviation=bool? + :hourPadding=bool? + :minutePadding=bool? + :secondPadding=bool? + :showDay=bool? + :showHour=bool? + :showMinute=bool? + :showSecond=bool? + :showMillis=bool? + :dayType=(month | year)? + :hourFormat=(AMPM | AS_24 | AS_12)? + :minimumIntegerDigits=int? + :maximumFractionDigits=int? + :minimumFractionDigits=int? + :useGrouping=bool? + :scientific=(onlyForSmall | whenNeeded | true | false)? + :small=real? + :prefix? + :suffix? + :tryStringsAsNumbers=bool? + :negativesOutside=bool? +=> relabel* affix* + +numberFormat + :minimumIntegerDigits=int? + :maximumFractionDigits=int? + :minimumFractionDigits=int? + :useGrouping=bool? + :scientific=(onlyForSmall | whenNeeded | true | false)? + :small=real? + :prefix? + :suffix? +=> affix* + +stringFormat => relabel* affix* + +dateTimeFormat + :baseFormat[dt_base_format]=(date | time | dateTime) + :separatorChars? + :mdyOrder=(dayMonthYear | monthDayYear | yearMonthDay)? + :showYear=bool? + :yearAbbreviation=bool? + :showQuarter=bool? + :quarterPrefix? + :quarterSuffix? + :showMonth=bool? + :monthFormat=(long | short | number | paddedNumber)? + :showWeek=bool? + :weekPadding=bool? + :weekSuffix? + :showDayOfWeek=bool? + :dayOfWeekAbbreviation=bool? + :dayPadding=bool? + :dayOfMonthPadding=bool? + :hourPadding=bool? + :minutePadding=bool? + :secondPadding=bool? + :showDay=bool? + :showHour=bool? + :showMinute=bool? + :showSecond=bool? + :showMillis=bool? + :dayType=(month | year)? + :hourFormat=(AMPM | AS_24 | AS_12)? +=> affix* + +elapsedTimeFormat + :baseFormat[dt_base_format]=(date | time | dateTime) + :dayPadding=bool? + :hourPadding=bool? + :minutePadding=bool? + :secondPadding=bool? + :showYear=bool? + :showDay=bool? + :showHour=bool? + :showMinute=bool? + :showSecond=bool? + :showMillis=bool? +=> affix* + +affix + :definesReference=int + :position=(subscript | superscript) + :suffix=bool + :value +=> EMPTY + +relabel :from=real :to => EMPTY + +union => intersect+ + +intersect => where+ | intersectWhere | alternating | EMPTY + +where + :variable=ref (sourceVariable | derivedVariable) + :include +=> EMPTY + +intersectWhere + :variable=ref (sourceVariable | derivedVariable) + :variable2=ref (sourceVariable | derivedVariable) +=> EMPTY + +alternating => EMPTY + +text + :usesReference=int? + :definesReference=int? + :position=(subscript | superscript)? + :style=ref style +=> TEXT + +interval :style=ref style => labeling footnotes? + +labeling + :style=ref style? + :variable=ref (sourceVariable | derivedVariable) +=> (formatting | format | footnotes)* + +formatting :variable=ref (sourceVariable | derivedVariable) => formatMapping* + +formatMapping :from=int => format? + +footnotes + :superscript=bool? + :variable=ref (sourceVariable | derivedVariable) +=> footnoteMapping* + +footnoteMapping :definesReference=int :from=int :to => EMPTY + +style + :color=color? + :color2=color? + :labelAngle=real? + :border-bottom=(solid | thick | thin | double | none)? + :border-top=(solid | thick | thin | double | none)? + :border-left=(solid | thick | thin | double | none)? + :border-right=(solid | thick | thin | double | none)? + :border-bottom-color? + :border-top-color? + :border-left-color? + :border-right-color? + :font-family? + :font-size? + :font-weight=(regular | bold)? + :font-style=(regular | italic)? + :font-underline=(none | underline)? + :margin-bottom=dimension? + :margin-left=dimension? + :margin-right=dimension? + :margin-top=dimension? + :textAlignment=(left | right | center | decimal | mixed)? + :labelLocationHorizontal=(positive | negative | center)? + :labelLocationVertical=(positive | negative | center)? + :decimal-offset=dimension? + :size? + :width? + :visible=bool? +=> EMPTY + +layerController + :source=(tableData) + :target=ref label? +=> EMPTY + +container :style=ref style => container_extension? location+ labelFrame* + +extension[container_extension] :combinedFootnotes=(true) => EMPTY + +labelFrame :style=ref style => location+ label? paragraph? + +paragraph :hangingIndent=dimension? => EMPTY diff --git a/src/output/spv/light-binary.grammar b/src/output/spv/light-binary.grammar new file mode 100644 index 0000000000..cf6e6c5fc3 --- /dev/null +++ b/src/output/spv/light-binary.grammar @@ -0,0 +1,217 @@ +# PSPP - a program for statistical analysis. +# Copyright (C) 2017, 2018, 2019 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +Table => + Header Titles Footnotes + Areas Borders PrintSettings[ps] TableSettings[ts] Formats + Dimensions Axes Cells + 01? + +Header => + 01 00 + int32[version] + bool[x0] + bool[x1] + bool[rotate-inner-column-labels] + bool[rotate-outer-row-labels] + bool[x2] + int32[x3] + int32[min-col-width] int32[max-col-width] + int32[min-row-height] int32[max-row-height] + int64[table-id] + +Titles => + Value[title] 01? + Value[subtype] 01? 31 + Value[user-title] 01? + (31 Value[corner-text] | 58) + (31 Value[caption] | 58) + +Footnotes => int32[n-footnotes] Footnote*[n-footnotes] +Footnote => Value[text] (58 | 31 Value[marker]) int32[show] + +Areas => 00? Area*8[areas] +Area => + byte[index] 31 + string[typeface] float[size] int32[style] bool[underline] + int32[halign] int32[valign] + string[fg-color] string[bg-color] + bool[alternate] string[alt-fg-color] string[alt-bg-color] + v3(int32[left-margin] int32[right-margin] int32[top-margin] int32[bottom-margin]) + +Borders => + count( + ib1 + be32[n-borders] Border*[n-borders] + bool[show-grid-lines] + 00 00 00) + +Border => + be32[border-type] + be32[stroke-type] + be32[color] + +PrintSettings => + count( + ib1 + bool[all-layers] + bool[paginate-layers] + bool[fit-width] + bool[fit-length] + bool[top-continuation] + bool[bottom-continuation] + be32[n-orphan-lines] + bestring[continuation-string]) + +TableSettings => + count( + v3( + ib1 + be32[x5] + be32[current-layer] + bool[omit-empty] + bool[show-row-labels-in-corner] + bool[show-alphabetic-markers] + bool[footnote-marker-superscripts] + byte[x6] + becount( + Breakpoints[row-breaks] Breakpoints[col-breaks] + Keeps[row-keeps] Keeps[col-keeps] + PointKeeps[row-point-keeps] PointKeeps[col-point-keeps] + ) + bestring[notes] + bestring[table-look] + )...) + +Breakpoints => be32[n-breaks] be32*[n-breaks] + +Keeps => be32[n-keeps] Keep*[n-keeps] +Keep => be32[offset] be32[n] + +PointKeeps => be32[n-point-keeps] PointKeep*[n-point-keeps] +PointKeep => be32[offset] be32 be32 + +Formats => + int32[n-widths] int32*[n-widths] + string[locale] + int32[current-layer] + bool[x7] bool[x8] bool[x9] + Y0 + CustomCurrency + count( + v1(X0?) + v3(count(X1 count(X2)) count(X3))) +Y0 => int32[epoch] byte[decimal] byte[grouping] +CustomCurrency => int32[n-ccs] string*[n-ccs] + +X0 => byte*14 Y1 Y2 +Y1 => + string[command] string[command-local] + string[language] string[charset] string[locale] + bool[x10] bool[x11] bool[x12] bool[x13] + Y0 +Y2 => CustomCurrency byte[missing] bool[x17] + +X1 => + bool[x14] byte[x15] bool[x16] + byte[lang] + byte[show-variables] + byte[show-values] + int32[x18] int32[x19] + 00*17 + bool[x20] + bool[show-caption] + +X2 => + int32[n-row-heights] int32*[n-row-heights] + int32[n-style-map] StyleMap*[n-style-map] + int32[n-styles] StylePair*[n-styles] + count((i0 i0)?) +StyleMap => int64[cell-index] int16[style-index] + +X3 => + 01 00 byte[x21] 00 00 00 + Y1 + double[small] 01 + (string[dataset] string[datafile] i0 int32[date] i0)? + Y2 + (int32[x22] i0)? + +Dimensions => int32[n-dims] Dimension*[n-dims] +Dimension => + Value[name] DimProperties[props] + int32[n-categories] Category*[n-categories] +DimProperties => + byte[x1] + byte[x2] + int32[x3] + bool[hide-dim-label] + bool[hide-all-labels] + 01 int32[dim-index] + +Category => Value[name] (Leaf | Group) +Leaf => 00 00 00 i2 int32[leaf-index] i0 +Group => + bool[merge] 00 01 int32[x23] + i-1 int32[n-subcategories] Category*[n-subcategories] + +Axes => + int32[n-layers] int32[n-rows] int32[n-columns] + int32*[n-layers] int32*[n-rows] int32*[n-columns] + +Cells => int32[n-cells] Cell*[n-cells] +Cell => int64[index] v1(00?) Value + +Value => + 00? 00? 00? 00? + case( + 01 ValueMod int32[format] double[x] + | 02 ValueMod int32[format] double[x] + string[var-name] string[value-label] byte[show] + | 03 string[local] ValueMod string[id] string[c] bool[fixed] + | 04 ValueMod int32[format] string[value-label] string[var-name] + byte[show] string[s] + | 05 ValueMod string[var-name] string[var-label] byte[show] + | 06 string[local] ValueMod string[id] string[c] + | else ValueMod string[template] int32[n-args] Argument*[n-args] + )[type] +Argument => + i0 Value[value] + | int32[n-values] i0 Value*[n-values] + +ValueMod => + 58 + | 31 + int32[n-refs] int16*[n-refs] + int32[n-subscripts] string*[n-subscripts] + v1(00 (i1 | i2) 00? 00? int32 00? 00?) + v3(count(TemplateString StylePair)) + +TemplateString => count((count((i0 (58 | 31 55))?) (58 | 31 string[id]))?) + +StylePair => + (31 FontStyle | 58) + (31 CellStyle | 58) + +FontStyle => + bool[bold] bool[italic] bool[underline] bool[show] + string[fg-color] string[bg-color] + string[typeface] byte[size] + +CellStyle => + int32[halign] int32[valign] double[decimal-offset] + int16[left-margin] int16[right-margin] + int16[top-margin] int16[bottom-margin] diff --git a/src/output/spv/old-binary.grammar b/src/output/spv/old-binary.grammar new file mode 100644 index 0000000000..12f4bbc284 --- /dev/null +++ b/src/output/spv/old-binary.grammar @@ -0,0 +1,39 @@ +# PSPP - a program for statistical analysis. +# Copyright (C) 2017, 2018, 2019 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +LegacyBinary => + 00 byte[version] int16[n-sources] int32[member-size] + Metadata*[n-sources][metadata] + #Data*[n-sources][data] + #Strings? + +Metadata => + int32[n-values] int32[n-variables] int32[data-offset] + byte*28[source-name] + vB0(byte*36[ext-source-name] int32[x]) + +#Data => Variable*[n-variables] +#Variable => byte*288[variable-name] double*[n-values] + +Strings => SourceMaps[maps] Labels + +SourceMaps => int32[n-maps] SourceMap*[n-maps] +SourceMap => string[source-name] int32[n-variables] VariableMap*[n-variables] +VariableMap => string[variable-name] int32[n-data] DatumMap*[n-data] +DatumMap => int32[value-idx] int32[label-idx] + +Labels => int32[n-labels] Label*[n-labels] +Label => int32[frequency] string[label] diff --git a/src/output/spv/spv-css-parser.c b/src/output/spv/spv-css-parser.c new file mode 100644 index 0000000000..c3a7118ccc --- /dev/null +++ b/src/output/spv/spv-css-parser.c @@ -0,0 +1,175 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "spv-css-parser.h" + +#include +#include + +#include "libpspp/str.h" +#include "output/pivot-table.h" +#include "spv.h" + +#include "gl/c-ctype.h" +#include "gl/xalloc.h" +#include "gl/xmemdup0.h" + +enum css_token_type + { + T_EOF, + T_ID, + T_LCURLY, + T_RCURLY, + T_COLON, + T_SEMICOLON, + T_ERROR + }; + +struct css_token + { + enum css_token_type type; + char *s; + }; + +static char * +css_skip_spaces (char *p) +{ + for (;;) + { + if (c_isspace (*p)) + p++; + else if (!strncmp (p, "", 3)) + p += 3; + else + return p; + } +} + +static bool +css_is_separator (unsigned char c) +{ + return c_isspace (c) || strchr ("{}:;", c); +} + +static void +css_token_get (char **p_, struct css_token *token) +{ + char *p = *p_; + + free (token->s); + token->s = NULL; + + p = css_skip_spaces (p); + if (*p == '\0') + token->type = T_EOF; + else if (*p == '{') + { + token->type = T_LCURLY; + p++; + } + else if (*p == '}') + { + token->type = T_RCURLY; + p++; + } + else if (*p == ':') + { + token->type = T_COLON; + p++; + } + else if (*p == ';') + { + token->type = T_SEMICOLON; + p++; + } + else + { + token->type = T_ID; + char *start = p; + while (!css_is_separator (*p)) + p++; + token->s = xmemdup0 (start, p - start); + } + *p_ = p; +} + +static void +css_decode_key_value (const char *key, const char *value, + struct font_style *font) +{ + if (!strcmp (key, "font-weight")) + font->bold = !strcmp (value, "bold"); + else if (!strcmp (key, "font-style")) + font->italic = !strcmp (value, "italic"); + else if (!strcmp (key, "font-decoration")) + font->underline = !strcmp (value, "underline"); + else if (!strcmp (key, "font-family")) + { + free (font->typeface); + font->typeface = xstrdup (value); + } + else if (!strcmp (key, "font-size")) + font->size = atoi (value); + + /* fg_color, bg_color */ + +} + +char * +spv_parse_css_style (char *style, struct font_style *font) +{ + *font = (struct font_style) FONT_STYLE_INITIALIZER; + + char *p = style; + struct css_token token = { .s = NULL }; + css_token_get (&p, &token); + while (token.type != T_EOF) + { + if (token.type != T_ID || !strcmp (token.s, "p")) + { + css_token_get (&p, &token); + continue; + } + + char *key = token.s; + token.s = NULL; + css_token_get (&p, &token); + + if (token.type == T_COLON) + { + struct string value = DS_EMPTY_INITIALIZER; + for (;;) + { + css_token_get (&p, &token); + if (token.type != T_ID) + break; + if (!ds_is_empty (&value)) + ds_put_byte (&value, ' '); + ds_put_cstr (&value, token.s); + } + + css_decode_key_value (key, ds_cstr (&value), font); + + ds_destroy (&value); + } + free (key); + } + return NULL; +} diff --git a/src/output/spv/spv-css-parser.h b/src/output/spv/spv-css-parser.h new file mode 100644 index 0000000000..44f7142f14 --- /dev/null +++ b/src/output/spv/spv-css-parser.h @@ -0,0 +1,24 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_CSS_PARSER_H +#define OUTPUT_SPV_CSS_PARSER_H 1 + +struct font_style; + +char *spv_parse_css_style (char *style, struct font_style *font); + +#endif /* output/spv/spv-css-parser.h */ diff --git a/src/output/spv/spv-dump.c b/src/output/spv/spv-dump.c new file mode 100644 index 0000000000..13ea6de57c --- /dev/null +++ b/src/output/spv/spv-dump.c @@ -0,0 +1,87 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2017, 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spv.h" + +#include +#include + +#include "data/settings.h" +#include "output/pivot-table.h" + +#include "gl/xalloc.h" + +static void +indent (int indentation) +{ + for (int i = 0; i < indentation * 2; i++) + putchar (' '); +} + +void +spv_item_dump (const struct spv_item *item, int indentation) +{ + indent (indentation); + if (item->label) + printf ("\"%s\" ", item->label); + if (!item->visible) + printf ("(hidden) "); + + switch (item->type) + { + case SPV_ITEM_HEADING: + printf ("heading\n"); + for (size_t i = 0; i < item->n_children; i++) + spv_item_dump (item->children[i], indentation + 1); + break; + + case SPV_ITEM_TEXT: + printf ("text \"%s\"\n", + pivot_value_to_string (item->text, SETTINGS_VALUE_SHOW_DEFAULT, + SETTINGS_VALUE_SHOW_DEFAULT)); + break; + + case SPV_ITEM_TABLE: + if (item->table) + pivot_table_dump (item->table, indentation + 1); + else + { + printf ("unloaded table in %s", item->bin_member); + if (item->xml_member) + printf (" and %s", item->xml_member); + putchar ('\n'); + } + break; + + case SPV_ITEM_GRAPH: + printf ("graph\n"); + break; + + case SPV_ITEM_MODEL: + printf ("model\n"); + break; + + case SPV_ITEM_OBJECT: + printf ("object type=\"%s\" uri=\"%s\"\n", item->object_type, item->uri); + break; + + case SPV_ITEM_TREE: + printf ("tree\n"); + break; + } +} diff --git a/src/output/spv/spv-legacy-data.c b/src/output/spv/spv-legacy-data.c new file mode 100644 index 0000000000..da9293ce48 --- /dev/null +++ b/src/output/spv/spv-legacy-data.c @@ -0,0 +1,400 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spv-legacy-data.h" + +#include +#include +#include +#include + +#include "libpspp/cast.h" +#include "libpspp/float-format.h" +#include "data/val-type.h" +#include "output/spv/old-binary-parser.h" + +#include "gl/minmax.h" +#include "gl/xalloc.h" +#include "gl/xmemdup0.h" +#include "gl/xsize.h" +#include "gl/xvasprintf.h" + +void +spv_data_uninit (struct spv_data *data) +{ + if (!data) + return; + + for (size_t i = 0; i < data->n_sources; i++) + spv_data_source_uninit (&data->sources[i]); + free (data->sources); +} + +void +spv_data_dump (const struct spv_data *data, FILE *stream) +{ + for (size_t i = 0; i < data->n_sources; i++) + { + if (i > 0) + putc ('\n', stream); + spv_data_source_dump (&data->sources[i], stream); + } +} + +struct spv_data_source * +spv_data_find_source (const struct spv_data *data, const char *source_name) +{ + for (size_t i = 0; i < data->n_sources; i++) + if (!strcmp (data->sources[i].source_name, source_name)) + return &data->sources[i]; + + return NULL; +} + +struct spv_data_variable * +spv_data_find_variable (const struct spv_data *data, + const char *source_name, + const char *variable_name) +{ + struct spv_data_source *source = spv_data_find_source (data, source_name); + return source ? spv_data_source_find_variable (source, variable_name) : NULL; +} + +void +spv_data_source_uninit (struct spv_data_source *source) +{ + if (!source) + return; + + for (size_t i = 0; i < source->n_vars; i++) + spv_data_variable_uninit (&source->vars[i]); + free (source->vars); + free (source->source_name); +} + +void +spv_data_source_dump (const struct spv_data_source *source, FILE *stream) +{ + fprintf (stream, "source \"%s\" (%zu values):\n", + source->source_name, source->n_values); + for (size_t i = 0; i < source->n_vars; i++) + spv_data_variable_dump (&source->vars[i], stream); +} + +struct spv_data_variable * +spv_data_source_find_variable (const struct spv_data_source *source, + const char *variable_name) +{ + for (size_t i = 0; i < source->n_vars; i++) + if (!strcmp (source->vars[i].var_name, variable_name)) + return &source->vars[i]; + return NULL; +} + +void +spv_data_variable_uninit (struct spv_data_variable *var) +{ + if (!var) + return; + + free (var->var_name); + for (size_t i = 0; i < var->n_values; i++) + spv_data_value_uninit (&var->values[i]); + free (var->values); +} + +void +spv_data_variable_dump (const struct spv_data_variable *var, FILE *stream) +{ + fprintf (stream, "variable \"%s\":", var->var_name); + for (size_t i = 0; i < var->n_values; i++) + { + if (i) + putc (',', stream); + putc (' ', stream); + spv_data_value_dump (&var->values[i], stream); + } + putc ('\n', stream); +} + +void +spv_data_value_uninit (struct spv_data_value *value) +{ + if (value && value->width >= 0) + free (value->s); +} + +bool +spv_data_value_equal (const struct spv_data_value *a, + const struct spv_data_value *b) +{ + return (a->width == b->width + && a->index == b->index + && (a->width < 0 + ? a->d == b->d + : !strcmp (a->s, b->s))); +} + +struct spv_data_value * +spv_data_values_clone (const struct spv_data_value *src, size_t n) +{ + struct spv_data_value *dst = xmemdup (src, n * sizeof *src); + for (size_t i = 0; i < n; i++) + if (dst[i].width >= 0) + dst[i].s = xstrdup (dst[i].s); + return dst; +} + +void +spv_data_value_dump (const struct spv_data_value *value, FILE *stream) +{ + if (value->index != SYSMIS) + fprintf (stream, "%.*ge-", DBL_DIG + 1, value->index); + if (value->width >= 0) + fprintf (stream, "\"%s\"", value->s); + else if (value->d == SYSMIS) + putc ('.', stream); + else + fprintf (stream, "%.*g", DBL_DIG + 1, value->d); +} + +static char * +decode_fixed_string (const uint8_t *buf_, size_t size) +{ + const char *buf = CHAR_CAST (char *, buf_); + return xmemdup0 (buf, strnlen (buf, size)); +} + +static char * +decode_var_name (const struct spvob_metadata *md) +{ + int n0 = strnlen ((char *) md->source_name, sizeof md->source_name); + int n1 = (n0 < sizeof md->source_name ? 0 + : strnlen ((char *) md->ext_source_name, + sizeof md->ext_source_name)); + return xasprintf ("%.*s%.*s", + n0, (char *) md->source_name, + n1, (char *) md->ext_source_name); +} + +static char * WARN_UNUSED_RESULT +decode_data (const uint8_t *in, size_t size, size_t data_offset, + struct spv_data_source *source, size_t *end_offsetp) +{ + size_t var_size = xsum (288, xtimes (source->n_values, 8)); + size_t source_size = xtimes (source->n_vars, var_size); + size_t end_offset = xsum (data_offset, source_size); + if (size_overflow_p (end_offset)) + return xasprintf ("Data source \"%s\" exceeds supported %zu-byte size.", + source->source_name, SIZE_MAX - 1); + if (end_offset > size) + return xasprintf ("%zu-byte data source \"%s\" starting at offset %#zx " + "runs past end of %zu-byte ZIP member.", + source_size, source->source_name, data_offset, + size); + + in += data_offset; + for (size_t i = 0; i < source->n_vars; i++) + { + struct spv_data_variable *var = &source->vars[i]; + var->var_name = decode_fixed_string (in, 288); + in += 288; + + var->values = xnmalloc (source->n_values, sizeof *var->values); + var->n_values = source->n_values; + for (size_t j = 0; j < source->n_values; j++) + { + var->values[j].index = SYSMIS; + var->values[j].width = -1; + var->values[j].d = float_get_double (FLOAT_IEEE_DOUBLE_LE, in); + in += 8; + } + } + + *end_offsetp = end_offset; + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_variable_map (const char *source_name, + const struct spvob_variable_map *in, + const struct spvob_labels *labels, + struct spv_data_variable *out) +{ + if (strcmp (in->variable_name, out->var_name)) + return xasprintf ("Source \"%s\" variable \"%s\" mapping is associated " + "with wrong variable \"%s\".", + source_name, out->var_name, in->variable_name); + + for (size_t i = 0; i < in->n_data; i++) + { + const struct spvob_datum_map *map = in->data[i]; + + if (map->value_idx >= out->n_values) + return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu " + "attempts to set 0-based value %"PRIu32" " + "but source has only %zu values.", + source_name, out->var_name, i, + map->value_idx, out->n_values); + struct spv_data_value *value = &out->values[map->value_idx]; + + if (map->label_idx >= labels->n_labels) + return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu " + "attempts to set value %"PRIu32" to 0-based label " + "%"PRIu32" but only %"PRIu32" labels are present.", + source_name, out->var_name, i, + map->value_idx, map->label_idx, labels->n_labels); + const struct spvob_label *label = labels->labels[map->label_idx]; + + if (value->width >= 0) + return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu " + "attempts to change string value %"PRIu32".", + source_name, out->var_name, i, + map->value_idx); +#if 0 + else if (value->d != SYSMIS && !isnan (value->d)) + { +#if 1 + return NULL; +#else + return xasprintf ("Source \"%s\" variable \"%s\" mapping %zu " + "attempts to change non-missing value %"PRIu32" " + "into \"%s\".", + source_name, out->var_name, i, + map->value_idx, + label->label); +#endif + } +#endif + + value->width = strlen (label->label); + value->s = xmemdup0 (label->label, value->width); + } + + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_source_map (const struct spvob_source_map *in, + const struct spvob_labels *labels, + struct spv_data_source *out) +{ + if (in->n_variables > out->n_vars) + return xasprintf ("source map for \"%s\" has %"PRIu32" variables but " + "source has only %zu", + out->source_name, in->n_variables, out->n_vars); + + for (size_t i = 0; i < in->n_variables; i++) + { + char *error = decode_variable_map (out->source_name, in->variables[i], + labels, &out->vars[i]); + if (error) + return error; + } + + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_strings (const struct spvob_strings *in, struct spv_data *out) +{ + for (size_t i = 0; i < in->maps->n_maps; i++) + { + const struct spvob_source_map *sm = in->maps->maps[i]; + const char *name = sm->source_name; + struct spv_data_source *source = spv_data_find_source (out, name); + if (!source) + return xasprintf ("cannot decode source map for unknown source \"%s\"", + name); + + char *error = decode_source_map (sm, in->labels, source); + if (error) + return error; + } + + return NULL; +} + +char * WARN_UNUSED_RESULT +spv_legacy_data_decode (const uint8_t *in, size_t size, struct spv_data *out) +{ + char *error = NULL; + memset (out, 0, sizeof *out); + + struct spvbin_input input; + spvbin_input_init (&input, in, size); + + struct spvob_legacy_binary *lb; + bool ok = spvob_parse_legacy_binary (&input, &lb); + if (!ok) + { + error = spvbin_input_to_error (&input, NULL); + goto error; + } + + out->sources = xcalloc (lb->n_sources, sizeof *out->sources); + out->n_sources = lb->n_sources; + + for (size_t i = 0; i < lb->n_sources; i++) + { + const struct spvob_metadata *md = lb->metadata[i]; + struct spv_data_source *source = &out->sources[i]; + + source->source_name = decode_var_name (md); + source->n_vars = md->n_variables; + source->n_values = md->n_values; + source->vars = xcalloc (md->n_variables, sizeof *source->vars); + + size_t end; + error = decode_data (in, size, md->data_offset, source, &end); + if (error) + goto error; + + input.ofs = MAX (input.ofs, end); + } + + if (input.ofs < input.size) + { + struct spvob_strings *strings; + bool ok = spvob_parse_strings (&input, &strings); + if (!ok) + error = spvbin_input_to_error (&input, NULL); + else + { + if (input.ofs != input.size) + error = xasprintf ("expected end of file at offset #%zx", + input.ofs); + else + error = decode_strings (strings, out); + spvob_free_strings (strings); + } + + if (error) + goto error; + } + + spvob_free_legacy_binary (lb); + + return NULL; + +error: + spv_data_uninit (out); + memset (out, 0, sizeof *out); + spvob_free_legacy_binary (lb); + return error; +} diff --git a/src/output/spv/spv-legacy-data.h b/src/output/spv/spv-legacy-data.h new file mode 100644 index 0000000000..1e8fa2d151 --- /dev/null +++ b/src/output/spv/spv-legacy-data.h @@ -0,0 +1,89 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_LEGACY_DATA_H +#define OUTPUT_SPV_LEGACY_DATA_H 1 + +/* SPSS Viewer (SPV) legacy binary data decoder. + + Used by spv.h, not useful directly. */ + +#include +#include +#include +#include +#include "libpspp/compiler.h" + +struct spv_data + { + struct spv_data_source *sources; + size_t n_sources; + }; + +void spv_data_uninit (struct spv_data *); +void spv_data_dump (const struct spv_data *, FILE *); + +struct spv_data_source *spv_data_find_source (const struct spv_data *, + const char *source_name); +struct spv_data_variable *spv_data_find_variable (const struct spv_data *, + const char *source_name, + const char *variable_name); + +struct spv_data_source + { + char *source_name; + struct spv_data_variable *vars; + size_t n_vars, n_values; + }; + +void spv_data_source_uninit (struct spv_data_source *); +void spv_data_source_dump (const struct spv_data_source *, FILE *); + +struct spv_data_variable *spv_data_source_find_variable ( + const struct spv_data_source *, const char *variable_name); + +struct spv_data_variable + { + char *var_name; + struct spv_data_value *values; + size_t n_values; + }; + +void spv_data_variable_uninit (struct spv_data_variable *); +void spv_data_variable_dump (const struct spv_data_variable *, FILE *); + +struct spv_data_value + { + double index; + int width; + union + { + double d; + char *s; + }; + }; + +void spv_data_value_uninit (struct spv_data_value *); +bool spv_data_value_equal (const struct spv_data_value *, + const struct spv_data_value *); +struct spv_data_value *spv_data_values_clone (const struct spv_data_value *, + size_t n); + +char *spv_legacy_data_decode (const uint8_t *in, size_t size, + struct spv_data *out) WARN_UNUSED_RESULT; +void spv_data_value_dump (const struct spv_data_value *, FILE *); + +#endif /* output/spv/spv-legacy-data.h */ diff --git a/src/output/spv/spv-legacy-decoder.c b/src/output/spv/spv-legacy-decoder.c new file mode 100644 index 0000000000..9740a37eee --- /dev/null +++ b/src/output/spv/spv-legacy-decoder.c @@ -0,0 +1,2303 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2017, 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spv-legacy-decoder.h" + +#include +#include +#include +#include +#include + +#include "data/data-out.h" +#include "data/calendar.h" +#include "data/format.h" +#include "data/value.h" +#include "libpspp/assertion.h" +#include "libpspp/hash-functions.h" +#include "libpspp/hmap.h" +#include "libpspp/message.h" +#include "output/pivot-table.h" +#include "output/spv/detail-xml-parser.h" +#include "output/spv/spv-legacy-data.h" +#include "output/spv/spv.h" +#include "output/spv/structure-xml-parser.h" + +#include "gl/c-strtod.h" +#include "gl/xalloc.h" +#include "gl/xmemdup0.h" + +#include + +#include "gettext.h" +#define N_(msgid) msgid +#define _(msgid) gettext (msgid) + +struct spv_legacy_properties + { + /* General properties. */ + bool omit_empty; + int width_ranges[TABLE_N_AXES][2]; /* In 1/96" units. */ + bool row_labels_in_corner; + + /* Footnote display settings. */ + bool show_numeric_markers; + bool footnote_marker_superscripts; + + /* Styles. */ + struct area_style areas[PIVOT_N_AREAS]; + struct table_border_style borders[PIVOT_N_BORDERS]; + + /* Print settings. */ + bool print_all_layers; + bool paginate_layers; + bool shrink_to_width; + bool shrink_to_length; + bool top_continuation, bottom_continuation; + char *continuation; + size_t n_orphan_lines; + }; + +struct spv_series + { + struct hmap_node hmap_node; /* By name. */ + char *name; + char *label; + struct fmt_spec format; + + struct spv_series *label_series; + bool is_label_series; + + const struct spvxml_node *xml; + + struct spv_data_value *values; + size_t n_values; + struct hmap map; /* Contains "struct spv_mapping". */ + bool remapped; + + struct pivot_dimension *dimension; + + struct pivot_category **index_to_category; + size_t n_index; + + struct spvdx_affix **affixes; + size_t n_affixes; + }; + +static void spv_map_destroy (struct hmap *); + +static struct spv_series * +spv_series_first (struct hmap *series_map) +{ + struct spv_series *series; + HMAP_FOR_EACH (series, struct spv_series, hmap_node, series_map) + return series; + return NULL; +} + +static struct spv_series * +spv_series_find (const struct hmap *series_map, const char *name) +{ + struct spv_series *series; + HMAP_FOR_EACH_WITH_HASH (series, struct spv_series, hmap_node, + hash_string (name, 0), series_map) + if (!strcmp (name, series->name)) + return series; + return NULL; +} + +static struct spv_series * +spv_series_from_ref (const struct hmap *series_map, + const struct spvxml_node *ref) +{ + const struct spvxml_node *node + = (spvdx_is_source_variable (ref) + ? &spvdx_cast_source_variable (ref)->node_ + : &spvdx_cast_derived_variable (ref)->node_); + return spv_series_find (series_map, node->id); +} + +static void UNUSED +spv_series_dump (const struct spv_series *series) +{ + printf ("series \"%s\"", series->name); + if (series->label) + printf (" (label \"%s\")", series->label); + printf (", %zu values:", series->n_values); + for (size_t i = 0; i < series->n_values; i++) + { + putchar (' '); + spv_data_value_dump (&series->values[i], stdout); + } + putchar ('\n'); +} + +static void +spv_series_destroy (struct hmap *series_map) +{ + struct spv_series *series, *next_series; + HMAP_FOR_EACH_SAFE (series, next_series, struct spv_series, hmap_node, + series_map) + { + free (series->name); + free (series->label); + + for (size_t i = 0; i < series->n_values; i++) + spv_data_value_uninit (&series->values[i]); + free (series->values); + + spv_map_destroy (&series->map); + + free (series->index_to_category); + + hmap_delete (series_map, &series->hmap_node); + free (series); + } + hmap_destroy (series_map); +} + +struct spv_mapping + { + struct hmap_node hmap_node; + double from; + struct spv_data_value to; + }; + +static struct spv_mapping * +spv_map_search (const struct hmap *map, double from) +{ + struct spv_mapping *mapping; + HMAP_FOR_EACH_WITH_HASH (mapping, struct spv_mapping, hmap_node, + hash_double (from, 0), map) + if (mapping->from == from) + return mapping; + return NULL; +} + +static const struct spv_data_value * +spv_map_lookup (const struct hmap *map, const struct spv_data_value *in) +{ + if (in->width >= 0) + return in; + + const struct spv_mapping *m = spv_map_search (map, in->d); + return m ? &m->to : in; +} + +static bool +parse_real (const char *s, double *real) +{ + int save_errno = errno; + errno = 0; + char *end; + *real = c_strtod (s, &end); + bool ok = !errno && end > s && !*end; + errno = save_errno; + + return ok; +} + +static char * WARN_UNUSED_RESULT +spv_map_insert (struct hmap *map, double from, const char *to, + bool try_strings_as_numbers, const struct fmt_spec *format) +{ + struct spv_mapping *mapping = xmalloc (sizeof *mapping); + mapping->from = from; + + if ((try_strings_as_numbers || (format && fmt_is_numeric (format->type))) + && parse_real (to, &mapping->to.d)) + { + if (try_strings_as_numbers) + mapping->to.width = -1; + else + { + union value v = { .f = mapping->to.d }; + mapping->to.s = data_out_stretchy (&v, NULL, format, NULL); + mapping->to.width = strlen (mapping->to.s); + } + } + else + { + mapping->to.width = strlen (to); + mapping->to.s = xstrdup (to); + } + + struct spv_mapping *old_mapping = spv_map_search (map, from); + if (old_mapping) + { + bool same = spv_data_value_equal (&old_mapping->to, &mapping->to); + spv_data_value_uninit (&mapping->to); + free (mapping); + return (same ? NULL + : xasprintf ("Duplicate relabeling differs for from=\"%.*g\"", + DBL_DIG + 1, from)); + } + + hmap_insert (map, &mapping->hmap_node, hash_double (from, 0)); + return NULL; +} + +static void +spv_map_destroy (struct hmap *map) +{ + struct spv_mapping *mapping, *next; + HMAP_FOR_EACH_SAFE (mapping, next, struct spv_mapping, hmap_node, map) + { + spv_data_value_uninit (&mapping->to); + hmap_delete (map, &mapping->hmap_node); + free (mapping); + } + hmap_destroy (map); +} + +static char * WARN_UNUSED_RESULT +spv_series_parse_relabels (struct hmap *map, + struct spvdx_relabel **relabels, size_t n_relabels, + bool try_strings_as_numbers, + const struct fmt_spec *format) +{ + for (size_t i = 0; i < n_relabels; i++) + { + const struct spvdx_relabel *relabel = relabels[i]; + char *error = spv_map_insert (map, relabel->from, relabel->to, + try_strings_as_numbers, format); + if (error) + return error; + } + return NULL; +} + +static char * WARN_UNUSED_RESULT +spv_series_parse_value_map_entry (struct hmap *map, + const struct spvdx_value_map_entry *vme) +{ + for (const char *p = vme->from; ; p++) + { + int save_errno = errno; + errno = 0; + char *end; + double from = c_strtod (p, &end); + bool ok = !errno && end > p && strchr (";", *end); + errno = save_errno; + if (!ok) + return xasprintf ("Syntax error in valueMapEntry from=\"%s\".", + vme->from); + + char *error = spv_map_insert (map, from, vme->to, true, + &(struct fmt_spec) { FMT_A, 40, 0 }); + if (error) + return error; + + p = end; + if (*p == '\0') + return NULL; + assert (*p == ';'); + } +} + +static struct fmt_spec +decode_date_time_format (const struct spvdx_date_time_format *dtf) +{ + if (dtf->dt_base_format == SPVDX_DT_BASE_FORMAT_DATE) + { + enum fmt_type type + = (dtf->show_quarter > 0 ? FMT_QYR + : dtf->show_week > 0 ? FMT_WKYR + : dtf->mdy_order == SPVDX_MDY_ORDER_DAY_MONTH_YEAR + ? (dtf->month_format == SPVDX_MONTH_FORMAT_NUMBER + || dtf->month_format == SPVDX_MONTH_FORMAT_PADDED_NUMBER + ? FMT_EDATE : FMT_DATE) + : dtf->mdy_order == SPVDX_MDY_ORDER_YEAR_MONTH_DAY ? FMT_SDATE + : FMT_ADATE); + + int w = fmt_min_output_width (type); + if (dtf->year_abbreviation <= 0) + w += 2; + return (struct fmt_spec) { .type = type, .w = w }; + } + else + { + enum fmt_type type + = (dtf->dt_base_format == SPVDX_DT_BASE_FORMAT_DATE_TIME + ? (dtf->mdy_order == SPVDX_MDY_ORDER_YEAR_MONTH_DAY + ? FMT_YMDHMS + : FMT_DATETIME) + : (dtf->show_day > 0 ? FMT_DTIME + : dtf->show_hour > 0 ? FMT_TIME + : FMT_MTIME)); + int w = fmt_min_output_width (type); + int d = 0; + if (dtf->show_second > 0) + { + w += 3; + if (dtf->show_millis > 0) + { + d = 3; + w += d + 1; + } + } + return (struct fmt_spec) { .type = type, .w = w, .d = d }; + } +} + +static struct fmt_spec +decode_elapsed_time_format (const struct spvdx_elapsed_time_format *etf) +{ + enum fmt_type type + = (etf->dt_base_format != SPVDX_DT_BASE_FORMAT_TIME ? FMT_DTIME + : etf->show_hour > 0 ? FMT_TIME + : FMT_MTIME); + int w = fmt_min_output_width (type); + int d = 0; + if (etf->show_second > 0) + { + w += 3; + if (etf->show_millis > 0) + { + d = 3; + w += d + 1; + } + } + return (struct fmt_spec) { .type = type, .w = w, .d = d }; +} + +static struct fmt_spec +decode_number_format (const struct spvdx_number_format *nf) +{ + enum fmt_type type = (nf->scientific == SPVDX_SCIENTIFIC_TRUE ? FMT_E + : nf->prefix && !strcmp (nf->prefix, "$") ? FMT_DOLLAR + : nf->suffix && !strcmp (nf->suffix, "%") ? FMT_PCT + : nf->use_grouping ? FMT_COMMA + : FMT_F); + + int d = nf->maximum_fraction_digits; + if (d < 0 || d > 15) + d = 2; + + struct fmt_spec f = (struct fmt_spec) { type, 40, d }; + fmt_fix_output (&f); + return f; +} + +/* Returns an *approximation* of IN as a fmt_spec. + + Not for use with string formats, which don't have any options anyway. */ +static struct fmt_spec +decode_format (const struct spvdx_format *in) +{ + if (in->f_base_format == SPVDX_F_BASE_FORMAT_DATE || + in->f_base_format == SPVDX_F_BASE_FORMAT_TIME || + in->f_base_format == SPVDX_F_BASE_FORMAT_DATE_TIME) + { + struct spvdx_date_time_format dtf = { + .dt_base_format = (in->f_base_format == SPVDX_F_BASE_FORMAT_DATE + ? SPVDX_DT_BASE_FORMAT_DATE + : in->f_base_format == SPVDX_F_BASE_FORMAT_TIME + ? SPVDX_DT_BASE_FORMAT_TIME + : SPVDX_DT_BASE_FORMAT_DATE_TIME), + .separator_chars = in->separator_chars, + .mdy_order = in->mdy_order, + .show_year = in->show_year, + .year_abbreviation = in->year_abbreviation, + .show_quarter = in->show_quarter, + .quarter_prefix = in->quarter_prefix, + .quarter_suffix = in->quarter_suffix, + .show_month = in->show_month, + .month_format = in->month_format, + .show_week = in->show_week, + .week_padding = in->week_padding, + .week_suffix = in->week_suffix, + .show_day_of_week = in->show_day_of_week, + .day_of_week_abbreviation = in->day_of_week_abbreviation, + .day_padding = in->day_padding, + .day_of_month_padding = in->day_of_month_padding, + .hour_padding = in->hour_padding, + .minute_padding = in->minute_padding, + .second_padding = in->second_padding, + .show_day = in->show_day, + .show_hour = in->show_hour, + .show_minute = in->show_minute, + .show_second = in->show_second, + .show_millis = in->show_millis, + .day_type = in->day_type, + .hour_format = in->hour_format, + }; + return decode_date_time_format (&dtf); + } + else if (in->f_base_format == SPVDX_F_BASE_FORMAT_ELAPSED_TIME) + { + struct spvdx_elapsed_time_format etf = { + .dt_base_format = (in->f_base_format == SPVDX_F_BASE_FORMAT_DATE + ? SPVDX_DT_BASE_FORMAT_DATE + : in->f_base_format == SPVDX_F_BASE_FORMAT_TIME + ? SPVDX_DT_BASE_FORMAT_TIME + : SPVDX_DT_BASE_FORMAT_DATE_TIME), + .day_padding = in->day_padding, + .minute_padding = in->minute_padding, + .second_padding = in->second_padding, + .show_year = in->show_year, + .show_day = in->show_day, + .show_hour = in->show_hour, + .show_minute = in->show_minute, + .show_second = in->show_second, + .show_millis = in->show_millis, + }; + return decode_elapsed_time_format (&etf); + } + else + { + assert (!in->f_base_format); + struct spvdx_number_format nf = { + .minimum_integer_digits = in->minimum_integer_digits, + .maximum_fraction_digits = in->maximum_fraction_digits, + .minimum_fraction_digits = in->minimum_fraction_digits, + .use_grouping = in->use_grouping, + .scientific = in->scientific, + .small = in->small, + .prefix = in->prefix, + .suffix = in->suffix, + }; + return decode_number_format (&nf); + } +} + +static void +spv_series_execute_mapping (struct spv_series *series) +{ + if (!hmap_is_empty (&series->map)) + { + series->remapped = true; + for (size_t i = 0; i < series->n_values; i++) + { + struct spv_data_value *value = &series->values[i]; + if (value->width >= 0) + continue; + + const struct spv_mapping *mapping = spv_map_search (&series->map, + value->d); + if (mapping) + { + value->index = value->d; + assert (value->index == floor (value->index)); + value->width = mapping->to.width; + if (value->width >= 0) + value->s = xmemdup0 (mapping->to.s, mapping->to.width); + else + value->d = mapping->to.d; + } + } + } +} + +static char * WARN_UNUSED_RESULT +spv_series_remap_formats (struct spv_series *series, + struct spvxml_node **seq, size_t n_seq) +{ + spv_map_destroy (&series->map); + hmap_init (&series->map); + for (size_t i = 0; i < n_seq; i++) + { + struct spvxml_node *node = seq[i]; + if (spvdx_is_format (node)) + { + struct spvdx_format *f = spvdx_cast_format (node); + series->format = decode_format (f); + char *error = spv_series_parse_relabels ( + &series->map, f->relabel, f->n_relabel, + f->try_strings_as_numbers > 0, &series->format); + if (error) + return error; + + series->affixes = f->affix; + series->n_affixes = f->n_affix; + } + else if (spvdx_is_string_format (node)) + { + struct spvdx_string_format *sf = spvdx_cast_string_format (node); + char *error = spv_series_parse_relabels (&series->map, + sf->relabel, sf->n_relabel, + false, NULL); + if (error) + return error; + + series->affixes = sf->affix; + series->n_affixes = sf->n_affix; + } + else + NOT_REACHED (); + } + spv_series_execute_mapping (series); + return NULL; +} + +static char * WARN_UNUSED_RESULT +spv_series_remap_vmes (struct spv_series *series, + struct spvdx_value_map_entry **vmes, + size_t n_vmes) +{ + spv_map_destroy (&series->map); + hmap_init (&series->map); + for (size_t i = 0; i < n_vmes; i++) + { + char *error = spv_series_parse_value_map_entry (&series->map, vmes[i]); + if (error) + return error; + } + spv_series_execute_mapping (series); + return NULL; +} + +static void +decode_footnotes (struct pivot_table *table, const struct spvdx_footnotes *f) +{ + if (f->n_footnote_mapping > 0) + pivot_table_create_footnote__ (table, f->n_footnote_mapping - 1, + NULL, NULL); + for (size_t i = 0; i < f->n_footnote_mapping; i++) + { + const struct spvdx_footnote_mapping *fm = f->footnote_mapping[i]; + pivot_table_create_footnote__ (table, fm->defines_reference - 1, + pivot_value_new_user_text (fm->to, -1), + NULL); + } +} + +static struct cell_color +optional_color (int color, struct cell_color default_color) +{ + return (color >= 0 + ? (struct cell_color) CELL_COLOR (color >> 16, color >> 8, color) + : default_color); +} + +static int +optional_length (const char *s, int default_length) +{ + /* There is usually a "pt" suffix. We ignore it. */ + int length; + return s && sscanf (s, "%d", &length) == 1 ? length : default_length; +} + +static int +optional_px (double inches, int default_px) +{ + return inches != DBL_MAX ? inches * 96.0 : default_px; +} + +static int +optional_pt (double inches, int default_pt) +{ + return inches != DBL_MAX ? inches * 72.0 + .5 : default_pt; +} + +static void +decode_spvdx_style_incremental (const struct spvdx_style *in, + const struct spvdx_style *bg, + struct area_style *out) +{ + if (in && in->font_weight) + out->font_style.bold = in->font_weight == SPVDX_FONT_WEIGHT_BOLD; + if (in && in->font_style) + out->font_style.italic = in->font_style == SPVDX_FONT_STYLE_ITALIC; + if (in && in->font_underline) + out->font_style.underline = in->font_underline == SPVDX_FONT_UNDERLINE_UNDERLINE; + if (in && in->color >= 0) + { + out->font_style.fg[0] = optional_color ( + in->color, (struct cell_color) CELL_COLOR_BLACK); + out->font_style.fg[1] = out->font_style.fg[0]; + } + if (bg && bg->color >= 0) + { + out->font_style.bg[0] = optional_color ( + bg->color, (struct cell_color) CELL_COLOR_WHITE); + out->font_style.bg[1] = out->font_style.bg[0]; + } + if (in && in->font_family) + { + free (out->font_style.typeface); + out->font_style.typeface = xstrdup (in->font_family); + } + if (in && in->font_size) + { + int size = optional_length (in->font_size, 0); + if (size) + out->font_style.size = size; + } + if (in && in->text_alignment) + out->cell_style.halign + = (in->text_alignment == SPVDX_TEXT_ALIGNMENT_LEFT + ? TABLE_HALIGN_LEFT + : in->text_alignment == SPVDX_TEXT_ALIGNMENT_RIGHT + ? TABLE_HALIGN_RIGHT + : in->text_alignment == SPVDX_TEXT_ALIGNMENT_CENTER + ? TABLE_HALIGN_CENTER + : in->text_alignment == SPVDX_TEXT_ALIGNMENT_DECIMAL + ? TABLE_HALIGN_DECIMAL + : TABLE_HALIGN_MIXED); + if (in && in->label_location_vertical) + out->cell_style.valign = + (in->label_location_vertical == SPVDX_LABEL_LOCATION_VERTICAL_NEGATIVE + ? TABLE_VALIGN_BOTTOM + : in->label_location_vertical == SPVDX_LABEL_LOCATION_VERTICAL_POSITIVE + ? TABLE_VALIGN_TOP + : TABLE_VALIGN_CENTER); + if (in && in->decimal_offset != DBL_MAX) + out->cell_style.decimal_offset = optional_px (in->decimal_offset, 0); +#if 0 + if (in && in->margin_left != DBL_MAX) + out->cell_style.margin[TABLE_HORZ][0] = optional_pt (in->margin_left, 8); + if (in && in->margin_right != DBL_MAX) + out->cell_style.margin[TABLE_HORZ][1] = optional_pt (in->margin_right, 11); + if (in && in->margin_top != DBL_MAX) + out->cell_style.margin[TABLE_VERT][0] = optional_pt (in->margin_top, 1); + if (in && in->margin_bottom != DBL_MAX) + out->cell_style.margin[TABLE_VERT][1] = optional_pt (in->margin_bottom, 1); +#endif +} + +static void +decode_spvdx_style (const struct spvdx_style *in, + const struct spvdx_style *bg, + struct area_style *out) +{ + *out = (struct area_style) AREA_STYLE_INITIALIZER; + decode_spvdx_style_incremental (in, bg, out); +} + +static void +add_footnote (struct pivot_value *v, int idx, struct pivot_table *table) +{ + if (idx < 1 || idx > table->n_footnotes) + return; + + pivot_value_add_footnote (v, table->footnotes[idx - 1]); +} + +static char * WARN_UNUSED_RESULT +decode_label_frame (struct pivot_table *table, + const struct spvdx_label_frame *lf) +{ + if (!lf->label) + return NULL; + + struct pivot_value **target; + struct area_style *area; + if (lf->label->purpose == SPVDX_PURPOSE_TITLE) + { + target = &table->title; + area = &table->areas[PIVOT_AREA_TITLE]; + } + else if (lf->label->purpose == SPVDX_PURPOSE_SUB_TITLE) + { + target = &table->caption; + area = &table->areas[PIVOT_AREA_CAPTION]; + } + else if (lf->label->purpose == SPVDX_PURPOSE_FOOTNOTE) + { + if (lf->label->n_text > 0 + && lf->label->text[0]->uses_reference != INT_MIN) + { + target = NULL; + area = &table->areas[PIVOT_AREA_FOOTER]; + } + else + return NULL; + } + else if (lf->label->purpose == SPVDX_PURPOSE_LAYER) + { + target = NULL; + area = &table->areas[PIVOT_AREA_LAYERS]; + } + else + return NULL; + + area_style_uninit (area); + decode_spvdx_style (lf->label->style, lf->label->text_frame_style, area); + + if (target) + { + struct pivot_value *value = xzalloc (sizeof *value); + value->type = PIVOT_VALUE_TEXT; + for (size_t i = 0; i < lf->label->n_text; i++) + { + const struct spvdx_text *in = lf->label->text[i]; + if (in->defines_reference != INT_MIN) + add_footnote (value, in->defines_reference, table); + else if (!value->text.local) + value->text.local = xstrdup (in->text); + else + { + char *new = xasprintf ("%s%s", value->text.local, in->text); + free (value->text.local); + value->text.local = new; + } + } + pivot_value_destroy (*target); + *target = value; + } + else + for (size_t i = 0; i < lf->label->n_text; i++) + { + const struct spvdx_text *in = lf->label->text[i]; + if (in->uses_reference == INT_MIN) + continue; + if (i % 2) + { + size_t length = strlen (in->text); + if (length && in->text[length - 1] == '\n') + length--; + + pivot_table_create_footnote__ ( + table, in->uses_reference - 1, NULL, + pivot_value_new_user_text (in->text, length)); + } + else + { + size_t length = strlen (in->text); + if (length && in->text[length - 1] == '.') + length--; + + pivot_table_create_footnote__ ( + table, in->uses_reference - 1, + pivot_value_new_user_text (in->text, length), NULL); + } + } + return NULL; +} + +/* Special return value for decode_spvdx_variable(). */ +static char BAD_REFERENCE; + +static char * WARN_UNUSED_RESULT +decode_spvdx_source_variable (const struct spvxml_node *node, + struct spv_data *data, + struct hmap *series_map) +{ + const struct spvdx_source_variable *sv = spvdx_cast_source_variable (node); + + struct spv_series *label_series = NULL; + if (sv->label_variable) + { + label_series = spv_series_find (series_map, + sv->label_variable->node_.id); + if (!label_series) + return &BAD_REFERENCE; + + label_series->is_label_series = true; + } + + const struct spv_data_variable *var = spv_data_find_variable ( + data, sv->source, sv->source_name); + if (!var) + return xasprintf ("sourceVariable %s references nonexistent " + "source %s variable %s.", + sv->node_.id, sv->source, sv->source_name); + + struct spv_series *s = xzalloc (sizeof *s); + s->name = xstrdup (node->id); + s->xml = node; + s->label = sv->label ? xstrdup (sv->label) : NULL; + s->label_series = label_series; + s->values = spv_data_values_clone (var->values, var->n_values); + s->n_values = var->n_values; + s->format = F_8_0; + hmap_init (&s->map); + hmap_insert (series_map, &s->hmap_node, hash_string (s->name, 0)); + + char *error = spv_series_remap_formats (s, sv->seq, sv->n_seq); + if (error) + return error; + + if (label_series && !s->remapped) + { + for (size_t i = 0; i < s->n_values; i++) + if (s->values[i].width < 0) + { + char *dest; + if (label_series->values[i].width < 0) + { + union value v = { .f = label_series->values[i].d }; + dest = data_out_stretchy (&v, "UTF-8", &s->format, NULL); + } + else + dest = label_series->values[i].s; + char *error = spv_map_insert (&s->map, s->values[i].d, + dest, false, NULL); + free (error); /* Duplicates are OK. */ + if (label_series->values[i].width < 0) + free (dest); + } + } + + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvdx_derived_variable (const struct spvxml_node *node, + struct hmap *series_map) +{ + const struct spvdx_derived_variable *dv = spvdx_cast_derived_variable (node); + + struct spv_data_value *values; + size_t n_values; + + struct substring value = ss_cstr (dv->value); + if (ss_equals (value, ss_cstr ("constant(0)"))) + { + struct spv_series *existing_series = spv_series_first (series_map); + if (!existing_series) + return &BAD_REFERENCE; + + n_values = existing_series->n_values; + values = xcalloc (n_values, sizeof *values); + for (size_t i = 0; i < n_values; i++) + values[i].width = -1; + } + else if (ss_starts_with (value, ss_cstr ("constant("))) + { + values = NULL; + n_values = 0; + } + else if (ss_starts_with (value, ss_cstr ("map(")) + && ss_ends_with (value, ss_cstr (")"))) + { + char *dependency_name = ss_xstrdup (ss_substr (value, 4, + value.length - 5)); + struct spv_series *dependency + = spv_series_find (series_map, dependency_name); + free (dependency_name); + if (!dependency) + return &BAD_REFERENCE; + + values = spv_data_values_clone (dependency->values, + dependency->n_values); + n_values = dependency->n_values; + } + else + return xasprintf ("Derived variable %s has unknown value \"%s\"", + node->id, dv->value); + + struct spv_series *s = xzalloc (sizeof *s); + s->format = F_8_0; + s->name = xstrdup (node->id); + s->values = values; + s->n_values = n_values; + hmap_init (&s->map); + hmap_insert (series_map, &s->hmap_node, hash_string (s->name, 0)); + + char *error = spv_series_remap_vmes (s, dv->value_map_entry, + dv->n_value_map_entry); + if (error) + return error; + + error = spv_series_remap_formats (s, dv->seq, dv->n_seq); + if (error) + return error; + + if (n_values > 0) + { + for (size_t i = 0; i < n_values; i++) + if (values[i].width != 0) + goto nonempty; + for (size_t i = 0; i < n_values; i++) + spv_data_value_uninit (&s->values[i]); + free (s->values); + + s->values = NULL; + s->n_values = 0; + + nonempty:; + } + return NULL; +} + +struct format_mapping + { + struct hmap_node hmap_node; + uint32_t from; + struct fmt_spec to; + }; + +static const struct format_mapping * +format_map_find (const struct hmap *format_map, uint32_t u32_format) +{ + if (format_map) + { + const struct format_mapping *fm; + HMAP_FOR_EACH_IN_BUCKET (fm, struct format_mapping, hmap_node, + hash_int (u32_format, 0), format_map) + if (fm->from == u32_format) + return fm; + } + + return NULL; +} + +static char * WARN_UNUSED_RESULT +spv_format_from_data_value (const struct spv_data_value *data, + const struct hmap *format_map, + struct fmt_spec *out) +{ + if (!data) + { + *out = fmt_for_output (FMT_F, 40, 2); + return NULL; + } + + uint32_t u32_format = data->width < 0 ? data->d : atoi (data->s); + const struct format_mapping *fm = format_map_find (format_map, u32_format); + if (fm) + { + *out = fm->to; + return NULL; + } + return spv_decode_fmt_spec (u32_format, out); +} + +static char * WARN_UNUSED_RESULT +pivot_value_from_data_value (const struct spv_data_value *data, + const struct spv_data_value *format, + const struct hmap *format_map, + struct pivot_value **vp) +{ + *vp = NULL; + + struct fmt_spec f; + char *error = spv_format_from_data_value (format, format_map, &f); + if (error) + return error; + + struct pivot_value *v = xzalloc (sizeof *v); + if (data->width >= 0) + { + if (format && fmt_get_category (f.type) == FMT_CAT_DATE) + { + int year, month, day, hour, minute, second, msec, len = -1; + if (sscanf (data->s, "%4d-%2d-%2dT%2d:%2d:%2d.%3d%n", + &year, &month, &day, &hour, &minute, &second, + &msec, &len) == 7 + && len == 23 + && data->s[len] == '\0') + { + double date = calendar_gregorian_to_offset (year, month, day, + NULL); + if (date != SYSMIS) + { + v->type = PIVOT_VALUE_NUMERIC; + v->numeric.x = (date * 60. * 60. * 24. + + hour * 60. * 60. + + minute * 60. + + second + + msec / 1000.0); + v->numeric.format = f; + *vp = v; + return NULL; + } + } + } + else if (format && fmt_get_category (f.type) == FMT_CAT_TIME) + { + int hour, minute, second, msec, len = -1; + if (sscanf (data->s, "%d:%2d:%2d.%3d%n", + &hour, &minute, &second, &msec, &len) == 4 + && len > 0 + && data->s[len] == '\0') + { + v->type = PIVOT_VALUE_NUMERIC; + v->numeric.x = (hour * 60. * 60. + + minute * 60. + + second + + msec / 1000.0); + v->numeric.format = f; + *vp = v; + return NULL; + } + } + v->type = PIVOT_VALUE_STRING; + v->string.s = xstrdup (data->s); + } + else + { + v->type = PIVOT_VALUE_NUMERIC; + v->numeric.x = data->d; + v->numeric.format = f; + } + *vp = v; + return NULL; +} + +static void +add_parents (struct pivot_category *cat, struct pivot_category *parent, + size_t group_index) +{ + cat->parent = parent; + cat->group_index = group_index; + if (pivot_category_is_group (cat)) + for (size_t i = 0; i < cat->n_subs; i++) + add_parents (cat->subs[i], cat, i); +} + +static const struct spvdx_facet_level * +find_facet_level (const struct spvdx_visualization *v, int facet_level) +{ + const struct spvdx_facet_layout *layout = v->graph->facet_layout; + for (size_t i = 0; i < layout->n_facet_level; i++) + { + const struct spvdx_facet_level *fl = layout->facet_level[i]; + if (facet_level == fl->level) + return fl; + } + return NULL; +} + +static bool +should_show_label (const struct spvdx_facet_level *fl) +{ + return fl && fl->axis->label && fl->axis->label->style->visible != 0; +} + +static size_t +max_category (const struct spv_series *s) +{ + double max_cat = -DBL_MAX; + for (size_t i = 0; i < s->n_values; i++) + { + const struct spv_data_value *dv = &s->values[i]; + double d = dv->width < 0 ? dv->d : dv->index; + if (d > max_cat) + max_cat = d; + } + assert (max_cat >= 0 && max_cat < SIZE_MAX - 1); + + return max_cat; +} + +static void +add_affixes (struct pivot_table *table, struct pivot_value *value, + struct spvdx_affix **affixes, size_t n_affixes) +{ + for (size_t i = 0; i < n_affixes; i++) + add_footnote (value, affixes[i]->defines_reference, table); +} + +static char * WARN_UNUSED_RESULT +add_dimension (struct spv_series **series, size_t n, + enum pivot_axis_type axis_type, + const struct spvdx_visualization *v, struct pivot_table *table, + struct spv_series **dim_seriesp, size_t *n_dim_seriesp, + int base_facet_level, struct pivot_dimension **dp) +{ + char *error = NULL; + + const struct spvdx_facet_level *fl + = find_facet_level (v, base_facet_level + n); + if (fl) + { + struct area_style *area = (axis_type == PIVOT_AXIS_COLUMN + ? &table->areas[PIVOT_AREA_COLUMN_LABELS] + : axis_type == PIVOT_AXIS_ROW + ? &table->areas[PIVOT_AREA_ROW_LABELS] + : NULL); + if (area && fl->axis->label) + { + area_style_uninit (area); + decode_spvdx_style (fl->axis->label->style, + fl->axis->label->text_frame_style, area); + } + } + + if (axis_type == PIVOT_AXIS_ROW) + { + const struct spvdx_facet_level *fl2 + = find_facet_level (v, base_facet_level + (n - 1)); + if (fl2) + decode_spvdx_style_incremental ( + fl2->axis->major_ticks->style, + fl2->axis->major_ticks->tick_frame_style, + &table->areas[PIVOT_AREA_ROW_LABELS]); + } + + const struct spvdx_facet_level *fl3 = find_facet_level (v, base_facet_level); + if (fl3 && fl3->axis->major_ticks->label_angle == -90) + { + if (axis_type == PIVOT_AXIS_COLUMN) + table->rotate_inner_column_labels = true; + else + table->rotate_outer_row_labels = true; + } + + /* Find the first row for each category. */ + size_t max_cat = max_category (series[0]); + size_t *cat_rows = xnmalloc (max_cat + 1, sizeof *cat_rows); + for (size_t k = 0; k <= max_cat; k++) + cat_rows[k] = SIZE_MAX; + for (size_t k = 0; k < series[0]->n_values; k++) + { + const struct spv_data_value *dv = &series[0]->values[k]; + double d = dv->width < 0 ? dv->d : dv->index; + if (d >= 0 && d < SIZE_MAX - 1) + { + size_t row = d; + if (cat_rows[row] == SIZE_MAX) + cat_rows[row] = k; + } + } + + /* Drop missing categories and count what's left. */ + size_t n_cats = 0; + for (size_t k = 0; k <= max_cat; k++) + if (cat_rows[k] != SIZE_MAX) + cat_rows[n_cats++] = cat_rows[k]; + assert (n_cats > 0); + + /* Make the categories. */ + struct pivot_dimension *d = xzalloc (sizeof *d); + table->dimensions[table->n_dimensions++] = d; + + series[0]->n_index = max_cat + 1; + series[0]->index_to_category = xcalloc ( + max_cat + 1, sizeof *series[0]->index_to_category); + struct pivot_category **cats = xnmalloc (n_cats, sizeof **cats); + for (size_t k = 0; k < n_cats; k++) + { + struct spv_data_value *dv = &series[0]->values[cat_rows[k]]; + int dv_num = dv ? dv->d : dv->index; + struct pivot_category *cat = xzalloc (sizeof *cat); + char *retval = pivot_value_from_data_value ( + spv_map_lookup (&series[0]->map, dv), NULL, NULL, &cat->name); + if (retval) + { + if (error) + free (retval); + else + error = retval; + } + cat->parent = NULL; + cat->dimension = d; + cat->data_index = k; + cat->presentation_index = cat_rows[k]; + cats[k] = cat; + series[0]->index_to_category[dv_num] = cat; + + if (cat->name) + add_affixes (table, cat->name, + series[0]->affixes, series[0]->n_affixes); + } + free (cat_rows); + + struct pivot_axis *axis = &table->axes[axis_type]; + d->axis_type = axis_type; + d->level = axis->n_dimensions; + d->top_index = table->n_dimensions - 1; + d->root = xzalloc (sizeof *d->root); + *d->root = (struct pivot_category) { + .name = pivot_value_new_user_text ( + series[0]->label ? series[0]->label : "", -1), + .dimension = d, + .show_label = should_show_label (fl), + .data_index = SIZE_MAX, + .presentation_index = SIZE_MAX, + }; + d->data_leaves = xmemdup (cats, n_cats * sizeof *cats); + d->presentation_leaves = xmemdup (cats, n_cats * sizeof *cats); + d->n_leaves = d->allocated_leaves = n_cats; + + /* Now group them, in one pass per grouping variable, innermost first. */ + for (size_t j = 1; j < n; j++) + { + struct pivot_category **new_cats = xnmalloc (n_cats, sizeof **cats); + size_t n_new_cats = 0; + + /* Allocate a category index. */ + size_t max_cat = max_category (series[j]); + series[j]->n_index = max_cat + 1; + series[j]->index_to_category = xcalloc ( + max_cat + 1, sizeof *series[j]->index_to_category); + for (size_t cat1 = 0; cat1 < n_cats; ) + { + /* Find a sequence of categories cat1...cat2 (exclusive), that all + have the same value in series 'j'. (This might be only a single + category; we will drop unnamed 1-category groups later.) */ + size_t row1 = cats[cat1]->presentation_index; + const struct spv_data_value *dv1 = &series[j]->values[row1]; + size_t cat2; + for (cat2 = cat1 + 1; cat2 < n_cats; cat2++) + { + size_t row2 = cats[cat2]->presentation_index; + const struct spv_data_value *dv2 = &series[j]->values[row2]; + if (!spv_data_value_equal (dv1, dv2)) + break; + } + size_t n_subs = cat2 - cat1; + + struct pivot_category *new_cat; + const struct spv_data_value *name + = spv_map_lookup (&series[j]->map, dv1); + if (n_subs == 1 && name->width == 0) + { + /* The existing category stands on its own. */ + new_cat = cats[cat1++]; + } + else + { + /* Create a new group with cat...cat2 as subcategories. */ + new_cat = xzalloc (sizeof *new_cat); + *new_cat = (struct pivot_category) { + .dimension = d, + .subs = xnmalloc (n_subs, sizeof *new_cat->subs), + .n_subs = n_subs, + .show_label = true, + .data_index = SIZE_MAX, + .presentation_index = row1, + }; + char *retval = pivot_value_from_data_value (name, NULL, NULL, + &new_cat->name); + if (retval) + { + if (error) + free (retval); + else + error = retval; + } + for (size_t k = 0; k < n_subs; k++) + new_cat->subs[k] = cats[cat1++]; + + int dv1_num = dv1->width < 0 ? dv1->d : dv1->index; + series[j]->index_to_category[dv1_num] = new_cat; + } + + if (new_cat->name) + add_affixes (table, new_cat->name, + series[j]->affixes, series[j]->n_affixes); + + /* Append the new group to the list of new groups. */ + new_cats[n_new_cats++] = new_cat; + } + + free (cats); + cats = new_cats; + n_cats = n_new_cats; + } + + /* Now drop unnamed 1-category groups and add parent pointers. */ + for (size_t j = 0; j < n_cats; j++) + add_parents (cats[j], d->root, j); + + d->root->subs = cats; + d->root->n_subs = n_cats; + + if (error) + { + pivot_dimension_destroy (d); + return error; + } + + dim_seriesp[(*n_dim_seriesp)++] = series[0]; + series[0]->dimension = d; + + axis->dimensions = xnrealloc (axis->dimensions, axis->n_dimensions + 1, + sizeof *axis->dimensions); + axis->dimensions[axis->n_dimensions++] = d; + axis->extent *= d->n_leaves; + + *dp = d; + return NULL; +} + +static char * WARN_UNUSED_RESULT +add_dimensions (struct hmap *series_map, const struct spvdx_nest *nest, + enum pivot_axis_type axis_type, + const struct spvdx_visualization *v, struct pivot_table *table, + struct spv_series **dim_seriesp, size_t *n_dim_seriesp, + int level_ofs) +{ + struct pivot_axis *axis = &table->axes[axis_type]; + if (!axis->extent) + axis->extent = 1; + + if (!nest) + return NULL; + + struct spv_series **series = xnmalloc (nest->n_vars, sizeof *series); + for (size_t i = 0; i < nest->n_vars; ) + { + size_t n; + for (n = 0; i + n < nest->n_vars; n++) + { + series[n] = spv_series_from_ref (series_map, nest->vars[i + n]->ref); + if (!series[n] || !series[n]->n_values) + break; + } + + if (n > 0) + { + struct pivot_dimension *d; + char *error = add_dimension (series, n, axis_type, v, table, + dim_seriesp, n_dim_seriesp, + level_ofs + i, &d); + if (error) + { + free (series); + return error; + } + } + + i += n + 1; + } + free (series); + + return NULL; +} + +static char * WARN_UNUSED_RESULT +add_layers (struct hmap *series_map, + struct spvdx_layer **layers, size_t n_layers, + const struct spvdx_visualization *v, struct pivot_table *table, + struct spv_series **dim_seriesp, size_t *n_dim_seriesp, + int level_ofs) +{ + struct pivot_axis *axis = &table->axes[PIVOT_AXIS_LAYER]; + if (!axis->extent) + axis->extent = 1; + + if (!n_layers) + return NULL; + + struct spv_series **series = xnmalloc (n_layers, sizeof *series); + for (size_t i = 0; i < n_layers; ) + { + size_t n; + for (n = 0; i + n < n_layers; n++) + { + series[n] = spv_series_from_ref (series_map, + layers[i + n]->variable); + if (!series[n] || !series[n]->n_values) + break; + } + + if (n > 0) + { + struct pivot_dimension *d; + char *error = add_dimension ( + series, n, PIVOT_AXIS_LAYER, v, table, + dim_seriesp, n_dim_seriesp, level_ofs + i, &d); + if (error) + { + free (series); + return error; + } + + int index = atoi (layers[i]->value); + assert (index < d->n_leaves); + table->current_layer = xrealloc ( + table->current_layer, + axis->n_dimensions * sizeof *table->current_layer); + table->current_layer[axis->n_dimensions - 1] = index; + } + i += n + 1; + } + free (series); + + return NULL; +} + +static int +optional_int (int x, int default_value) +{ + return x != INT_MIN ? x : default_value; +} + +static enum pivot_area +pivot_area_from_name (const char *name) +{ + static const char *area_names[PIVOT_N_AREAS] = { + [PIVOT_AREA_TITLE] = "title", + [PIVOT_AREA_CAPTION] = "caption", + [PIVOT_AREA_FOOTER] = "footnotes", + [PIVOT_AREA_CORNER] = "cornerLabels", + [PIVOT_AREA_COLUMN_LABELS] = "columnLabels", + [PIVOT_AREA_ROW_LABELS] = "rowLabels", + [PIVOT_AREA_DATA] = "data", + [PIVOT_AREA_LAYERS] = "layers", + }; + + enum pivot_area area; + for (area = 0; area < PIVOT_N_AREAS; area++) + if (!strcmp (name, area_names[area])) + break; + return area; +} + +static enum pivot_border +pivot_border_from_name (const char *name) +{ + static const char *border_names[PIVOT_N_BORDERS] = { + [PIVOT_BORDER_TITLE] = "titleLayerSeparator", + [PIVOT_BORDER_OUTER_LEFT] = "leftOuterFrame", + [PIVOT_BORDER_OUTER_TOP] = "topOuterFrame", + [PIVOT_BORDER_OUTER_RIGHT] = "rightOuterFrame", + [PIVOT_BORDER_OUTER_BOTTOM] = "bottomOuterFrame", + [PIVOT_BORDER_INNER_LEFT] = "leftInnerFrame", + [PIVOT_BORDER_INNER_TOP] = "topInnerFrame", + [PIVOT_BORDER_INNER_RIGHT] = "rightInnerFrame", + [PIVOT_BORDER_INNER_BOTTOM] = "bottomInnerFrame", + [PIVOT_BORDER_DATA_LEFT] = "dataAreaLeft", + [PIVOT_BORDER_DATA_TOP] = "dataAreaTop", + [PIVOT_BORDER_DIM_ROW_HORZ] = "horizontalDimensionBorderRows", + [PIVOT_BORDER_DIM_ROW_VERT] = "verticalDimensionBorderRows", + [PIVOT_BORDER_DIM_COL_HORZ] = "horizontalDimensionBorderColumns", + [PIVOT_BORDER_DIM_COL_VERT] = "verticalDimensionBorderColumns", + [PIVOT_BORDER_CAT_ROW_HORZ] = "horizontalCategoryBorderRows", + [PIVOT_BORDER_CAT_ROW_VERT] = "verticalCategoryBorderRows", + [PIVOT_BORDER_CAT_COL_HORZ] = "horizontalCategoryBorderColumns", + [PIVOT_BORDER_CAT_COL_VERT] = "verticalCategoryBorderColumns", + }; + + enum pivot_border border; + for (border = 0; border < PIVOT_N_BORDERS; border++) + if (!strcmp (name, border_names[border])) + break; + return border; +} + +static struct pivot_category * +find_category (struct spv_series *series, int index) +{ + return (index >= 0 && index < series->n_index + ? series->index_to_category[index] + : NULL); +} + +static bool +int_in_array (int value, const int *array, size_t n) +{ + for (size_t i = 0; i < n; i++) + if (array[i] == value) + return true; + + return false; +} + +static void +apply_styles_to_value (struct pivot_table *table, + struct pivot_value *value, + const struct spvdx_set_format *sf, + const struct area_style *base_area_style, + const struct spvdx_style *fg, + const struct spvdx_style *bg) +{ + if (sf) + { + if (sf->reset > 0) + { + free (value->footnotes); + value->footnotes = NULL; + value->n_footnotes = 0; + } + + struct fmt_spec format = { .w = 0 }; + if (sf->format) + { + format = decode_format (sf->format); + add_affixes (table, value, sf->format->affix, sf->format->n_affix); + } + else if (sf->number_format) + { + format = decode_number_format (sf->number_format); + add_affixes (table, value, sf->number_format->affix, + sf->number_format->n_affix); + } + else if (sf->n_string_format) + { + for (size_t i = 0; i < sf->n_string_format; i++) + add_affixes (table, value, sf->string_format[i]->affix, + sf->string_format[i]->n_affix); + } + else if (sf->date_time_format) + { + format = decode_date_time_format (sf->date_time_format); + add_affixes (table, value, sf->date_time_format->affix, + sf->date_time_format->n_affix); + } + else if (sf->elapsed_time_format) + { + format = decode_elapsed_time_format (sf->elapsed_time_format); + add_affixes (table, value, sf->elapsed_time_format->affix, + sf->elapsed_time_format->n_affix); + } + + if (format.w) + { + if (value->type == PIVOT_VALUE_NUMERIC) + value->numeric.format = format; + + /* Possibly we should try to apply date and time formats too, + but none seem to occur in practice so far. */ + } + } + if (fg || bg) + { + struct area_style area; + pivot_value_get_style ( + value, + value->font_style ? value->font_style : &base_area_style->font_style, + value->cell_style ? value->cell_style : &base_area_style->cell_style, + &area); + decode_spvdx_style_incremental (fg, bg, &area); + pivot_value_set_style (value, &area); + area_style_uninit (&area); + } +} + +static void +decode_set_cell_properties__ (struct pivot_table *table, + struct hmap *series_map, + const struct spvdx_intersect *intersect, + const struct spvdx_style *interval, + const struct spvdx_style *graph, + const struct spvdx_style *labeling, + const struct spvdx_style *frame, + const struct spvdx_style *major_ticks, + const struct spvdx_set_format *set_format) +{ + if (graph && labeling && intersect->alternating + && !interval && !major_ticks && !frame && !set_format) + { + /* Sets alt_fg_color and alt_bg_color. */ + struct area_style area; + decode_spvdx_style (labeling, graph, &area); + table->areas[PIVOT_AREA_DATA].font_style.fg[1] + = area.font_style.fg[0]; + table->areas[PIVOT_AREA_DATA].font_style.bg[1] + = area.font_style.bg[0]; + area_style_uninit (&area); + } + else if (graph + && !labeling && !interval && !major_ticks && !frame && !set_format) + { + /* 'graph->width' likely just sets the width of the table as a + whole. */ + } + else if (!graph && !labeling && !interval && !frame && !set_format + && !major_ticks) + { + /* No-op. (Presumably there's a setMetaData we don't care about.) */ + } + else if (((set_format && spvdx_is_major_ticks (set_format->target)) + || major_ticks || frame) + && intersect->n_where == 1) + { + /* Formatting for individual row or column labels. */ + const struct spvdx_where *w = intersect->where[0]; + struct spv_series *s = spv_series_find (series_map, w->variable->id); + assert (s); + + const char *p = w->include; + + while (*p) + { + char *tail; + int include = strtol (p, &tail, 10); + + struct pivot_category *c = find_category (s, include); + if (c) + { + const struct area_style *base_area_style + = (c->dimension->axis_type == PIVOT_AXIS_ROW + ? &table->areas[PIVOT_AREA_ROW_LABELS] + : &table->areas[PIVOT_AREA_COLUMN_LABELS]); + apply_styles_to_value (table, c->name, set_format, + base_area_style, major_ticks, frame); + } + + if (tail == p) + break; + p = tail; + if (*p == ';') + p++; + } + } + else if ((set_format && spvdx_is_labeling (set_format->target)) + || labeling || interval) + { + /* Formatting for individual cells or groups of them with some dimensions + in common. */ + int **indexes = xcalloc (table->n_dimensions, sizeof *indexes); + size_t *n = xcalloc (table->n_dimensions, sizeof *n); + size_t *allocated = xcalloc (table->n_dimensions, sizeof *allocated); + + for (size_t i = 0; i < intersect->n_where; i++) + { + const struct spvdx_where *w = intersect->where[i]; + struct spv_series *s = spv_series_find (series_map, w->variable->id); + assert (s); + if (!s->dimension) + { + /* Group indexes may be included even though they are redundant. + Ignore them. */ + continue; + } + + size_t j = s->dimension->top_index; + + const char *p = w->include; + while (*p) + { + char *tail; + int include = strtol (p, &tail, 10); + + struct pivot_category *c = find_category (s, include); + if (c) + { + if (n[j] >= allocated[j]) + indexes[j] = x2nrealloc (indexes[j], &allocated[j], + sizeof *indexes[j]); + indexes[j][n[j]++] = c->data_index; + } + + if (tail == p) + break; + p = tail; + if (*p == ';') + p++; + } + } + +#if 0 + printf ("match:"); + for (size_t i = 0; i < table->n_dimensions; i++) + { + if (n[i]) + { + printf (" %d=(", i); + for (size_t j = 0; j < n[i]; j++) + { + if (j) + putchar (','); + printf ("%d", indexes[i][j]); + } + putchar (')'); + } + } + printf ("\n"); +#endif + + /* XXX This is inefficient in the common case where all of the dimensions + are matched. We should use a heuristic where if all of the dimensions + are matched and the product of n[*] is less than + hmap_count(&table->cells) then iterate through all the possibilities + rather than all the cells. Or even only do it if there is just one + possibility. */ + + struct pivot_cell *cell; + HMAP_FOR_EACH (cell, struct pivot_cell, hmap_node, &table->cells) + { + for (size_t i = 0; i < table->n_dimensions; i++) + { + if (n[i] && !int_in_array (cell->idx[i], indexes[i], n[i])) + goto skip; + } + apply_styles_to_value (table, cell->value, set_format, + &table->areas[PIVOT_AREA_DATA], + labeling, interval); + + skip: ; + } + + for (size_t i = 0; i < table->n_dimensions; i++) + free (indexes[i]); + free (indexes); + free (n); + free (allocated); + } + else + NOT_REACHED (); +} + +static void +decode_set_cell_properties (struct pivot_table *table, struct hmap *series_map, + struct spvdx_set_cell_properties **scps, + size_t n_scps) +{ + for (size_t i = 0; i < n_scps; i++) + { + const struct spvdx_set_cell_properties *scp = scps[i]; + const struct spvdx_style *interval = NULL; + const struct spvdx_style *graph = NULL; + const struct spvdx_style *labeling = NULL; + const struct spvdx_style *frame = NULL; + const struct spvdx_style *major_ticks = NULL; + const struct spvdx_set_format *set_format = NULL; + for (size_t j = 0; j < scp->n_seq; j++) + { + const struct spvxml_node *node = scp->seq[j]; + if (spvdx_is_set_style (node)) + { + const struct spvdx_set_style *set_style + = spvdx_cast_set_style (node); + if (spvdx_is_graph (set_style->target)) + graph = set_style->style; + else if (spvdx_is_labeling (set_style->target)) + labeling = set_style->style; + else if (spvdx_is_interval (set_style->target)) + interval = set_style->style; + else if (spvdx_is_major_ticks (set_style->target)) + major_ticks = set_style->style; + else + NOT_REACHED (); + } + else if (spvdx_is_set_frame_style (node)) + frame = spvdx_cast_set_frame_style (node)->style; + else if (spvdx_is_set_format (node)) + set_format = spvdx_cast_set_format (node); + else + assert (spvdx_is_set_meta_data (node)); + } + + if (scp->union_ && scp->apply_to_converse <= 0) + { + for (size_t j = 0; j < scp->union_->n_intersect; j++) + decode_set_cell_properties__ ( + table, series_map, scp->union_->intersect[j], + interval, graph, labeling, frame, major_ticks, set_format); + } + else if (!scp->union_ && scp->apply_to_converse > 0) + { + if ((set_format && spvdx_is_labeling (set_format->target)) + || labeling || interval) + { + struct pivot_cell *cell; + HMAP_FOR_EACH (cell, struct pivot_cell, hmap_node, &table->cells) + apply_styles_to_value (table, cell->value, set_format, + &table->areas[PIVOT_AREA_DATA], + NULL, NULL); + } + } + else if (!scp->union_ && scp->apply_to_converse <= 0) + { + /* Appears to be used to set the font for something--but what? */ + } + else + NOT_REACHED (); + } +} + +char * WARN_UNUSED_RESULT +decode_spvsx_legacy_properties (const struct spvsx_table_properties *in, + struct spv_legacy_properties **outp) +{ + struct spv_legacy_properties *out = xzalloc (sizeof *out); + char *error; + + if (!in) + { + error = xstrdup ("Legacy table lacks tableProperties"); + goto error; + } + + const struct spvsx_general_properties *g = in->general_properties; + out->omit_empty = g->hide_empty_rows != 0; + out->width_ranges[TABLE_HORZ][0] = optional_pt (g->minimum_column_width, -1); + out->width_ranges[TABLE_HORZ][1] = optional_pt (g->maximum_column_width, -1); + out->width_ranges[TABLE_VERT][0] = optional_pt (g->minimum_row_width, -1); + out->width_ranges[TABLE_VERT][1] = optional_pt (g->maximum_row_width, -1); + out->row_labels_in_corner + = g->row_dimension_labels != SPVSX_ROW_DIMENSION_LABELS_NESTED; + + const struct spvsx_footnote_properties *f = in->footnote_properties; + out->footnote_marker_superscripts + = (f->marker_position != SPVSX_MARKER_POSITION_SUBSCRIPT); + out->show_numeric_markers + = (f->number_format == SPVSX_NUMBER_FORMAT_NUMERIC); + + for (int i = 0; i < PIVOT_N_AREAS; i++) + area_style_copy (NULL, &out->areas[i], pivot_area_get_default_style (i)); + + const struct spvsx_cell_format_properties *cfp = in->cell_format_properties; + for (size_t i = 0; i < cfp->n_cell_style; i++) + { + const struct spvsx_cell_style *c = cfp->cell_style[i]; + const char *name = CHAR_CAST (const char *, c->node_.raw->name); + enum pivot_area area = pivot_area_from_name (name); + if (area == PIVOT_N_AREAS) + { + error = xasprintf ("unknown area \"%s\" in cellFormatProperties", + name); + goto error; + } + + struct area_style *a = &out->areas[area]; + const struct spvsx_style *s = c->style; + if (s->font_weight) + a->font_style.bold = s->font_weight == SPVSX_FONT_WEIGHT_BOLD; + if (s->font_style) + a->font_style.italic = s->font_style == SPVSX_FONT_STYLE_ITALIC; + a->font_style.underline = false; + if (s->color >= 0) + a->font_style.fg[0] = optional_color ( + s->color, (struct cell_color) CELL_COLOR_BLACK); + if (c->alternating_text_color >= 0 || s->color >= 0) + a->font_style.fg[1] = optional_color (c->alternating_text_color, + a->font_style.fg[0]); + if (s->color2 >= 0) + a->font_style.bg[0] = optional_color ( + s->color2, (struct cell_color) CELL_COLOR_WHITE); + if (c->alternating_color >= 0 || s->color2 >= 0) + a->font_style.bg[1] = optional_color (c->alternating_color, + a->font_style.bg[0]); + if (s->font_family) + { + free (a->font_style.typeface); + a->font_style.typeface = xstrdup (s->font_family); + } + + if (s->font_size) + a->font_style.size = optional_length (s->font_size, 0); + + if (s->text_alignment) + a->cell_style.halign + = (s->text_alignment == SPVSX_TEXT_ALIGNMENT_LEFT + ? TABLE_HALIGN_LEFT + : s->text_alignment == SPVSX_TEXT_ALIGNMENT_RIGHT + ? TABLE_HALIGN_RIGHT + : s->text_alignment == SPVSX_TEXT_ALIGNMENT_CENTER + ? TABLE_HALIGN_CENTER + : s->text_alignment == SPVSX_TEXT_ALIGNMENT_DECIMAL + ? TABLE_HALIGN_DECIMAL + : TABLE_HALIGN_MIXED); + if (s->label_location_vertical) + a->cell_style.valign + = (s->label_location_vertical == SPVSX_LABEL_LOCATION_VERTICAL_NEGATIVE + ? TABLE_VALIGN_BOTTOM + : s->label_location_vertical == SPVSX_LABEL_LOCATION_VERTICAL_POSITIVE + ? TABLE_VALIGN_TOP + : TABLE_VALIGN_CENTER); + + if (s->decimal_offset != DBL_MAX) + a->cell_style.decimal_offset = optional_px (s->decimal_offset, 0); + + if (s->margin_left != DBL_MAX) + a->cell_style.margin[TABLE_HORZ][0] = optional_px (s->margin_left, 8); + if (s->margin_right != DBL_MAX) + a->cell_style.margin[TABLE_HORZ][1] = optional_px (s->margin_right, + 11); + if (s->margin_top != DBL_MAX) + a->cell_style.margin[TABLE_VERT][0] = optional_px (s->margin_top, 1); + if (s->margin_bottom != DBL_MAX) + a->cell_style.margin[TABLE_VERT][1] = optional_px (s->margin_bottom, + 1); + } + + for (int i = 0; i < PIVOT_N_BORDERS; i++) + pivot_border_get_default_style (i, &out->borders[i]); + + const struct spvsx_border_properties *bp = in->border_properties; + for (size_t i = 0; i < bp->n_border_style; i++) + { + const struct spvsx_border_style *bin = bp->border_style[i]; + const char *name = CHAR_CAST (const char *, bin->node_.raw->name); + enum pivot_border border = pivot_border_from_name (name); + if (border == PIVOT_N_BORDERS) + { + error = xasprintf ("unknown border \"%s\" parsing borderProperties", + name); + goto error; + } + + struct table_border_style *bout = &out->borders[border]; + bout->stroke + = (bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_NONE + ? TABLE_STROKE_NONE + : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_DASHED + ? TABLE_STROKE_DASHED + : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_THICK + ? TABLE_STROKE_THICK + : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_THIN + ? TABLE_STROKE_THIN + : bin->border_style_type == SPVSX_BORDER_STYLE_TYPE_DOUBLE + ? TABLE_STROKE_DOUBLE + : TABLE_STROKE_SOLID); + bout->color = optional_color (bin->color, + (struct cell_color) CELL_COLOR_BLACK); + } + + const struct spvsx_printing_properties *pp = in->printing_properties; + out->print_all_layers = pp->print_all_layers > 0; + out->paginate_layers = pp->print_each_layer_on_separate_page > 0; + out->shrink_to_width = pp->rescale_wide_table_to_fit_page > 0; + out->shrink_to_length = pp->rescale_long_table_to_fit_page > 0; + out->top_continuation = pp->continuation_text_at_top > 0; + out->bottom_continuation = pp->continuation_text_at_bottom > 0; + out->continuation = xstrdup (pp->continuation_text + ? pp->continuation_text : "(cont.)"); + out->n_orphan_lines = optional_int (pp->window_orphan_lines, 2); + + *outp = out; + return NULL; + +error: + spv_legacy_properties_destroy (out); + *outp = NULL; + return error; +} + +void +spv_legacy_properties_destroy (struct spv_legacy_properties *props) +{ + if (props) + { + for (size_t i = 0; i < PIVOT_N_AREAS; i++) + area_style_uninit (&props->areas[i]); + free (props->continuation); + free (props); + } +} + +static struct spv_series * +parse_formatting (const struct spvdx_visualization *v, + const struct hmap *series_map, struct hmap *format_map) +{ + const struct spvdx_labeling *labeling = v->graph->interval->labeling; + struct spv_series *cell_format = NULL; + for (size_t i = 0; i < labeling->n_seq; i++) + { + const struct spvdx_formatting *f + = spvdx_cast_formatting (labeling->seq[i]); + if (!f) + continue; + + cell_format = spv_series_from_ref (series_map, f->variable); + for (size_t j = 0; j < f->n_format_mapping; j++) + { + const struct spvdx_format_mapping *fm = f->format_mapping[j]; + + if (fm->format) + { + struct format_mapping *out = xmalloc (sizeof *out); + out->from = fm->from; + out->to = decode_format (fm->format); + hmap_insert (format_map, &out->hmap_node, + hash_int (out->from, 0)); + } + } + } + + return cell_format; +} + +static void +format_map_destroy (struct hmap *format_map) +{ + struct format_mapping *fm, *next; + HMAP_FOR_EACH_SAFE (fm, next, struct format_mapping, hmap_node, format_map) + { + hmap_delete (format_map, &fm->hmap_node); + free (fm); + } + hmap_destroy (format_map); +} + +char * WARN_UNUSED_RESULT +decode_spvdx_table (const struct spvdx_visualization *v, const char *subtype, + const struct spv_legacy_properties *props, + struct spv_data *data, struct pivot_table **outp) +{ + struct pivot_table *table = pivot_table_create__ (NULL, subtype); + + struct hmap series_map = HMAP_INITIALIZER (series_map); + struct hmap format_map = HMAP_INITIALIZER (format_map); + struct spv_series **dim_series = NULL; + char *error; + + /* First get the legacy properties. */ + table->omit_empty = props->omit_empty; + for (enum table_axis axis = 0; axis < TABLE_N_AXES; axis++) + for (int i = 0; i < 2; i++) + if (props->width_ranges[axis][i] > 0) + table->sizing[axis].range[i] = props->width_ranges[axis][i]; + table->row_labels_in_corner = props->row_labels_in_corner; + + table->footnote_marker_superscripts = props->footnote_marker_superscripts; + table->show_numeric_markers = props->show_numeric_markers; + + for (size_t i = 0; i < PIVOT_N_AREAS; i++) + { + area_style_uninit (&table->areas[i]); + area_style_copy (NULL, &table->areas[i], &props->areas[i]); + } + for (size_t i = 0; i < PIVOT_N_BORDERS; i++) + table->borders[i] = props->borders[i]; + + table->print_all_layers = props->print_all_layers; + table->paginate_layers = props->paginate_layers; + table->shrink_to_fit[TABLE_HORZ] = props->shrink_to_width; + table->shrink_to_fit[TABLE_VERT] = props->shrink_to_length; + table->top_continuation = props->top_continuation; + table->bottom_continuation = props->bottom_continuation; + table->continuation = xstrdup (props->continuation); + table->n_orphan_lines = props->n_orphan_lines; + + struct spvdx_visualization_extension *ve = v->visualization_extension; + table->show_grid_lines = ve && ve->show_gridline; + + /* Sizing from the legacy properties can get overridden. */ + if (v->graph->cell_style->width) + { + int min_width, max_width, n = 0; + if (sscanf (v->graph->cell_style->width, "%*d%%;%dpt;%dpt%n", + &min_width, &max_width, &n) + && v->graph->cell_style->width[n] == '\0') + { + table->sizing[TABLE_HORZ].range[0] = min_width; + table->sizing[TABLE_HORZ].range[1] = max_width; + } + } + + /* Footnotes. + + Any pivot_value might refer to footnotes, so it's important to process the + footnotes early to ensure that those references can be resolved. There is + a possible problem that a footnote might itself reference an + as-yet-unprocessed footnote, but that's OK because footnote references + don't actually look at the footnote contents but only resolve a pointer to + where the footnote will go later. + + Before we really start, create all the footnotes we'll fill in. This is + because sometimes footnotes refer to themselves or to each other and we + don't want to reject those references. */ + if (v->container) + for (size_t i = 0; i < v->container->n_label_frame; i++) + { + const struct spvdx_label_frame *lf = v->container->label_frame[i]; + if (lf->label + && lf->label->purpose == SPVDX_PURPOSE_FOOTNOTE + && lf->label->n_text > 0 + && lf->label->text[0]->uses_reference > 0) + { + pivot_table_create_footnote__ ( + table, lf->label->text[0]->uses_reference - 1, + NULL, NULL); + } + } + + if (v->graph->interval->footnotes) + decode_footnotes (table, v->graph->interval->footnotes); + + struct spv_series *footnotes = NULL; + for (size_t i = 0; i < v->graph->interval->labeling->n_seq; i++) + { + const struct spvxml_node *node = v->graph->interval->labeling->seq[i]; + if (spvdx_is_footnotes (node)) + { + const struct spvdx_footnotes *f = spvdx_cast_footnotes (node); + footnotes = spv_series_from_ref (&series_map, f->variable); + decode_footnotes (table, f); + } + } + for (size_t i = 0; i < v->n_lf1; i++) + { + error = decode_label_frame (table, v->lf1[i]); + if (error) + goto exit; + } + for (size_t i = 0; i < v->n_lf2; i++) + { + error = decode_label_frame (table, v->lf2[i]); + if (error) + goto exit; + } + if (v->container) + for (size_t i = 0; i < v->container->n_label_frame; i++) + { + error = decode_label_frame (table, v->container->label_frame[i]); + if (error) + goto exit; + } + if (v->graph->interval->labeling->style) + { + area_style_uninit (&table->areas[PIVOT_AREA_DATA]); + decode_spvdx_style (v->graph->interval->labeling->style, + v->graph->cell_style, + &table->areas[PIVOT_AREA_DATA]); + } + + /* Decode all of the sourceVariable and derivedVariable */ + struct spvxml_node **nodes = xmemdup (v->seq, v->n_seq * sizeof *v->seq); + size_t n_nodes = v->n_seq; + while (n_nodes > 0) + { + bool progress = false; + for (size_t i = 0; i < n_nodes; ) + { + error = (spvdx_is_source_variable (nodes[i]) + ? decode_spvdx_source_variable (nodes[i], data, &series_map) + : decode_spvdx_derived_variable (nodes[i], &series_map)); + if (!error) + { + nodes[i] = nodes[--n_nodes]; + progress = true; + } + else if (error == &BAD_REFERENCE) + i++; + else + { + free (nodes); + goto exit; + } + } + + if (!progress) + { + free (nodes); + error = xasprintf ("Table has %zu variables with circular or " + "unresolved references, including variable %s.", + n_nodes, nodes[0]->id); + goto exit; + } + } + free (nodes); + + const struct spvdx_cross *cross = v->graph->faceting->cross; + + assert (cross->n_seq == 1); + const struct spvdx_nest *columns = spvdx_cast_nest (cross->seq[0]); + size_t max_columns = columns ? columns->n_vars : 0; + + assert (cross->n_seq2 == 1); + const struct spvdx_nest *rows = spvdx_cast_nest (cross->seq2[0]); + size_t max_rows = rows ? rows->n_vars : 0; + + size_t max_layers = (v->graph->faceting->n_layers1 + + v->graph->faceting->n_layers2); + + size_t max_dims = max_columns + max_rows + max_layers; + table->dimensions = xnmalloc (max_dims, sizeof *table->dimensions); + dim_series = xnmalloc (max_dims, sizeof *dim_series); + size_t n_dim_series = 0; + + error = add_dimensions (&series_map, columns, PIVOT_AXIS_COLUMN, v, table, + dim_series, &n_dim_series, 1); + if (error) + goto exit; + + error = add_dimensions (&series_map, rows, PIVOT_AXIS_ROW, v, table, + dim_series, &n_dim_series, max_columns + 1); + if (error) + goto exit; + + error = add_layers (&series_map, v->graph->faceting->layers1, + v->graph->faceting->n_layers1, + v, table, dim_series, &n_dim_series, + max_rows + max_columns + 1); + if (error) + goto exit; + + error = add_layers (&series_map, v->graph->faceting->layers2, + v->graph->faceting->n_layers2, + v, table, dim_series, &n_dim_series, + (max_rows + max_columns + v->graph->faceting->n_layers1 + + 1)); + if (error) + goto exit; + + struct spv_series *cell = spv_series_find (&series_map, "cell"); + if (!cell) + { + error = xstrdup (_("Table lacks cell data.")); + goto exit; + } + + struct spv_series *cell_format = parse_formatting (v, &series_map, + &format_map); + + assert (table->n_dimensions == n_dim_series); + size_t *dim_indexes = xnmalloc (table->n_dimensions, sizeof *dim_indexes); + for (size_t i = 0; i < cell->n_values; i++) + { + for (size_t j = 0; j < table->n_dimensions; j++) + { + const struct spv_data_value *value = &dim_series[j]->values[i]; + const struct pivot_category *cat = find_category ( + dim_series[j], value->width < 0 ? value->d : value->index); + if (!cat) + goto skip; + dim_indexes[j] = cat->data_index; + } + + struct pivot_value *value; + error = pivot_value_from_data_value ( + &cell->values[i], cell_format ? &cell_format->values[i] : NULL, + &format_map, &value); + if (error) + goto exit; + + if (footnotes) + { + const struct spv_data_value *d = &footnotes->values[i]; + if (d->width >= 0) + { + const char *p = d->s; + while (*p) + { + char *tail; + int idx = strtol (p, &tail, 10); + add_footnote (value, idx, table); + if (tail == p) + break; + p = tail; + if (*p == ',') + p++; + } + } + } + + if (value->type == PIVOT_VALUE_NUMERIC + && value->numeric.x == SYSMIS + && !value->n_footnotes) + { + /* Apparently, system-missing values are just empty cells? */ + pivot_value_destroy (value); + } + else + pivot_table_put (table, dim_indexes, table->n_dimensions, value); + skip:; + } + free (dim_indexes); + + decode_set_cell_properties (table, &series_map, v->graph->facet_layout->scp1, + v->graph->facet_layout->n_scp1); + decode_set_cell_properties (table, &series_map, v->graph->facet_layout->scp2, + v->graph->facet_layout->n_scp2); + + pivot_table_assign_label_depth (table); + + format_map_destroy (&format_map); + +exit: + free (dim_series); + spv_series_destroy (&series_map); + if (error) + { + pivot_table_unref (table); + *outp = NULL; + } + else + *outp = table; + return error; +} diff --git a/src/output/spv/spv-legacy-decoder.h b/src/output/spv/spv-legacy-decoder.h new file mode 100644 index 0000000000..9ff2428b47 --- /dev/null +++ b/src/output/spv/spv-legacy-decoder.h @@ -0,0 +1,46 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_LEGACY_DECODER_H +#define OUTPUT_SPV_LEGACY_DECODER_H 1 + +/* SPSS Viewer (SPV) legacy binary decoder. + + Used by spv.h, not useful directly. */ + +#include "libpspp/compiler.h" + +struct pivot_table; +struct spvdx_visualization; +struct spvsx_table_properties; +struct spv_data; + +struct spv_legacy_properties; + +void spv_legacy_properties_destroy (struct spv_legacy_properties *); + +char *decode_spvsx_legacy_properties (const struct spvsx_table_properties *, + struct spv_legacy_properties **) + WARN_UNUSED_RESULT; + +char *decode_spvdx_table (const struct spvdx_visualization *, + const char *subtype, + const struct spv_legacy_properties *, + struct spv_data *, + struct pivot_table **outp) + WARN_UNUSED_RESULT; + +#endif /* output/spv/spv-legacy-decoder.h */ diff --git a/src/output/spv/spv-light-decoder.c b/src/output/spv/spv-light-decoder.c new file mode 100644 index 0000000000..be58d416eb --- /dev/null +++ b/src/output/spv/spv-light-decoder.c @@ -0,0 +1,1056 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2017, 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spv-light-decoder.h" + +#include +#include +#include +#include + +#include "libpspp/i18n.h" +#include "libpspp/message.h" +#include "output/pivot-table.h" +#include "output/spv/light-binary-parser.h" +#include "output/spv/spv.h" + +#include "gl/xalloc.h" +#include "gl/xsize.h" + +static char * +to_utf8 (const char *s, const char *encoding) +{ + return recode_string ("UTF-8", encoding, s, strlen (s)); +} + +static char * +to_utf8_if_nonempty (const char *s, const char *encoding) +{ + return s && s[0] ? to_utf8 (s, encoding) : NULL; +} + +static void +convert_widths (const uint32_t *in, uint32_t n, int **out, size_t *n_out) +{ + if (n) + { + *n_out = n; + *out = xnmalloc (n, sizeof **out); + for (size_t i = 0; i < n; i++) + (*out)[i] = in[i]; + } +} + +static void +convert_breakpoints (const struct spvlb_breakpoints *in, + size_t **out, size_t *n_out) +{ + if (in && in->n_breaks) + { + *n_out = in->n_breaks; + *out = xnmalloc (in->n_breaks, sizeof *out); + for (size_t i = 0; i < in->n_breaks; i++) + (*out)[i] = in->breaks[i]; + } +} + +static void +convert_keeps (const struct spvlb_keeps *in, + struct pivot_keep **out, size_t *n_out) +{ + if (in && in->n_keeps) + { + *n_out = in->n_keeps; + *out = xnmalloc (*n_out, sizeof **out); + for (size_t i = 0; i < *n_out; i++) + { + (*out)[i].ofs = in->keeps[i]->offset; + (*out)[i].n = in->keeps[i]->n; + } + } +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_color_string (const char *s, uint8_t def, + struct cell_color *colorp) +{ + int r, g, b; + if (!*s) + r = g = b = def; + else if (sscanf (s, "#%2x%2x%2x", &r, &g, &b) != 3) + return xasprintf ("bad color %s", s); + + *colorp = (struct cell_color) CELL_COLOR (r, g, b); + return NULL; +} + +static struct cell_color +decode_spvlb_color_u32 (uint32_t x) +{ + return (struct cell_color) { x >> 24, x >> 16, x >> 8, x }; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_font_style (const struct spvlb_font_style *in, + const char *encoding, struct font_style **outp) +{ + if (!in) + { + *outp = NULL; + return NULL; + } + + struct cell_color fg, bg; + char *error = decode_spvlb_color_string (in->fg_color, 0x00, &fg); + if (!error) + error = decode_spvlb_color_string (in->bg_color, 0xff, &bg); + if (error) + return error; + + *outp = xmalloc (sizeof **outp); + **outp = (struct font_style) { + .bold = in->bold, + .italic = in->italic, + .underline = in->underline, + .fg = { fg, fg }, + .bg = { bg, bg }, + .typeface = to_utf8 (in->typeface, encoding), + .size = in->size / 1.33, + }; + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_halign (uint32_t in, enum table_halign *halignp) +{ + switch (in) + { + case 0: + *halignp = TABLE_HALIGN_CENTER; + return NULL; + + case 2: + *halignp = TABLE_HALIGN_LEFT; + return NULL; + + case 4: + *halignp = TABLE_HALIGN_RIGHT; + return NULL; + + case 6: + case 61453: + *halignp = TABLE_HALIGN_DECIMAL; + return NULL; + + case 0xffffffad: + case 64173: + *halignp = TABLE_HALIGN_MIXED; + return NULL; + + default: + return xasprintf ("bad cell style halign %"PRIu32, in); + } +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_valign (uint32_t in, enum table_valign *valignp) +{ + switch (in) + { + case 0: + *valignp = TABLE_VALIGN_CENTER; + return NULL; + + case 1: + *valignp = TABLE_VALIGN_TOP; + return NULL; + + case 3: + *valignp = TABLE_VALIGN_BOTTOM; + return NULL; + + default: + return xasprintf ("bad cell style valign %"PRIu32, in); + } +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_cell_style (const struct spvlb_cell_style *in, + struct cell_style **outp) +{ + if (!in) + { + *outp = NULL; + return NULL; + } + + enum table_halign halign; + char *error = decode_spvlb_halign (in->halign, &halign); + if (error) + return error; + + enum table_valign valign; + error = decode_spvlb_valign (in->valign, &valign); + if (error) + return error; + + *outp = xzalloc (sizeof **outp); + **outp = (struct cell_style) { + .halign = halign, + .valign = valign, + .decimal_offset = in->decimal_offset, + .margin = { + [TABLE_HORZ] = { in->left_margin, in->right_margin }, + [TABLE_VERT] = { in->top_margin, in->bottom_margin }, + }, + }; + return NULL; +} + +static char *decode_spvlb_value ( + const struct pivot_table *, const struct spvlb_value *, + const char *encoding, struct pivot_value **) WARN_UNUSED_RESULT; + +static char * WARN_UNUSED_RESULT +decode_spvlb_argument (const struct pivot_table *table, + const struct spvlb_argument *in, + const char *encoding, struct pivot_argument *out) +{ + if (in->value) + { + struct pivot_value *value; + char *error = decode_spvlb_value (table, in->value, encoding, &value); + if (error) + return error; + + out->n = 1; + out->values = xmalloc (sizeof *out->values); + out->values[0] = value; + } + else + { + out->n = 0; + out->values = xnmalloc (in->n_values, sizeof *out->values); + for (size_t i = 0; i < in->n_values; i++) + { + char *error = decode_spvlb_value (table, in->values[i], encoding, + &out->values[i]); + if (error) + { + pivot_argument_uninit (out); + return error; + } + out->n++; + } + } + + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_value_show (uint8_t in, enum settings_value_show *out) +{ + switch (in) + { + case 0: *out = SETTINGS_VALUE_SHOW_DEFAULT; return NULL; + case 1: *out = SETTINGS_VALUE_SHOW_VALUE; return NULL; + case 2: *out = SETTINGS_VALUE_SHOW_LABEL; return NULL; + case 3: *out = SETTINGS_VALUE_SHOW_BOTH; return NULL; + default: + return xasprintf ("bad value show %"PRIu8, in); + } +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_value (const struct pivot_table *table, + const struct spvlb_value *in, + const char *encoding, struct pivot_value **outp) +{ + *outp = NULL; + + struct pivot_value *out = xzalloc (sizeof *out); + const struct spvlb_value_mod *vm; + + char *error; + switch (in->type) + { + case 1: + vm = in->type_01.value_mod; + out->type = PIVOT_VALUE_NUMERIC; + out->numeric.x = in->type_01.x; + error = spv_decode_fmt_spec (in->type_01.format, &out->numeric.format); + if (error) + return error; + break; + + case 2: + vm = in->type_02.value_mod; + out->type = PIVOT_VALUE_NUMERIC; + out->numeric.x = in->type_02.x; + error = spv_decode_fmt_spec (in->type_02.format, &out->numeric.format); + if (!error) + error = decode_spvlb_value_show (in->type_02.show, &out->numeric.show); + if (error) + return NULL; + out->numeric.var_name = to_utf8_if_nonempty (in->type_02.var_name, + encoding); + out->numeric.value_label = to_utf8_if_nonempty (in->type_02.value_label, + encoding); + break; + + case 3: + vm = in->type_03.value_mod; + out->type = PIVOT_VALUE_TEXT; + out->text.local = to_utf8 (in->type_03.local, encoding); + out->text.c = to_utf8 (in->type_03.c, encoding); + out->text.id = to_utf8 (in->type_03.id, encoding); + out->text.user_provided = !in->type_03.fixed; + break; + + case 4: + vm = in->type_04.value_mod; + out->type = PIVOT_VALUE_STRING; + error = decode_spvlb_value_show (in->type_04.show, &out->string.show); + if (error) + return NULL; + out->string.s = to_utf8 (in->type_04.s, encoding); + out->string.var_name = to_utf8 (in->type_04.var_name, encoding); + out->string.value_label = to_utf8_if_nonempty (in->type_04.value_label, + encoding); + break; + + case 5: + vm = in->type_05.value_mod; + out->type = PIVOT_VALUE_VARIABLE; + error = decode_spvlb_value_show (in->type_05.show, &out->variable.show); + if (error) + return error; + out->variable.var_name = to_utf8 (in->type_05.var_name, encoding); + out->variable.var_label = to_utf8_if_nonempty (in->type_05.var_label, + encoding); + break; + + case 6: + vm = in->type_06.value_mod; + out->type = PIVOT_VALUE_TEXT; + out->text.local = to_utf8 (in->type_06.local, encoding); + out->text.c = to_utf8 (in->type_06.c, encoding); + out->text.id = to_utf8 (in->type_06.id, encoding); + out->text.user_provided = false; + break; + + case -1: + vm = in->type_else.value_mod; + out->type = PIVOT_VALUE_TEMPLATE; + out->template.local = to_utf8 (in->type_else.template, encoding); + out->template.id = out->template.local; + out->template.n_args = 0; + out->template.args = xnmalloc (in->type_else.n_args, + sizeof *out->template.args); + for (size_t i = 0; i < in->type_else.n_args; i++) + { + error = decode_spvlb_argument (table, in->type_else.args[i], + encoding, &out->template.args[i]); + if (error) + { + pivot_value_destroy (out); + return error; + } + out->template.n_args++; + } + break; + + default: + assert (0); + } + + if (vm) + { + if (vm->n_subscripts) + { + out->n_subscripts = vm->n_subscripts; + out->subscripts = xnmalloc (vm->n_subscripts, + sizeof *out->subscripts); + for (size_t i = 0; i < vm->n_subscripts; i++) + out->subscripts[i] = to_utf8 (vm->subscripts[i], encoding); + } + + if (vm->n_refs) + { + out->footnotes = xnmalloc (vm->n_refs, sizeof *out->footnotes); + for (size_t i = 0; i < vm->n_refs; i++) + { + uint16_t idx = vm->refs[i]; + if (idx >= table->n_footnotes) + { + pivot_value_destroy (out); + return xasprintf ("bad footnote index: %"PRIu16" >= %zu", + idx, table->n_footnotes); + } + + out->footnotes[out->n_footnotes++] = table->footnotes[idx]; + } + } + + if (vm->style_pair) + { + error = decode_spvlb_font_style (vm->style_pair->font_style, + encoding, &out->font_style); + if (!error) + error = decode_spvlb_cell_style (vm->style_pair->cell_style, + &out->cell_style); + if (error) + { + pivot_value_destroy (out); + return error; + } + } + + if (vm->template_string + && vm->template_string->id + && vm->template_string->id[0] + && out->type == PIVOT_VALUE_TEMPLATE) + out->template.id = to_utf8 (vm->template_string->id, encoding); + } + + *outp = out; + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_area (const struct spvlb_area *in, struct area_style *out, + const char *encoding) +{ + char *error; + + struct cell_color fg0, fg1, bg0, bg1; + error = decode_spvlb_color_string (in->fg_color, 0x00, &fg0); + if (!error) + error = decode_spvlb_color_string (in->bg_color, 0xff, &bg0); + if (!error && in->alternate) + error = decode_spvlb_color_string (in->alt_fg_color, 0x00, &fg1); + if (!error && in->alternate) + error = decode_spvlb_color_string (in->alt_bg_color, 0xff, &bg1); + + enum table_halign halign; + if (!error) + { + error = decode_spvlb_halign (in->halign, &halign); + + /* TABLE_HALIGN_DECIMAL doesn't seem to be a real halign for areas, which + is good because there's no way to indicate the decimal offset. Just + in case: */ + if (!error && halign == TABLE_HALIGN_DECIMAL) + halign = TABLE_HALIGN_MIXED; + } + + enum table_valign valign; + if (!error) + error = decode_spvlb_valign (in->valign, &valign); + + if (error) + return error; + + *out = (struct area_style) { + .font_style = { + .bold = (in->style & 1) != 0, + .italic = (in->style & 2) != 0, + .underline = in->underline, + .fg = { fg0, in->alternate ? fg1 : fg0 }, + .bg = { bg0, in->alternate ? bg1 : bg0 }, + .typeface = to_utf8 (in->typeface, encoding), + .size = in->size / 1.33, + }, + .cell_style = { + .halign = halign, + .valign = valign, + .margin = { + [TABLE_HORZ] = { in->left_margin, in->right_margin }, + [TABLE_VERT] = { in->top_margin, in->bottom_margin }, + }, + }, + }; + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_group (const struct pivot_table *, + struct spvlb_category **, + size_t n_categories, + bool show_label, + struct pivot_category *parent, + struct pivot_dimension *, + const char *encoding); + +static char * WARN_UNUSED_RESULT +decode_spvlb_categories (const struct pivot_table *table, + struct spvlb_category **categories, + size_t n_categories, + struct pivot_category *parent, + struct pivot_dimension *dimension, + const char *encoding) +{ + for (size_t i = 0; i < n_categories; i++) + { + const struct spvlb_category *in = categories[i]; + if (in->group && in->group->merge) + { + char *error = decode_spvlb_categories ( + table, in->group->subcategories, in->group->n_subcategories, + parent, dimension, encoding); + if (error) + return error; + + continue; + } + + struct pivot_value *name; + char *error = decode_spvlb_value (table, in->name, encoding, &name); + if (error) + return error; + + struct pivot_category *out = xzalloc (sizeof *out); + out->name = name; + out->parent = parent; + out->dimension = dimension; + if (in->group) + { + char *error = decode_spvlb_group (table, in->group->subcategories, + in->group->n_subcategories, + true, out, dimension, encoding); + if (error) + { + pivot_category_destroy (out); + return error; + } + + out->data_index = SIZE_MAX; + out->presentation_index = SIZE_MAX; + } + else + { + out->data_index = in->leaf->leaf_index; + out->presentation_index = dimension->n_leaves; + dimension->n_leaves++; + } + + if (parent->n_subs >= parent->allocated_subs) + parent->subs = x2nrealloc (parent->subs, &parent->allocated_subs, + sizeof *parent->subs); + parent->subs[parent->n_subs++] = out; + } + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_group (const struct pivot_table *table, + struct spvlb_category **categories, + size_t n_categories, bool show_label, + struct pivot_category *category, + struct pivot_dimension *dimension, + const char *encoding) +{ + category->subs = xcalloc (n_categories, sizeof *category->subs); + category->n_subs = 0; + category->allocated_subs = 0; + category->show_label = show_label; + + return decode_spvlb_categories (table, categories, n_categories, category, + dimension, encoding); +} + +static char * WARN_UNUSED_RESULT +fill_leaves (struct pivot_category *category, + struct pivot_dimension *dimension) +{ + if (pivot_category_is_group (category)) + { + for (size_t i = 0; i < category->n_subs; i++) + { + char *error = fill_leaves (category->subs[i], dimension); + if (error) + return error; + } + } + else + { + if (category->data_index >= dimension->n_leaves) + return xasprintf ("leaf_index %zu >= n_leaves %zu", + category->data_index, dimension->n_leaves); + if (dimension->data_leaves[category->data_index]) + return xasprintf ("two leaves with data_index %zu", + category->data_index); + dimension->data_leaves[category->data_index] = category; + dimension->presentation_leaves[category->presentation_index] = category; + } + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_dimension (const struct pivot_table *table, + const struct spvlb_dimension *in, + size_t idx, const char *encoding, + struct pivot_dimension **outp) +{ + /* Convert most of the dimension. */ + struct pivot_value *name; + char *error = decode_spvlb_value (table, in->name, encoding, &name); + if (error) + return error; + + struct pivot_dimension *out = xzalloc (sizeof *out); + out->level = UINT_MAX; + out->top_index = idx; + out->hide_all_labels = in->props->hide_all_labels; + + out->root = xzalloc (sizeof *out->root); + *out->root = (struct pivot_category) { + .name = name, + .dimension = out, + .data_index = SIZE_MAX, + .presentation_index = SIZE_MAX, + }; + error = decode_spvlb_group (table, in->categories, in->n_categories, + !in->props->hide_dim_label, out->root, + out, encoding); + if (error) + goto error; + + /* Allocate and fill the array of leaves now that we know how many there + are. */ + out->data_leaves = xcalloc (out->n_leaves, sizeof *out->data_leaves); + out->presentation_leaves = xcalloc (out->n_leaves, + sizeof *out->presentation_leaves); + out->allocated_leaves = out->n_leaves; + error = fill_leaves (out->root, out); + if (error) + goto error; + for (size_t i = 0; i < out->n_leaves; i++) + { + assert (out->data_leaves[i] != NULL); + assert (out->presentation_leaves[i] != NULL); + } + *outp = out; + return NULL; + +error: + pivot_dimension_destroy (out); + return error; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_stroke (uint32_t stroke_type, enum table_stroke *strokep) +{ + enum table_stroke strokes[] = { + TABLE_STROKE_NONE, + TABLE_STROKE_SOLID, + TABLE_STROKE_DASHED, + TABLE_STROKE_THICK, + TABLE_STROKE_THIN, + TABLE_STROKE_DOUBLE, + }; + + if (stroke_type >= sizeof strokes / sizeof *strokes) + return xasprintf ("bad stroke %"PRIu32, stroke_type); + + *strokep = strokes[stroke_type]; + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_border (const struct spvlb_border *in, struct pivot_table *table) + +{ + if (in->border_type >= PIVOT_N_BORDERS) + return xasprintf ("bad border type %"PRIu32, in->border_type); + + struct table_border_style *out = &table->borders[in->border_type]; + out->color = decode_spvlb_color_u32 (in->color); + return decode_spvlb_stroke (in->stroke_type, &out->stroke); +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_axis (const uint32_t *dimension_indexes, size_t n_dimensions, + enum pivot_axis_type axis_type, struct pivot_table *table) +{ + struct pivot_axis *axis = &table->axes[axis_type]; + axis->dimensions = xcalloc (n_dimensions, sizeof *axis->dimensions); + axis->n_dimensions = n_dimensions; + axis->extent = 1; + for (size_t i = 0; i < n_dimensions; i++) + { + uint32_t idx = dimension_indexes[i]; + if (idx >= table->n_dimensions) + return xasprintf ("bad dimension index %"PRIu32" >= %zu", + idx, table->n_dimensions); + + struct pivot_dimension *d = table->dimensions[idx]; + if (d->level != UINT_MAX) + return xasprintf ("duplicate dimension %"PRIu32, idx); + + axis->dimensions[i] = d; + d->axis_type = axis_type; + d->level = i; + + axis->extent *= d->n_leaves; + } + + return NULL; +} + +static char * +decode_data_index (uint64_t in, const struct pivot_table *table, + size_t *out) +{ + uint64_t remainder = in; + for (size_t i = table->n_dimensions - 1; i > 0; i--) + { + const struct pivot_dimension *d = table->dimensions[i]; + if (d->n_leaves) + { + out[i] = remainder % d->n_leaves; + remainder /= d->n_leaves; + } + else + out[i] = 0; + } + if (remainder >= table->dimensions[0]->n_leaves) + return xasprintf ("out of range cell data index %"PRIu64, in); + + out[0] = remainder; + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_cells (struct spvlb_cell **in, size_t n_in, + struct pivot_table *table, const char *encoding) +{ + if (!table->n_dimensions) + return NULL; + + size_t *dindexes = xnmalloc (table->n_dimensions, sizeof *dindexes); + for (size_t i = 0; i < n_in; i++) + { + struct pivot_value *value; + char *error = decode_data_index (in[i]->index, table, dindexes); + if (!error) + error = decode_spvlb_value (table, in[i]->value, encoding, &value); + if (error) + { + free (dindexes); + return error; + } + pivot_table_put (table, dindexes, table->n_dimensions, value); + } + free (dindexes); + + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_spvlb_footnote (const struct spvlb_footnote *in, const char *encoding, + size_t idx, struct pivot_table *table) +{ + struct pivot_value *content; + char *error = decode_spvlb_value (table, in->text, encoding, &content); + if (error) + return error; + + struct pivot_value *marker = NULL; + if (in->marker) + { + error = decode_spvlb_value (table, in->marker, encoding, &marker); + if (error) + { + pivot_value_destroy (content); + return error; + } + if (marker->type == PIVOT_VALUE_TEXT) + marker->text.user_provided = false; + } + + struct pivot_footnote *f = pivot_table_create_footnote__ ( + table, idx, marker, content); + f->show = (int32_t) in->show > 0; + return NULL; +} + +static char * WARN_UNUSED_RESULT +decode_current_layer (uint64_t current_layer, struct pivot_table *table) +{ + const struct pivot_axis *axis = &table->axes[PIVOT_AXIS_LAYER]; + table->current_layer = xnmalloc (axis->n_dimensions, + sizeof *table->current_layer); + + for (size_t i = 0; i < axis->n_dimensions; i++) + { + const struct pivot_dimension *d = axis->dimensions[i]; + if (d->n_leaves) + { + table->current_layer[i] = current_layer % d->n_leaves; + current_layer /= d->n_leaves; + } + else + table->current_layer[i] = 0; + } + if (current_layer > 0) + return xasprintf ("out of range layer data index %"PRIu64, current_layer); + return NULL; +} + +char * WARN_UNUSED_RESULT +decode_spvlb_table (const struct spvlb_table *in, struct pivot_table **outp) +{ + *outp = NULL; + if (in->header->version != 1 && in->header->version != 3) + return xasprintf ("unknown version %"PRIu32" (expected 1 or 3)", + in->header->version); + + char *error = NULL; + struct pivot_table *out = xzalloc (sizeof *out); + out->ref_cnt = 1; + hmap_init (&out->cells); + + const struct spvlb_y1 *y1 = (in->formats->x0 ? in->formats->x0->y1 + : in->formats->x3 ? in->formats->x3->y1 + : NULL); + const char *encoding; + if (y1) + encoding = y1->charset; + else + { + const char *dot = strchr (in->formats->locale, '.'); + encoding = dot ? dot + 1 : "windows-1252"; + } + + /* Display settings. */ + out->show_numeric_markers = !in->ts->show_alphabetic_markers; + out->rotate_inner_column_labels = in->header->rotate_inner_column_labels; + out->rotate_outer_row_labels = in->header->rotate_outer_row_labels; + out->row_labels_in_corner = in->ts->show_row_labels_in_corner; + out->show_grid_lines = in->borders->show_grid_lines; + out->show_caption = true; + out->footnote_marker_superscripts = in->ts->footnote_marker_superscripts; + out->omit_empty = in->ts->omit_empty; + + const struct spvlb_x1 *x1 = in->formats->x1; + if (x1) + { + error = decode_spvlb_value_show (x1->show_values, &out->show_values); + if (!error) + error = decode_spvlb_value_show (x1->show_variables, + &out->show_variables); + if (error) + goto error; + + out->show_caption = x1->show_caption; + } + + /* Column and row display settings. */ + out->sizing[TABLE_VERT].range[0] = in->header->min_row_height; + out->sizing[TABLE_VERT].range[1] = in->header->max_row_height; + out->sizing[TABLE_HORZ].range[0] = in->header->min_col_width; + out->sizing[TABLE_HORZ].range[1] = in->header->max_col_width; + + convert_widths (in->formats->widths, in->formats->n_widths, + &out->sizing[TABLE_HORZ].widths, + &out->sizing[TABLE_HORZ].n_widths); + + const struct spvlb_x2 *x2 = in->formats->x2; + if (x2) + convert_widths (x2->row_heights, x2->n_row_heights, + &out->sizing[TABLE_VERT].widths, + &out->sizing[TABLE_VERT].n_widths); + + convert_breakpoints (in->ts->row_breaks, + &out->sizing[TABLE_VERT].breaks, + &out->sizing[TABLE_VERT].n_breaks); + convert_breakpoints (in->ts->col_breaks, + &out->sizing[TABLE_HORZ].breaks, + &out->sizing[TABLE_HORZ].n_breaks); + + convert_keeps (in->ts->row_keeps, + &out->sizing[TABLE_VERT].keeps, + &out->sizing[TABLE_VERT].n_keeps); + convert_keeps (in->ts->col_keeps, + &out->sizing[TABLE_HORZ].keeps, + &out->sizing[TABLE_HORZ].n_keeps); + + out->notes = to_utf8_if_nonempty (in->ts->notes, encoding); + out->table_look = to_utf8_if_nonempty (in->ts->table_look, encoding); + + /* Print settings. */ + out->print_all_layers = in->ps->all_layers; + out->paginate_layers = in->ps->paginate_layers; + out->shrink_to_fit[TABLE_HORZ] = in->ps->fit_width; + out->shrink_to_fit[TABLE_VERT] = in->ps->fit_length; + out->top_continuation = in->ps->top_continuation; + out->bottom_continuation = in->ps->bottom_continuation; + out->continuation = xstrdup (in->ps->continuation_string); + out->n_orphan_lines = in->ps->n_orphan_lines; + + /* Format settings. */ + out->epoch = in->formats->y0->epoch; + out->decimal = in->formats->y0->decimal; + out->grouping = in->formats->y0->grouping; + const struct spvlb_custom_currency *cc = in->formats->custom_currency; + for (int i = 0; i < 5; i++) + if (cc && i < cc->n_ccs) + out->ccs[i] = xstrdup (cc->ccs[i]); + out->small = in->formats->x3 ? in->formats->x3->small : 0; + + /* Command information. */ + if (y1) + { + out->command_local = to_utf8 (y1->command_local, encoding); + out->command_c = to_utf8 (y1->command, encoding); + out->language = xstrdup (y1->language); + /* charset? */ + out->locale = xstrdup (y1->locale); + } + + /* Source information. */ + const struct spvlb_x3 *x3 = in->formats->x3; + if (x3) + { + if (x3->dataset && x3->dataset[0] && x3->dataset[0] != 4) + out->dataset = to_utf8 (x3->dataset, encoding); + out->datafile = to_utf8_if_nonempty (x3->datafile, encoding); + out->date = x3->date; + } + + /* Footnotes. + + Any pivot_value might refer to footnotes, so it's important to process the + footnotes early to ensure that those references can be resolved. There is + a possible problem that a footnote might itself reference an + as-yet-unprocessed footnote, but that's OK because footnote references + don't actually look at the footnote contents but only resolve a pointer to + where the footnote will go later. + + Before we really start, create all the footnotes we'll fill in. This is + because sometimes footnotes refer to themselves or to each other and we + don't want to reject those references. */ + const struct spvlb_footnotes *fn = in->footnotes; + if (fn->n_footnotes > 0) + { + pivot_table_create_footnote__ (out, fn->n_footnotes - 1, NULL, NULL); + for (size_t i = 0; i < fn->n_footnotes; i++) + { + error = decode_spvlb_footnote (in->footnotes->footnotes[i], + encoding, i, out); + if (error) + goto error; + } + } + + /* Title and caption. */ + error = decode_spvlb_value (out, in->titles->user_title, encoding, + &out->title); + if (error) + goto error; + + error = decode_spvlb_value (out, in->titles->subtype, encoding, + &out->subtype); + if (error) + goto error; + + if (in->titles->corner_text) + { + error = decode_spvlb_value (out, in->titles->corner_text, + encoding, &out->corner_text); + if (error) + goto error; + } + + if (in->titles->caption) + { + error = decode_spvlb_value (out, in->titles->caption, encoding, + &out->caption); + if (error) + goto error; + } + + + /* Styles. */ + for (size_t i = 0; i < PIVOT_N_AREAS; i++) + { + error = decode_spvlb_area (in->areas->areas[i], &out->areas[i], + encoding); + if (error) + goto error; + } + for (size_t i = 0; i < PIVOT_N_BORDERS; i++) + { + error = decode_spvlb_border (in->borders->borders[i], out); + if (error) + goto error; + } + + /* Dimensions. */ + out->n_dimensions = in->dimensions->n_dims; + out->dimensions = xcalloc (out->n_dimensions, sizeof *out->dimensions); + for (size_t i = 0; i < out->n_dimensions; i++) + { + error = decode_spvlb_dimension (out, in->dimensions->dims[i], + i, encoding, &out->dimensions[i]); + if (error) + goto error; + } + + /* Axes. */ + size_t a = in->axes->n_layers; + size_t b = in->axes->n_rows; + size_t c = in->axes->n_columns; + if (size_overflow_p (xsum3 (a, b, c)) || a + b + c != out->n_dimensions) + { + error = xasprintf ("dimensions do not sum correctly " + "(%zu + %zu + %zu != %zu)", + a, b, c, out->n_dimensions); + goto error; + } + error = decode_spvlb_axis (in->axes->layers, in->axes->n_layers, + PIVOT_AXIS_LAYER, out); + if (error) + goto error; + error = decode_spvlb_axis (in->axes->rows, in->axes->n_rows, + PIVOT_AXIS_ROW, out); + if (error) + goto error; + error = decode_spvlb_axis (in->axes->columns, in->axes->n_columns, + PIVOT_AXIS_COLUMN, out); + if (error) + goto error; + + pivot_table_assign_label_depth (out); + + error = decode_current_layer (in->ts->current_layer, out); + if (error) + goto error; + + /* Data. */ + error = decode_spvlb_cells (in->cells->cells, in->cells->n_cells, out, + encoding); + + *outp = out; + return NULL; + +error: + pivot_table_unref (out); + return error; +} diff --git a/src/output/spv/spv-light-decoder.h b/src/output/spv/spv-light-decoder.h new file mode 100644 index 0000000000..158707b559 --- /dev/null +++ b/src/output/spv/spv-light-decoder.h @@ -0,0 +1,33 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_LIGHT_DECODER_H +#define OUTPUT_SPV_LIGHT_DECODER_H 1 + +/* SPSS Viewer (SPV) light binary decoder. + + Used by spv.h, not useful directly. */ + +#include "libpspp/compiler.h" + +struct pivot_table; +struct spvlb_table; + +char *decode_spvlb_table (const struct spvlb_table *, + struct pivot_table **outp) + WARN_UNUSED_RESULT; + +#endif /* output/spv/spv-light-decoder.h */ diff --git a/src/output/spv/spv-output.c b/src/output/spv/spv-output.c new file mode 100644 index 0000000000..f83ff4542b --- /dev/null +++ b/src/output/spv/spv-output.c @@ -0,0 +1,51 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spv-output.h" + +#include "output/pivot-table.h" +#include "output/spv/spv.h" +#include "output/text-item.h" + +#include "gl/xalloc.h" + +void +spv_text_submit (const struct spv_item *in) +{ + enum spv_item_class class = spv_item_get_class (in); + enum text_item_type type + = (class == SPV_CLASS_HEADINGS ? TEXT_ITEM_TITLE + : class == SPV_CLASS_PAGETITLE ? TEXT_ITEM_PAGE_TITLE + : TEXT_ITEM_LOG); + const struct pivot_value *value = spv_item_get_text (in); + char *text = pivot_value_to_string (value, SETTINGS_VALUE_SHOW_DEFAULT, + SETTINGS_VALUE_SHOW_DEFAULT); + struct text_item *item = text_item_create_nocopy (type, text); + const struct font_style *font = value->font_style; + if (font) + { + item->bold = font->bold; + item->italic = font->italic; + item->underline = font->underline; + item->markup = font->markup; + if (font->typeface) + item->typeface = xstrdup (font->typeface); + item->size = font->size; + } + text_item_submit (item); +} diff --git a/src/output/spv/spv-output.h b/src/output/spv/spv-output.h new file mode 100644 index 0000000000..b100c0cef7 --- /dev/null +++ b/src/output/spv/spv-output.h @@ -0,0 +1,26 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_OUTPUT_H +#define OUTPUT_SPV_OUTPUT_H 1 + +/* Interface between SPVs and the PSPP output engine. */ + +struct spv_item; + +void spv_text_submit (const struct spv_item *); + +#endif /* output/spv/spv-output.h */ diff --git a/src/output/spv/spv-select.c b/src/output/spv/spv-select.c new file mode 100644 index 0000000000..dacd53ec91 --- /dev/null +++ b/src/output/spv/spv-select.c @@ -0,0 +1,219 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "spv-select.h" + +#include + +#include "libpspp/assertion.h" +#include "libpspp/bit-vector.h" +#include "output/spv/spv.h" + +#include "gl/c-ctype.h" +#include "gl/xalloc.h" + +static struct spv_item * +find_command_item (struct spv_item *item) +{ + /* A command item itself does not have a command item. */ + if (!item->parent || !item->parent->parent) + return NULL; + + do + { + item = item->parent; + } + while (item->parent && item->parent->parent); + return item; +} + +static bool +string_matches (const char *pattern, const char *s) +{ + /* XXX This should be a Unicode case insensitive comparison. */ + while (c_tolower (*pattern) == c_tolower (*s)) + { + if (*pattern == '\0') + return true; + + pattern++; + s++; + } + + return pattern[0] == '*' && pattern[1] == '\0'; +} + +static int +string_array_matches (const char *name, const struct string_array *array) +{ + if (!array->n) + return -1; + else if (!name) + return false; + + for (size_t i = 0; i < array->n; i++) + if (string_matches (array->strings[i], name)) + return true; + + return false; +} + +static bool +match (const char *name, + const struct string_array *white, + const struct string_array *black) +{ + return (string_array_matches (name, white) != false + && string_array_matches (name, black) != true); +} + +static int +match_instance (const int *instances, size_t n_instances, + int instance_within_command) +{ + int retval = false; + for (size_t i = 0; i < n_instances; i++) + { + if (instances[i] == instance_within_command) + return true; + else if (instances[i] == -1) + retval = -1; + } + return retval; +} + +static void +select_matches (const struct spv_reader *spv, const struct spv_criteria *c, + unsigned long int *include) +{ + struct spv_item *item; + struct spv_item *command_item = NULL; + int instance_within_command = 0; + int last_instance = -1; + ssize_t index = -1; + SPV_ITEM_FOR_EACH_SKIP_ROOT (item, spv_get_root (spv)) + { + index++; + + struct spv_item *new_command_item = find_command_item (item); + if (new_command_item != command_item) + { + if (last_instance >= 0) + { + bitvector_set1 (include, last_instance); + last_instance = -1; + } + + command_item = new_command_item; + instance_within_command = 0; + } + + if (!((1u << spv_item_get_class (item)) & c->classes)) + continue; + + if (!c->include_hidden && !spv_item_is_visible (item)) + continue; + + if (c->error) + { + spv_item_load (item); + if (!item->error) + continue; + } + + if (!match (spv_item_get_command_id (item), + &c->include.commands, &c->exclude.commands)) + continue; + + if (!match (spv_item_get_subtype (item), + &c->include.subtypes, &c->exclude.subtypes)) + continue; + + if (!match (spv_item_get_label (item), + &c->include.labels, &c->exclude.labels)) + continue; + + if (c->members.n + && !((item->xml_member + && string_array_matches (item->xml_member, &c->members)) || + (item->bin_member + && string_array_matches (item->bin_member, &c->members)))) + continue; + + if (c->n_instances) + { + if (!command_item) + continue; + instance_within_command++; + + int include_instance = match_instance (c->instances, c->n_instances, + instance_within_command); + if (!include_instance) + continue; + else if (include_instance < 0) + { + last_instance = index; + continue; + } + } + + bitvector_set1 (include, index); + } + + if (last_instance >= 0) + bitvector_set1 (include, last_instance); +} + +void +spv_select (const struct spv_reader *spv, + const struct spv_criteria c[], size_t nc, + struct spv_item ***itemsp, size_t *n_itemsp) +{ + struct spv_item *item; + + struct spv_criteria default_criteria = SPV_CRITERIA_INITIALIZER; + if (!nc) + { + nc = 1; + c = &default_criteria; + } + + /* Count items. */ + size_t max_items = 0; + SPV_ITEM_FOR_EACH_SKIP_ROOT (item, spv_get_root (spv)) + max_items++; + + /* Allocate bitmap for items then fill it in with selected items. */ + unsigned long int *include = bitvector_allocate (max_items); + for (size_t i = 0; i < nc; i++) + select_matches (spv, &c[i], include); + + /* Copy selected items into output array. */ + size_t n_items = 0; + struct spv_item **items = xnmalloc (bitvector_count (include, max_items), + sizeof *items); + size_t i = 0; + SPV_ITEM_FOR_EACH_SKIP_ROOT (item, spv_get_root (spv)) + if (bitvector_is_set (include, i++)) + items[n_items++] = item; + *itemsp = items; + *n_itemsp = n_items; + + /* Free memory. */ + free (include); +} diff --git a/src/output/spv/spv-select.h b/src/output/spv/spv-select.h new file mode 100644 index 0000000000..b8bcd98e4e --- /dev/null +++ b/src/output/spv/spv-select.h @@ -0,0 +1,73 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2019 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_SELECT_H +#define OUTPUT_SPV_SELECT_H 1 + +#include "libpspp/string-array.h" + +struct spv_item; +struct spv_reader; + +/* Matching criteria for commands, subtypes, and labels. + + Each of the members is an array of strings. A string that ends in '*' + matches anything that begins with the rest of the string, otherwise a string + requires an exact (case-insensitive) match. */ +struct spv_criteria_match + { + struct string_array commands; + struct string_array subtypes; + struct string_array labels; + }; + +struct spv_criteria + { + /* Include objects that are not visible? */ + bool include_hidden; + + /* If false, include all objects. + If true, include only objects that have an error on loading. */ + bool error; + + /* Bit-mask of SPV_CLASS_* for the classes to include. */ + unsigned int classes; + + /* Include all of the objects that match 'include' and do not match + 'exclude', except that objects are included by default if 'include' is + empty. */ + struct spv_criteria_match include; + struct spv_criteria_match exclude; + + /* Include XML and binary member names that match (except that everything + is included by default if empty). */ + struct string_array members; + + /* Include the objects with indexes listed in INSTANCES within each of the + commands that are included. Indexes are 1-based. Index -1 means the + last object within a command. */ + int *instances; + size_t n_instances; + }; + +#define SPV_CRITERIA_INITIALIZER { .classes = SPV_ALL_CLASSES } + +void spv_select (const struct spv_reader *, + const struct spv_criteria[], size_t n_criteria, + struct spv_item ***items, size_t *n_items); + + +#endif /* output/spv/spv-select.h */ diff --git a/src/output/spv/spv-writer.c b/src/output/spv/spv-writer.c new file mode 100644 index 0000000000..1cba64bd8b --- /dev/null +++ b/src/output/spv/spv-writer.c @@ -0,0 +1,1024 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2019 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spv-writer.h" + +#include +#include +#include +#include +#include + +#include "libpspp/array.h" +#include "libpspp/assertion.h" +#include "libpspp/cast.h" +#include "libpspp/float-format.h" +#include "libpspp/integer-format.h" +#include "libpspp/temp-file.h" +#include "libpspp/version.h" +#include "libpspp/zip-writer.h" +#include "output/page-setup-item.h" +#include "output/pivot-table.h" +#include "output/text-item.h" + +#include "gl/xalloc.h" +#include "gl/xvasprintf.h" + +#include "gettext.h" +#define _(msgid) gettext (msgid) +#define N_(msgid) (msgid) + +struct spv_writer + { + struct zip_writer *zw; + + FILE *heading; + int heading_depth; + xmlTextWriter *xml; + + int n_tables; + + int n_headings; + struct page_setup *page_setup; + bool need_page_break; + }; + +char * WARN_UNUSED_RESULT +spv_writer_open (const char *filename, struct spv_writer **writerp) +{ + *writerp = NULL; + + struct zip_writer *zw = zip_writer_create (filename); + if (!zw) + return xasprintf (_("%s: create failed"), filename); + + struct spv_writer *w = xmalloc (sizeof *w); + *w = (struct spv_writer) { .zw = zw }; + *writerp = w; + return NULL; +} + +char * WARN_UNUSED_RESULT +spv_writer_close (struct spv_writer *w) +{ + if (!w) + return NULL; + + zip_writer_add_string (w->zw, "META-INF/MANIFEST.MF", "allowPivoting=true"); + + while (w->heading_depth) + spv_writer_close_heading (w); + + char *error = NULL; + if (!zip_writer_close (w->zw)) + error = xstrdup (_("I/O error writing SPV file")); + + page_setup_destroy (w->page_setup); + free (w); + return error; +} + +void +spv_writer_set_page_setup (struct spv_writer *w, + const struct page_setup *page_setup) +{ + page_setup_destroy (w->page_setup); + w->page_setup = page_setup_clone (page_setup); +} + +static void +write_attr (struct spv_writer *w, const char *name, const char *value) +{ + xmlTextWriterWriteAttribute (w->xml, + CHAR_CAST (xmlChar *, name), + CHAR_CAST (xmlChar *, value)); +} + +static void PRINTF_FORMAT (3, 4) +write_attr_format (struct spv_writer *w, const char *name, + const char *format, ...) +{ + va_list args; + va_start (args, format); + char *value = xvasprintf (format, args); + va_end (args); + + write_attr (w, name, value); + free (value); +} + +static void +start_elem (struct spv_writer *w, const char *name) +{ + xmlTextWriterStartElement (w->xml, CHAR_CAST (xmlChar *, name)); +} + +static void +end_elem (struct spv_writer *w) +{ + xmlTextWriterEndElement (w->xml); +} + +static void +write_text (struct spv_writer *w, const char *text) +{ + xmlTextWriterWriteString (w->xml, CHAR_CAST (xmlChar *, text)); +} + +static void +write_page_heading (struct spv_writer *w, const struct page_heading *h, + const char *name) +{ + start_elem (w, name); + if (h->n) + { + start_elem (w, "pageParagraph"); + for (size_t i = 0; i < h->n; i++) + { + start_elem (w, "text"); + write_attr (w, "type", "title"); + write_text (w, h->paragraphs[i].markup); /* XXX */ + end_elem (w); + } + end_elem (w); + } + end_elem (w); +} + +static void +write_page_setup (struct spv_writer *w, const struct page_setup *ps) +{ + start_elem (w, "pageSetup"); + write_attr_format (w, "initial-page-number", "%d", ps->initial_page_number); + write_attr (w, "chart-size", + (ps->chart_size == PAGE_CHART_AS_IS ? "as-is" + : ps->chart_size == PAGE_CHART_FULL_HEIGHT ? "full-height" + : ps->chart_size == PAGE_CHART_HALF_HEIGHT ? "half-height" + : "quarter-height")); + write_attr_format (w, "margin-left", "%.2fin", ps->margins[TABLE_HORZ][0]); + write_attr_format (w, "margin-right", "%.2fin", ps->margins[TABLE_HORZ][1]); + write_attr_format (w, "margin-top", "%.2fin", ps->margins[TABLE_VERT][0]); + write_attr_format (w, "margin-bottom", "%.2fin", ps->margins[TABLE_VERT][1]); + write_attr_format (w, "paper-height", "%.2fin", ps->paper[TABLE_VERT]); + write_attr_format (w, "paper-width", "%.2fin", ps->paper[TABLE_HORZ]); + write_attr (w, "reference-orientation", + ps->orientation == PAGE_PORTRAIT ? "portrait" : "landscape"); + write_attr_format (w, "space-after", "%.1fpt", ps->object_spacing * 72.0); + write_page_heading (w, &ps->headings[0], "pageHeader"); + write_page_heading (w, &ps->headings[1], "pageFooter"); + end_elem (w); +} + +static bool +spv_writer_open_file (struct spv_writer *w) +{ + w->heading = create_temp_file (); + if (!w->heading) + return false; + + w->xml = xmlNewTextWriter (xmlOutputBufferCreateFile (w->heading, NULL)); + xmlTextWriterStartDocument (w->xml, NULL, "UTF-8", NULL); + start_elem (w, "heading"); + + time_t t = time (NULL); + struct tm *tm = gmtime (&t); + char *tm_s = asctime (tm); + write_attr (w, "creation-date-time", tm_s); + + write_attr (w, "creator", version); + + write_attr (w, "creator-version", "21"); + + write_attr (w, "xmlns", "http://xml.spss.com/spss/viewer/viewer-tree"); + write_attr (w, "xmlns:vps", "http://xml.spss.com/spss/viewer/viewer-pagesetup"); + write_attr (w, "xmlns:vtx", "http://xml.spss.com/spss/viewer/viewer-text"); + write_attr (w, "xmlns:vtb", "http://xml.spss.com/spss/viewer/viewer-table"); + + start_elem (w, "label"); + write_text (w, _("Output")); + end_elem (w); + + if (w->page_setup) + { + write_page_setup (w, w->page_setup); + + page_setup_destroy (w->page_setup); + w->page_setup = NULL; + } + + return true; +} + +void +spv_writer_open_heading (struct spv_writer *w, const char *command_id, + const char *label) +{ + if (!w->heading) + { + if (!spv_writer_open_file (w)) + return; + } + + w->heading_depth++; + start_elem (w, "heading"); + write_attr (w, "commandName", command_id); + /* XXX locale */ + /* XXX olang */ + + start_elem (w, "label"); + write_text (w, label); + end_elem (w); +} + +static void +spv_writer_close_file (struct spv_writer *w, const char *infix) +{ + if (!w->heading) + return; + + end_elem (w); + xmlTextWriterEndDocument (w->xml); + xmlFreeTextWriter (w->xml); + + char *member_name = xasprintf ("outputViewer%010d%s.xml", + w->n_headings++, infix); + zip_writer_add (w->zw, w->heading, member_name); + free (member_name); + + w->heading = NULL; +} + +void +spv_writer_close_heading (struct spv_writer *w) +{ + const char *infix = ""; + if (w->heading_depth) + { + infix = "_heading"; + end_elem (w); + w->heading_depth--; + } + + if (!w->heading_depth) + spv_writer_close_file (w, infix); +} + +static void +start_container (struct spv_writer *w) +{ + start_elem (w, "container"); + write_attr (w, "visibility", "visible"); + if (w->need_page_break) + { + write_attr (w, "page-break-before", "always"); + w->need_page_break = false; + } +} + +void +spv_writer_put_text (struct spv_writer *w, const struct text_item *text, + const char *command_id) +{ + if (text->type == TEXT_ITEM_EJECT_PAGE) + w->need_page_break = true; + + bool initial_depth = w->heading_depth; + if (!initial_depth) + spv_writer_open_file (w); + + start_container (w); + + start_elem (w, "label"); + write_text (w, (text->type == TEXT_ITEM_TITLE ? "Title" + : text->type == TEXT_ITEM_PAGE_TITLE ? "Page Title" + : "Log")); + end_elem (w); + + start_elem (w, "vtx:text"); + write_attr (w, "type", (text->type == TEXT_ITEM_TITLE ? "title" + : text->type == TEXT_ITEM_PAGE_TITLE ? "page-title" + : "log")); + if (command_id) + write_attr (w, "commandName", command_id); + + start_elem (w, "html"); + write_text (w, text->text); /* XXX */ + end_elem (w); /* html */ + end_elem (w); /* vtx:text */ + end_elem (w); /* container */ + + if (!initial_depth) + spv_writer_close_file (w, ""); +} + +#define H TABLE_HORZ +#define V TABLE_VERT + +struct buf + { + uint8_t *data; + size_t len; + size_t allocated; + }; + +static uint8_t * +put_uninit (struct buf *b, size_t n) +{ + while (b->allocated - b->len < n) + b->data = x2nrealloc (b->data, &b->allocated, sizeof b->data); + uint8_t *p = &b->data[b->len]; + b->len += n; + return p; +} + +static void +put_byte (struct buf *b, uint8_t byte) +{ + *put_uninit (b, 1) = byte; +} + +static void +put_bool (struct buf *b, bool boolean) +{ + put_byte (b, boolean); +} + +static void +put_bytes (struct buf *b, const char *bytes, size_t n) +{ + memcpy (put_uninit (b, n), bytes, n); +} + +static void +put_u16 (struct buf *b, uint16_t x) +{ + put_uint16 (native_to_le16 (x), put_uninit (b, sizeof x)); +} + +static void +put_u32 (struct buf *b, uint32_t x) +{ + put_uint32 (native_to_le32 (x), put_uninit (b, sizeof x)); +} + +static void +put_u64 (struct buf *b, uint64_t x) +{ + put_uint64 (native_to_le64 (x), put_uninit (b, sizeof x)); +} + +static void +put_be32 (struct buf *b, uint32_t x) +{ + put_uint32 (native_to_be32 (x), put_uninit (b, sizeof x)); +} + +static void +put_double (struct buf *b, double x) +{ + float_convert (FLOAT_NATIVE_DOUBLE, &x, + FLOAT_IEEE_DOUBLE_LE, put_uninit (b, 8)); +} + +static void +put_float (struct buf *b, float x) +{ + float_convert (FLOAT_NATIVE_FLOAT, &x, + FLOAT_IEEE_SINGLE_LE, put_uninit (b, 4)); +} + +static void +put_string (struct buf *b, const char *s_) +{ + const char *s = s_ ? s_ : ""; + size_t len = strlen (s); + put_u32 (b, len); + memcpy (put_uninit (b, len), s, len); +} + +static void +put_bestring (struct buf *b, const char *s_) +{ + const char *s = s_ ? s_ : ""; + size_t len = strlen (s); + put_be32 (b, len); + memcpy (put_uninit (b, len), s, len); +} + +static size_t +start_count (struct buf *b) +{ + put_u32 (b, 0); + return b->len; +} + +static void +end_count_u32 (struct buf *b, size_t start) +{ + put_uint32 (native_to_le32 (b->len - start), &b->data[start - 4]); +} + +static void +end_count_be32 (struct buf *b, size_t start) +{ + put_uint32 (native_to_be32 (b->len - start), &b->data[start - 4]); +} + +static void +put_color (struct buf *buf, const struct cell_color *color) +{ + char *s = xasprintf ("#%02"PRIx8"%02"PRIx8"%02"PRIx8, + color->r, color->g, color->b); + put_string (buf, s); + free (s); +} + +static void +put_font_style (struct buf *buf, const struct font_style *font_style) +{ + put_bool (buf, font_style->bold); + put_bool (buf, font_style->italic); + put_bool (buf, font_style->underline); + put_bool (buf, 1); + put_color (buf, &font_style->fg[0]); + put_color (buf, &font_style->bg[0]); + put_string (buf, font_style->typeface ? font_style->typeface : "SansSerif"); + put_byte (buf, ceil (font_style->size * 1.33)); +} + +static void +put_halign (struct buf *buf, enum table_halign halign, + uint32_t mixed, uint32_t decimal) +{ + put_u32 (buf, (halign == TABLE_HALIGN_RIGHT ? 4 + : halign == TABLE_HALIGN_LEFT ? 2 + : halign == TABLE_HALIGN_CENTER ? 0 + : halign == TABLE_HALIGN_MIXED ? mixed + : decimal)); +} + +static void +put_valign (struct buf *buf, enum table_valign valign) +{ + put_u32 (buf, (valign == TABLE_VALIGN_TOP ? 1 + : valign == TABLE_VALIGN_CENTER ? 0 + : 3)); +} + +static void +put_cell_style (struct buf *buf, const struct cell_style *cell_style) +{ + put_halign (buf, cell_style->halign, 0xffffffad, 6); + put_valign (buf, cell_style->valign); + put_double (buf, cell_style->decimal_offset); + put_u16 (buf, cell_style->margin[H][0]); + put_u16 (buf, cell_style->margin[H][1]); + put_u16 (buf, cell_style->margin[V][0]); + put_u16 (buf, cell_style->margin[V][1]); +} + +static void UNUSED +put_style_pair (struct buf *buf, const struct font_style *font_style, + const struct cell_style *cell_style) +{ + if (font_style) + { + put_byte (buf, 0x31); + put_font_style (buf, font_style); + } + else + put_byte (buf, 0x58); + + if (cell_style) + { + put_byte (buf, 0x31); + put_cell_style (buf, cell_style); + } + else + put_byte (buf, 0x58); +} + +static void +put_value_mod (struct buf *buf, const struct pivot_value *value, + const char *template) +{ + if (value->n_footnotes || value->n_subscripts + || template || value->font_style || value->cell_style) + { + put_byte (buf, 0x31); + + /* Footnotes. */ + put_u32 (buf, value->n_footnotes); + for (size_t i = 0; i < value->n_footnotes; i++) + put_u16 (buf, value->footnotes[i]->idx); + + /* Subscripts. */ + put_u32 (buf, value->n_subscripts); + for (size_t i = 0; i < value->n_subscripts; i++) + put_string (buf, value->subscripts[i]); + + /* Template and style. */ + uint32_t v3_start = start_count (buf); + uint32_t template_string_start = start_count (buf); + if (template) + { + uint32_t inner_start = start_count (buf); + end_count_u32 (buf, inner_start); + + put_byte (buf, 0x31); + put_string (buf, template); + } + end_count_u32 (buf, template_string_start); + put_style_pair (buf, value->font_style, value->cell_style); + end_count_u32 (buf, v3_start); + } + else + put_byte (buf, 0x58); +} + +static void +put_format (struct buf *buf, const struct fmt_spec *f) +{ + put_u32 (buf, (fmt_to_io (f->type) << 16) | (f->w << 8) | f->d); +} + +static int +show_values_to_spvlb (enum settings_value_show show) +{ + return (show == SETTINGS_VALUE_SHOW_DEFAULT ? 0 + : show == SETTINGS_VALUE_SHOW_VALUE ? 1 + : show == SETTINGS_VALUE_SHOW_LABEL ? 2 + : 3); +} + +static void +put_show_values (struct buf *buf, enum settings_value_show show) +{ + put_byte (buf, show_values_to_spvlb (show)); +} + +static void +put_value (struct buf *buf, const struct pivot_value *value) +{ + switch (value->type) + { + case PIVOT_VALUE_NUMERIC: + if (value->numeric.var_name || value->numeric.value_label) + { + put_byte (buf, 2); + put_value_mod (buf, value, NULL); + put_format (buf, &value->numeric.format); + put_double (buf, value->numeric.x); + put_string (buf, value->numeric.var_name); + put_string (buf, value->numeric.value_label); + put_show_values (buf, value->numeric.show); + } + else + { + put_byte (buf, 1); + put_value_mod (buf, value, NULL); + put_format (buf, &value->numeric.format); + put_double (buf, value->numeric.x); + } + break; + + case PIVOT_VALUE_STRING: + put_byte (buf, 4); + put_value_mod (buf, value, NULL); + put_format (buf, + &(struct fmt_spec) { FMT_A, strlen (value->string.s), 0 }); + put_string (buf, value->string.value_label); + put_string (buf, value->string.var_name); + put_show_values (buf, value->string.show); + put_string (buf, value->string.s); + break; + + case PIVOT_VALUE_VARIABLE: + put_byte (buf, 5); + put_value_mod (buf, value, NULL); + put_string (buf, value->variable.var_name); + put_string (buf, value->variable.var_label); + put_show_values (buf, value->variable.show); + break; + + case PIVOT_VALUE_TEXT: + put_byte (buf, 3); + put_string (buf, value->text.local); + put_value_mod (buf, value, NULL); + put_string (buf, value->text.id); + put_string (buf, value->text.c); + put_byte (buf, 1); /* XXX user-provided */ + break; + + case PIVOT_VALUE_TEMPLATE: + put_byte (buf, 0); + put_value_mod (buf, value, value->template.id); + put_string (buf, value->template.local); + put_u32 (buf, value->template.n_args); + for (size_t i = 0; i < value->template.n_args; i++) + { + const struct pivot_argument *arg = &value->template.args[i]; + assert (arg->n >= 1); + if (arg->n > 1) + { + put_u32 (buf, arg->n); + put_u32 (buf, 0); + for (size_t j = 0; j < arg->n; j++) + { + if (j > 0) + put_bytes (buf, "\0\0\0\0", 4); + put_value (buf, arg->values[j]); + } + } + else + { + put_u32 (buf, 0); + put_value (buf, arg->values[0]); + } + } + break; + + default: + NOT_REACHED (); + } +} + +static void +put_optional_value (struct buf *buf, const struct pivot_value *value) +{ + if (value) + { + put_byte (buf, 0x31); + put_value (buf, value); + } + else + put_byte (buf, 0x58); +} + +static void +put_category (struct buf *buf, const struct pivot_category *c) +{ + put_value (buf, c->name); + if (pivot_category_is_leaf (c)) + { + put_bytes (buf, "\0\0\0", 3); + put_u32 (buf, 2); + put_u32 (buf, c->data_index); + put_u32 (buf, 0); + } + else + { + put_bytes (buf, "\0\0\1", 3); + put_u32 (buf, 0); + put_u32 (buf, -1); + put_u32 (buf, c->n_subs); + for (size_t i = 0; i < c->n_subs; i++) + put_category (buf, c->subs[i]); + } +} + +static void +put_y0 (struct buf *buf, const struct pivot_table *table) +{ + put_u32 (buf, table->epoch); + put_byte (buf, table->decimal); + put_byte (buf, table->grouping); +} + +static void +put_custom_currency (struct buf *buf, const struct pivot_table *table) +{ + put_u32 (buf, 5); + for (int i = 0; i < 5; i++) + put_string (buf, table->ccs[i]); +} + +static void +put_x1 (struct buf *buf, const struct pivot_table *table) +{ + put_bytes (buf, "\0\1\0", 3); + put_byte (buf, 0); + put_show_values (buf, table->show_variables); + put_show_values (buf, table->show_values); + put_u32 (buf, -1); + put_u32 (buf, -1); + for (int i = 0; i < 17; i++) + put_byte (buf, 0); + put_bool (buf, false); + put_byte (buf, 1); +} + +static void +put_x2 (struct buf *buf) +{ + put_u32 (buf, 0); /* n-row-heights */ + put_u32 (buf, 0); /* n-style-map */ + put_u32 (buf, 0); /* n-styles */ + put_u32 (buf, 0); +} + +static void +put_x3 (struct buf *buf, const struct pivot_table *table) +{ + put_bytes (buf, "\1\0\4\0\0\0", 6); + put_string (buf, table->command_c); + put_string (buf, table->command_local); + put_string (buf, table->language); + put_string (buf, "UTF-8"); /* XXX */ + put_string (buf, table->locale); + put_bytes (buf, "\0\0\1\1", 4); + put_y0 (buf, table); + put_double (buf, table->small); + put_byte (buf, 1); + put_string (buf, table->dataset); + put_string (buf, table->datafile); + put_u32 (buf, 0); + put_u32 (buf, table->date); + put_u32 (buf, 0); + + /* Y2. */ + put_custom_currency (buf, table); + put_byte (buf, '.'); + put_bool (buf, 0); +} + +static void +put_light_table (struct buf *buf, uint64_t table_id, + const struct pivot_table *table) +{ + /* Header. */ + put_bytes (buf, "\1\0", 2); + put_u32 (buf, 3); + put_bool (buf, true); + put_bool (buf, false); + put_bool (buf, table->rotate_inner_column_labels); + put_bool (buf, table->rotate_outer_row_labels); + put_bool (buf, true); + put_u32 (buf, 0x15); + put_u32 (buf, table->sizing[H].range[0]); + put_u32 (buf, table->sizing[H].range[1]); + put_u32 (buf, table->sizing[V].range[0]); + put_u32 (buf, table->sizing[V].range[1]); + put_u64 (buf, table_id); + + /* Titles. */ + put_value (buf, table->title); + put_value (buf, table->subtype); + put_optional_value (buf, table->title); + put_optional_value (buf, table->corner_text); + put_optional_value (buf, table->caption); + + /* Footnotes. */ + put_u32 (buf, table->n_footnotes); + for (size_t i = 0; i < table->n_footnotes; i++) + { + put_value (buf, table->footnotes[i]->content); + put_optional_value (buf, table->footnotes[i]->marker); + put_u32 (buf, 0); + } + + /* Areas. */ + for (size_t i = 0; i < PIVOT_N_AREAS; i++) + { + const struct area_style *a = &table->areas[i]; + put_byte (buf, i + 1); + put_byte (buf, 0x31); + put_string (buf, (a->font_style.typeface + ? a->font_style.typeface + : "SansSerif")); + put_float (buf, ceil (a->font_style.size * 1.33)); + put_u32 (buf, ((a->font_style.bold ? 1 : 0) + | (a->font_style.italic ? 2 : 0))); + put_bool (buf, a->font_style.underline); + put_halign (buf, a->cell_style.halign, 64173, 61453); + put_valign (buf, a->cell_style.valign); + + put_color (buf, &a->font_style.fg[0]); + put_color (buf, &a->font_style.bg[0]); + + bool alt + = (!cell_color_equal (&a->font_style.fg[0], &a->font_style.fg[1]) + || !cell_color_equal (&a->font_style.bg[0], &a->font_style.bg[1])); + put_bool (buf, alt); + if (alt) + { + put_color (buf, &a->font_style.fg[1]); + put_color (buf, &a->font_style.bg[1]); + } + else + { + put_string (buf, ""); + put_string (buf, ""); + } + + put_u32 (buf, a->cell_style.margin[H][0]); + put_u32 (buf, a->cell_style.margin[H][1]); + put_u32 (buf, a->cell_style.margin[V][0]); + put_u32 (buf, a->cell_style.margin[V][1]); + } + + /* Borders. */ + uint32_t borders_start = start_count (buf); + put_be32 (buf, 1); + put_be32 (buf, PIVOT_N_BORDERS); + for (size_t i = 0; i < PIVOT_N_BORDERS; i++) + { + const struct table_border_style *b = &table->borders[i]; + put_be32 (buf, i); + put_be32 (buf, (b->stroke == TABLE_STROKE_NONE ? 0 + : b->stroke == TABLE_STROKE_SOLID ? 1 + : b->stroke == TABLE_STROKE_DASHED ? 2 + : b->stroke == TABLE_STROKE_THICK ? 3 + : b->stroke == TABLE_STROKE_THIN ? 4 + : 5)); + put_be32 (buf, ((b->color.alpha << 24) + | (b->color.r << 16) + | (b->color.g << 8) + | b->color.b)); + } + put_bool (buf, table->show_grid_lines); + put_bytes (buf, "\0\0\0", 3); + end_count_u32 (buf, borders_start); + + /* Print Settings. */ + uint32_t ps_start = start_count (buf); + put_be32 (buf, 1); + put_bool (buf, table->print_all_layers); + put_bool (buf, table->paginate_layers); + put_bool (buf, table->shrink_to_fit[H]); + put_bool (buf, table->shrink_to_fit[V]); + put_bool (buf, table->top_continuation); + put_bool (buf, table->bottom_continuation); + put_be32 (buf, table->n_orphan_lines); + put_bestring (buf, table->continuation); + end_count_u32 (buf, ps_start); + + /* Table Settings. */ + uint32_t ts_start = start_count (buf); + put_be32 (buf, 1); + put_be32 (buf, 4); + put_be32 (buf, 0); /* XXX current_layer */ + put_bool (buf, table->omit_empty); + put_bool (buf, table->row_labels_in_corner); + put_bool (buf, !table->show_numeric_markers); + put_bool (buf, table->footnote_marker_superscripts); + put_byte (buf, 0); + uint32_t keep_start = start_count (buf); + put_be32 (buf, 0); /* n-row-breaks */ + put_be32 (buf, 0); /* n-column-breaks */ + put_be32 (buf, 0); /* n-row-keeps */ + put_be32 (buf, 0); /* n-column-keeps */ + put_be32 (buf, 0); /* n-row-point-keeps */ + put_be32 (buf, 0); /* n-column-point-keeps */ + end_count_be32 (buf, keep_start); + put_bestring (buf, table->notes); + put_bestring (buf, table->table_look); + for (size_t i = 0; i < 82; i++) + put_byte (buf, 0); + end_count_u32 (buf, ts_start); + + /* Formats. */ + put_u32 (buf, 0); /* n-widths */ + put_string (buf, "en_US.ISO_8859-1:1987"); /* XXX */ + put_u32 (buf, 0); /* XXX current-layer */ + put_bool (buf, 0); + put_bool (buf, 0); + put_bool (buf, 1); + put_y0 (buf, table); + put_custom_currency (buf, table); + uint32_t formats_start = start_count (buf); + uint32_t x1_start = start_count (buf); + put_x1 (buf, table); + uint32_t x2_start = start_count (buf); + put_x2 (buf); + end_count_u32 (buf, x2_start); + end_count_u32 (buf, x1_start); + uint32_t x3_start = start_count (buf); + put_x3 (buf, table); + end_count_u32 (buf, x3_start); + end_count_u32 (buf, formats_start); + + /* Dimensions. */ + put_u32 (buf, table->n_dimensions); + int *x2 = xnmalloc (table->n_dimensions, sizeof *x2); + for (size_t i = 0; i < table->axes[PIVOT_AXIS_LAYER].n_dimensions; i++) + x2[i] = 2; + for (size_t i = 0; i < table->axes[PIVOT_AXIS_ROW].n_dimensions; i++) + x2[i + table->axes[PIVOT_AXIS_LAYER].n_dimensions] = 0; + for (size_t i = 0; i < table->axes[PIVOT_AXIS_COLUMN].n_dimensions; i++) + x2[i + + table->axes[PIVOT_AXIS_LAYER].n_dimensions + + table->axes[PIVOT_AXIS_ROW].n_dimensions] = 1; + for (size_t i = 0; i < table->n_dimensions; i++) + { + const struct pivot_dimension *d = table->dimensions[i]; + put_value (buf, d->root->name); + put_byte (buf, 0); + put_byte (buf, x2[i]); + put_u32 (buf, 2); + put_bool (buf, !d->root->show_label); + put_bool (buf, d->hide_all_labels); + put_bool (buf, 1); + put_u32 (buf, i); + + put_u32 (buf, d->root->n_subs); + for (size_t j = 0; j < d->root->n_subs; j++) + put_category (buf, d->root->subs[j]); + } + free (x2); + + /* Axes. */ + put_u32 (buf, table->axes[PIVOT_AXIS_LAYER].n_dimensions); + put_u32 (buf, table->axes[PIVOT_AXIS_ROW].n_dimensions); + put_u32 (buf, table->axes[PIVOT_AXIS_COLUMN].n_dimensions); + for (size_t i = 0; i < table->axes[PIVOT_AXIS_LAYER].n_dimensions; i++) + put_u32 (buf, table->axes[PIVOT_AXIS_LAYER].dimensions[i]->top_index); + for (size_t i = 0; i < table->axes[PIVOT_AXIS_ROW].n_dimensions; i++) + put_u32 (buf, table->axes[PIVOT_AXIS_ROW].dimensions[i]->top_index); + for (size_t i = 0; i < table->axes[PIVOT_AXIS_COLUMN].n_dimensions; i++) + put_u32 (buf, table->axes[PIVOT_AXIS_COLUMN].dimensions[i]->top_index); + + /* Cells. */ + put_u32 (buf, hmap_count (&table->cells)); + const struct pivot_cell *cell; + HMAP_FOR_EACH (cell, struct pivot_cell, hmap_node, &table->cells) + { + uint64_t index = 0; + for (size_t j = 0; j < table->n_dimensions; j++) + index = (table->dimensions[j]->n_leaves * index) + cell->idx[j]; + put_u64 (buf, index); + + put_value (buf, cell->value); + } +} + +void +spv_writer_put_table (struct spv_writer *w, const struct pivot_table *table) +{ + struct pivot_table *table_rw = CONST_CAST (struct pivot_table *, table); + if (!table_rw->subtype) + table_rw->subtype = pivot_value_new_user_text ("unknown", -1); + + int table_id = ++w->n_tables; + + bool initial_depth = w->heading_depth; + if (!initial_depth) + spv_writer_open_file (w); + + start_container (w); + + char *title = pivot_value_to_string (table->title, + SETTINGS_VALUE_SHOW_DEFAULT, + SETTINGS_VALUE_SHOW_DEFAULT); + + char *subtype = pivot_value_to_string (table->subtype, + SETTINGS_VALUE_SHOW_DEFAULT, + SETTINGS_VALUE_SHOW_DEFAULT); + + start_elem (w, "label"); + write_text (w, title); + end_elem (w); + + start_elem (w, "vtb:table"); + write_attr (w, "commandName", table->command_c); + write_attr (w, "type", "table"); /* XXX */ + write_attr (w, "subType", subtype); + write_attr_format (w, "tableId", "%d", table_id); + + free (subtype); + free (title); + + start_elem (w, "vtb:tableStructure"); + start_elem (w, "vtb:dataPath"); + char *data_path = xasprintf ("%010d_lightTableData.bin", table_id); + write_text (w, data_path); + end_elem (w); /* vtb:dataPath */ + end_elem (w); /* vtb:tableStructure */ + end_elem (w); /* vtb:table */ + end_elem (w); /* container */ + + if (!initial_depth) + spv_writer_close_file (w, ""); + + struct buf buf = { NULL, 0, 0 }; + put_light_table (&buf, table_id, table); + zip_writer_add_memory (w->zw, data_path, buf.data, buf.len); + free (buf.data); + + free (data_path); +} diff --git a/src/output/spv/spv-writer.h b/src/output/spv/spv-writer.h new file mode 100644 index 0000000000..1d891213ea --- /dev/null +++ b/src/output/spv/spv-writer.h @@ -0,0 +1,43 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2019 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_WRITER_H +#define OUTPUT_SPV_WRITER_H 1 + +struct page_setup; +struct table_item_text; +struct pivot_table; +struct spv_writer; +struct text_item; + +#include "libpspp/compiler.h" + +char *spv_writer_open (const char *filename, struct spv_writer **) + WARN_UNUSED_RESULT; +char *spv_writer_close (struct spv_writer *) WARN_UNUSED_RESULT; + +void spv_writer_set_page_setup (struct spv_writer *, + const struct page_setup *); + +void spv_writer_open_heading (struct spv_writer *, const char *command_id, + const char *label); +void spv_writer_close_heading (struct spv_writer *); + +void spv_writer_put_text (struct spv_writer *, const struct text_item *, + const char *command_id); +void spv_writer_put_table (struct spv_writer *, const struct pivot_table *); + +#endif /* output/spv/spv-writer.h */ diff --git a/src/output/spv/spv.c b/src/output/spv/spv.c new file mode 100644 index 0000000000..f186e638e7 --- /dev/null +++ b/src/output/spv/spv.c @@ -0,0 +1,1218 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2017, 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spv.h" + +#include +#include +#include +#include +#include +#include + +#include "libpspp/assertion.h" +#include "libpspp/cast.h" +#include "libpspp/hash-functions.h" +#include "libpspp/message.h" +#include "libpspp/str.h" +#include "libpspp/zip-reader.h" +#include "output/page-setup-item.h" +#include "output/pivot-table.h" +#include "output/spv/detail-xml-parser.h" +#include "output/spv/light-binary-parser.h" +#include "output/spv/spv-css-parser.h" +#include "output/spv/spv-legacy-data.h" +#include "output/spv/spv-legacy-decoder.h" +#include "output/spv/spv-light-decoder.h" +#include "output/spv/structure-xml-parser.h" + +#include "gl/c-ctype.h" +#include "gl/intprops.h" +#include "gl/minmax.h" +#include "gl/xalloc.h" +#include "gl/xvasprintf.h" +#include "gl/xsize.h" + +#include "gettext.h" +#define _(msgid) gettext (msgid) +#define N_(msgid) (msgid) + +struct spv_reader + { + struct string zip_errs; + struct zip_reader *zip; + struct spv_item *root; + struct page_setup *page_setup; + }; + +const struct page_setup * +spv_get_page_setup (const struct spv_reader *spv) +{ + return spv->page_setup; +} + +const char * +spv_item_type_to_string (enum spv_item_type type) +{ + switch (type) + { + case SPV_ITEM_HEADING: return "heading"; + case SPV_ITEM_TEXT: return "text"; + case SPV_ITEM_TABLE: return "table"; + case SPV_ITEM_GRAPH: return "graph"; + case SPV_ITEM_MODEL: return "model"; + case SPV_ITEM_OBJECT: return "object"; + default: return "**error**"; + } +} + +const char * +spv_item_class_to_string (enum spv_item_class class) +{ + switch (class) + { +#define SPV_CLASS(ENUM, NAME) case SPV_CLASS_##ENUM: return NAME; + SPV_CLASSES +#undef SPV_CLASS + default: return NULL; + } +} + +enum spv_item_class +spv_item_class_from_string (const char *name) +{ +#define SPV_CLASS(ENUM, NAME) \ + if (!strcmp (name, NAME)) return SPV_CLASS_##ENUM; + SPV_CLASSES +#undef SPV_CLASS + + return SPV_N_CLASSES; +} + +enum spv_item_type +spv_item_get_type (const struct spv_item *item) +{ + return item->type; +} + +enum spv_item_class +spv_item_get_class (const struct spv_item *item) +{ + const char *label = spv_item_get_label (item); + if (!label) + label = ""; + + switch (item->type) + { + case SPV_ITEM_HEADING: + return SPV_CLASS_HEADINGS; + + case SPV_ITEM_TEXT: + return (!strcmp (label, "Title") ? SPV_CLASS_OUTLINEHEADERS + : !strcmp (label, "Log") ? SPV_CLASS_LOGS + : !strcmp (label, "Page Title") ? SPV_CLASS_PAGETITLE + : SPV_CLASS_TEXTS); + + case SPV_ITEM_TABLE: + return (!strcmp (label, "Warnings") ? SPV_CLASS_WARNINGS + : !strcmp (label, "Notes") ? SPV_CLASS_NOTES + : SPV_CLASS_TABLES); + + case SPV_ITEM_GRAPH: + return SPV_CLASS_CHARTS; + + case SPV_ITEM_MODEL: + return SPV_CLASS_MODELS; + + case SPV_ITEM_OBJECT: + return SPV_CLASS_OTHER; + + case SPV_ITEM_TREE: + return SPV_CLASS_TREES; + + default: + return SPV_CLASS_UNKNOWN; + } +} + +const char * +spv_item_get_label (const struct spv_item *item) +{ + return item->label; +} + +bool +spv_item_is_heading (const struct spv_item *item) +{ + return item->type == SPV_ITEM_HEADING; +} + +size_t +spv_item_get_n_children (const struct spv_item *item) +{ + return item->n_children; +} + +struct spv_item * +spv_item_get_child (const struct spv_item *item, size_t idx) +{ + assert (idx < item->n_children); + return item->children[idx]; +} + +bool +spv_item_is_table (const struct spv_item *item) +{ + return item->type == SPV_ITEM_TABLE; +} + +bool +spv_item_is_text (const struct spv_item *item) +{ + return item->type == SPV_ITEM_TEXT; +} + +const struct pivot_value * +spv_item_get_text (const struct spv_item *item) +{ + assert (spv_item_is_text (item)); + return item->text; +} + +struct spv_item * +spv_item_next (const struct spv_item *item) +{ + if (item->n_children) + return item->children[0]; + + while (item->parent) + { + size_t idx = item->parent_idx + 1; + item = item->parent; + if (idx < item->n_children) + return item->children[idx]; + } + + return NULL; +} + +const struct spv_item * +spv_item_get_parent (const struct spv_item *item) +{ + return item->parent; +} + +size_t +spv_item_get_level (const struct spv_item *item) +{ + int level = 0; + for (; item->parent; item = item->parent) + level++; + return level; +} + +const char * +spv_item_get_command_id (const struct spv_item *item) +{ + return item->command_id; +} + +const char * +spv_item_get_subtype (const struct spv_item *item) +{ + return item->subtype; +} + +bool +spv_item_is_visible (const struct spv_item *item) +{ + return item->visible; +} + +static void +spv_item_destroy (struct spv_item *item) +{ + if (item) + { + free (item->structure_member); + + free (item->label); + free (item->command_id); + + for (size_t i = 0; i < item->n_children; i++) + spv_item_destroy (item->children[i]); + free (item->children); + + pivot_table_unref (item->table); + spv_legacy_properties_destroy (item->legacy_properties); + free (item->bin_member); + free (item->xml_member); + free (item->subtype); + + pivot_value_destroy (item->text); + + free (item->object_type); + free (item->uri); + + free (item); + } +} + +static void +spv_heading_add_child (struct spv_item *parent, struct spv_item *child) +{ + assert (parent->type == SPV_ITEM_HEADING); + assert (!child->parent); + + child->parent = parent; + child->parent_idx = parent->n_children; + + if (parent->n_children >= parent->allocated_children) + parent->children = x2nrealloc (parent->children, + &parent->allocated_children, + sizeof *parent->children); + parent->children[parent->n_children++] = child; +} + +static xmlNode * +find_xml_child_element (xmlNode *parent, const char *child_name) +{ + for (xmlNode *node = parent->children; node; node = node->next) + if (node->type == XML_ELEMENT_NODE + && node->name + && !strcmp (CHAR_CAST (char *, node->name), child_name)) + return node; + + return NULL; +} + +static char * +get_xml_attr (const xmlNode *node, const char *name) +{ + return CHAR_CAST (char *, xmlGetProp (node, CHAR_CAST (xmlChar *, name))); +} + +static void +put_xml_attr (const char *name, const char *value, struct string *dst) +{ + if (!value) + return; + + ds_put_format (dst, " %s=\"", name); + for (const char *p = value; *p; p++) + { + switch (*p) + { + case '\n': + ds_put_cstr (dst, " "); + break; + case '&': + ds_put_cstr (dst, "&"); + break; + case '<': + ds_put_cstr (dst, "<"); + break; + case '>': + ds_put_cstr (dst, ">"); + break; + case '"': + ds_put_cstr (dst, """); + break; + default: + ds_put_byte (dst, *p); + break; + } + } + ds_put_byte (dst, '"'); +} + +static void +extract_html_text (const xmlNode *node, int base_font_size, struct string *s) +{ + if (node->type == XML_ELEMENT_NODE) + { + const char *name = CHAR_CAST (char *, node->name); + if (!strcmp (name, "br")) + ds_put_byte (s, '\n'); + else if (strcmp (name, "style")) + { + const char *tag = NULL; + if (strchr ("biu", name[0]) && name[1] == '\0') + { + tag = name; + ds_put_format (s, "<%s>", tag); + } + else if (!strcmp (name, "font")) + { + tag = "span"; + ds_put_format (s, "<%s", tag); + + char *face = get_xml_attr (node, "face"); + put_xml_attr ("face", face, s); + free (face); + + char *color = get_xml_attr (node, "color"); + if (color) + { + if (color[0] == '#') + put_xml_attr ("color", color, s); + else + { + uint8_t r, g, b; + if (sscanf (color, "rgb (%"SCNu8", %"SCNu8", %"SCNu8" )", + &r, &g, &b) == 3) + { + char color2[8]; + snprintf (color2, sizeof color2, + "#%02"PRIx8"%02"PRIx8"%02"PRIx8, + r, g, b); + put_xml_attr ("color", color2, s); + } + } + } + free (color); + + char *size_s = get_xml_attr (node, "size"); + int html_size = size_s ? atoi (size_s) : 0; + free (size_s); + if (html_size >= 1 && html_size <= 7) + { + static const double scale[7] = { + .444, .556, .667, .778, 1.0, 1.33, 2.0 + }; + double size = base_font_size * scale[html_size - 1]; + + char size2[INT_BUFSIZE_BOUND (int)]; + snprintf (size2, sizeof size2, "%.0f", size * 1024.); + put_xml_attr ("size", size2, s); + } + + ds_put_cstr (s, ">"); + } + for (const xmlNode *child = node->children; child; + child = child->next) + extract_html_text (child, base_font_size, s); + if (tag) + ds_put_format (s, "", tag); + } + } + else if (node->type == XML_TEXT_NODE) + { + /* U+00A0 NONBREAKING SPACE is really, really common in SPV text and it + makes it impossible to break syntax across lines. Translate it into a + regular space. (Note that U+00A0 is C2 A0 in UTF-8.) + + Do the same for U+2007 FIGURE SPACE, which also crops out weirdly + sometimes. */ + ds_extend (s, ds_length (s) + xmlStrlen (node->content)); + for (const uint8_t *p = node->content; *p; ) + { + int c; + if (p[0] == 0xc2 && p[1] == 0xa0) + { + c = ' '; + p += 2; + } + else if (p[0] == 0xe2 && p[1] == 0x80 && p[2] == 0x87) + { + c = ' '; + p += 3; + } + else + c = *p++; + + if (c_isspace (c)) + { + int last = ds_last (s); + if (last != EOF && !c_isspace (last)) + ds_put_byte (s, c); + } + else if (c == '<') + ds_put_cstr (s, "<"); + else if (c == '>') + ds_put_cstr (s, ">"); + else if (c == '&') + ds_put_cstr (s, "&"); + else + ds_put_byte (s, c); + } + } +} + +static xmlDoc * +parse_embedded_html (const xmlNode *node) +{ + /* Extract HTML from XML node. */ + char *html_s = CHAR_CAST (char *, xmlNodeGetContent (node)); + if (!html_s) + xalloc_die (); + + xmlDoc *html_doc = htmlReadMemory ( + html_s, strlen (html_s), + NULL, "UTF-8", (HTML_PARSE_RECOVER | HTML_PARSE_NOERROR + | HTML_PARSE_NOWARNING | HTML_PARSE_NOBLANKS + | HTML_PARSE_NONET)); + free (html_s); + + return html_doc; +} + +/* Given NODE, which should contain HTML content, returns the text within that + content as an allocated string. The caller must eventually free the + returned string (with xmlFree()). */ +static char * +decode_embedded_html (const xmlNode *node, struct font_style *font_style) +{ + struct string markup = DS_EMPTY_INITIALIZER; + *font_style = (struct font_style) FONT_STYLE_INITIALIZER; + font_style->size = 10; + + xmlDoc *html_doc = parse_embedded_html (node); + if (html_doc) + { + xmlNode *root = xmlDocGetRootElement (html_doc); + xmlNode *head = root ? find_xml_child_element (root, "head") : NULL; + xmlNode *style = head ? find_xml_child_element (head, "style") : NULL; + if (style) + { + uint8_t *style_s = xmlNodeGetContent (style); + spv_parse_css_style (CHAR_CAST (char *, style_s), font_style); + xmlFree (style_s); + } + + if (root) + extract_html_text (root, font_style->size, &markup); + xmlFreeDoc (html_doc); + } + + font_style->markup = true; + return ds_steal_cstr (&markup); +} + +static char * +xstrdup_if_nonempty (const char *s) +{ + return s && s[0] ? xstrdup (s) : NULL; +} + +static void +decode_container_text (const struct spvsx_container_text *ct, + struct spv_item *item) +{ + item->type = SPV_ITEM_TEXT; + item->command_id = xstrdup_if_nonempty (ct->command_name); + + item->text = xzalloc (sizeof *item->text); + item->text->type = PIVOT_VALUE_TEXT; + item->text->font_style = xmalloc (sizeof *item->text->font_style); + item->text->text.local = decode_embedded_html (ct->html->node_.raw, + item->text->font_style); +} + +static void +decode_page_p (const xmlNode *in, struct page_paragraph *out) +{ + char *style = get_xml_attr (in, "style"); + out->halign = (style && strstr (style, "center") ? TABLE_HALIGN_CENTER + : style && strstr (style, "right") ? TABLE_HALIGN_RIGHT + : TABLE_HALIGN_LEFT); + free (style); + + struct font_style font_style; + out->markup = decode_embedded_html (in, &font_style); + font_style_uninit (&font_style); +} + +static void +decode_page_paragraph (const struct spvsx_page_paragraph *page_paragraph, + struct page_heading *ph) +{ + memset (ph, 0, sizeof *ph); + + const struct spvsx_page_paragraph_text *page_paragraph_text + = page_paragraph->page_paragraph_text; + if (!page_paragraph_text) + return; + + xmlDoc *html_doc = parse_embedded_html (page_paragraph_text->node_.raw); + if (!html_doc) + return; + + xmlNode *root = xmlDocGetRootElement (html_doc); + xmlNode *body = find_xml_child_element (root, "body"); + if (body) + for (const xmlNode *node = body->children; node; node = node->next) + if (node->type == XML_ELEMENT_NODE + && !strcmp (CHAR_CAST (const char *, node->name), "p")) + { + ph->paragraphs = xrealloc (ph->paragraphs, + (ph->n + 1) * sizeof *ph->paragraphs); + decode_page_p (node, &ph->paragraphs[ph->n++]); + } + xmlFreeDoc (html_doc); +} + +void +spv_item_load (const struct spv_item *item) +{ + if (spv_item_is_table (item)) + spv_item_get_table (item); +} + +bool +spv_item_is_light_table (const struct spv_item *item) +{ + return item->type == SPV_ITEM_TABLE && !item->xml_member; +} + +char * WARN_UNUSED_RESULT +spv_item_get_raw_light_table (const struct spv_item *item, + void **data, size_t *size) +{ + return zip_member_read_all (item->spv->zip, item->bin_member, data, size); +} + +char * WARN_UNUSED_RESULT +spv_item_get_light_table (const struct spv_item *item, + struct spvlb_table **tablep) +{ + *tablep = NULL; + + if (!spv_item_is_light_table (item)) + return xstrdup ("not a light binary table object"); + + void *data; + size_t size; + char *error = spv_item_get_raw_light_table (item, &data, &size); + if (error) + return error; + + struct spvbin_input input; + spvbin_input_init (&input, data, size); + + struct spvlb_table *table; + error = (!size + ? xasprintf ("light table member is empty") + : !spvlb_parse_table (&input, &table) + ? spvbin_input_to_error (&input, NULL) + : input.ofs != input.size + ? xasprintf ("expected end of file at offset %#zx", input.ofs) + : NULL); + if (error) + { + struct string s = DS_EMPTY_INITIALIZER; + spv_item_format_path (item, &s); + ds_put_format (&s, " (%s): %s", item->bin_member, error); + + free (error); + error = ds_steal_cstr (&s); + } + free (data); + if (!error) + *tablep = table; + return error; +} + +static char * +pivot_table_open_light (struct spv_item *item) +{ + assert (spv_item_is_light_table (item)); + + struct spvlb_table *raw_table; + char *error = spv_item_get_light_table (item, &raw_table); + if (!error) + error = decode_spvlb_table (raw_table, &item->table); + spvlb_free_table (raw_table); + + return error; +} + +bool +spv_item_is_legacy_table (const struct spv_item *item) +{ + return item->type == SPV_ITEM_TABLE && item->xml_member; +} + +char * WARN_UNUSED_RESULT +spv_item_get_raw_legacy_data (const struct spv_item *item, + void **data, size_t *size) +{ + if (!spv_item_is_legacy_table (item)) + return xstrdup ("not a legacy table object"); + + return zip_member_read_all (item->spv->zip, item->bin_member, data, size); +} + +char * WARN_UNUSED_RESULT +spv_item_get_legacy_data (const struct spv_item *item, struct spv_data *data) +{ + void *raw; + size_t size; + char *error = spv_item_get_raw_legacy_data (item, &raw, &size); + if (!error) + { + error = spv_legacy_data_decode (raw, size, data); + free (raw); + } + + return error; +} + +static char * WARN_UNUSED_RESULT +spv_read_xml_member (struct spv_reader *spv, const char *member_name, + bool keep_blanks, const char *root_element_name, + xmlDoc **docp) +{ + *docp = NULL; + + struct zip_member *zm = zip_member_open (spv->zip, member_name); + if (!zm) + return ds_steal_cstr (&spv->zip_errs); + + xmlParserCtxt *parser; + xmlKeepBlanksDefault (keep_blanks); + parser = xmlCreatePushParserCtxt(NULL, NULL, NULL, 0, NULL); + if (!parser) + { + zip_member_finish (zm); + return xasprintf (_("%s: Failed to create XML parser"), member_name); + } + + int retval; + char buf[4096]; + while ((retval = zip_member_read (zm, buf, sizeof buf)) > 0) + xmlParseChunk (parser, buf, retval, false); + xmlParseChunk (parser, NULL, 0, true); + + xmlDoc *doc = parser->myDoc; + bool well_formed = parser->wellFormed; + xmlFreeParserCtxt (parser); + + if (retval < 0) + { + char *error = ds_steal_cstr (&spv->zip_errs); + zip_member_finish (zm); + xmlFreeDoc (doc); + return error; + } + zip_member_finish (zm); + + if (!well_formed) + { + xmlFreeDoc (doc); + return xasprintf(_("%s: document is not well-formed"), member_name); + } + + const xmlNode *root_node = xmlDocGetRootElement (doc); + assert (root_node->type == XML_ELEMENT_NODE); + if (strcmp (CHAR_CAST (char *, root_node->name), root_element_name)) + { + xmlFreeDoc (doc); + return xasprintf(_("%s: root node is \"%s\" but \"%s\" was expected"), + member_name, + CHAR_CAST (char *, root_node->name), root_element_name); + } + + *docp = doc; + return NULL; +} + +char * WARN_UNUSED_RESULT +spv_item_get_legacy_table (const struct spv_item *item, xmlDoc **docp) +{ + assert (spv_item_is_legacy_table (item)); + + return spv_read_xml_member (item->spv, item->xml_member, false, + "visualization", docp); +} + +char * WARN_UNUSED_RESULT +spv_item_get_structure (const struct spv_item *item, struct _xmlDoc **docp) +{ + return spv_read_xml_member (item->spv, item->structure_member, false, + "heading", docp); +} + +static const char * +identify_item (const struct spv_item *item) +{ + return (item->label ? item->label + : item->command_id ? item->command_id + : spv_item_type_to_string (item->type)); +} + +void +spv_item_format_path (const struct spv_item *item, struct string *s) +{ + enum { MAX_STACK = 32 }; + const struct spv_item *stack[MAX_STACK]; + size_t n = 0; + + while (item != NULL && item->parent && n < MAX_STACK) + { + stack[n++] = item; + item = item->parent; + } + + while (n > 0) + { + item = stack[--n]; + ds_put_byte (s, '/'); + + const char *name = identify_item (item); + ds_put_cstr (s, name); + + if (item->parent) + { + size_t total = 1; + size_t index = 1; + for (size_t i = 0; i < item->parent->n_children; i++) + { + const struct spv_item *sibling = item->parent->children[i]; + if (sibling == item) + index = total; + else if (!strcmp (name, identify_item (sibling))) + total++; + } + if (total > 1) + ds_put_format (s, "[%zu]", index); + } + } +} + +static char * WARN_UNUSED_RESULT +pivot_table_open_legacy (struct spv_item *item) +{ + assert (spv_item_is_legacy_table (item)); + + struct spv_data data; + char *error = spv_item_get_legacy_data (item, &data); + if (error) + { + struct string s = DS_EMPTY_INITIALIZER; + spv_item_format_path (item, &s); + ds_put_format (&s, " (%s): %s", item->bin_member, error); + + free (error); + return ds_steal_cstr (&s); + } + + xmlDoc *doc; + error = spv_read_xml_member (item->spv, item->xml_member, false, + "visualization", &doc); + if (error) + { + spv_data_uninit (&data); + return error; + } + + struct spvxml_context ctx = SPVXML_CONTEXT_INIT (ctx); + struct spvdx_visualization *v; + spvdx_parse_visualization (&ctx, xmlDocGetRootElement (doc), &v); + error = spvxml_context_finish (&ctx, &v->node_); + + if (!error) + error = decode_spvdx_table (v, item->subtype, item->legacy_properties, + &data, &item->table); + + if (error) + { + struct string s = DS_EMPTY_INITIALIZER; + spv_item_format_path (item, &s); + ds_put_format (&s, " (%s): %s", item->xml_member, error); + + free (error); + error = ds_steal_cstr (&s); + } + + spv_data_uninit (&data); + spvdx_free_visualization (v); + if (doc) + xmlFreeDoc (doc); + + return error; +} + +struct pivot_table * +spv_item_get_table (const struct spv_item *item_) +{ + struct spv_item *item = CONST_CAST (struct spv_item *, item_); + + assert (spv_item_is_table (item)); + if (!item->table) + { + char *error = (item->xml_member + ? pivot_table_open_legacy (item) + : pivot_table_open_light (item)); + if (error) + { + item->error = true; + msg (ME, "%s", error); + item->table = pivot_table_create_for_text ( + pivot_value_new_text (N_("Error")), + pivot_value_new_user_text (error, -1)); + free (error); + } + } + + return item->table; +} + +/* Constructs a new spv_item from XML and stores it in *ITEMP. Returns NULL if + successful, otherwise an error message for the caller to use and free (with + free()). + + XML should be a 'heading' or 'container' element. */ +static char * WARN_UNUSED_RESULT +spv_decode_container (const struct spvsx_container *c, + const char *structure_member, + struct spv_item *parent) +{ + struct spv_item *item = xzalloc (sizeof *item); + item->spv = parent->spv; + item->label = xstrdup (c->label->text); + item->visible = c->visibility == SPVSX_VISIBILITY_VISIBLE; + item->structure_member = xstrdup (structure_member); + + assert (c->n_seq == 1); + struct spvxml_node *content = c->seq[0]; + if (spvsx_is_container_text (content)) + decode_container_text (spvsx_cast_container_text (content), item); + else if (spvsx_is_table (content)) + { + item->type = SPV_ITEM_TABLE; + + struct spvsx_table *table = spvsx_cast_table (content); + const struct spvsx_table_structure *ts = table->table_structure; + item->bin_member = xstrdup (ts->data_path->text); + item->command_id = xstrdup_if_nonempty (table->command_name); + item->subtype = xstrdup_if_nonempty (table->sub_type); + if (ts->path) + { + item->xml_member = ts->path ? xstrdup (ts->path->text) : NULL; + char *error = decode_spvsx_legacy_properties ( + table->table_properties, &item->legacy_properties); + if (error) + { + spv_item_destroy (item); + return error; + } + } + } + else if (spvsx_is_graph (content)) + { + struct spvsx_graph *graph = spvsx_cast_graph (content); + item->type = SPV_ITEM_GRAPH; + item->command_id = xstrdup_if_nonempty (graph->command_name); + /* XXX */ + } + else if (spvsx_is_model (content)) + { + struct spvsx_model *model = spvsx_cast_model (content); + item->type = SPV_ITEM_MODEL; + item->command_id = xstrdup_if_nonempty (model->command_name); + /* XXX */ + } + else if (spvsx_is_object (content)) + { + struct spvsx_object *object = spvsx_cast_object (content); + item->type = SPV_ITEM_OBJECT; + item->object_type = xstrdup (object->type); + item->uri = xstrdup (object->uri); + } + else if (spvsx_is_image (content)) + { + struct spvsx_image *image = spvsx_cast_image (content); + item->type = SPV_ITEM_OBJECT; + item->object_type = xstrdup ("image"); + item->uri = xstrdup (image->data_path->text); + } + else if (spvsx_is_tree (content)) + { + struct spvsx_tree *tree = spvsx_cast_tree (content); + item->type = SPV_ITEM_TREE; + item->object_type = xstrdup ("tree"); + item->uri = xstrdup (tree->data_path->text); + } + else + NOT_REACHED (); + + spv_heading_add_child (parent, item); + return NULL; +} + +static char * WARN_UNUSED_RESULT +spv_decode_children (struct spv_reader *spv, const char *structure_member, + struct spvxml_node **seq, size_t n_seq, + struct spv_item *parent) +{ + for (size_t i = 0; i < n_seq; i++) + { + const struct spvxml_node *node = seq[i]; + + char *error; + if (spvsx_is_container (node)) + { + const struct spvsx_container *container + = spvsx_cast_container (node); + error = spv_decode_container (container, structure_member, parent); + } + else if (spvsx_is_heading (node)) + { + const struct spvsx_heading *subheading = spvsx_cast_heading (node); + struct spv_item *subitem = xzalloc (sizeof *subitem); + subitem->structure_member = xstrdup (structure_member); + subitem->spv = parent->spv; + subitem->type = SPV_ITEM_HEADING; + subitem->label = xstrdup (subheading->label->text); + if (subheading->command_name) + subitem->command_id = xstrdup (subheading->command_name); + subitem->visible = !subheading->heading_visibility_present; + spv_heading_add_child (parent, subitem); + + error = spv_decode_children (spv, structure_member, + subheading->seq, subheading->n_seq, + subitem); + } + else + NOT_REACHED (); + + if (error) + return error; + } + + return NULL; +} + +static struct page_setup * +decode_page_setup (const struct spvsx_page_setup *in, const char *file_name) +{ + struct page_setup *out = xmalloc (sizeof *out); + *out = (struct page_setup) PAGE_SETUP_INITIALIZER; + + out->initial_page_number = in->initial_page_number; + + if (in->paper_width != DBL_MAX) + out->paper[TABLE_HORZ] = in->paper_width; + if (in->paper_height != DBL_MAX) + out->paper[TABLE_VERT] = in->paper_height; + + if (in->margin_left != DBL_MAX) + out->margins[TABLE_HORZ][0] = in->margin_left; + if (in->margin_right != DBL_MAX) + out->margins[TABLE_HORZ][1] = in->margin_right; + if (in->margin_top != DBL_MAX) + out->margins[TABLE_VERT][0] = in->margin_top; + if (in->margin_bottom != DBL_MAX) + out->margins[TABLE_VERT][1] = in->margin_bottom; + + if (in->space_after != DBL_MAX) + out->object_spacing = in->space_after; + + if (in->chart_size) + out->chart_size = (in->chart_size == SPVSX_CHART_SIZE_FULL_HEIGHT + ? PAGE_CHART_FULL_HEIGHT + : in->chart_size == SPVSX_CHART_SIZE_HALF_HEIGHT + ? PAGE_CHART_HALF_HEIGHT + : in->chart_size == SPVSX_CHART_SIZE_QUARTER_HEIGHT + ? PAGE_CHART_QUARTER_HEIGHT + : PAGE_CHART_AS_IS); + + decode_page_paragraph (in->page_header->page_paragraph, &out->headings[0]); + decode_page_paragraph (in->page_footer->page_paragraph, &out->headings[1]); + + out->file_name = xstrdup (file_name); + + return out; +} + +static char * WARN_UNUSED_RESULT +spv_heading_read (struct spv_reader *spv, + const char *file_name, const char *member_name) +{ + xmlDoc *doc; + char *error = spv_read_xml_member (spv, member_name, true, "heading", &doc); + if (error) + return error; + + struct spvxml_context ctx = SPVXML_CONTEXT_INIT (ctx); + struct spvsx_root_heading *root; + spvsx_parse_root_heading (&ctx, xmlDocGetRootElement (doc), &root); + error = spvxml_context_finish (&ctx, &root->node_); + + if (!error && root->page_setup) + spv->page_setup = decode_page_setup (root->page_setup, file_name); + + for (size_t i = 0; !error && i < root->n_seq; i++) + error = spv_decode_children (spv, member_name, root->seq, root->n_seq, + spv->root); + + if (error) + { + char *s = xasprintf ("%s: %s", member_name, error); + free (error); + error = s; + } + + spvsx_free_root_heading (root); + xmlFreeDoc (doc); + + return error; +} + +struct spv_item * +spv_get_root (const struct spv_reader *spv) +{ + return spv->root; +} + +static int +spv_detect__ (struct zip_reader *zip, char **errorp) +{ + *errorp = NULL; + + const char *member = "META-INF/MANIFEST.MF"; + if (!zip_reader_contains_member (zip, member)) + return 0; + + void *data; + size_t size; + *errorp = zip_member_read_all (zip, "META-INF/MANIFEST.MF", + &data, &size); + if (*errorp) + return -1; + + const char *magic = "allowPivoting=true"; + bool is_spv = size == strlen (magic) && !memcmp (magic, data, size); + free (data); + + return is_spv; +} + +/* Returns NULL if FILENAME is an SPV file, otherwise an error string that the + caller must eventually free(). */ +char * WARN_UNUSED_RESULT +spv_detect (const char *filename) +{ + struct string zip_error; + struct zip_reader *zip = zip_reader_create (filename, &zip_error); + if (!zip) + return ds_steal_cstr (&zip_error); + + char *error; + if (spv_detect__ (zip, &error) <= 0 && !error) + error = xasprintf("%s: not an SPV file", filename); + zip_reader_destroy (zip); + ds_destroy (&zip_error); + return error; +} + +char * WARN_UNUSED_RESULT +spv_open (const char *filename, struct spv_reader **spvp) +{ + *spvp = NULL; + + struct spv_reader *spv = xzalloc (sizeof *spv); + ds_init_empty (&spv->zip_errs); + spv->zip = zip_reader_create (filename, &spv->zip_errs); + if (!spv->zip) + { + char *error = ds_steal_cstr (&spv->zip_errs); + spv_close (spv); + return error; + } + + char *error; + int detect = spv_detect__ (spv->zip, &error); + if (detect <= 0) + { + spv_close (spv); + return error ? error : xasprintf("%s: not an SPV file", filename); + } + + spv->root = xzalloc (sizeof *spv->root); + spv->root->spv = spv; + spv->root->type = SPV_ITEM_HEADING; + for (size_t i = 0; ; i++) + { + const char *member_name = zip_reader_get_member_name (spv->zip, i); + if (!member_name) + break; + + struct substring member_name_ss = ss_cstr (member_name); + if (ss_starts_with (member_name_ss, ss_cstr ("outputViewer")) + && ss_ends_with (member_name_ss, ss_cstr (".xml"))) + { + char *error = spv_heading_read (spv, filename, member_name); + if (error) + { + spv_close (spv); + return error; + } + } + } + + *spvp = spv; + return NULL; +} + +void +spv_close (struct spv_reader *spv) +{ + if (spv) + { + ds_destroy (&spv->zip_errs); + zip_reader_destroy (spv->zip); + spv_item_destroy (spv->root); + page_setup_destroy (spv->page_setup); + free (spv); + } +} + +char * WARN_UNUSED_RESULT +spv_decode_fmt_spec (uint32_t u32, struct fmt_spec *out) +{ + if (!u32 + || (u32 == 0x10000 || u32 == 1 /* both used as string formats */)) + { + *out = fmt_for_output (FMT_F, 40, 2); + return NULL; + } + + uint8_t raw_type = u32 >> 16; + uint8_t w = u32 >> 8; + uint8_t d = u32; + + msg_disable (); + *out = (struct fmt_spec) { .type = FMT_F, .w = w, .d = d }; + bool ok = raw_type >= 40 || fmt_from_io (raw_type, &out->type); + if (ok) + { + fmt_fix_output (out); + ok = fmt_check_width_compat (out, 0); + } + msg_enable (); + + if (!ok) + { + *out = fmt_for_output (FMT_F, 40, 2); + return xasprintf ("bad format %#"PRIx32, u32); + } + + return NULL; +} diff --git a/src/output/spv/spv.h b/src/output/spv/spv.h new file mode 100644 index 0000000000..0bf2ff7014 --- /dev/null +++ b/src/output/spv/spv.h @@ -0,0 +1,202 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2017 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef OUTPUT_SPV_H +#define OUTPUT_SPV_H 1 + +/* SPSS Viewer (SPV) file reader. + + An SPV file, represented as struct spv_reader, contains a number of + top-level headings, each of which recursively contains other headings and + tables. Here, we model a heading, text, table, or other element as an + "item", and a an SPV file as a single root item that contains each of the + top-level headings as a child item. + */ + +#include +#include +#include +#include "libpspp/compiler.h" + +struct fmt_spec; +struct pivot_table; +struct spv_data; +struct spv_reader; +struct spvlb_table; +struct string; +struct _xmlDoc; + +/* SPV files. */ + +char *spv_open (const char *filename, struct spv_reader **) WARN_UNUSED_RESULT; +void spv_close (struct spv_reader *); + +char *spv_detect (const char *filename) WARN_UNUSED_RESULT; + +const char *spv_get_errors (const struct spv_reader *); +void spv_clear_errors (struct spv_reader *); + +struct spv_item *spv_get_root (const struct spv_reader *); +void spv_item_dump (const struct spv_item *, int indentation); + +const struct page_setup *spv_get_page_setup (const struct spv_reader *); + +/* Items. + + An spv_item represents of the elements that can occur in an SPV file. Items + form a tree because "heading" items can have an arbitrary number of child + items, which in turn may also be headings. The root item, that is, the item + returned by spv_get_root(), is always a heading. */ + +enum spv_item_type + { + SPV_ITEM_HEADING, + SPV_ITEM_TEXT, + SPV_ITEM_TABLE, + SPV_ITEM_GRAPH, + SPV_ITEM_MODEL, + SPV_ITEM_OBJECT, + SPV_ITEM_TREE, + }; + +const char *spv_item_type_to_string (enum spv_item_type); + +#define SPV_CLASSES \ + SPV_CLASS(CHARTS, "charts") \ + SPV_CLASS(HEADINGS, "headings") \ + SPV_CLASS(LOGS, "logs") \ + SPV_CLASS(MODELS, "models") \ + SPV_CLASS(TABLES, "tables") \ + SPV_CLASS(TEXTS, "texts") \ + SPV_CLASS(TREES, "trees") \ + SPV_CLASS(WARNINGS, "warnings") \ + SPV_CLASS(OUTLINEHEADERS, "outlineheaders") \ + SPV_CLASS(PAGETITLE, "pagetitle") \ + SPV_CLASS(NOTES, "notes") \ + SPV_CLASS(UNKNOWN, "unknown") \ + SPV_CLASS(OTHER, "other") +enum spv_item_class + { +#define SPV_CLASS(ENUM, NAME) SPV_CLASS_##ENUM, + SPV_CLASSES +#undef SPV_CLASS + }; +enum + { +#define SPV_CLASS(ENUM, NAME) +1 + SPV_N_CLASSES = SPV_CLASSES +#undef SPV_CLASS +}; +#define SPV_ALL_CLASSES ((1u << SPV_N_CLASSES) - 1) + +const char *spv_item_class_to_string (enum spv_item_class); +enum spv_item_class spv_item_class_from_string (const char *); + +struct spv_item + { + struct spv_reader *spv; + struct spv_item *parent; + size_t parent_idx; /* item->parent->children[parent_idx] == item */ + + bool error; + + char *structure_member; + + enum spv_item_type type; + char *label; + char *command_id; /* Unique command identifier. */ + + /* Whether the item is visible. + For SPV_ITEM_HEADING, false indicates that the item is collapsed. + For SPV_ITEM_TABLE, false indicates that the item is not shown. */ + bool visible; + + /* SPV_ITEM_HEADING only. */ + struct spv_item **children; + size_t n_children, allocated_children; + + /* SPV_ITEM_TABLE only. */ + struct pivot_table *table; /* NULL if not yet loaded. */ + struct spv_legacy_properties *legacy_properties; + char *bin_member; + char *xml_member; + char *subtype; + + /* SPV_ITEM_TEXT only. */ + struct pivot_value *text; + + /* SPV_ITEM_OBJECT only. */ + char *object_type; + char *uri; + }; + +void spv_item_format_path (const struct spv_item *, struct string *); + +void spv_item_load (const struct spv_item *); + +enum spv_item_type spv_item_get_type (const struct spv_item *); +enum spv_item_class spv_item_get_class (const struct spv_item *); + +const char *spv_item_get_label (const struct spv_item *); + +bool spv_item_is_heading (const struct spv_item *); +size_t spv_item_get_n_children (const struct spv_item *); +struct spv_item *spv_item_get_child (const struct spv_item *, size_t idx); + +bool spv_item_is_table (const struct spv_item *); +struct pivot_table *spv_item_get_table (const struct spv_item *); + +bool spv_item_is_text (const struct spv_item *); +const struct pivot_value *spv_item_get_text (const struct spv_item *); + +bool spv_item_is_visible (const struct spv_item *); + +#define SPV_ITEM_FOR_EACH(ITER, ROOT) \ + for ((ITER) = (ROOT); (ITER) != NULL; (ITER) = spv_item_next(ITER)) +#define SPV_ITEM_FOR_EACH_SKIP_ROOT(ITER, ROOT) \ + for ((ITER) = (ROOT); ((ITER) = spv_item_next(ITER)) != NULL; ) +struct spv_item *spv_item_next (const struct spv_item *); + +const struct spv_item *spv_item_get_parent (const struct spv_item *); +size_t spv_item_get_level (const struct spv_item *); + +const char *spv_item_get_member_name (const struct spv_item *); +const char *spv_item_get_command_id (const struct spv_item *); +const char *spv_item_get_subtype (const struct spv_item *); + +char *spv_item_get_structure (const struct spv_item *, struct _xmlDoc **) + WARN_UNUSED_RESULT; + +bool spv_item_is_light_table (const struct spv_item *); +char *spv_item_get_light_table (const struct spv_item *, + struct spvlb_table **) + WARN_UNUSED_RESULT; +char *spv_item_get_raw_light_table (const struct spv_item *, + void **data, size_t *size) + WARN_UNUSED_RESULT; + +bool spv_item_is_legacy_table (const struct spv_item *); +char *spv_item_get_raw_legacy_data (const struct spv_item *item, + void **data, size_t *size) + WARN_UNUSED_RESULT; +char *spv_item_get_legacy_data (const struct spv_item *, struct spv_data *) + WARN_UNUSED_RESULT; +char *spv_item_get_legacy_table (const struct spv_item *, struct _xmlDoc **) + WARN_UNUSED_RESULT; + +char *spv_decode_fmt_spec (uint32_t u32, struct fmt_spec *) WARN_UNUSED_RESULT; + +#endif /* output/spv/spv.h */ diff --git a/src/output/spv/spvbin-helpers.c b/src/output/spv/spvbin-helpers.c new file mode 100644 index 0000000000..e405310395 --- /dev/null +++ b/src/output/spv/spvbin-helpers.c @@ -0,0 +1,358 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spvbin-helpers.h" + +#include +#include + +#include "libpspp/float-format.h" +#include "libpspp/integer-format.h" +#include "libpspp/str.h" + +#include "gl/xmemdup0.h" + +void +spvbin_input_init (struct spvbin_input *input, const void *data, size_t size) +{ + *input = (struct spvbin_input) { .data = data, .size = size }; +} + +bool +spvbin_input_at_end (const struct spvbin_input *input) +{ + return input->ofs >= input->size; +} + +char * +spvbin_input_to_error (const struct spvbin_input *input, const char *name) +{ + struct string s = DS_EMPTY_INITIALIZER; + if (name) + ds_put_format (&s, "%s: ", name); + ds_put_cstr (&s, "parse error decoding "); + for (size_t i = input->n_errors; i-- > 0; ) + if (i < SPVBIN_MAX_ERRORS) + ds_put_format (&s, "/%s@%#zx", input->errors[i].name, + input->errors[i].start); + ds_put_format (&s, " near %#zx", input->error_ofs); + return ds_steal_cstr (&s); +} + + +bool +spvbin_match_bytes (struct spvbin_input *input, const void *bytes, size_t n) +{ + if (input->size - input->ofs < n + || memcmp (&input->data[input->ofs], bytes, n)) + return false; + + input->ofs += n; + return true; +} + +bool +spvbin_match_byte (struct spvbin_input *input, uint8_t byte) +{ + return spvbin_match_bytes (input, &byte, 1); +} + +bool +spvbin_parse_bool (struct spvbin_input *input, bool *p) +{ + if (input->ofs >= input->size || input->data[input->ofs] > 1) + return false; + if (p) + *p = input->data[input->ofs]; + input->ofs++; + return true; +} + +static const void * +spvbin_parse__ (struct spvbin_input *input, size_t n) +{ + if (input->size - input->ofs < n) + return NULL; + + const void *src = &input->data[input->ofs]; + input->ofs += n; + return src; +} + +bool +spvbin_parse_byte (struct spvbin_input *input, uint8_t *p) +{ + const void *src = spvbin_parse__ (input, sizeof *p); + if (src && p) + *p = *(const uint8_t *) src; + return src != NULL; +} + +bool +spvbin_parse_int16 (struct spvbin_input *input, uint16_t *p) +{ + const void *src = spvbin_parse__ (input, sizeof *p); + if (src && p) + *p = le_to_native16 (get_uint16 (src)); + return src != NULL; +} + +bool +spvbin_parse_int32 (struct spvbin_input *input, uint32_t *p) +{ + const void *src = spvbin_parse__ (input, sizeof *p); + if (src && p) + *p = le_to_native32 (get_uint32 (src)); + return src != NULL; +} + +bool +spvbin_parse_int64 (struct spvbin_input *input, uint64_t *p) +{ + const void *src = spvbin_parse__ (input, sizeof *p); + if (src && p) + *p = le_to_native64 (get_uint64 (src)); + return src != NULL; +} + +bool +spvbin_parse_be16 (struct spvbin_input *input, uint16_t *p) +{ + const void *src = spvbin_parse__ (input, sizeof *p); + if (src && p) + *p = be_to_native16 (get_uint16 (src)); + return src != NULL; +} + +bool +spvbin_parse_be32 (struct spvbin_input *input, uint32_t *p) +{ + const void *src = spvbin_parse__ (input, sizeof *p); + if (src && p) + *p = be_to_native32 (get_uint32 (src)); + return src != NULL; +} + +bool +spvbin_parse_be64 (struct spvbin_input *input, uint64_t *p) +{ + const void *src = spvbin_parse__ (input, sizeof *p); + if (src && p) + *p = be_to_native64 (get_uint64 (src)); + return src != NULL; +} + +bool +spvbin_parse_double (struct spvbin_input *input, double *p) +{ + const void *src = spvbin_parse__ (input, 8); + if (src && p) + *p = float_get_double (FLOAT_IEEE_DOUBLE_LE, src); + return src != NULL; +} + +bool +spvbin_parse_float (struct spvbin_input *input, double *p) +{ + const void *src = spvbin_parse__ (input, 4); + if (src && p) + *p = float_get_double (FLOAT_IEEE_SINGLE_LE, src); + return src != NULL; +} + +static bool +spvbin_parse_string__ (struct spvbin_input *input, + uint32_t (*raw_to_native32) (uint32_t), + char **p) +{ + *p = NULL; + + uint32_t length; + if (input->size - input->ofs < sizeof length) + return false; + + const uint8_t *src = &input->data[input->ofs]; + length = raw_to_native32 (get_uint32 (src)); + if (input->size - input->ofs - sizeof length < length) + return false; + + if (p) + *p = xmemdup0 (src + sizeof length, length); + input->ofs += sizeof length + length; + return true; +} + +bool +spvbin_parse_string (struct spvbin_input *input, char **p) +{ + return spvbin_parse_string__ (input, le_to_native32, p); +} + +bool +spvbin_parse_bestring (struct spvbin_input *input, char **p) +{ + return spvbin_parse_string__ (input, be_to_native32, p); +} + +void +spvbin_error (struct spvbin_input *input, const char *name, size_t start) +{ + if (!input->n_errors) + input->error_ofs = input->ofs; + + /* We keep track of the error depth regardless of whether we can store all of + them. The parser needs this to accurately save and restore error + state. */ + if (input->n_errors < SPVBIN_MAX_ERRORS) + { + input->errors[input->n_errors].name = name; + input->errors[input->n_errors].start = start; + } + input->n_errors++; +} + +void +spvbin_print_header (const char *title, size_t start UNUSED, size_t len UNUSED, int indent) +{ + for (int i = 0; i < indent * 4; i++) + putchar (' '); + fputs (title, stdout); +#if 0 + if (start != SIZE_MAX) + printf (" (0x%zx, %zu)", start, len); +#endif + fputs (": ", stdout); +} + +void +spvbin_print_presence (const char *title, int indent, bool present) +{ + spvbin_print_header (title, -1, -1, indent); + puts (present ? "present" : "absent"); +} + +void +spvbin_print_bool (const char *title, int indent, bool x) +{ + spvbin_print_header (title, -1, -1, indent); + printf ("%s\n", x ? "true" : "false"); +} + +void +spvbin_print_byte (const char *title, int indent, uint8_t x) +{ + spvbin_print_header (title, -1, -1, indent); + printf ("%"PRIu8"\n", x); +} + +void +spvbin_print_int16 (const char *title, int indent, uint16_t x) +{ + spvbin_print_header (title, -1, -1, indent); + printf ("%"PRIu16"\n", x); +} + +void +spvbin_print_int32 (const char *title, int indent, uint32_t x) +{ + spvbin_print_header (title, -1, -1, indent); + printf ("%"PRIu32"\n", x); +} + +void +spvbin_print_int64 (const char *title, int indent, uint64_t x) +{ + spvbin_print_header (title, -1, -1, indent); + printf ("%"PRIu64"\n", x); +} + +void +spvbin_print_double (const char *title, int indent, double x) +{ + spvbin_print_header (title, -1, -1, indent); + printf ("%g\n", x); +} + +void +spvbin_print_string (const char *title, int indent, const char *s) +{ + spvbin_print_header (title, -1, -1, indent); + if (s) + printf ("\"%s\"\n", s); + else + printf ("none\n"); +} + +void +spvbin_print_case (const char *title, int indent, int x) +{ + spvbin_print_header (title, -1, -1, indent); + printf ("%d\n", x); +} + +struct spvbin_position +spvbin_position_save (const struct spvbin_input *input) +{ + struct spvbin_position pos = { input->ofs }; + return pos; +} + +void +spvbin_position_restore (struct spvbin_position *pos, + struct spvbin_input *input) +{ + input->ofs = pos->ofs; +} + +static bool +spvbin_limit_parse__ (struct spvbin_limit *limit, struct spvbin_input *input, + uint32_t (*raw_to_native32) (uint32_t)) +{ + limit->size = input->size; + + uint32_t count; + if (input->size - input->ofs < sizeof count) + return false; + + const uint8_t *src = &input->data[input->ofs]; + count = raw_to_native32 (get_uint32 (src)); + if (input->size - input->ofs - sizeof count < count) + return false; + + input->ofs += sizeof count; + input->size = input->ofs + count; + return true; +} + +bool +spvbin_limit_parse (struct spvbin_limit *limit, struct spvbin_input *input) +{ + return spvbin_limit_parse__ (limit, input, le_to_native32); +} + +bool +spvbin_limit_parse_be (struct spvbin_limit *limit, struct spvbin_input *input) +{ + return spvbin_limit_parse__ (limit, input, be_to_native32); +} + +void +spvbin_limit_pop (struct spvbin_limit *limit, struct spvbin_input *input) +{ + input->size = limit->size; +} diff --git a/src/output/spv/spvbin-helpers.h b/src/output/spv/spvbin-helpers.h new file mode 100644 index 0000000000..1cb8e34072 --- /dev/null +++ b/src/output/spv/spvbin-helpers.h @@ -0,0 +1,94 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef SPVBIN_HELPERS_H +#define SPVBIN_HELPERS_H 1 + +#include +#include +#include + +struct spvbin_input + { + const uint8_t *data; + size_t ofs; + size_t size; + int version; + +#define SPVBIN_MAX_ERRORS 16 + struct + { + const char *name; + size_t start; + } + errors[SPVBIN_MAX_ERRORS]; + size_t n_errors; + size_t error_ofs; + }; + +void spvbin_input_init (struct spvbin_input *, const void *, size_t); +bool spvbin_input_at_end (const struct spvbin_input *); + +char *spvbin_input_to_error (const struct spvbin_input *, const char *name); + +bool spvbin_match_bytes (struct spvbin_input *, const void *, size_t); +bool spvbin_match_byte (struct spvbin_input *, uint8_t); + +bool spvbin_parse_bool (struct spvbin_input *, bool *); +bool spvbin_parse_byte (struct spvbin_input *, uint8_t *); +bool spvbin_parse_int16 (struct spvbin_input *, uint16_t *); +bool spvbin_parse_int32 (struct spvbin_input *, uint32_t *); +bool spvbin_parse_int64 (struct spvbin_input *, uint64_t *); +bool spvbin_parse_be16 (struct spvbin_input *, uint16_t *); +bool spvbin_parse_be32 (struct spvbin_input *, uint32_t *); +bool spvbin_parse_be64 (struct spvbin_input *, uint64_t *); +bool spvbin_parse_double (struct spvbin_input *, double *); +bool spvbin_parse_float (struct spvbin_input *, double *); +bool spvbin_parse_string (struct spvbin_input *, char **); +bool spvbin_parse_bestring (struct spvbin_input *, char **); + +void spvbin_error (struct spvbin_input *, const char *name, size_t start); + +void spvbin_print_header (const char *title, size_t start, size_t len, + int indent); +void spvbin_print_presence (const char *title, int indent, bool); +void spvbin_print_bool (const char *title, int indent, bool); +void spvbin_print_byte (const char *title, int indent, uint8_t); +void spvbin_print_int16 (const char *title, int indent, uint16_t); +void spvbin_print_int32 (const char *title, int indent, uint32_t); +void spvbin_print_int64 (const char *title, int indent, uint64_t); +void spvbin_print_double (const char *title, int indent, double); +void spvbin_print_string (const char *title, int indent, const char *); +void spvbin_print_case (const char *title, int indent, int); + +struct spvbin_position + { + size_t ofs; + }; + +struct spvbin_position spvbin_position_save (const struct spvbin_input *); +void spvbin_position_restore (struct spvbin_position *, struct spvbin_input *); + +struct spvbin_limit + { + size_t size; + }; + +bool spvbin_limit_parse (struct spvbin_limit *, struct spvbin_input *); +bool spvbin_limit_parse_be (struct spvbin_limit *, struct spvbin_input *); +void spvbin_limit_pop (struct spvbin_limit *, struct spvbin_input *); + +#endif /* output/spv/spvbin-helpers.h */ diff --git a/src/output/spv/spvxml-helpers.c b/src/output/spv/spvxml-helpers.c new file mode 100644 index 0000000000..eb99ccc3a5 --- /dev/null +++ b/src/output/spv/spvxml-helpers.c @@ -0,0 +1,880 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "output/spv/spvxml-helpers.h" + +#include +#include +#include + +#include "libpspp/cast.h" +#include "libpspp/compiler.h" +#include "libpspp/hash-functions.h" +#include "libpspp/str.h" + +#include "gl/xvasprintf.h" + +char * WARN_UNUSED_RESULT +spvxml_context_finish (struct spvxml_context *ctx, struct spvxml_node *root) +{ + if (!ctx->error) + root->class_->spvxml_node_collect_ids (ctx, root); + if (!ctx->error) + root->class_->spvxml_node_resolve_refs (ctx, root); + + hmap_destroy (&ctx->id_map); + + return ctx->error; +} + +void +spvxml_node_context_uninit (struct spvxml_node_context *nctx) +{ + for (struct spvxml_attribute *a = nctx->attrs; + a < &nctx->attrs[nctx->n_attrs]; a++) + free (a->value); +} + +static const char * +xml_element_type_to_string (xmlElementType type) +{ + switch (type) + { + case XML_ELEMENT_NODE: return "element"; + case XML_ATTRIBUTE_NODE: return "attribute"; + case XML_TEXT_NODE: return "text"; + case XML_CDATA_SECTION_NODE: return "CDATA section"; + case XML_ENTITY_REF_NODE: return "entity reference"; + case XML_ENTITY_NODE: return "entity"; + case XML_PI_NODE: return "PI"; + case XML_COMMENT_NODE: return "comment"; + case XML_DOCUMENT_NODE: return "document"; + case XML_DOCUMENT_TYPE_NODE: return "document type"; + case XML_DOCUMENT_FRAG_NODE: return "document fragment"; + case XML_NOTATION_NODE: return "notation"; + case XML_HTML_DOCUMENT_NODE: return "HTML document"; + case XML_DTD_NODE: return "DTD"; + case XML_ELEMENT_DECL: return "element declaration"; + case XML_ATTRIBUTE_DECL: return "attribute declaration"; + case XML_ENTITY_DECL: return "entity declaration"; + case XML_NAMESPACE_DECL: return "namespace declaration"; + case XML_XINCLUDE_START: return "XInclude start"; + case XML_XINCLUDE_END: return "XInclude end"; + case XML_DOCB_DOCUMENT_NODE: return "docb document"; + default: return ""; + } +} + +static void +spvxml_format_node_path (const xmlNode *node, struct string *s) +{ + enum { MAX_STACK = 32 }; + const xmlNode *stack[MAX_STACK]; + size_t n = 0; + + while (node != NULL && node->type != XML_DOCUMENT_NODE && n < MAX_STACK) + { + stack[n++] = node; + node = node->parent; + } + + while (n > 0) + { + node = stack[--n]; + ds_put_byte (s, '/'); + if (node->name) + ds_put_cstr (s, CHAR_CAST (char *, node->name)); + if (node->type == XML_ELEMENT_NODE) + { + if (node->parent) + { + size_t total = 1; + size_t index = 1; + for (const xmlNode *sibling = node->parent->children; + sibling; sibling = sibling->next) + { + if (sibling == node) + index = total; + else if (sibling->type == XML_ELEMENT_NODE + && !strcmp (CHAR_CAST (char *, sibling->name), + CHAR_CAST (char *, node->name))) + total++; + } + if (total > 1) + ds_put_format (s, "[%zu]", index); + } + } + else + ds_put_format (s, "(%s)", xml_element_type_to_string (node->type)); + } +} + +static struct spvxml_node * +spvxml_node_find (struct spvxml_context *ctx, const char *name, + unsigned int hash) +{ + struct spvxml_node *node; + HMAP_FOR_EACH_WITH_HASH (node, struct spvxml_node, id_node, hash, + &ctx->id_map) + if (!strcmp (node->id, name)) + return node; + + return NULL; +} + +void +spvxml_node_collect_id (struct spvxml_context *ctx, struct spvxml_node *node) +{ + if (!node->id) + return; + + unsigned int hash = hash_string (node->id, 0); + struct spvxml_node *other = spvxml_node_find (ctx, node->id, hash); + if (other) + { + if (!ctx->error) + { + struct string node_path = DS_EMPTY_INITIALIZER; + spvxml_format_node_path (node->raw, &node_path); + + struct string other_path = DS_EMPTY_INITIALIZER; + spvxml_format_node_path (other->raw, &other_path); + + ctx->error = xasprintf ("Nodes %s and %s both have ID \"%s\".", + ds_cstr (&node_path), + ds_cstr (&other_path), node->id); + + ds_destroy (&node_path); + ds_destroy (&other_path); + } + + return; + } + + hmap_insert (&ctx->id_map, &node->id_node, hash); +} + +struct spvxml_node * +spvxml_node_resolve_ref (struct spvxml_context *ctx, + const xmlNode *src, const char *attr_name, + const struct spvxml_node_class *const *classes, + size_t n) +{ + char *dst_id = CHAR_CAST ( + char *, xmlGetProp (CONST_CAST (xmlNode *, src), + CHAR_CAST (xmlChar *, attr_name))); + if (!dst_id) + return NULL; + + struct spvxml_node *dst = spvxml_node_find (ctx, dst_id, + hash_string (dst_id, 0)); + if (!dst) + { + struct string node_path = DS_EMPTY_INITIALIZER; + spvxml_format_node_path (src, &node_path); + + ctx->error = xasprintf ( + "%s: Attribute %s has unknown target ID \"%s\".", + ds_cstr (&node_path), attr_name, dst_id); + + ds_destroy (&node_path); + free (dst_id); + return NULL; + } + + if (!n) + { + free (dst_id); + return dst; + } + for (size_t i = 0; i < n; i++) + if (classes[i] == dst->class_) + { + free (dst_id); + return dst; + } + + if (!ctx->error) + { + struct string s = DS_EMPTY_INITIALIZER; + spvxml_format_node_path (src, &s); + + ds_put_format (&s, ": Attribute \"%s\" should refer to a \"%s\"", + attr_name, classes[0]->name); + if (n == 2) + ds_put_format (&s, " or \"%s\"", classes[1]->name); + else if (n > 2) + { + for (size_t i = 1; i < n - 1; i++) + ds_put_format (&s, ", \"%s\"", classes[i]->name); + ds_put_format (&s, ", or \"%s\"", classes[n - 1]->name); + } + ds_put_format (&s, " element, but its target ID \"%s\" " + "actually refers to a \"%s\" element.", + dst_id, dst->class_->name); + + ctx->error = ds_steal_cstr (&s); + } + + free (dst_id); + return NULL; +} + +void PRINTF_FORMAT (2, 3) +spvxml_attr_error (struct spvxml_node_context *nctx, const char *format, ...) +{ + if (nctx->up->error) + return; + + struct string s = DS_EMPTY_INITIALIZER; + ds_put_cstr (&s, "error parsing attributes of "); + spvxml_format_node_path (nctx->parent, &s); + + va_list args; + va_start (args, format); + ds_put_cstr (&s, ": "); + ds_put_vformat (&s, format, args); + va_end (args); + + nctx->up->error = ds_steal_cstr (&s); +} + +/* xmlGetPropNodeValueInternal() is from tree.c in libxml2 2.9.4+dfsg1, which + is covered by the following copyright and license: + + Except where otherwise noted in the source code (e.g. the files hash.c, + list.c and the trio files, which are covered by a similar licence but with + different Copyright notices) all the files are: + + Copyright (C) 1998-2012 Daniel Veillard. All Rights Reserved. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to + deal in the Software without restriction, including without limitation the + rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + sell copies of the Software, and to permit persons to whom the Software is + fur- nished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FIT- NESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + IN THE SOFTWARE. +*/ +static xmlChar* +xmlGetPropNodeValueInternal(const xmlAttr *prop) +{ + if (prop == NULL) + return(NULL); + if (prop->type == XML_ATTRIBUTE_NODE) { + /* + * Note that we return at least the empty string. + * TODO: Do we really always want that? + */ + if (prop->children != NULL) { + if ((prop->children->next == NULL) && + ((prop->children->type == XML_TEXT_NODE) || + (prop->children->type == XML_CDATA_SECTION_NODE))) + { + /* + * Optimization for the common case: only 1 text node. + */ + return(xmlStrdup(prop->children->content)); + } else { + xmlChar *ret; + + ret = xmlNodeListGetString(prop->doc, prop->children, 1); + if (ret != NULL) + return(ret); + } + } + return(xmlStrdup((xmlChar *)"")); + } else if (prop->type == XML_ATTRIBUTE_DECL) { + return(xmlStrdup(((xmlAttributePtr)prop)->defaultValue)); + } + return(NULL); +} + +static struct spvxml_attribute * +find_attribute (struct spvxml_node_context *nctx, const char *name) +{ + /* XXX This is linear search but we could use binary search. */ + for (struct spvxml_attribute *a = nctx->attrs; + a < &nctx->attrs[nctx->n_attrs]; a++) + if (!strcmp (a->name, name)) + return a; + + return NULL; +} + +static void +format_attribute (struct string *s, const xmlAttr *attr) +{ + const char *name = CHAR_CAST (char *, attr->name); + char *value = CHAR_CAST (char *, xmlGetPropNodeValueInternal (attr)); + ds_put_format (s, "%s=\"%s\"", name, value); + free (value); +} + +void +spvxml_parse_attributes (struct spvxml_node_context *nctx) +{ + for (const xmlAttr *node = nctx->parent->properties; node; node = node->next) + { + const char *node_name = CHAR_CAST (char *, node->name); + struct spvxml_attribute *a = find_attribute (nctx, node_name); + if (!a) + { + if (!strcmp (node_name, "id")) + continue; + + struct string unexpected = DS_EMPTY_INITIALIZER; + format_attribute (&unexpected, node); + int n = 1; + + for (node = node->next; node; node = node->next) + { + node_name = CHAR_CAST (char *, node->name); + if (!find_attribute (nctx, node_name) + && strcmp (node_name, "id")) + { + ds_put_byte (&unexpected, ' '); + format_attribute (&unexpected, node); + n++; + } + } + + spvxml_attr_error (nctx, "Node has unexpected attribute%s: %s", + n > 1 ? "s" : "", ds_cstr (&unexpected)); + ds_destroy (&unexpected); + return; + } + if (a->value) + { + spvxml_attr_error (nctx, "Duplicate attribute \"%s\".", a->name); + return; + } + a->value = CHAR_CAST (char *, xmlGetPropNodeValueInternal (node)); + } + + for (struct spvxml_attribute *a = nctx->attrs; + a < &nctx->attrs[nctx->n_attrs]; a++) + { + if (a->required && !a->value) + spvxml_attr_error (nctx, "Missing required attribute \"%s\".", + a->name); + return; + } +} + +int +spvxml_attr_parse_enum (struct spvxml_node_context *nctx, + const struct spvxml_attribute *a, + const struct spvxml_enum enums[]) +{ + if (!a->value) + return 0; + + for (const struct spvxml_enum *e = enums; e->name; e++) + if (!strcmp (a->value, e->name)) + return e->value; + + for (const struct spvxml_enum *e = enums; e->name; e++) + if (!strcmp (e->name, "OTHER")) + return e->value; + + spvxml_attr_error (nctx, "Attribute %s has unexpected value \"%s\".", + a->name, a->value); + return 0; +} + +int +spvxml_attr_parse_bool (struct spvxml_node_context *nctx, + const struct spvxml_attribute *a) +{ + static const struct spvxml_enum bool_enums[] = { + { "true", 1 }, + { "false", 0 }, + { NULL, 0 }, + }; + + return !a->value ? -1 : spvxml_attr_parse_enum (nctx, a, bool_enums); +} + +bool +spvxml_attr_parse_fixed (struct spvxml_node_context *nctx, + const struct spvxml_attribute *a, + const char *attr_value) +{ + const struct spvxml_enum fixed_enums[] = { + { attr_value, true }, + { NULL, 0 }, + }; + + return spvxml_attr_parse_enum (nctx, a, fixed_enums); +} + +int +spvxml_attr_parse_int (struct spvxml_node_context *nctx, + const struct spvxml_attribute *a) +{ + if (!a->value) + return INT_MIN; + + char *tail = NULL; + int save_errno = errno; + errno = 0; + long int integer = strtol (a->value, &tail, 10); + if (errno || *tail || integer <= INT_MIN || integer > INT_MAX) + { + spvxml_attr_error (nctx, "Attribute %s has unexpected value " + "\"%s\" expecting small integer.", a->name, a->value); + integer = INT_MIN; + } + errno = save_errno; + + return integer; +} + +static int +lookup_color_name (const char *s) +{ + struct color + { + struct hmap_node hmap_node; + const char *name; + int code; + }; + + static struct color colors[] = + { + { .name = "aliceblue", .code = 0xf0f8ff }, + { .name = "antiquewhite", .code = 0xfaebd7 }, + { .name = "aqua", .code = 0x00ffff }, + { .name = "aquamarine", .code = 0x7fffd4 }, + { .name = "azure", .code = 0xf0ffff }, + { .name = "beige", .code = 0xf5f5dc }, + { .name = "bisque", .code = 0xffe4c4 }, + { .name = "black", .code = 0x000000 }, + { .name = "blanchedalmond", .code = 0xffebcd }, + { .name = "blue", .code = 0x0000ff }, + { .name = "blueviolet", .code = 0x8a2be2 }, + { .name = "brown", .code = 0xa52a2a }, + { .name = "burlywood", .code = 0xdeb887 }, + { .name = "cadetblue", .code = 0x5f9ea0 }, + { .name = "chartreuse", .code = 0x7fff00 }, + { .name = "chocolate", .code = 0xd2691e }, + { .name = "coral", .code = 0xff7f50 }, + { .name = "cornflowerblue", .code = 0x6495ed }, + { .name = "cornsilk", .code = 0xfff8dc }, + { .name = "crimson", .code = 0xdc143c }, + { .name = "cyan", .code = 0x00ffff }, + { .name = "darkblue", .code = 0x00008b }, + { .name = "darkcyan", .code = 0x008b8b }, + { .name = "darkgoldenrod", .code = 0xb8860b }, + { .name = "darkgray", .code = 0xa9a9a9 }, + { .name = "darkgreen", .code = 0x006400 }, + { .name = "darkgrey", .code = 0xa9a9a9 }, + { .name = "darkkhaki", .code = 0xbdb76b }, + { .name = "darkmagenta", .code = 0x8b008b }, + { .name = "darkolivegreen", .code = 0x556b2f }, + { .name = "darkorange", .code = 0xff8c00 }, + { .name = "darkorchid", .code = 0x9932cc }, + { .name = "darkred", .code = 0x8b0000 }, + { .name = "darksalmon", .code = 0xe9967a }, + { .name = "darkseagreen", .code = 0x8fbc8f }, + { .name = "darkslateblue", .code = 0x483d8b }, + { .name = "darkslategray", .code = 0x2f4f4f }, + { .name = "darkslategrey", .code = 0x2f4f4f }, + { .name = "darkturquoise", .code = 0x00ced1 }, + { .name = "darkviolet", .code = 0x9400d3 }, + { .name = "deeppink", .code = 0xff1493 }, + { .name = "deepskyblue", .code = 0x00bfff }, + { .name = "dimgray", .code = 0x696969 }, + { .name = "dimgrey", .code = 0x696969 }, + { .name = "dodgerblue", .code = 0x1e90ff }, + { .name = "firebrick", .code = 0xb22222 }, + { .name = "floralwhite", .code = 0xfffaf0 }, + { .name = "forestgreen", .code = 0x228b22 }, + { .name = "fuchsia", .code = 0xff00ff }, + { .name = "gainsboro", .code = 0xdcdcdc }, + { .name = "ghostwhite", .code = 0xf8f8ff }, + { .name = "gold", .code = 0xffd700 }, + { .name = "goldenrod", .code = 0xdaa520 }, + { .name = "gray", .code = 0x808080 }, + { .name = "green", .code = 0x008000 }, + { .name = "greenyellow", .code = 0xadff2f }, + { .name = "grey", .code = 0x808080 }, + { .name = "honeydew", .code = 0xf0fff0 }, + { .name = "hotpink", .code = 0xff69b4 }, + { .name = "indianred", .code = 0xcd5c5c }, + { .name = "indigo", .code = 0x4b0082 }, + { .name = "ivory", .code = 0xfffff0 }, + { .name = "khaki", .code = 0xf0e68c }, + { .name = "lavender", .code = 0xe6e6fa }, + { .name = "lavenderblush", .code = 0xfff0f5 }, + { .name = "lawngreen", .code = 0x7cfc00 }, + { .name = "lemonchiffon", .code = 0xfffacd }, + { .name = "lightblue", .code = 0xadd8e6 }, + { .name = "lightcoral", .code = 0xf08080 }, + { .name = "lightcyan", .code = 0xe0ffff }, + { .name = "lightgoldenrodyellow", .code = 0xfafad2 }, + { .name = "lightgray", .code = 0xd3d3d3 }, + { .name = "lightgreen", .code = 0x90ee90 }, + { .name = "lightgrey", .code = 0xd3d3d3 }, + { .name = "lightpink", .code = 0xffb6c1 }, + { .name = "lightsalmon", .code = 0xffa07a }, + { .name = "lightseagreen", .code = 0x20b2aa }, + { .name = "lightskyblue", .code = 0x87cefa }, + { .name = "lightslategray", .code = 0x778899 }, + { .name = "lightslategrey", .code = 0x778899 }, + { .name = "lightsteelblue", .code = 0xb0c4de }, + { .name = "lightyellow", .code = 0xffffe0 }, + { .name = "lime", .code = 0x00ff00 }, + { .name = "limegreen", .code = 0x32cd32 }, + { .name = "linen", .code = 0xfaf0e6 }, + { .name = "magenta", .code = 0xff00ff }, + { .name = "maroon", .code = 0x800000 }, + { .name = "mediumaquamarine", .code = 0x66cdaa }, + { .name = "mediumblue", .code = 0x0000cd }, + { .name = "mediumorchid", .code = 0xba55d3 }, + { .name = "mediumpurple", .code = 0x9370db }, + { .name = "mediumseagreen", .code = 0x3cb371 }, + { .name = "mediumslateblue", .code = 0x7b68ee }, + { .name = "mediumspringgreen", .code = 0x00fa9a }, + { .name = "mediumturquoise", .code = 0x48d1cc }, + { .name = "mediumvioletred", .code = 0xc71585 }, + { .name = "midnightblue", .code = 0x191970 }, + { .name = "mintcream", .code = 0xf5fffa }, + { .name = "mistyrose", .code = 0xffe4e1 }, + { .name = "moccasin", .code = 0xffe4b5 }, + { .name = "navajowhite", .code = 0xffdead }, + { .name = "navy", .code = 0x000080 }, + { .name = "oldlace", .code = 0xfdf5e6 }, + { .name = "olive", .code = 0x808000 }, + { .name = "olivedrab", .code = 0x6b8e23 }, + { .name = "orange", .code = 0xffa500 }, + { .name = "orangered", .code = 0xff4500 }, + { .name = "orchid", .code = 0xda70d6 }, + { .name = "palegoldenrod", .code = 0xeee8aa }, + { .name = "palegreen", .code = 0x98fb98 }, + { .name = "paleturquoise", .code = 0xafeeee }, + { .name = "palevioletred", .code = 0xdb7093 }, + { .name = "papayawhip", .code = 0xffefd5 }, + { .name = "peachpuff", .code = 0xffdab9 }, + { .name = "peru", .code = 0xcd853f }, + { .name = "pink", .code = 0xffc0cb }, + { .name = "plum", .code = 0xdda0dd }, + { .name = "powderblue", .code = 0xb0e0e6 }, + { .name = "purple", .code = 0x800080 }, + { .name = "red", .code = 0xff0000 }, + { .name = "rosybrown", .code = 0xbc8f8f }, + { .name = "royalblue", .code = 0x4169e1 }, + { .name = "saddlebrown", .code = 0x8b4513 }, + { .name = "salmon", .code = 0xfa8072 }, + { .name = "sandybrown", .code = 0xf4a460 }, + { .name = "seagreen", .code = 0x2e8b57 }, + { .name = "seashell", .code = 0xfff5ee }, + { .name = "sienna", .code = 0xa0522d }, + { .name = "silver", .code = 0xc0c0c0 }, + { .name = "skyblue", .code = 0x87ceeb }, + { .name = "slateblue", .code = 0x6a5acd }, + { .name = "slategray", .code = 0x708090 }, + { .name = "slategrey", .code = 0x708090 }, + { .name = "snow", .code = 0xfffafa }, + { .name = "springgreen", .code = 0x00ff7f }, + { .name = "steelblue", .code = 0x4682b4 }, + { .name = "tan", .code = 0xd2b48c }, + { .name = "teal", .code = 0x008080 }, + { .name = "thistle", .code = 0xd8bfd8 }, + { .name = "tomato", .code = 0xff6347 }, + { .name = "turquoise", .code = 0x40e0d0 }, + { .name = "violet", .code = 0xee82ee }, + { .name = "wheat", .code = 0xf5deb3 }, + { .name = "white", .code = 0xffffff }, + { .name = "whitesmoke", .code = 0xf5f5f5 }, + { .name = "yellow", .code = 0xffff00 }, + { .name = "yellowgreen", .code = 0x9acd32 }, + }; + + static struct hmap color_table = HMAP_INITIALIZER (color_table); + + if (hmap_is_empty (&color_table)) + for (size_t i = 0; i < sizeof colors / sizeof *colors; i++) + hmap_insert (&color_table, &colors[i].hmap_node, + hash_string (colors[i].name, 0)); + + const struct color *color; + HMAP_FOR_EACH_WITH_HASH (color, struct color, hmap_node, + hash_string (s, 0), &color_table) + if (!strcmp (color->name, s)) + return color->code; + return -1; +} + +int +spvxml_attr_parse_color (struct spvxml_node_context *nctx, + const struct spvxml_attribute *a) +{ + if (!a->value || !strcmp (a->value, "transparent")) + return -1; + + int r, g, b; + if (sscanf (a->value, "#%2x%2x%2x", &r, &g, &b) == 3 + || sscanf (a->value, "%2x%2x%2x", &r, &g, &b) == 3) + return (r << 16) | (g << 8) | b; + + int code = lookup_color_name (a->value); + if (code >= 0) + return code; + + spvxml_attr_error (nctx, "Attribute %s has unexpected value " + "\"%s\" expecting #rrggbb or rrggbb or web color name.", + a->name, a->value); + return 0; +} + +static bool +try_strtod (char *s, char **tail, double *real) +{ + char *comma = strchr (s, ','); + if (comma) + *comma = '.'; + + int save_errno = errno; + errno = 0; + *tail = NULL; + *real = strtod (s, tail); + bool ok = errno == 0; + errno = save_errno; + + if (!ok) + *real = DBL_MAX; + return ok; +} + +double +spvxml_attr_parse_real (struct spvxml_node_context *nctx, + const struct spvxml_attribute *a) +{ + if (!a->value) + return DBL_MAX; + + char *tail; + double real; + if (!try_strtod (a->value, &tail, &real) || *tail) + spvxml_attr_error (nctx, "Attribute %s has unexpected value " + "\"%s\" expecting real number.", a->name, a->value); + + return real; +} + +double +spvxml_attr_parse_dimension (struct spvxml_node_context *nctx, + const struct spvxml_attribute *a) +{ + if (!a->value) + return DBL_MAX; + + char *tail; + double real; + if (!try_strtod (a->value, &tail, &real)) + goto error; + + tail += strspn (tail, " \t\r\n"); + + struct unit + { + const char *name; + double divisor; + }; + static const struct unit units[] = { + +/* If you add anything to this table, update the table in + doc/dev/spv-file-format.texi also. */ + + /* Inches. */ + { "in", 1.0 }, + { "인치", 1.0 }, + { "pol.", 1.0 }, + { "cala", 1.0 }, + { "cali", 1.0 }, + + /* Device-independent pixels. */ + { "px", 96.0 }, + + /* Points. */ + { "pt", 72.0 }, + { "пт", 72.0 }, + { "", 72.0 }, + + /* Centimeters. */ + { "cm", 2.54 }, + { "см", 2.54 }, + }; + + for (size_t i = 0; i < sizeof units / sizeof *units; i++) + if (!strcmp (units[i].name, tail)) + return real / units[i].divisor; + goto error; + +error: + spvxml_attr_error (nctx, "Attribute %s has unexpected value " + "\"%s\" expecting dimension.", a->name, a->value); + return DBL_MAX; +} + +struct spvxml_node * +spvxml_attr_parse_ref (struct spvxml_node_context *nctx UNUSED, + const struct spvxml_attribute *a UNUSED) +{ + return NULL; +} + +void PRINTF_FORMAT (3, 4) +spvxml_content_error (struct spvxml_node_context *nctx, const xmlNode *node, + const char *format, ...) +{ + if (nctx->up->error) + return; + + struct string s = DS_EMPTY_INITIALIZER; + + ds_put_cstr (&s, "error parsing content of "); + spvxml_format_node_path (nctx->parent, &s); + + if (node) + { + ds_put_format (&s, " at %s", xml_element_type_to_string (node->type)); + if (node->name) + ds_put_format (&s, " \"%s\"", node->name); + } + else + ds_put_format (&s, " at end of content"); + + va_list args; + va_start (args, format); + ds_put_cstr (&s, ": "); + ds_put_vformat (&s, format, args); + va_end (args); + + //puts (ds_cstr (&s)); + + nctx->up->error = ds_steal_cstr (&s); +} + +bool +spvxml_content_parse_element (struct spvxml_node_context *nctx, + xmlNode **nodep, + const char *elem_name, xmlNode **outp) +{ + xmlNode *node = *nodep; + while (node) + { + if (node->type == XML_ELEMENT_NODE + && (!strcmp (CHAR_CAST (char *, node->name), elem_name) + || !strcmp (elem_name, "any"))) + { + *outp = node; + *nodep = node->next; + return true; + } + else if (node->type != XML_COMMENT_NODE) + break; + + node = node->next; + } + + spvxml_content_error (nctx, node, "\"%s\" element expected.", elem_name); + *outp = NULL; + return false; +} + +bool +spvxml_content_parse_text (struct spvxml_node_context *nctx UNUSED, xmlNode **nodep, + char **textp) +{ + struct string text = DS_EMPTY_INITIALIZER; + + xmlNode *node = *nodep; + while (node) + { + if (node->type == XML_TEXT_NODE || node->type == XML_CDATA_SECTION_NODE) + { + char *segment = CHAR_CAST (char *, xmlNodeGetContent (node)); + if (!text.ss.string) + { + text.ss = ss_cstr (segment); + text.capacity = text.ss.length; + } + else + { + ds_put_cstr (&text, segment); + free (segment); + } + } + else if (node->type != XML_COMMENT_NODE) + break; + + node = node->next; + } + *nodep = node; + + *textp = ds_steal_cstr (&text); + + return true; +} + +bool +spvxml_content_parse_end (struct spvxml_node_context *nctx, xmlNode *node) +{ + for (;;) + { + if (!node) + return true; + else if (node->type != XML_COMMENT_NODE) + break; + + node = node->next; + } + + struct string s = DS_EMPTY_INITIALIZER; + + for (int i = 0; i < 4 && node; i++, node = node->next) + { + if (i) + ds_put_cstr (&s, ", "); + ds_put_cstr (&s, xml_element_type_to_string (node->type)); + if (node->name) + ds_put_format (&s, " \"%s\"", node->name); + } + if (node) + ds_put_format (&s, ", ..."); + + spvxml_content_error (nctx, node, "Extra content found expecting end: %s", + ds_cstr (&s)); + ds_destroy (&s); + + return false; +} + diff --git a/src/output/spv/spvxml-helpers.h b/src/output/spv/spvxml-helpers.h new file mode 100644 index 0000000000..16f7335013 --- /dev/null +++ b/src/output/spv/spvxml-helpers.h @@ -0,0 +1,126 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef SPVXML_HELPERS_H +#define SPVXML_HELPERS_H 1 + +#include +#include +#include +#include "libpspp/compiler.h" +#include "libpspp/hmap.h" + +struct spvxml_node; + +struct spvxml_context + { + struct hmap id_map; + + char *error; + bool hard_error; + }; + +#define SPVXML_CONTEXT_INIT(CONTEXT) \ + { HMAP_INITIALIZER ((CONTEXT).id_map), NULL, false } + +char *spvxml_context_finish (struct spvxml_context *, struct spvxml_node *root) + WARN_UNUSED_RESULT; + +struct spvxml_node_context + { + struct spvxml_context *up; + const xmlNode *parent; + + struct spvxml_attribute *attrs; + size_t n_attrs; + }; + +void spvxml_node_context_uninit (struct spvxml_node_context *); + +struct spvxml_node_class + { + const char *name; + void (*spvxml_node_free) (struct spvxml_node *); + void (*spvxml_node_collect_ids) (struct spvxml_context *, + struct spvxml_node *); + void (*spvxml_node_resolve_refs) (struct spvxml_context *, + struct spvxml_node *); + }; + +struct spvxml_node + { + struct hmap_node id_node; + char *id; + + const struct spvxml_node_class *class_; + const xmlNode *raw; + }; + +void spvxml_node_collect_id (struct spvxml_context *, struct spvxml_node *); +struct spvxml_node *spvxml_node_resolve_ref ( + struct spvxml_context *, const xmlNode *, const char *attr_name, + const struct spvxml_node_class *const *, size_t n); + +/* Attribute parsing. */ +struct spvxml_attribute + { + const char *name; + bool required; + char *value; + }; + +void spvxml_parse_attributes (struct spvxml_node_context *); +void spvxml_attr_error (struct spvxml_node_context *, const char *format, ...) + PRINTF_FORMAT (2, 3); + +struct spvxml_enum + { + const char *name; + int value; + }; + +int spvxml_attr_parse_enum (struct spvxml_node_context *, + const struct spvxml_attribute *, + const struct spvxml_enum[]); +int spvxml_attr_parse_bool (struct spvxml_node_context *, + const struct spvxml_attribute *); +bool spvxml_attr_parse_fixed (struct spvxml_node_context *, + const struct spvxml_attribute *, + const char *attr_value); +int spvxml_attr_parse_int (struct spvxml_node_context *, + const struct spvxml_attribute *); +int spvxml_attr_parse_color (struct spvxml_node_context *, + const struct spvxml_attribute *); +double spvxml_attr_parse_real (struct spvxml_node_context *, + const struct spvxml_attribute *); +double spvxml_attr_parse_dimension (struct spvxml_node_context *, + const struct spvxml_attribute *); +struct spvxml_node *spvxml_attr_parse_ref (struct spvxml_node_context *, + const struct spvxml_attribute *); + +/* Content parsing. */ + +void spvxml_content_error (struct spvxml_node_context *, const xmlNode *, + const char *format, ...) + PRINTF_FORMAT (3, 4); +bool spvxml_content_parse_element (struct spvxml_node_context *, xmlNode **, + const char *elem_name, xmlNode **); +bool spvxml_content_parse_text (struct spvxml_node_context *, xmlNode **, + char **textp); +void spvxml_content_parse_etc (xmlNode **); +bool spvxml_content_parse_end (struct spvxml_node_context *, xmlNode *); + +#endif /* output/spv/spvxml-helpers.h */ diff --git a/src/output/spv/structure-xml.grammar b/src/output/spv/structure-xml.grammar new file mode 100644 index 0000000000..61dc5d48f4 --- /dev/null +++ b/src/output/spv/structure-xml.grammar @@ -0,0 +1,193 @@ +# PSPP - a program for statistical analysis. +# Copyright (C) 2017, 2018, 2019 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +heading[root_heading] + :creator-version? + :creator? + :creation-date-time? + :lockReader=bool? + :schemaLocation? +=> label pageSetup? (container | heading)* + +heading + :creator-version? + :commandName? + :visibility[heading_visibility]=(collapsed)? + :locale? + :olang? +=> label (container | heading)* + +label => TEXT + +container + :visibility=(visible | hidden) + :page-break-before=(always)? + :text-align=(left | center)? + :width=dimension +=> label (table | container_text | graph | model | object | image | tree) + +text[container_text] + :type[text_type]=(title | log | text | page-title) + :commandName? + :creator-version? +=> html + +html :lang=(en) => TEXT + +table + :VDPId? + :ViZmlSource? + :activePageId=int? + :commandName + :creator-version? + :displayFiltering=bool? + :maxNumCells=int? + :orphanTolerance=int? + :rowBreakNumber=int? + :subType + :tableId + :tableLookId? + :type[table_type]=(table | note | warning) +=> tableProperties? tableStructure + +tableProperties +=> generalProperties footnoteProperties cellFormatProperties borderProperties printingProperties + +generalProperties + :hideEmptyRows=bool? + :maximumColumnWidth=dimension? + :maximumRowWidth=dimension? + :minimumColumnWidth=dimension? + :minimumRowWidth=dimension? + :rowDimensionLabels=(inCorner | nested)? +=> EMPTY + +footnoteProperties + :markerPosition=(superscript | subscript)? + :numberFormat=(alphabetic | numeric)? +=> EMPTY + +cellFormatProperties => cell_style+ + +any[cell_style] + :alternatingColor=color? + :alternatingTextColor=color? +=> style + +style + :color=color? + :color2=color? + :font-family? + :font-size? + :font-style=(regular | italic)? + :font-weight=(regular | bold)? + :labelLocationVertical=(positive | negative | center)? + :margin-bottom=dimension? + :margin-left=dimension? + :margin-right=dimension? + :margin-top=dimension? + :textAlignment=(left | right | center | decimal | mixed)? + :decimal-offset=dimension? +=> EMPTY + +borderProperties => border_style+ + +any[border_style] + :borderStyleType=(none | solid | dashed | thick | thin | double)? + :color=color? +=> EMPTY + +printingProperties + :printAllLayers=bool? + :rescaleLongTableToFitPage=bool? + :rescaleWideTableToFitPage=bool? + :windowOrphanLines=int? + :continuationText? + :continuationTextAtBottom=bool? + :continuationTextAtTop=bool? + :printEachLayerOnSeparatePage=bool? +=> EMPTY + +tableStructure => path? dataPath + +graph + :VDPId? + :ViZmlSource? + :commandName? + :creator-version? + :dataMapId? + :dataMapURI? + :editor? + :refMapId? + :refMapURI? + :csvFileIds? + :csvFileNames? +=> dataPath? path csvPath? + +model + :PMMLContainerId? + :PMMLId + :StatXMLContainerId + :VDPId + :auxiliaryViewName + :commandName + :creator-version + :mainViewName +=> ViZml? dataPath? path | pmmlContainerPath statsContainerPath + +tree + :commandName + :creator-version + :name + :type +=> dataPath path + +pmmlContainerPath => TEXT + +statsContainerPath => TEXT + +ViZml :viewName? => TEXT + +dataPath => TEXT + +path => TEXT + +csvPath => TEXT + +pageSetup + :initial-page-number=int? + :chart-size=(as-is | full-height | half-height | quarter-height | OTHER)? + :margin-left=dimension? + :margin-right=dimension? + :margin-top=dimension? + :margin-bottom=dimension? + :paper-height=dimension? + :paper-width=dimension? + :reference-orientation? + :space-after=dimension? +=> pageHeader pageFooter + +pageHeader => pageParagraph? + +pageFooter => pageParagraph? + +pageParagraph => pageParagraph_text + +text[pageParagraph_text] :type=(title | text) => TEXT + +object :type :uri => EMPTY + +image :VDPId :commandName => dataPath diff --git a/src/output/spv/xml-parser-generator b/src/output/spv/xml-parser-generator new file mode 100644 index 0000000000..eb07f17224 --- /dev/null +++ b/src/output/spv/xml-parser-generator @@ -0,0 +1,1092 @@ +#! /usr/bin/python + +# PSPP - a program for statistical analysis. +# Copyright (C) 2017, 2018, 2019 Free Software Foundation, Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . + +import getopt +import os +import re +import struct +import sys + +n_errors = 0 + +def error(msg): + global n_errors + sys.stderr.write("%s:%d: %s\n" % (file_name, line_number, msg)) + n_errors += 1 + + +def fatal(msg): + error(msg) + sys.exit(1) + + +def get_line(): + global line + global line_number + line = input_file.readline() + line = re.sub('#.*', '\n', line) + line_number += 1 + + +def expect(type): + if token[0] != type: + fatal("syntax error expecting %s" % type) + + +def match(type): + if token[0] == type: + get_token() + return True + else: + return False + + +def must_match(type): + expect(type) + get_token() + + +def match_id(id_): + if token == ('id', id_): + get_token() + return True + else: + return False + + +def is_idchar(c): + return c.isalnum() or c in '-_' + + +def get_token(): + global token + global line + prev = token + while True: + if line == "": + if token == ('eof', ): + fatal("unexpected end of input") + get_line() + if not line: + token = ('eof', ) + break + elif line == '\n': + token = (';', ) + break + + line = line.lstrip() + if line == "": + continue + + if line.startswith('=>'): + token = (line[:2],) + line = line[2:] + elif line[0] in '[]()?|*+=:': + token = (line[0],) + line = line[1:] + elif is_idchar(line[0]): + n = 1 + while n < len(line) and is_idchar(line[n]): + n += 1 + s = line[:n] + token = ('id', s) + line = line[n:] + else: + fatal("unknown character %c" % line[0]) + break + + +def usage(): + argv0 = os.path.basename(sys.argv[0]) + print('''\ +%(argv0)s, parser generator for SPV XML members +usage: %(argv0)s GRAMMAR header PREFIX + %(argv0)s GRAMMAR code PREFIX HEADER_NAME + where GRAMMAR contains grammar definitions\ +''' % {"argv0": argv0}) + sys.exit(0) + + +def parse_term(): + if match('('): + sub = parse_alternation() + must_match(')') + return sub + else: + member_name, nonterminal_name = parse_name() + if member_name.isupper(): + fatal('%s; unknown terminal' % member_name) + else: + return {'type': 'nonterminal', + 'nonterminal_name': nonterminal_name, + 'member_name': member_name} + + +def parse_quantified(): + item = parse_term() + if token[0] in ['*', '+', '?']: + item = {'type': token[0], 'item': item} + get_token() + return item + + +def parse_sequence(): + if match_id('EMPTY'): + return {'type': 'empty'} + items = [] + while True: + sub = parse_quantified() + if sub['type'] == 'sequence': + items.extend(sub[1:]) + else: + items.append(sub) + if token[0] in ('|', ';', ')', 'eof'): + break + return {'type': 'sequence', 'items': items} if len(items) > 1 else items[0] + + +def parse_alternation(): + items = [parse_sequence()] + while match('|'): + items.append(parse_sequence()) + if len(items) > 1: + return {'type': '|', 'items': items} + else: + return items[0] + + +def parse_name(): + # The name used in XML for the attribute or element always comes + # first. + expect('id') + xml_name = token[1] + get_token() + + # If a different name is needed to disambiguate when the same name + # is used in different contexts in XML, it comes later, in + # brackets. + if match('['): + expect('id') + unique_name = token[1] + get_token() + must_match(']') + else: + unique_name = xml_name + + return unique_name, xml_name + + +enums = {} +def parse_production(): + unique_name, xml_name = parse_name() + + attr_xml_names = set() + attributes = {} + while match(':'): + attr_unique_name, attr_xml_name = parse_name() + if match('='): + if match('('): + attr_value = set() + while not match(')'): + expect('id') + attr_value.add(token[1]) + get_token() + match('|') + + global enums + if attr_unique_name not in enums: + enums[attr_unique_name] = attr_value + elif enums[attr_unique_name] != attr_value: + sys.stderr.write('%s: different enums with same name\n' + % attr_unique_name) + sys.exit(1) + elif match_id('bool'): + attr_value = set(('true', 'false')) + elif match_id('dimension'): + attr_value = 'dimension' + elif match_id('real'): + attr_value = 'real' + elif match_id('int'): + attr_value = 'int' + elif match_id('color'): + attr_value = 'color' + elif match_id('ref'): + if token[0] == 'id': + ref_type = token[1] + attr_value = ('ref', ref_type) + get_token() + elif match('('): + ref_types = set() + while not match(')'): + expect('id') + ref_types.add(token[1]) + get_token() + match('|') + attr_value = ('ref', ref_types) + else: + attr_value = ('ref', None) + else: + fatal("unknown attribute value type") + else: + attr_value = 'string' + attr_required = not match('?') + + if attr_xml_name == 'id': + if attr_value != 'string': + fatal("id attribute must have string type") + attr_value = 'id' + + if attr_unique_name in attributes: + fatal("production %s has two attributes %s" % (unique_name, + attr_unique_name)) + if attr_xml_name in attr_xml_names: + fatal("production %s has two attributes %s" % (unique_name, + attr_xml_name)) + attr_xml_names.add(attr_xml_name) + attributes[attr_unique_name] = (attr_xml_name, + attr_value, attr_required) + if 'id' not in attributes: + attributes["id"] = ('id', 'id', False) + + must_match('=>') + + if match_id('TEXT'): + rhs = {'type': 'text'} + elif match_id('ETC'): + rhs = {'type': 'etc'} + else: + rhs = parse_alternation() + + n = 0 + for a in rhs['items'] if rhs['type'] == '|' else (rhs,): + for term in a['items'] if a['type'] == 'sequence' else (a,): + if term['type'] == 'empty': + pass + elif term['type'] == 'nonterminal': + pass + elif term['type'] == '?' and term['item']['type'] == 'nonterminal': + pass + elif (term['type'] in ('*', '+') + and term['item']['type'] == 'nonterminal'): + pass + else: + n += 1 + term['seq_name'] = 'seq' if n == 1 else 'seq%d' % n + + return unique_name, xml_name, attributes, rhs + + +used_enums = set() +def print_members(attributes, rhs, indent): + attrs = [] + new_enums = set() + for unique_name, (xml_name, value, required) in attributes.items(): + c_name = name_to_id(unique_name) + if type(value) is set: + if len(value) <= 1: + if not required: + attrs += [('bool %s_present;' % c_name, + 'True if attribute present')] + elif value == set(('true', 'false')): + if required: + attrs += [('bool %s;' % c_name, None)] + else: + attrs += [('int %s;' % c_name, + '-1 if not present, otherwise 0 or 1')] + else: + attrs += [('enum %s%s %s;' % (prefix, c_name, c_name), + 'Always nonzero' if required else + 'Zero if not present')] + + global used_enums + if unique_name not in used_enums: + new_enums.add(unique_name) + elif value == 'dimension' or value == 'real': + attrs += [('double %s;' % c_name, + 'In inches. ' + ('Always present' if required else + 'DBL_MAX if not present'))] + elif value == 'int': + attrs += [('int %s;' % c_name, + 'Always present' if required + else 'INT_MIN if not present')] + elif value == 'color': + attrs += [('int %s;' % c_name, + 'Always present' if required + else '-1 if not present')] + elif value == 'string': + attrs += [('char *%s;' % c_name, + 'Always nonnull' if required else 'Possibly null')] + elif value[0] == 'ref': + struct = ('spvxml_node' + if value[1] is None or type(value[1]) is set + else '%s%s' % (prefix, name_to_id(value[1]))) + attrs += [('struct %s *%s;' % (struct, c_name), + 'Always nonnull' if required else 'Possibly null')] + elif value == 'id': + pass + else: + assert False + + for enum_name in new_enums: + used_enums.add(enum_name) + c_name = name_to_id(enum_name) + print '\nenum %s%s {' % (prefix, c_name) + i = 0 + for value in sorted(enums[enum_name]): + print ' %s%s_%s%s,' % (prefix.upper(), + c_name.upper(), + name_to_id(value).upper(), + ' = 1' if i == 0 else '') + i += 1 + print '};' + print 'const char *%s%s_to_string (enum %s%s);' % ( + prefix, c_name, prefix, c_name) + + print '\nstruct %s%s {' % (prefix, name_to_id(name)) + print '%sstruct spvxml_node node_;' % indent + + if attrs: + print '\n%s/* Attributes. */' % indent + for decl, comment in attrs: + line = '%s%s' % (indent, decl) + if comment: + n_spaces = max(35 - len(line), 1) + line += '%s/* %s. */' % (' ' * n_spaces, comment) + print line + + if rhs['type'] == 'etc' or rhs['type'] == 'empty': + return + + print '\n%s/* Content. */' % indent + if rhs['type'] == 'text': + print '%schar *text; /* Always nonnull. */' % indent + return + + for a in rhs['items'] if rhs['type'] == '|' else (rhs,): + for term in a['items'] if a['type'] == 'sequence' else (a,): + if term['type'] == 'empty': + pass + elif term['type'] == 'nonterminal': + nt_name = name_to_id(term['nonterminal_name']) + member_name = name_to_id(term['member_name']) + print '%sstruct %s%s *%s; /* Always nonnull. */' % ( + indent, prefix, nt_name, member_name) + elif term['type'] == '?' and term['item']['type'] == 'nonterminal': + nt_name = name_to_id(term['item']['nonterminal_name']) + member_name = name_to_id(term['item']['member_name']) + print '%sstruct %s%s *%s; /* Possibly null. */' % ( + indent, prefix, nt_name, member_name) + elif (term['type'] in ('*', '+') + and term['item']['type'] == 'nonterminal'): + nt_name = name_to_id(term['item']['nonterminal_name']) + member_name = name_to_id(term['item']['member_name']) + print '%sstruct %s%s **%s;' % (indent, prefix, + nt_name, member_name) + print '%ssize_t n_%s;' % (indent, member_name) + else: + seq_name = term['seq_name'] + print '%sstruct spvxml_node **%s;' % (indent, seq_name) + print '%ssize_t n_%s;' % (indent, seq_name) + + +def bytes_to_hex(s): + return ''.join(['"'] + ["\\x%02x" % ord(x) for x in s] + ['"']) + + +class Parser_Context(object): + def __init__(self, function_name, productions): + self.suffixes = {} + self.bail = 'error' + self.need_error_handler = False + self.parsers = {} + self.parser_index = 0 + self.productions = productions + + self.function_name = function_name + self.functions = [] + def gen_name(self, prefix): + n = self.suffixes.get(prefix, 0) + 1 + self.suffixes[prefix] = n + return '%s%d' % (prefix, n) if n > 1 else prefix + def new_function(self, type_name): + f = Function('%s_%d' % (self.function_name, len(self.functions) + 1), + type_name) + self.functions += [f] + return f + + +def print_attribute_decls(name, attributes): + if attributes: + print(' enum {') + for unique_name, (xml_name, value, required) in sorted(attributes.items()): + c_name = name_to_id(unique_name) + print(' ATTR_%s,' % c_name.upper()) + print(' };') + print(' struct spvxml_attribute attrs[] = {') + for unique_name, (xml_name, value, required) in sorted(attributes.items()): + c_name = name_to_id(unique_name) + print(' [ATTR_%s] = { "%s", %s, NULL },' + % (c_name.upper(), xml_name, 'true' if required else 'false')) + print(' };') + print(' enum { N_ATTRS = sizeof attrs / sizeof *attrs };') + + +def print_parser_for_attributes(name, attributes): + print(' /* Parse attributes. */') + print(' spvxml_parse_attributes (&nctx);') + + if not attributes: + return + + for unique_name, (xml_name, value, required) in sorted(attributes.items()): + c_name = name_to_id(unique_name) + params = '&nctx, &attrs[ATTR_%s]' % c_name.upper() + if type(value) is set: + if len(value) <= 1: + if required: + print(' spvxml_attr_parse_fixed (%s, "%s");' + % (params, tuple(value)[0])) + else: + print(' p->%s_present = spvxml_attr_parse_fixed (\n' + ' %s, "%s");' + % (c_name, params, tuple(value)[0])) + elif value == set(('true', 'false')): + print(' p->%s = spvxml_attr_parse_bool (%s);' + % (c_name, params)) + else: + map_name = '%s%s_map' % (prefix, c_name) + print(' p->%s = spvxml_attr_parse_enum (\n' + ' %s, %s);' + % (c_name, params, map_name)) + elif value in ('real', 'dimension', 'int', 'color'): + print(' p->%s = spvxml_attr_parse_%s (%s);' + % (c_name, value, params)) + elif value == 'string': + print(' p->%s = attrs[ATTR_%s].value;\n' + ' attrs[ATTR_%s].value = NULL;' + % (c_name, c_name.upper(), + c_name.upper())) + elif value == 'id': + print(' p->node_.id = attrs[ATTR_%s].value;\n' + ' attrs[ATTR_%s].value = NULL;' + % (c_name.upper(), c_name.upper())) + elif value[0] == 'ref': + pass + else: + assert False + print('''\ + if (ctx->error) { + spvxml_node_context_uninit (&nctx); + ctx->hard_error = true; + %sfree_%s (p); + return false; + }''' + % (prefix, name_to_id(name))) + +class Function(object): + def __init__(self, function_name, type_name): + self.function_name = function_name + self.type_name = type_name + self.suffixes = {} + self.code = [] + def gen_name(self, prefix): + n = self.suffixes.get(prefix, 0) + 1 + self.suffixes[prefix] = n + return '%s%d' % (prefix, n) if n > 1 else prefix + def print_(self): + print(''' +static bool +%s (struct spvxml_node_context *nctx, xmlNode **input, struct %s *p) +{''' + % (self.function_name, self.type_name)) + while self.code and self.code[0] == '': + self.code = self.code[1:] + for line in self.code: + print(' %s' % line if line else '') + print(' return true;') + print('}') + +STATE_START = 0 +STATE_ALTERNATION = 1 +STATE_SEQUENCE = 2 +STATE_REPETITION = 3 +STATE_OPTIONAL = 4 +STATE_GENERAL = 5 + +def generate_content_parser(nonterminal, rhs, function, ctx, state, seq_name): + seq_name = seq_name if seq_name else rhs.get('seq_name') + ctx.parser_index += 1 + + if rhs['type'] == 'etc': + function.code += ['spvxml_content_parse_etc (input);'] + elif rhs['type'] == 'text': + function.code += ['if (!spvxml_content_parse_text (nctx, input, &p->text))', + ' return false;'] + elif rhs['type'] == '|': + for i in range(len(rhs['items'])): + choice = rhs['items'][i] + subfunc = ctx.new_function(function.type_name) + generate_content_parser(nonterminal, choice, subfunc, ctx, + STATE_ALTERNATION + if state == STATE_START + else STATE_GENERAL, seq_name) + function.code += ['%(start)s!%(tryfunc)s (nctx, input, p, %(subfunc)s)%(end)s' + % {'start': 'if (' if i == 0 else ' && ', + 'subfunc': subfunc.function_name, + 'tryfunc': '%stry_parse_%s' + % (prefix, name_to_id(nonterminal)), + 'end': ')' if i == len(rhs['items']) - 1 else ''}] + function.code += [' {', + ' spvxml_content_error (nctx, *input, "Syntax error.");', + ' return false;', + ' }'] + elif rhs['type'] == 'sequence': + for element in rhs['items']: + generate_content_parser(nonterminal, element, function, ctx, + STATE_SEQUENCE + if state in (STATE_START, + STATE_ALTERNATION) + else STATE_GENERAL, seq_name) + elif rhs['type'] == 'empty': + function.code += ['(void) nctx;'] + function.code += ['(void) input;'] + function.code += ['(void) p;'] + elif rhs['type'] in ('*', '+', '?'): + subfunc = ctx.new_function(function.type_name) + generate_content_parser(nonterminal, rhs['item'], subfunc, ctx, + (STATE_OPTIONAL + if rhs['type'] == '?' + else STATE_REPETITION) + if state in (STATE_START, + STATE_ALTERNATION, + STATE_SEQUENCE) + else STATE_GENERAL, seq_name) + next_name = function.gen_name('next') + args = {'subfunc': subfunc.function_name, + 'tryfunc': '%stry_parse_%s' % (prefix, + name_to_id (nonterminal))} + if rhs['type'] == '?': + function.code += [ + '%(tryfunc)s (nctx, input, p, %(subfunc)s);' % args] + else: + if rhs['type'] == '+': + function.code += ['if (!%(subfunc)s (nctx, input, p))' % args, + ' return false;'] + function.code += [ + 'while (%(tryfunc)s (nctx, input, p, %(subfunc)s))' % args, + ' continue;'] + elif rhs['type'] == 'nonterminal': + node_name = function.gen_name('node') + function.code += [ + '', + 'xmlNode *%s;' % node_name, + 'if (!spvxml_content_parse_element (nctx, input, "%s", &%s))' + % (ctx.productions[rhs['nonterminal_name']][0], node_name), + ' return false;'] + if state in (STATE_START, + STATE_ALTERNATION, + STATE_SEQUENCE, + STATE_OPTIONAL): + target = '&p->%s' % name_to_id(rhs['member_name']) + else: + assert state in (STATE_REPETITION, STATE_GENERAL) + member = name_to_id(rhs['member_name']) if state == STATE_REPETITION else seq_name + function.code += ['struct %s%s *%s;' % ( + prefix, name_to_id(rhs['nonterminal_name']), member)] + target = '&%s' % member + function.code += [ + 'if (!%sparse_%s (nctx->up, %s, %s))' + % (prefix, name_to_id(rhs['nonterminal_name']), node_name, target), + ' return false;'] + if state in (STATE_REPETITION, STATE_GENERAL): + function.code += [ + 'p->%s = xrealloc (p->%s, sizeof *p->%s * (p->n_%s + 1));' + % (member, member, member, member), + 'p->%s[p->n_%s++] = %s;' % (member, member, + '&%s->node_' % member + if state == STATE_GENERAL + else member)] + else: + assert False + +def print_parser(name, production, productions, indent): + xml_name, attributes, rhs = production + + print(''' +static bool UNUSED +%(prefix)stry_parse_%(name)s ( + struct spvxml_node_context *nctx, xmlNode **input, + struct %(prefix)s%(name)s *p, + bool (*sub) (struct spvxml_node_context *, + xmlNode **, + struct %(prefix)s%(name)s *)) +{ + xmlNode *next = *input; + bool ok = sub (nctx, &next, p); + if (ok) + *input = next; + else if (!nctx->up->hard_error) { + free (nctx->up->error); + nctx->up->error = NULL; + } + return ok; +}''' + % {'prefix': prefix, + 'name': name_to_id(name)}) + + ctx = Parser_Context('%sparse_%s' % (prefix, name_to_id(name)), + productions) + if rhs['type'] not in ('empty', 'etc'): + function = ctx.new_function('%s%s' % (prefix, name_to_id(name))) + generate_content_parser(name, rhs, function, ctx, 0, None) + for f in reversed(ctx.functions): + f.print_() + + print(''' +bool +%(prefix)sparse_%(name)s ( + struct spvxml_context *ctx, xmlNode *input, + struct %(prefix)s%(name)s **p_) +{''' + % {'prefix': prefix, + 'name': name_to_id(name)}) + + print_attribute_decls(name, attributes) + + print(' struct spvxml_node_context nctx = {') + print(' .up = ctx,') + print(' .parent = input,') + print(' .attrs = attrs,') + print(' .n_attrs = N_ATTRS,') + print(' };') + print('') + print(' *p_ = NULL;') + print(' struct %(prefix)s%(name)s *p = xzalloc (sizeof *p);' + % {'prefix': prefix, + 'name': name_to_id(name)}) + print(' p->node_.raw = input;') + print(' p->node_.class_ = &%(prefix)s%(name)s_class;' + % {'prefix': prefix, + 'name': name_to_id(name)}) + print('') + + print_parser_for_attributes(name, attributes) + + if rhs['type'] == 'empty': + print(''' + /* Parse content. */ + if (!spvxml_content_parse_end (&nctx, input->children)) { + ctx->hard_error = true; + spvxml_node_context_uninit (&nctx); + %sfree_%s (p); + return false; + }''' + % (prefix, name_to_id(name))) + elif rhs['type'] == 'etc': + print(''' + /* Ignore content. */ +''') + else: + print(''' + /* Parse content. */ + input = input->children; + if (!%s (&nctx, &input, p) + || !spvxml_content_parse_end (&nctx, input)) { + ctx->hard_error = true; + spvxml_node_context_uninit (&nctx); + %sfree_%s (p); + return false; + }''' + % (function.function_name, + prefix, name_to_id(name))) + + print(''' + spvxml_node_context_uninit (&nctx); + *p_ = p; + return true;''') + + print "}" + + +def print_free_members(attributes, rhs, indent): + for unique_name, (xml_name, value, required) in attributes.items(): + c_name = name_to_id(unique_name) + if (type(value) is set + or value in ('dimension', 'real', 'int', 'color', 'id') + or value[0] == 'ref'): + pass + elif value == 'string': + print(' free (p->%s);' % c_name); + else: + assert False + + if rhs['type'] in ('etc', 'empty'): + pass + elif rhs['type'] == 'text': + print(' free (p->text);') + else: + n = 0 + for a in rhs['items'] if rhs['type'] == '|' else (rhs,): + for term in a['items'] if a['type'] == 'sequence' else (a,): + if term['type'] == 'empty': + pass + elif (term['type'] == 'nonterminal' + or (term['type'] == '?' + and term['item']['type'] == 'nonterminal')): + if term['type'] == '?': + term = term['item'] + nt_name = name_to_id(term['nonterminal_name']) + member_name = name_to_id(term['member_name']) + print(' %sfree_%s (p->%s);' % (prefix, nt_name, + member_name)) + elif (term['type'] in ('*', '+') + and term['item']['type'] == 'nonterminal'): + nt_name = name_to_id(term['item']['nonterminal_name']) + member_name = name_to_id(term['item']['member_name']) + print('''\ + for (size_t i = 0; i < p->n_%s; i++) + %sfree_%s (p->%s[i]); + free (p->%s);''' + % (member_name, + prefix, nt_name, member_name, + member_name)) + else: + n += 1 + seq_name = 'seq' if n == 1 else 'seq%d' % n + print('''\ + for (size_t i = 0; i < p->n_%s; i++) + p->%s[i]->class_->spvxml_node_free (p->%s[i]); + free (p->%s);''' + % (seq_name, + seq_name, seq_name, + seq_name)) + print(' free (p->node_.id);') + print(' free (p);') + + +def print_free(name, production, indent): + xml_name, attributes, rhs = production + + print ''' +void +%(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *p) +{ + if (!p) + return; +''' % {'prefix': prefix, + 'name': name_to_id(name)} + + print_free_members(attributes, rhs, ' ' * 4) + + print('}') + +def name_to_id(s): + return s[0].lower() + ''.join(['_%c' % x.lower() if x.isupper() else x + for x in s[1:]]).replace('-', '_') + + +def print_recurse_members(attributes, rhs, function): + if rhs['type'] == 'etc' or rhs['type'] == 'empty': + pass + elif rhs['type'] == 'text': + pass + else: + n = 0 + for a in rhs['items'] if rhs['type'] == '|' else (rhs,): + for term in a['items'] if a['type'] == 'sequence' else (a,): + if term['type'] == 'empty': + pass + elif (term['type'] == 'nonterminal' + or (term['type'] == '?' + and term['item']['type'] == 'nonterminal')): + if term['type'] == '?': + term = term['item'] + nt_name = name_to_id(term['nonterminal_name']) + member_name = name_to_id(term['member_name']) + print(' %s%s_%s (ctx, p->%s);' + % (prefix, function, nt_name, member_name)) + elif (term['type'] in ('*', '+') + and term['item']['type'] == 'nonterminal'): + nt_name = name_to_id(term['item']['nonterminal_name']) + member_name = name_to_id(term['item']['member_name']) + print('''\ + for (size_t i = 0; i < p->n_%s; i++) + %s%s_%s (ctx, p->%s[i]);''' + % (member_name, + prefix, function, nt_name, member_name)) + else: + n += 1 + seq_name = 'seq' if n == 1 else 'seq%d' % n + print('''\ + for (size_t i = 0; i < p->n_%s; i++) + p->%s[i]->class_->spvxml_node_%s (ctx, p->%s[i]);''' + % (seq_name, + seq_name, function, seq_name)) + + +def print_collect_ids(name, production): + xml_name, attributes, rhs = production + + print ''' +void +%(prefix)scollect_ids_%(name)s (struct spvxml_context *ctx, struct %(prefix)s%(name)s *p) +{ + if (!p) + return; + + spvxml_node_collect_id (ctx, &p->node_); +''' % {'prefix': prefix, + 'name': name_to_id(name)} + + print_recurse_members(attributes, rhs, 'collect_ids') + + print('}') + + +def print_resolve_refs(name, production): + xml_name, attributes, rhs = production + + print ''' +bool +%(prefix)sis_%(name)s (const struct spvxml_node *node) +{ + return node->class_ == &%(prefix)s%(name)s_class; +} + +struct %(prefix)s%(name)s * +%(prefix)scast_%(name)s (const struct spvxml_node *node) +{ + return (node && %(prefix)sis_%(name)s (node) + ? UP_CAST (node, struct %(prefix)s%(name)s, node_) + : NULL); +} + +void +%(prefix)sresolve_refs_%(name)s (struct spvxml_context *ctx UNUSED, struct %(prefix)s%(name)s *p UNUSED) +{ + if (!p) + return; +''' % {'prefix': prefix, + 'name': name_to_id(name)} + + i = 0 + for unique_name, (xml_name, value, required) in sorted(attributes.items()): + c_name = name_to_id(unique_name) + if type(value) is set or value[0] != 'ref': + continue + + if value[1] is None: + print(' p->%s = spvxml_node_resolve_ref (ctx, p->node_.raw, \"%s\", NULL, 0);' + % (c_name, xml_name)) + else: + i += 1 + name = 'classes' + if i > 1: + name += '%d' % i + if type(value[1]) is set: + print(' static const struct spvxml_node_class *const %s[] = {' % name) + for ref_type in value[1]: + print(' &%(prefix)s%(ref_type)s_class,' + % {'prefix': prefix, + 'ref_type': name_to_id(ref_type)}) + print(' };'); + print(' const size_t n_%s = sizeof %s / sizeof *%s;' + % (name, name, name)) + print(' p->%(member)s = spvxml_node_resolve_ref (ctx, p->node_.raw, \"%(attr)s\", %(name)s, n_%(name)s);' + % {"member": c_name, + "attr": xml_name, + 'prefix': prefix, + 'name': name + }) + else: + print(' static const struct spvxml_node_class *const %s' % name) + print(' = &%(prefix)s%(ref_type)s_class;' + % {'prefix': prefix, + 'ref_type': name_to_id(value[1])}) + print(' p->%(member)s = %(prefix)scast_%(ref_type)s (spvxml_node_resolve_ref (ctx, p->node_.raw, \"%(attr)s\", &%(name)s, 1));' + % {"member": c_name, + "attr": xml_name, + 'prefix': prefix, + 'name': name, + 'ref_type': name_to_id(value[1])}) + + + print_recurse_members(attributes, rhs, 'resolve_refs') + + print('}') + + +def name_to_id(s): + return s[0].lower() + ''.join(['_%c' % x.lower() if x.isupper() else x + for x in s[1:]]).replace('-', '_') + + +if __name__ == "__main__": + argv0 = sys.argv[0] + try: + options, args = getopt.gnu_getopt(sys.argv[1:], 'h', ['help']) + except getopt.GetoptError as e: + sys.stderr.write("%s: %s\n" % (argv0, e.msg)) + sys.exit(1) + + for key, value in options: + if key in ['-h', '--help']: + usage() + else: + sys.exit(0) + + if len(args) < 3: + sys.stderr.write("%s: bad usage (use --help for help)\n" % argv0) + sys.exit(1) + + global file_name + global prefix + file_name, output_type, prefix = args[:3] + input_file = open(file_name) + + prefix = '%s_' % prefix + + global line + global line_number + line = "" + line_number = 0 + + productions = {} + + global token + token = ('start', ) + get_token() + while True: + while match(';'): + pass + if token[0] == 'eof': + break + + name, xml_name, attributes, rhs = parse_production() + if name in productions: + fatal("%s: duplicate production" % name) + productions[name] = (xml_name, attributes, rhs) + + print '/* Generated automatically -- do not modify! -*- buffer-read-only: t -*- */' + if output_type == 'code' and len(args) == 4: + header_name = args[3] + + print """\ +#include +#include %s +#include +#include +#include +#include "libpspp/cast.h" +#include "libpspp/str.h" +#include "gl/xalloc.h" + +""" % header_name + for enum_name, values in sorted(enums.items()): + if len(values) <= 1: + continue + + c_name = name_to_id(enum_name) + print('\nstatic const struct spvxml_enum %s%s_map[] = {' + % (prefix, c_name)) + for value in sorted(values): + print(' { "%s", %s%s_%s },' % (value, prefix.upper(), + c_name.upper(), + name_to_id(value).upper())) + print(' { NULL, 0 },') + print('};') + print('\nconst char *') + print('%s%s_to_string (enum %s%s %s)' + % (prefix, c_name, prefix, c_name, c_name)) + print('{') + print(' switch (%s) {' % c_name) + for value in sorted(values): + print(' case %s%s_%s: return "%s";' + % (prefix.upper(), c_name.upper(), + name_to_id(value).upper(), value)) + print(' default: return NULL;') + print(' }') + print('}') + + for name, (xml_name, attributes, rhs) in sorted(productions.items()): + print('static void %(prefix)scollect_ids_%(name)s (struct spvxml_context *, struct %(prefix)s%(name)s *);\n' + 'static void %(prefix)sresolve_refs_%(name)s (struct spvxml_context *ctx UNUSED, struct %(prefix)s%(name)s *p UNUSED);\n' + % {'prefix': prefix, + 'name': name_to_id(name)}) + for name, production in sorted(productions.items()): + print_parser(name, production, productions, ' ' * 4) + print_free(name, production, ' ' * 4) + print_collect_ids(name, production) + print_resolve_refs(name, production) + print(''' +static void +%(prefix)sdo_free_%(name)s (struct spvxml_node *node) +{ + %(prefix)sfree_%(name)s (UP_CAST (node, struct %(prefix)s%(name)s, node_)); +} + +static void +%(prefix)sdo_collect_ids_%(name)s (struct spvxml_context *ctx, struct spvxml_node *node) +{ + %(prefix)scollect_ids_%(name)s (ctx, UP_CAST (node, struct %(prefix)s%(name)s, node_)); +} + +static void +%(prefix)sdo_resolve_refs_%(name)s (struct spvxml_context *ctx, struct spvxml_node *node) +{ + %(prefix)sresolve_refs_%(name)s (ctx, UP_CAST (node, struct %(prefix)s%(name)s, node_)); +} + +struct spvxml_node_class %(prefix)s%(name)s_class = { + "%(class)s", + %(prefix)sdo_free_%(name)s, + %(prefix)sdo_collect_ids_%(name)s, + %(prefix)sdo_resolve_refs_%(name)s, +}; +''' + % {'prefix': prefix, + 'name': name_to_id(name), + 'class': (name if name == production[0] + else '%s (%s)' % (name, production[0]))}) + elif output_type == 'header' and len(args) == 3: + print """\ +#ifndef %(PREFIX)sPARSER_H +#define %(PREFIX)sPARSER_H + +#include +#include +#include +#include "output/spv/spvxml-helpers.h"\ +""" % {'PREFIX': prefix.upper()} + for name, (xml_name, attributes, rhs) in sorted(productions.items()): + print_members(attributes, rhs, ' ' * 4) + print('''}; + +extern struct spvxml_node_class %(prefix)s%(name)s_class; + +bool %(prefix)sparse_%(name)s (struct spvxml_context *, xmlNode *input, struct %(prefix)s%(name)s **); +void %(prefix)sfree_%(name)s (struct %(prefix)s%(name)s *); +bool %(prefix)sis_%(name)s (const struct spvxml_node *); +struct %(prefix)s%(name)s *%(prefix)scast_%(name)s (const struct spvxml_node *);''' + % {'prefix': prefix, + 'name': name_to_id(name)}) + print """\ + +#endif /* %(PREFIX)sPARSER_H */""" % {'PREFIX': prefix.upper()} + else: + sys.stderr.write("%s: bad usage (use --help for help)" % argv0) diff --git a/src/output/table-item.c b/src/output/table-item.c index 6243f03109..a1595479bb 100644 --- a/src/output/table-item.c +++ b/src/output/table-item.c @@ -131,6 +131,7 @@ table_item_create (struct table *table, const char *title, const char *caption) item->title = table_item_text_create (title); item->layers = NULL; item->caption = table_item_text_create (caption); + item->pt = NULL; return item; } @@ -222,6 +223,7 @@ table_item_destroy (struct output_item *output_item) table_item_text_destroy (item->title); table_item_text_destroy (item->caption); table_item_layers_destroy (item->layers); + pivot_table_unref (item->pt); table_unref (item->table); free (item); } diff --git a/src/output/table-item.h b/src/output/table-item.h index 24740f7eab..a16c82ebb3 100644 --- a/src/output/table-item.h +++ b/src/output/table-item.h @@ -75,6 +75,7 @@ struct table_item struct table_item_text *title; /* Null if there is no title. */ struct table_item_text *caption; /* Null if there is no caption. */ struct table_item_layers *layers; /* Null if there is no layer info. */ + struct pivot_table *pt; }; struct table_item *table_item_create (struct table *, const char *title, diff --git a/src/ui/gui/psppire-output-window.c b/src/ui/gui/psppire-output-window.c index ad1f03db5f..de57f17f9d 100644 --- a/src/ui/gui/psppire-output-window.c +++ b/src/ui/gui/psppire-output-window.c @@ -253,6 +253,7 @@ struct file_types enum { FT_AUTO = 0, + FT_SPV, FT_PDF, FT_HTML, FT_ODT, @@ -267,6 +268,7 @@ enum struct file_types ft[n_FT] = { {N_("Infer file type from extension"), NULL}, + {N_("SPSS Viewer (*.spv)"), ".spv"}, {N_("PDF (*.pdf)"), ".pdf"}, {N_("HTML (*.html)"), ".html"}, {N_("OpenDocument (*.odt)"), ".odt"}, @@ -453,6 +455,9 @@ psppire_output_window_export (PsppireOutputWindow *window) switch (file_type) { + case FT_SPV: + export_output (window, &options, "spv"); + break; case FT_PDF: export_output (window, &options, "pdf"); break; diff --git a/src/ui/gui/psppire-window.c b/src/ui/gui/psppire-window.c index 771bbb3731..3c80523a9a 100644 --- a/src/ui/gui/psppire-window.c +++ b/src/ui/gui/psppire-window.c @@ -32,6 +32,11 @@ #include "data/file-handle-def.h" #include "data/dataset.h" #include "libpspp/version.h" +#include "output/group-item.h" +#include "output/pivot-table.h" +#include "output/spv/spv.h" +#include "output/spv/spv-output.h" +#include "output/spv/spv-select.h" #include "helper.h" #include "psppire-data-window.h" @@ -656,9 +661,12 @@ psppire_window_file_chooser_dialog (PsppireWindow *toplevel) gtk_file_filter_set_name (filter, _("Data and Syntax Files")); gtk_file_filter_add_mime_type (filter, "application/x-spss-sav"); gtk_file_filter_add_mime_type (filter, "application/x-spss-por"); + gtk_file_filter_add_mime_type (filter, "application/x-spss-spv"); gtk_file_filter_add_pattern (filter, "*.zsav"); gtk_file_filter_add_pattern (filter, "*.sps"); gtk_file_filter_add_pattern (filter, "*.SPS"); + gtk_file_filter_add_pattern (filter, "*.spv"); + gtk_file_filter_add_pattern (filter, "*.SPV"); gtk_file_chooser_add_filter (GTK_FILE_CHOOSER (dialog), filter); filter = gtk_file_filter_new (); @@ -678,6 +686,12 @@ psppire_window_file_chooser_dialog (PsppireWindow *toplevel) gtk_file_filter_add_pattern (filter, "*.SPS"); gtk_file_chooser_add_filter (GTK_FILE_CHOOSER (dialog), filter); + filter = gtk_file_filter_new (); + gtk_file_filter_set_name (filter, _("Output Files (*.spv) ")); + gtk_file_filter_add_pattern (filter, "*.spv"); + gtk_file_filter_add_pattern (filter, "*.SPV"); + gtk_file_chooser_add_filter (GTK_FILE_CHOOSER (dialog), filter); + filter = gtk_file_filter_new (); gtk_file_filter_set_name (filter, _("All Files")); gtk_file_filter_add_pattern (filter, "*"); @@ -711,6 +725,112 @@ psppire_window_file_chooser_dialog (PsppireWindow *toplevel) return dialog; } +struct item_path + { + const struct spv_item **nodes; + size_t n; + +#define N_STUB 10 + const struct spv_item *stub[N_STUB]; + }; + +static void +swap_nodes (const struct spv_item **a, const struct spv_item **b) +{ + const struct spv_item *tmp = *a; + *a = *b; + *b = tmp; +} + +static void +get_path (const struct spv_item *item, struct item_path *path) +{ + size_t allocated = 10; + path->nodes = path->stub; + path->n = 0; + + while (item) + { + if (path->n >= allocated) + { + if (path->nodes == path->stub) + path->nodes = xmemdup (path->stub, sizeof path->stub); + path->nodes = x2nrealloc (path->nodes, &allocated, + sizeof *path->nodes); + } + path->nodes[path->n++] = item; + item = item->parent; + } + + for (size_t i = 0; i < path->n / 2; i++) + swap_nodes (&path->nodes[i], &path->nodes[path->n - i - 1]); +} + +static void +free_path (struct item_path *path) +{ + if (path && path->nodes != path->stub) + free (path->nodes); +} + +static void +dump_heading_transition (const struct spv_item *old, + const struct spv_item *new) +{ + if (old == new) + return; + + struct item_path old_path, new_path; + get_path (old, &old_path); + get_path (new, &new_path); + + size_t common = 0; + for (; common < old_path.n && common < new_path.n; common++) + if (old_path.nodes[common] != new_path.nodes[common]) + break; + + for (size_t i = common; i < old_path.n; i++) + group_close_item_submit (group_close_item_create ()); + for (size_t i = common; i < new_path.n; i++) + group_open_item_submit (group_open_item_create ( + new_path.nodes[i]->command_id)); + + free_path (&old_path); + free_path (&new_path); +} + +void +read_spv_file (const char *filename) +{ + struct spv_reader *spv; + char *error = spv_open (filename, &spv); + if (error) + { + /* XXX */ + fprintf (stderr, "%s\n", error); + return; + } + + struct spv_item **items; + size_t n_items; + spv_select (spv, NULL, 0, &items, &n_items); + struct spv_item *prev_heading = spv_get_root (spv); + for (size_t i = 0; i < n_items; i++) + { + struct spv_item *heading + = items[i]->type == SPV_ITEM_HEADING ? items[i] : items[i]->parent; + dump_heading_transition (prev_heading, heading); + if (items[i]->type == SPV_ITEM_TEXT) + spv_text_submit (items[i]); + else if (items[i]->type == SPV_ITEM_TABLE) + pivot_table_submit (spv_item_get_table (items[i])); + prev_heading = heading; + } + dump_heading_transition (prev_heading, spv_get_root (spv)); + free (items); + spv_close (spv); +} + /* Callback for the file_open action. Prompts for a filename and opens it */ void @@ -739,7 +859,16 @@ psppire_window_open (PsppireWindow *de) if (retval == 1) open_data_window (de, name, encoding, NULL); else if (retval == 0) - open_syntax_window (name, encoding); + { + char *error = spv_detect (name); + if (!error) + read_spv_file (name); + else + { + free (error); + open_syntax_window (name, encoding); + } + } g_free (encoding); fh_unref (fh); diff --git a/src/ui/gui/psppire-window.h b/src/ui/gui/psppire-window.h index bc1b0f0b28..c65823a298 100644 --- a/src/ui/gui/psppire-window.h +++ b/src/ui/gui/psppire-window.h @@ -120,6 +120,8 @@ GtkWidget *psppire_window_file_chooser_dialog (PsppireWindow *toplevel); void add_most_recent (const char *file_name, const char *mime_type, const char *encoding); +void read_spv_file (const char *filename); + G_END_DECLS #endif /* __PSPPIRE_WINDOW_H__ */ diff --git a/src/ui/gui/psppire.c b/src/ui/gui/psppire.c index d2262e45e6..e3f8589f71 100644 --- a/src/ui/gui/psppire.c +++ b/src/ui/gui/psppire.c @@ -38,6 +38,7 @@ #include "output/driver.h" #include "output/journal.h" #include "output/message-item.h" +#include "output/spv/spv.h" #include "ui/gui/dict-display.h" #include "ui/gui/executor.h" @@ -193,8 +194,15 @@ psppire_preload_file (const gchar *file) w = open_data_window (NULL, filename, NULL, NULL); else if (retval == 0) { - create_data_window (); - w = open_syntax_window (filename, NULL); + char *error = spv_detect (filename); + if (!error) + read_spv_file (filename); + else + { + free (error); + create_data_window (); + open_syntax_window (filename, NULL); + } } fh_unref (fh); diff --git a/tests/automake.mk b/tests/automake.mk index c7e0d4f9d8..f3e14e7e98 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -280,7 +280,8 @@ EXTRA_DIST += \ tests/language/data-io/test.ods \ tests/language/data-io/newone.ods \ tests/language/data-io/readnames.ods \ - tests/language/stats/llz.zsav + tests/language/stats/llz.zsav \ + tests/utilities/regress.spv CLEANFILES += *.save pspp.* foo* @@ -426,6 +427,7 @@ TESTSUITE_AT = \ tests/ui/terminal/main.at \ tests/ui/syntax-gen.at \ tests/utilities/pspp-convert.at \ + tests/utilities/pspp-output.at \ tests/perl-module.at TESTSUITE = $(srcdir)/tests/testsuite diff --git a/tests/utilities/pspp-output.at b/tests/utilities/pspp-output.at new file mode 100644 index 0000000000..a577867f76 --- /dev/null +++ b/tests/utilities/pspp-output.at @@ -0,0 +1,190 @@ +AT_BANNER([pspp-output]) + +AT_SETUP([pspp-output dir]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv], [0], [dnl +- heading "Set" command "Set" +- heading "Title" command "Title" + - text "Page Title" command "Title" +- heading "Data List" command "Data List" + - table "Reading 1 record from INLINE." command "Data List" subtype "Fixed Data Records" +- heading "Begin Data" command "Begin Data" +- heading "List" command "List" + - table "Data List" command "List" +- heading "Frequencies" command "Frequencies" + - table "Statistics" command "Frequencies" + - table "v0" command "Frequencies" subtype "Frequencies" + - table "v1" command "Frequencies" subtype "Frequencies" + - table "v2" command "Frequencies" subtype "Frequencies" +- heading "Regression" command "Regression" + - table "Model Summary (v2)" command "Regression" subtype "Model Summary" + - table "ANOVA (v2)" command "Regression" subtype "ANOVA" + - table "Coefficients (v2)" command "Regression" subtype "Coefficients" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --select equal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --select=headings], + [0], [dnl +- heading "Set" command "Set" +- heading "Title" command "Title" +- heading "Data List" command "Data List" +- heading "Begin Data" command "Begin Data" +- heading "List" command "List" +- heading "Frequencies" command "Frequencies" +- heading "Regression" command "Regression" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --select unequal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --select=^headings], + [0], [dnl + - text "Page Title" command "Title" + - table "Reading 1 record from INLINE." command "Data List" subtype "Fixed Data Records" + - table "Data List" command "List" + - table "Statistics" command "Frequencies" + - table "v0" command "Frequencies" subtype "Frequencies" + - table "v1" command "Frequencies" subtype "Frequencies" + - table "v2" command "Frequencies" subtype "Frequencies" + - table "Model Summary (v2)" command "Regression" subtype "Model Summary" + - table "ANOVA (v2)" command "Regression" subtype "ANOVA" + - table "Coefficients (v2)" command "Regression" subtype "Coefficients" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --commands equal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --commands='reg*'], + [0], [dnl +- heading "Regression" command "Regression" + - table "Model Summary (v2)" command "Regression" subtype "Model Summary" + - table "ANOVA (v2)" command "Regression" subtype "ANOVA" + - table "Coefficients (v2)" command "Regression" subtype "Coefficients" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --commands unequal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --commands='^reg*'], + [0], [dnl +- heading "Set" command "Set" +- heading "Title" command "Title" + - text "Page Title" command "Title" +- heading "Data List" command "Data List" + - table "Reading 1 record from INLINE." command "Data List" subtype "Fixed Data Records" +- heading "Begin Data" command "Begin Data" +- heading "List" command "List" + - table "Data List" command "List" +- heading "Frequencies" command "Frequencies" + - table "Statistics" command "Frequencies" + - table "v0" command "Frequencies" subtype "Frequencies" + - table "v1" command "Frequencies" subtype "Frequencies" + - table "v2" command "Frequencies" subtype "Frequencies" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --subtypes equal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --subtypes='freq*'], + [0], [dnl + - table "v0" command "Frequencies" subtype "Frequencies" + - table "v1" command "Frequencies" subtype "Frequencies" + - table "v2" command "Frequencies" subtype "Frequencies" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --subtypes unequal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --subtypes='^freq*'], + [0], [dnl +- heading "Set" command "Set" +- heading "Title" command "Title" + - text "Page Title" command "Title" +- heading "Data List" command "Data List" + - table "Reading 1 record from INLINE." command "Data List" subtype "Fixed Data Records" +- heading "Begin Data" command "Begin Data" +- heading "List" command "List" + - table "Data List" command "List" +- heading "Frequencies" command "Frequencies" + - table "Statistics" command "Frequencies" +- heading "Regression" command "Regression" + - table "Model Summary (v2)" command "Regression" subtype "Model Summary" + - table "ANOVA (v2)" command "Regression" subtype "ANOVA" + - table "Coefficients (v2)" command "Regression" subtype "Coefficients" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --labels equal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --labels='v*'], + [0], [dnl + - table "v0" command "Frequencies" subtype "Frequencies" + - table "v1" command "Frequencies" subtype "Frequencies" + - table "v2" command "Frequencies" subtype "Frequencies" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --labels unequal]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --labels='^data*'], + [0], [dnl +- heading "Set" command "Set" +- heading "Title" command "Title" + - text "Page Title" command "Title" + - table "Reading 1 record from INLINE." command "Data List" subtype "Fixed Data Records" +- heading "Begin Data" command "Begin Data" +- heading "List" command "List" +- heading "Frequencies" command "Frequencies" + - table "Statistics" command "Frequencies" + - table "v0" command "Frequencies" subtype "Frequencies" + - table "v1" command "Frequencies" subtype "Frequencies" + - table "v2" command "Frequencies" subtype "Frequencies" +- heading "Regression" command "Regression" + - table "Model Summary (v2)" command "Regression" subtype "Model Summary" + - table "ANOVA (v2)" command "Regression" subtype "ANOVA" + - table "Coefficients (v2)" command "Regression" subtype "Coefficients" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --instances]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --instances=1], + [0], [dnl + - text "Page Title" command "Title" + - table "Reading 1 record from INLINE." command "Data List" subtype "Fixed Data Records" + - table "Data List" command "List" + - table "Statistics" command "Frequencies" + - table "Model Summary (v2)" command "Regression" subtype "Model Summary" +]) +AT_CLEANUP + +AT_SETUP([pspp-output --instances=last]) +AT_KEYWORDS([--instances last]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --instances=last], + [0], [dnl + - text "Page Title" command "Title" + - table "Reading 1 record from INLINE." command "Data List" subtype "Fixed Data Records" + - table "Data List" command "List" + - table "v2" command "Frequencies" subtype "Frequencies" + - table "Coefficients (v2)" command "Regression" subtype "Coefficients" +]) +AT_CLEANUP + +dnl XXX Currently PSPP doesn't output hidden items so no tests +dnl XXX for --show-hidden. + +AT_SETUP([pspp-output --or]) +AT_CHECK([pspp-output dir $srcdir/utilities/regress.spv --select=headings --or --labels='v*'], + [0], [dnl +- heading "Set" command "Set" +- heading "Title" command "Title" +- heading "Data List" command "Data List" +- heading "Begin Data" command "Begin Data" +- heading "List" command "List" +- heading "Frequencies" command "Frequencies" + - table "v0" command "Frequencies" subtype "Frequencies" + - table "v1" command "Frequencies" subtype "Frequencies" + - table "v2" command "Frequencies" subtype "Frequencies" +- heading "Regression" command "Regression" +]) +AT_CLEANUP + +AT_SETUP([pspp-output convert]) +AT_CHECK([pspp-output convert $srcdir/utilities/regress.spv -O format=csv - --subtypes='model*'], [0], [dnl +Table: Model Summary (v2) +R,R Square,Adjusted R Square,Std. Error of the Estimate +.96,.92,.91,1.49 +]) +AT_CLEANUP diff --git a/tests/utilities/regress.spv b/tests/utilities/regress.spv new file mode 100644 index 0000000000..4979fc1044 Binary files /dev/null and b/tests/utilities/regress.spv differ diff --git a/utilities/automake.mk b/utilities/automake.mk index 1d58b619a9..f4cf8983c8 100644 --- a/utilities/automake.mk +++ b/utilities/automake.mk @@ -32,3 +32,14 @@ utilities_pspp_convert_LDFLAGS = $(PSPP_LDFLAGS) $(PG_LDFLAGS) if RELOCATABLE_VIA_LD utilities_pspp_convert_LDFLAGS += `$(RELOCATABLE_LDFLAGS) $(bindir)` endif + +bin_PROGRAMS += utilities/pspp-output +dist_man_MANS += utilities/pspp-output.1 +utilities_pspp_output_SOURCES = utilities/pspp-output.c +utilities_pspp_output_CPPFLAGS = \ + $(LIBXML2_CFLAGS) $(AM_CPPFLAGS) -DINSTALLDIR=\"$(bindir)\" +utilities_pspp_output_LDADD = \ + src/libpspp.la \ + src/libpspp-core.la \ + $(CAIRO_LIBS) +utilities_pspp_output_LDFLAGS = $(PSPP_LDFLAGS) $(LIBXML2_LIBS) diff --git a/utilities/pspp-convert.1 b/utilities/pspp-convert.1 index 9c96dead1e..86530ec76c 100644 --- a/utilities/pspp-convert.1 +++ b/utilities/pspp-convert.1 @@ -7,7 +7,7 @@ .TH pspp\-convert 1 "October 2013" "PSPP" "PSPP Manual" . .SH NAME -pspp\-convert \- convert SPSS files to other formats +pspp\-convert \- convert SPSS data files to other formats . .SH SYNOPSIS \fBpspp\-convert\fR [\fIoptions\fR] \fIinput\fR \fIoutput\fR @@ -131,5 +131,6 @@ Ben Pfaff. . .SH "SEE ALSO" . +.BR pspp\-output (1), .BR pspp (1), .BR psppire (1). diff --git a/utilities/pspp-output.1 b/utilities/pspp-output.1 new file mode 100644 index 0000000000..a9a4c3974c --- /dev/null +++ b/utilities/pspp-output.1 @@ -0,0 +1,183 @@ +.\" -*- nroff -*- +.de IQ +. br +. ns +. IP "\\$1" +.. +.TH pspp\-output 1 "December 2019" "PSPP" "PSPP Manual" +. +.SH NAME +pspp\-output \- convert and operate on SPSS viewer (SPV) files +. +.SH SYNOPSIS +\fBpspp\-output detect \fIfile\fR +.br +\fBpspp\-output \fR[\fIoptions\fR] \fBdir\fR \fIfile\fR +.br +\fBpspp\-output \fR[\fIoptions\fR] \fBconvert\fR \fIsource destination\fR +.br +\fBpspp\-output \-\-help\fR | \fB\-h\fR +.br +\fBpspp\-output \-\-version\fR | \fB\-v\fR +. +.SH DESCRIPTION +.PP +\fBpspp\-output\fR is a command-line utility accompanying PSPP. +It supports multiple operations on SPSS viewer or \fB.spv\fR files, +here called SPV files. SPSS 16 and later writes SPV files to +represent the contents of its output editor. +.PP +SPSS 15 and earlier versions instead use \fB.spo\fR files. +\fBpspp\-output\fR does not support this format. +.PP +\fBpspp\-output\fR has a number of subcommands, documented separately +below. \fBpspp\-output\fR also has several undocumented command forms +that developers may find useful for debugging. +. +.SS The \fBdetect\fR command +When invoked as \fBpspp\-output detect \fIfile\fR, \fBpspp\-output\fR +reads enough of \fIfile\fR to determine whether it is an SPV file. If +so, it exits successfully without outputting anything. When +\fIfile\fR is not an SPV file or if some other error occurs, +\fBpspp\-output\fR prints an error message and exits with a failure +indication. +. +.SS The \fBdir\fR command +When invoked as \fBpspp\-output dir \fIfile\fR, \fBpspp\-output\fR +prints on stdout a table of contents for SPV file \fIfile\fR. By +default, this table lists every object in the file, except for hidden +objects. See the \fBInput Selection Options\fR section below for +information on the options available to select a subset of objects. +.PP +The following additional option for \fBdir\fR is intended mainly for +use by PSPP developers: +. +.IP "\fB\-\-member\-names\fR" +Also show the names of the Zip members associated with each object. +. +.SS The \fBconvert\fR command +When invoked as \fBpspp\-output convert \fIsource destination\fR, +\fBpspp\-output\fR reads the SPV file \fIsource\fR and converts it +to another format, writing the output to \fIdestination\fR. +.PP +By default, \fBpspp\-output\fR infers the intended format for +\fIdestination\fR from its extension: +. +.IP \fBcsv\fR +.IQ \fBtxt\fR +Comma-separated value. Each value is formatted according to its +variable's print format. The first line in the file contains variable +names. +. +.IP \fBsav\fR +.IQ \fBsys\fR +SPSS system file. +. +.IP \fBpor\fR +SPSS portable file. +. +.IP \fBsps\fR +SPSS syntax file. (Only encrypted syntax files may be converted to +this format.) +.PP +See the \fBInput Selection Options\fR section below for information on +the options available to select a subset of objects to include in the +output. The following additional options are accepted: +.IP "\fB-O format=\fIformat\fR" +Overrides the format inferred from the output file's extension. +\fIformat\fR must be one of the extensions listed above. +.IP "\fB-O \fIoption\fB=\fIvalue\fR" +Sets an option for the output file format. Refer to the PSPP manual +for details of the available output options. +.IP \fB\-F\fR +.IQ \fB\-\-force\fR +By default, if the source is corrupt or otherwise cannot be processed, +the destination is not written. These option make \fBpspp\-output\fR +write the output as best it can, even with errors. +.SS "Input Selection Options" +The \fBdir\fR and \fBconvert\fR commands, by default, operate on all +of the objects in the source SPV file, except for objects that are not +visible in the output viewer window. The user may specify these +options to select a subset of the input objects. When multiple +options are used, only objects that satisfy all of them are selected: +.IP "\fB\-\-select=\fR[\fB^\fR]\fIclass\fR..." +Include only objects of the given \fIclass\fR; with leading \fB^\fR, +include only objects not in the class. Use commas to separate +multiple classes. The supported classes are: +.RS +.IP +\fBcharts headings logs models tables texts trees warnings +outlineheaders pagetitle notes unknown other\fR +.RE +.IP +Use \fB\-\-select=help\fR to print this list of classes. +.IP "\-\-commands=\fR[\fB^\fR]\fIcommand\fR..." +.IQ "\-\-subtypes=\fR[\fB^\fR]\fIsubtype\fR..." +.IQ "\-\-labels=\fR[\fB^\fR]\fIlabel\fR..." +Include only objects with the specified \fIcommand\fR, \fIsubtype\fR, +or \fIlabel\fR. With a leading \fB^\fR, include only the objects +that do not match. Multiple values may be specified separated by +commas. An asterisk at the end of a value acts as a wildcard. +.IP +The \fB\-\-command\fR option matches command identifiers, case +insensitively. All of the objects produced by a single command use +the same, unique command identifier. Command identifiers are always +in English regardless of the language used for output. They often +differ from the command name in PSPP syntax. Use the +\fBpspp\-output\fR program's \fBdir\fR command to print command +identifiers in particular output. +.IP +The \fB\-\-subtypes\fR option matches particular tables within a +command, case insensitively. Subtypes are not necessarily unique: two +commands that produce similar output tables may use the same subtype. +Subtypes are always in English and \fBdir\fR will print them. +.IP +The \fB\-\-labels\fR option matches the labels in table output (that +is, the table titles). Labels are affected by the output language, +variable names and labels, split file settings, and other factors. +.IP "\fB\-\-instances=\fIinstance\fR..." +Include the specified \fIinstance\fR of an object that matches the +other criteria within a single command. The \fIinstance\fR may be a +number (1 for the first instance, 2 for the second, and so on) or +\fBlast\fR for the last instance. +.IP "\fB\-\-show\-hidden" +Include hidden output objects in the output. By default, they are +excluded. +.IP "\fB\-\-or\fR" +Separates two sets of selection options. Objects selected by either +set of options are included in the output. +.PP +The following additional input selection options are intended mainly +for use by PSPP developers: +.IP "\fB\-\-errors\fR" +Include only objects that cause an error when read. With the +\fBconvert\fR command, this is most useful in conjunction with the +\fB\-\-force\fR option. +.IP "\fB\-\-members=\fImember\fR..." +Include only the objects that include a listed Zip file \fImember\fR. +More than one name may be included, comma-separated. The members in +an SPV file may be listed with the \fBdir\fR command by adding the +\fB\-\-show\-members\fR option or with the \fBzipinfo\fR program +included with many operating systems. Error messages that +\fBpspp\-output\fR prints when it reads SPV files also often include +member names. +.IP "\fB\-\-member\-names\fR" +Displays the name of the Zip member or members associated with each +object just above the object itself. +.SH "OPTIONS" +.IP "\fB\-h\fR" +.IQ "\fB\-\-help\fR" +Prints a usage message on stdout and exits. +. +.IP "\fB\-v\fR" +.IQ "\fB\-\-version\fR" +Prints version information on stdout and exits. +. +.SH "AUTHORS" +Ben Pfaff. +. +.SH "SEE ALSO" +. +.BR pspp\-convert (1), +.BR pspp (1), +.BR psppire (1). diff --git a/utilities/pspp-output.c b/utilities/pspp-output.c new file mode 100644 index 0000000000..c57dda8e6d --- /dev/null +++ b/utilities/pspp-output.c @@ -0,0 +1,1022 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2017, 2018 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include +#include +#include +#include + +#include "data/file-handle-def.h" +#include "data/settings.h" +#include "libpspp/i18n.h" +#include "libpspp/message.h" +#include "libpspp/string-map.h" +#include "libpspp/string-set.h" +#include "output/driver.h" +#include "output/group-item.h" +#include "output/page-setup-item.h" +#include "output/pivot-table.h" +#include "output/spv/light-binary-parser.h" +#include "output/spv/spv-legacy-data.h" +#include "output/spv/spv-output.h" +#include "output/spv/spv-select.h" +#include "output/spv/spv.h" +#include "output/table-item.h" +#include "output/text-item.h" + +#include "gl/c-ctype.h" +#include "gl/error.h" +#include "gl/progname.h" +#include "gl/version-etc.h" +#include "gl/xalloc.h" + +#include +#include +#include + +#include "gettext.h" +#define _(msgid) gettext (msgid) + +/* -O key=value: Output driver options. */ +static struct string_map output_options + = STRING_MAP_INITIALIZER (output_options); + +/* --member-name: Include .zip member name in "dir" output. */ +static bool show_member_names; + +/* --show-hidden, --select, --commands, ...: Selection criteria. */ +static struct spv_criteria *criteria; +static size_t n_criteria, allocated_criteria; + +/* --or: Add new element to 'criteria' array. */ +static bool new_criteria; + +/* --sort: Sort members under dump-light-table, to make comparisons easier. */ +static bool sort; + +/* --raw: Dump raw binary data in dump-light-table. */ +static bool raw; + +/* -f, --force: Keep output file even on error. */ +static bool force; + +/* Number of warnings issued. */ +static size_t n_warnings; + +static void usage (void); +static void parse_options (int argc, char **argv); + +static void +dump_item (const struct spv_item *item) +{ + if (show_member_names && (item->xml_member || item->bin_member)) + { + const char *x = item->xml_member; + const char *b = item->bin_member; + char *s = (x && b + ? xasprintf (_("%s and %s:"), x, b) + : xasprintf ("%s:", x ? x : b)); + text_item_submit (text_item_create_nocopy (TEXT_ITEM_TITLE, s)); + } + + switch (spv_item_get_type (item)) + { + case SPV_ITEM_HEADING: + break; + + case SPV_ITEM_TEXT: + spv_text_submit (item); + break; + + case SPV_ITEM_TABLE: + pivot_table_submit (pivot_table_ref (spv_item_get_table (item))); + break; + + case SPV_ITEM_GRAPH: + break; + + case SPV_ITEM_MODEL: + break; + + case SPV_ITEM_OBJECT: + break; + + case SPV_ITEM_TREE: + break; + + default: + abort (); + } +} + +static void +print_item_directory (const struct spv_item *item) +{ + for (int i = 1; i < spv_item_get_level (item); i++) + printf (" "); + + enum spv_item_type type = spv_item_get_type (item); + printf ("- %s", spv_item_type_to_string (type)); + + const char *label = spv_item_get_label (item); + if (label) + printf (" \"%s\"", label); + + if (type == SPV_ITEM_TABLE) + { + const struct pivot_table *table = spv_item_get_table (item); + char *title = pivot_value_to_string (table->title, + SETTINGS_VALUE_SHOW_DEFAULT, + SETTINGS_VALUE_SHOW_DEFAULT); + if (!label || strcmp (title, label)) + printf (" title \"%s\"", title); + free (title); + } + + const char *command_id = spv_item_get_command_id (item); + if (command_id) + printf (" command \"%s\"", command_id); + + const char *subtype = spv_item_get_subtype (item); + if (subtype && (!label || strcmp (label, subtype))) + printf (" subtype \"%s\"", subtype); + + if (!spv_item_is_visible (item)) + printf (" (hidden)"); + if (show_member_names && (item->xml_member || item->bin_member)) + { + if (item->xml_member && item->bin_member) + printf (" in %s and %s", item->xml_member, item->bin_member); + else if (item->xml_member) + printf (" in %s", item->xml_member); + else if (item->bin_member) + printf (" in %s", item->bin_member); + } + putchar ('\n'); +} + +static void +run_detect (int argc UNUSED, char **argv) +{ + char *err = spv_detect (argv[1]); + if (err) + error (1, 0, "%s", err); +} + +static void +run_directory (int argc UNUSED, char **argv) +{ + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + for (size_t i = 0; i < n_items; i++) + print_item_directory (items[i]); + free (items); + + spv_close (spv); +} + +struct item_path + { + const struct spv_item **nodes; + size_t n; + +#define N_STUB 10 + const struct spv_item *stub[N_STUB]; + }; + +static void +swap_nodes (const struct spv_item **a, const struct spv_item **b) +{ + const struct spv_item *tmp = *a; + *a = *b; + *b = tmp; +} + +static void +get_path (const struct spv_item *item, struct item_path *path) +{ + size_t allocated = 10; + path->nodes = path->stub; + path->n = 0; + + while (item) + { + if (path->n >= allocated) + { + if (path->nodes == path->stub) + path->nodes = xmemdup (path->stub, sizeof path->stub); + path->nodes = x2nrealloc (path->nodes, &allocated, + sizeof *path->nodes); + } + path->nodes[path->n++] = item; + item = item->parent; + } + + for (size_t i = 0; i < path->n / 2; i++) + swap_nodes (&path->nodes[i], &path->nodes[path->n - i - 1]); +} + +static void +free_path (struct item_path *path) +{ + if (path && path->nodes != path->stub) + free (path->nodes); +} + +static void +dump_heading_transition (const struct spv_item *old, + const struct spv_item *new) +{ + if (old == new) + return; + + struct item_path old_path, new_path; + get_path (old, &old_path); + get_path (new, &new_path); + + size_t common = 0; + for (; common < old_path.n && common < new_path.n; common++) + if (old_path.nodes[common] != new_path.nodes[common]) + break; + + for (size_t i = common; i < old_path.n; i++) + group_close_item_submit (group_close_item_create ()); + for (size_t i = common; i < new_path.n; i++) + group_open_item_submit (group_open_item_create ( + new_path.nodes[i]->command_id)); + + free_path (&old_path); + free_path (&new_path); +} + +static void +run_convert (int argc UNUSED, char **argv) +{ + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + output_engine_push (); + output_set_filename (argv[1]); + string_map_replace (&output_options, "output-file", argv[2]); + struct output_driver *driver = output_driver_create (&output_options); + if (!driver) + exit (EXIT_FAILURE); + output_driver_register (driver); + + const struct page_setup *ps = spv_get_page_setup (spv); + if (ps) + page_setup_item_submit (page_setup_item_create (ps)); + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + struct spv_item *prev_heading = spv_get_root (spv); + for (size_t i = 0; i < n_items; i++) + { + struct spv_item *heading + = items[i]->type == SPV_ITEM_HEADING ? items[i] : items[i]->parent; + dump_heading_transition (prev_heading, heading); + dump_item (items[i]); + prev_heading = heading; + } + dump_heading_transition (prev_heading, spv_get_root (spv)); + free (items); + + spv_close (spv); + + output_engine_pop (); + fh_done (); + + if (n_warnings && !force) + { + /* XXX There could be other files to unlink, e.g. the ascii driver can + produce additional files with the charts. */ + unlink (argv[2]); + } +} + +static void +run_dump (int argc UNUSED, char **argv) +{ + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + for (size_t i = 0; i < n_items; i++) + if (items[i]->type == SPV_ITEM_TABLE) + { + pivot_table_dump (spv_item_get_table (items[i]), 0); + putchar ('\n'); + } + free (items); + + spv_close (spv); +} + +static int +compare_borders (const void *a_, const void *b_) +{ + const struct spvlb_border *const *ap = a_; + const struct spvlb_border *const *bp = b_; + uint32_t a = (*ap)->border_type; + uint32_t b = (*bp)->border_type; + + return a < b ? -1 : a > b; +} + +static int +compare_cells (const void *a_, const void *b_) +{ + const struct spvlb_cell *const *ap = a_; + const struct spvlb_cell *const *bp = b_; + uint64_t a = (*ap)->index; + uint64_t b = (*bp)->index; + + return a < b ? -1 : a > b; +} + +static void +run_dump_light_table (int argc UNUSED, char **argv) +{ + if (raw && isatty (STDOUT_FILENO)) + error (1, 0, "not writing binary data to tty"); + + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + for (size_t i = 0; i < n_items; i++) + { + if (!spv_item_is_light_table (items[i])) + continue; + + char *error; + if (raw) + { + void *data; + size_t size; + error = spv_item_get_raw_light_table (items[i], &data, &size); + if (!error) + { + fwrite (data, size, 1, stdout); + free (data); + } + } + else + { + struct spvlb_table *table; + error = spv_item_get_light_table (items[i], &table); + if (!error) + { + if (sort) + { + qsort (table->borders->borders, table->borders->n_borders, + sizeof *table->borders->borders, compare_borders); + qsort (table->cells->cells, table->cells->n_cells, + sizeof *table->cells->cells, compare_cells); + } + spvlb_print_table (items[i]->bin_member, 0, table); + spvlb_free_table (table); + } + } + if (error) + { + msg (ME, "%s", error); + free (error); + } + } + + free (items); + + spv_close (spv); +} + +static void +run_dump_legacy_data (int argc UNUSED, char **argv) +{ + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + for (size_t i = 0; i < n_items; i++) + if (spv_item_is_legacy_table (items[i])) + { + struct spv_data data; + char *error; + if (raw) + { + void *data; + size_t size; + error = spv_item_get_raw_legacy_data (items[i], &data, &size); + if (!error) + { + fwrite (data, size, 1, stdout); + free (data); + } + } + else + { + error = spv_item_get_legacy_data (items[i], &data); + if (!error) + { + printf ("%s:\n", items[i]->bin_member); + spv_data_dump (&data, stdout); + spv_data_uninit (&data); + printf ("\n"); + } + } + + if (error) + { + msg (ME, "%s", error); + free (error); + } + } + free (items); + + spv_close (spv); +} + +/* This is really bogus. + + XPath doesn't have any notion of a default XML namespace, but all of the + elements in the documents we're interested in have a namespace. Thus, we'd + need to require the XPath expressions to have a namespace on every single + element: vis:sourceVariable, vis:graph, and so on. That's a pain. So, + instead, we remove the default namespace from everyplace it occurs. XPath + does support the null namespace, so this allows sourceVariable, graph, + etc. to work. + + See http://plasmasturm.org/log/259/ and + https://mail.gnome.org/archives/xml/2003-April/msg00144.html for more + information.*/ +static void +remove_default_xml_namespace (xmlNode *node) +{ + if (node->ns && !node->ns->prefix) + node->ns = NULL; + + for (xmlNode *child = node->children; child; child = child->next) + remove_default_xml_namespace (child); +} + +static void +register_ns (xmlXPathContext *ctx, const char *prefix, const char *uri) +{ + xmlXPathRegisterNs (ctx, CHAR_CAST (xmlChar *, prefix), + CHAR_CAST (xmlChar *, uri)); +} + +static xmlXPathContext * +create_xpath_context (xmlDoc *doc) +{ + xmlXPathContext *ctx = xmlXPathNewContext (doc); + register_ns (ctx, "vgr", "http://xml.spss.com/spss/viewer/viewer-graph"); + register_ns (ctx, "vizml", "http://xml.spss.com/visualization"); + register_ns (ctx, "vmd", "http://xml.spss.com/spss/viewer/viewer-model"); + register_ns (ctx, "vps", "http://xml.spss.com/spss/viewer/viewer-pagesetup"); + register_ns (ctx, "vst", "http://xml.spss.com/spss/viewer/viewer-style"); + register_ns (ctx, "vtb", "http://xml.spss.com/spss/viewer/viewer-table"); + register_ns (ctx, "vtl", "http://xml.spss.com/spss/viewer/table-looks"); + register_ns (ctx, "vtt", "http://xml.spss.com/spss/viewer/viewer-treemodel"); + register_ns (ctx, "vtx", "http://xml.spss.com/spss/viewer/viewer-text"); + register_ns (ctx, "xsi", "http://www.w3.org/2001/XMLSchema-instance"); + return ctx; +} + +static void +dump_xml (int argc, char **argv, const char *member_name, + char *error_s, xmlDoc *doc) +{ + if (!error_s) + { + if (argc == 2) + { + printf ("\n", member_name); + xmlElemDump (stdout, NULL, xmlDocGetRootElement (doc)); + putchar ('\n'); + } + else + { + bool any_results = false; + + remove_default_xml_namespace (xmlDocGetRootElement (doc)); + for (int i = 2; i < argc; i++) + { + xmlXPathContext *xpath_ctx = create_xpath_context (doc); + xmlXPathSetContextNode (xmlDocGetRootElement (doc), + xpath_ctx); + xmlXPathObject *xpath_obj = xmlXPathEvalExpression( + CHAR_CAST (xmlChar *, argv[i]), xpath_ctx); + if (!xpath_obj) + error (1, 0, _("%s: invalid XPath expression"), argv[i]); + + const xmlNodeSet *nodes = xpath_obj->nodesetval; + if (nodes && nodes->nodeNr > 0) + { + if (!any_results) + { + printf ("\n", member_name); + any_results = true; + } + for (size_t j = 0; j < nodes->nodeNr; j++) + { + xmlElemDump (stdout, doc, nodes->nodeTab[j]); + putchar ('\n'); + } + } + + xmlXPathFreeObject (xpath_obj); + xmlXPathFreeContext (xpath_ctx); + } + if (any_results) + putchar ('\n');; + } + xmlFreeDoc (doc); + } + else + { + printf ("\n", member_name); + msg (ME, "%s", error_s); + free (error_s); + } +} + +static void +run_dump_legacy_table (int argc, char **argv) +{ + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + for (size_t i = 0; i < n_items; i++) + if (spv_item_is_legacy_table (items[i])) + { + xmlDoc *doc; + char *error_s = spv_item_get_legacy_table (items[i], &doc); + dump_xml (argc, argv, items[i]->xml_member, error_s, doc); + } + free (items); + + spv_close (spv); +} + +static void +run_dump_structure (int argc, char **argv) +{ + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + const char *last_structure_member = NULL; + for (size_t i = 0; i < n_items; i++) + if (!last_structure_member || strcmp (items[i]->structure_member, + last_structure_member)) + { + last_structure_member = items[i]->structure_member; + + xmlDoc *doc; + char *error_s = spv_item_get_structure (items[i], &doc); + dump_xml (argc, argv, items[i]->structure_member, error_s, doc); + } + free (items); + + spv_close (spv); +} + +static void +run_is_legacy (int argc UNUSED, char **argv) +{ + struct spv_reader *spv; + char *err = spv_open (argv[1], &spv); + if (err) + error (1, 0, "%s", err); + + bool is_legacy = false; + + struct spv_item **items; + size_t n_items; + spv_select (spv, criteria, n_criteria, &items, &n_items); + for (size_t i = 0; i < n_items; i++) + if (spv_item_is_legacy_table (items[i])) + { + is_legacy = true; + break; + } + free (items); + + spv_close (spv); + + exit (is_legacy ? EXIT_SUCCESS : EXIT_FAILURE); +} + +struct command + { + const char *name; + int min_args, max_args; + void (*run) (int argc, char **argv); + }; + +static const struct command commands[] = + { + { "detect", 1, 1, run_detect }, + { "dir", 1, 1, run_directory }, + { "convert", 2, 2, run_convert }, + + /* Undocumented commands. */ + { "dump", 1, 1, run_dump }, + { "dump-light-table", 1, 1, run_dump_light_table }, + { "dump-legacy-data", 1, 1, run_dump_legacy_data }, + { "dump-legacy-table", 1, INT_MAX, run_dump_legacy_table }, + { "dump-structure", 1, INT_MAX, run_dump_structure }, + { "is-legacy", 1, 1, run_is_legacy }, + }; +static const int n_commands = sizeof commands / sizeof *commands; + +static const struct command * +find_command (const char *name) +{ + for (size_t i = 0; i < n_commands; i++) + { + const struct command *c = &commands[i]; + if (!strcmp (name, c->name)) + return c; + } + return NULL; +} + +static void +emit_msg (const struct msg *m, void *aux UNUSED) +{ + if (m->severity == MSG_S_ERROR || m->severity == MSG_S_WARNING) + n_warnings++; + + char *s = msg_to_string (m); + fprintf (stderr, "%s\n", s); + free (s); +} + +int +main (int argc, char **argv) +{ + set_program_name (argv[0]); + msg_set_handler (emit_msg, NULL); + settings_init (); + i18n_init (); + + parse_options (argc, argv); + + argc -= optind; + argv += optind; + + if (argc < 1) + error (1, 0, _("missing command name (use --help for help)")); + + const struct command *c = find_command (argv[0]); + if (!c) + error (1, 0, _("unknown command \"%s\" (use --help for help)"), argv[0]); + + int n_args = argc - 1; + if (n_args < c->min_args || n_args > c->max_args) + { + if (c->min_args == c->max_args) + error (1, 0, _("\"%s\" command takes exactly %d argument%s"), + c->name, c->min_args, c->min_args ? "s" : ""); + else if (c->max_args == INT_MAX) + error (1, 0, _("\"%s\" command requires at least %d argument%s"), + c->name, c->min_args, c->min_args ? "s" : ""); + else + error (1, 0, _("\"%s\" command requires between %d and %d arguments"), + c->name, c->min_args, c->max_args); + } + + c->run (argc, argv); + + i18n_done (); + + return n_warnings ? EXIT_FAILURE : EXIT_SUCCESS; +} + +static struct spv_criteria * +get_criteria (void) +{ + if (!n_criteria || new_criteria) + { + new_criteria = false; + if (n_criteria >= allocated_criteria) + criteria = x2nrealloc (criteria, &allocated_criteria, + sizeof *criteria); + criteria[n_criteria++] = (struct spv_criteria) SPV_CRITERIA_INITIALIZER; + } + + return &criteria[n_criteria - 1]; +} + +static void +parse_select (char *arg) +{ + bool invert = arg[0] == '^'; + arg += invert; + + unsigned classes = 0; + for (char *token = strtok (arg, ","); token; token = strtok (NULL, ",")) + { + if (!strcmp (arg, "all")) + classes = SPV_ALL_CLASSES; + else if (!strcmp (arg, "help")) + { + puts (_("The following object classes are supported:")); + for (int class = 0; class < SPV_N_CLASSES; class++) + printf ("- %s\n", spv_item_class_to_string (class)); + exit (0); + } + else + { + int class = spv_item_class_from_string (token); + if (class == SPV_N_CLASSES) + error (1, 0, _("%s: unknown object class (use --select=help " + "for help"), arg); + classes |= 1u << class; + } + } + + struct spv_criteria *c = get_criteria (); + c->classes = invert ? classes ^ SPV_ALL_CLASSES : classes; +} + +static struct spv_criteria_match * +get_criteria_match (const char **arg) +{ + struct spv_criteria *c = get_criteria (); + if ((*arg)[0] == '^') + { + (*arg)++; + return &c->exclude; + } + else + return &c->include; +} + +static void +parse_commands (const char *arg) +{ + struct spv_criteria_match *cm = get_criteria_match (&arg); + string_array_parse (&cm->commands, ss_cstr (arg), ss_cstr (",")); +} + +static void +parse_subtypes (const char *arg) +{ + struct spv_criteria_match *cm = get_criteria_match (&arg); + string_array_parse (&cm->subtypes, ss_cstr (arg), ss_cstr (",")); +} + +static void +parse_labels (const char *arg) +{ + struct spv_criteria_match *cm = get_criteria_match (&arg); + string_array_parse (&cm->labels, ss_cstr (arg), ss_cstr (",")); +} + +static void +parse_instances (char *arg) +{ + struct spv_criteria *c = get_criteria (); + size_t allocated_instances = c->n_instances; + + for (char *token = strtok (arg, ","); token; token = strtok (NULL, ",")) + { + if (c->n_instances >= allocated_instances) + c->instances = x2nrealloc (c->instances, &allocated_instances, + sizeof *c->instances); + + c->instances[c->n_instances++] = (!strcmp (token, "last") ? -1 + : atoi (token)); + } +} + +static void +parse_members (const char *arg) +{ + struct spv_criteria *cm = get_criteria (); + string_array_parse (&cm->members, ss_cstr (arg), ss_cstr (",")); +} + +static void +parse_options (int argc, char *argv[]) +{ + for (;;) + { + enum + { + OPT_MEMBER_NAMES = UCHAR_MAX + 1, + OPT_SHOW_HIDDEN, + OPT_SELECT, + OPT_COMMANDS, + OPT_SUBTYPES, + OPT_LABELS, + OPT_INSTANCES, + OPT_MEMBERS, + OPT_ERRORS, + OPT_OR, + OPT_SORT, + OPT_RAW, + }; + static const struct option long_options[] = + { + /* Input selection options. */ + { "show-hidden", no_argument, NULL, OPT_SHOW_HIDDEN }, + { "select", required_argument, NULL, OPT_SELECT }, + { "commands", required_argument, NULL, OPT_COMMANDS }, + { "subtypes", required_argument, NULL, OPT_SUBTYPES }, + { "labels", required_argument, NULL, OPT_LABELS }, + { "instances", required_argument, NULL, OPT_INSTANCES }, + { "members", required_argument, NULL, OPT_MEMBERS }, + { "errors", no_argument, NULL, OPT_ERRORS }, + { "or", no_argument, NULL, OPT_OR }, + + /* "dir" command options. */ + { "member-names", no_argument, NULL, OPT_MEMBER_NAMES }, + + /* "convert" command options. */ + { "force", no_argument, NULL, 'f' }, + + /* "dump-light-table" command options. */ + { "sort", no_argument, NULL, OPT_SORT }, + { "raw", no_argument, NULL, OPT_RAW }, + + { "help", no_argument, NULL, 'h' }, + { "version", no_argument, NULL, 'v' }, + + { NULL, 0, NULL, 0 }, + }; + + int c; + + c = getopt_long (argc, argv, "O:hvf", long_options, NULL); + if (c == -1) + break; + + switch (c) + { + case 'O': + output_driver_parse_option (optarg, &output_options); + break; + + case OPT_MEMBER_NAMES: + show_member_names = true; + break; + + case OPT_SHOW_HIDDEN: + get_criteria ()->include_hidden = true; + break; + + case OPT_SELECT: + parse_select (optarg); + break; + + case OPT_COMMANDS: + parse_commands (optarg); + break; + + case OPT_SUBTYPES: + parse_subtypes (optarg); + break; + + case OPT_LABELS: + parse_labels (optarg); + break; + + case OPT_INSTANCES: + parse_instances (optarg); + break; + + case OPT_MEMBERS: + parse_members (optarg); + break; + + case OPT_ERRORS: + get_criteria ()->error = true; + break; + + case OPT_OR: + new_criteria = true; + break; + + case OPT_SORT: + sort = true; + break; + + case OPT_RAW: + raw = true; + break; + + case 'f': + force = true; + break; + + case 'v': + version_etc (stdout, "pspp-output", PACKAGE_NAME, PACKAGE_VERSION, + "Ben Pfaff", "John Darrington", NULL_SENTINEL); + exit (EXIT_SUCCESS); + + case 'h': + usage (); + exit (EXIT_SUCCESS); + + default: + exit (EXIT_FAILURE); + } + } +} + +static void +usage (void) +{ + struct string s = DS_EMPTY_INITIALIZER; + struct string_set formats = STRING_SET_INITIALIZER(formats); + output_get_supported_formats (&formats); + const char *format; + const struct string_set_node *node; + STRING_SET_FOR_EACH (format, node, &formats) + { + if (!ds_is_empty (&s)) + ds_put_byte (&s, ' '); + ds_put_cstr (&s, format); + } + string_set_destroy (&formats); + + printf ("\ +%s, a utility for working with SPSS viewer (.spv) files.\n\ +Usage: %s [OPTION]... COMMAND ARG...\n\ +\n\ +The following commands are available:\n\ + detect FILE Detect whether FILE is an SPV file.\n\ + dir FILE List tables and other items in FILE.\n\ + convert SOURCE DEST Convert .spv SOURCE to DEST.\n\ +\n\ +Input selection options for \"dir\" and \"convert\":\n\ + --select=CLASS... include only some kinds of objects\n\ + --select=help print known object classes\n\ + --commands=COMMAND... include only specified COMMANDs\n\ + --subtypes=SUBTYPE... include only specified SUBTYPEs of output\n\ + --labels=LABEL... include only output objects with the given LABELs\n\ + --instances=INSTANCE... include only the given object INSTANCEs\n\ + --show-hidden include hidden output objects\n\ + --or separate two sets of selection options\n\ +\n\ +\"convert\" by default infers the destination's format from its extension.\n\ +The known extensions are: %s\n\ +The following options override \"convert\" behavior:\n\ + -O format=FORMAT set destination format to FORMAT\n\ + -O OPTION=VALUE set output option\n\ + -f, --force keep output file even given errors\n\ +Other options:\n\ + --help display this help and exit\n\ + --version output version information and exit\n", + program_name, program_name, ds_cstr (&s)); + ds_destroy (&s); +}