X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=spv-file-format.texi;h=f9d698706e50acf0ba48e577f2f1647a2737a118;hb=8af10bb39253b97589c5f4b455b708c8fb9e233b;hp=bb615a2384333bd3fd4166b21a38f04dceffeefd;hpb=2aed65a53e0d5ae8d7abc77f6cbd7cf055b37ceb;p=pspp diff --git a/spv-file-format.texi b/spv-file-format.texi index bb615a2384..f9d698706e 100644 --- a/spv-file-format.texi +++ b/spv-file-format.texi @@ -411,8 +411,12 @@ In every example in the corpus, @code{x2} is 18 and the bytes that follow it are @code{00 00 00 01 00 00 00 00 00 00 00 00 00 02 00 00 00 00}. The meaning of these bytes is unknown. -Observed values of @code{x3} vary from 16 to 150. The bytes that -follow it vary somewhat. +In every example in the corpus for version 1, @code{x3} is 16 and the +bytes that follow it are @code{00 00 00 01 00 00 00 01 00 00 00 00 01 +01 01 01}. In version 3, observed @code{x3} varies from 117 to 150 and +the bytes that follow it vary somewhat and often include a readable +text string, e.g. ``Default'' or ``Academic'', which appears to be the +name of a ``TableLook''. Observed values of @code{x4} vary from 0 to 17. Out of 7060 examples in the corpus, it is nonzero only 36 times. @@ -557,6 +561,7 @@ be naive. @example data := int[layers] int[rows] int[columns] int*[n-dimensions] + int[n-data] datum*[n-data] @end example The values of @code{layers}, @code{rows}, and @code{columns} each @@ -571,3 +576,59 @@ specify the dimensions represented by rows, and the final @code{columns} of them specify the dimensions represented by columns. When there is more than one dimension of a given kind, the inner dimensions are given first. + +@example +datum := int64[index] 00? value /* @r{version 1} */ +datum := int64[index] value /* @r{version 3} */ +@end example + +The format of a datum varies slightly from version 1 to version 3: in +version 1 it allows for an extra optional 00 byte. + +A datum consists of an index and a value. Suppose there are @math{d} +dimensions and dimension @math{i} for @math{0 \le i < d} has +@math{n_i} categories. Consider the datum at coordinates @math{x_i} +for @math{0 \le i < d}; note that @math{0 \le x_i < n_i}. Then the +index is calculated by the following algorithm: + +@display +let index = 0 +for each @math{i} from 0 to @math{d - 1}: + index = @math{n_i \times} index + @math{x_i} +@end display + +For example, suppose there are 3 dimensions with 3, 4, and 5 +categories, respectively. The datum at coordinates (1, 2, 3) has +index @math{5 \times (4 \times (3 \times 0 + 1) + 2) + 3 = 33}. + +@example +value := 00? 00? 00? 00? raw-value +raw-value := 01 opt-value int32[format] double + | 02 opt-value int32[format] double string[varname] string[vallab] + (01 | 02 | 03) + | 03 string[local] opt-value string[id] string[c] (00 | 01) + | 04 opt-value int32[format] string[vallab] string[varname] + (01 | 02 | 03) string[vallab] + | 05 opt-value string[varname] string[varlabel] (01 | 02 | 03) + | opt-value string[format] int32[n-substs] substitution*[n-substs] +substitution := i0 value + | int32[x] value*[x + 1] /* @r{x > 0} */ +opt-value := 31 i0 (i0 | i1 string) opt-value-i0-v1 /* @r{version 1} */ + | 31 i0 (i0 | i1 string) opt-value-i0-v3 /* @r{version 3} */ + | 31 i1 int32[footnote-number] nested-string + | 31 i2 (00 | 02) 00 (i1 | i2 | i3) nested-string + | 31 i3 00 00 01 00 i2 nested-string + | 58 +opt-value-i0-v1 := 00 (i1 | i2) 00 00 int32 00 00 +opt-value-i0-v3 := count(counted-string + (58 + | 31 01? 00? 00? 00? 01 + string[fgcolor] string[bgcolor] string[typeface] + byte) + (58 + | 31 i0 i0 i0 i0 01 00 (01 | 02 | 08) + 00 08 00 0a 00)) + +nested-string := 00 00 count(counted-string 58 58) +counted-string := count((i0 (58 | 31 string))?) +@end example