X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=spv-file-format.texi;h=a5f45e5b4979f7a86307413c2ad4a8c304658a5c;hb=463238cd3f894fc6fb5cebbcc7bb2b9584c640a8;hp=c58723379517a7ddfd275184a96020313f147703;hpb=86009a9088ecdfddfacd7974f30b88cb89937d55;p=pspp diff --git a/spv-file-format.texi b/spv-file-format.texi index c587233795..a5f45e5b49 100644 --- a/spv-file-format.texi +++ b/spv-file-format.texi @@ -554,3 +554,45 @@ are terminal categories that directly represent data values for a variable (e.g. in a frequency table or crosstabulation, a group of values in a variable being tabulated) and i0 otherwise, but this might be naive. + +@example +data := int[layers] int[rows] int[columns] int*[n-dimensions] + int[n-data] datum*[n-data] +@end example + +The values of @code{layers}, @code{rows}, and @code{columns} each +specifies the number of dimensions represented in layers or rows or +columns, respectively, and their values sum to the number of +dimensions. + +The @code{n-dimensions} integers are a permutation of the 0-based +dimension numbers. The first @code{layers} of them specify each of +the dimensions represented by layers, the next @code{rows} of them +specify the dimensions represented by rows, and the final +@code{columns} of them specify the dimensions represented by columns. +When there is more than one dimension of a given kind, the inner +dimensions are given first. + +@example +datum := int64[index] 00? value @r{# Version 1.} +datum := int64[index] value @r{# Version 3.} +@end example + +A datum consists of an index and a value. Suppose there are @math{d} +dimensions and dimension @math{i} for @math{0 \le i < d} has +@math{n_i} categories. Consider the datum at coordinates @math{x_i} +for @math{0 \le i < d}; note that @math{0 \le x_i < n_i}. Then the +index is calculated by the following algorithm: + +@display +let index = 0 +for each @math{i} from 0 to @math{d - 1}: + index = @math{n_i \times} index + @math{x_i} +@end display + +For example, suppose there are 3 dimensions with 3, 4, and 5 +categories, respectively. The datum at coordinates (1, 2, 3) has +index @math{5 \times (4 \times (3 \times 0 + 1) + 2) + 3 = 33}. + +The format of a datum varies slightly from version 1 to version 3, in +that version 1 has an extra optional 00 byte.