X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fdev%2Fsystem-file-format.texi;h=164807b80115e4796bb394c0336522b0d84344ba;hb=db714493fb4cfee9aac97f897aaa795d5beb85ee;hp=70fa385c7525efea051a5bfb90fbf046bdb7e14b;hpb=317ba3232cb1361259bba6aa8444b141068dd0d3;p=pspp diff --git a/doc/dev/system-file-format.texi b/doc/dev/system-file-format.texi index 70fa385c75..164807b801 100644 --- a/doc/dev/system-file-format.texi +++ b/doc/dev/system-file-format.texi @@ -96,6 +96,8 @@ Each type of record is described separately below. * Variable Display Parameter Record:: * Long Variable Names Record:: * Very Long String Record:: +* Character Encoding Record:: +* Data File and Variable Attributes Records:: * Miscellaneous Informational Records:: * Dictionary Termination Record:: * Data Record:: @@ -545,9 +547,14 @@ Compression code. Always set to 1. Machine endianness. 1 indicates big-endian, 2 indicates little-endian. @item int32 character_code; +@anchor{character-code} Character code. 1 indicates EBCDIC, 2 indicates 7-bit ASCII, 3 indicates 8-bit ASCII, 4 indicates DEC Kanji. Windows code page numbers are also valid. + +Experience has shown that in many files, this field is ignored or incorrect. +For a more reliable indication of the file's character encoding +see @ref{Character Encoding Record}. @end table @node Machine Floating-Point Info Record @@ -791,6 +798,125 @@ After the last tuple, there may be a single byte 00, or @{00, 09@}. The total length is @code{count} bytes. @end table +@node Character Encoding Record +@section Character Encoding Record + +This record, if present, indicates the character encoding for string data, +long variable names, variable labels, value labels and other strings in the +file. + +@example +/* @r{Header.} */ +int32 rec_type; +int32 subtype; +int32 size; +int32 count; + +/* @r{Exactly @code{count} bytes of data.} */ +char encoding[]; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 7. + +@item int32 subtype; +Record subtype. Always set to 20. + +@item int32 size; +The size of each element in the @code{encoding} member. Always set to 1. + +@item int32 count; +The total number of bytes in @code{encoding}. + +@item char encoding[]; +The name of the character encoding. Normally this will be an official IANA characterset name or alias. +See @url{http://www.iana.org/assignments/character-sets}. +@end table + +This record is not present in files generated by older software. +See also @ref{character-code}. + + +@node Data File and Variable Attributes Records +@section Data File and Variable Attributes Records + +The data file and variable attributes records represent custom +attributes for the system file or for individual variables in the +system file, as defined on the DATAFILE ATTRIBUTE (@pxref{DATAFILE +ATTRIBUTE,,,pspp, PSPP Users Guide}) and VARIABLE ATTRIBUTE commands +(@pxref{VARIABLE ATTRIBUTE,,,pspp, PSPP Users Guide}), respectively. + +@example +/* @r{Header.} */ +int32 rec_type; +int32 subtype; +int32 size; +int32 count; + +/* @r{Exactly @code{count} bytes of data.} */ +char attributes[]; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 7. + +@item int32 subtype; +Record subtype. Always set to 17 for a data file attribute record or +to 18 for a variable attributes record. + +@item int32 size; +The size of each element in the @code{attributes} member. Always set to 1. + +@item int32 count; +The total number of bytes in @code{attributes}. + +@item char attributes[]; +The attributes, in a text-based format. + +In record type 17, this field contains a single attribute set. An +attribute set is a sequence of one or more attributes concatenated +together. Each attribute consists of a name, which has the same +syntax as a variable name, followed by, inside parentheses, a sequence +of one or more values. Each value consists of a string enclosed in +single quotes (@code{'}) followed by a line feed (byte 0x0a). A value +may contain single quote characters, which are not themselves escaped +or quoted or required to be present in pairs. There is no apparent +way to embed a line feed in a value. There is no distinction between +an attribute with a single value and an attribute array with one +element. + +In record type 18, this field contains a sequence of one or more +variable attribute sets. If more than one variable attribute set is +present, each one after the first is delimited from the previous by +@code{/}. Each variable attribute set consists of a variable name, +followed by @code{:}, followed by an attribute set with the same +syntax as on record type 17. + +The total length is @code{count} bytes. +@end table + +@subheading Example + +A system file produced with the following VARIABLE ATTRIBUTE commands +in effect: + +@example +VARIABLE ATTRIBUTE VARIABLES=dummy ATTRIBUTE=fred[1]('23') fred[2]('34'). +VARIABLE ATTRIBUTE VARIABLES=dummy ATTRIBUTE=bert('123'). +@end example + +@noindent +will contain a variable attribute record with the following contents: + +@example +00000000 07 00 00 00 12 00 00 00 01 00 00 00 22 00 00 00 |............"...| +00000010 64 75 6d 6d 79 3a 66 72 65 64 28 27 32 33 27 0a |dummy:fred('23'.| +00000020 27 33 34 27 0a 29 62 65 72 74 28 27 31 32 33 27 |'34'.)bert('123'| +00000030 0a 29 |.) | +@end example + @node Miscellaneous Informational Records @section Miscellaneous Informational Records