Machine endianness. 1 indicates big-endian, 2 indicates little-endian.
@item int32 character_code;
-@anchor{character-code}
-Character code. 1 indicates EBCDIC, 2 indicates 7-bit ASCII, 3
-indicates 8-bit ASCII, 4 indicates DEC Kanji.
-Windows code page numbers are also valid.
-
-Experience has shown that in many files, this field is ignored or incorrect.
-For a more reliable indication of the file's character encoding
-see @ref{Character Encoding Record}.
+@anchor{character-code} Character code. The following values have
+been actually observed in system files:
+
+@table @asis
+@item 2
+7-bit ASCII.
+
+@item 1250
+The @code{windows-1250} code page for Central European and Eastern
+European languages.
+
+@item 1252
+The @code{windows-1252} code page for Western European languages.
+
+@item 28591
+ISO 8859-1.
+
+@item 65001
+UTF-8.
+@end table
+
+The following additional values are known to be defined:
+
+@table @asis
+@item 1
+EBCDIC.
+
+@item 3
+8-bit ``ASCII''.
+
+@item 4
+DEC Kanji.
+@end table
+
+Other Windows code page numbers are known to be generally valid.
+
+Old versions of SPSS always wrote value 2 in this field, regardless of
+the encoding in use. Newer versions also write the character encoding
+as a string (see @ref{Character Encoding Record}).
@end table
@node Machine Floating-Point Info Record
See @url{http://www.iana.org/assignments/character-sets}.
@end table
-This record is not present in files generated by older software.
-See also @ref{character-code}.
+This record is not present in files generated by older software. See
+also the @code{character_code} field in the machine integer info
+record (@pxref{character-code}).
+
+When the character encoding record and the machine integer info record
+are both present, all system files observed in practice indicate the
+same character encoding, e.g.@: 1252 as @code{character_code} and
+@code{windows-1252} as @code{encoding}, 65001 and @code{UTF-8}, etc.
+
+If, for testing purposes, a file is crafted with different
+@code{character_code} and @code{encoding}, it seems that
+@code{character_code} controls the encoding for all strings in the
+system file before the dictionary termination record, including
+strings in data (e.g.@: string missing values), and @code{encoding}
+controls the encoding for strings following the dictionary termination
+record.
@node Long String Value Labels Record
@section Long String Value Labels Record