X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fdev%2Fi18n.texi;h=09b5b419cf74d6636183c505557772f123d60f3d;hb=8e27b1a0dba7f33b7acb0d8894efe2045b0bb98f;hp=97077d344e3135548fa4cc91c633062f4e0440e9;hpb=e0e0fab736b71dd934d82bcc91a5ba674d3491d5;p=pspp diff --git a/doc/dev/i18n.texi b/doc/dev/i18n.texi index 97077d344e..09b5b419cf 100644 --- a/doc/dev/i18n.texi +++ b/doc/dev/i18n.texi @@ -47,13 +47,12 @@ report generated by pspp. Non-data related strings (Eg: ``Page number'', @subsection The data locale This locale is the one associated with the data being analysed with pspp. The only important aspect of this locale is the character encoding. -@footnote {It might also be desirable for the LC_COLLATE category to be used for the purposes of sorting data.} +@footnote{It might also be desirable for the LC_COLLATE category to be used for the purposes of sorting data.} The dictionary pertaining to the data contains a field denoting the encoding. Any string data stored in a @union{value} will be encoded in the dictionary's character set. - @section System files @file{*.sav} files contain a field which is supposed to identify the encoding of the data they contain (@pxref{Machine Integer Info Record}). @@ -103,25 +102,20 @@ It is the caller's responsibility to free the returned string when no longer required. @end deftypefun +In order to minimise the number of conversions required, and to simplify +design, PSPP attempts to store all internal strings in UTF8 encoding. +Thus, when reading system and portable files (or any other data source), +the following items are immediately converted to UTF8 encoding: +@itemize +@item Variable names +@item Variable labels +@item Value labels +@end itemize +Conversely, when writing system files, these are converted back to the +encoding of that system file. -For example, in order to display a string variable's value in a label widget in the psppire gui one would use code similar to -@example - -struct variable *var = /* assigned from somewhere */ -struct case c = /* from somewhere else */ - -const union value *val = case_data (&c, var); - -char *utf8string = recode_string (UTF8, dict_get_encoding (dict), val->s, - var_get_width (var)); - -GtkWidget *entry = gtk_entry_new(); -gtk_entry_set_text (entry, utf8string); -gtk_widget_show (entry); - -free (utf8string); - -@end example +String data stored in union values are left in their original encoding. +These will be converted by the data_in/data_out functions.