X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fdev%2Fi18n.texi;h=ab80b2f7735dda72bff6abcfab19662ce68c5775;hb=a9e49cdd81db02cef2a41c1ad3584d74ae3d7476;hp=97077d344e3135548fa4cc91c633062f4e0440e9;hpb=14aac9fe7a7efbb6c9bded2ed5969a643cb76645;p=pspp diff --git a/doc/dev/i18n.texi b/doc/dev/i18n.texi index 97077d344e..ab80b2f773 100644 --- a/doc/dev/i18n.texi +++ b/doc/dev/i18n.texi @@ -1,3 +1,13 @@ +@c PSPP - a program for statistical analysis. +@c Copyright (C) 2019 Free Software Foundation, Inc. +@c Permission is granted to copy, distribute and/or modify this document +@c under the terms of the GNU Free Documentation License, Version 1.3 +@c or any later version published by the Free Software Foundation; +@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. +@c A copy of the license is included in the section entitled "GNU +@c Free Documentation License". +@c + @node Internationalisation @chapter Internationalisation @@ -27,7 +37,7 @@ It's rarely, if ever, necessary to interrogate the system to find out the values of the 3 locales. However it's important to be aware of the source (destination) locale when reading (writing) string data. -When transfering data between a source and a destination, the appropriate +When transferring data between a source and a destination, the appropriate recoding must be performed. @@ -47,13 +57,12 @@ report generated by pspp. Non-data related strings (Eg: ``Page number'', @subsection The data locale This locale is the one associated with the data being analysed with pspp. The only important aspect of this locale is the character encoding. -@footnote {It might also be desirable for the LC_COLLATE category to be used for the purposes of sorting data.} +@footnote{It might also be desirable for the LC_COLLATE category to be used for the purposes of sorting data.} The dictionary pertaining to the data contains a field denoting the encoding. Any string data stored in a @union{value} will be encoded in the dictionary's character set. - @section System files @file{*.sav} files contain a field which is supposed to identify the encoding of the data they contain (@pxref{Machine Integer Info Record}). @@ -103,25 +112,20 @@ It is the caller's responsibility to free the returned string when no longer required. @end deftypefun +In order to minimise the number of conversions required, and to simplify +design, PSPP attempts to store all internal strings in UTF8 encoding. +Thus, when reading system and portable files (or any other data source), +the following items are immediately converted to UTF8 encoding: +@itemize +@item Variable names +@item Variable labels +@item Value labels +@end itemize +Conversely, when writing system files, these are converted back to the +encoding of that system file. -For example, in order to display a string variable's value in a label widget in the psppire gui one would use code similar to -@example - -struct variable *var = /* assigned from somewhere */ -struct case c = /* from somewhere else */ - -const union value *val = case_data (&c, var); - -char *utf8string = recode_string (UTF8, dict_get_encoding (dict), val->s, - var_get_width (var)); - -GtkWidget *entry = gtk_entry_new(); -gtk_entry_set_text (entry, utf8string); -gtk_widget_show (entry); - -free (utf8string); - -@end example +String data stored in union values are left in their original encoding. +These will be converted by the data_in/data_out functions.