X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fdev%2Fi18n.texi;h=876f359bff996e274b66d9b3356d2561692af379;hb=f790dbda9d498eef9c9c0a49078adbeecf768d56;hp=039e32b29f04756ffb989fb94ebb725cbded4f20;hpb=537fdeb3702c011e05d7826a8d556a7beeba2605;p=pspp diff --git a/doc/dev/i18n.texi b/doc/dev/i18n.texi index 039e32b29f..876f359bff 100644 --- a/doc/dev/i18n.texi +++ b/doc/dev/i18n.texi @@ -11,9 +11,9 @@ in which they are addressed. Pspp has three ``working'' locales: @itemize -@item The local of the user interface. -@item The local of the output. -@item The local of the data. Only the character encoding is relevant. +@item The locale of the user interface. +@item The locale of the output. +@item The locale of the data. Only the character encoding is relevant. @end itemize Each of these locales may, at different times take @@ -27,7 +27,7 @@ It's rarely, if ever, necessary to interrogate the system to find out the values of the 3 locales. However it's important to be aware of the source (destination) locale when reading (writing) string data. -When transfering data between a source and a destination, the appropriate +When transferring data between a source and a destination, the appropriate recoding must be performed. @@ -47,13 +47,12 @@ report generated by pspp. Non-data related strings (Eg: ``Page number'', @subsection The data locale This locale is the one associated with the data being analysed with pspp. The only important aspect of this locale is the character encoding. -@footnote {It might also be desirable for the LC_COLLATE category to be used for the purposes of sorting data.} +@footnote{It might also be desirable for the LC_COLLATE category to be used for the purposes of sorting data.} The dictionary pertaining to the data contains a field denoting the encoding. Any string data stored in a @union{value} will be encoded in the dictionary's character set. - @section System files @file{*.sav} files contain a field which is supposed to identify the encoding of the data they contain (@pxref{Machine Integer Info Record}). @@ -71,7 +70,7 @@ remains unset. @section GUI The psppire graphic user interface is written using the Gtk+ api, for which all strings must be encoded in UTF8. -All strings passed to the Gtk+/Glib library functions (except for filenames) +All strings passed to the GTK+/GLib library functions (except for filenames) must be UTF-8 encoded otherwise errors will occur. Thus, for the purposes of the programming psppire, the user interface locale should be assumed to be UTF8, even if setlocale and/or nl_langinfo @@ -81,12 +80,12 @@ indicates otherwise. The GLib API has some special functions for dealing with filenames. Strings returned from functions like gtk_file_chooser_dialog_get_name are not, in general, encoded in UTF8, but in ``filename'' encoding. -If that filename is passed to another Glib function which expects a filename, +If that filename is passed to another GLib function which expects a filename, no conversion is necessary. If it's passed to a function for the purposes of displaying it (eg. in a window's title-bar) it must be converted to UTF8 --- there is a special function for this: g_filename_display_name or g_filename_basename. -If however, a filename needs to be passed outside of Gtk/Glib (for example to fopen) it must be converted to the local system encoding. +If however, a filename needs to be passed outside of GTK+/GLib (for example to fopen) it must be converted to the local system encoding. @section Existing locale handling functions @@ -103,25 +102,20 @@ It is the caller's responsibility to free the returned string when no longer required. @end deftypefun +In order to minimise the number of conversions required, and to simplify +design, PSPP attempts to store all internal strings in UTF8 encoding. +Thus, when reading system and portable files (or any other data source), +the following items are immediately converted to UTF8 encoding: +@itemize +@item Variable names +@item Variable labels +@item Value labels +@end itemize +Conversely, when writing system files, these are converted back to the +encoding of that system file. -For example, in order to display a string variable's value in a label widget in the psppire gui one would use code similar to -@example - -struct variable *var = /* assigned from somewhere */ -struct case c = /* from somewhere else */ - -const union value *val = case_data (&c, var); - -char *utf8string = recode_string (UTF8, dict_get_encoding (dict), val->s, - var_get_width (var)); - -GtkWidget *entry = gtk_entry_new(); -gtk_entry_set_text (entry, utf8string); -gtk_widget_show (entry); - -free (utf8string); - -@end example +String data stored in union values are left in their original encoding. +These will be converted by the data_in/data_out functions.