X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fdev%2Fconcepts.texi;h=06652d62653b1ec21ccdbc387f8993ae8f762b1f;hb=dc4da1f8120bddad12c1326714438f05b594e6e1;hp=29d2b584d68532d7308739dfd29f7ed45750e3f5;hpb=c41f14854e73ad44824b54933ae96eb52f781fc2;p=pspp-builds.git diff --git a/doc/dev/concepts.texi b/doc/dev/concepts.texi index 29d2b584..06652d62 100644 --- a/doc/dev/concepts.texi +++ b/doc/dev/concepts.texi @@ -117,76 +117,88 @@ case when it processes it later. @subsection Runtime Typed Values When a value's type is only known at runtime, it is often represented -as a @union{value}, defined in @file{data/value.h}. @union{value} has -two members: a @code{double} named @samp{f} to store a numeric value -and an array of @code{char} named @samp{s} to a store a string value. -A @union{value} does not identify the type or width of the data it -contains. Code that works with @union{values}s must therefore have -external knowledge of its content, often through the type and width of -a @struct{variable} (@pxref{Variables}). - -@cindex MAX_SHORT_STRING -@cindex short string -@cindex long string -@cindex string value -The array of @code{char} in @union{value} has only a small, fixed -capacity of @code{MAX_SHORT_STRING} bytes. A value that -fits within this capacity is called a @dfn{short string}. Any wider -string value, which must be represented by more than one -@union{value}, is called a @dfn{long string}. - -@deftypefn Macro int MAX_SHORT_STRING -Maximum width of a short string value, never less than 8 bytes. It is -wider than 8 bytes on systems where @code{double} is either larger -than 8 bytes or has stricter alignment than 8 bytes. -@end deftypefn +as a @union{value}, defined in @file{data/value.h}. A @union{value} +does not identify the type or width of the data it contains. Code +that works with @union{values}s must therefore have external knowledge +of its content, often through the type and width of a +@struct{variable} (@pxref{Variables}). + +@union{value} has one member that clients are permitted to access +directly, a @code{double} named @samp{f} that stores the content of a +numeric @union{value}. It has other members that store the content of +string @union{value}, but client code should use accessor functions +instead of referring to these directly. + +PSPP provides some functions for working with @union{value}s. The +most useful are described below. To use these functions, recall that +a numeric value has a width of 0. -@deftypefn Macro int MIN_LONG_STRING -Minimum width of a long string value, that is, @code{MAX_SHORT_STRING -+ 1}. -@end deftypefn +@deftypefun void value_init (union value *@var{value}, int @var{width}) +Initializes @var{value} as a value of the given @var{width}. After +initialization, the data in @var{value} are indeterminate; the caller +is responsible for storing initial data in it. +@end deftypefun -Long string variables are slightly harder to work with than short -string values, because they cannot be conveniently and efficiently -allocated as block scope variables or structure members. The PSPP -language exposes this inconvenience to the user: there are many -circumstances in PSPP syntax where short strings are allowed but not -long strings. Short string variables, for example, may have -user-missing values, but long string variables may not (@pxref{Missing -Observations,,,pspp, PSPP Users Guide}). +@deftypefun void value_destroy (union value *@var{value}, int @var{width}) +Frees auxiliary storage associated with @var{value}, which must have +the given @var{width}. +@end deftypefun -PSPP provides a few functions for working with @union{value}s. The -most useful are described below. To use these functions, recall that -a numeric value has a width of 0. +@deftypefun bool value_needs_init (int @var{width}) +For some widths, @func{value_init} and @func{value_destroy} do not +actually do anything, because no additional storage is needed beyond +the size of @union{value}. This function returns true if @var{width} +is such a width, which case there is no actual need to call those +functions. This can be a useful optimization if a large number of +@union{value}s of such a width are to be initialized or destroyed. -@deftypefun size_t value_cnt_from_width (int @var{width}) -Returns the number of consecutive @union{value}s that must be -allocated to store a value of the given @var{width}. For a numeric or -short string value, the return value is 1; for long string -variables, it is greater than 1. +This function returns false if @func{value_init} and +@func{value_destroy} are actually required for the given @var{width}. +@end deftypefun + +@deftypefun double value_num (const union value *@var{value}) +Returns the numeric value in @var{value}, which must have been +initialized as a numeric value. Equivalent to @code{@var{value}->f}. +@end deftypefun + +@deftypefun {const char *} value_str (const union value *@var{value}, int @var{width}) +@deftypefunx {char *} value_str_rw (union value *@var{value}, int @var{width}) +Returns the string value in @var{value}, which must have been +initialized with positive width @var{width}. The string returned is +not null-terminated. Only @var{width} bytes of returned data may be +accessed. + +The two different functions exist only for @code{const}-correctness. +Otherwise they are identical. + +It is important that @var{width} be the correct value that was passed +to @func{value_init}. Passing a smaller or larger value (e.g.@: +because that number of bytes will be accessed) will not always work +and should be avoided. @end deftypefun @deftypefun void value_copy (union value *@var{dst}, @ const union value *@var{src}, @ int @var{width}) -Copies a value of the given @var{width} from the @union{value} array -starting at @var{src} to the one starting at @var{dst}. The two -arrays must not overlap. +Copies the contents of @union{value} @var{src} to @var{dst}. Both +@var{dst} and @var{src} must have been initialized with the specified +@var{width}. @end deftypefun @deftypefun void value_set_missing (union value *@var{value}, int @var{width}) Sets @var{value} to @code{SYSMIS} if it is numeric or to all spaces if -it is alphanumeric, according to @var{width}. @var{value} must point -to the start of a @union{value} array of the given @var{width}. +it is alphanumeric, according to @var{width}. @var{value} must have +been initialized with the specified @var{width}. @end deftypefun @anchor{value_is_resizable} @deftypefun bool value_is_resizable (const union value *@var{value}, int @var{old_width}, int @var{new_width}) -Determines whether @var{value} may be resized from @var{old_width} to -@var{new_width}. Resizing is possible if the following criteria are -met. First, @var{old_width} and @var{new_width} must be both numeric -or both string widths. Second, if @var{new_width} is a short string -width and less than @var{old_width}, resizing is allowed only if bytes +Determines whether @var{value}, which must have been initialized with +the specified @var{old_width}, may be resized to @var{new_width}. +Resizing is possible if the following criteria are met. First, +@var{old_width} and @var{new_width} must be both numeric or both +string widths. Second, if @var{new_width} is a short string width and +less than @var{old_width}, resizing is allowed only if bytes @var{new_width} through @var{old_width} in @var{value} contain only spaces. @@ -196,9 +208,36 @@ These rules are part of those used by @func{mv_is_resizable} and @deftypefun void value_resize (union value *@var{value}, int @var{old_width}, int @var{new_width}) Resizes @var{value} from @var{old_width} to @var{new_width}, which -must be allowed by the rules stated above. This has an effect only if -@var{new_width} is greater than @var{old_width}, in which case the -bytes newly added to @var{value} are cleared to spaces. +must be allowed by the rules stated above. @var{value} must have been +initialized with the specified @var{old_width} before calling this +function. After resizing, @var{value} has width @var{new_width}. + +If @var{new_width} is greater than @var{old_width}, @var{value} will +be padded on the right with spaces to the new width. If +@var{new_width} is less than @var{old_width}, the rightmost bytes of +@var{value} are truncated. +@end deftypefun + +@deftypefun bool value_equal (const union value *@var{a}, const union value *@var{b}, int @var{width}) +Compares of @var{a} and @var{b}, which must both have width +@var{width}. Returns true if their contents are the same, false if +they differ. +@end deftypefun + +@deftypefun int value_compare_3way (const union value *@var{a}, const union value *@var{b}, int @var{width}) +Compares of @var{a} and @var{b}, which must both have width +@var{width}. Returns -1 if @var{a} is less than @var{b}, 0 if they +are equal, or 1 if @var{a} is greater than @var{b}. + +Numeric values are compared numerically, with @code{SYSMIS} comparing +less than any real number. String values are compared +lexicographically byte-by-byte. +@end deftypefun + +@deftypefun size_t value_hash (const union value *@var{value}, int @var{width}, unsigned int @var{basis}) +Computes and returns a hash of @var{value}, which must have the +specified @var{width}. The value in @var{basis} is folded into the +hash. @end deftypefun @node Input and Output Formats @@ -259,6 +298,8 @@ the data in fields represented by formats. These functions construct @struct{fmt_spec}s and verify that they are valid. + + @deftypefun {struct fmt_spec} fmt_for_input (enum fmt_type @var{type}, int @var{w}, int @var{d}) @deftypefunx {struct fmt_spec} fmt_for_output (enum fmt_type @var{type}, int @var{w}, int @var{d}) Constructs a @struct{fmt_spec} with the given @var{type}, @var{w}, and @@ -340,6 +381,10 @@ identical, false otherwise. @var{format} need not be a valid input or output format specifier. @end deftypefun +@deftypefun void fmt_resize (struct fmt_spec *@var{fmt}, int @var{width}) +Sets the width of @var{fmt} to a valid format for a @union{value} of size @var{width}. +@end deftypefun + @node Obtaining Properties of Format Types @subsection Obtaining Properties of Format Types @@ -552,16 +597,29 @@ equal to @code{decimal}, or it may be set to 0 to disable grouping. The following functions are provided for working with numeric formatting styles. -@deftypefun {struct fmt_number_style *} fmt_number_style_create (void) -Creates and returns a new @struct{fmt_number_style} with all of the +@deftypefun void fmt_number_style_init (struct fmt_number_style *@var{style}) +Initialises a @struct{fmt_number_style} with all of the prefixes and suffixes set to the empty string, @samp{.} as the decimal point character, and grouping disables. @end deftypefun + @deftypefun void fmt_number_style_destroy (struct fmt_number_style *@var{style}) Destroys @var{style}, freeing its storage. @end deftypefun +@deftypefun {struct fmt_number_style} *fmt_create (void) +A function which creates an array of all the styles used by pspp, and +calls fmt_number_style_init on each of them. +@end deftypefun + +@deftypefun void fmt_done (struct fmt_number_style *@var{styles}) +A wrapper function which takes an array of @struct{fmt_number_style}, calls +fmt_number_style_destroy on each of them, and then frees the array. +@end deftypefun + + + @deftypefun int fmt_affix_width (const struct fmt_number_style *@var{style}) Returns the total length of @var{style}'s @code{prefix} and @code{suffix}. @end deftypefun @@ -579,31 +637,16 @@ work with these global styles: Returns the numeric style for the given format @var{type}. @end deftypefun -@deftypefun void fmt_set_style (enum fmt_type @var{type}, struct fmt_number_style *@var{style}) -Replaces the current numeric style for format @var{type} by the given -@var{style}, which becomes owned by the callee. @var{type} must be a -custom currency format and @var{style} must follow all the rules for -numeric styles explained above. +@deftypefun void fmt_check_style (const struct fmt_number_style *@var{style}) +Asserts that style is self consistent. @end deftypefun -@deftypefun int fmt_decimal_char (enum fmt_type @var{type}) -Returns the decimal point character for the given format @var{type}. -Equivalent to @code{fmt_get_style (@var{type})->decimal}. -@end deftypefun -@deftypefun int fmt_grouping_char (enum fmt_type @var{type}) -Returns the grouping character for the given format @var{type}, or 0 -if @var{type} output should not be grouped. Equivalent to -@code{fmt_get_style (@var{type})->grouping}. +@deftypefun {const char *} fmt_name (enum fmt_type @var{type}) +Returns the name of the given format @var{type}. @end deftypefun -@deftypefun void fmt_set_decimal (char @var{decimal}) -Changes the decimal point character for the basic numeric formats to -@var{decimal}, which must be @samp{.} or @samp{,}. The F, E, COMMA, -DOLLAR, and PCT will use the specified decimal point character, and the -opposite character for grouping where appropriate. The DOT format -uses the reverse choices. -@end deftypefun + @node Formatted Data Input and Output @subsection Formatted Data Input and Output @@ -611,18 +654,17 @@ uses the reverse choices. These functions provide the ability to convert data fields into @union{value}s and vice versa. -@deftypefun bool data_in (struct substring @var{input}, enum legacy_encoding @var{legacy_encoding}, enum fmt_type @var{type}, int @var{implied_decimals}, int @var{first_column}, union value *@var{output}, int @var{width}) +@deftypefun bool data_in (struct substring @var{input}, const char *@var{encoding}, enum fmt_type @var{type}, int @var{implied_decimals}, int @var{first_column}, const struct dictionary *@var{dict}, union value *@var{output}, int @var{width}) Parses @var{input} as a field containing data in the given format -@var{type}. The resulting value is stored in @var{output}, which has -the given @var{width}. For consistency, @var{width} must be 0 if +@var{type}. The resulting value is stored in @var{output}, which the +caller must have initialized with the given @var{width}. For +consistency, @var{width} must be 0 if @var{type} is a numeric format type and greater than 0 if @var{type} is a string format type. - -Ordinarily @var{legacy_encoding} should be @code{LEGACY_NATIVE}, -indicating that @var{input} is encoded in the character set -conventionally used on the host machine. It may be set to -@code{LEGACY_EBCDIC} to cause @var{input} to be re-encoded from EBCDIC -during data parsing. +@var{encoding} should be set to indicate the character +encoding of @var{input}. +@var{dict} must be a pointer to the dictionary with which @var{output} +is associated. If @var{input} is the empty string (with length 0), @var{output} is set to the value set on SET BLANKS (@pxref{SET BLANKS,,,pspp, PSPP @@ -657,21 +699,15 @@ not propagated to the caller as errors. This function is declared in @file{data/data-in.h}. @end deftypefun -@deftypefun void data_out (const union value *@var{input}, const struct fmt_spec *@var{format}, char *@var{output}) -@deftypefunx void data_out_legacy (const union value *@var{input}, enum legacy_encoding @var{legacy_encoding}, const struct fmt_spec *@var{format}, char *@var{output}) -Converts the data pointed to by @var{input} into a data field in -@var{output} according to output format specifier @var{format}, which -must be a valid output format. Exactly @code{@var{format}->w} bytes -are written to @var{output}. The width of @var{input} is also +@deftypefun char * data_out (const union value *@var{input}, const struct fmt_spec *@var{format}) +@deftypefunx char * data_out_legacy (const union value *@var{input}, const char *@var{encoding}, const struct fmt_spec *@var{format}) +Converts the data pointed to by @var{input} into a string value, which +will be encoded in UTF-8, according to output format specifier @var{format}. +Format +must be a valid output format. The width of @var{input} is inferred from @var{format} using an algorithm equivalent to @func{fmt_var_width}. -If @func{data_out} is called, or @func{data_out_legacy} is called with -@var{legacy_encoding} set to @code{LEGACY_NATIVE}, @var{output} will -be encoded in the character set conventionally used on the host -machine. If @var{legacy_encoding} is set to @code{LEGACY_EBCDIC}, -@var{output} will be re-encoded from EBCDIC during data output. - When @var{input} contains data that cannot be represented in the given @var{format}, @func{data_out} may output a message using @func{msg}, @c (@pxref{msg}), @@ -699,28 +735,7 @@ variable, is most conveniently executed through functions on A @struct{missing_values} is essentially a set of @union{value}s that have a common value width (@pxref{Values}). For a set of missing values associated with a variable (the common case), the set's -width is the same as the variable's width. The contents of a set of -missing values is subject to some restrictions. Regardless of width, -a set of missing values is allowed to be empty. Otherwise, its -possible contents depend on its width: - -@table @asis -@item 0 (numeric values) -Up to three discrete numeric values, or a range of numeric values -(which includes both ends of the range), or a range plus one discrete -numeric value. - -@item 1@dots{}@t{MAX_SHORT_STRING} - 1 (short string values) -Up to three discrete string values (with the same width as the set). - -@item @t{MAX_SHORT_STRING}@dots{}@t{MAX_STRING} (long string values) -Always empty. -@end table - -These somewhat arbitrary restrictions are the same as those imposed by -SPSS. In PSPP we could easily eliminate these restrictions, but doing -so would also require us to extend the system file format in an -incompatible way, which we consider a bad tradeoff. +width is the same as the variable's width. Function prototypes and other declarations related to missing values are declared in @file{data/missing-values.h}. @@ -729,18 +744,37 @@ are declared in @file{data/missing-values.h}. Opaque type that represents a set of missing values. @end deftp +The contents of a set of missing values is subject to some +restrictions. Regardless of width, a set of missing values is allowed +to be empty. A set of numeric missing values may contain up to three +discrete numeric values, or a range of numeric values (which includes +both ends of the range), or a range plus one discrete numeric value. +A set of string missing values may contain up to three discrete string +values (with the same width as the set), but ranges are not supported. + +In addition, values in string missing values wider than +@code{MV_MAX_STRING} bytes may contain non-space characters only in +their first @code{MV_MAX_STRING} bytes; all the bytes after the first +@code{MV_MAX_STRING} must be spaces. @xref{mv_is_acceptable}, for a +function that tests a value against these constraints. + +@deftypefn Macro int MV_MAX_STRING +Number of bytes in a string missing value that are not required to be +spaces. The current value is 8, a value which is fixed by the system +file format. In PSPP we could easily eliminate this restriction, but +doing so would also require us to extend the system file format in an +incompatible way, which we consider a bad tradeoff. +@end deftypefn + The most often useful functions for missing values are those for testing whether a given value is missing, described in the following section. Several other functions for creating, inspecting, and modifying @struct{missing_values} objects are described afterward, but -these functions are much more rarely useful. No function for -destroying a @struct{missing_values} is provided, because -@struct{missing_values} does not contain any pointers or other -references to resources that need deallocation. +these functions are much more rarely useful. @menu * Testing for Missing Values:: -* Initializing User-Missing Value Sets:: +* Creating and Destroying User-Missing Values:: * Changing User-Missing Value Set Width:: * Inspecting User-Missing Value Sets:: * Modifying User-Missing Value Sets:: @@ -792,8 +826,10 @@ missing. @end deftp @end deftypefun -@node Initializing User-Missing Value Sets -@subsection Initializing User-Missing Value Sets +@node Creating and Destroying User-Missing Values +@subsection Creation and Destruction + +These functions create and destroy @struct{missing_values} objects. @deftypefun void mv_init (struct missing_values *@var{mv}, int @var{width}) Initializes @var{mv} as a set of user-missing values. The set is @@ -801,6 +837,10 @@ initially empty. Any values added to it must have the specified @var{width}. @end deftypefun +@deftypefun void mv_destroy (struct missing_values *@var{mv}) +Destroys @var{mv}, which must not be referred to again. +@end deftypefun + @deftypefun void mv_copy (struct missing_values *@var{mv}, const struct missing_values *@var{old}) Initializes @var{mv} as a copy of the existing set of user-missing values @var{old}. @@ -830,11 +870,9 @@ the required width, may be used instead. Tests whether @var{mv}'s width may be changed to @var{new_width} using @func{mv_resize}. Returns true if it is allowed, false otherwise. -If @var{new_width} is a long string width, @var{mv} may be resized -only if it is empty. Otherwise, if @var{mv} contains any missing -values, then it may be resized only if each missing value may be -resized, as determined by @func{value_is_resizable} -(@pxref{value_is_resizable}). +If @var{mv} contains any missing values, then it may be resized only +if each missing value may be resized, as determined by +@func{value_is_resizable} (@pxref{value_is_resizable}). @end deftypefun @anchor{mv_resize} @@ -853,8 +891,8 @@ width. These functions inspect the properties and contents of @struct{missing_values} objects. -The first set of functions inspects the discrete values that numeric -and short string sets of user-missing values may contain: +The first set of functions inspects the discrete values that sets of +user-missing values may contain: @deftypefun bool mv_is_empty (const struct missing_values *@var{mv}) Returns true if @var{mv} contains no user-missing values, false if it @@ -879,11 +917,12 @@ values, that is, if @func{mv_n_values} would return nonzero for @var{mv}. @end deftypefun -@deftypefun void mv_get_value (const struct missing_values *@var{mv}, union value *@var{value}, int @var{index}) -Copies the discrete user-missing value in @var{mv} with the given -@var{index} into @var{value}. The index must be less than the number -of discrete user-missing values in @var{mv}, as reported by -@func{mv_n_values}. +@deftypefun {const union value *} mv_get_value (const struct missing_values *@var{mv}, int @var{index}) +Returns the discrete user-missing value in @var{mv} with the given +@var{index}. The caller must not modify or free the returned value or +refer to it after modifying or freeing @var{mv}. The index must be +less than the number of discrete user-missing values in @var{mv}, as +reported by @func{mv_n_values}. @end deftypefun The second set of functions inspects the single range of values that @@ -905,7 +944,7 @@ include a range. These functions modify the contents of @struct{missing_values} objects. -The first set of functions applies to all sets of user-missing values: +The next set of functions applies to all sets of user-missing values: @deftypefun bool mv_add_value (struct missing_values *@var{mv}, const union value *@var{value}) @deftypefunx bool mv_add_str (struct missing_values *@var{mv}, const char @var{value}[]) @@ -913,8 +952,8 @@ The first set of functions applies to all sets of user-missing values: Attempts to add the given discrete @var{value} to set of user-missing values @var{mv}. @var{value} must have the same width as @var{mv}. Returns true if @var{value} was successfully added, false if the set -could not accept any more discrete values. (Always returns false if -@var{mv} is a set of long string user-missing values.) +could not accept any more discrete values or if @var{value} is not an +acceptable user-missing value (see @func{mv_is_acceptable} below). These functions are equivalent, except for the form in which @var{value} is provided, so you may use whichever function is most @@ -926,10 +965,22 @@ Removes a discrete value from @var{mv} (which must contain at least one discrete value) and stores it in @var{value}. @end deftypefun -@deftypefun void mv_replace_value (struct missing_values *@var{mv}, const union value *@var{value}, int @var{index}) -Replaces the discrete value with the given @var{index} in @var{mv} -(which must contain at least @var{index} + 1 discrete values) with -@var{value}. +@deftypefun bool mv_replace_value (struct missing_values *@var{mv}, const union value *@var{value}, int @var{index}) +Attempts to replace the discrete value with the given @var{index} in +@var{mv} (which must contain at least @var{index} + 1 discrete values) +by @var{value}. Returns true if successful, false if @var{value} is +not an acceptable user-missing value (see @func{mv_is_acceptable} +below). +@end deftypefun + +@deftypefun bool mv_is_acceptable (const union value *@var{value}, int @var{width}) +@anchor{mv_is_acceptable} +Returns true if @var{value}, which must have the specified +@var{width}, may be added to a missing value set of the same +@var{width}, false if it cannot. As described above, all numeric +values and string values of width @code{MV_MAX_STRING} or less may be +added, but string value of greater width may be added only if bytes +beyond the first @code{MV_MAX_STRING} are all spaces. @end deftypefun The second set of functions applies only to numeric sets of @@ -961,12 +1012,7 @@ All of the values in a set of value labels have the same width, which for a set of value labels owned by a variable (the common case) is the same as its variable. -Numeric and short string sets of value labels may contain any number -of entries. Long string sets of value labels may not contain any -value labels at all, due to a corresponding restriction in SPSS. In -PSPP we could easily eliminate this restriction, but doing so would -also require us to extend the system file format in an incompatible -way, which we consider a bad tradeoff. +Sets of value labels may contain any number of entries. It is rarely necessary to interact directly with a @struct{val_labs} object. Instead, the most common operation, looking up the label for @@ -1047,31 +1093,24 @@ value in it may be resized to that width, as determined by Changes the width of @var{val_labs}'s values to @var{new_width}, which must be a valid new width as determined by @func{val_labs_can_set_width}. - -If @var{new_width} is a long string width, this function deletes all -value labels from @var{val_labs}. @end deftypefun @node Value Labels Adding and Removing Labels @subsection Adding and Removing Labels These functions add and remove value labels from a @struct{val_labs} -object. These functions apply only to numeric and short string sets -of value labels. They have no effect on long string sets of value -labels, since these sets are always empty. +object. @deftypefun bool val_labs_add (struct val_labs *@var{val_labs}, union value @var{value}, const char *@var{label}) Adds @var{label} to in @var{var_labs} as a label for @var{value}, which must have the same width as the set of value labels. Returns -true if successful, false if @var{value} already has a label or if -@var{val_labs} has long string width. +true if successful, false if @var{value} already has a label. @end deftypefun @deftypefun void val_labs_replace (struct val_labs *@var{val_labs}, union value @var{value}, const char *@var{label}) Adds @var{label} to in @var{var_labs} as a label for @var{value}, which must have the same width as the set of value labels. If @var{value} already has a label in @var{var_labs}, it is replaced. -Has no effect if @var{var_labs} has long string width. @end deftypefun @deftypefun bool val_labs_remove (struct val_labs *@var{val_labs}, union value @var{value}) @@ -1084,75 +1123,65 @@ was removed, false otherwise. @subsection Iterating through Value Labels These functions allow iteration through the set of value labels -represented by a @struct{val_labs} object. They are usually used in -the context of a @code{for} loop: +represented by a @struct{val_labs} object. They may be used in the +context of a @code{for} loop: @example struct val_labs val_labs; -struct val_labs_iterator *i; -struct val_lab *vl; +const struct val_lab *vl; @dots{} -for (vl = val_labs_first (val_labs, &i); vl != NULL; - vl = val_labs_next (val_labs, &i)) +for (vl = val_labs_first (val_labs); vl != NULL; + vl = val_labs_next (val_labs, vl)) @{ @dots{}@r{do something with @code{vl}}@dots{} @} @end example -The value labels in a @struct{val_labs} must not be modified as it is -undergoing iteration. +Value labels should not be added or deleted from a @struct{val_labs} +as it is undergoing iteration. -@deftp {Structure} {struct val_lab} -Represents a value label for iteration purposes, with two -client-visible members: - -@table @code -@item union value value -Value being labeled, of the same width as the @struct{val_labs} being -iterated. - -@item const char *label -The label, as a null-terminated string. -@end table -@end deftp - -@deftp {Structure} {struct val_labs_iterator} -Opaque object that represents the current state of iteration through a -set of value value labels. Automatically destroyed by successful -completion of iteration. Must be destroyed manually in other -circumstances, by calling @func{val_labs_done}. -@end deftp - -@deftypefun {struct val_lab *} val_labs_first (const struct val_labs *@var{val_labs}, struct val_labs_iterator **@var{iterator}) -If @var{val_labs} contains at least one value label, starts an -iteration through @var{val_labs}, initializes @code{*@var{iterator}} -to point to a newly allocated iterator, and returns the first value -label in @var{val_labs}. If @var{val_labs} is empty, sets -@code{*@var{iterator}} to null and returns a null pointer. +@deftypefun {const struct val_lab *} val_labs_first (const struct val_labs *@var{val_labs}) +Returns the first value label in @var{var_labs}, if it contains at +least one value label, or a null pointer if it does not contain any +value labels. +@end deftypefun -This function creates iterators that traverse sets of value labels in -no particular order. +@deftypefun {const struct val_lab *} val_labs_next (const struct val_labs *@var{val_labs}, const struct val_labs_iterator **@var{vl}) +Returns the value label in @var{var_labs} following @var{vl}, if +@var{vl} is not the last value label in @var{val_labs}, or a null +pointer if there are no value labels following @var{vl}. @end deftypefun -@deftypefun {struct val_lab *} val_labs_first_sorted (const struct val_labs *@var{val_labs}, struct val_labs_iterator **@var{iterator}) -Same as @func{val_labs_first}, except that the created iterator -traverses the set of value labels in ascending order of value. +@deftypefun {const struct val_lab **} val_labs_sorted (const struct val_labs *@var{val_labs}) +Allocates and returns an array of pointers to value labels, which are +sorted in increasing order by value. The array has +@code{val_labs_count (@var{val_labs})} elements. The caller is +responsible for freeing the array with @func{free} (but must not free +any of the @struct{val_lab} elements that the array points to). @end deftypefun -@deftypefun {struct val_lab *} val_labs_next (const struct val_labs *@var{val_labs}, struct val_labs_iterator **@var{iterator}) -Advances an iterator created with @func{val_labs_first} or -@func{val_labs_first_sorted} to the next value label, which is -returned. If the set of value labels is exhausted, returns a null -pointer after freeing @code{*@var{iterator}} and setting it to a null -pointer. +The iteration functions above work with pointers to @struct{val_lab} +which is an opaque data structure that users of @struct{val_labs} must +not modify or free directly. The following functions work with +objects of this type: + +@deftypefun {const union value *} val_lab_get_value (const struct val_lab *@var{vl}) +Returns the value of value label @var{vl}. The caller must not modify +or free the returned value. (To achieve a similar result, remove the +value label with @func{val_labs_remove}, then add the new value with +@func{val_labs_add}.) + +The width of the returned value cannot be determined directly from +@var{vl}. It may be obtained by calling @func{val_labs_get_width} on +the @struct{val_labs} that @var{vl} is in. @end deftypefun -@deftypefun void val_labs_done (struct val_labs_iterator **@var{iterator}) -Frees @code{*@var{iterator}} and sets it to a null pointer. Does -not need to be called explicitly if @func{val_labs_next} returns a -null pointer, indicating that all value labels have been visited. +@deftypefun {const char *} val_lab_get_label (const struct val_lab *@var{vl}) +Returns the label in @var{vl} as a null-terminated string. The caller +must not modify or free the returned string. (Use +@func{val_labs_replace} to change a value label.) @end deftypefun @node Variables @@ -1276,22 +1305,6 @@ Returns true if @var{var} is an alphanumeric (string) variable, false otherwise. @end deftypefun -@deftypefun bool var_is_short_string (const struct variable *@var{var}) -Returns true if @var{var} is a string variable of width -@code{MAX_SHORT_STRING} or less, false otherwise. -@end deftypefun - -@deftypefun bool var_is_long_string (const struct variable *var{var}) -Returns true if @var{var} is a string variable of width greater than -@code{MAX_SHORT_STRING}, false otherwise. -@end deftypefun - -@deftypefun size_t var_get_value_cnt (const struct variable *@var{var}) -Returns the number of @union{value}s needed to hold an instance of -variable @var{var}. @code{var_get_value_cnt (var)} is equivalent to -@code{value_cnt_from_width (var_get_width (var))}. -@end deftypefun - @node Variable Missing Values @subsection Variable Missing Values @@ -1309,8 +1322,8 @@ Tests whether @var{value} is a missing value of the given @var{class} for variable @var{var} and returns true if so, false otherwise. @func{var_is_num_missing} may only be applied to numeric variables; @func{var_is_str_missing} may only be applied to string variables. -For string variables, @var{value} must contain exactly as many -characters as @var{var}'s width. +@var{value} must have been initialized with the same width as +@var{var}. @code{var_is_@var{type}_missing (@var{var}, @var{value}, @var{class})} is equivalent to @code{mv_is_@var{type}_missing @@ -1335,7 +1348,7 @@ resizable to @var{var}'s width (@pxref{mv_resize}). The caller retains ownership of @var{miss}. @end deftypefun -b@deftypefun void var_clear_missing_values (struct variable *@var{var}) +@deftypefun void var_clear_missing_values (struct variable *@var{var}) Clears @var{var}'s missing values. Equivalent to @code{var_set_missing_values (@var{var}, NULL)}. @end deftypefun @@ -1356,11 +1369,13 @@ value: @deftypefun {const char *} var_lookup_value_label (const struct variable *@var{var}, const union value *@var{value}) Looks for a label for @var{value} in @var{var}'s set of value labels. -Returns the label if one exists, otherwise a null pointer. +@var{value} must have the same width as @var{var}. Returns the label +if one exists, otherwise a null pointer. @end deftypefun @deftypefun void var_append_value_name (const struct variable *@var{var}, const union value *@var{value}, struct string *@var{str}) Looks for a label for @var{value} in @var{var}'s set of value labels. +@var{value} must have the same width as @var{var}. If a label exists, it will be appended to the string pointed to by @var{str}. Otherwise, it formats @var{value} using @var{var}'s print format (@pxref{Input and Output Formats}) @@ -1402,20 +1417,19 @@ the variable (making a second copy): @deftypefun bool var_add_value_label (struct variable *@var{var}, const union value *@var{value}, const char *@var{label}) Attempts to add a copy of @var{label} as a label for @var{value} for -the given @var{var}. If @var{value} already has a label, then the old -label is retained. Returns true if a label is added, false if there -was an existing label for @var{value} or if @var{var} is a long string -variable. Either way, the caller retains ownership of @var{value} and -@var{label}. +the given @var{var}. @var{value} must have the same width as +@var{var}. If @var{value} already has a label, then the old label is +retained. Returns true if a label is added, false if there was an +existing label for @var{value}. Either way, the caller retains +ownership of @var{value} and @var{label}. @end deftypefun @deftypefun void var_replace_value_label (struct variable *@var{var}, const union value *@var{value}, const char *@var{label}) Attempts to add a copy of @var{label} as a label for @var{value} for -the given @var{var}. If @var{value} already has a label, then +the given @var{var}. @var{value} must have the same width as +@var{var}. If @var{value} already has a label, then @var{label} replaces the old label. Either way, the caller retains ownership of @var{value} and @var{label}. - -If @var{var} is a long string variable, this function has no effect. @end deftypefun @node Variable Print and Write Formats