From: Ben Pfaff Date: Sun, 2 May 2010 02:09:04 +0000 (-0700) Subject: Implement MRSETS command. X-Git-Tag: v0.7.5~45 X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=c5ad65b0351ab1d897eb072eeaec06fb37802b01;p=pspp-builds.git Implement MRSETS command. Thanks to Tom Wilson for reporting the format of the MRSETS record in system files, and to John Darrington for examples of additional types of records. --- diff --git a/NEWS b/NEWS index f1a0ee4e..ce98d876 100644 --- a/NEWS +++ b/NEWS @@ -1,10 +1,14 @@ PSPP NEWS -- history of user-visible changes. -Time-stamp: <2009-12-05 20:44:30 blp> -Copyright (C) 1996-9, 2000, 2008, 2009 Free Software Foundation, Inc. +Time-stamp: <2010-04-25 21:08:59 blp> +Copyright (C) 1996-9, 2000, 2008, 2009, 2010 Free Software Foundation, Inc. See the end for copying conditions. Please send PSPP bug reports to bug-gnu-pspp@gnu.org. +Changes from 0.7.3 to 0.7.4: + + * The MRSETS command is now implemented. + Changes from 0.7.2 to 0.7.3: * Charts are now produced with Cairo and Pango, instead of libplot. diff --git a/doc/dev/system-file-format.texi b/doc/dev/system-file-format.texi index d1d00873..1309cb79 100644 --- a/doc/dev/system-file-format.texi +++ b/doc/dev/system-file-format.texi @@ -78,6 +78,7 @@ Each type of record is described separately below. * Document Record:: * Machine Integer Info Record:: * Machine Floating-Point Info Record:: +* Multiple Response Sets Records:: * Variable Display Parameter Record:: * Long Variable Names Record:: * Very Long String Record:: @@ -595,6 +596,129 @@ The value used for HIGHEST in missing values. The value used for LOWEST in missing values. @end table +@node Multiple Response Sets Records +@section Multiple Response Sets Records + +The system file format has two different types of records that +represent multiple response sets (@pxref{MRSETS,,,pspp, PSPP Users +Guide}). The first type of record describes multiple response sets +that can be understood by SPSS before version 14. The second type of +record, with a closely related format, is used for multiple dichotomy +sets that use the CATEGORYLABELS=COUNTEDVALUES feature added in +version 14. + +@example +/* @r{Header.} */ +int32 rec_type; +int32 subtype; +int32 size; +int32 count; + +/* @r{Exactly @code{count} bytes of data.} */ +char mrsets[]; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 7. + +@item int32 subtype; +Record subtype. Set to 7 for records that describe multiple response +sets understood by SPSS before version 14, or to 19 for records that +describe dichotomy sets that use the CATEGORYLABELS=COUNTEDVALUES +feature added in version 14. + +@item int32 size; +The size of each element in the @code{mrsets} member. Always set to 1. + +@item int32 count; +The total number of bytes in @code{mrsets}. + +@item char mrsets[]; +A series of multiple response sets, each of which consists of the +following: + +@itemize @bullet +@item +The set's name (an identifier that begins with @samp{$}). + +@item +An equals sign (@samp{=}). + +@item +@samp{C} for a multiple category set, @samp{D} for a multiple +dichotomy set with CATEGORYLABELS=VARLABELS, or @samp{E} for a +multiple dichotomy set with CATEGORYLABELS=COUNTEDVALUES. + +@item +For a multiple dichotomy set with CATEGORYLABELS=COUNTEDVALUES, a +space, followed by a number expressed as decimal digits, followed by a +space. If LABELSOURCE=VARLABEL was specified on MRSETS, then the +number is 11; otherwise it is 1.@footnote{This part of the format may +not be fully understood, because only a single example of each +possibility has been examined.} + +@item +For either kind of multiple dichotomy set, the counted value, as a +positive integer count specified as decimal digits, followed by a +space, followed by as many string bytes as specified in the count. If +the set contains numeric variables, the string consists of the counted +integer value expressed as decimal digits. If the set contains string +variables, the string contains the counted string value. Either way, +the string may be padded on the right with spaces (older versions of +SPSS seem to always pad to a width of 8 bytes; newer versions don't). + +@item +A space. + +@item +The multiple response set's label, using the same format as for the +counted value for multiple dichotomy sets. A string of length 0 means +that the set does not have a label. A string of length 0 is also +written if LABELSOURCE=VARLABEL was specified. + +@item +A space. + +@item +The names of the variables in the set, each separated from the +previous by a single space. + +@item +A line feed (byte 0x0a). +@end itemize +@end table + +Example: Given appropriate variable definitions, consider the +following MRSETS command: + +@example +MRSETS /MCGROUP NAME=$a LABEL='my mcgroup' VARIABLES=a b c + /MDGROUP NAME=$b VARIABLES=g e f d VALUE=55 + /MDGROUP NAME=$c LABEL='mdgroup #2' VARIABLES=h i j VALUE='Yes' + /MDGROUP NAME=$d LABEL='third mdgroup' CATEGORYLABELS=COUNTEDVALUES + VARIABLES=k l m VALUE=34 + /MDGROUP NAME=$e CATEGORYLABELS=COUNTEDVALUES LABELSOURCE=VARLABEL + VARIABLES=n o p VALUE='choice'. +@end example + +The above would generate the following multiple response set record of +subtype 7: + +@example +$a=C 10 my mcgroup a b c +$b=D2 55 0 g e f d +$c=D3 Yes 10 mdgroup #2 h i j +@end example + +It would also generate the following multiple response set record with +subtype 19: + +@example +$d=E 1 2 34 13 third mdgroup k l m +$e=E 11 6 choice 0 n o p +@end example + @node Variable Display Parameter Record @section Variable Display Parameter Record diff --git a/doc/variables.texi b/doc/variables.texi index c9e270bf..6b4e574e 100644 --- a/doc/variables.texi +++ b/doc/variables.texi @@ -12,6 +12,7 @@ several utility functions for examining and adjusting them. * LEAVE:: Don't clear variables between cases. * MISSING VALUES:: Set missing values for variables. * MODIFY VARS:: Rename, reorder, and drop variables. +* MRSETS:: Add, modify, and list multiple response sets. * NUMERIC:: Create new numeric variables. * PRINT FORMATS:: Set variable print formats. * RENAME VARIABLES:: Rename variables. @@ -293,6 +294,113 @@ Formats}); otherwise, the default is F8.2. Variables created with @cmd{NUMERIC} are initialized to the system-missing value. +@node MRSETS +@section MRSETS +@vindex MRSETS + +@display +MRSETS + /MDGROUP NAME=name VARIABLES=var_list VALUE=value + [CATEGORYLABELS=@{VARLABELS,COUNTEDVALUES@}] + [@{LABEL='label',LABELSOURCE=VARLABEL@}] + + /MCGROUP NAME=name VARIABLES=var_list [LABEL='label'] + + /DELETE NAME=@{[names],ALL@} + + /DISPLAY NAME=@{[names],ALL@} +@end display + +@cmd{MRSETS} creates, modifies, deletes, and displays multiple +response sets. A multiple response set is a set of variables that +represent multiple responses to a single survey question in one of the +two following ways: + +@itemize @bullet +@item +A @dfn{multiple dichotomy set} is analogous to a survey question with +a set of checkboxes. Each variable in the set is treated in a Boolean +fashion: one value (the "counted value") means that the box was +checked, and any other value means that it was not. + +@item +A @dfn{multiple category set} represents a survey question where the +respondent is instructed to list up to @var{n} choices. Each variable +represents one of the responses. +@end itemize + +Any number of subcommands may be specified in any order. + +The MDGROUP subcommand creates a new multiple dichotomy set or +replaces an existing multiple response set. The NAME, VARIABLES, and +VALUE specifications are required. The others are optional: + +@itemize @bullet +@item +NAME specifies the name used in syntax for the new multiple dichotomy +set. The name must begin with @samp{$}; it must otherwise follow the +rules for identifiers (@pxref{Tokens}). + +@item +VARIABLES specifies the variables that belong to the set. At least +two variables must be specified. The variables must be all string or +all numeric. + +@item +VALUE specifies the counted value. If the variables are numeric, the +value must be an integer. If the variables are strings, then the +value must be a string that is no longer than the shortest of the +variables in the set (ignoring trailing spaces). + +@item +CATEGORYLABELS optionally specifies the source of the labels for each +category in the set: + +@itemize @minus +@item +VARLABELS, the default, uses variable labels or, for variables without +variable labels, variable names. PSPP warns if two variables have the +same variable label, since these categories cannot be distinguished in +output. + +@item +COUNTEDVALUES instead uses each variable's value label for the counted +value. PSPP warns if two variables have the same value label for the +counted value or if one of the variables lacks a value label, since +such categories cannot be distinguished in output. +@end itemize + +@item +LABEL optionally specifies a label for the multiple response set. If +neither LABEL nor LABELSOURCE=VARLABEL is specified, the set is +unlabeled. + +@item +LABELSOURCE=VARLABEL draws the multiple response set's label from the +first variable label among the variables in the set; if none of the +variables has a label, the name of the first variable is used. +LABELSOURCE=VARLABEL must be used with CATEGORYLABELS=COUNTEDVALUES. +It is mutually exclusive with LABEL. +@end itemize + +The MCGROUP subcommand creates a new multiple category set or +replaces an existing multiple response set. The NAME and VARIABLES +specifications are required, and LABEL is optional. Their meanings +are as described above to MDGROUP. PSPP warns if two variables in the +set have different value labels for a single value, since each of the +variables in the set should have the same possible categories. + +The DELETE subcommand deletes multiple response groups. A list of +groups may be named within a set of required square brackets, or ALL +may be used to delete all groups. + +The DISPLAY subcommand displays information about defined multiple +response sets. Its syntax is the same as the DELETE subcommand. + +Multiple response sets are saved to and read from system files by, +e.g., the @cmd{SAVE} and @cmd{GET} command. Otherwise, multiple +response sets are currently used only by third party software. + @node PRINT FORMATS @section PRINT FORMATS @vindex PRINT FORMATS diff --git a/src/data/automake.mk b/src/data/automake.mk index efa0834f..e508ff7e 100644 --- a/src/data/automake.mk +++ b/src/data/automake.mk @@ -73,6 +73,8 @@ src_data_libdata_la_SOURCES = \ src/data/missing-values.h \ src/data/make-file.c \ src/data/make-file.h \ + src/data/mrset.c \ + src/data/mrset.h \ src/data/procedure.c \ src/data/procedure.h \ src/data/por-file-reader.c \ diff --git a/src/data/dictionary.c b/src/data/dictionary.c index 03548c44..1976ec03 100644 --- a/src/data/dictionary.c +++ b/src/data/dictionary.c @@ -18,6 +18,7 @@ #include "data/dictionary.h" +#include #include #include @@ -25,6 +26,7 @@ #include "data/case.h" #include "data/category.h" #include "data/identifier.h" +#include "data/mrset.h" #include "data/settings.h" #include "data/value-labels.h" #include "data/vardict.h" @@ -66,6 +68,8 @@ struct dictionary struct vector **vector; /* Vectors of variables. */ size_t vector_cnt; /* Number of vectors. */ struct attrset attributes; /* Custom attributes. */ + struct mrset **mrsets; /* Multiple response sets. */ + size_t n_mrsets; /* Number of multiple response sets. */ char *encoding; /* Character encoding of string data */ @@ -77,6 +81,8 @@ struct dictionary void *changed_data; }; +static void dict_unset_split_var (struct dictionary *, struct variable *); +static void dict_unset_mrset_var (struct dictionary *, struct variable *); void dict_set_encoding (struct dictionary *d, const char *enc) @@ -222,6 +228,20 @@ dict_clone (const struct dictionary *s) dict_set_attributes (d, dict_get_attributes (s)); + for (i = 0; i < s->n_mrsets; i++) + { + const struct mrset *old = s->mrsets[i]; + struct mrset *new; + size_t j; + + /* Clone old mrset, then replace vars from D by vars from S. */ + new = mrset_clone (old); + for (j = 0; j < new->n_vars; j++) + new->vars[j] = dict_lookup_var_assert (d, var_get_name (new->vars[j])); + + dict_add_mrset (d, new); + } + return d; } @@ -278,6 +298,7 @@ dict_destroy (struct dictionary *d) dict_clear (d); hmap_destroy (&d->name_map); attrset_destroy (&d->attributes); + free (d->mrsets); free (d); } } @@ -577,6 +598,7 @@ dict_delete_var (struct dictionary *d, struct variable *v) var_clear_aux (v); dict_unset_split_var (d, v); + dict_unset_mrset_var (d, v); if (d->weight == v) dict_set_weight (d, NULL); @@ -1153,7 +1175,7 @@ dict_get_split_cnt (const struct dictionary *d) /* Removes variable V, which must be in D, from D's set of split variables. */ -void +static void dict_unset_split_var (struct dictionary *d, struct variable *v) { int orig_count; @@ -1362,7 +1384,138 @@ dict_clear_vectors (struct dictionary *d) d->vector = NULL; d->vector_cnt = 0; } + +/* Multiple response sets. */ + +/* Returns the multiple response set in DICT with index IDX, which must be + between 0 and the count returned by dict_get_n_mrsets(), exclusive. */ +const struct mrset * +dict_get_mrset (const struct dictionary *dict, size_t idx) +{ + assert (idx < dict->n_mrsets); + return dict->mrsets[idx]; +} + +/* Returns the number of multiple response sets in DICT. */ +size_t +dict_get_n_mrsets (const struct dictionary *dict) +{ + return dict->n_mrsets; +} + +/* Looks for a multiple response set named NAME in DICT. If it finds one, + returns its index; otherwise, returns SIZE_MAX. */ +static size_t +dict_lookup_mrset_idx (const struct dictionary *dict, const char *name) +{ + size_t i; + + for (i = 0; i < dict->n_mrsets; i++) + if (!strcasecmp (name, dict->mrsets[i]->name)) + return i; + + return SIZE_MAX; +} + +/* Looks for a multiple response set named NAME in DICT. If it finds one, + returns it; otherwise, returns NULL. */ +const struct mrset * +dict_lookup_mrset (const struct dictionary *dict, const char *name) +{ + size_t idx = dict_lookup_mrset_idx (dict, name); + return idx != SIZE_MAX ? dict->mrsets[idx] : NULL; +} + +/* Adds MRSET to DICT, replacing any existing set with the same name. Returns + true if a set was replaced, false if none existed with the specified name. + + Ownership of MRSET is transferred to DICT. */ +bool +dict_add_mrset (struct dictionary *dict, struct mrset *mrset) +{ + size_t idx; + + assert (mrset_ok (mrset, dict)); + + idx = dict_lookup_mrset_idx (dict, mrset->name); + if (idx == SIZE_MAX) + { + dict->mrsets = xrealloc (dict->mrsets, + (dict->n_mrsets + 1) * sizeof *dict->mrsets); + dict->mrsets[dict->n_mrsets++] = mrset; + return true; + } + else + { + mrset_destroy (dict->mrsets[idx]); + dict->mrsets[idx] = mrset; + return false; + } +} + +/* Looks for a multiple response set in DICT named NAME. If found, removes it + from DICT and returns true. If none is found, returns false without + modifying DICT. + Deleting one multiple response set causes the indexes of other sets within + DICT to change. */ +bool +dict_delete_mrset (struct dictionary *dict, const char *name) +{ + size_t idx = dict_lookup_mrset_idx (dict, name); + if (idx != SIZE_MAX) + { + mrset_destroy (dict->mrsets[idx]); + dict->mrsets[idx] = dict->mrsets[--dict->n_mrsets]; + return true; + } + else + return false; +} + +/* Deletes all multiple response sets from DICT. */ +void +dict_clear_mrsets (struct dictionary *dict) +{ + size_t i; + + for (i = 0; i < dict->n_mrsets; i++) + mrset_destroy (dict->mrsets[i]); + free (dict->mrsets); + dict->mrsets = NULL; + dict->n_mrsets = 0; +} + +/* Removes VAR, which must be in DICT, from DICT's multiple response sets. */ +static void +dict_unset_mrset_var (struct dictionary *dict, struct variable *var) +{ + size_t i; + + assert (dict_contains_var (dict, var)); + + for (i = 0; i < dict->n_mrsets; ) + { + struct mrset *mrset = dict->mrsets[i]; + size_t j; + + for (j = 0; j < mrset->n_vars; ) + if (mrset->vars[j] == var) + remove_element (mrset->vars, mrset->n_vars--, + sizeof *mrset->vars, j); + else + j++; + + if (mrset->n_vars < 2) + { + mrset_destroy (mrset); + dict->mrsets[i] = dict->mrsets[--dict->n_mrsets]; + } + else + i++; + } +} + /* Returns D's attribute set. The caller may examine or modify the attribute set, but must not destroy it. Destroying D or calling dict_set_attributes for D will also destroy D's diff --git a/src/data/dictionary.h b/src/data/dictionary.h index 0b3c6cd2..96529a1b 100644 --- a/src/data/dictionary.h +++ b/src/data/dictionary.h @@ -120,7 +120,6 @@ const struct variable *const *dict_get_split_vars (const struct dictionary *); size_t dict_get_split_cnt (const struct dictionary *); void dict_set_split_vars (struct dictionary *, struct variable *const *, size_t cnt); -void dict_unset_split_var (struct dictionary *, struct variable *); /* File label. */ const char *dict_get_label (const struct dictionary *); @@ -149,6 +148,16 @@ const struct vector *dict_lookup_vector (const struct dictionary *, const char *name); void dict_clear_vectors (struct dictionary *); +/* Multiple response sets. */ +const struct mrset *dict_get_mrset (const struct dictionary *, size_t idx); +size_t dict_get_n_mrsets (const struct dictionary *); +const struct mrset *dict_lookup_mrset (const struct dictionary *, + const char *name); + +bool dict_add_mrset (struct dictionary *, struct mrset *); +bool dict_delete_mrset (struct dictionary *, const char *name); +void dict_clear_mrsets (struct dictionary *); + /* Attributes. */ struct attrset *dict_get_attributes (const struct dictionary *); void dict_set_attributes (struct dictionary *, const struct attrset *); diff --git a/src/data/mrset.c b/src/data/mrset.c new file mode 100644 index 00000000..2f05edb8 --- /dev/null +++ b/src/data/mrset.c @@ -0,0 +1,104 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2010 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "data/mrset.h" + +#include + +#include "data/dictionary.h" +#include "data/val-type.h" +#include "data/variable.h" + +#include "gl/xalloc.h" + +/* Creates and returns a clone of OLD. The caller is responsible for freeing + the new multiple response set (using mrset_destroy()). */ +struct mrset * +mrset_clone (const struct mrset *old) +{ + struct mrset *new; + + new = xmalloc (sizeof *new); + new->name = xstrdup (old->name); + new->label = old->label != NULL ? xstrdup (old->label) : NULL; + new->type = old->type; + new->vars = xmemdup (old->vars, old->n_vars * sizeof *old->vars); + new->n_vars = old->n_vars; + + new->cat_source = old->cat_source; + new->label_from_var_label = old->label_from_var_label; + value_clone (&new->counted, &old->counted, old->width); + new->width = old->width; + + return new; +} + +/* Frees MRSET and the data that it contains. */ +void +mrset_destroy (struct mrset *mrset) +{ + if (mrset != NULL) + { + free (mrset->name); + free (mrset->label); + free (mrset->vars); + value_destroy (&mrset->counted, mrset->width); + } +} + +/* Checks various constraints on MRSET: + + - MRSET has a valid name for a multiple response set (beginning with '$'). + + - MRSET has a valid type. + + - MRSET has at least 2 variables. + + - All of MRSET's variables are in DICT. + + - All of MRSET's variables are the same type (numeric or string). + + - If MRSET is a multiple dichotomy set, its counted value has the same type + as and is no wider than its narrowest variable. + + Returns true if all the constraints are satisfied, otherwise false. */ +bool +mrset_ok (const struct mrset *mrset, const struct dictionary *dict) +{ + enum val_type type; + size_t i; + + if (mrset->name == NULL + || mrset->name[0] != '$' + || (mrset->type != MRSET_MD && mrset->type != MRSET_MC) + || mrset->vars == NULL + || mrset->n_vars < 2) + return false; + + type = var_get_type (mrset->vars[0]); + if (mrset->type == MRSET_MD && type != val_type_from_width (mrset->width)) + return false; + for (i = 0; i < mrset->n_vars; i++) + if (!dict_contains_var (dict, mrset->vars[i]) + || type != var_get_type (mrset->vars[i]) + || (mrset->type == MRSET_MD + && mrset->width > var_get_width (mrset->vars[i]))) + return false; + + return true; +} diff --git a/src/data/mrset.h b/src/data/mrset.h new file mode 100644 index 00000000..c531db7a --- /dev/null +++ b/src/data/mrset.h @@ -0,0 +1,82 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2010 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef DATA_MRSET_H +#define DATA_MRSET_H 1 + +/* Multiple response set data structure. + + A multiple response set (mrset) is a set of variables that represent + multiple responses to a single survey question in one of the two following + ways: + + - A multiple dichotomy set represents a survey question with a set of + checkboxes. Each variable in the set is treated in a Boolean fashion: + one value (the "counted value") means that the box was checked, and any + other value means that it was not. + + - A multiple category set represents a survey question where the + respondent is instructed to "list up to N choices". Each variable + represents one of the responses. + + The set of functions provided here are skeletal. Undoubtedly they will grow + as PSPP begins to make use of multiple response sets, as opposed to merely + maintaining them as part of the dictionary. + */ + +#include +#include + +#include "data/value.h" + +struct dictionary; + +/* Type of a multiple response set. */ +enum mrset_type + { + MRSET_MD, /* Multiple dichotomy group. */ + MRSET_MC /* Multiple category group. */ + }; + +/* Source of category labels for a multiple dichotomy group. */ +enum mrset_md_cat_source + { + MRSET_VARLABELS, /* Variable labels. */ + MRSET_COUNTEDVALUES /* Value labels for the counted value. */ + }; + +/* A multiple response set. */ +struct mrset + { + char *name; /* Name for syntax. Always begins with "$". */ + char *label; /* Human-readable label for group. */ + enum mrset_type type; /* Group type. */ + struct variable **vars; /* Constituent variables. */ + size_t n_vars; /* Number of constituent variables. */ + + /* MRSET_MD only. */ + enum mrset_md_cat_source cat_source; /* Source of category labels. */ + bool label_from_var_label; /* 'label' taken from variable label? */ + union value counted; /* Counted value. */ + int width; /* Width of 'counted'. */ + }; + +struct mrset *mrset_clone (const struct mrset *); +void mrset_destroy (struct mrset *); + +bool mrset_ok (const struct mrset *, const struct dictionary *); + +#endif /* data/mrset.h */ diff --git a/src/data/sys-file-reader.c b/src/data/sys-file-reader.c index d9d26d0a..f284f56e 100644 --- a/src/data/sys-file-reader.c +++ b/src/data/sys-file-reader.c @@ -16,8 +16,8 @@ #include -#include -#include +#include "data/sys-file-reader.h" +#include "data/sys-file-private.h" #include #include @@ -25,36 +25,37 @@ #include #include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "c-ctype.h" -#include "inttostr.h" -#include "minmax.h" -#include "unlocked-io.h" -#include "xalloc.h" -#include "xsize.h" +#include "data/attributes.h" +#include "data/case.h" +#include "data/casereader-provider.h" +#include "data/casereader.h" +#include "data/dictionary.h" +#include "data/file-handle-def.h" +#include "data/file-name.h" +#include "data/format.h" +#include "data/missing-values.h" +#include "data/mrset.h" +#include "data/short-names.h" +#include "data/value-labels.h" +#include "data/value.h" +#include "data/variable.h" +#include "libpspp/array.h" +#include "libpspp/assertion.h" +#include "libpspp/compiler.h" +#include "libpspp/hash.h" +#include "libpspp/i18n.h" +#include "libpspp/message.h" +#include "libpspp/misc.h" +#include "libpspp/pool.h" +#include "libpspp/str.h" +#include "libpspp/stringi-set.h" + +#include "gl/c-ctype.h" +#include "gl/inttostr.h" +#include "gl/minmax.h" +#include "gl/unlocked-io.h" +#include "gl/xalloc.h" +#include "gl/xsize.h" #include "gettext.h" #define _(msgid) gettext (msgid) @@ -100,6 +101,8 @@ static struct variable **make_var_by_value_idx (struct sfm_reader *, static struct variable *lookup_var_by_value_idx (struct sfm_reader *, struct variable **, int value_idx); +static struct variable *lookup_var_by_short_name (struct dictionary *, + const char *short_name); static void sys_msg (struct sfm_reader *r, int class, const char *format, va_list args) @@ -128,12 +131,15 @@ static void text_warn (struct sfm_reader *r, struct text_record *text, const char *format, ...) PRINTF_FORMAT (3, 4); static char *text_get_token (struct text_record *, - struct substring delimiters); + struct substring delimiters, char *delimiter); static bool text_match (struct text_record *, char c); static bool text_read_short_name (struct sfm_reader *, struct dictionary *, struct text_record *, struct substring delimiters, struct variable **); +static const char *text_parse_counted_string (struct sfm_reader *, + struct text_record *); +static size_t text_pos (const struct text_record *); static bool close_reader (struct sfm_reader *r); @@ -169,6 +175,8 @@ static void read_machine_integer_info (struct sfm_reader *, ); static void read_machine_float_info (struct sfm_reader *, size_t size, size_t count); +static void read_mrsets (struct sfm_reader *, size_t size, size_t count, + struct dictionary *); static void read_display_parameters (struct sfm_reader *, size_t size, size_t count, struct dictionary *); @@ -825,8 +833,9 @@ read_extension_record (struct sfm_reader *r, struct dictionary *dict, break; case 7: - /* Used by the MRSETS command. */ - break; + case 19: + read_mrsets (r, size, count, dict); + return; case 8: /* Used by the SPSS Data Entry software. */ @@ -1007,6 +1016,160 @@ read_machine_float_info (struct sfm_reader *r, size_t size, size_t count) lowest, "LOWEST"); } +/* Read record type 7, subtype 7 or 19. */ +static void +read_mrsets (struct sfm_reader *r, size_t size, size_t count, + struct dictionary *dict) +{ + struct text_record *text; + struct mrset *mrset; + + text = open_text_record (r, size * count); + for (;;) + { + const char *name, *label, *counted; + struct stringi_set var_names; + size_t allocated_vars; + char delimiter; + int width; + + mrset = xzalloc (sizeof *mrset); + + name = text_get_token (text, ss_cstr ("="), NULL); + if (name == NULL) + break; + mrset->name = xstrdup (name); + + if (text_match (text, 'C')) + { + mrset->type = MRSET_MC; + if (!text_match (text, ' ')) + { + sys_warn (r, _("Missing space following 'C' at offset %zu " + "in MRSETS record"), text_pos (text)); + break; + } + } + else if (text_match (text, 'D')) + { + mrset->type = MRSET_MD; + mrset->cat_source = MRSET_VARLABELS; + } + else if (text_match (text, 'E')) + { + char *number; + + mrset->type = MRSET_MD; + mrset->cat_source = MRSET_COUNTEDVALUES; + if (!text_match (text, ' ')) + { + sys_warn (r, _("Missing space following 'E' at offset %zu " + "in MRSETS record"), text_pos (text)); + break; + } + + number = text_get_token (text, ss_cstr (" "), NULL); + if (!strcmp (number, "11")) + mrset->label_from_var_label = true; + else if (strcmp (number, "1")) + sys_warn (r, _("Unexpected label source value \"%s\" " + "following 'E' at offset %zu in MRSETS record"), + number, text_pos (text)); + } + else + { + sys_warn (r, _("Missing 'C', 'D', or 'E' at offset %zu " + "in MRSETS record."), + text_pos (text)); + break; + } + + if (mrset->type == MRSET_MD) + { + counted = text_parse_counted_string (r, text); + if (counted == NULL) + break; + } + + label = text_parse_counted_string (r, text); + if (label == NULL) + break; + mrset->label = label[0] != '\0' ? xstrdup (label) : NULL; + + stringi_set_init (&var_names); + allocated_vars = 0; + width = INT_MAX; + do + { + struct variable *var; + const char *var_name; + + var_name = text_get_token (text, ss_cstr (" \n"), &delimiter); + if (var_name == NULL) + { + sys_warn (r, _("Missing new-line parsing variable names " + "at offset %zu in MRSETS record."), + text_pos (text)); + break; + } + + var = lookup_var_by_short_name (dict, var_name); + if (var == NULL) + continue; + if (!stringi_set_insert (&var_names, var_name)) + { + sys_warn (r, _("Duplicate variable name %s " + "at offset %zu in MRSETS record."), + var_name, text_pos (text)); + continue; + } + + if (mrset->label == NULL && mrset->label_from_var_label + && var_has_label (var)) + mrset->label = xstrdup (var_get_label (var)); + + if (mrset->n_vars + && var_get_type (var) != var_get_type (mrset->vars[0])) + { + sys_warn (r, _("MRSET %s contains both string and " + "numeric variables."), name); + continue; + } + width = MIN (width, var_get_width (var)); + + if (mrset->n_vars >= allocated_vars) + mrset->vars = x2nrealloc (mrset->vars, &allocated_vars, + sizeof *mrset->vars); + mrset->vars[mrset->n_vars++] = var; + } + while (delimiter != '\n'); + + if (mrset->n_vars < 2) + { + sys_warn (r, _("MRSET %s has only %zu variables."), mrset->name, + mrset->n_vars); + mrset_destroy (mrset); + continue; + } + + if (mrset->type == MRSET_MD) + { + mrset->width = width; + value_init (&mrset->counted, width); + if (width == 0) + mrset->counted.f = strtod (counted, NULL); + else + value_copy_str_rpad (&mrset->counted, width, + (const uint8_t *) counted, ' '); + } + + dict_add_mrset (dict, mrset); + mrset = NULL; + } + mrset_destroy (mrset); + close_text_record (r, text); +} + /* Read record type 7, subtype 11, which specifies how variables should be displayed in GUI environments. */ static void @@ -1355,7 +1518,7 @@ read_attributes (struct sfm_reader *r, struct text_record *text, int index; /* Parse the key. */ - key = text_get_token (text, ss_cstr ("(")); + key = text_get_token (text, ss_cstr ("("), NULL); if (key == NULL) return; @@ -1366,7 +1529,7 @@ read_attributes (struct sfm_reader *r, struct text_record *text, char *value; size_t length; - value = text_get_token (text, ss_cstr ("\n")); + value = text_get_token (text, ss_cstr ("\n"), NULL); if (value == NULL) { text_warn (r, text, _("Error parsing attribute value %s[%d]"), @@ -1952,7 +2115,7 @@ read_variable_to_value_pair (struct sfm_reader *r, struct dictionary *dict, if (!text_read_short_name (r, dict, text, ss_cstr ("="), var)) return false; - *value = text_get_token (text, ss_buffer ("\t\0", 2)); + *value = text_get_token (text, ss_buffer ("\t\0", 2), NULL); if (*value == NULL) return false; @@ -1969,13 +2132,13 @@ text_read_short_name (struct sfm_reader *r, struct dictionary *dict, struct text_record *text, struct substring delimiters, struct variable **var) { - char *short_name = text_get_token (text, delimiters); + char *short_name = text_get_token (text, delimiters, NULL); if (short_name == NULL) return false; *var = lookup_var_by_short_name (dict, short_name); if (*var == NULL) - text_warn (r, text, _("Variable map refers to unknown variable %s."), + text_warn (r, text, _("Dictionary record refers to unknown variable %s."), short_name); return true; } @@ -1997,16 +2160,78 @@ text_warn (struct sfm_reader *r, struct text_record *text, } static char * -text_get_token (struct text_record *text, struct substring delimiters) +text_get_token (struct text_record *text, struct substring delimiters, + char *delimiter) { struct substring token; + char *end; if (!ss_tokenize (text->buffer, delimiters, &text->pos, &token)) return NULL; - ss_data (token)[ss_length (token)] = '\0'; + + end = &ss_data (token)[ss_length (token)]; + if (delimiter != NULL) + *delimiter = *end; + *end = '\0'; return ss_data (token); } +/* Reads a integer value expressed in decimal, then a space, then a string that + consists of exactly as many bytes as specified by the integer, then a space, + from TEXT. Returns the string, null-terminated, as a subset of TEXT's + buffer (so the caller should not free the string). */ +static const char * +text_parse_counted_string (struct sfm_reader *r, struct text_record *text) +{ + size_t start; + size_t n; + char *s; + + start = text->pos; + n = 0; + for (;;) + { + int c = text->buffer.string[text->pos]; + if (c < '0' || c > '9') + break; + n = (n * 10) + (c - '0'); + text->pos++; + } + if (start == text->pos) + { + sys_warn (r, _("Expecting digit at offset %zu in MRSETS record."), + text->pos); + return NULL; + } + + if (!text_match (text, ' ')) + { + sys_warn (r, _("Expecting space at offset %zu in MRSETS record."), + text->pos); + return NULL; + } + + if (text->pos + n > text->buffer.length) + { + sys_warn (r, _("%zu-byte string starting at offset %zu " + "exceeds record length %zu."), + n, text->pos, text->buffer.length); + return NULL; + } + + s = &text->buffer.string[text->pos]; + if (s[n] != ' ') + { + sys_warn (r, + _("Expecting space at offset %zu following %zu-byte string."), + text->pos + n, n); + return NULL; + } + s[n] = '\0'; + text->pos += n + 1; + return s; +} + static bool text_match (struct text_record *text, char c) { @@ -2018,6 +2243,13 @@ text_match (struct text_record *text, char c) else return false; } + +/* Returns the current byte offset inside the TEXT's string. */ +static size_t +text_pos (const struct text_record *text) +{ + return text->pos; +} /* Messages. */ diff --git a/src/data/sys-file-writer.c b/src/data/sys-file-writer.c index 0d93bd12..a761d4a2 100644 --- a/src/data/sys-file-writer.c +++ b/src/data/sys-file-writer.c @@ -16,8 +16,8 @@ #include -#include "sys-file-writer.h" -#include "sys-file-private.h" +#include "data/sys-file-writer.h" +#include "data/sys-file-private.h" #include #include @@ -26,32 +26,33 @@ #include #include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "minmax.h" -#include "unlocked-io.h" -#include "xalloc.h" +#include "data/attributes.h" +#include "data/case.h" +#include "data/casewriter-provider.h" +#include "data/casewriter.h" +#include "data/dictionary.h" +#include "data/file-handle-def.h" +#include "data/file-name.h" +#include "data/format.h" +#include "data/make-file.h" +#include "data/missing-values.h" +#include "data/mrset.h" +#include "data/settings.h" +#include "data/short-names.h" +#include "data/value-labels.h" +#include "data/variable.h" +#include "libpspp/float-format.h" +#include "libpspp/i18n.h" +#include "libpspp/integer-format.h" +#include "libpspp/message.h" +#include "libpspp/misc.h" +#include "libpspp/str.h" +#include "libpspp/version.h" + +#include "gl/xmemdup0.h" +#include "gl/minmax.h" +#include "gl/unlocked-io.h" +#include "gl/xalloc.h" #include "gettext.h" #define _(msgid) gettext (msgid) @@ -113,6 +114,9 @@ static void write_vls_length_table (struct sfm_writer *w, static void write_long_string_value_labels (struct sfm_writer *, const struct dictionary *); +static void write_mrsets (struct sfm_writer *, const struct dictionary *, + bool pre_v14); + static void write_variable_display_parameters (struct sfm_writer *w, const struct dictionary *dict); @@ -241,6 +245,8 @@ sfm_open_writer (struct file_handle *fh, struct dictionary *d, write_integer_info_record (w); write_float_info_record (w); + write_mrsets (w, d, true); + write_variable_display_parameters (w, d); if (opts.version >= 3) @@ -254,6 +260,8 @@ sfm_open_writer (struct file_handle *fh, struct dictionary *d, write_data_file_attributes (w, d); write_variable_attributes (w, d); + write_mrsets (w, d, false); + write_encoding_record (w, d); /* Write end-of-headers record. */ @@ -621,6 +629,64 @@ write_variable_attributes (struct sfm_writer *w, const struct dictionary *d) ds_destroy (&s); } +/* Write multiple response sets. If PRE_V14 is true, writes sets supported by + SPSS before release 14, otherwise writes sets supported only by later + versions. */ +static void +write_mrsets (struct sfm_writer *w, const struct dictionary *dict, + bool pre_v14) +{ + struct string s = DS_EMPTY_INITIALIZER; + size_t n_mrsets; + size_t i; + + n_mrsets = dict_get_n_mrsets (dict); + if (n_mrsets == 0) + return; + + for (i = 0; i < n_mrsets; i++) + { + const struct mrset *mrset = dict_get_mrset (dict, i); + const char *label; + size_t j; + + if ((mrset->type != MRSET_MD || mrset->cat_source != MRSET_COUNTEDVALUES) + != pre_v14) + continue; + + ds_put_format (&s, "%s=", mrset->name); + if (mrset->type == MRSET_MD) + { + char *counted; + + if (mrset->cat_source == MRSET_COUNTEDVALUES) + ds_put_format (&s, "E %d ", mrset->label_from_var_label ? 11 : 1); + else + ds_put_char (&s, 'D'); + + if (mrset->width == 0) + counted = xasprintf ("%.0f", mrset->counted.f); + else + counted = xmemdup0 (value_str (&mrset->counted, mrset->width), + mrset->width); + ds_put_format (&s, "%zu %s", strlen (counted), counted); + free (counted); + } + else + ds_put_char (&s, 'C'); + ds_put_char (&s, ' '); + + label = mrset->label && !mrset->label_from_var_label ? mrset->label : ""; + ds_put_format (&s, "%zu %s", strlen (label), label); + + for (j = 0; j < mrset->n_vars; j++) + ds_put_format (&s, " %s", var_get_short_name (mrset->vars[j], 0)); + ds_put_char (&s, '\n'); + } + write_attribute_record (w, &s, 7); + ds_destroy (&s); +} + /* Write the alignment, width and scale values. */ static void write_variable_display_parameters (struct sfm_writer *w, diff --git a/src/data/value-labels.c b/src/data/value-labels.c index ff78b9d2..b1c0dc7a 100644 --- a/src/data/value-labels.c +++ b/src/data/value-labels.c @@ -118,6 +118,13 @@ val_labs_clear (struct val_labs *vls) } } +/* Returns the width of VLS. */ +int +val_labs_get_width (const struct val_labs *vls) +{ + return vls->width; +} + /* Returns the number of value labels in VLS. Returns 0 if VLS is null. */ size_t diff --git a/src/language/command.def b/src/language/command.def index d51f3295..80b471b6 100644 --- a/src/language/command.def +++ b/src/language/command.def @@ -1,5 +1,5 @@ /* PSPP - a program for statistical analysis. - Copyright (C) 2006, 2009 Free Software Foundation, Inc. + Copyright (C) 2006, 2009, 2010 Free Software Foundation, Inc. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -72,6 +72,7 @@ DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "IF", cmd_if) DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "LEAVE", cmd_leave) DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "LOOP", cmd_loop) DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "MISSING VALUES", cmd_missing_values) +DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "MRSETS", cmd_mrsets) DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "NUMERIC", cmd_numeric) DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "PRINT EJECT", cmd_print_eject) DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "PRINT FORMATS", cmd_print_formats) @@ -202,7 +203,6 @@ UNIMPL_CMD ("MODEL CLOSE ", "Close server connection") UNIMPL_CMD ("MODEL HANDLE", "Define server connection") UNIMPL_CMD ("MODEL LIST ", "Show existing models") UNIMPL_CMD ("MODEL NAME ", "Specify model label") -UNIMPL_CMD ("MRSETS", "Multiple response sets") UNIMPL_CMD ("MULTIPLE CORRESPONDENCE", "Multiple correspondence analysis") UNIMPL_CMD ("MULT RESPONSE", "Multiple reponse analysis") UNIMPL_CMD ("MVA", "Missing value analysis") diff --git a/src/language/dictionary/automake.mk b/src/language/dictionary/automake.mk index 2aa91842..15243b65 100644 --- a/src/language/dictionary/automake.mk +++ b/src/language/dictionary/automake.mk @@ -7,6 +7,7 @@ language_dictionary_sources = \ src/language/dictionary/formats.c \ src/language/dictionary/missing-values.c \ src/language/dictionary/modify-variables.c \ + src/language/dictionary/mrsets.c \ src/language/dictionary/numeric.c \ src/language/dictionary/rename-variables.c \ src/language/dictionary/split-file.c \ diff --git a/src/language/dictionary/mrsets.c b/src/language/dictionary/mrsets.c new file mode 100644 index 00000000..6e8a0508 --- /dev/null +++ b/src/language/dictionary/mrsets.c @@ -0,0 +1,605 @@ +/* PSPP - a program for statistical analysis. + Copyright (C) 2010 Free Software Foundation, Inc. + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#include + +#include "data/data-out.h" +#include "data/dictionary.h" +#include "data/mrset.h" +#include "data/procedure.h" +#include "data/value-labels.h" +#include "data/variable.h" +#include "language/command.h" +#include "language/lexer/lexer.h" +#include "language/lexer/variable-parser.h" +#include "libpspp/assertion.h" +#include "libpspp/hmap.h" +#include "libpspp/message.h" +#include "libpspp/str.h" +#include "libpspp/stringi-map.h" +#include "libpspp/stringi-set.h" +#include "output/tab.h" + +#include "gl/xalloc.h" + +#include "gettext.h" +#define _(msgid) gettext (msgid) + +static bool parse_group (struct lexer *, struct dictionary *, enum mrset_type); +static bool parse_delete (struct lexer *, struct dictionary *); +static bool parse_display (struct lexer *, struct dictionary *); + +int +cmd_mrsets (struct lexer *lexer, struct dataset *ds) +{ + struct dictionary *dict = dataset_dict (ds); + + while (lex_match (lexer, '/')) + { + bool ok; + + if (lex_match_id (lexer, "MDGROUP")) + ok = parse_group (lexer, dict, MRSET_MD); + else if (lex_match_id (lexer, "MCGROUP")) + ok = parse_group (lexer, dict, MRSET_MC); + else if (lex_match_id (lexer, "DELETE")) + ok = parse_delete (lexer, dict); + else if (lex_match_id (lexer, "DISPLAY")) + ok = parse_display (lexer, dict); + else + { + ok = false; + lex_error (lexer, NULL); + } + + if (!ok) + return CMD_FAILURE; + } + + return lex_end_of_command (lexer); +} + +static bool +parse_group (struct lexer *lexer, struct dictionary *dict, + enum mrset_type type) +{ + const char *subcommand_name = type == MRSET_MD ? "MDGROUP" : "MCGROUP"; + struct mrset *mrset; + bool labelsource_varlabel; + bool has_value; + + mrset = xzalloc (sizeof *mrset); + mrset->type = type; + mrset->cat_source = MRSET_VARLABELS; + + labelsource_varlabel = false; + has_value = false; + while (lex_token (lexer) != '/' && lex_token (lexer) != '.') + { + if (lex_match_id (lexer, "NAME")) + { + if (!lex_force_match (lexer, '=') || !lex_force_id (lexer)) + goto error; + if (lex_tokid (lexer)[0] != '$') + { + msg (SE, _("%s is not a valid name for a multiple response " + "set. Multiple response set names must begin with " + "`$'."), lex_tokid (lexer)); + goto error; + } + + free (mrset->name); + mrset->name = xstrdup (lex_tokid (lexer)); + lex_get (lexer); + } + else if (lex_match_id (lexer, "VARIABLES")) + { + if (!lex_force_match (lexer, '=')) + goto error; + + free (mrset->vars); + if (!parse_variables (lexer, dict, &mrset->vars, &mrset->n_vars, + PV_SAME_TYPE | PV_NO_SCRATCH)) + goto error; + + if (mrset->n_vars < 2) + { + msg (SE, _("VARIABLES specified only variable %s on %s, but " + "at least two variables are required."), + var_get_name (mrset->vars[0]), subcommand_name); + goto error; + } + } + else if (lex_match_id (lexer, "LABEL")) + { + if (!lex_force_match (lexer, '=') || !lex_force_string (lexer)) + goto error; + + free (mrset->label); + mrset->label = ds_xstrdup (lex_tokstr (lexer)); + lex_get (lexer); + } + else if (type == MRSET_MD && lex_match_id (lexer, "LABELSOURCE")) + { + if (!lex_force_match (lexer, '=') + || !lex_force_match_id (lexer, "VARLABEL")) + goto error; + + labelsource_varlabel = true; + } + else if (type == MRSET_MD && lex_match_id (lexer, "VALUE")) + { + if (!lex_force_match (lexer, '=')) + goto error; + + has_value = true; + if (lex_is_number (lexer)) + { + if (!lex_is_integer (lexer)) + { + msg (SE, _("Numeric VALUE must be an integer.")); + goto error; + } + value_destroy (&mrset->counted, mrset->width); + mrset->counted.f = lex_integer (lexer); + mrset->width = 0; + } + else if (lex_is_string (lexer)) + { + const char *s = ds_cstr (lex_tokstr (lexer)); + int width; + + /* Trim off trailing spaces, but don't trim the string until + it's empty because a width of 0 is a numeric type. */ + width = strlen (s); + while (width > 1 && s[width - 1] == ' ') + width--; + + value_destroy (&mrset->counted, mrset->width); + value_init (&mrset->counted, width); + memcpy (value_str_rw (&mrset->counted, width), s, width); + mrset->width = width; + } + else + { + lex_error (lexer, NULL); + goto error; + } + lex_get (lexer); + } + else if (type == MRSET_MD && lex_match_id (lexer, "CATEGORYLABELS")) + { + if (!lex_force_match (lexer, '=')) + goto error; + + if (lex_match_id (lexer, "VARLABELS")) + mrset->cat_source = MRSET_VARLABELS; + else if (lex_match_id (lexer, "COUNTEDVALUES")) + mrset->cat_source = MRSET_COUNTEDVALUES; + else + { + lex_error (lexer, NULL); + goto error; + } + } + else + { + lex_error (lexer, NULL); + goto error; + } + } + + if (mrset->name == NULL) + { + msg (SE, _("Required %s specification missing from %s subcommand."), + "NAME", subcommand_name); + goto error; + } + else if (mrset->n_vars == 0) + { + msg (SE, _("Required %s specification missing from %s subcommand."), + "VARIABLES", subcommand_name); + goto error; + } + + if (type == MRSET_MD) + { + /* Check that VALUE is specified and is valid for the VARIABLES. */ + if (!has_value) + { + msg (SE, _("Required %s specification missing from %s subcommand."), + "VALUE", subcommand_name); + goto error; + } + else if (var_is_alpha (mrset->vars[0])) + { + if (mrset->width == 0) + { + msg (SE, _("MDGROUP subcommand for group %s specifies a string " + "VALUE, but the variables specified for this group " + "are numeric."), + mrset->name); + goto error; + } + else { + const struct variable *shortest_var; + int min_width; + size_t i; + + shortest_var = NULL; + min_width = INT_MAX; + for (i = 0; i < mrset->n_vars; i++) + { + int width = var_get_width (mrset->vars[i]); + if (width < min_width) + { + shortest_var = mrset->vars[i]; + min_width = width; + } + } + if (mrset->width > min_width) + { + msg (SE, _("VALUE string on MDGROUP subcommand for group " + "%s is %d bytes long, but it must be no longer " + "than the narrowest variable in the group, " + "which is %s with a width of %d bytes."), + mrset->name, mrset->width, + var_get_name (shortest_var), min_width); + goto error; + } + } + } + else + { + if (mrset->width != 0) + { + msg (SE, _("MDGROUP subcommand for group %s specifies a string " + "VALUE, but the variables specified for this group " + "are numeric."), + mrset->name); + goto error; + } + } + + /* Implement LABELSOURCE=VARLABEL. */ + if (labelsource_varlabel) + { + if (mrset->cat_source != MRSET_COUNTEDVALUES) + msg (SW, _("MDGROUP subcommand for group %s specifies " + "LABELSOURCE=VARLABEL but not " + "CATEGORYLABELS=COUNTEDVALUES. " + "Ignoring LABELSOURCE."), + mrset->name); + else if (mrset->label) + msg (SW, _("MDGROUP subcommand for group %s specifies both LABEL " + "and LABELSOURCE, but only one of these subcommands " + "may be used at a time. Ignoring LABELSOURCE."), + mrset->name); + else + { + size_t i; + + mrset->label_from_var_label = true; + for (i = 0; mrset->label == NULL && i < mrset->n_vars; i++) + { + const char *label = var_get_label (mrset->vars[i]); + if (label != NULL) + { + mrset->label = xstrdup (label); + break; + } + } + } + } + + /* Warn if categories cannot be distinguished in output. */ + if (mrset->cat_source == MRSET_VARLABELS) + { + struct stringi_map seen; + size_t i; + + stringi_map_init (&seen); + for (i = 0; i < mrset->n_vars; i++) + { + const struct variable *var = mrset->vars[i]; + const char *name = var_get_name (var); + const char *label = var_get_label (var); + if (label != NULL) + { + const char *other_name = stringi_map_find (&seen, label); + + if (other_name == NULL) + stringi_map_insert (&seen, label, name); + else + msg (SW, _("Variables %s and %s specified as part of " + "multiple dichotomy group %s have the same " + "variable label. Categories represented by " + "these variables will not be distinguishable " + "in output."), + other_name, name, mrset->name); + } + } + stringi_map_destroy (&seen); + } + else + { + struct stringi_map seen; + size_t i; + + stringi_map_init (&seen); + for (i = 0; i < mrset->n_vars; i++) + { + const struct variable *var = mrset->vars[i]; + const char *name = var_get_name (var); + const struct val_labs *val_labs; + union value value; + const char *label; + + value_clone (&value, &mrset->counted, mrset->width); + value_resize (&value, mrset->width, var_get_width (var)); + + val_labs = var_get_value_labels (var); + label = val_labs_find (val_labs, &value); + if (label == NULL) + msg (SW, _("Variable %s specified as part of multiple " + "dichotomy group %s (which has " + "CATEGORYLABELS=COUNTEDVALUES) has no value label " + "for its counted value. This category will not " + "be distinguishable in output."), + name, mrset->name); + else + { + const char *other_name = stringi_map_find (&seen, label); + + if (other_name == NULL) + stringi_map_insert (&seen, label, name); + else + msg (SW, _("Variables %s and %s specified as part of " + "multiple dichotomy group %s (which has " + "CATEGORYLABELS=COUNTEDVALUES) have the same " + "value label for the the group's counted " + "value. These categories will not be " + "distinguishable in output."), + other_name, name, mrset->name); + } + } + stringi_map_destroy (&seen); + } + } + else /* MCGROUP. */ + { + /* Warn if categories cannot be distinguished in output. */ + struct category + { + struct hmap_node hmap_node; + union value value; + int width; + const char *label; + const char *var_name; + bool warned; + }; + + struct category *c, *next; + struct hmap categories; + size_t i; + + hmap_init (&categories); + for (i = 0; i < mrset->n_vars; i++) + { + const struct variable *var = mrset->vars[i]; + const char *name = var_get_name (var); + int width = var_get_width (var); + const struct val_labs *val_labs; + const struct val_lab *vl; + + val_labs = var_get_value_labels (var); + for (vl = val_labs_first (val_labs); vl != NULL; + vl = val_labs_next (val_labs, vl)) + { + const union value *value = val_lab_get_value (vl); + const char *label = val_lab_get_label (vl); + unsigned int hash = value_hash (value, width, 0); + + HMAP_FOR_EACH_WITH_HASH (c, struct category, hmap_node, + hash, &categories) + { + if (width == c->width + && value_equal (value, &c->value, width)) + { + if (!c->warned && strcasecmp (c->label, label)) + { + char *s = data_out (value, var_get_encoding (var), + var_get_print_format (var)); + c->warned = true; + msg (SW, _("Variables specified on MCGROUP should " + "have the same categories, but %s and %s " + "(and possibly others) in multiple " + "category group %s have different " + "value labels for value %s."), + c->var_name, name, mrset->name, s); + free (s); + } + goto found; + } + } + + c = xmalloc (sizeof *c); + value_clone (&c->value, value, width); + c->width = width; + c->label = label; + c->var_name = name; + c->warned = false; + hmap_insert (&categories, &c->hmap_node, hash); + + found: ; + } + } + + HMAP_FOR_EACH_SAFE (c, next, struct category, hmap_node, &categories) + { + value_destroy (&c->value, c->width); + hmap_delete (&categories, &c->hmap_node); + free (c); + } + hmap_destroy (&categories); + } + + dict_add_mrset (dict, mrset); + return true; + +error: + mrset_destroy (mrset); + return false; +} + +static bool +parse_mrset_names (struct lexer *lexer, struct dictionary *dict, + struct stringi_set *mrset_names) +{ + if (!lex_force_match_id (lexer, "NAME") || !lex_force_match (lexer, '=')) + return false; + + stringi_set_init (mrset_names); + if (lex_match (lexer, '[')) + { + while (!lex_match (lexer, ']')) + { + if (!lex_force_id (lexer)) + return false; + if (dict_lookup_mrset (dict, lex_tokid (lexer)) == NULL) + { + msg (SE, _("No multiple response set named %s."), + lex_tokid (lexer)); + stringi_set_destroy (mrset_names); + return false; + } + stringi_set_insert (mrset_names, lex_tokid (lexer)); + lex_get (lexer); + } + } + else if (lex_match (lexer, T_ALL)) + { + size_t n_sets = dict_get_n_mrsets (dict); + size_t i; + + for (i = 0; i < n_sets; i++) + stringi_set_insert (mrset_names, dict_get_mrset (dict, i)->name); + } + + return true; +} + +static bool +parse_delete (struct lexer *lexer, struct dictionary *dict) +{ + const struct stringi_set_node *node; + struct stringi_set mrset_names; + const char *name; + + if (!parse_mrset_names (lexer, dict, &mrset_names)) + return false; + + STRINGI_SET_FOR_EACH (name, node, &mrset_names) + dict_delete_mrset (dict, name); + stringi_set_destroy (&mrset_names); + + return true; +} + +static bool +parse_display (struct lexer *lexer, struct dictionary *dict) +{ + struct string details, var_names; + struct stringi_set mrset_names_set; + char **mrset_names; + struct tab_table *table; + size_t i, n; + + if (!parse_mrset_names (lexer, dict, &mrset_names_set)) + return false; + + n = stringi_set_count (&mrset_names_set); + if (n == 0) + { + if (dict_get_n_mrsets (dict) == 0) + msg (SN, _("The active file dictionary does not contain any multiple " + "response sets.")); + stringi_set_destroy (&mrset_names_set); + return true; + } + + table = tab_create (3, n + 1); + tab_headers (table, 0, 0, 1, 0); + tab_box (table, TAL_1, TAL_1, TAL_1, TAL_1, 0, 0, 2, n); + tab_hline (table, TAL_2, 0, 2, 1); + tab_title (table, "%s", _("Multiple Response Sets")); + tab_text (table, 0, 0, TAB_EMPH | TAB_LEFT, _("Name")); + tab_text (table, 1, 0, TAB_EMPH | TAB_LEFT, _("Variables")); + tab_text (table, 2, 0, TAB_EMPH | TAB_LEFT, _("Details")); + + ds_init_empty (&details); + ds_init_empty (&var_names); + mrset_names = stringi_set_get_sorted_array (&mrset_names_set); + for (i = 0; i < n; i++) + { + const struct mrset *mrset = dict_lookup_mrset (dict, mrset_names[i]); + const int row = i + 1; + size_t j; + + /* Details. */ + ds_clear (&details); + ds_put_format (&details, "%s\n", (mrset->type == MRSET_MD + ? _("Multiple dichotomy set") + : _("Multiple category set"))); + if (mrset->label != NULL) + ds_put_format (&details, "%s: %s\n", _("Label"), mrset->label); + if (mrset->type == MRSET_MD) + { + if (mrset->label != NULL || mrset->label_from_var_label) + ds_put_format (&details, "%s: %s\n", _("Label source"), + (mrset->label_from_var_label + ? _("First variable label among variables") + : _("Provided by user"))); + ds_put_format (&details, "%s: ", _("Counted value")); + if (mrset->width == 0) + ds_put_format (&details, "%.0f\n", mrset->counted.f); + else + ds_put_format (&details, "\"%.*s\"\n", mrset->width, + value_str (&mrset->counted, mrset->width)); + ds_put_format (&details, "%s: %s\n", _("Category label source"), + (mrset->cat_source == MRSET_VARLABELS + ? _("Variable labels") + : _("Value labels of counted value"))); + } + + /* Variable names. */ + ds_clear (&var_names); + for (j = 0; j < mrset->n_vars; j++) + ds_put_format (&var_names, "%s\n", var_get_name (mrset->vars[j])); + + tab_text (table, 0, row, TAB_LEFT, mrset_names[i]); + tab_text (table, 1, row, TAB_LEFT, ds_cstr (&var_names)); + tab_text (table, 2, row, TAB_LEFT, ds_cstr (&details)); + } + free (mrset_names); + ds_destroy (&var_names); + ds_destroy (&details); + stringi_set_destroy (&mrset_names_set); + + tab_submit (table); + + return true; +} diff --git a/tests/automake.mk b/tests/automake.mk index 8cd379bc..a342e549 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -417,6 +417,7 @@ EXTRA_DIST += \ $(TESTSUITE) TESTSUITE_AT = \ tests/testsuite.at \ + tests/language/dictionary/mrsets.at \ tests/language/stats/aggregate.at \ tests/language/stats/autorecode.at \ tests/language/stats/crosstabs.at \ diff --git a/tests/dissect-sysfile.c b/tests/dissect-sysfile.c index 17eaf976..ecb8d59b 100644 --- a/tests/dissect-sysfile.c +++ b/tests/dissect-sysfile.c @@ -64,6 +64,7 @@ static void read_machine_integer_info (struct sfm_reader *, size_t size, size_t count); static void read_machine_float_info (struct sfm_reader *, size_t size, size_t count); +static void read_mrsets (struct sfm_reader *, size_t size, size_t count); static void read_display_parameters (struct sfm_reader *, size_t size, size_t count); static void read_long_var_name_map (struct sfm_reader *r, @@ -90,6 +91,8 @@ static bool read_variable_to_value_pair (struct text_record *, char **key, char **value); static char *text_tokenize (struct text_record *, int delimiter); static bool text_match (struct text_record *text, int c); +static const char *text_parse_counted_string (struct text_record *); +static size_t text_pos (const struct text_record *); static void usage (int exit_code); static void sys_warn (struct sfm_reader *, const char *, ...) @@ -523,8 +526,9 @@ read_extension_record (struct sfm_reader *r) break; case 7: - /* Unknown purpose. */ - break; + case 19: + read_mrsets (r, size, count); + return; case 11: read_display_parameters (r, size, count); @@ -632,6 +636,107 @@ read_machine_float_info (struct sfm_reader *r, size_t size, size_t count) lowest, "LOWEST"); } +/* Read record type 7, subtype 7. */ +static void +read_mrsets (struct sfm_reader *r, size_t size, size_t count) +{ + struct text_record *text; + + printf ("%08lx: multiple response sets\n", ftell (r->file)); + text = open_text_record (r, size * count); + for (;;) + { + const char *name; + enum { MRSET_MC, MRSET_MD } type; + bool cat_label_from_counted_values = false; + bool label_from_var_label = false; + const char *counted; + const char *label; + const char *variables; + + name = text_tokenize (text, '='); + if (name == NULL) + break; + + if (text_match (text, 'C')) + { + type = MRSET_MC; + counted = NULL; + if (!text_match (text, ' ')) + { + sys_warn (r, "missing space following 'C' at offset %zu " + "in mrsets record", text_pos (text)); + break; + } + } + else if (text_match (text, 'D')) + { + type = MRSET_MD; + } + else if (text_match (text, 'E')) + { + char *number; + + type = MRSET_MD; + cat_label_from_counted_values = true; + + if (!text_match (text, ' ')) + { + sys_warn (r, _("Missing space following 'E' at offset %zu " + "in MRSETS record"), text_pos (text)); + break; + } + + number = text_tokenize (text, ' '); + if (!strcmp (number, "11")) + label_from_var_label = true; + else if (strcmp (number, "1")) + sys_warn (r, _("Unexpected label source value \"%s\" " + "following 'E' at offset %zu in MRSETS record"), + number, text_pos (text)); + + } + else + { + sys_warn (r, "missing 'C', 'D', or 'E' at offset %zu " + "in mrsets record", text_pos (text)); + break; + } + + if (type == MRSET_MD) + { + counted = text_parse_counted_string (text); + if (counted == NULL) + break; + } + + label = text_parse_counted_string (text); + if (label == NULL) + break; + + variables = text_tokenize (text, '\n'); + if (variables == NULL) + { + sys_warn (r, "missing variable names following label " + "at offset %zu in mrsets record", text_pos (text)); + break; + } + + printf ("\t\"%s\": multiple %s set", + name, type == MRSET_MC ? "category" : "dichotomy"); + if (counted != NULL) + printf (", counted value \"%s\"", counted); + if (cat_label_from_counted_values) + printf (", category labels from counted values"); + if (label[0] != '\0') + printf (", label \"%s\"", label); + if (label_from_var_label) + printf (", label from variable label"); + printf(", variables \"%s\"\n", variables); + } + close_text_record (text); +} + /* Read record type 7, subtype 11. */ static void read_display_parameters (struct sfm_reader *r, size_t size, size_t count) @@ -1033,6 +1138,7 @@ read_compressed_data (struct sfm_reader *r) /* State. */ struct text_record { + struct sfm_reader *reader; /* Reader. */ char *buffer; /* Record contents. */ size_t size; /* Size of buffer. */ size_t pos; /* Current position in buffer. */ @@ -1046,6 +1152,8 @@ open_text_record (struct sfm_reader *r, size_t size) struct text_record *text = xmalloc (sizeof *text); char *buffer = xmalloc (size + 1); read_bytes (r, buffer, size); + buffer[size] = '\0'; + text->reader = r; text->buffer = buffer; text->size = size; text->pos = 0; @@ -1088,6 +1196,54 @@ text_match (struct text_record *text, int c) return false; } +/* Reads a integer value expressed in decimal, then a space, then a string that + consists of exactly as many bytes as specified by the integer, then a space, + from TEXT. Returns the string, null-terminated, as a subset of TEXT's + buffer (so the caller should not free the string). */ +static const char * +text_parse_counted_string (struct text_record *text) +{ + size_t start; + size_t n; + char *s; + + start = text->pos; + n = 0; + while (isdigit ((unsigned char) text->buffer[text->pos])) + n = (n * 10) + (text->buffer[text->pos++] - '0'); + if (start == text->pos) + { + sys_error (text->reader, "expecting digit at offset %zu in record", + text->pos); + return NULL; + } + + if (!text_match (text, ' ')) + { + sys_error (text->reader, "expecting space at offset %zu in record", + text->pos); + return NULL; + } + + if (text->pos + n > text->size) + { + sys_error (text->reader, "%zu-byte string starting at offset %zu " + "exceeds record length %zu", n, text->pos, text->size); + return NULL; + } + + s = &text->buffer[text->pos]; + if (s[n] != ' ') + { + sys_error (text->reader, "expecting space at offset %zu following " + "%zu-byte string", text->pos + n, n); + return NULL; + } + s[n] = '\0'; + text->pos += n + 1; + return s; +} + /* Reads a variable=value pair from TEXT. Looks up the variable in DICT and stores it into *VAR. Stores a null-terminated value into *VALUE. */ @@ -1106,6 +1262,13 @@ read_variable_to_value_pair (struct text_record *text, text->pos++; return true; } + +/* Returns the current byte offset inside the TEXT's string. */ +static size_t +text_pos (const struct text_record *text) +{ + return text->pos; +} static void usage (int exit_code) diff --git a/tests/language/dictionary/mrsets.at b/tests/language/dictionary/mrsets.at new file mode 100644 index 00000000..4567173d --- /dev/null +++ b/tests/language/dictionary/mrsets.at @@ -0,0 +1,313 @@ +AT_BANNER([MRSETS]) + +m4_define([DEFINE_MRSETS_DATA], + [DATA LIST NOTABLE /w x y z 1-4 a b c d 5-8 (a). +BEGIN DATA. +1234acbd +5678efgh +END DATA.]) + +m4_define([DEFINE_MRSETS], + [DEFINE_MRSETS_DATA + +[VARIABLE LABEL + w 'duplicate variable label' + x 'Variable x' + z 'Duplicate variable label'. +VALUE LABELS + /w 1 'w value 1' + /y 1 'duplicate Value label' + /z 1 'duplicate value Label' + /a b c d 'a' 'burger' 'b' 'fries' 'c' 'shake' 'd' 'taco'. +ADD VALUE LABELS + /b 'b' 'Fries' + /c 'b' 'XXX'. +MRSETS + /MDGROUP NAME=$a + LABEL='First multiple dichotomy group' + CATEGORYLABELS=VARLABELS + VARIABLES=w x y z + VALUE=5 + /MDGROUP NAME=$b + CATEGORYLABELS=COUNTEDVALUES + VARIABLES=z y + VALUE=123 + /MDGROUP NAME=$c + LABELSOURCE=VARLABEL + CATEGORYLABELS=COUNTEDVALUES + VARIABLES=w x y z + VALUE=1 + /MDGROUP NAME=$d + LABELSOURCE=VARLABEL + VARIABLES=a b c d + VALUE='c' + /MCGROUP NAME=$e + LABEL='First multiple category group' + VARIABLES=w x y z + /MCGROUP NAME=$f + VARIABLES=a b c d. +]]) + +m4_define([DEFINE_MRSETS_OUTPUT], + [mrsets.sps:25: warning: MRSETS: Variables w and z specified as part of multiple dichotomy group $a have the same variable label. Categories represented by these variables will not be distinguishable in output. + +mrsets.sps:29: warning: MRSETS: Variable z specified as part of multiple dichotomy group $b (which has CATEGORYLABELS=COUNTEDVALUES) has no value label for its counted value. This category will not be distinguishable in output. + +mrsets.sps:29: warning: MRSETS: Variable y specified as part of multiple dichotomy group $b (which has CATEGORYLABELS=COUNTEDVALUES) has no value label for its counted value. This category will not be distinguishable in output. + +mrsets.sps:34: warning: MRSETS: Variable x specified as part of multiple dichotomy group $c (which has CATEGORYLABELS=COUNTEDVALUES) has no value label for its counted value. This category will not be distinguishable in output. + +mrsets.sps:34: warning: MRSETS: Variables y and z specified as part of multiple dichotomy group $c (which has CATEGORYLABELS=COUNTEDVALUES) have the same value label for the the group's counted value. These categories will not be distinguishable in output. + +mrsets.sps:38: warning: MRSETS: MDGROUP subcommand for group $d specifies LABELSOURCE=VARLABEL but not CATEGORYLABELS=COUNTEDVALUES. Ignoring LABELSOURCE. + +"mrsets.sps:41: warning: MRSETS: Variables specified on MCGROUP should have the same categories, but w and y (and possibly others) in multiple category group $e have different value labels for value 1." + +"mrsets.sps:42: warning: MRSETS: Variables specified on MCGROUP should have the same categories, but a and c (and possibly others) in multiple category group $f have different value labels for value b." +]) + +m4_define([MRSETS_DISPLAY_OUTPUT], + [Table: Multiple Response Sets +Name,Variables,Details +$a,"w +x +y +z +","Multiple dichotomy set +Label: First multiple dichotomy group +Label source: Provided by user +Counted value: 5 +Category label source: Variable labels +" +$b,"z +y +","Multiple dichotomy set +Counted value: 123 +Category label source: Value labels of counted value +" +$c,"w +x +y +z +","Multiple dichotomy set +Label: duplicate variable label +Label source: First variable label among variables +Counted value: 1 +Category label source: Value labels of counted value +" +$d,"a +b +c +d +","Multiple dichotomy set +Counted value: ""c"" +Category label source: Variable labels +" +$e,"w +x +y +z +","Multiple category set +Label: First multiple category group +" +$f,"a +b +c +d +","Multiple category set +" +]) + +AT_SETUP([MRSETS add, display, delete]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS +[MRSETS + /DISPLAY NAME=[$a] + /DISPLAY NAME=ALL + /DELETE NAME=[$c] + /DISPLAY NAME=ALL + /DELETE NAME=ALL + /DISPLAY NAME=ALL. +]]) +AT_CHECK([pspp -O format=csv mrsets.sps], [0], + [DEFINE_MRSETS_OUTPUT +Table: Multiple Response Sets +Name,Variables,Details +$a,"w +x +y +z +","Multiple dichotomy set +Label: First multiple dichotomy group +Label source: Provided by user +Counted value: 5 +Category label source: Variable labels +" + +MRSETS_DISPLAY_OUTPUT +Table: Multiple Response Sets +Name,Variables,Details +$a,"w +x +y +z +","Multiple dichotomy set +Label: First multiple dichotomy group +Label source: Provided by user +Counted value: 5 +Category label source: Variable labels +" +$b,"z +y +","Multiple dichotomy set +Counted value: 123 +Category label source: Value labels of counted value +" +$d,"a +b +c +d +","Multiple dichotomy set +Counted value: ""c"" +Category label source: Variable labels +" +$e,"w +x +y +z +","Multiple category set +Label: First multiple category group +" +$f,"a +b +c +d +","Multiple category set +" + +mrsets.sps:50: note: MRSETS: The active file dictionary does not contain any multiple response sets. +]) +AT_CLEANUP + +AT_SETUP([MRSETS read and write]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS +SAVE OUTFILE='mrsets.sav'. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [0], [DEFINE_MRSETS_OUTPUT]) +AT_DATA([mrsets2.sps], + [GET FILE='mrsets.sav'. +MRSETS /DISPLAY NAME=ALL. +]) +AT_CHECK([pspp -O format=csv mrsets2.sps], [0], [MRSETS_DISPLAY_OUTPUT], + [], [hd mrsets.sav]) +AT_CLEANUP + +AT_SETUP([MRSETS names must begin with $]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MCGROUP NAME=x. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + [mrsets.sps:6: error: MRSETS: x is not a valid name for a multiple response set. Multiple response set names must begin with `$'. +]) +AT_CLEANUP + +AT_SETUP([MRSETS must have at least 2 variables]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MCGROUP NAME=$x VARIABLES=a. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + ["mrsets.sps:6: error: MRSETS: VARIABLES specified only variable a on MCGROUP, but at least two variables are required." +]) +AT_CLEANUP + +AT_SETUP([MRSETS does not allow noninteger VALUE]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MDGROUP VALUE=1.5. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + [mrsets.sps:6: error: MRSETS: Numeric VALUE must be an integer. +]) +AT_CLEANUP + +AT_SETUP([MRSETS requires NAME to define a group]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MCGROUP VARIABLES=a b c. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + [mrsets.sps:6: error: MRSETS: Required NAME specification missing from MCGROUP subcommand. +]) +AT_CLEANUP + +AT_SETUP([MRSETS requires VARIABLES to define a group]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MCGROUP NAME=$Mcgroup. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + [mrsets.sps:6: error: MRSETS: Required VARIABLES specification missing from MCGROUP subcommand. +]) +AT_CLEANUP + +AT_SETUP([MRSETS variables must be same type]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MCGROUP NAME=$mygroup VARIABLES=a b x y. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + [mrsets.sps:6: error: MRSETS: a and x are not the same type. All variables in this variable list must be of the same type. x will be omitted from the list. + +mrsets.sps:6: error: MRSETS: a and y are not the same type. All variables in this variable list must be of the same type. y will be omitted from the list. +]) +AT_CLEANUP + +AT_SETUP([MRSETS variables and VALUE must be same type]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MDGROUP NAME=$group1 VARIABLES=a b VALUE=1. +MRSETS /MDGROUP NAME=$group2 VARIABLES=x y VALUE='abc'. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + ["mrsets.sps:6: error: MRSETS: MDGROUP subcommand for group $group1 specifies a string VALUE, but the variables specified for this group are numeric." + +"mrsets.sps:7: error: MRSETS: MDGROUP subcommand for group $group2 specifies a string VALUE, but the variables specified for this group are numeric." +]) +AT_CLEANUP + +AT_SETUP([MRSETS VALUE must not be too wide]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MDGROUP NAME=$group1 VARIABLES=a b VALUE='abc'. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + ["mrsets.sps:6: error: MRSETS: VALUE string on MDGROUP subcommand for group $group1 is 3 bytes long, but it must be no longer than the narrowest variable in the group, which is a with a width of 1 bytes." +]) +AT_CLEANUP + +AT_SETUP([MRSETS LABEL and LABELSOURCE are exclusive]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +MRSETS /MDGROUP NAME=$group1 VARIABLES=a b VALUE='a' + LABEL='label' LABELSOURCE=VARLABEL. +]) +AT_CHECK([pspp -O format=csv mrsets.sps], [0], + [mrsets.sps:7: warning: MRSETS: MDGROUP subcommand for group $group1 specifies LABELSOURCE=VARLABEL but not CATEGORYLABELS=COUNTEDVALUES. Ignoring LABELSOURCE. +]) +AT_CLEANUP + +AT_SETUP([MRSETS DISPLAY or DELETE unknown group]) +AT_DATA([mrsets.sps], + [DEFINE_MRSETS_DATA +[MRSETS /DISPLAY NAME=[$x]. +MRSETS /DELETE NAME=[$y]. +]]) +AT_CHECK([pspp -O format=csv mrsets.sps], [1], + [mrsets.sps:6: error: MRSETS: No multiple response set named $x. + +mrsets.sps:7: error: MRSETS: No multiple response set named $y. +]) +AT_CLEANUP diff --git a/tests/testsuite.at b/tests/testsuite.at index 7cd9c8ab..1b4d7e44 100644 --- a/tests/testsuite.at +++ b/tests/testsuite.at @@ -6,6 +6,7 @@ m4_ifndef([AT_SKIP_IF], [AT_CHECK([($1) \ && exit 77 || exit 0], [0], [ignore], [ignore])])]) +m4_include([tests/language/dictionary/mrsets.at]) m4_include([tests/language/stats/aggregate.at]) m4_include([tests/language/stats/autorecode.at]) m4_include([tests/language/stats/crosstabs.at])