in testing.
-Time-stamp: <2004-03-24 19:52:48 blp>
+Time-stamp: <2004-03-26 00:07:35 blp>
What Ben's working on now.
--------------------------
-Expression parser/optimizer/evaluator revisions:
-
- * Testing.
-
- * Add random distributions.
-
- * Get rid of Boolean/integer type mismatch errors.
-
Procedures need to be able to make multiple passes.
Write a better descriptive stats evaluator based on NR two-pass technique,
TODO
----
+The expression tests need tests for XDATE and a few others, see
+tests/xforms/expressions.sh comments for details.
+
+Expressions need random distribution functions.
+
There needs to be another layer onto the lexer, which should probably be
entirely rewritten anyway. The lexer needs to read entire *commands* at a
time, not just a *line* at a time. It also needs to support arbitrary putback,
+Fri Mar 26 00:07:46 2004 Ben Pfaff <blp@gnu.org>
+
+ * pspp.texi: Update chapter on expressions.
+
Sat Mar 20 00:53:10 WST 2004 John Darrington <john@darrington.wattle.id.au>
* pspp.texi: Added a brief mention of the SHOW command.
full-fledged expressions in themselves.
@menu
-* Booleans:: Boolean values.
+* Boolean Values:: Boolean values.
* Missing Values in Expressions:: Using missing values in expressions.
* Grouping Operators:: ( )
* Arithmetic Operators:: + - * / **
* Order of Operations:: Operator precedence.
@end menu
-@node Booleans, Missing Values in Expressions, Expressions, Expressions
-@section Boolean values
+@node Boolean Values, Missing Values in Expressions, Expressions, Expressions
+@section Boolean Values
@cindex Boolean
@cindex values, Boolean
-There is a third type for arguments and results, the @dfn{Boolean} type,
-which is used to represent true/false conditions. Booleans have only
-three possible values: 0 (false), 1 (true), and system-missing.
-System-missing is neither true nor false.
+Some PSPP operators and expressions work with Boolean values, which
+represent true/false conditions. Booleans have only three possible
+values: 0 (false), 1 (true), and system-missing (unknown).
+System-missing is neither true nor false and indicates that the true
+value is unknown.
-@itemize @bullet
-@item
-A numeric expression that has value 0, 1, or system-missing may be used
-in place of a Boolean. Thus, the expression @code{0 AND 1} is valid
-(although it is always false).
-
-@item
-A numeric expression with any other value will cause an error if it is
-used as a Boolean. So, @code{2 OR 3} is invalid.
-
-@item
-A Boolean expression may not be used in place of a numeric expression.
-Thus, @code{(1>2) + (3<4)} is invalid.
+Boolean-typed operands or function arguments must take on one of these
+three values. Other values are considered false, but cause an error
+when the expression is evaluated.
-@item
Strings and Booleans are not compatible, and neither may be used in
place of the other.
-@end itemize
-@node Missing Values in Expressions, Grouping Operators, Booleans, Expressions
+@node Missing Values in Expressions, Grouping Operators, Boolean Values, Expressions
@section Missing Values in Expressions
String missing values are not treated specially in expressions. Most
descriptions.
User-missing values for numeric variables are always transformed into
-the system-missing value, except inside the arguments to the
-@code{VALUE}, @code{SYSMIS}, and @code{MISSING} functions.
+the system-missing value, except inside the arguments to the
+@code{VALUE} and @code{SYSMIS} functions.
The missing-value functions can be used to precisely control how missing
values are treated in expressions. @xref{Missing Value Functions}, for
@cindex logical intersection
@item @var{a} AND @var{b}
@itemx @var{a} & @var{b}
-True if both @var{a} and @var{b} are true. However, if one argument is
-false and the other is missing, the result is false, not missing. If
+True if both @var{a} and @var{b} are true, false otherwise. If one
+argument is false, the result is false even if the other is missing. If
both arguments are missing, the result is missing.
@cindex @code{OR}
@item @var{a} OR @var{b}
@itemx @var{a} | @var{b}
True if at least one of @var{a} and @var{b} is true. If one argument is
-true and the other is missing, the result is true, not missing. If both
+true, the result is true even if the other argument is missing. If both
arguments are missing, the result is missing.
@cindex @code{NOT}
@cindex logical inversion
@item NOT @var{a}
@itemx ~ @var{a}
-True if @var{a} is false.
+True if @var{a} is false. If the argument is missing, then the result
+is missing.
@end table
@node Relational Operators, Functions, Logical Operators, Expressions
The relational operators take numeric or string arguments and produce Boolean
results.
-Note that, with numeric arguments, PSPP does not make exact
-relational tests. Instead, two numbers are considered to be equal even
-if they differ by a small amount. This amount, @dfn{epsilon}, is
-dependent on the PSPP configuration and determined at compile
-time. (The default value is 0.000000001, or
-@ifinfo
-@code{10**(-9)}.)
-@end ifinfo
-@tex
-$10 ^{-9}$.)
-@end tex
-Use of epsilon allows for round-off errors. Use of epsilon is also
-idiotic, but the author is not a numeric analyst.
-
Strings cannot be compared to numbers. When strings of different
lengths are compared, the shorter string is right-padded with spaces
to match the length of the longer string.
@cindex arccosine
@cindex inverse cosine
-@deftypefn {Function} {} ACOS (@var{number})
-@deftypefnx {Function} {} ARCOS (@var{number})
+@deftypefn {Function} {} ARCOS (@var{number})
Takes the arccosine, in radians, of @var{number}. Results in
-system-missing if @var{number} is not between -1 and 1. Portability:
-none.
+system-missing if @var{number} is not between -1 and 1.
@end deftypefn
@cindex arcsine
Takes the arctangent, in radians, of @var{number}.
@end deftypefn
-@cindex arcsine
-@cindex inverse sine
-@deftypefn {Function} {} ASIN (@var{number})
-Takes the arcsine, in radians, of @var{number}. Results in
-system-missing if @var{number} is not between -1 and 1 inclusive.
-Portability: none.
-@end deftypefn
-
-@cindex arctangent
-@cindex inverse tangent
-@deftypefn {Function} {} ATAN (@var{number})
-Takes the arctangent, in radians, of @var{number}.
-@end deftypefn
-
-@quotation
-@strong{Please note:} Use of the AR* group of inverse trigonometric
-functions is recommended over the A* group because they are more
-portable.
-@end quotation
-
@cindex cosine
@deftypefn {Function} {} COS (@var{angle})
Takes the cosine of @var{angle} which should be in radians.
@cindex values, missing
@cindex functions, missing-value
-Missing-value functions take various types as arguments, returning
-various types of results.
-
-@deftypefn {Function} {} MISSING (@var{variable or expression})
-@var{num} may be a single variable name or an expression. If it is a
-variable name, results in 1 if the variable has a user-missing or
-system-missing value for the current case, 0 otherwise. If it is an
-expression, results in 1 if the expression has the system-missing value,
-0 otherwise.
+Missing-value functions take various numeric arguments and yield
+various types of results. Note that the normal rules of evaluation
+apply within expression arguments to these functions. In particular,
+user-missing values for numeric variables are converted to
+system-missing values.
-@quotation
-@strong{Please note:} If the argument is a string expression other than
-a variable name, MISSING is guaranteed to return 0, because strings do
-not have a system-missing value. Also, when using a numeric expression
-argument, remember that user-missing values are converted to the
-system-missing value in most contexts. Thus, the expressions
-@code{MISSING(VAR1 @var{op} VAR2)} and @code{MISSING(VAR1) OR
-MISSING(VAR2)} are often equivalent, depending on the specific operator
-@var{op} used.
-@end quotation
+@deftypefn {Function} {} MISSING (@var{expr})
+Returns 1 if @var{expr} has the system-missing value, 0 otherwise.
@end deftypefn
@deftypefn {Function} {} NMISS (@var{expr} [, @var{expr}]@dots{})
Each argument must be a numeric expression. Returns the number of
-user- or system-missing values in the list. As a special extension,
+system-missing values in the list. As a special extension,
the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a
range of variables; see @ref{Sets of Variables}, for more details.
@end deftypefn
@deftypefn {Function} {} NVALID (@var{expr} [, @var{expr}]@dots{})
Each argument must be a numeric expression. Returns the number of
-values in the list that are not user- or system-missing. As a special extension,
+values in the list that are not system-missing. As a special extension,
the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a
range of variables; see @ref{Sets of Variables}, for more details.
@end deftypefn
-@deftypefn {Function} {} SYSMIS (@var{variable or expression})
-When given the name of a numeric variable, returns 1 if the value of
-that variable is system-missing. Otherwise, if the value is not
-missing or if it is user-missing, returns 0. If given the name of a
-string variable, always returns 1. If given an expression other than
-a single variable name, results in 1 if the value is system- or
-user-missing, 0 otherwise.
+@deftypefn {Function} {} SYSMIS (@var{expr})
+When @var{expr} is simply the name of a numeric variable, returns 1 if
+the variable has the system-missing value, 0 if it is user-missing or
+not missing. If given @var{expr} takes another form, results in 1 if
+the value is system-missing, 0 otherwise.
@end deftypefn
@deftypefn {Function} {} VALUE (@var{variable})
Prevents the user-missing values of @var{variable} from being
-transformed into system-missing values: If @var{variable} is not
-system- or user-missing, results in the value of @var{variable}. If
-@var{variable} is user-missing, results in the value of @var{variable}
-anyway. If @var{variable} is system-missing, results in system-missing.
+transformed into system-missing values, and always results in the
+actual value of @var{variable}, whether it is user-missing,
+system-missing or not missing at all.
@end deftypefn
@node Pseudo-Random Numbers, Set Membership, Missing Value Functions, Functions
@end deftypefn
@cindex variance
-@deftypefn {Function} {} VAR (@var{number}, @var{number}[, @dots{}])
-Results in the variance of the values of @var{number}. This function
-requires at least two valid arguments to give a non-missing result.
-@end deftypefn
-
@deftypefn {Function} {} VARIANCE (@var{number}, @var{number}[, @dots{}])
Results in the variance of the values of @var{number}. This function
requires at least two valid arguments to give a non-missing result.
-(Use VAR in preference to VARIANCE for reasons of portability.)
@end deftypefn
@node String Functions, Time & Date, Statistical Functions, Functions
@cindex numbers, converting from strings
@cindex strings, converting to numbers
-@deftypefn {Function} {} NUMBER (@var{string})
-Returns the number produced when @var{string} is interpreted according
-to format F@var{x}.0, where @var{x} is the number of characters in
-@var{string}. If @var{string} does not form a proper number,
-system-missing is returned without an error message. Portability: none.
-@end deftypefn
-
@deftypefn {Function} {} NUMBER (@var{string}, @var{format})
Returns the number produced when @var{string} is interpreted according
-to format specifier @var{format}. Only the number of characters in
-@var{string} specified by @var{format} are examined. For example,
-@code{NUMBER("123", F3.0)} and @code{NUMBER("1234", F3.0)} both have
-value 123. If @var{string} does not form a proper number,
-system-missing is returned without an error message.
+to format specifier @var{format}. If the format width @var{w} is less
+than the length of @var{string}, then only the first @var{w}
+characters in @var{string} are used, e.g.@: @code{NUMBER("123", F3.0)}
+and @code{NUMBER("1234", F3.0)} both have value 123. If @var{w} is
+greater than @var{string}'s length, then it is treated as if it were
+right-padded with spaces. If @var{string} is not in the correct
+format for @var{format}, system-missing is returned.
@end deftypefn
@cindex strings, searching backwards
+Fri Mar 26 00:54:57 2004 Ben Pfaff <blp@gnu.org>
+
+ * var-labs.c: (cmd_variable_labels) For compatibility, don't allow
+ `/' at start. Check return value of parse_variables() for error
+ return.
+
+Fri Mar 26 00:19:27 2004 Ben Pfaff <blp@gnu.org>
+
+ Revamp expressions: make the code a little nicer, and fix bugs
+ found in testing.
+
+ * expr-evl.c: (expr_evaluate) Make expression argument const.
+ Support OP_ADD, OP_SUB, OP_MUL, OP_DIV instead of OP_PLUS, OP_MUL.
+ OP_POW is missing for arg 2 <= 0. OP_LOG is natural log, not
+ base-10 log. Fix OP_ANY, OP_ANY_STRING, OP_RANGE, OP_RANGE_STRING
+ off-by-ones. Add OP_MAX_STRING, OP_MIN_STRING. Fix OP_TIME_HMS,
+ OP_DATE_WKYR boundary conditions. Add OP_CTIME_DAYS,
+ OP_CTIME_HOURS, OP_CTIME_MINUTES, OP_CTIME_DAYS, OP_CTIME_SECONDS.
+ Support OP_INDEX_2, OP_INDEX_3, OP_RINDEX_2, OP_RINDEX_3 instead
+ of OP_INDEX, OP_INDEX_OPT, OP_RINDEX, OP_RINDEX_OPT. Merge
+ OP_LPAD_OPT into OP_LPAD, OP_RPAD_OPT into OP_RPAD, OP_LTRIM_OPT
+ into OP_LTRIM, OP_RTRIM_OPT into OP_RTRIM, OP_NUMBER_OPT into
+ OP_NUMBER. Fix OP_RTRIM fragility. Support OP_SUBSTR_2,
+ OP_SUBSTR_3 instead of OP_SUBSTR, OP_SUBSTR_OPT. Remove OP_INV.
+ Simplify OP_SYSMIS. Remove OP_STR_MIS.
+
+ * expr-opt.c: (optimize_expression) Rewrite.
+ (macro n0) Removed.
+ (macro n1) Removed.
+ (macro n2) Removed.
+ (macro s0) Removed.
+ (macro s0l) Removed.
+ (macro s1) Removed.
+ (macro s1l) Removed.
+ (macro s2) Removed.
+ (macro s2l) Removed.
+ (macro s) Removed.
+ (macro sl) Removed.
+ (eq_num_con) New function.
+ (optimize_tree) New function.
+ (macro rnc) Removed.
+ (macro frnc) Removed.
+ (str_search) Add const to string params.
+ (str_rsearch) Ditto.
+ (evaluate_tree_no_missing) Renamed from evaluate_tree. Add num[],
+ str[], str_len[] locals to substitute for most of removed macros.
+ Support OP_ADD, OP_SUB, OP_MUL, OP_DIV instead of OP_PLUS, OP_MUL.
+ Removed support for missing values because we're never called with
+ missing values. Use set_number() or set_number_errno() instead of
+ rnc or frnc. Removed any stuff that caused trouble in testing.
+ We can re-add it later if it really slows anything. Fix some
+ random problems.
+ (evaluate_tree_with_missing) Not yet supported. To be added later
+ if it's important.
+ (repl_num_con) Removed.
+ (collapse_node) New function.
+ (force_repl_num_con) Removed.
+ (set_number) New function.
+ (set_number_errno) New function.
+ (repl_str_con) Removed.
+ (set_string) New function.
+ (yrmoda) Tighten boundary conditions. Adopt 2030 cut-off for
+ 2-digit years.
+ (dump_node) No special case for OP_AND, OP_OR.
+
+ * expr-prs.c: (expr_prs) Honor EXPR_NO_OPTIMIZE bit. Rewrite.
+ (expr_get_type) New function.
+ (type_check) Rewrite.
+ (type_coercion) New function.
+ (struct operator) New structure.
+ (match_operator New function.
+ (parse_binary_operators) New function.
+ (parse_inverting_unary_operator) New function.
+ (parse_or) Rewritten.
+ (parse_and) Rewritten.
+ (parse_not) Rewritten.
+ (parse_rel) Rewritten.
+ (parse_add) Rewritten.
+ (parse_mul) Rewritten.
+ (parse_neg) Rewritten.
+ (parse_exp) Rewritten.
+ (parse_sysvar) Add $TRUE, $FALSE system variables.
+ Get $LENGTH, $WIDTH from get_viewlength(), get_viewwidth().
+ (parse_primary) Use allocate_var_node(), allocate_num_con(),
+ allocate_str_con().
+ (struct function) Remove desc, change `func' prototype.
+ (unary_func) Remove special cases.
+ (MISSING_func) Reduce to unary_func() that just returns a boolean.
+ (SYSMIS_func) Handle SYSMIS((x)) like SYSMIS(x).
+ (VALUE_func) Use allocate_var_node().
+ (nary_num_func) Allow MIN and MAX for strings.
+ Use allocate_var_node(). Properly clean up.
+ Fix return type.
+ (generic_str_func) Use local table instead of removed `desc'
+ member. Mostly rewrite.
+ (get_num_args) Revise error message.
+ (parse_function) Return EXPR_ERROR, not 0 on error.
+ (macro op) Removed.
+ (macro varies) Removed.
+ (ops[]) Use expr.def.
+ (free_node) Do nothing if node is null.
+ (allocate_num_con) New function.
+ (allocate_str_con) New function.
+ (allocate_var_node) New function.
+ (allocate_binary_nonterminal) New function.
+ (append_nonterminal_arg) Removed.
+ (static var func_tab[]) Revised.
+ (expr_debug_print_postfix) Make parameter const.
+ Use printf() instead of debug_printf().
+
+ * expr.def: New file.
+
+ * expr.h: Change PXP_* to EXPR_*, all references updated. Also
+ use named enum instead of unnamed, all references updated. Add
+ EXPR_ANY, EXPR_NO_OPTIMIZE.
+
+ * exprP.h: Remove EX_*. Add DEFINE_OPERATOR. Use expr.def
+ instead of defining OP_* directly.
+ (macro IS_TERMINAL) New macro.
+ (macro IS_NONTERMINAL) New macro.
+ (enum OP_NO_FLAGS) New.
+
+Fri Mar 26 00:18:01 2004 Ben Pfaff <blp@gnu.org>
+
+ * error.c: (err_assert_fail) msg variable needs to be non-const.
+
+Fri Mar 26 00:17:24 2004 Ben Pfaff <blp@gnu.org>
+
+ * debug.c: (cmd_debug_evaluate) Rewrite.
+
+Fri Mar 26 00:15:13 2004 Ben Pfaff <blp@gnu.org>
+
+ Fix some CROSSTABS bit rot stupidity.
+
+ * crosstabs.q: Reorder the CELLS subcommands for compatibility.
+ (internal_cmd_crosstabs) Initializes cells[] correctly.
+ (float_M_suffix) Rename format_cell_entry(), change prototype,
+ rewrite.
+ (display_crosstabulation) Fix cell formatting.
+
+Fri Mar 26 00:14:09 2004 Ben Pfaff <blp@gnu.org>
+
+ Make lex_rest_of_line(), lex_entire_end() not discard lines. Have
+ to call lex_discard_line() to do that.
+
+ * command.c: (run_command) Call lex_discard_line() after
+ lex_rest_of_line().
+
+ * lexer.c: (lex_entire_end) Change behavior.
+ (lex_rest_of_line) Change behavior. Return const char *.
+ (lex_discard_line) Don't clear getl_buf, don't emit message.
+
+ * main.c: (handle_error) Emit message here.
+
+ * repeat.c: (internal_cmd_do_repeat) Use lex_discard_line()
+ instead of lex_entire_line().
+
+ * str.c: (mm_find_reverse) Make length params size_t. Rewrite.
+
+ * title.c: (get_title) Call lex_discard_line() after
+ lex_rest_of_line().
+ (cmd_file_label) Ditto.
+ (cmd_document) Deal with const char * return value.
+
+Fri Mar 26 00:10:16 2004 Ben Pfaff <blp@gnu.org>
+
+ Removed REMARK command.
+
+ * command.c: (extract_prefix) Removed.
+ (output_line) Removed.
+ (cmd_remark) Removed.
+
+ * command.def: Remove REMARK.
+
+Fri Mar 26 00:08:38 2004 Ben Pfaff <blp@gnu.org>
+
+ Added abort() after lots of assert(0) invocations to avoid some
+ compiler warnings. We really need a NOT_REACHED macro.
+
Tue Mar 23 08:00:42 WAST 2004 John Darrington <john@darrington.wattle.id.au>
* sort.c: Added missing call to temp_file_close. Changed error
break;
default:
assert (0);
+ abort ();
}
ls_create (ascii_pool, s, value);
}
break;
default:
assert (0);
+ abort ();
}
if (!on)
{
break;
default:
assert (0);
+ abort ();
}
else
{
break;
default:
assert (0);
+ abort ();
}
}
break;
default:
assert (0);
+ abort ();
}
if (off)
output_string (this, ls_value (off), ls_end (off));
break;
default:
assert (0);
+ abort ();
}
if (on)
output_string (this, ls_value (on), ls_end (on));
break;
default:
assert (0);
+ abort ();
}
output_char (this, 1, ch);
n_chars += ep - bp + 1;
return CMD_SUCCESS;
}
-/* Extracts a null-terminated 8-or-fewer-character PREFIX from STRING.
- PREFIX is converted to lowercase. Removes trailing spaces from
- STRING as a side effect. */
-static void
-extract_prefix (char *string, char *prefix)
-{
- /* Length of STRING. */
- int len;
-
- /* Points to the null terminator in STRING (`end pointer'). */
- char *ep;
-
- /* Strip spaces from end of STRING. */
- len = strlen (string);
- while (len && isspace ((unsigned char) string[len - 1]))
- string[--len] = 0;
-
- /* Find null terminator. */
- ep = memchr (string, '\0', 8);
- if (!ep)
- ep = &string[8];
-
- /* Copy prefix, converting to lowercase. */
- while (string < ep)
- *prefix++ = tolower ((unsigned char) (*string++));
- *prefix = 0;
-}
-
-/* Prints STRING on the console and to the listing file, replacing \n
- by newline. */
-static void
-output_line (char *string)
-{
- /* Location of \n in line read in. */
- char *cp;
-
- cp = strstr (string, "\\n");
- while (cp)
- {
- *cp = 0;
- tab_output_text (TAB_LEFT | TAT_NOWRAP, string);
- string = &cp[2];
- cp = strstr (string, "\\n");
- }
- tab_output_text (TAB_LEFT | TAT_NOWRAP, string);
-}
-
-/* Parse and execute REMARK command. */
-int
-cmd_remark ()
-{
- /* Points to the line read in. */
- char *s;
-
- /* Index into s. */
- char *cp;
-
- /* 8-character sentinel used to terminate remark. */
- char sentinel[9];
-
- /* Beginning of line used to compare with SENTINEL. */
- char prefix[9];
-
- som_blank_line ();
-
- s = lex_rest_of_line (NULL);
- if (*s == '-')
- {
- output_line (&s[1]);
- return CMD_SUCCESS;
- }
-
- /* Read in SENTINEL from end of current line. */
- cp = s;
- while (isspace ((unsigned char) *cp))
- cp++;
- extract_prefix (cp, sentinel);
- if (sentinel[0] == 0)
- {
- msg (SE, _("The sentinel may not be the empty string."));
- return CMD_FAILURE;
- }
-
- /* Read in other lines until we encounter the sentinel. */
- while (getl_read_line ())
- {
- extract_prefix (ds_value (&getl_buf), prefix);
- if (!strcmp (sentinel, prefix))
- break;
-
- /* Output the line. */
- output_line (ds_value (&getl_buf));
- }
-
- /* Calling lex_entire_line() forces the sentinel line to be
- discarded. */
- getl_prompt = GETL_PRPT_STANDARD;
- lex_entire_line ();
-
- return CMD_SUCCESS;
-}
-
/* Parses the N command. */
int
cmd_n_of_cases (void)
static int
run_command (void)
{
- char *cmd;
+ const char *cmd;
int string;
/* Handle either a string argument or a full-line argument. */
else
{
cmd = lex_rest_of_line (NULL);
+ lex_discard_line ();
string = 0;
}
}
DEFCMD ("RECODE", ERRO, INPU, TRAN, TRAN, cmd_recode)
DEFCMD ("RECORD TYPE", ERRO, INPU, ERRO, ERRO, cmd_record_type)
UNIMPL ("REFORMAT", ERRO, ERRO, TRAN, TRAN)
-SPCCMD ("REMARK", INIT, INPU, TRAN, PROC, cmd_remark)
DEFCMD ("RENAME VARIABLES", ERRO, INPU, TRAN, PROC, cmd_rename_variables)
DEFCMD ("REPEATING DATA", ERRO, INPU, ERRO, ERRO, cmd_repeating_data)
DEFCMD ("REREAD", ERRO, INPU, ERRO, ERRO, cmd_reread)
compute = compute_trns_create ();
/* Test expression. */
- compute->test = expr_parse (PXP_BOOLEAN);
+ compute->test = expr_parse (EXPR_BOOLEAN);
if (compute->test == NULL)
goto fail;
assert (type == NUMERIC || type == ALPHA);
- compute->rvalue = expr_parse (type == ALPHA ? PXP_STRING : PXP_NUMERIC);
+ compute->rvalue = expr_parse (type == ALPHA ? EXPR_STRING : EXPR_NUMERIC);
if (compute->rvalue == NULL)
return 0;
lex_get ();
if (!lex_force_match ('('))
goto lossage;
- lvalue->element = expr_parse (PXP_NUMERIC);
+ lvalue->element = expr_parse (EXPR_NUMERIC);
if (lvalue->element == NULL)
goto lossage;
if (!lex_force_match (')'))
tabl:!tables/notables,
box:!box/nobox,
pivot:!pivot/nopivot;
- +cells[cl_]=count,none,row,column,total,expected,residual,sresidual,
+ +cells[cl_]=count,none,expected,row,column,total,residual,sresidual,
asresidual,all;
+statistics[st_]=chisq,phi,cc,lambda,uc,none,btau,ctau,risk,gamma,d,
kappa,eta,corr,all.
static int
internal_cmd_crosstabs (void)
{
+ int i;
+
variables = NULL;
variables_cnt = 0;
xtab = NULL;
if (!cmd.sbc_cells)
{
cmd.a_cells[CRS_CL_COUNT] = 1;
- num_cells = 1;
}
else
{
- int i;
int count = 0;
for (i = 0; i < CRS_CL_count; i++)
cmd.a_cells[CRS_CL_ALL] = 0;
}
cmd.a_cells[CRS_CL_NONE] = 0;
- for (num_cells = i = 0; i < CRS_CL_count; i++)
- if (cmd.a_cells[i])
- cmd.a_cells[num_cells++] = i;
}
+ for (num_cells = i = 0; i < CRS_CL_count; i++)
+ if (cmd.a_cells[i])
+ cells[num_cells++] = i;
/* STATISTICS. */
if (cmd.sbc_statistics)
x->vars[first_difference]);
}
-/* Put value V into cell (C,R) of TABLE, suffixed with letter M. */
+/* Put VALUE into cell (C,R) of TABLE, suffixed with character
+ SUFFIX if nonzero. If MARK_MISSING is nonzero the entry is
+ additionally suffixed with a letter `M'. */
static void
-float_M_suffix (struct tab_table *table, int c, int r, double v)
+format_cell_entry (struct tab_table *table, int c, int r, double value,
+ char suffix, int mark_missing)
{
- static const struct fmt_spec f = {FMT_F, 8, 0};
+ const struct fmt_spec f = {FMT_F, 10, 1};
+ union value v;
struct len_string s;
-
- s.length = 9;
- s.string = tab_alloc (table, 9);
- s.string[8] = 'M';
- format_short (s.string, &f, (union value *) &v);
+
+ s.length = 10;
+ s.string = tab_alloc (table, 16);
+ v.f = value;
+ data_out (s.string, &f, &v);
while (*s.string == ' ')
{
s.length--;
s.string++;
}
+ if (suffix != 0)
+ s.string[s.length++] = suffix;
+ if (mark_missing)
+ s.string[s.length++] = 'M';
+
tab_raw (table, c, r, TAB_RIGHT, &s);
}
tab_hline (table, TAL_1, -1, n_cols, 0);
for (c = 0; c < n_cols; c++)
{
- double expected_value = row_tot[r] * col_tot[c] / W;
+ int mark_missing = 0;
+ double expected_value = row_tot[r] * col_tot[c] / W;
+ if (cmd.miss == CRS_REPORT
+ && (is_num_user_missing (cols[c].f, x->vars[COL_VAR])
+ || is_num_user_missing (rows[r].f, x->vars[ROW_VAR])))
+ mark_missing = 1;
for (i = 0; i < num_cells; i++)
{
double v;
+ int suffix = 0;
switch (cells[i])
{
break;
case CRS_CL_ROW:
v = *mp / row_tot[r] * 100.;
+ suffix = '%';
break;
case CRS_CL_COLUMN:
v = *mp / col_tot[c] * 100.;
+ suffix = '%';
break;
case CRS_CL_TOTAL:
v = *mp / W * 100.;
+ suffix = '%';
break;
case CRS_CL_EXPECTED:
v = expected_value;
break;
default:
assert (0);
+ abort ();
}
- if (cmd.miss == CRS_REPORT
- && (is_num_user_missing (cols[c].f, x->vars[COL_VAR])
- || is_num_user_missing (rows[r].f, x->vars[ROW_VAR])))
- float_M_suffix (table, c, i, v);
- else if (v != 0.)
- tab_float (table, c, i, TAB_RIGHT, v, 8, 0);
+ format_cell_entry (table, c, i, v, suffix, mark_missing);
}
mp++;
int r, i;
tab_offset (table, -1, tab_row (table) - num_cells * n_rows);
- for (r = 0; r < n_rows; r++)
- for (i = 0; i < num_cells; i++)
- {
- double v;
-
- switch (cells[i])
- {
- case CRS_CL_COUNT:
- v = row_tot[r];
- break;
- case CRS_CL_ROW:
- v = 100.;
- break;
- case CRS_CL_COLUMN:
- v = row_tot[r] / W * 100.;
- break;
- case CRS_CL_TOTAL:
- v = row_tot[r] / W * 100.;
- break;
- case CRS_CL_EXPECTED:
- case CRS_CL_RESIDUAL:
- case CRS_CL_SRESIDUAL:
- case CRS_CL_ASRESIDUAL:
- v = 0.;
- break;
- default:
- assert (0);
- }
-
- if (cmd.miss == CRS_REPORT
- && is_num_user_missing (rows[r].f, x->vars[ROW_VAR]))
- float_M_suffix (table, n_cols, 0, v);
- else if (v != 0.)
- tab_float (table, n_cols, 0, TAB_RIGHT, v, 8, 0);
-
- tab_next_row (table);
- }
+ for (r = 0; r < n_rows; r++)
+ {
+ char suffix = 0;
+ int mark_missing = 0;
+
+ if (cmd.miss == CRS_REPORT
+ && is_num_user_missing (rows[r].f, x->vars[ROW_VAR]))
+ mark_missing = 1;
+
+ for (i = 0; i < num_cells; i++)
+ {
+ double v;
+
+ switch (cells[i])
+ {
+ case CRS_CL_COUNT:
+ v = row_tot[r];
+ break;
+ case CRS_CL_ROW:
+ v = 100.;
+ suffix = '%';
+ break;
+ case CRS_CL_COLUMN:
+ v = row_tot[r] / W * 100.;
+ suffix = '%';
+ break;
+ case CRS_CL_TOTAL:
+ v = row_tot[r] / W * 100.;
+ suffix = '%';
+ break;
+ case CRS_CL_EXPECTED:
+ case CRS_CL_RESIDUAL:
+ case CRS_CL_SRESIDUAL:
+ case CRS_CL_ASRESIDUAL:
+ v = 0.;
+ break;
+ default:
+ assert (0);
+ abort ();
+ }
+
+ format_cell_entry (table, n_cols, 0, v, suffix, mark_missing);
+ tab_next_row (table);
+ }
+ }
}
/* Column totals, grand total. */
{
- int c, j;
+ int c;
int last_row = 0;
if (num_cells > 1)
for (c = 0; c <= n_cols; c++)
{
double ct = c < n_cols ? col_tot[c] : W;
- int i;
+ int mark_missing = 0;
+ char suffix = 0;
+ int i;
- for (i = j = 0; i < num_cells; i++)
+ if (cmd.miss == CRS_REPORT && c < n_cols
+ && is_num_user_missing (cols[c].f, x->vars[COL_VAR]))
+ mark_missing = 1;
+
+ for (i = 0; i < num_cells; i++)
{
double v;
{
case CRS_CL_COUNT:
v = ct;
+ suffix = '%';
break;
case CRS_CL_ROW:
v = ct / W * 100.;
+ suffix = '%';
break;
case CRS_CL_COLUMN:
v = 100.;
+ suffix = '%';
break;
case CRS_CL_TOTAL:
v = ct / W * 100.;
+ suffix = '%';
break;
case CRS_CL_EXPECTED:
case CRS_CL_RESIDUAL:
continue;
default:
assert (0);
+ abort ();
}
- if (cmd.miss == CRS_REPORT && c < n_cols
- && is_num_user_missing (cols[c].f, x->vars[COL_VAR]))
- float_M_suffix (table, c, j, v);
- else if (v != 0.)
- tab_float (table, c, j, TAB_RIGHT, v, 8, 0);
-
- j++;
+ format_cell_entry (table, c, i, v, suffix, mark_missing);
}
- last_row = j;
+ last_row = i;
}
tab_offset (table, -1, tab_row (table) + last_row);
default:
assert (0);
+ abort ();
}
}
case FMT_A:
assert (0);
+ abort ();
case FMT_AHEX:
assert (0);
+ abort ();
case FMT_IB:
ok = convert_IB (s, fp, number);
default:
assert (0);
+ abort ();
}
}
else
default:
assert (0);
+ abort ();
}
}
02111-1307, USA. */
#include <config.h>
+#include <assert.h>
+#include <stdio.h>
#include "command.h"
#include "error.h"
#include "expr.h"
cmd_debug_evaluate (void)
{
struct expression *expr;
+ union value value;
+ enum expr_type expr_flags;
+ int dump_postfix = 0;
discard_variables ();
- expr = expr_parse (PXP_NONE);
- if (!expr)
- return CMD_FAILURE;
- expr_free (expr);
- if (token != '.')
+ expr_flags = 0;
+ if (lex_match_id ("NOOPTIMIZE"))
+ expr_flags |= EXPR_NO_OPTIMIZE;
+ if (lex_match_id ("POSTFIX"))
+ dump_postfix = 1;
+ if (token != '/')
{
- msg (SE, _("Extra characters after expression."));
+ lex_force_match ('/');
return CMD_FAILURE;
}
+ fprintf (stderr, "%s => ", lex_rest_of_line (NULL));
+ lex_get ();
+
+ expr = expr_parse (EXPR_ANY | expr_flags);
+ if (!expr || token != '.')
+ {
+ fprintf (stderr, "error\n");
+ return CMD_FAILURE;
+ }
+
+ if (dump_postfix)
+ expr_debug_print_postfix (expr);
+ else
+ {
+ expr_evaluate (expr, NULL, 0, &value);
+ switch (expr_get_type (expr))
+ {
+ case EXPR_NUMERIC:
+ if (value.f == SYSMIS)
+ fprintf (stderr, "sysmis\n");
+ else
+ fprintf (stderr, "%.2f\n", value.f);
+ break;
+
+ case EXPR_BOOLEAN:
+ if (value.f == SYSMIS)
+ fprintf (stderr, "sysmis\n");
+ else if (value.f == 0.0)
+ fprintf (stderr, "false\n");
+ else
+ fprintf (stderr, "true\n");
+ break;
+
+ case EXPR_STRING:
+ fputc ('"', stderr);
+ fwrite (value.c + 1, value.c[0], 1, stderr);
+ fputs ("\"\n", stderr);
+ break;
+
+ default:
+ assert (0);
+ }
+ }
+ expr_free (expr);
return CMD_SUCCESS;
}
struct do_if_trns *t;
struct expression *e;
- e = expr_parse (PXP_BOOLEAN);
+ e = expr_parse (EXPR_BOOLEAN);
if (!e)
return NULL;
if (token != '.')
void
err_assert_fail(const char *expr, const char *file, int line)
{
- const char msg[256];
+ char msg[256];
snprintf(msg,256,"Assertion failed: %s:%d; (%s)",file,line,expr);
request_bug_report_and_abort( msg );
}
#include "vfmP.h"
double
-expr_evaluate (struct expression *e, const struct ccase *c, int case_num,
+expr_evaluate (const struct expression *e, const struct ccase *c, int case_num,
union value *v)
{
unsigned char *op = e->op;
{
switch (*op++)
{
- case OP_PLUS:
- sp -= *op - 1;
- if (sp->f != SYSMIS)
- for (i = 1; i < *op; i++)
- {
- if (sp[i].f == SYSMIS)
- {
- sp->f = SYSMIS;
- break;
- }
- else
- sp->f += sp[i].f;
- }
- op++;
+ case OP_ADD:
+ sp--;
+ if (sp[1].f == SYSMIS)
+ sp[0].f = SYSMIS;
+ else if (sp[0].f != SYSMIS)
+ sp[0].f += sp[1].f;
+ break;
+ case OP_SUB:
+ sp--;
+ if (sp[1].f == SYSMIS)
+ sp[0].f = SYSMIS;
+ else if (sp[0].f != SYSMIS)
+ sp[0].f -= sp[1].f;
break;
case OP_MUL:
- sp -= *op - 1;
- if (sp->f != SYSMIS)
- for (i = 1; i < *op; i++)
- {
- if (sp[i].f == SYSMIS)
- {
- sp->f = SYSMIS;
- break;
- }
- else
- sp->f *= sp[i].f;
- }
- op++;
+ sp--;
+ if (sp[1].f == SYSMIS)
+ sp[0].f = SYSMIS;
+ else if (sp[0].f != SYSMIS)
+ sp[0].f *= sp[1].f;
+ break;
+ case OP_DIV:
+ sp--;
+ if (sp[1].f == SYSMIS || sp[1].f == 0.)
+ sp[0].f = SYSMIS;
+ else if (sp[0].f != SYSMIS)
+ sp[0].f /= sp[1].f;
break;
case OP_POW:
sp--;
else if (sp[1].f == SYSMIS)
{
if (sp[0].f == 0.0)
- /* SYSMIS**0 */
sp->f = 0.0;
else
sp->f = SYSMIS;
}
- else if (sp[0].f == 0.0 && sp[1].f == 0.0)
+ else if (sp[0].f == 0.0 && sp[1].f <= 0.0)
sp->f = SYSMIS;
else
sp->f = pow (sp[0].f, sp[1].f);
break;
/* String operators. */
- case OP_STRING_EQ:
+ case OP_EQ_STRING:
sp--;
sp[0].f = st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[1].c[1], sp[1].c[0]) == 0;
break;
- case OP_STRING_GE:
+ case OP_GE_STRING:
sp--;
sp[0].f = st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[1].c[1], sp[1].c[0]) >= 0;
break;
- case OP_STRING_GT:
+ case OP_GT_STRING:
sp--;
sp[0].f = st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[1].c[1], sp[1].c[0]) > 0;
break;
- case OP_STRING_LE:
+ case OP_LE_STRING:
sp--;
sp[0].f = st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[1].c[1], sp[1].c[0]) <= 0;
break;
- case OP_STRING_LT:
+ case OP_LT_STRING:
sp--;
sp[0].f = st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[1].c[1], sp[1].c[0]) < 0;
break;
- case OP_STRING_NE:
+ case OP_NE_STRING:
sp--;
sp[0].f = st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[1].c[1], sp[1].c[0]) != 0;
if (sp->f != SYSMIS)
{
errno = 0;
- sp->f = log10 (sp->f);
+ sp->f = log (sp->f);
if (errno)
sp->f = SYSMIS;
}
sp -= n_args - 1;
if (sp->f == SYSMIS)
break;
- for (i = 1; i <= n_args; i++)
+ for (i = 1; i < n_args; i++)
if (sp[0].f == sp[i].f)
{
sp->f = 1.0;
case OP_ANY_STRING:
{
int n_args = *op++;
+ int result = 0.0;
sp -= n_args - 1;
- for (i = 1; i <= n_args; i++)
+ for (i = 1; i < n_args; i++)
if (!st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[i].c[1], sp[i].c[0]))
{
- sp->f = 1.0;
- goto main_loop;
+ result = 1.0;
+ break;
}
- sp->f = 0.0;
+ sp->f = result;
}
break;
case OP_CFVAR:
sp->f = max;
}
break;
+ case OP_MAX_STRING:
+ {
+ int n_args = *op++;
+ int max_idx = 0;
+
+ sp -= n_args - 1;
+ for (i = 1; i < n_args; i++)
+ if (st_compare_pad (&sp[i].c[1], sp[i].c[0],
+ &sp[max_idx].c[1], sp[max_idx].c[0]) > 0)
+ max_idx = i;
+ sp[0].c = sp[max_idx].c;
+ }
+ break;
case OP_MEAN:
{
int n_args = *op++;
sp->f = min;
}
break;
+ case OP_MIN_STRING:
+ {
+ int n_args = *op++;
+ int min_idx = 0;
+
+ sp -= n_args - 1;
+ for (i = 1; i < n_args; i++)
+ if (st_compare_pad (&sp[i].c[1], sp[i].c[0],
+ &sp[min_idx].c[1], sp[min_idx].c[0]) < 0)
+ min_idx = i;
+ sp[0].c = sp[min_idx].c;
+ }
+ break;
case OP_NMISS:
{
int n_args = *op++;
sp -= n_args - 1;
if (sp->f == SYSMIS)
break;
- for (i = 1; i <= n_args; i += 2)
+ for (i = 1; i < n_args; i += 2)
if (sp[i].f == SYSMIS || sp[i + 1].f == SYSMIS)
continue;
else if (sp[0].f >= sp[i].f && sp[0].f <= sp[i + 1].f)
int n_args = *op++;
sp -= n_args - 1;
- for (i = 1; i <= n_args; i += 2)
+ for (i = 1; i < n_args; i += 2)
if (st_compare_pad (&sp[0].c[1], sp[0].c[0],
&sp[i].c[1], sp[i].c[0]) >= 0
&& st_compare_pad (&sp[0].c[1], sp[0].c[0],
break;
/* Time construction function. */
- case OP_TIME_HMS:
- sp -= 2;
- if (sp[0].f == SYSMIS || sp[1].f == SYSMIS || sp[2].f == SYSMIS)
- sp->f = SYSMIS;
- else
- sp->f = 60. * (60. * sp[0].f + sp[1].f) + sp[2].f;
+ case OP_TIME_HMS:
+ sp -= 2;
+ if (sp[0].f == SYSMIS || sp[1].f == SYSMIS || sp[2].f == SYSMIS)
+ sp->f = SYSMIS;
+ else
+ {
+ double min, max;
+ min = min (sp[0].f, min (sp[1].f, sp[2].f));
+ max = max (sp[0].f, max (sp[1].f, sp[2].f));
+ if (min < 0. && max > 0.)
+ {
+ msg (SW, _("TIME.HMS cannot mix positive and negative "
+ "in its arguments."));
+ sp->f = SYSMIS;
+ }
+ else
+ sp->f = 60. * (60. * sp[0].f + sp[1].f) + sp[2].f;
+ }
+ break;
+ case OP_CTIME_DAYS:
+ if (sp->f != SYSMIS)
+ sp->f /= 60. * 60. * 24.;
break;
+ case OP_CTIME_HOURS:
+ if (sp->f != SYSMIS)
+ sp->f /= 60. * 60;
+ break;
+ case OP_CTIME_MINUTES:
+ if (sp->f != SYSMIS)
+ sp->f /= 60.;
+ break;
+ case OP_TIME_DAYS:
+ if (sp->f != SYSMIS)
+ sp->f *= 60. * 60. * 24.;
+ break;
+ case OP_CTIME_SECONDS:
+ /* No-op. */
+ break;
/* Date construction functions. */
case OP_DATE_DMY:
break;
case OP_DATE_WKYR:
sp--;
- if (sp[0].f == SYSMIS)
+ if (sp[0].f == SYSMIS || sp[1].f == SYSMIS)
sp->f = SYSMIS;
- else
+ else if (sp[0].f < 0. || sp[0].f > 53.)
+ {
+ msg (SW, _("Week argument to WKYR must be in range 0 to 53."));
+ sp->f = SYSMIS;
+ }
+ else
{
- sp[1].f = yrmoda (sp[1].f, 1, 1);
- if (sp->f != SYSMIS)
- sp[1].f = 60. * 60. * 24. * (sp[1].f + 7. * (floor (sp[0].f) - 1.));
- sp->f = sp[1].f;
+ double result = yrmoda (sp[1].f, 1, 1);
+ if (result != SYSMIS)
+ result = 60. * 60. * 24. * (result + 7. * (sp->f - 1.));
+ sp->f = result;
}
break;
case OP_DATE_YRDAY:
sp[0].c = dest;
}
break;
- case OP_INDEX:
+ case OP_INDEX_2:
sp--;
if (sp[1].c[0] == 0)
sp->f = SYSMIS;
else
{
int last = sp[0].c[0] - sp[1].c[0];
+ int result = 0;
for (i = 0; i <= last; i++)
- if (!memcmp (&sp[0].c[i + 1], &sp[0].c[1], sp[0].c[0]))
+ if (!memcmp (&sp[0].c[i + 1], &sp[1].c[1], sp[1].c[0]))
{
- sp->f = i + 1;
- goto main_loop;
+ result = i + 1;
+ break;
}
- sp->f = 0.0;
+ sp->f = result;
}
break;
- case OP_INDEX_OPT:
- {
- /* Length of each search string. */
- int part_len = sp[2].f;
-
- sp -= 2;
- if (sp[1].c[0] == 0 || part_len <= 0 || sp[2].f == SYSMIS
- || sp[1].c[0] % part_len != 0)
- sp->f = SYSMIS;
- else
- {
- /* Last possible index. */
- int last = sp[0].c[0] - part_len;
-
- for (i = 0; i <= last; i++)
- for (j = 0; j < sp[1].c[0]; j += part_len)
- if (!memcmp (&sp[0].c[i], &sp[1].c[j], part_len))
- {
- sp->f = i + 1;
- goto main_loop;
- }
- sp->f = 0.0;
- }
- }
- break;
- case OP_RINDEX:
+ case OP_INDEX_3:
+ sp -= 2;
+ if (sp[1].c[0] == 0)
+ {
+ sp->f = SYSMIS;
+ break;
+ }
+ else if (sp[2].f == SYSMIS)
+ {
+ msg (SW, _("Argument 3 of RINDEX may not be system-missing."));
+ sp->f = SYSMIS;
+ }
+ else
+ {
+ int part_len = sp[2].f;
+ int result = 0;
+ if (part_len < 0 || part_len > sp[1].c[0]
+ || sp[1].c[0] % part_len != 0)
+ {
+ msg (SW, _("Argument 3 of RINDEX must be between 1 and "
+ "the length of argument 2, and it must "
+ "evenly divide the length of argument 2."));
+ sp->f = SYSMIS;
+ break;
+ }
+ else
+ {
+ int last = sp[0].c[0] - part_len;
+ for (i = 0; i <= last; i++)
+ for (j = 0; j < sp[1].c[0]; j += part_len)
+ if (!memcmp (&sp[0].c[i + 1], &sp[1].c[j + 1], part_len))
+ {
+ result = i + 1;
+ goto index_3_out;
+ }
+ index_3_out:
+ sp->f = result;
+ }
+ }
+ break;
+ case OP_RINDEX_2:
sp--;
if (sp[1].c[0] == 0)
sp->f = SYSMIS;
else
{
+ int result = 0;
for (i = sp[0].c[0] - sp[1].c[0]; i >= 0; i--)
- if (!memcmp (&sp[0].c[i + 1], &sp[0].c[1], sp[0].c[0]))
+ if (!memcmp (&sp[0].c[i + 1], &sp[1].c[1], sp[1].c[0]))
{
- sp->f = i + 1;
- goto main_loop;
+ result = i + 1;
+ break;
}
- sp->f = 0.0;
+ sp->f = result;
}
break;
- case OP_RINDEX_OPT:
- {
- /* Length of each search string. */
- int part_len = sp[2].f;
-
- sp -= 2;
- if (sp[1].c[0] == 0 || part_len <= 0 || sp[2].f == SYSMIS
- || sp[1].c[0] % part_len != 0)
- sp->f = SYSMIS;
- else
- {
- for (i = sp[0].c[0] - part_len; i >= 0; i--)
- for (j = 0; j < sp[1].c[0]; j += part_len)
- if (!memcmp (&sp[0].c[i], &sp[1].c[j], part_len))
- {
- sp->f = i + 1;
- goto main_loop;
- }
- sp->f = 0.0;
- }
- }
+ case OP_RINDEX_3:
+ sp -= 2;
+ if (sp[1].c[0] == 0)
+ {
+ sp->f = SYSMIS;
+ break;
+ }
+ else if (sp[2].f == SYSMIS)
+ {
+ msg (SW, _("Argument 3 of RINDEX may not be system-missing."));
+ sp->f = SYSMIS;
+ }
+ else
+ {
+ int part_len = sp[2].f;
+ int result = 0;
+ if (part_len < 0 || part_len > sp[1].c[0]
+ || sp[1].c[0] % part_len != 0)
+ {
+ msg (SW, _("Argument 3 of RINDEX must be between 1 and "
+ "the length of argument 2, and it must "
+ "evenly divide the length of argument 2."));
+ sp->f = SYSMIS;
+ }
+ else
+ {
+ for (i = sp[0].c[0] - part_len; i >= 0; i--)
+ for (j = 0; j < sp[1].c[0]; j += part_len)
+ if (!memcmp (&sp[0].c[i + 1], &sp[1].c[j + 1], part_len))
+ {
+ result = i + 1;
+ goto rindex_3_out;
+ }
+ rindex_3_out:
+ sp->f = result;
+ }
+ }
break;
case OP_LENGTH:
sp->f = sp[0].c[0];
sp[0].c[i] = toupper ((unsigned char) (sp[0].c[i]));
break;
case OP_LPAD:
- {
- int len;
- sp--;
- len = sp[1].f;
- if (sp[1].f == SYSMIS || len < 0 || len > 255)
- sp->c[0] = 0;
- else if (len > sp[0].c[0])
- {
- unsigned char *dest;
-
- dest = pool_alloc (e->pool, len + 1);
- dest[0] = len;
- memset (&dest[1], ' ', len - sp->c[0]);
- memcpy (&dest[len - sp->c[0] + 1], &sp->c[1], sp->c[0]);
- sp->c = dest;
- }
- }
- break;
- case OP_LPAD_OPT:
{
int len;
sp -= 2;
}
break;
case OP_RPAD:
- {
- int len;
- sp--;
- len = sp[1].f;
- if (sp[1].f == SYSMIS || len < 0 || len > 255)
- sp->c[0] = 0;
- else if (len > sp[0].c[0])
- {
- unsigned char *dest;
-
- dest = pool_alloc (e->pool, len + 1);
- dest[0] = len;
- memcpy (&dest[1], &sp->c[1], sp->c[0]);
- memset (&dest[sp->c[0] + 1], ' ', len - sp->c[0]);
- sp->c = dest;
- }
- }
- break;
- case OP_RPAD_OPT:
{
int len;
sp -= 2;
}
break;
case OP_LTRIM:
- {
- int len = sp[0].c[0];
-
- i = 1;
- while (i <= len && sp[0].c[i] == ' ')
- i++;
- if (--i)
- {
- sp[0].c[i] = sp[0].c[0] - i;
- sp->c = &sp[0].c[i];
- }
- }
- break;
- case OP_LTRIM_OPT:
{
sp--;
if (sp[1].c[0] != 1)
}
break;
case OP_RTRIM:
- assert (' ' != 0);
- while (sp[0].c[sp[0].c[0]] == ' ')
- sp[0].c[0]--;
- break;
- case OP_RTRIM_OPT:
sp--;
if (sp[1].c[0] != 1)
sp[0].c[0] = 0;
else
{
- /* Note that NULs are not allowed in strings. This code
- needs to change if this decision is changed. */
int cmp = sp[1].c[1];
- while (sp[0].c[sp[0].c[0]] == cmp)
+ while (sp[0].c[0] > 0 && sp[0].c[sp[0].c[0]] == cmp)
sp[0].c[0]--;
}
break;
case OP_NUMBER:
{
struct data_in di;
-
- di.s = &sp->c[1];
- di.e = &sp->c[1] + sp->c[0];
- di.v = sp;
- di.flags = DI_IGNORE_ERROR;
- di.f1 = 1;
- di.format.type = FMT_F;
- di.format.w = sp->c[0];
- di.format.d = 0;
- data_in (&di);
- }
- break;
- case OP_NUMBER_OPT:
- {
- struct data_in di;
+ union value out;
di.s = &sp->c[1];
- di.e = &sp->c[1] + sp->c[0];
- di.v = sp;
- di.flags = DI_IGNORE_ERROR;
+ di.v = &out;
+ di.flags = 0;
di.f1 = 1;
di.format.type = *op++;
di.format.w = *op++;
di.format.d = *op++;
+ di.e = &sp->c[1] + min (sp->c[0], di.format.w);
data_in (&di);
+ sp->f = out.f;
}
break;
case OP_STRING:
sp->c = dest;
}
break;
- case OP_SUBSTR:
+ case OP_SUBSTR_2:
{
int index;
}
}
break;
- case OP_SUBSTR_OPT:
+ case OP_SUBSTR_3:
{
int index;
int n;
break;
/* Artificial. */
- case OP_INV:
- if (sp->f != SYSMIS)
- sp->f = 1. / sp->f;
- break;
case OP_SQUARE:
if (sp->f != SYSMIS)
sp->f *= sp->f;
sp->f *= rng_get_double (pspp_rng ());
break;
case OP_SYSMIS:
- if (sp[0].f == SYSMIS || !finite (sp[0].f))
- sp->f = 1.0;
- else
- sp->f = 0.0;
+ sp->f = sp->f == SYSMIS || !finite (sp->f);
break;
case OP_VEC_ELEM_NUM:
{
assert (c != NULL);
sp->f = c->data[*op++].f == SYSMIS;
break;
- case OP_STR_MIS:
- sp++;
- assert (c != NULL);
- sp->f = is_str_user_missing (c->data[(*vars)->fv].s, *vars);
- vars++;
- break;
case OP_NUM_VAL:
sp++;
assert (c != NULL);
main_loop: ;
}
finished:
- if (e->type != EX_STRING)
+ if (e->type != EXPR_STRING)
{
double value = sp->f;
if (!finite (value))
#include "str.h"
#include "var.h"
-/*
- Expression "optimizer"
+static void evaluate_tree_no_missing (union any_node **);
+static void evaluate_tree_with_missing (union any_node **, size_t count);
+static void optimize_tree (union any_node **);
- Operates on the tree representation of expressions.
- optimize_expression() performs the optimizations listed below:
+static void collapse_node (union any_node **node, size_t child_idx);
+static void set_number (union any_node **node, double);
+static void set_number_errno (union any_node **node, double);
+static void set_string (union any_node **node, const char *, size_t);
- 1. Constant folding
- Any operation with constant operands is replaced by its value.
- (Exception: random-number-generator functions.)
-
- 2. Strength reduction (x is any expression; a is a numeric constant)
- x/0 => SYSMIS
- x*0 => 0
- x**0 => 1
- x**1, x+0, x-0, x*1 => x
- x**2 => sqr(x)
- x/a => x*(1/a) (where 1/a is evaluated at optimization time)
-
- I thought about adding additional optimizations but decided that what
- is here could already be considered overkill.
- */
-
-static struct nonterm_node *evaluate_tree (struct nonterm_node * n);
-static struct nonterm_node *optimize_tree (struct nonterm_node * n);
-
-struct nonterm_node *
-optimize_expression (struct nonterm_node * n)
+void
+optimize_expression (union any_node **node)
{
+ int nonconst = 0; /* Number of nonconstant children. */
+ int sysmis = 0; /* Number of system-missing children. */
+ struct nonterm_node *nonterm;
int i;
- /* Set to 1 if a child is nonconstant. */
- int nonconst = 0;
-
- /* Number of system-missing children. */
- int sysmis = 0;
-
/* We can't optimize a terminal node. */
- if (n->type > OP_TERMINAL)
- return n;
+ if (IS_TERMINAL ((*node)->type))
+ return;
+ nonterm = &(*node)->nonterm;
/* Start by optimizing all the children. */
- for (i = 0; i < n->n; i++)
+ for (i = 0; i < nonterm->n; i++)
{
- n->arg[i] = ((union any_node *)
- optimize_expression ((struct nonterm_node *) n->arg[i]));
- if (n->arg[i]->type == OP_NUM_CON)
+ optimize_expression (&nonterm->arg[i]);
+ if (nonterm->arg[i]->type == OP_NUM_CON)
{
- if (n->arg[i]->num_con.value == SYSMIS)
+ if (nonterm->arg[i]->num_con.value == SYSMIS)
sysmis++;
}
- else if (n->arg[i]->type != OP_STR_CON)
- nonconst = 1;
+ else if (nonterm->arg[i]->type != OP_STR_CON)
+ nonconst++;
}
- if (sysmis && !(ops[n->type].flags & OP_ABSORB_MISS))
- /* Just about any operation produces SYSMIS when given any SYSMIS
- arguments. */
+ if (sysmis && !(ops[nonterm->type].flags & OP_ABSORB_MISS))
+ {
+ /* Most operation produce SYSMIS given any SYSMIS
+ argument. */
+ set_number (node, SYSMIS);
+ }
+ else if (!nonconst)
+ {
+ /* Evaluate constant expressions. */
+ if (!sysmis)
+ evaluate_tree_no_missing (node);
+ else
+ evaluate_tree_with_missing (node, sysmis);
+ }
+ else
{
- struct num_con_node *num = xmalloc (sizeof *num);
- free_node ((union any_node *) n);
- num->type = OP_NUM_CON;
- num->value = SYSMIS;
- n = (struct nonterm_node *) num;
+ /* A few optimization possibilities are still left. */
+ optimize_tree (node);
}
- else if (!nonconst)
- /* If all the children of this node are constants, then there are
- obvious optimizations. */
- n = evaluate_tree (n);
- else
- /* Otherwise, we may be able to make certain optimizations
- anyway. */
- n = optimize_tree (n);
- return n;
}
-static struct nonterm_node *repl_num_con (struct nonterm_node *, double);
-static struct nonterm_node *force_repl_num_con (struct nonterm_node *, double);
-static struct nonterm_node *repl_str_con (struct nonterm_node *, char *, int);
-
-#define n0 n->arg[0]->num_con.value
-#define n1 n->arg[1]->num_con.value
-#define n2 n->arg[2]->num_con.value
-
-#define s0 n->arg[0]->str_con.s
-#define s0l n->arg[0]->str_con.len
-#define s1 n->arg[1]->str_con.s
-#define s1l n->arg[1]->str_con.len
-#define s2 n->arg[2]->str_con.s
-#define s2l n->arg[2]->str_con.len
-#define s(X) n->arg[X]->str_con.s
-#define sl(X) n->arg[X]->str_con.len
-
-static struct nonterm_node *
-optimize_tree (struct nonterm_node * n)
+static int
+eq_num_con (union any_node *node, double number)
{
- int i;
-
- errno = 0;
- if (n->type == OP_PLUS || n->type == OP_MUL)
- {
- /* Default constant value. */
- double def = n->type == OP_MUL ? 1.0 : 0.0;
-
- /* Total value of all the constants. */
- double cval = def;
-
- /* Number of nonconst arguments. */
- int nvar = 0;
-
- /* New node. */
- struct nonterm_node *m = NULL;
-
- /* Argument copying counter. */
- int c;
-
- /* 1=SYSMIS encountered */
- int sysmis = 0;
-
- for (i = 0; i < n->n; i++)
- if (n->arg[i]->type == OP_NUM_CON)
- {
- if (n->arg[i]->num_con.value != SYSMIS)
- {
- if (n->type == OP_MUL)
- cval *= n->arg[i]->num_con.value;
- else
- cval += n->arg[i]->num_con.value;
- }
- else
- sysmis++;
- }
- else
- nvar++;
-
- /* 0*SYSMIS=0, 0/SYSMIS=0; otherwise, SYSMIS and infinities
- produce SYSMIS. */
- if (cval == 0.0 && n->type == OP_MUL)
- nvar = 0;
- else if (sysmis || !finite (cval))
- {
- nvar = 0;
- cval = SYSMIS;
- }
-
- /* If no nonconstant terms, replace with a constant node. */
- if (nvar == 0)
- return force_repl_num_con (n, cval);
+ return node->type == OP_NUM_CON && node->num_con.value == number;
+}
- if (nvar == 1 && cval == def)
- {
- /* If there is exactly one nonconstant term and no constant
- terms, replace with the nonconstant term. */
- for (i = 0; i < n->n; i++)
- if (n->arg[i]->type != OP_NUM_CON)
- m = (struct nonterm_node *) n->arg[i];
- else
- free_node (n->arg[i]);
- }
- else
- {
- /* Otherwise consolidate all the nonconstant terms. */
- m = xmalloc (sizeof (struct nonterm_node)
- + ((nvar + (cval != def) - 1)
- * sizeof (union any_node *)));
- for (i = c = 0; i < n->n; i++)
- if (n->arg[i]->type != OP_NUM_CON)
- m->arg[c++] = n->arg[i];
- else
- free_node (n->arg[i]);
-
- if (cval != def)
- {
- m->arg[c] = xmalloc (sizeof (struct num_con_node));
- m->arg[c]->num_con.type = OP_NUM_CON;
- m->arg[c]->num_con.value = cval;
- c++;
- }
-
- m->type = n->type;
- m->n = c;
- }
- free (n);
- n = m;
- }
- else if (n->type == OP_POW)
+static void
+optimize_tree (union any_node **node)
+{
+ struct nonterm_node *n = &(*node)->nonterm;
+
+ /* x+0, x-0, 0+x => x. */
+ if ((n->type == OP_ADD || n->type == OP_SUB) && eq_num_con (n->arg[1], 0.))
+ collapse_node (node, 1);
+ else if (n->type == OP_ADD && eq_num_con (n->arg[0], 0.))
+ collapse_node (node, 0);
+
+ /* x*1, x/1, 1*x => x. */
+ else if ((n->type == OP_MUL || n->type == OP_DIV)
+ && eq_num_con (n->arg[1], 1.))
+ collapse_node (node, 0);
+ else if (n->type == OP_MUL && eq_num_con (n->arg[0], 1.))
+ collapse_node (node, 1);
+
+ /* 0*x, 0/x, x*0, MOD(0,x) => x. */
+ else if (((n->type == OP_MUL || n->type == OP_DIV || n->type == OP_MOD)
+ && eq_num_con (n->arg[0], 0.))
+ || (n->type == OP_MUL && eq_num_con (n->arg[1], 0.)))
+ set_number (node, 0.);
+
+ /* x**1 => x. */
+ else if (n->type == OP_POW && eq_num_con (n->arg[1], 1))
+ collapse_node (node, 0);
+
+ /* x**2 => SQUARE(x). */
+ else if (n->type == OP_POW && eq_num_con (n->arg[2], 2))
{
- if (n->arg[1]->type == OP_NUM_CON)
- {
- if (n1 == 1.0)
- {
- struct nonterm_node *m = (struct nonterm_node *) n->arg[0];
-
- free_node (n->arg[1]);
- free (n);
- return m;
- }
- else if (n1 == 2.0)
- {
- n = xrealloc (n, sizeof (struct nonterm_node));
- n->type = OP_SQUARE;
- n->n = 1;
- }
- }
+ n->type = OP_SQUARE;
+ n->n = 1;
}
- return n;
}
-#define rnc(D) \
- (n = repl_num_con (n, D))
-
-#define frnc(D) \
- (n = force_repl_num_con (n, D))
-
/* Finds the first NEEDLE of length NEEDLE_LEN in a HAYSTACK of length
HAYSTACK_LEN. Returns a 1-based index, 0 on failure. */
-static inline int
-str_search (char *haystack, int haystack_len, char *needle, int needle_len)
+static int
+str_search (const char *haystack, int haystack_len,
+ const char *needle, int needle_len)
{
char *p = memmem (haystack, haystack_len, needle, needle_len);
return p ? p - haystack + 1 : 0;
/* Finds the last NEEDLE of length NEEDLE_LEN in a HAYSTACK of length
HAYSTACK_LEN. Returns a 1-based index, 0 on failure. */
-static inline int
-str_rsearch (char *haystack, int haystack_len, char *needle, int needle_len)
+static int
+str_rsearch (const char *haystack, int haystack_len,
+ const char *needle, int needle_len)
{
char *p = mm_find_reverse (haystack, haystack_len, needle, needle_len);
return p ? p - haystack + 1 : 0;
}
-static struct nonterm_node *
-evaluate_tree (struct nonterm_node * n)
+static void
+evaluate_tree_no_missing (union any_node **node)
{
- static char *strbuf;
- int add;
- int len;
+ struct nonterm_node *n = &(*node)->nonterm;
+ double num[3];
+ char *str[3];
+ size_t str_len[3];
int i;
- if (!strbuf)
- strbuf = xmalloc (256);
errno = 0;
+ for (i = 0; i < n->n && i < 3; i++)
+ {
+ union any_node *arg = n->arg[i];
+
+ if (arg->type == OP_NUM_CON)
+ num[i] = arg->num_con.value;
+ else if (arg->type == OP_STR_CON)
+ {
+ str[i] = arg->str_con.s;
+ str_len[i] = arg->str_con.len;
+ }
+ }
+
switch (n->type)
{
- case OP_PLUS:
+ case OP_ADD:
+ set_number (node, num[0] + num[1]);
+ break;
+
+ case OP_SUB:
+ set_number (node, num[0] - num[1]);
+ break;
+
case OP_MUL:
- return optimize_tree (n);
+ set_number (node, num[0] * num[1]);
+ break;
+
+ case OP_DIV:
+ if (num[1] != 0.)
+ set_number (node, num[0] / num[1]);
+ break;
case OP_POW:
- if (n0 == 0.0 && n1 == 0.0)
- frnc (SYSMIS);
- else if (n0 == SYSMIS && n1 == 0.0)
- frnc (1.0);
- else if (n0 == 0.0 && n1 == SYSMIS)
- frnc (0.0);
+ if (num[0] == 0. && num[1] == 0.)
+ set_number (node, SYSMIS);
else
- rnc (pow (n0, n1));
+ set_number_errno (node, pow (num[0], num[1]));
break;
case OP_AND:
- if (n0 == 0.0 || n1 == 0.0)
- frnc (0.0);
- else if (n0 == SYSMIS || n1 == SYSMIS)
- frnc (SYSMIS);
- else
- frnc (1.0);
+ set_number (node, num[0] && num[1]);
break;
+
case OP_OR:
- if (n0 == 1.0 || n1 == 1.0)
- frnc (1.0);
- else if (n0 == SYSMIS || n1 == SYSMIS)
- frnc (SYSMIS);
- else
- frnc (0.0);
+ set_number (node, num[0] || num[1]);
break;
+
case OP_NOT:
- rnc (n0 == 0.0 ? 1.0 : 0.0);
+ set_number (node, !num[0]);
break;
case OP_EQ:
- rnc (n0 == n1);
+ set_number (node, num[0] == num[1]);
break;
case OP_GE:
- rnc (n0 >= n1);
+ set_number (node, num[0] >= num[1]);
break;
case OP_GT:
- rnc (n0 > n1);
+ set_number (node, num[0] > num[1]);
break;
case OP_LE:
- rnc (n0 <= n1);
+ set_number (node, num[0] <= num[1]);
break;
case OP_LT:
- rnc (n0 < n1);
+ set_number (node, num[0] < num[1]);
break;
case OP_NE:
- rnc (n0 != n1);
+ set_number (node, num[0] != num[1]);
break;
/* String operators. */
- case OP_STRING_EQ:
- rnc (st_compare_pad (s0, s0l, s1, s1l) == 0);
+ case OP_EQ_STRING:
+ set_number (node, st_compare_pad (str[0], str_len[0],
+ str[1], str_len[1]) == 0);
break;
- case OP_STRING_GE:
- rnc (st_compare_pad (s0, s0l, s1, s1l) >= 0);
+ case OP_GE_STRING:
+ set_number (node, st_compare_pad (str[0], str_len[0],
+ str[1], str_len[1]) >= 0);
break;
- case OP_STRING_GT:
- rnc (st_compare_pad (s0, s0l, s1, s1l) > 0);
+ case OP_GT_STRING:
+ set_number (node, st_compare_pad (str[0], str_len[0],
+ str[1], str_len[1]) > 0);
break;
- case OP_STRING_LE:
- rnc (st_compare_pad (s0, s0l, s1, s1l) <= 0);
+ case OP_LE_STRING:
+ set_number (node, st_compare_pad (str[0], str_len[0],
+ str[1], str_len[1]) <= 0);
break;
- case OP_STRING_LT:
- rnc (st_compare_pad (s0, s0l, s1, s1l) < 0);
+ case OP_LT_STRING:
+ set_number (node, st_compare_pad (str[0], str_len[0],
+ str[1], str_len[1]) < 0);
break;
- case OP_STRING_NE:
- rnc (st_compare_pad (s0, s0l, s1, s1l) != 0);
+ case OP_NE_STRING:
+ set_number (node, st_compare_pad (str[0], str_len[0],
+ str[1], str_len[1]) != 0);
break;
/* Unary functions. */
case OP_NEG:
- rnc (-n0);
+ set_number (node, -num[0]);
break;
case OP_ABS:
- rnc (fabs (n0));
+ set_number (node, fabs (num[0]));
break;
case OP_ARCOS:
- rnc (acos (n0));
+ set_number_errno (node, acos (num[0]));
break;
case OP_ARSIN:
- rnc (asin (n0));
+ set_number_errno (node, asin (num[0]));
break;
case OP_ARTAN:
- rnc (atan (n0));
+ set_number_errno (node, atan (num[0]));
break;
case OP_COS:
- rnc (cos (n0));
+ set_number_errno (node, cos (num[0]));
break;
case OP_EXP:
- rnc (exp (n0));
+ set_number_errno (node, exp (num[0]));
break;
case OP_LG10:
- rnc (log10 (n0));
+ set_number_errno (node, log10 (num[0]));
break;
case OP_LN:
- rnc (log (n0));
+ set_number_errno (node, log (num[0]));
break;
case OP_MOD10:
- rnc (fmod (n0, 10));
+ set_number_errno (node, fmod (num[0], 10));
break;
case OP_RND:
- rnc (n0 >= 0.0 ? floor (n0 + 0.5) : -floor (-n0 + 0.5));
+ if (num[0] >= 0.0)
+ set_number_errno (node, floor (num[0] + 0.5));
+ else
+ set_number_errno (node, -floor (-num[0] + 0.5));
break;
case OP_SIN:
- rnc (sin (n0));
+ set_number_errno (node, sin (num[0]));
break;
case OP_SQRT:
- rnc (sqrt (n0));
+ set_number_errno (node, sqrt (num[0]));
break;
case OP_TAN:
- rnc (tan (n0));
+ set_number_errno (node, tan (num[0]));
break;
case OP_TRUNC:
- rnc (n0 >= 0.0 ? floor (n0) : -floor (-n0));
+ if (num[0] >= 0.0)
+ set_number_errno (node, floor (num[0]));
+ else
+ set_number_errno (node, -floor (-num[0]));
break;
/* N-ary numeric functions. */
case OP_ANY:
- if (n0 == SYSMIS)
- frnc (SYSMIS);
- else
- {
- int sysmis = 1;
- double ni;
-
- for (i = 1; i < n->n; i++)
- {
- ni = n->arg[i]->num_con.value;
- if (n0 == ni)
- {
- frnc (1.0);
- goto any_done;
- }
- if (ni != SYSMIS)
- sysmis = 0;
- }
- frnc (sysmis ? SYSMIS : 0.0);
- }
- any_done:
- break;
- case OP_ANY_STRING:
- for (i = 1; i < n->n; i++)
- if (!st_compare_pad (n->arg[0]->str_con.s, n->arg[0]->str_con.len,
- n->arg[i]->str_con.s, n->arg[i]->str_con.len))
- {
- frnc (1.0);
- goto any_string_done;
- }
- frnc (0.0);
- any_string_done:
+ {
+ double result = 0.0;
+ for (i = 1; i < n->n; i++)
+ if (num[0] == n->arg[i]->num_con.value)
+ {
+ result = 1.0;
+ break;
+ }
+ set_number (node, result);
+ }
+ break;
+ case OP_ANY_STRING:
+ {
+ double result = 0.0;
+ for (i = 1; i < n->n; i++)
+ if (!st_compare_pad (n->arg[0]->str_con.s, n->arg[0]->str_con.len,
+ n->arg[i]->str_con.s, n->arg[i]->str_con.len))
+ {
+ result = 1.0;
+ break;
+ }
+ set_number (node, result);
+ }
break;
case OP_CFVAR:
case OP_SD:
case OP_SUM:
case OP_VARIANCE:
+ /* FIXME */
+ break;
+
+ case OP_RANGE:
{
- double d[2] =
- {0.0, 0.0}; /* sum, sum of squares */
- double min = DBL_MAX; /* minimum value */
- double max = -DBL_MAX; /* maximum value */
- double ni; /* value of i'th argument */
- int nv = 0; /* number of valid arguments */
-
- for (i = 0; i < n->n; i++)
- {
- ni = n->arg[i]->num_con.value;
- if (ni != SYSMIS)
- {
- nv++;
- d[0] += ni;
- d[1] += ni * ni;
- if (ni < min)
- min = ni;
- if (ni > max)
- max = ni;
- }
- }
- if (n->type == OP_NMISS)
- frnc (i - nv);
- else if (n->type == OP_NVALID)
- frnc (nv);
- else if (nv >= (int) n->arg[i])
- {
- switch (n->type)
- {
- case OP_CFVAR:
- frnc (calc_cfvar (d, nv));
- break;
- case OP_MAX:
- frnc (max);
- break;
- case OP_MEAN:
- frnc (calc_mean (d, nv));
- break;
- case OP_MIN:
- frnc (min);
- break;
- case OP_SD:
- frnc (calc_stddev (calc_variance (d, nv)));
- break;
- case OP_SUM:
- frnc (d[0]);
- break;
- case OP_VARIANCE:
- frnc (calc_variance (d, nv));
- break;
- }
- }
- else
- frnc (SYSMIS);
+ double result = 0.0;
+
+ for (i = 1; i < n->n; i += 2)
+ {
+ double min = n->arg[i]->num_con.value;
+ double max = n->arg[i + 1]->num_con.value;
+ if (num[0] >= min && num[0] <= max)
+ {
+ result = 1.0;
+ break;
+ }
+ }
+ set_number (node, result);
}
break;
- case OP_RANGE:
- if (n0 == SYSMIS)
- frnc (SYSMIS);
- else
- {
- double min, max;
- int sysmis = 1;
-
- for (i = 1; i < n->n; i += 2)
- {
- min = n->arg[i]->num_con.value;
- max = n->arg[i + 1]->num_con.value;
- if (min == SYSMIS || max == SYSMIS)
- continue;
- sysmis = 0;
- if (n0 >= min && n0 <= max)
- {
- frnc (1.0);
- goto range_done;
- }
- }
- frnc (sysmis ? SYSMIS : 0.0);
- }
- range_done:
- break;
- case OP_RANGE_STRING:
- for (i = 1; i < n->n; i += 2)
- if (st_compare_pad (n->arg[0]->str_con.s, n->arg[0]->str_con.len,
- n->arg[i]->str_con.s, n->arg[i]->str_con.len) >= 0
- && st_compare_pad (n->arg[0]->str_con.s, n->arg[0]->str_con.len,
- n->arg[i + 1]->str_con.s,
- n->arg[i + 1]->str_con.len) <= 0)
- {
- frnc (1.0);
- goto range_str_done;
- }
- frnc (0.0);
- range_str_done:
- break;
- /* Time function. */
+ case OP_RANGE_STRING:
+ {
+ double result = 0.0;
+
+ for (i = 1; i < n->n; i += 2)
+ {
+ const char *min = n->arg[i]->str_con.s;
+ size_t min_len = n->arg[i]->str_con.len;
+ const char *max = n->arg[i + 1]->str_con.s;
+ size_t max_len = n->arg[i + 1]->str_con.len;
+
+ if (st_compare_pad (str[0], str_len[0], min, min_len) >= 0
+ && st_compare_pad (str[0], str_len[0], max, max_len) <= 0)
+ {
+ result = 1.0;
+ break;
+ }
+ }
+ set_number (node, result);
+ break;
+ }
+
+ /* Time functions. */
case OP_TIME_HMS:
- rnc (60. * (60. * n0 + n1) + n2);
+ {
+ double min, max;
+ min = min (num[0], min (num[1], num[2]));
+ max = max (num[0], max (num[1], num[2]));
+ if (min < 0. && max > 0.)
+ break;
+ set_number (node, 60. * (60. * num[0] + num[1]) + num[2]);
+ }
+ break;
+ case OP_CTIME_DAYS:
+ set_number (node, num[0] / (60. * 60. * 24.));
+ break;
+ case OP_CTIME_HOURS:
+ set_number (node, num[0] / (60. * 60.));
+ break;
+ case OP_CTIME_MINUTES:
+ set_number (node, num[0] / 60.);
+ break;
+ case OP_TIME_DAYS:
+ set_number (node, num[0] * (60. * 60. * 24.));
+ break;
+ case OP_CTIME_SECONDS:
+ set_number (node, num[0]);
break;
/* Date construction functions. */
case OP_DATE_DMY:
- rnc (60. * 60. * 24. * yrmoda (n2, n1, n0));
+ set_number (node, 60. * 60. * 24. * yrmoda (num[2], num[1], num[0]));
break;
case OP_DATE_MDY:
- rnc (60. * 60. * 24. * yrmoda (n2, n0, n1));
+ set_number (node, 60. * 60. * 24. * yrmoda (num[2], num[0], num[1]));
break;
case OP_DATE_MOYR:
- rnc (60. * 60. * 24. * yrmoda (n1, n0, 1));
+ set_number (node, 60. * 60. * 24. * yrmoda (num[1], num[0], 1));
break;
case OP_DATE_QYR:
- rnc (60. * 60. * 24. * yrmoda (n1, 3 * (int) n0 - 2, 1));
+ set_number (node,
+ 60. * 60. * 24. * yrmoda (num[1], 3 * (int) num[0] - 2, 1));
break;
case OP_DATE_WKYR:
{
- double t = yrmoda (n1, 1, 1);
+ double t = yrmoda (num[1], 1, 1);
+ if (num[0] < 0. || num[0] > 53.)
+ break;
if (t != SYSMIS)
- t = 60. * 60. * 24. * (t + 7. * (n0 - 1));
- rnc (t);
+ t = 60. * 60. * 24. * (t + 7. * (num[0] - 1));
+ set_number (node, t);
}
break;
case OP_DATE_YRDAY:
{
- double t = yrmoda (n0, 1, 1);
+ double t = yrmoda (num[0], 1, 1);
if (t != SYSMIS)
- t = 60. * 60. * 24. * (t + n0 - 1);
- rnc (t);
+ t = 60. * 60. * 24. * (t + num[1] - 1);
+ set_number (node, t);
}
break;
case OP_YRMODA:
- rnc (yrmoda (n0, n1, n2));
+ set_number (node, yrmoda (num[0], num[1], num[2]));
break;
+
/* Date extraction functions. */
case OP_XDATE_DATE:
- rnc (floor (n0 / 60. / 60. / 24.) * 60. * 60. * 24.);
+ set_number_errno (node,
+ floor (num[0] / 60. / 60. / 24.) * 60. * 60. * 24.);
break;
case OP_XDATE_HOUR:
- rnc (fmod (floor (n0 / 60. / 60.), 24.));
+ set_number_errno (node, fmod (floor (num[0] / 60. / 60.), 24.));
break;
case OP_XDATE_JDAY:
- rnc (julian_to_jday (n0 / 86400.));
+ set_number (node, julian_to_jday (num[0] / 86400.));
break;
case OP_XDATE_MDAY:
{
int day;
- julian_to_calendar (n0 / 86400., NULL, NULL, &day);
- rnc (day);
+ julian_to_calendar (num[0] / 86400., NULL, NULL, &day);
+ set_number (node, day);
}
break;
case OP_XDATE_MINUTE:
- rnc (fmod (floor (n0 / 60.), 60.));
+ set_number_errno (node, fmod (floor (num[0] / 60.), 60.));
break;
case OP_XDATE_MONTH:
{
int month;
- julian_to_calendar (n0 / 86400., NULL, &month, NULL);
- rnc (month);
+ julian_to_calendar (num[0] / 86400., NULL, &month, NULL);
+ set_number (node, month);
}
break;
case OP_XDATE_QUARTER:
{
int month;
- julian_to_calendar (n0 / 86400., NULL, &month, NULL);
- rnc ((month - 1) / 3 + 1);
+ julian_to_calendar (num[0] / 86400., NULL, &month, NULL);
+ set_number (node, (month - 1) / 3 + 1);
}
break;
case OP_XDATE_SECOND:
- rnc (fmod (n0, 60.));
+ set_number_errno (node, fmod (num[0], 60.));
break;
case OP_XDATE_TDAY:
- rnc (floor (n0 / 60. / 60. / 24.));
+ set_number_errno (node, floor (num[0] / 60. / 60. / 24.));
break;
case OP_XDATE_TIME:
- rnc (n0 - floor (n0 / 60. / 60. / 24.) * 60. * 60. * 24.);
+ set_number_errno (node, num[0] - (floor (num[0] / 60. / 60. / 24.)
+ * 60. * 60. * 24.));
break;
case OP_XDATE_WEEK:
- rnc ((julian_to_jday (n0) - 1) / 7 + 1);
+ set_number (node, (julian_to_jday (num[0]) - 1) / 7 + 1);
break;
case OP_XDATE_WKDAY:
- rnc (julian_to_wday (n0));
+ set_number (node, julian_to_wday (num[0]));
break;
case OP_XDATE_YEAR:
{
int year;
- julian_to_calendar (n0 / 86400., &year, NULL, NULL);
- rnc (year);
+ julian_to_calendar (num[0] / 86400., &year, NULL, NULL);
+ set_number (node, year);
}
break;
/* String functions. */
case OP_CONCAT:
{
- len = s0l;
- memcpy (strbuf, s0, len);
+ char string[256];
+ int length = str_len[0];
+ memcpy (string, str[0], length);
for (i = 1; i < n->n; i++)
{
- add = sl (i);
- if (add + len > 255)
- add = 255 - len;
- memcpy (&strbuf[len], s (i), add);
- len += add;
+ int add = n->arg[i]->str_con.len;
+ if (add + length > 255)
+ add = 255 - length;
+ memcpy (&string[length], n->arg[i]->str_con.s, add);
+ length += add;
}
- n = repl_str_con (n, strbuf, len);
+ set_string (node, string, length);
}
break;
- case OP_INDEX:
- rnc (s1l ? str_search (s0, s0l, s1, s1l) : SYSMIS);
- break;
- case OP_INDEX_OPT:
- if (n2 == SYSMIS || (int) n2 <= 0 || s1l % (int) n2)
- {
- msg (SW, _("While optimizing a constant expression, there was "
- "a bad value for the third argument to INDEX."));
- frnc (SYSMIS);
- }
- else
- {
- int pos = 0;
- int c = s1l / (int) n2;
- int r;
-
- for (i = 0; i < c; i++)
- {
- r = str_search (s0, s0l, s (i), sl (i));
- if (r < pos || pos == 0)
- pos = r;
- }
- frnc (pos);
- }
- break;
- case OP_RINDEX:
- rnc (str_rsearch (s0, s0l, s1, s1l));
- break;
- case OP_RINDEX_OPT:
- if (n2 == SYSMIS || (int) n2 <= 0 || s1l % (int) n2)
- {
- msg (SE, _("While optimizing a constant expression, there was "
- "a bad value for the third argument to RINDEX."));
- frnc (SYSMIS);
- }
- else
- {
- int pos = 0;
- int c = s1l / n2;
- int r;
-
- for (i = 0; i < c; i++)
- {
- r = str_rsearch (s0, s0l, s (i), sl (i));
- if (r > pos)
- pos = r;
- }
- frnc (pos);
- }
+ case OP_INDEX_2:
+ case OP_INDEX_3:
+ case OP_RINDEX_2:
+ case OP_RINDEX_3:
+ {
+ int result, chunk_width, chunk_cnt;
+
+ if (n->type == OP_INDEX_2 || n->type == OP_RINDEX_2)
+ chunk_width = str_len[1];
+ else
+ chunk_width = num[2];
+ if (chunk_width <= 0 || chunk_width > str_len[1]
+ || str_len[1] % chunk_width != 0)
+ break;
+ chunk_cnt = str_len[1] / chunk_width;
+
+ result = 0;
+ for (i = 0; i < chunk_cnt; i++)
+ {
+ const char *chunk = str[1] + chunk_width * i;
+ int ofs;
+ if (n->type == OP_INDEX_2 || n->type == OP_INDEX_3)
+ {
+ ofs = str_search (str[0], str_len[0], chunk, chunk_width);
+ if (ofs < result || result == 0)
+ result = ofs;
+ }
+ else
+ {
+ ofs = str_rsearch (str[0], str_len[0], chunk, chunk_width);
+ if (ofs > result)
+ result = ofs;
+ }
+ }
+ set_number (node, result);
+ }
break;
case OP_LENGTH:
- frnc (s0l);
+ set_number (node, str_len[0]);
break;
case OP_LOWER:
{
char *cp;
- for (cp = &s0[s0l]; cp >= s0; cp--)
- *cp = tolower ((unsigned char) (*cp));
- n = repl_str_con (n, s0, s0l);
+ for (cp = str[0]; cp < str[0] + str_len[0]; cp++)
+ *cp = tolower ((unsigned char) *cp);
}
break;
case OP_UPPER:
{
char *cp;
- for (cp = &s0[s0[0] + 1]; cp > s0; cp--)
- *cp = toupper ((unsigned char) (*cp));
- n = repl_str_con (n, s0, s0l);
+ for (cp = str[0]; cp < str[0] + str_len[0]; cp++)
+ *cp = toupper ((unsigned char) *cp);
}
break;
case OP_LPAD:
- case OP_LPAD_OPT:
case OP_RPAD:
- case OP_RPAD_OPT:
{
- int c;
-
- if (n1 == SYSMIS)
- {
- n = repl_str_con (n, NULL, 0);
- break;
- }
- len = n1;
- len = range (len, 1, 255);
- add = max (n1 - s0l, 0);
-
- if (n->type == OP_LPAD_OPT || n->type == OP_RPAD_OPT)
- {
- if (s2l < 1)
- {
- c = n->type == OP_LPAD_OPT ? 'L' : 'R';
- msg (SE, _("Third argument to %cPAD() must be at least one "
- "character in length."), c);
- c = ' ';
- }
- else
- c = s2[0];
- }
- else
- c = ' ';
-
- if (n->type == OP_LPAD || n->type == OP_LPAD_OPT)
- memmove (&s0[add], s0, len);
- if (n->type == OP_LPAD || n->type == OP_LPAD_OPT)
- memset (s0, c, add);
- else
- memset (&s0[s0l], c, add);
-
- n = repl_str_con (n, s0, len);
+ char string[256];
+ int len, pad_len;
+ char pad_char;
+
+ /* Target length. */
+ len = num[1];
+ if (len < 1 || len > 255)
+ break;
+
+ /* Pad character. */
+ if (str_len[2] != 1)
+ break;
+ pad_char = str[2][0];
+
+ if (str_len[0] >= len)
+ len = str_len[0];
+ pad_len = len - str_len[0];
+ if (n->type == OP_LPAD)
+ {
+ memset (string, pad_char, pad_len);
+ memcpy (string + pad_len, str[0], str_len[0]);
+ }
+ else
+ {
+ memcpy (string, str[0], str_len[0]);
+ memset (string + str_len[0], pad_char, pad_len);
+ }
+
+ set_string (node, string, len);
}
break;
case OP_LTRIM:
- case OP_LTRIM_OPT:
case OP_RTRIM:
- case OP_RTRIM_OPT:
{
- int c = ' ';
- char *cp = s0;
-
- if (n->type == OP_LTRIM_OPT || n->type == OP_RTRIM_OPT)
- {
- if (s1l < 1)
- {
- c = n->type == OP_LTRIM_OPT ? 'L' : 'R';
- msg (SE, _("Second argument to %cTRIM() must be at least one "
- "character in length."), c);
- }
- else
- c = s1[0];
- }
- len = s0l;
- if (n->type == OP_LTRIM || n->type == OP_LTRIM_OPT)
- {
- while (*cp == c && cp < &s0[len])
- cp++;
- len -= cp - s0;
- }
+ char pad_char;
+ const char *cp = str[0];
+ int len = str_len[0];
+
+ /* Pad character. */
+ if (str_len[1] != 1)
+ break;
+ pad_char = str[1][0];
+
+ if (n->type == OP_LTRIM)
+ while (len > 0 && *cp == pad_char)
+ cp++, len--;
else
- while (len > 0 && s0[len - 1] == c)
+ while (len > 0 && str[0][len - 1] == pad_char)
len--;
- n = repl_str_con (n, cp, len);
- }
- break;
- case OP_NUMBER:
- case OP_NUMBER_OPT:
- {
- union value v;
- struct data_in di;
-
- di.s = s0;
- di.e = s0 + s0l;
- di.v = &v;
- di.flags = DI_IGNORE_ERROR;
- di.f1 = 1;
-
- if (n->type == OP_NUMBER_OPT)
- {
- di.format.type = (int) n->arg[1];
- di.format.w = (int) n->arg[2];
- di.format.d = (int) n->arg[3];
- }
- else
- {
- di.format.type = FMT_F;
- di.format.w = s0l;
- di.format.d = 0;
- }
-
- data_in (&di);
- frnc (v.f);
- }
- break;
- case OP_STRING:
- {
- union value v;
- struct fmt_spec f;
- f.type = (int) n->arg[1];
- f.w = (int) n->arg[2];
- f.d = (int) n->arg[3];
- v.f = n0;
-
- assert ((formats[f.type].cat & FCAT_STRING) == 0);
- data_out (strbuf, &f, &v);
- n = repl_str_con (n, strbuf, f.w);
+ set_string (node, cp, len);
}
break;
- case OP_SUBSTR:
- case OP_SUBSTR_OPT:
+ case OP_SUBSTR_2:
+ case OP_SUBSTR_3:
{
- int pos = (int) n1;
- if (pos > s0l || pos <= 0 || n1 == SYSMIS
- || (n->type == OP_SUBSTR_OPT && n2 == SYSMIS))
- n = repl_str_con (n, NULL, 0);
+ int pos = (int) num[1];
+ if (pos > str_len[0] || pos <= 0 || num[1] == SYSMIS
+ || (n->type == OP_SUBSTR_3 && num[2] == SYSMIS))
+ set_string (node, NULL, 0);
else
{
- if (n->type == OP_SUBSTR_OPT)
+ int len;
+ if (n->type == OP_SUBSTR_3)
{
- len = (int) n2;
- if (len + pos - 1 > s0l)
- len = s0l - pos + 1;
+ len = (int) num[2];
+ if (len + pos - 1 > str_len[0])
+ len = str_len[0] - pos + 1;
}
else
- len = s0l - pos + 1;
- n = repl_str_con (n, &s0[pos - 1], len);
+ len = str_len[0] - pos + 1;
+ set_string (node, &str[0][pos - 1], len);
}
}
break;
/* Weirdness. */
- case OP_INV:
- rnc (1.0 / n0);
- break;
case OP_MOD:
- if (n0 == 0.0 && n1 == SYSMIS)
- frnc (0.0);
+ if (num[0] == 0.0 && num[1] == SYSMIS)
+ set_number (node, 0.0);
else
- rnc (fmod (n0, n1));
+ set_number (node, fmod (num[0], num[1]));
break;
case OP_NUM_TO_BOOL:
- if (n0 == 0.0)
- n0 = 0.0;
- else if (n0 == 1.0)
- n0 = 1.0;
- else if (n0 != SYSMIS)
+ if (num[0] == 0.0)
+ num[0] = 0.0;
+ else if (num[0] == 1.0)
+ num[0] = 1.0;
+ else if (num[0] != SYSMIS)
{
msg (SE, _("When optimizing a constant expression, an integer "
"that was being used as an Boolean value was found "
"to have a constant value other than 0, 1, or SYSMIS."));
- n0 = 0.0;
+ num[0] = 0.0;
}
- rnc (n0);
+ set_number (node, num[0]);
break;
}
- return n;
}
-#undef n0
-#undef n1
-#undef n2
-
-#undef s0
-#undef s0l
-#undef s1
-#undef s1l
-#undef s2
-#undef s2l
-#undef s
-#undef sl
-
-#undef rnc
-#undef frnc
+static void
+evaluate_tree_with_missing (union any_node **node UNUSED, size_t count UNUSED)
+{
+ /* FIXME */
+}
-static struct nonterm_node *
-repl_num_con (struct nonterm_node * n, double d)
+static void
+collapse_node (union any_node **node, size_t child_idx)
{
- int i;
- if (!finite (d) || errno)
- d = SYSMIS;
- else
- for (i = 0; i < n->n; i++)
- if (n->arg[i]->type == OP_NUM_CON && n->arg[i]->num_con.value == SYSMIS)
- {
- d = SYSMIS;
- break;
- }
- return force_repl_num_con (n, d);
+ struct nonterm_node *nonterm = &(*node)->nonterm;
+ union any_node *child;
+
+ child = nonterm->arg[child_idx];
+ nonterm->arg[child_idx] = NULL;
+ free_node (*node);
+ *node = child;
}
-static struct nonterm_node *
-force_repl_num_con (struct nonterm_node * n, double d)
+
+static void
+set_number (union any_node **node, double value)
{
struct num_con_node *num;
+
+ free_node (*node);
- if (!finite (d) || errno)
- d = SYSMIS;
- free_node ((union any_node *) n);
- num = xmalloc (sizeof *num);
+ *node = xmalloc (sizeof *num);
+ num = &(*node)->num_con;
num->type = OP_NUM_CON;
- num->value = d;
- return (struct nonterm_node *) num;
+ num->value = finite (value) ? value : SYSMIS;
}
-static struct nonterm_node *
-repl_str_con (struct nonterm_node * n, char *s, int len)
+static void
+set_number_errno (union any_node **node, double value)
+{
+ if (errno == EDOM || errno == ERANGE)
+ value = SYSMIS;
+ set_number (node, value);
+}
+
+static void
+set_string (union any_node **node, const char *string, size_t length)
{
struct str_con_node *str;
/* The ordering here is important since the source string may be
part of a subnode of n. */
- str = xmalloc (sizeof *str + len - 1);
+ str = xmalloc (sizeof *str + length - 1);
str->type = OP_STR_CON;
- str->len = len;
- memcpy (str->s, s, len);
- free_node ((union any_node *) n);
- return (struct nonterm_node *) str;
+ str->len = length;
+ memcpy (str->s, string, length);
+ free_node (*node);
+ *node = (union any_node *) str;
}
/* Returns the number of days since 10 Oct 1582 for the date
month = floor (month + EPSILON);
day = floor (day + EPSILON);
- if (year >= 0. && year <= 199.)
+ if (year >= 0. && year <= 29.)
+ year += 2000.;
+ else if (year >= 30. && year <= 99.)
year += 1900.;
if ((year < 1582. || year > 19999.)
|| (year == 1582. && (month < 10. || (month == 10. && day < 15.)))
- || (month < -1 || month > 13)
- || (day < -1 || day > 32))
+ || (month < 0 || month > 13)
+ || (day < 0 || day > 31))
return SYSMIS;
return calendar_to_julian (year, month, day);
}
static void
dump_node (struct expr_dump_state *eds, union any_node * n)
{
- if (n->type == OP_AND || n->type == OP_OR)
- {
- int i;
-
- dump_node (eds, n->nonterm.arg[0]);
- for (i = 1; i < n->nonterm.n; i++)
- {
- dump_node (eds, n->nonterm.arg[i]);
- emit (eds, n->type);
- }
- return;
- }
- else if (n->type < OP_TERMINAL)
+ if (IS_NONTERMINAL (n->type))
{
int i;
for (i = 0; i < n->nonterm.n; i++)
emit (eds, (int) n->nonterm.arg[n->nonterm.n + 1]);
emit (eds, (int) n->nonterm.arg[n->nonterm.n + 2]);
}
- return;
}
-
- emit (eds, n->type);
- if (n->type == OP_NUM_CON)
- emit_num_con (eds, n->num_con.value);
- else if (n->type == OP_STR_CON)
- emit_str_con (eds, n->str_con.s, n->str_con.len);
- else if (n->type == OP_NUM_VAR || n->type == OP_STR_VAR
- || n->type == OP_STR_MIS)
- emit_var (eds, n->var.v);
- else if (n->type == OP_NUM_LAG || n->type == OP_STR_LAG)
+ else
{
- emit_var (eds, n->lag.v);
- emit (eds, n->lag.lag);
+ emit (eds, n->type);
+ if (n->type == OP_NUM_CON)
+ emit_num_con (eds, n->num_con.value);
+ else if (n->type == OP_STR_CON)
+ emit_str_con (eds, n->str_con.s, n->str_con.len);
+ else if (n->type == OP_NUM_VAR || n->type == OP_STR_VAR)
+ emit_var (eds, n->var.v);
+ else if (n->type == OP_NUM_LAG || n->type == OP_STR_LAG)
+ {
+ emit_var (eds, n->lag.v);
+ emit (eds, n->lag.lag);
+ }
+ else if (n->type == OP_NUM_SYS || n->type == OP_NUM_VAL)
+ emit (eds, n->var.v->fv);
+ else
+ assert (n->type == OP_CASENUM);
}
- else if (n->type == OP_NUM_SYS || n->type == OP_NUM_VAL)
- emit (eds, n->var.v->fv);
- else
- assert (n->type == OP_CASENUM);
}
static void
#include "error.h"
#include "lexer.h"
#include "misc.h"
+#include "settings.h"
#include "str.h"
#include "var.h"
#include "vfm.h"
\f
/* Declarations. */
-/* Lowest precedence. */
-static int parse_or (union any_node **n);
-static int parse_and (union any_node **n);
-static int parse_not (union any_node **n);
-static int parse_rel (union any_node **n);
-static int parse_add (union any_node **n);
-static int parse_mul (union any_node **n);
-static int parse_neg (union any_node **n);
-static int parse_exp (union any_node **n);
-static int parse_primary (union any_node **n);
-static int parse_function (union any_node **n);
-/* Highest precedence. */
+/* Recursive descent parser in order of increasing precedence. */
+typedef enum expr_type parse_recursively_func (union any_node **);
+static parse_recursively_func parse_or, parse_and, parse_not;
+static parse_recursively_func parse_rel, parse_add, parse_mul;
+static parse_recursively_func parse_neg, parse_exp;
+static parse_recursively_func parse_primary, parse_function;
/* Utility functions. */
-static const char *expr_type_name (int type);
-static const char *type_name (int type);
+static const char *expr_type_name (enum expr_type type);
+static const char *var_type_name (int var_type);
static void make_bool (union any_node **n);
static union any_node *allocate_nonterminal (int op, union any_node *n);
-static union any_node *append_nonterminal_arg (union any_node *,
- union any_node *);
-static int type_check (union any_node **n, int type, int flags);
+static union any_node *allocate_binary_nonterminal (int op, union any_node *,
+ union any_node *);
+static union any_node *allocate_num_con (double value);
+static union any_node *allocate_str_con (const char *string, size_t length);
+static union any_node *allocate_var_node (int type, struct variable *);
+static int type_check (union any_node **n,
+ enum expr_type actual_type,
+ enum expr_type expected_type);
static algo_compare_func compare_functions;
static void init_func_tab (void);
#if DEBUGGING
static void debug_print_tree (union any_node *, int);
#endif
-
-#if GLOBAL_DEBUGGING
-static void debug_print_postfix (struct expression *);
-#endif
\f
/* Public functions. */
}
struct expression *
-expr_parse (int flags)
+expr_parse (enum expr_type expected_type)
{
struct expression *e;
union any_node *n;
- int type;
+ enum expr_type actual_type;
+ int optimize = (expected_type & EXPR_NO_OPTIMIZE) == 0;
+
+ expected_type &= ~EXPR_NO_OPTIMIZE;
/* Make sure the table of functions is initialized. */
init_func_tab ();
/* Parse the expression. */
- type = parse_or (&n);
- if (type == EX_ERROR)
+ actual_type = parse_or (&n);
+ if (actual_type == EXPR_ERROR)
return NULL;
/* Enforce type rules. */
- if (!type_check (&n, type, flags))
+ if (!type_check (&n, actual_type, expected_type))
{
free_node (n);
return NULL;
}
/* Optimize the expression as best we can. */
- n = (union any_node *) optimize_expression ((struct nonterm_node *) n);
+ if (optimize)
+ optimize_expression (&n);
/* Dump the tree-based expression to a postfix representation for
best evaluation speed, and destroy the tree. */
e = xmalloc (sizeof *e);
- e->type = type;
+ e->type = actual_type;
dump_expression (n, e);
free_node (n);
- /* If we're debugging or the user requested it, print the postfix
- representation. */
-#if DEBUGGING
- debug_print_postfix (e);
-#endif
-
return e;
}
+/* Returns the type of EXPR. */
+enum expr_type
+expr_get_type (const struct expression *expr)
+{
+ assert (expr != NULL);
+ return expr->type;
+}
+
static int
-type_check (union any_node **n, int type, int flags)
+type_check (union any_node **n, enum expr_type actual_type, enum expr_type expected_type)
{
- /* Enforce PXP_BOOLEAN flag. */
- if (flags & PXP_BOOLEAN)
+ switch (expected_type)
{
- if (type == EX_STRING)
+ case EXPR_BOOLEAN:
+ case EXPR_NUMERIC:
+ if (actual_type == EXPR_STRING)
{
- msg (SE, _("A string expression was supplied in a place "
- "where a Boolean expression was expected."));
+ msg (SE, _("Type mismatch: expression has string type, "
+ "but a numeric value is required here."));
return 0;
}
- else if (type == EX_NUMERIC)
+ if (actual_type == EXPR_NUMERIC && expected_type == EXPR_BOOLEAN)
*n = allocate_nonterminal (OP_NUM_TO_BOOL, *n);
- }
-
- /* Enforce PXP_NUMERIC flag. */
- if ((flags & PXP_NUMERIC) && (type != EX_NUMERIC))
- {
- msg (SE, _("A numeric expression was expected in a place "
- "where one was not supplied."));
- return 0;
- }
+ break;
+
+ case EXPR_STRING:
+ if (actual_type != EXPR_STRING)
+ {
+ msg (SE, _("Type mismatch: expression has numeric type, "
+ "but a string value is required here."));
+ return 0;
+ }
+ break;
+
+ case EXPR_ANY:
+ break;
- /* Enforce PXP_STRING flag. */
- if ((flags & PXP_STRING) && (type != EX_STRING))
- {
- msg (SE, _("A string expression was expected in a place "
- "where one was not supplied."));
- return 0;
+ default:
+ assert (0);
}
-
+
return 1;
}
\f
/* Recursive-descent expression parser. */
-/* Parses the OR level. */
+/* Coerces *NODE, of type ACTUAL_TYPE, to type REQUIRED_TYPE, and
+ returns success. If ACTUAL_TYPE cannot be coerced to the
+ desired type then we issue an error message about operator
+ OPERATOR_NAME and free *NODE. */
static int
-parse_or (union any_node **n)
+type_coercion (enum expr_type actual_type, enum expr_type required_type,
+ union any_node **node,
+ const char *operator_name)
{
- char typ[] = N_("The OR operator cannot take string operands.");
- union any_node *c;
- int type;
+ assert (required_type == EXPR_NUMERIC
+ || required_type == EXPR_BOOLEAN
+ || required_type == EXPR_STRING);
- type = parse_and (n);
- if (type == EX_ERROR || token != T_OR)
- return type;
- if (type == EX_STRING)
+ if (actual_type == required_type)
{
- free_node (*n);
- msg (SE, gettext (typ));
+ /* Type match. */
+ return 1;
+ }
+ else if (actual_type == EXPR_ERROR)
+ {
+ /* Error already reported. */
+ *node = NULL;
return 0;
}
- else if (type == EX_NUMERIC)
- make_bool (n);
-
- c = allocate_nonterminal (OP_OR, *n);
- for (;;)
+ else if (actual_type == EXPR_BOOLEAN && required_type == EXPR_NUMERIC)
{
- lex_get ();
- type = parse_and (n);
- if (type == EX_ERROR)
- goto fail;
- else if (type == EX_STRING)
- {
- msg (SE, gettext (typ));
- goto fail;
- }
- else if (type == EX_NUMERIC)
- make_bool (n);
- c = append_nonterminal_arg (c, *n);
-
- if (token != T_OR)
- break;
+ /* Boolean -> numeric: nothing to do. */
+ return 1;
}
- *n = c;
- return EX_BOOLEAN;
+ else if (actual_type == EXPR_NUMERIC && required_type == EXPR_BOOLEAN)
+ {
+ /* Numeric -> Boolean: insert conversion. */
+ make_bool (node);
+ return 1;
+ }
+ else
+ {
+ /* We want a string and got a number/Boolean, or vice versa. */
+ assert ((actual_type == EXPR_STRING) != (required_type == EXPR_STRING));
-fail:
- free_node (c);
- return EX_ERROR;
+ if (required_type == EXPR_STRING)
+ msg (SE, _("Type mismatch: operands of %s operator must be strings."),
+ operator_name);
+ else
+ msg (SE, _("Type mismatch: operands of %s operator must be numeric."),
+ operator_name);
+ free_node (*node);
+ *node = NULL;
+ return 0;
+ }
}
-/* Parses the AND level. */
+/* An operator. */
+struct operator
+ {
+ int token; /* Operator token. */
+ int type; /* Operator node type. */
+ const char *name; /* Operator name. */
+ };
+
+/* Attempts to match the current token against the tokens for the
+ OP_CNT operators in OPS[]. If successful, returns nonzero
+ and, if OPERATOR is non-null, sets *OPERATOR to the operator.
+ On failure, returns zero and, if OPERATOR is non-null, sets
+ *OPERATOR to a null pointer. */
static int
-parse_and (union any_node ** n)
+match_operator (const struct operator ops[], size_t op_cnt,
+ const struct operator **operator)
{
- static const char typ[]
- = N_("The AND operator cannot take string operands.");
- union any_node *c;
- int type = parse_not (n);
+ const struct operator *op;
- if (type == EX_ERROR)
- return EX_ERROR;
- if (token != T_AND)
- return type;
- if (type == EX_STRING)
+ for (op = ops; op < ops + op_cnt; op++)
{
- free_node (*n);
- msg (SE, gettext (typ));
- return 0;
+ if (op->token == '-')
+ lex_negative_to_dash ();
+ if (lex_match (op->token))
+ {
+ if (operator != NULL)
+ *operator = op;
+ return 1;
+ }
}
- else if (type == EX_NUMERIC)
- make_bool (n);
-
- c = allocate_nonterminal (OP_AND, *n);
- for (;;)
- {
- lex_get ();
- type = parse_not (n);
- if (type == EX_ERROR)
- goto fail;
- else if (type == EX_STRING)
- {
- msg (SE, gettext (typ));
- goto fail;
- }
- else if (type == EX_NUMERIC)
- make_bool (n);
- c = append_nonterminal_arg (c, *n);
-
- if (token != T_AND)
- break;
- }
- *n = c;
- return EX_BOOLEAN;
-
-fail:
- free_node (c);
- return EX_ERROR;
+ if (operator != NULL)
+ *operator = NULL;
+ return 0;
}
-/* Parses the NOT level. */
-static int
-parse_not (union any_node ** n)
+/* Parses a chain of left-associative operator/operand pairs.
+ The operators' operands uniformly must be type REQUIRED_TYPE.
+ There are OP_CNT operators, specified in OPS[]. The next
+ higher level is parsed by PARSE_NEXT_LEVEL. If CHAIN_WARNING
+ is non-null, then it will be issued as a warning if more than
+ one operator/operand pair is parsed. */
+static enum expr_type
+parse_binary_operators (union any_node **node,
+ enum expr_type actual_type,
+ enum expr_type required_type,
+ enum expr_type result_type,
+ const struct operator ops[], size_t op_cnt,
+ parse_recursively_func *parse_next_level,
+ const char *chain_warning)
{
- static const char typ[]
- = N_("The NOT operator cannot take a string operand.");
- int not = 0;
- int type;
+ int op_count;
+ const struct operator *operator;
- while (lex_match (T_NOT))
- not ^= 1;
- type = parse_rel (n);
- if (!not || type == EX_ERROR)
- return type;
+ if (actual_type == EXPR_ERROR)
+ return EXPR_ERROR;
- if (type == EX_STRING)
+ for (op_count = 0; match_operator (ops, op_cnt, &operator); op_count++)
{
- free_node (*n);
- msg (SE, gettext (typ));
- return 0;
+ union any_node *rhs;
+
+ /* Convert the left-hand side to type REQUIRED_TYPE. */
+ if (!type_coercion (actual_type, required_type, node, operator->name))
+ return EXPR_ERROR;
+
+ /* Parse the right-hand side and coerce to type
+ REQUIRED_TYPE. */
+ if (!type_coercion (parse_next_level (&rhs), required_type,
+ &rhs, operator->name))
+ {
+ free_node (*node);
+ *node = NULL;
+ return EXPR_ERROR;
+ }
+ *node = allocate_binary_nonterminal (operator->type, *node, rhs);
+
+ /* The result is of type RESULT_TYPE. */
+ actual_type = result_type;
}
- else if (type == EX_NUMERIC)
- make_bool (n);
- *n = allocate_nonterminal (OP_NOT, *n);
- return EX_BOOLEAN;
+ if (op_count > 1 && chain_warning != NULL)
+ msg (SW, chain_warning);
+
+ return actual_type;
}
-static int
-parse_rel (union any_node ** n)
+static enum expr_type
+parse_inverting_unary_operator (union any_node **node,
+ enum expr_type required_type,
+ const struct operator *operator,
+ parse_recursively_func *parse_next_level)
{
- static const char typ[]
- = N_("Strings cannot be compared with numeric or Boolean "
- "values with the relational operators "
- "= >= > <= < <>.");
- union any_node *c;
- int type = parse_add (n);
+ unsigned op_count;
+
+ op_count = 0;
+ while (match_operator (operator, 1, NULL))
+ op_count++;
+ if (op_count == 0)
+ return parse_next_level (node);
+
+ if (!type_coercion (parse_next_level (node), required_type,
+ node, operator->name))
+ return EXPR_ERROR;
+ if (op_count % 2 != 0)
+ *node = allocate_nonterminal (operator->type, *node);
+ return required_type;
+}
- if (type == EX_ERROR)
- return EX_ERROR;
- if (token == '=')
- token = T_EQ;
- if (token < T_EQ || token > T_NE)
- return type;
+/* Parses the OR level. */
+static enum expr_type
+parse_or (union any_node **n)
+{
+ static const struct operator ops[] =
+ {
+ { T_OR, OP_OR, "logical disjunction (\"OR\")" },
+ };
+
+ return parse_binary_operators (n, parse_and (n), EXPR_BOOLEAN, EXPR_BOOLEAN,
+ ops, sizeof ops / sizeof *ops,
+ parse_and, NULL);
+}
- for (;;)
+/* Parses the AND level. */
+static enum expr_type
+parse_and (union any_node ** n)
+{
+ static const struct operator ops[] =
{
- int t;
+ { T_AND, OP_AND, "logical conjunction (\"AND\")" },
+ };
+
+ return parse_binary_operators (n, parse_not (n), EXPR_BOOLEAN, EXPR_BOOLEAN,
+ ops, sizeof ops / sizeof *ops,
+ parse_not, NULL);
+}
- c = allocate_nonterminal (token - T_EQ
- + (type == EX_NUMERIC ? OP_EQ : OP_STRING_EQ),
- *n);
- lex_get ();
+/* Parses the NOT level. */
+static enum expr_type
+parse_not (union any_node ** n)
+{
+ static const struct operator op
+ = { T_NOT, OP_NOT, "logical negation (\"NOT-\")" };
+ return parse_inverting_unary_operator (n, EXPR_BOOLEAN, &op, parse_rel);
+}
- t = parse_add (n);
- if (t == EX_ERROR)
- goto fail;
- if (t == EX_BOOLEAN && type == EX_NUMERIC)
- make_bool (&c->nonterm.arg[0]);
- else if (t == EX_NUMERIC && type == EX_BOOLEAN)
- make_bool (n);
- else if (t != type)
- {
- msg (SE, gettext (typ));
- goto fail;
- }
+/* Parse relational operators. */
+static enum expr_type
+parse_rel (union any_node **n)
+{
+ static const struct operator numeric_ops[] =
+ {
+ { '=', OP_EQ, "numeric equality (\"=\")" },
+ { T_EQ, OP_EQ, "numeric equality (\"EQ\")" },
+ { T_GE, OP_GE, "numeric greater-than-or-equal-to (\">=\")" },
+ { T_GT, OP_GT, "numeric greater than (\">\")" },
+ { T_LE, OP_LE, "numeric less-than-or-equal-to (\"<=\")" },
+ { T_LT, OP_LT, "numeric less than (\"<\")" },
+ { T_NE, OP_NE, "numeric inequality (\"<>\")" },
+ };
+
+ static const struct operator string_ops[] =
+ {
+ { '=', OP_EQ_STRING, "string equality (\"=\")" },
+ { T_EQ, OP_EQ_STRING, "string equality (\"EQ\")" },
+ { T_GE, OP_GE_STRING, "string greater-than-or-equal-to (\">=\")" },
+ { T_GT, OP_GT_STRING, "string greater than (\">\")" },
+ { T_LE, OP_LE_STRING, "string less-than-or-equal-to (\"<=\")" },
+ { T_LT, OP_LT_STRING, "string less than (\"<\")" },
+ { T_NE, OP_NE_STRING, "string inequality (\"<>\")" },
+ };
- c = append_nonterminal_arg (c, *n);
- *n = c;
+ int type = parse_add (n);
- if (token == '=')
- token = T_EQ;
- if (token < T_EQ || token > T_NE)
- break;
+ const char *chain_warning =
+ _("Chaining relational operators (e.g. \"a < b < c\") will "
+ "not produce the mathematically expected result. "
+ "Use the AND logical operator to fix the problem "
+ "(e.g. \"a < b AND b < c\"). "
+ "If chaining is really intended, parentheses will disable "
+ "this warning (e.g. \"(a < b) < c\".)");
- type = EX_BOOLEAN;
- }
- return EX_BOOLEAN;
+ switch (type)
+ {
+ case EXPR_ERROR:
+ return EXPR_ERROR;
+
+ case EXPR_NUMERIC:
+ case EXPR_BOOLEAN:
+ return parse_binary_operators (n,
+ type, EXPR_NUMERIC, EXPR_BOOLEAN,
+ numeric_ops,
+ sizeof numeric_ops / sizeof *numeric_ops,
+ parse_add, chain_warning);
+
+ case EXPR_STRING:
+ return parse_binary_operators (n,
+ type, EXPR_STRING, EXPR_BOOLEAN,
+ string_ops,
+ sizeof string_ops / sizeof *string_ops,
+ parse_add, chain_warning);
-fail:
- free_node (c);
- return EX_ERROR;
+ default:
+ assert (0);
+ abort ();
+ }
}
/* Parses the addition and subtraction level. */
-static int
+static enum expr_type
parse_add (union any_node **n)
{
- static const char typ[]
- = N_("The `+' and `-' operators may only be used with "
- "numeric operands.");
- union any_node *c;
- int type;
- int op;
-
- type = parse_mul (n);
- lex_negative_to_dash ();
- if (type == EX_ERROR || (token != '+' && token != '-'))
- return type;
- if (type != EX_NUMERIC)
+ static const struct operator ops[] =
{
- free_node (*n);
- msg (SE, gettext (typ));
- return 0;
- }
-
- c = allocate_nonterminal (OP_PLUS, *n);
- for (;;)
- {
- op = token;
- lex_get ();
-
- type = parse_mul (n);
- if (type == EX_ERROR)
- goto fail;
- else if (type != EX_NUMERIC)
- {
- msg (SE, gettext (typ));
- goto fail;
- }
- if (op == '-')
- *n = allocate_nonterminal (OP_NEG, *n);
- c = append_nonterminal_arg (c, *n);
-
- lex_negative_to_dash ();
- if (token != '+' && token != '-')
- break;
- }
- *n = c;
- return EX_NUMERIC;
-
-fail:
- free_node (c);
- return EX_ERROR;
+ { '+', OP_ADD, "addition (\"+\")" },
+ { '-', OP_SUB, "subtraction (\"-\")-" },
+ };
+
+ return parse_binary_operators (n, parse_mul (n), EXPR_NUMERIC, EXPR_NUMERIC,
+ ops, sizeof ops / sizeof *ops,
+ parse_mul, NULL);
}
/* Parses the multiplication and division level. */
-static int
+static enum expr_type
parse_mul (union any_node ** n)
{
- static const char typ[]
- = N_("The `*' and `/' operators may only be used with "
- "numeric operands.");
-
- union any_node *c;
- int type;
- int op;
-
- type = parse_neg (n);
- if (type == EX_ERROR || (token != '*' && token != '/'))
- return type;
- if (type != EX_NUMERIC)
- {
- free_node (*n);
- msg (SE, gettext (typ));
- return 0;
- }
-
- c = allocate_nonterminal (OP_MUL, *n);
- for (;;)
+ static const struct operator ops[] =
{
- op = token;
- lex_get ();
-
- type = parse_neg (n);
- if (type == EX_ERROR)
- goto fail;
- else if (type != EX_NUMERIC)
- {
- msg (SE, gettext (typ));
- goto fail;
- }
- if (op == '/')
- *n = allocate_nonterminal (OP_INV, *n);
- c = append_nonterminal_arg (c, *n);
-
- if (token != '*' && token != '/')
- break;
- }
- *n = c;
- return EX_NUMERIC;
-
-fail:
- free_node (c);
- return EX_ERROR;
+ { '*', OP_MUL, "multiplication (\"*\")" },
+ { '/', OP_DIV, "division (\"/\")" },
+ };
+
+ return parse_binary_operators (n, parse_neg (n), EXPR_NUMERIC, EXPR_NUMERIC,
+ ops, sizeof ops / sizeof *ops,
+ parse_neg, NULL);
}
/* Parses the unary minus level. */
-static int
+static enum expr_type
parse_neg (union any_node **n)
{
- static const char typ[]
- = N_("The unary minus (-) operator can only take a numeric operand.");
-
- int neg = 0;
- int type;
-
- for (;;)
- {
- lex_negative_to_dash ();
- if (!lex_match ('-'))
- break;
- neg ^= 1;
- }
- type = parse_exp (n);
- if (!neg || type == EX_ERROR)
- return type;
- if (type != EX_NUMERIC)
- {
- free_node (*n);
- msg (SE, gettext (typ));
- return 0;
- }
-
- *n = allocate_nonterminal (OP_NEG, *n);
- return EX_NUMERIC;
+ static const struct operator op = { '-', OP_NEG, "negation (\"-\")" };
+ return parse_inverting_unary_operator (n, EXPR_NUMERIC, &op, parse_exp);
}
-static int
+static enum expr_type
parse_exp (union any_node **n)
{
- static const char typ[]
- = N_("Both operands to the ** operator must be numeric.");
-
- union any_node *c;
- int type;
-
- type = parse_primary (n);
- if (type == EX_ERROR || token != T_EXP)
- return type;
- if (type != EX_NUMERIC)
- {
- free_node (*n);
- msg (SE, gettext (typ));
- return 0;
- }
-
- for (;;)
+ static const struct operator ops[] =
{
- c = allocate_nonterminal (OP_POW, *n);
- lex_get ();
-
- type = parse_primary (n);
- if (type == EX_ERROR)
- goto fail;
- else if (type != EX_NUMERIC)
- {
- msg (SE, gettext (typ));
- goto fail;
- }
- *n = append_nonterminal_arg (c, *n);
-
- if (token != T_EXP)
- break;
- }
- return EX_NUMERIC;
-
-fail:
- free_node (c);
- return EX_ERROR;
+ { T_EXP, OP_POW, "exponentiation (\"**\")" },
+ };
+
+ const char *chain_warning =
+ _("The exponentiation operator (\"**\") is left-associative, "
+ "even though right-associative semantics are more useful. "
+ "That is, \"a**b**c\" equals \"(a**b)**c\", not as \"a**(b**c)\". "
+ "To disable this warning, insert parentheses.");
+
+ return parse_binary_operators (n,
+ parse_primary (n), EXPR_NUMERIC, EXPR_NUMERIC,
+ ops, sizeof ops / sizeof *ops,
+ parse_primary, chain_warning);
}
/* Parses system variables. */
-static int
+static enum expr_type
parse_sysvar (union any_node **n)
{
if (!strcmp (tokid, "$CASENUM"))
{
*n = xmalloc (sizeof (struct casenum_node));
(*n)->casenum.type = OP_CASENUM;
- return EX_NUMERIC;
+ return EXPR_NUMERIC;
+ }
+ else if (!strcmp (tokid, "$DATE"))
+ {
+ static const char *months[12] =
+ {
+ "JAN", "FEB", "MAR", "APR", "MAY", "JUN",
+ "JUL", "AUG", "SEP", "OCT", "NOV", "DEC",
+ };
+
+ struct tm *time;
+ char temp_buf[10];
+
+ time = localtime (&last_vfm_invocation);
+ sprintf (temp_buf, "%02d %s %02d", abs (time->tm_mday) % 100,
+ months[abs (time->tm_mon) % 12], abs (time->tm_year) % 100);
+
+ *n = xmalloc (sizeof (struct str_con_node) + 8);
+ (*n)->str_con.type = OP_STR_CON;
+ (*n)->str_con.len = 9;
+ memcpy ((*n)->str_con.s, temp_buf, 9);
+ return EXPR_STRING;
}
else
{
+ enum expr_type type;
double d;
- if (!strcmp (tokid, "$SYSMIS"))
+ type = EXPR_NUMERIC;
+ if (!strcmp (tokid, "$TRUE"))
+ {
+ d = 1.0;
+ type = EXPR_BOOLEAN;
+ }
+ else if (!strcmp (tokid, "$FALSE"))
+ {
+ d = 0.0;
+ type = EXPR_BOOLEAN;
+ }
+ else if (!strcmp (tokid, "$SYSMIS"))
d = SYSMIS;
else if (!strcmp (tokid, "$JDATE"))
{
struct tm *time = localtime (&last_vfm_invocation);
d = yrmoda (time->tm_year + 1900, time->tm_mon + 1, time->tm_mday);
}
- else if (!strcmp (tokid, "$DATE"))
- {
- static const char *months[12] =
- {
- "JAN", "FEB", "MAR", "APR", "MAY", "JUN",
- "JUL", "AUG", "SEP", "OCT", "NOV", "DEC",
- };
-
- struct tm *time;
- char temp_buf[10];
-
- time = localtime (&last_vfm_invocation);
- sprintf (temp_buf, "%02d %s %02d", abs (time->tm_mday) % 100,
- months[abs (time->tm_mon) % 12], abs (time->tm_year) % 100);
-
- *n = xmalloc (sizeof (struct str_con_node) + 8);
- (*n)->str_con.type = OP_STR_CON;
- (*n)->str_con.len = 9;
- memcpy ((*n)->str_con.s, temp_buf, 9);
- return EX_STRING;
- }
else if (!strcmp (tokid, "$TIME"))
{
struct tm *time;
+ time->tm_sec);
}
else if (!strcmp (tokid, "$LENGTH"))
- {
- msg (SW, _("Use of $LENGTH is obsolete, returning default of 66."));
- d = 66.0;
- }
+ d = get_viewlength ();
else if (!strcmp (tokid, "$WIDTH"))
- {
- msg (SW, _("Use of $WIDTH is obsolete, returning default of 131."));
- d = 131.0;
- }
+ d = get_viewwidth ();
else
{
msg (SE, _("Unknown system variable %s."), tokid);
- return EX_ERROR;
+ return EXPR_ERROR;
}
*n = xmalloc (sizeof (struct num_con_node));
(*n)->num_con.type = OP_NUM_CON;
(*n)->num_con.value = d;
- return EX_NUMERIC;
+ return type;
}
}
/* Parses numbers, varnames, etc. */
-static int
+static enum expr_type
parse_primary (union any_node **n)
{
switch (token)
/* $ at the beginning indicates a system variable. */
if (tokid[0] == '$')
{
- int type = parse_sysvar (n);
+ enum expr_type type = parse_sysvar (n);
lex_get ();
return type;
}
if (v == NULL)
{
lex_error (_("expecting variable name"));
- return EX_ERROR;
+ return EXPR_ERROR;
}
- *n = xmalloc (sizeof (struct var_node));
- (*n)->var.type = v->type == NUMERIC ? OP_NUM_VAR : OP_STR_VAR;
- (*n)->var.v = v;
- return v->type == NUMERIC ? EX_NUMERIC : EX_STRING;
+ if (v->type == NUMERIC)
+ {
+ *n = allocate_var_node (OP_NUM_VAR, v);
+ return EXPR_NUMERIC;
+ }
+ else
+ {
+ *n = allocate_var_node (OP_STR_VAR, v);
+ return EXPR_STRING;
+ }
}
case T_NUM:
- *n = xmalloc (sizeof (struct num_con_node));
- (*n)->num_con.type = OP_NUM_CON;
- (*n)->num_con.value = tokval;
+ *n = allocate_num_con (tokval);
lex_get ();
- return EX_NUMERIC;
+ return EXPR_NUMERIC;
case T_STRING:
{
- *n = xmalloc (sizeof (struct str_con_node) + ds_length (&tokstr) - 1);
- (*n)->str_con.type = OP_STR_CON;
- (*n)->str_con.len = ds_length (&tokstr);
- memcpy ((*n)->str_con.s, ds_value (&tokstr), ds_length (&tokstr));
+ *n = allocate_str_con (ds_value (&tokstr), ds_length (&tokstr));
lex_get ();
- return EX_STRING;
+ return EXPR_STRING;
}
case '(':
{
lex_error (_("expecting `)'"));
free_node (*n);
- return EX_ERROR;
+ return EXPR_ERROR;
}
return t;
}
default:
lex_error (_("in expression"));
- return EX_ERROR;
+ return EXPR_ERROR;
}
}
\f
{
const char *s;
int t;
- int (*func) (struct function *, int, union any_node **);
- const char *desc;
+ enum expr_type (*func) (const struct function *, int, union any_node **);
};
static struct function func_tab[];
static int func_count;
-static int get_num_args (struct function *, int, union any_node **);
+static int get_num_args (const struct function *, int, union any_node **);
-static int
-unary_func (struct function * f, int x UNUSED, union any_node ** n)
+static enum expr_type
+unary_func (const struct function *f, int x UNUSED, union any_node ** n)
{
- double divisor;
- struct nonterm_node *c;
-
if (!get_num_args (f, 1, n))
- return EX_ERROR;
-
- switch (f->t)
- {
- case OP_CTIME_DAYS:
- divisor = 1 / 60. / 60. / 24.;
- goto multiply;
- case OP_CTIME_HOURS:
- divisor = 1 / 60. / 60.;
- goto multiply;
- case OP_CTIME_MINUTES:
- divisor = 1 / 60.;
- goto multiply;
- case OP_TIME_DAYS:
- divisor = 60. * 60. * 24.;
- goto multiply;
-
- case OP_CTIME_SECONDS:
- c = &(*n)->nonterm;
- *n = (*n)->nonterm.arg[0];
- free (c);
- return EX_NUMERIC;
- }
- return EX_NUMERIC;
-
-multiply:
- /* Arrive here when we encounter an operation that is just a
- glorified version of a multiplication or division. Converts the
- operation directly into that multiplication. */
- c = xmalloc (sizeof (struct nonterm_node) + sizeof (union any_node *));
- c->type = OP_MUL;
- c->n = 2;
- c->arg[0] = (*n)->nonterm.arg[0];
- c->arg[1] = xmalloc (sizeof (struct num_con_node));
- c->arg[1]->num_con.type = OP_NUM_CON;
- c->arg[1]->num_con.value = divisor;
- free (*n);
- *n = (union any_node *) c;
- return EX_NUMERIC;
+ return EXPR_ERROR;
+ return EXPR_NUMERIC;
}
-static int
-binary_func (struct function * f, int x UNUSED, union any_node ** n)
+static enum expr_type
+binary_func (const struct function *f, int x UNUSED, union any_node ** n)
{
if (!get_num_args (f, 2, n))
- return EX_ERROR;
- return EX_NUMERIC;
+ return EXPR_ERROR;
+ return EXPR_NUMERIC;
}
-static int
-ternary_func (struct function * f, int x UNUSED, union any_node ** n)
+static enum expr_type
+ternary_func (const struct function *f, int x UNUSED, union any_node **n)
{
if (!get_num_args (f, 3, n))
- return EX_ERROR;
- return EX_NUMERIC;
+ return EXPR_ERROR;
+ return EXPR_NUMERIC;
}
-static int
-MISSING_func (struct function * f, int x UNUSED, union any_node ** n)
+static enum expr_type
+MISSING_func (const struct function *f, int x UNUSED, union any_node **n)
{
- if (token == T_ID
- && dict_lookup_var (default_dict, tokid) != NULL
- && lex_look_ahead () == ')')
- {
- struct var_node *c = xmalloc (sizeof *c);
- c->v = parse_variable ();
- c->type = c->v->type == ALPHA ? OP_STR_MIS : OP_NUM_SYS;
- *n = (union any_node *) c;
- return EX_BOOLEAN;
- }
if (!get_num_args (f, 1, n))
- return EX_ERROR;
- return EX_BOOLEAN;
+ return EXPR_ERROR;
+ return EXPR_BOOLEAN;
}
-static int
-SYSMIS_func (struct function * f UNUSED, int x UNUSED, union any_node ** n)
+static enum expr_type
+SYSMIS_func (const struct function *f, int x UNUSED, union any_node **n)
{
- int t;
-
- if (token == T_ID
- && dict_lookup_var (default_dict, tokid)
- && lex_look_ahead () == ')')
- {
- struct variable *v;
- v = parse_variable ();
- if (v->type == ALPHA)
- {
- struct num_con_node *c = xmalloc (sizeof *c);
- c->type = OP_NUM_CON;
- c->value = 0;
- return EX_BOOLEAN;
- }
- else
- {
- struct var_node *c = xmalloc (sizeof *c);
- c->type = OP_NUM_SYS;
- c->v = v;
- return EX_BOOLEAN;
- }
- }
-
- t = parse_or (n);
- if (t == EX_ERROR)
- return t;
- else if (t == EX_NUMERIC)
- {
- *n = allocate_nonterminal (OP_SYSMIS, *n);
- return EX_BOOLEAN;
- }
- else /* EX_STRING or EX_BOOLEAN */
+ if (!get_num_args (f, 1, n))
+ return EXPR_ERROR;
+ if ((*n)->nonterm.arg[0]->type == OP_NUM_VAR)
{
- /* Return constant `true' value. */
+ struct variable *v = (*n)->nonterm.arg[0]->var.v;
free_node (*n);
- *n = xmalloc (sizeof (struct num_con_node));
- (*n)->num_con.type = OP_NUM_CON;
- (*n)->num_con.value = 1.0;
- return EX_BOOLEAN;
+ *n = allocate_var_node (OP_NUM_SYS, v);
}
+ return EXPR_BOOLEAN;
}
-static int
-VALUE_func (struct function *f UNUSED, int x UNUSED, union any_node **n)
+static enum expr_type
+VALUE_func (const struct function *f UNUSED, int x UNUSED, union any_node **n)
{
struct variable *v = parse_variable ();
if (!v)
- return EX_ERROR;
- *n = xmalloc (sizeof (struct var_node));
- (*n)->var.v = v;
+ return EXPR_ERROR;
if (v->type == NUMERIC)
{
- (*n)->var.type = OP_NUM_VAL;
- return EX_NUMERIC;
+ *n = allocate_var_node (OP_NUM_VAL, v);
+ return EXPR_NUMERIC;
}
else
{
- (*n)->var.type = OP_STR_VAR;
- return EX_STRING;
+ *n = allocate_var_node (OP_STR_VAR, v);
+ return EXPR_STRING;
}
}
-static int
-LAG_func (struct function *f UNUSED, int x UNUSED, union any_node **n)
+static enum expr_type
+LAG_func (const struct function *f UNUSED, int x UNUSED, union any_node **n)
{
struct variable *v = parse_variable ();
int nlag = 1;
if (!v)
- return EX_ERROR;
+ return EXPR_ERROR;
if (lex_match (','))
{
if (!lex_integer_p () || lex_integer () <= 0 || lex_integer () > 1000)
{
msg (SE, _("Argument 2 to LAG must be a small positive "
"integer constant."));
- return 0;
+ return EXPR_ERROR;
}
nlag = lex_integer ();
(*n)->lag.type = (v->type == NUMERIC ? OP_NUM_LAG : OP_STR_LAG);
(*n)->lag.v = v;
(*n)->lag.lag = nlag;
- return (v->type == NUMERIC ? EX_NUMERIC : EX_STRING);
+ return (v->type == NUMERIC ? EXPR_NUMERIC : EXPR_STRING);
}
/* This screwball function parses n-ary operators:
- 1. NMISS, NVALID, SUM, MEAN, MIN, MAX: any number of (numeric) arguments.
- 2. SD, VARIANCE, CFVAR: at least two (numeric) arguments.
- 3. RANGE: An odd number of arguments, but at least three.
- All arguments must be the same type.
- 4. ANY: At least two arguments. All arguments must be the same type.
+
+ 1. NMISS, NVALID, SUM, MEAN: any number of numeric
+ arguments.
+
+ 2. SD, VARIANCE, CFVAR: at least two numeric arguments.
+
+ 3. RANGE: An odd number of arguments, but at least three, and
+ all of the same type.
+
+ 4. ANY: At least two arguments, all of the same type.
+
+ 5. MIN, MAX: Any number of arguments, all of the same type.
*/
-static int
-nary_num_func (struct function *f, int min_args, union any_node **n)
+static enum expr_type
+nary_num_func (const struct function *f, int min_args, union any_node **n)
{
/* Argument number of current argument (used for error messages). */
- int argn = 1;
+ int arg_idx = 1;
/* Number of arguments. */
int nargs;
int m = 16;
/* Type of arguments. */
- int type = (f->t == OP_ANY || f->t == OP_RANGE) ? -1 : NUMERIC;
+ int type = (f->t == OP_ANY || f->t == OP_RANGE
+ || f->t == OP_MIN || f->t == OP_MAX) ? -1 : NUMERIC;
*n = xmalloc (sizeof (struct nonterm_node) + sizeof (union any_node *[15]));
(*n)->nonterm.type = f->t;
msg (SE, _("Type mismatch in argument %d of %s, which was "
"expected to be of %s type. It was actually "
"of %s type. "),
- argn, f->s, type_name (type), type_name (v[j]->type));
+ arg_idx, f->s, var_type_name (type), var_type_name (v[j]->type));
free (v);
goto fail;
}
for (j = 0; j < nv; j++)
{
union any_node **c = &(*n)->nonterm.arg[(*n)->nonterm.n++];
- *c = xmalloc (sizeof (struct var_node));
- (*c)->var.type = (type == NUMERIC ? OP_NUM_VAR : OP_STR_VAR);
- (*c)->var.v = v[j];
+ *c = allocate_var_node ((type == NUMERIC
+ ? OP_NUM_VAR : OP_STR_VAR),
+ v[j]);
}
}
else
union any_node *c;
int t = parse_or (&c);
- if (t == EX_ERROR)
+ if (t == EXPR_ERROR)
goto fail;
- if (t == EX_BOOLEAN)
+ if (t == EXPR_BOOLEAN)
{
free_node (c);
msg (SE, _("%s cannot take Boolean operands."), f->s);
}
if (type == -1)
{
- if (t == EX_NUMERIC)
+ if (t == EXPR_NUMERIC)
type = NUMERIC;
- else if (t == EX_STRING)
+ else if (t == EXPR_STRING)
type = ALPHA;
}
- else if ((t == EX_NUMERIC) ^ (type == NUMERIC))
+ else if ((t == EXPR_NUMERIC) ^ (type == NUMERIC))
{
free_node (c);
msg (SE, _("Type mismatch in argument %d of %s, which was "
"expected to be of %s type. It was actually "
"of %s type. "),
- argn, f->s, type_name (type), expr_type_name (t));
+ arg_idx, f->s, var_type_name (type), expr_type_name (t));
goto fail;
}
if ((*n)->nonterm.n + 1 >= m)
goto fail;
}
- argn++;
+ arg_idx++;
}
*n = xrealloc (*n, (sizeof (struct nonterm_node)
+ ((*n)->nonterm.n) * sizeof (union any_node *)));
{
msg (SE, _("RANGE requires an odd number of arguments, but "
"at least three."));
- return 0;
+ goto fail;
}
}
else if (f->t == OP_SD || f->t == OP_VARIANCE
if (nargs < 2)
{
msg (SE, _("%s requires at least two arguments."), f->s);
- return 0;
+ goto fail;
}
}
{
msg (SE, _("%s.%d requires at least %d arguments."),
f->s, min_args, min_args);
- return 0;
+ goto fail;
}
- if (f->t == OP_ANY || f->t == OP_RANGE)
+ if (f->t == OP_MIN || f->t == OP_MAX)
+ {
+ if (type == ALPHA)
+ {
+ if (f->t == OP_MIN)
+ (*n)->type = OP_MIN_STRING;
+ else if (f->t == OP_MAX)
+ (*n)->type = OP_MAX_STRING;
+ else
+ assert (0);
+ return EXPR_STRING;
+ }
+ else
+ return EXPR_NUMERIC;
+ }
+ else if (f->t == OP_ANY || f->t == OP_RANGE)
{
- if (type == T_STRING)
- f->t++;
- return EX_BOOLEAN;
+ if (type == ALPHA)
+ {
+ if (f->t == OP_ANY)
+ (*n)->type = OP_ANY_STRING;
+ else if (f->t == OP_RANGE)
+ (*n)->type = OP_RANGE_STRING;
+ else
+ assert (0);
+ }
+ return EXPR_BOOLEAN;
}
else
- return EX_NUMERIC;
+ return EXPR_NUMERIC;
fail:
free_node (*n);
- return EX_ERROR;
+ return EXPR_ERROR;
}
-static int
-CONCAT_func (struct function * f UNUSED, int x UNUSED, union any_node ** n)
+static enum expr_type
+CONCAT_func (const struct function *f UNUSED, int x UNUSED, union any_node **n)
{
int m = 0;
+ (m - 1) * sizeof (union any_node *)));
}
type = parse_or (&(*n)->nonterm.arg[(*n)->nonterm.n]);
- if (type == EX_ERROR)
+ if (type == EXPR_ERROR)
goto fail;
- if (type != EX_STRING)
+ if (type != EXPR_STRING)
{
msg (SE, _("Argument %d to CONCAT is type %s. All arguments "
"to CONCAT must be strings."),
}
*n = xrealloc (*n, (sizeof (struct nonterm_node)
+ ((*n)->nonterm.n - 1) * sizeof (union any_node *)));
- return EX_STRING;
+ return EXPR_STRING;
fail:
free_node (*n);
- return EX_ERROR;
+ return EXPR_ERROR;
}
/* Parses a string function according to f->desc. f->desc[0] is the
args by a slash (`/'). Codes are `n', numeric arg; `s', string
arg; and `f', format spec (this must be the last arg). If the
optional args are included, the type becomes f->t+1. */
-static int
-generic_str_func (struct function *f, int x UNUSED, union any_node ** n)
+static enum expr_type
+generic_str_func (const struct function *f, int x UNUSED, union any_node **n)
{
- int max_args = 0;
- int type;
+ struct string_function
+ {
+ int t1, t2;
+ enum expr_type return_type;
+ const char *arg_types;
+ };
+
+ static const struct string_function string_func_tab[] =
+ {
+ {OP_INDEX_2, OP_INDEX_3, EXPR_NUMERIC, "ssN"},
+ {OP_RINDEX_2, OP_RINDEX_3, EXPR_NUMERIC, "ssN"},
+ {OP_LENGTH, 0, EXPR_NUMERIC, "s"},
+ {OP_LOWER, 0, EXPR_STRING, "s"},
+ {OP_UPPER, 0, EXPR_STRING, "s"},
+ {OP_LPAD, 0, EXPR_STRING, "snS"},
+ {OP_RPAD, 0, EXPR_STRING, "snS"},
+ {OP_LTRIM, 0, EXPR_STRING, "sS"},
+ {OP_RTRIM, 0, EXPR_STRING, "sS"},
+ {OP_NUMBER, 0, EXPR_NUMERIC, "sf"},
+ {OP_STRING, 0, EXPR_STRING, "nf"},
+ {OP_SUBSTR_2, OP_SUBSTR_3, EXPR_STRING, "snN"},
+ };
+
+ const int string_func_cnt = sizeof string_func_tab / sizeof *string_func_tab;
+
+ const struct string_function *sf;
+ int arg_cnt;
const char *cp;
+ struct nonterm_node *nonterm;
+
+ /* Find string_function that corresponds to f. */
+ for (sf = string_func_tab; sf < string_func_tab + string_func_cnt; sf++)
+ if (f->t == sf->t1)
+ break;
+ assert (sf < string_func_tab + string_func_cnt);
/* Count max number of arguments. */
- cp = &f->desc[1];
- while (*cp)
+ arg_cnt = 0;
+ for (cp = sf->arg_types; *cp != '\0'; cp++)
{
- if (*cp == 'n' || *cp == 's')
- max_args++;
- else if (*cp == 'f')
- max_args += 3;
- cp++;
+ if (*cp != 'f')
+ arg_cnt++;
+ else
+ arg_cnt += 3;
}
- cp = &f->desc[1];
+ /* Allocate node. */
*n = xmalloc (sizeof (struct nonterm_node)
- + (max_args - 1) * sizeof (union any_node *));
- (*n)->nonterm.type = f->t;
- (*n)->nonterm.n = 0;
+ + (arg_cnt - 1) * sizeof (union any_node *));
+ nonterm = &(*n)->nonterm;
+ nonterm->type = sf->t1;
+ nonterm->n = 0;
+
+ /* Parse arguments. */
+ cp = sf->arg_types;
for (;;)
{
- if (*cp == 'n' || *cp == 's')
+ if (*cp == 'n' || *cp == 's' || *cp == 'N' || *cp == 'S')
{
- int t = *cp == 'n' ? EX_NUMERIC : EX_STRING;
- type = parse_or (&(*n)->nonterm.arg[(*n)->nonterm.n]);
+ enum expr_type wanted_type
+ = *cp == 'n' || *cp == 'N' ? EXPR_NUMERIC : EXPR_STRING;
+ enum expr_type actual_type = parse_or (&nonterm->arg[nonterm->n]);
- if (type == EX_ERROR)
+ if (actual_type == EXPR_ERROR)
goto fail;
- if (type != t)
+ else if (actual_type == EXPR_BOOLEAN)
+ actual_type = EXPR_NUMERIC;
+ if (actual_type != wanted_type)
{
msg (SE, _("Argument %d to %s was expected to be of %s type. "
"It was actually of type %s."),
- (*n)->nonterm.n + 1, f->s,
- *cp == 'n' ? _("numeric") : _("string"),
- expr_type_name (type));
+ nonterm->n + 1, f->s,
+ expr_type_name (actual_type), expr_type_name (wanted_type));
goto fail;
}
- (*n)->nonterm.n++;
+ nonterm->n++;
}
else if (*cp == 'f')
{
msg (SE, _("%s is not a numeric format."), fmt_to_string (&fmt));
goto fail;
}
- (*n)->nonterm.arg[(*n)->nonterm.n + 0] = (union any_node *) fmt.type;
- (*n)->nonterm.arg[(*n)->nonterm.n + 1] = (union any_node *) fmt.w;
- (*n)->nonterm.arg[(*n)->nonterm.n + 2] = (union any_node *) fmt.d;
+ nonterm->arg[nonterm->n + 0] = (union any_node *) fmt.type;
+ nonterm->arg[nonterm->n + 1] = (union any_node *) fmt.w;
+ nonterm->arg[nonterm->n + 2] = (union any_node *) fmt.d;
break;
}
else
assert (0);
- if (*++cp == 0)
+ /* We're done if no args are left. */
+ cp++;
+ if (*cp == 0)
break;
- if (*cp == '/')
+
+ /* Optional arguments are named with capital letters. */
+ if (isupper ((unsigned char) *cp))
{
- cp++;
- if (lex_match (','))
- {
- (*n)->nonterm.type++;
- continue;
- }
- else
- break;
+ if (!lex_match (','))
+ {
+ if (sf->t2 == 0)
+ {
+ if (*cp == 'N')
+ nonterm->arg[nonterm->n++] = allocate_num_con (SYSMIS);
+ else if (*cp == 'S')
+ nonterm->arg[nonterm->n++] = allocate_str_con (" ", 1);
+ else
+ assert (0);
+ }
+ break;
+ }
+
+ if (sf->t2 != 0)
+ nonterm->type = sf->t2;
}
else if (!lex_match (','))
{
}
}
- return f->desc[0] == 'n' ? EX_NUMERIC : EX_STRING;
+ return sf->return_type;
fail:
free_node (*n);
- return EX_ERROR;
+ return EXPR_ERROR;
}
\f
/* General function parsing. */
static int
-get_num_args (struct function *f, int num_args, union any_node **n)
+get_num_args (const struct function *f, int num_args, union any_node **n)
{
int t;
int i;
for (i = 0;;)
{
t = parse_or (&(*n)->nonterm.arg[i]);
- if (t == EX_ERROR)
+ if (t == EXPR_ERROR)
goto fail;
(*n)->nonterm.n++;
- if (t != EX_NUMERIC)
+
+ if (t == EXPR_STRING)
{
- msg (SE, _("Type mismatch in argument %d of %s, which was expected "
- "to be numeric. It was actually type %s."),
- i + 1, f->s, expr_type_name (t));
+ msg (SE, _("Type mismatch in argument %d of %s. A string "
+ "expression was supplied where only a numeric expression "
+ "is allowed."),
+ i + 1, f->s);
goto fail;
}
if (++i >= num_args)
return 0;
}
-static int
+static enum expr_type
parse_function (union any_node ** n)
{
- struct function *fp;
+ const struct function *fp;
char fname[32], *cp;
int t;
int min_args;
(*n)->nonterm.n = 0;
t = parse_or (&(*n)->nonterm.arg[0]);
- if (t == EX_ERROR)
+ if (t == EXPR_ERROR)
goto fail;
- if (t != EX_NUMERIC)
+ if (t != EXPR_NUMERIC)
{
msg (SE, _("The index value after a vector name must be numeric."));
goto fail;
}
((*n)->nonterm.arg[1]) = (union any_node *) v->idx;
- return v->var[0]->type == NUMERIC ? EX_NUMERIC : EX_STRING;
+ return v->var[0]->type == NUMERIC ? EXPR_NUMERIC : EXPR_STRING;
}
ds_truncate (&tokstr, 31);
lex_get ();
if (!lex_force_match ('('))
- return 0;
+ return EXPR_ERROR;
{
struct function f;
if (!fp)
{
msg (SE, _("There is no function named %s."), fname);
- return 0;
+ return EXPR_ERROR;
}
if (min_args && fp->func != nary_num_func)
{
msg (SE, _("Function %s may not be given a minimum number of "
"arguments."), fname);
- return 0;
+ return EXPR_ERROR;
}
t = fp->func (fp, min_args, n);
- if (t == EX_ERROR)
- return EX_ERROR;
+ if (t == EXPR_ERROR)
+ return EXPR_ERROR;
if (!lex_match (')'))
{
lex_error (_("expecting `)' after %s function"), fname);
fail:
free_node (*n);
- return EX_ERROR;
+ return EXPR_ERROR;
}
-
-#if GLOBAL_DEBUGGING
-#define op(a,b,c,d) {a,b,c,d}
-#else
-#define op(a,b,c,d) {b,c,d}
-#endif
-
-#define varies 0
-
-struct op_desc ops[OP_SENTINEL + 1] =
-{
- op ("!?ERROR?!", 000, 0, 0),
-
- op ("plus", 001, varies, 1),
- op ("mul", 011, varies, 1),
- op ("pow", 010, -1, 0),
- op ("and", 010, -1, 0),
- op ("or", 010, -1, 0),
- op ("not", 000, 0, 0),
- op ("eq", 000, -1, 0),
- op ("ge", 000, -1, 0),
- op ("gt", 000, -1, 0),
- op ("le", 000, -1, 0),
- op ("lt", 000, -1, 0),
- op ("ne", 000, -1, 0),
-
- op ("string-eq", 000, -1, 0),
- op ("string-ge", 000, -1, 0),
- op ("string-gt", 000, -1, 0),
- op ("string-le", 000, -1, 0),
- op ("string-lt", 000, -1, 0),
- op ("string-ne", 000, -1, 0),
-
- op ("neg", 000, 0, 0),
- op ("abs", 000, 0, 0),
- op ("arcos", 000, 0, 0),
- op ("arsin", 000, 0, 0),
- op ("artan", 000, 0, 0),
- op ("cos", 000, 0, 0),
- op ("exp", 000, 0, 0),
- op ("lg10", 000, 0, 0),
- op ("ln", 000, 0, 0),
- op ("mod10", 000, 0, 0),
- op ("rnd", 000, 0, 0),
- op ("sin", 000, 0, 0),
- op ("sqrt", 000, 0, 0),
- op ("tan", 000, 0, 0),
- op ("trunc", 000, 0, 0),
-
- op ("any", 011, varies, 1),
- op ("any-string", 001, varies, 1),
- op ("cfvar", 013, varies, 2),
- op ("max", 013, varies, 2),
- op ("mean", 013, varies, 2),
- op ("min", 013, varies, 2),
- op ("nmiss", 011, varies, 1),
- op ("nvalid", 011, varies, 1),
- op ("range", 011, varies, 1),
- op ("range-string", 001, varies, 1),
- op ("sd", 013, varies, 2),
- op ("sum", 013, varies, 2),
- op ("variance", 013, varies, 2),
-
- op ("time_hms", 000, -2, 0),
- op ("ctime_days?!", 000, 0, 0),
- op ("ctime_hours?!", 000, 0, 0),
- op ("ctime_minutes?!", 000, 0, 0),
- op ("ctime_seconds?!", 000, 0, 0),
- op ("time_days?!", 000, 0, 0),
-
- op ("date_dmy", 000, -2, 0),
- op ("date_mdy", 000, -2, 0),
- op ("date_moyr", 000, -1, 0),
- op ("date_qyr", 000, -1, 0),
- op ("date_wkyr", 000, -1, 0),
- op ("date_yrday", 000, -1, 0),
- op ("yrmoda", 000, -2, 0),
-
- op ("xdate_date", 000, 0, 0),
- op ("xdate_hour", 000, 0, 0),
- op ("xdate_jday", 000, 0, 0),
- op ("xdate_mday", 000, 0, 0),
- op ("xdate_minute", 000, 0, 0),
- op ("xdate_month", 000, 0, 0),
- op ("xdate_quarter", 000, 0, 0),
- op ("xdate_second", 000, 0, 0),
- op ("xdate_tday", 000, 0, 0),
- op ("xdate_time", 000, 0, 0),
- op ("xdate_week", 000, 0, 0),
- op ("xdate_wkday", 000, 0, 0),
- op ("xdate_year", 000, 0, 0),
-
- op ("concat", 001, varies, 1),
- op ("index-2", 000, -1, 0),
- op ("index-3", 000, -2, 0),
- op ("rindex-2", 000, -1, 0),
- op ("rindex-3", 000, -2, 0),
- op ("length", 000, 0, 0),
- op ("lower", 000, 0, 0),
- op ("upcas", 000, 0, 0),
- op ("lpad-2", 010, -1, 0),
- op ("lpad-3", 010, -2, 0),
- op ("rpad-2", 010, -1, 0),
- op ("rpad-3", 010, -2, 0),
- op ("ltrim-1", 000, 0, 0),
- op ("ltrim-2", 000, -1, 0),
- op ("rtrim-1", 000, 0, 0),
- op ("rtrim-2", 000, -1, 0),
- op ("number-1", 010, 0, 0),
- op ("number-2", 014, 0, 3),
- op ("string", 004, 0, 3),
- op ("substr-2", 010, -1, 0),
- op ("substr-3", 010, -2, 0),
-
- op ("inv", 000, 0, 0),
- op ("square", 000, 0, 0),
- op ("num-to-Bool", 000, 0, 0),
-
- op ("mod", 010, -1, 0),
- op ("normal", 000, 0, 0),
- op ("uniform", 000, 0, 0),
- op ("sysmis", 010, 0, 0),
- op ("vec-elem-num", 002, 0, 1),
- op ("vec-elem-str", 002, 0, 1),
-
- op ("!?TERMINAL?!", 000, 0, 0),
- op ("num-con", 000, +1, 0),
- op ("str-con", 000, +1, 0),
- op ("num-var", 000, +1, 0),
- op ("str-var", 000, +1, 0),
- op ("num-lag", 000, +1, 1),
- op ("str-lag", 000, +1, 1),
- op ("num-sys", 000, +1, 1),
- op ("num-val", 000, +1, 1),
- op ("str-mis", 000, +1, 1),
- op ("$casenum", 000, +1, 0),
- op ("!?SENTINEL?!", 000, 0, 0),
-};
-
-#undef op
-#undef varies
-\f
\f
/* Utility functions. */
static const char *
-expr_type_name (int type)
+expr_type_name (enum expr_type type)
{
switch (type)
{
- case EX_ERROR:
+ case EXPR_ERROR:
return _("error");
- case EX_BOOLEAN:
+ case EXPR_BOOLEAN:
return _("Boolean");
- case EX_NUMERIC:
+ case EXPR_NUMERIC:
return _("numeric");
- case EX_STRING:
+ case EXPR_STRING:
return _("string");
default:
}
static const char *
-type_name (int type)
+var_type_name (int type)
{
switch (type)
{
void
free_node (union any_node *n)
{
- if (n->type < OP_TERMINAL)
+ if (n != NULL)
{
- int i;
-
- for (i = 0; i < n->nonterm.n; i++)
- free_node (n->nonterm.arg[i]);
+ if (IS_NONTERMINAL (n->type))
+ {
+ int i;
+
+ for (i = 0; i < n->nonterm.n; i++)
+ free_node (n->nonterm.arg[i]);
+ }
+ free (n);
}
- free (n);
+}
+
+static union any_node *
+allocate_num_con (double value)
+{
+ union any_node *c;
+
+ c = xmalloc (sizeof (struct num_con_node));
+ c->num_con.type = OP_NUM_CON;
+ c->num_con.value = value;
+
+ return c;
+}
+
+static union any_node *
+allocate_str_con (const char *string, size_t length)
+{
+ union any_node *c;
+
+ c = xmalloc (sizeof (struct str_con_node) + length - 1);
+ c->str_con.type = OP_STR_CON;
+ c->str_con.len = length;
+ memcpy (c->str_con.s, string, length);
+
+ return c;
+}
+
+static union any_node *
+allocate_var_node (int type, struct variable *variable)
+{
+ union any_node *c;
+
+ c = xmalloc (sizeof (struct var_node));
+ c->var.type = type;
+ c->var.v = variable;
+
+ return c;
}
union any_node *
return c;
}
-union any_node *
-append_nonterminal_arg (union any_node *a, union any_node *b)
+static union any_node *
+allocate_binary_nonterminal (int op, union any_node *lhs, union any_node *rhs)
{
- a = xrealloc (a, sizeof *a + sizeof *a->nonterm.arg * a->nonterm.n);
- a->nonterm.arg[a->nonterm.n++] = b;
- return a;
+ union any_node *node;
+
+ node = xmalloc (sizeof node->nonterm + sizeof *node->nonterm.arg);
+ node->nonterm.type = op;
+ node->nonterm.n = 2;
+ node->nonterm.arg[0] = lhs;
+ node->nonterm.arg[1] = rhs;
+
+ return node;
}
\f
static struct function func_tab[] =
{
- {"ABS", OP_ABS, unary_func, NULL},
- {"ACOS", OP_ARCOS, unary_func, NULL},
- {"ARCOS", OP_ARCOS, unary_func, NULL},
- {"ARSIN", OP_ARSIN, unary_func, NULL},
- {"ARTAN", OP_ARTAN, unary_func, NULL},
- {"ASIN", OP_ARSIN, unary_func, NULL},
- {"ATAN", OP_ARTAN, unary_func, NULL},
- {"COS", OP_COS, unary_func, NULL},
- {"EXP", OP_EXP, unary_func, NULL},
- {"LG10", OP_LG10, unary_func, NULL},
- {"LN", OP_LN, unary_func, NULL},
- {"MOD10", OP_MOD10, unary_func, NULL},
- {"NORMAL", OP_NORMAL, unary_func, NULL},
- {"RND", OP_RND, unary_func, NULL},
- {"SIN", OP_SIN, unary_func, NULL},
- {"SQRT", OP_SQRT, unary_func, NULL},
- {"TAN", OP_TAN, unary_func, NULL},
- {"TRUNC", OP_TRUNC, unary_func, NULL},
- {"UNIFORM", OP_UNIFORM, unary_func, NULL},
-
- {"TIME.DAYS", OP_TIME_DAYS, unary_func, NULL},
- {"TIME.HMS", OP_TIME_HMS, ternary_func, NULL},
-
- {"CTIME.DAYS", OP_CTIME_DAYS, unary_func, NULL},
- {"CTIME.HOURS", OP_CTIME_HOURS, unary_func, NULL},
- {"CTIME.MINUTES", OP_CTIME_MINUTES, unary_func, NULL},
- {"CTIME.SECONDS", OP_CTIME_SECONDS, unary_func, NULL},
-
- {"DATE.DMY", OP_DATE_DMY, ternary_func, NULL},
- {"DATE.MDY", OP_DATE_MDY, ternary_func, NULL},
- {"DATE.MOYR", OP_DATE_MOYR, binary_func, NULL},
- {"DATE.QYR", OP_DATE_QYR, binary_func, NULL},
- {"DATE.WKYR", OP_DATE_WKYR, binary_func, NULL},
- {"DATE.YRDAY", OP_DATE_YRDAY, binary_func, NULL},
-
- {"XDATE.DATE", OP_XDATE_DATE, unary_func, NULL},
- {"XDATE.HOUR", OP_XDATE_HOUR, unary_func, NULL},
- {"XDATE.JDAY", OP_XDATE_JDAY, unary_func, NULL},
- {"XDATE.MDAY", OP_XDATE_MDAY, unary_func, NULL},
- {"XDATE.MINUTE", OP_XDATE_MINUTE, unary_func, NULL},
- {"XDATE.MONTH", OP_XDATE_MONTH, unary_func, NULL},
- {"XDATE.QUARTER", OP_XDATE_QUARTER, unary_func, NULL},
- {"XDATE.SECOND", OP_XDATE_SECOND, unary_func, NULL},
- {"XDATE.TDAY", OP_XDATE_TDAY, unary_func, NULL},
- {"XDATE.TIME", OP_XDATE_TIME, unary_func, NULL},
- {"XDATE.WEEK", OP_XDATE_WEEK, unary_func, NULL},
- {"XDATE.WKDAY", OP_XDATE_WKDAY, unary_func, NULL},
- {"XDATE.YEAR", OP_XDATE_YEAR, unary_func, NULL},
-
- {"MISSING", OP_SYSMIS, MISSING_func, NULL},
- {"MOD", OP_MOD, binary_func, NULL},
- {"SYSMIS", OP_SYSMIS, SYSMIS_func, NULL},
- {"VALUE", OP_NUM_VAL, VALUE_func, NULL},
- {"LAG", OP_NUM_LAG, LAG_func, NULL},
- {"YRMODA", OP_YRMODA, ternary_func, NULL},
-
- {"ANY", OP_ANY, nary_num_func, NULL},
- {"CFVAR", OP_CFVAR, nary_num_func, NULL},
- {"MAX", OP_MAX, nary_num_func, NULL},
- {"MEAN", OP_MEAN, nary_num_func, NULL},
- {"MIN", OP_MIN, nary_num_func, NULL},
- {"NMISS", OP_NMISS, nary_num_func, NULL},
- {"NVALID", OP_NVALID, nary_num_func, NULL},
- {"RANGE", OP_RANGE, nary_num_func, NULL},
- {"SD", OP_SD, nary_num_func, NULL},
- {"SUM", OP_SUM, nary_num_func, NULL},
- {"VARIANCE", OP_VARIANCE, nary_num_func, NULL},
-
- {"CONCAT", OP_CONCAT, CONCAT_func, NULL},
- {"INDEX", OP_INDEX, generic_str_func, "nss/n"},
- {"RINDEX", OP_RINDEX, generic_str_func, "nss/n"},
- {"LENGTH", OP_LENGTH, generic_str_func, "ns"},
- {"LOWER", OP_LOWER, generic_str_func, "ss"},
- {"UPCAS", OP_UPPER, generic_str_func, "ss"},
- {"LPAD", OP_LPAD, generic_str_func, "ssn/s"},
- {"RPAD", OP_RPAD, generic_str_func, "ssn/s"},
- {"LTRIM", OP_LTRIM, generic_str_func, "ss/s"},
- {"RTRIM", OP_RTRIM, generic_str_func, "ss/s"},
- {"NUMBER", OP_NUMBER, generic_str_func, "ns/f"},
- {"STRING", OP_STRING, generic_str_func, "snf"},
- {"SUBSTR", OP_SUBSTR, generic_str_func, "ssn/n"},
+ {"ABS", OP_ABS, unary_func},
+ {"ACOS", OP_ARCOS, unary_func},
+ {"ARCOS", OP_ARCOS, unary_func},
+ {"ARSIN", OP_ARSIN, unary_func},
+ {"ARTAN", OP_ARTAN, unary_func},
+ {"ASIN", OP_ARSIN, unary_func},
+ {"ATAN", OP_ARTAN, unary_func},
+ {"COS", OP_COS, unary_func},
+ {"EXP", OP_EXP, unary_func},
+ {"LG10", OP_LG10, unary_func},
+ {"LN", OP_LN, unary_func},
+ {"MOD10", OP_MOD10, unary_func},
+ {"NORMAL", OP_NORMAL, unary_func},
+ {"RND", OP_RND, unary_func},
+ {"SIN", OP_SIN, unary_func},
+ {"SQRT", OP_SQRT, unary_func},
+ {"TAN", OP_TAN, unary_func},
+ {"TRUNC", OP_TRUNC, unary_func},
+ {"UNIFORM", OP_UNIFORM, unary_func},
+
+ {"TIME.DAYS", OP_TIME_DAYS, unary_func},
+ {"TIME.HMS", OP_TIME_HMS, ternary_func},
+
+ {"CTIME.DAYS", OP_CTIME_DAYS, unary_func},
+ {"CTIME.HOURS", OP_CTIME_HOURS, unary_func},
+ {"CTIME.MINUTES", OP_CTIME_MINUTES, unary_func},
+ {"CTIME.SECONDS", OP_CTIME_SECONDS, unary_func},
+
+ {"DATE.DMY", OP_DATE_DMY, ternary_func},
+ {"DATE.MDY", OP_DATE_MDY, ternary_func},
+ {"DATE.MOYR", OP_DATE_MOYR, binary_func},
+ {"DATE.QYR", OP_DATE_QYR, binary_func},
+ {"DATE.WKYR", OP_DATE_WKYR, binary_func},
+ {"DATE.YRDAY", OP_DATE_YRDAY, binary_func},
+
+ {"XDATE.DATE", OP_XDATE_DATE, unary_func},
+ {"XDATE.HOUR", OP_XDATE_HOUR, unary_func},
+ {"XDATE.JDAY", OP_XDATE_JDAY, unary_func},
+ {"XDATE.MDAY", OP_XDATE_MDAY, unary_func},
+ {"XDATE.MINUTE", OP_XDATE_MINUTE, unary_func},
+ {"XDATE.MONTH", OP_XDATE_MONTH, unary_func},
+ {"XDATE.QUARTER", OP_XDATE_QUARTER, unary_func},
+ {"XDATE.SECOND", OP_XDATE_SECOND, unary_func},
+ {"XDATE.TDAY", OP_XDATE_TDAY, unary_func},
+ {"XDATE.TIME", OP_XDATE_TIME, unary_func},
+ {"XDATE.WEEK", OP_XDATE_WEEK, unary_func},
+ {"XDATE.WKDAY", OP_XDATE_WKDAY, unary_func},
+ {"XDATE.YEAR", OP_XDATE_YEAR, unary_func},
+
+ {"MISSING", OP_SYSMIS, MISSING_func},
+ {"MOD", OP_MOD, binary_func},
+ {"SYSMIS", OP_SYSMIS, SYSMIS_func},
+ {"VALUE", OP_NUM_VAL, VALUE_func},
+ {"LAG", OP_NUM_LAG, LAG_func},
+ {"YRMODA", OP_YRMODA, ternary_func},
+
+ {"ANY", OP_ANY, nary_num_func},
+ {"CFVAR", OP_CFVAR, nary_num_func},
+ {"MAX", OP_MAX, nary_num_func},
+ {"MEAN", OP_MEAN, nary_num_func},
+ {"MIN", OP_MIN, nary_num_func},
+ {"NMISS", OP_NMISS, nary_num_func},
+ {"NVALID", OP_NVALID, nary_num_func},
+ {"RANGE", OP_RANGE, nary_num_func},
+ {"SD", OP_SD, nary_num_func},
+ {"SUM", OP_SUM, nary_num_func},
+ {"VAR", OP_VARIANCE, nary_num_func},
+ {"VARIANCE", OP_VARIANCE, nary_num_func},
+
+ {"CONCAT", OP_CONCAT, CONCAT_func},
+
+ {"INDEX", OP_INDEX_2, generic_str_func},
+ {"RINDEX", OP_RINDEX_2, generic_str_func},
+ {"LENGTH", OP_LENGTH, generic_str_func},
+ {"LOWER", OP_LOWER, generic_str_func},
+ {"UPCASE", OP_UPPER, generic_str_func},
+ {"LPAD", OP_LPAD, generic_str_func},
+ {"RPAD", OP_RPAD, generic_str_func},
+ {"LTRIM", OP_LTRIM, generic_str_func},
+ {"RTRIM", OP_RTRIM, generic_str_func},
+ {"NUMBER", OP_NUMBER, generic_str_func},
+ {"STRING", OP_STRING, generic_str_func},
+ {"SUBSTR", OP_SUBSTR_2, generic_str_func},
};
/* An algo_compare_func that compares functions A and B based on
}
#endif /* DEBUGGING */
-#if GLOBAL_DEBUGGING
-static void
-debug_print_postfix (struct expression * e)
+void
+expr_debug_print_postfix (const struct expression *e)
{
- unsigned char *o;
- double *num = e->num;
- unsigned char *str = e->str;
- struct variable **v = e->var;
+ const unsigned char *o;
+ const double *num = e->num;
+ const unsigned char *str = e->str;
+ struct variable *const *v = e->var;
int t;
- debug_printf ((_("postfix:")));
+ printf ("postfix:");
for (o = e->op; *o != OP_SENTINEL;)
{
t = *o++;
- if (t < OP_TERMINAL)
+ if (IS_NONTERMINAL (t))
{
- debug_printf ((" %s", ops[t].name));
+ printf (" %s", ops[t].name);
if (ops[t].flags & OP_VAR_ARGS)
{
- debug_printf (("(%d)", *o));
+ printf ("(%d)", *o);
o++;
}
if (ops[t].flags & OP_MIN_ARGS)
{
- debug_printf ((".%d", *o));
+ printf (".%d", *o);
o++;
}
if (ops[t].flags & OP_FMT_SPEC)
f.type = (int) *o++;
f.w = (int) *o++;
f.d = (int) *o++;
- debug_printf (("(%s)", fmt_to_string (&f)));
+ printf ("(%s)", fmt_to_string (&f));
}
}
else if (t == OP_NUM_CON)
{
if (*num == SYSMIS)
- debug_printf ((" SYSMIS"));
+ printf (" SYSMIS");
else
- debug_printf ((" %f", *num));
+ printf (" %f", *num);
num++;
}
else if (t == OP_STR_CON)
{
- debug_printf ((" \"%.*s\"", *str, &str[1]));
+ printf (" \"%.*s\"", *str, &str[1]);
str += str[0] + 1;
}
else if (t == OP_NUM_VAR || t == OP_STR_VAR)
{
- debug_printf ((" %s", (*v)->name));
+ printf (" %s", (*v)->name);
v++;
}
else if (t == OP_NUM_SYS)
{
- debug_printf ((" SYSMIS(#%d)", *o));
+ printf (" SYSMIS(#%d)", *o);
o++;
}
else if (t == OP_NUM_VAL)
{
- debug_printf ((" VALUE(#%d)", *o));
+ printf (" VALUE(#%d)", *o);
o++;
}
else if (t == OP_NUM_LAG || t == OP_STR_LAG)
{
- debug_printf ((" LAG(%s,%d)", (*v)->name, *o));
+ printf (" LAG(%s,%d)", (*v)->name, *o);
o++;
v++;
}
else
{
- printf ("debug_print_postfix(): %d\n", t);
+ printf ("%d unknown\n", t);
assert (0);
}
}
- debug_putc ('\n', stdout);
+ putchar ('\n');
}
-#endif /* GLOBAL_DEBUGGING */
+\f
+#define DEFINE_OPERATOR(NAME, STACK_DELTA, FLAGS, ARGS) \
+ {#NAME, STACK_DELTA, FLAGS, ARGS},
+struct op_desc ops[OP_SENTINEL] =
+ {
+#include "expr.def"
+ };
--- /dev/null
+/* One operand, one result. */
+#define UNARY_OPERATOR(NAME) DEFINE_OPERATOR (NAME, 0, OP_NO_FLAGS, 0)
+
+/* Two operands, one result. */
+#define BINARY_OPERATOR(NAME) DEFINE_OPERATOR (NAME, -1, OP_NO_FLAGS, 0)
+
+/* Three operands, one result. */
+#define TERNARY_OPERATOR(NAME) DEFINE_OPERATOR (NAME, -2, OP_NO_FLAGS, 0)
+
+/* Variable operands, one result. */
+#define NARY_OPERATOR(NAME, FLAGS, ARGS) \
+ DEFINE_OPERATOR (NAME, 0, OP_VAR_ARGS | OP_ABSORB_MISS | (FLAGS), ARGS)
+
+/* No operands, one result. */
+#define TERMINAL(NAME, ARGS) DEFINE_OPERATOR (NAME, +1, OP_NO_FLAGS, ARGS)
+
+/* Basic operators. */
+BINARY_OPERATOR (ADD)
+BINARY_OPERATOR (SUB)
+DEFINE_OPERATOR (MUL, -1, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (DIV, -1, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (MOD, -1, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (POW, -1, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (AND, -1, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (OR, -1, OP_ABSORB_MISS, 0)
+UNARY_OPERATOR (NOT)
+
+/* Numeric relational operators. */
+BINARY_OPERATOR (EQ)
+BINARY_OPERATOR (GE)
+BINARY_OPERATOR (GT)
+BINARY_OPERATOR (LE)
+BINARY_OPERATOR (LT)
+BINARY_OPERATOR (NE)
+
+/* String relational operators. */
+BINARY_OPERATOR (EQ_STRING)
+BINARY_OPERATOR (GE_STRING)
+BINARY_OPERATOR (GT_STRING)
+BINARY_OPERATOR (LE_STRING)
+BINARY_OPERATOR (LT_STRING)
+BINARY_OPERATOR (NE_STRING)
+
+/* Unary functions. */
+UNARY_OPERATOR (NEG)
+UNARY_OPERATOR (ABS)
+UNARY_OPERATOR (ARCOS)
+UNARY_OPERATOR (ARSIN)
+UNARY_OPERATOR (ARTAN)
+UNARY_OPERATOR (COS)
+UNARY_OPERATOR (EXP)
+UNARY_OPERATOR (LG10)
+UNARY_OPERATOR (LN)
+UNARY_OPERATOR (MOD10)
+UNARY_OPERATOR (RND)
+UNARY_OPERATOR (SIN)
+UNARY_OPERATOR (SQRT)
+UNARY_OPERATOR (TAN)
+UNARY_OPERATOR (TRUNC)
+
+/* N-ary numeric functions. */
+NARY_OPERATOR (ANY, 0, 1)
+NARY_OPERATOR (ANY_STRING, 0, 1)
+NARY_OPERATOR (CFVAR, OP_MIN_ARGS, 2)
+NARY_OPERATOR (MAX, OP_MIN_ARGS, 2)
+NARY_OPERATOR (MAX_STRING, 0, 1)
+NARY_OPERATOR (MEAN, OP_MIN_ARGS, 2)
+NARY_OPERATOR (MIN, OP_MIN_ARGS, 2)
+NARY_OPERATOR (MIN_STRING, 0, 1)
+NARY_OPERATOR (NMISS, 0, 1)
+NARY_OPERATOR (NVALID, 0, 1)
+NARY_OPERATOR (RANGE, 0, 1)
+NARY_OPERATOR (RANGE_STRING, 0, 1)
+NARY_OPERATOR (SD, OP_MIN_ARGS, 2)
+NARY_OPERATOR (SUM, OP_MIN_ARGS, 2)
+NARY_OPERATOR (VARIANCE, OP_MIN_ARGS, 2)
+
+/* Time construction & extraction functions. */
+TERNARY_OPERATOR (TIME_HMS)
+UNARY_OPERATOR (CTIME_DAYS)
+UNARY_OPERATOR (CTIME_HOURS)
+UNARY_OPERATOR (CTIME_MINUTES)
+UNARY_OPERATOR (CTIME_SECONDS)
+UNARY_OPERATOR (TIME_DAYS)
+
+/* Date construction functions. */
+TERNARY_OPERATOR (DATE_DMY)
+TERNARY_OPERATOR (DATE_MDY)
+BINARY_OPERATOR (DATE_MOYR)
+BINARY_OPERATOR (DATE_QYR)
+BINARY_OPERATOR (DATE_WKYR)
+BINARY_OPERATOR (DATE_YRDAY)
+TERNARY_OPERATOR (YRMODA)
+
+/* Date extraction functions. */
+UNARY_OPERATOR (XDATE_DATE)
+UNARY_OPERATOR (XDATE_HOUR)
+UNARY_OPERATOR (XDATE_JDAY)
+UNARY_OPERATOR (XDATE_MDAY)
+UNARY_OPERATOR (XDATE_MINUTE)
+UNARY_OPERATOR (XDATE_MONTH)
+UNARY_OPERATOR (XDATE_QUARTER)
+UNARY_OPERATOR (XDATE_SECOND)
+UNARY_OPERATOR (XDATE_TDAY)
+UNARY_OPERATOR (XDATE_TIME)
+UNARY_OPERATOR (XDATE_WEEK)
+UNARY_OPERATOR (XDATE_WKDAY)
+UNARY_OPERATOR (XDATE_YEAR)
+
+/* String functions. */
+NARY_OPERATOR (CONCAT, 0, 0)
+BINARY_OPERATOR (INDEX_2)
+TERNARY_OPERATOR (INDEX_3)
+BINARY_OPERATOR (RINDEX_2)
+TERNARY_OPERATOR (RINDEX_3)
+UNARY_OPERATOR (LENGTH)
+UNARY_OPERATOR (LOWER)
+UNARY_OPERATOR (UPPER)
+DEFINE_OPERATOR (LPAD, -2, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (RPAD, -2, OP_ABSORB_MISS, 0)
+UNARY_OPERATOR (LTRIM)
+UNARY_OPERATOR (RTRIM)
+DEFINE_OPERATOR (NUMBER, -1, OP_FMT_SPEC, 3)
+DEFINE_OPERATOR (STRING, 0, OP_FMT_SPEC | OP_ABSORB_MISS, 3)
+DEFINE_OPERATOR (SUBSTR_2, -1, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (SUBSTR_3, -2, OP_ABSORB_MISS, 0)
+
+/* Artificial. */
+UNARY_OPERATOR (SQUARE)
+UNARY_OPERATOR (NUM_TO_BOOL)
+
+/* Weirdness. */
+UNARY_OPERATOR (NORMAL)
+UNARY_OPERATOR (UNIFORM)
+DEFINE_OPERATOR (SYSMIS, 0, OP_ABSORB_MISS, 0)
+DEFINE_OPERATOR (VEC_ELEM_NUM, 0, OP_MIN_ARGS, 1)
+DEFINE_OPERATOR (VEC_ELEM_STR, 0, OP_MIN_ARGS, 1)
+
+/* Terminals. */
+TERMINAL (NUM_CON, 0)
+TERMINAL (STR_CON, 0)
+TERMINAL (NUM_VAR, 0)
+TERMINAL (STR_VAR, 0)
+TERMINAL (NUM_LAG, 1)
+TERMINAL (STR_LAG, 1)
+TERMINAL (NUM_SYS, 1)
+TERMINAL (NUM_VAL, 1)
+TERMINAL (STR_MIS, 1)
+TERMINAL (CASENUM, 0)
+
+#undef DEFINE_OPERATOR
#define expr_h 1
/* Expression parsing flags. */
-enum
+enum expr_type
{
- PXP_NONE = 000, /* No flags. */
-
- /* Specify expression type. */
- PXP_BOOLEAN = 002, /* Coerce return value to Boolean. */
- PXP_NUMERIC = 004, /* Must be numeric result type. */
- PXP_STRING = 010 /* Must be string result type. */
+ EXPR_ANY = 0, /* Any type. */
+ EXPR_BOOLEAN = 1, /* Must be numeric; coerce to Boolean. */
+ EXPR_NUMERIC = 2, /* Must be numeric result type. */
+ EXPR_STRING = 3, /* Must be string result type. */
+ EXPR_ERROR = 4, /* Indicates an error. */
+ EXPR_NO_OPTIMIZE = 0x1000 /* May be set in expr_parse()
+ argument to disable optimization. */
};
struct expression;
struct ccase;
union value;
-struct expression *expr_parse (int flags);
-double expr_evaluate (struct expression *, const struct ccase *, int case_num,
- union value *);
+struct expression *expr_parse (enum expr_type);
+enum expr_type expr_get_type (const struct expression *);
+double expr_evaluate (const struct expression *, const struct ccase *,
+ int case_num, union value *);
void expr_free (struct expression *);
+void expr_debug_print_postfix (const struct expression *);
#endif /* expr.h */
/*#define DEBUGGING 1*/
#include "debug-print.h"
-#if GLOBAL_DEBUGGING
-void debug_print_expr (struct expression *);
void debug_print_op (short int *);
-#endif
-/* Expression types. */
-enum
- {
- EX_ERROR, /* Error value for propagation. */
- EX_BOOLEAN, /* Numeric value that's 0, 1, or SYSMIS. */
- EX_NUMERIC, /* Numeric value. */
- EX_STRING /* String value. */
- };
-/* Expression operators.
- The ordering below is important. Do not change it. */
+/* Expression operators. */
+#define DEFINE_OPERATOR(NAME, STACK_DELTA, FLAGS, ARGS) \
+ OP_##NAME,
enum
{
- OP_ERROR,
-
- /* Basic operators. */
- OP_PLUS,
- OP_MUL,
- OP_POW,
- OP_AND,
- OP_OR,
- OP_NOT,
-
- /* Numeric relational operators. */
- OP_EQ,
- OP_GE,
- OP_GT,
- OP_LE,
- OP_LT,
- OP_NE,
-
- /* String relational operators. */
- OP_STRING_EQ,
- OP_STRING_GE,
- OP_STRING_GT,
- OP_STRING_LE,
- OP_STRING_LT,
- OP_STRING_NE,
-
- /* Unary functions. */
- OP_NEG,
- OP_ABS,
- OP_ARCOS,
- OP_ARSIN,
- OP_ARTAN,
- OP_COS,
- OP_EXP,
- OP_LG10,
- OP_LN,
- OP_MOD10,
- OP_RND,
- OP_SIN,
- OP_SQRT,
- OP_TAN,
- OP_TRUNC,
-
- /* N-ary numeric functions. */
- OP_ANY,
- OP_ANY_STRING,
- OP_CFVAR,
- OP_MAX,
- OP_MEAN,
- OP_MIN,
- OP_NMISS,
- OP_NVALID,
- OP_RANGE,
- OP_RANGE_STRING,
- OP_SD,
- OP_SUM,
- OP_VARIANCE,
-
- /* Time construction & extraction functions. */
- OP_TIME_HMS,
-
- /* These never appear in a tree or an expression.
- They disappear in parse.c:unary_func(). */
- OP_CTIME_DAYS,
- OP_CTIME_HOURS,
- OP_CTIME_MINUTES,
- OP_CTIME_SECONDS,
- OP_TIME_DAYS,
-
- /* Date construction functions. */
- OP_DATE_DMY,
- OP_DATE_MDY,
- OP_DATE_MOYR,
- OP_DATE_QYR,
- OP_DATE_WKYR,
- OP_DATE_YRDAY,
- OP_YRMODA,
-
- /* Date extraction functions. */
- OP_XDATE_DATE,
- OP_XDATE_HOUR,
- OP_XDATE_JDAY,
- OP_XDATE_MDAY,
- OP_XDATE_MINUTE,
- OP_XDATE_MONTH,
- OP_XDATE_QUARTER,
- OP_XDATE_SECOND,
- OP_XDATE_TDAY,
- OP_XDATE_TIME,
- OP_XDATE_WEEK,
- OP_XDATE_WKDAY,
- OP_XDATE_YEAR,
-
- /* String functions. */
- OP_CONCAT,
- OP_INDEX,
- OP_INDEX_OPT,
- OP_RINDEX,
- OP_RINDEX_OPT,
- OP_LENGTH,
- OP_LOWER,
- OP_UPPER,
- OP_LPAD,
- OP_LPAD_OPT,
- OP_RPAD,
- OP_RPAD_OPT,
- OP_LTRIM,
- OP_LTRIM_OPT,
- OP_RTRIM,
- OP_RTRIM_OPT,
- OP_NUMBER,
- OP_NUMBER_OPT,
- OP_STRING,
- OP_SUBSTR,
- OP_SUBSTR_OPT,
-
- /* Artificial. */
- OP_INV, /* Reciprocal. */
- OP_SQUARE, /* Squares the argument. */
- OP_NUM_TO_BOOL, /* Converts ~0=>0, ~1=>1, SYSMIS=>SYSMIS,
- others=>0 with a warning. */
-
- /* Weirdness. */
- OP_MOD, /* Modulo function. */
- OP_NORMAL, /* Normally distributed PRNG. */
- OP_UNIFORM, /* Uniformly distributed PRNG. */
- OP_SYSMIS, /* Tests whether for SYSMIS argument. */
- OP_VEC_ELEM_NUM, /* Element of a numeric vector. */
- OP_VEC_ELEM_STR, /* Element of a string vector. */
-
- /* Terminals. */
- OP_TERMINAL, /* Not a valid type. Boundary
- between terminals and nonterminals. */
-
- OP_NUM_CON, /* Numeric constant. */
- OP_STR_CON, /* String literal. */
- OP_NUM_VAR, /* Numeric variable reference. */
- OP_STR_VAR, /* String variable reference. */
- OP_NUM_LAG, /* Numeric variable from an earlier case. */
- OP_STR_LAG, /* String variable from an earlier case. */
- OP_NUM_SYS, /* SYSMIS(numvar). */
- OP_NUM_VAL, /* VALUE(numvar). */
- OP_STR_MIS, /* MISSING(strvar). */
- OP_CASENUM, /* $CASENUM. */
- OP_SENTINEL /* Sentinel. */
+#include "expr.def"
+ OP_SENTINEL
};
+#define IS_TERMINAL(OPERATOR) (ops[OPERATOR].height > 0)
+#define IS_NONTERMINAL(OPERATOR) !IS_TERMINAL (OPERATOR)
+
/* Flags that describe operators. */
enum
{
+ OP_NO_FLAGS = 0, /* No flags. */
OP_VAR_ARGS = 001, /* 1=Variable number of args. */
OP_MIN_ARGS = 002, /* 1=Can specific min args with .X. */
OP_FMT_SPEC = 004, /* 1=Includes a format specifier. */
/* Describes an operator. */
struct op_desc
{
-#if GLOBAL_DEBUGGING
const char *name; /* Operator name. */
-#endif
- unsigned char flags; /* Flags. */
signed char height; /* Effect on stack height. */
+ unsigned char flags; /* Flags. */
unsigned char skip; /* Number of operator item arguments. */
};
struct var_node
{
int type; /* OP_NUM_VAR, OP_NUM_SYS, OP_NUM_VAL,
- OP_STR_MIS, or OP_STR_VAR. */
+ or OP_STR_VAR. */
struct variable *v; /* Variable. */
};
/* An expression. */
struct expression
{
- int type; /* Type of expression result. */
+ enum expr_type type; /* Type of expression result. */
unsigned char *op; /* Operators. */
struct variable **var; /* Variables. */
double *num; /* Numeric operands. */
struct pool *pool; /* Pool for evaluation temporaries. */
};
-struct nonterm_node *optimize_expression (struct nonterm_node *);
+void optimize_expression (union any_node **);
void dump_expression (union any_node *, struct expression *);
void free_node (union any_node *);
name = mtf.tail->last;
sbc = "LAST";
}
- else
- assert (0);
+ else
+ {
+ assert (0);
+ abort ();
+ }
lex_match ('=');
if (token != T_ID)
default:
assert (0);
+ abort ();
}
line = readline (prompt);
break;
default:
assert (0);
+ abort ();
}
if (*dest)
free (*dest);
int case_num UNUSED)
{
assert (0);
+ abort ();
}
/* REREAD transformation. */
return CMD_FAILURE;
}
- e = expr_parse (PXP_NUMERIC);
+ e = expr_parse (EXPR_NUMERIC);
if (!e)
return CMD_FAILURE;
}
\f
/* Weird line processing functions. */
-/* Discards the rest of the current input line for tokenization
- purposes, but returns the entire contents of the line for use by
- the caller. */
-char *
+/* Returns the entire contents of the current line. */
+const char *
lex_entire_line (void)
{
- prog = ds_end (&getl_buf);
- dot = 0;
return ds_value (&getl_buf);
}
/* As lex_entire_line(), but only returns the part of the current line
that hasn't already been tokenized.
- If HAD_DOT is non-null, stores nonzero into *HAD_DOT if the line
+ If END_DOT is non-null, stores nonzero into *END_DOT if the line
ends with a terminal dot, or zero if it doesn't. */
-char *
-lex_rest_of_line (int *had_dot)
+const char *
+lex_rest_of_line (int *end_dot)
{
- char *s = prog;
- prog = ds_end (&getl_buf);
-
- if (had_dot)
- *had_dot = dot;
- dot = 0;
-
- return s;
+ if (end_dot)
+ *end_dot = dot;
+ return prog;
}
/* Causes the rest of the current input line to be ignored for
void
lex_discard_line (void)
{
- msg (SW, _("The rest of this command has been discarded."));
-
- ds_clear (&getl_buf);
- prog = ds_value (&getl_buf);
+ prog = ds_end (&getl_buf);
dot = put_token = 0;
}
void lex_put_back_id (const char *tokid);
/* Weird line processing functions. */
-char *lex_entire_line (void);
-char *lex_rest_of_line (int *had_dot);
+const char *lex_entire_line (void);
+const char *lex_rest_of_line (int *end_dot);
void lex_discard_line (void);
void lex_set_prog (char *p);
assert (token == '=');
lex_get ();
- one->init = expr_parse (PXP_NUMERIC);
+ one->init = expr_parse (EXPR_NUMERIC);
if (!one->init)
return 0;
expr_free (one->init);
return 0;
}
- one->term = expr_parse (PXP_NUMERIC);
+ one->term = expr_parse (EXPR_NUMERIC);
if (!one->term)
{
expr_free (one->init);
if (lex_match (T_BY))
{
- one->incr = expr_parse (PXP_NUMERIC);
+ one->incr = expr_parse (EXPR_NUMERIC);
if (!one->incr)
return 0;
}
{
two->flags |= LPC_COND;
- two->cond = expr_parse (PXP_BOOLEAN);
+ two->cond = expr_parse (EXPR_BOOLEAN);
if (!two->cond)
return 0;
}
/* Parse the expression if any. */
if (lex_match_id ("IF"))
{
- thr->cond = expr_parse (PXP_BOOLEAN);
+ thr->cond = expr_parse (EXPR_BOOLEAN);
if (!thr->cond)
return 0;
}
while (token != T_STOP && token != '.')
lex_get ();
}
- else
- lex_discard_line ();
+ else
+ {
+ msg (SW, _("The rest of this command has been discarded."));
+ lex_discard_line ();
+ }
}
break;
default:
assert (0);
+ abort ();
}
break;
case 2:
break;
default:
assert (0);
+ abort ();
}
{
break;
default:
assert (0);
+ abort ();
}
break;
default:
assert (0);
+ abort ();
}
c->n_rows[wr->content]++;
break;
default:
assert (0);
+ abort ();
}
/* Create each variable. */
break;
default:
assert (0);
+ abort ();
}
if (setting)
x->output_options |= mask;
break;
default:
assert (0);
+ abort ();
}
if (*dest)
free (*dest);
break;
default:
assert (0);
+ abort ();
}
lp = line;
case PRT_ERROR:
default:
assert (0);
+ abort ();
}
if (pot_w > w)
w = pot_w;
if (token != '.')
{
- e = expr_parse (PXP_NUMERIC);
+ e = expr_parse (EXPR_NUMERIC);
if (token != '.')
{
expr_free (e);
break;
default:
assert (0);
+ abort ();
}
if (!cp)
continue;
REPEAT line. We should actually check for the PRINT specifier.
This can be done easier when we buffer entire commands instead of
doing it token by token; see TODO. */
- lex_entire_line ();
+ lex_discard_line ();
/* Tie up the loose end of the chain. */
if (line_buf_head == NULL)
struct expression *e;
struct select_if_trns *t;
- e = expr_parse (PXP_BOOLEAN);
+ e = expr_parse (EXPR_BOOLEAN);
if (!e)
return CMD_FAILURE;
{
struct expression *e;
- e = expr_parse (PXP_BOOLEAN);
+ e = expr_parse (EXPR_BOOLEAN);
if (!e)
return CMD_FAILURE;
break;
default:
assert (0);
+ abort ();
}
sv.n_missing_values = nm;
return y - (-x) % y;
assert (0);
+ abort ();
}
/* A run of data for use in merging. */
return 0;
}
-/* Compares S0 of length S0L to S1 of length S1L. The shorter string
- is considered to be padded with spaces to the length of the
- longer. */
+/* Compares A of length A_LEN to B of length B_LEN. The shorter
+ string is considered to be padded with spaces to the length of
+ the longer. */
int
-st_compare_pad (const char *s0, int s0l, const char *s1, int s1l)
+st_compare_pad (const char *a, size_t a_len, const char *b, size_t b_len)
{
- /* 254 spaces. */
- static char blanks[254] =
- " "
- " "
- " "
- " "
- " ";
+ size_t min_len;
+ int result;
- int diff = s0l - s1l;
- int r;
-
- if (diff == 0)
- {
- if (s0l == 0)
- return 0;
- return memcmp (s0, s1, s0l);
- }
- else if (diff > 0)
+ min_len = a_len < b_len ? a_len : b_len;
+ result = memcmp (a, b, min_len);
+ if (result != 0)
+ return result;
+ else
{
- /* s0l > s1l */
- if (s1l)
- {
- r = memcmp (s0, s1, s1l);
- if (r)
- return r;
- }
- return memcmp (&s0[s1l], blanks, diff);
- }
- else
- /* diff<0 */
- {
- /* s0l < s1l */
- if (s0l)
- {
- r = memcmp (s0, s1, s0l);
- if (r)
- return r;
- }
- return memcmp (blanks, &s1[s0l], -diff);
+ size_t idx;
+
+ if (a_len < b_len)
+ {
+ for (idx = min_len; idx < b_len; idx++)
+ if (' ' != b[idx])
+ return ' ' > b[idx] ? 1 : -1;
+ }
+ else
+ {
+ for (idx = min_len; idx < a_len; idx++)
+ if (a[idx] != ' ')
+ return a[idx] > ' ' ? 1 : -1;
+ }
+ return 0;
}
}
void mm_reverse (void *, size_t);
char *mm_find_reverse (const char *, size_t, const char *, size_t);
-int st_compare_pad (const char *, int, const char *, int);
+int st_compare_pad (const char *, size_t, const char *, size_t);
char *st_spaces (int);
void st_bare_pad_copy (char *dest, const char *src, size_t n);
void st_bare_pad_len_copy (char *dest, const char *src, size_t n, size_t len);
if (*title)
free (*title);
*title = xstrdup (lex_rest_of_line (NULL));
+ lex_discard_line ();
for (cp = *title; *cp; cp++)
*cp = toupper ((unsigned char) (*cp));
token = '.';
int
cmd_file_label (void)
{
- char *label;
+ const char *label;
label = lex_rest_of_line (NULL);
+ lex_discard_line ();
while (isspace ((unsigned char) *label))
label++;
for (;;)
{
int had_dot;
- char *line;
+ const char *orig_line;
+ char *copy_line;
- line = lex_rest_of_line (&had_dot);
- while (isspace ((unsigned char) *line))
- line++;
+ orig_line = lex_rest_of_line (&had_dot);
+ lex_discard_line ();
+ while (isspace ((unsigned char) *orig_line))
+ orig_line++;
+ copy_line = xmalloc (strlen (orig_line) + 2);
+ strcpy (copy_line, orig_line);
if (had_dot)
- {
- char *cp = strchr (line, 0);
- *cp++ = '.';
- *cp = 0;
- }
+ strcat (copy_line, ".");
- add_document_line (line, 3);
+ add_document_line (copy_line, 3);
+ free (copy_line);
lex_get_line ();
if (had_dot)
int i;
- lex_match ('/');
do
{
- parse_variables (default_dict, &v, &nv, PV_NONE);
+ if (!parse_variables (default_dict, &v, &nv, PV_NONE))
+ return CMD_PART_SUCCESS_MAYBE;
if (token != T_STRING)
{
return _("scratch");
default:
assert (0);
+ abort ();
}
}
/* Parses a set of variables from dictionary D given options
OPTS. Resulting list of variables stored in *VAR and the
- number of variables into *CNT. */
+ number of variables into *CNT. Returns nonzero only if
+ successful. */
int
parse_variables (struct dictionary *d, struct variable ***var, int *cnt,
int opts)
+Fri Mar 26 00:55:48 2004 Ben Pfaff <blp@gnu.org>
+
+ * Makefile.am: (TESTS) Add xforms/expressions.sh, remove
+ command/compute.sh.
+
+ * command/beg-data.sh: Remove REMARK usage.
+
+ * command/bignum.sh: Ditto.
+
+ * command/list.sh: Ditto.
+
+ * command/print.sh: Ditto.
+
Sat Mar 20 18:11:15 2004 Ben Pfaff <blp@gnu.org>
* command/list.sh: Update output.
command/beg-data.sh \
command/bignum.sh \
command/count.sh \
- command/compute.sh \
command/descriptives.sh \
command/erase.sh \
command/file-label.sh \
bugs/random.sh \
bugs/t-test.sh \
bugs/temporary.sh \
- bugs/val-labs.sh
+ bugs/val-labs.sh \
+ xforms/expressions.sh
noinst_PROGRAMS = gengarbage
cat > $TEMPDIR/prog.sps << EOF_foobar
title 'Test BEGIN DATA ... END DATA'.
-remark EOF
-----------------------------------------------------------------------
-First we show that we can input data with BEGIN DATA/END DATA after
-a procedure.
-----------------------------------------------------------------------
-EOF
data list /A B 1-2.
list.
begin data.
90
end data.
-remark EOF
-----------------------------------------------------------------------
-Next we show that BEGIN DATA/END DATA work fine on their own as well.
-----------------------------------------------------------------------
-EOF
data list /A B 1-2.
begin data.
09
activity="compare data"
diff -b -B $TEMPDIR/pspp.list - << foobar
-----------------------------------------------------------------------
-First we show that we can input data with BEGIN DATA/END DATA after
-a procedure.
-----------------------------------------------------------------------
-
1.1 DATA LIST. Reading 1 record from the command file.
+--------+------+-------+------+
|Variable|Record|Columns|Format|
7 8
9 0
-----------------------------------------------------------------------
-Next we show that BEGIN DATA/END DATA work fine on their own as well.
-----------------------------------------------------------------------
-
2.1 DATA LIST. Reading 1 record from the command file.
+--------+------+-------+------+
|Variable|Record|Columns|Format|
title 'Test use of big numbers'.
*** Do the portable output.
-remark EOF
-----------------------------------------------------------------------
-Testing use of big numbers.\n
-The numbers in the data file are designed for IEEE754 double
-format--if your system uses something different then the test needs to
-be adjusted for whatever are big numbers to your system.
-----------------------------------------------------------------------
-EOF
data list file='$TEMPDIR/bignum.data'/BIGNUM 1-40.
list.
*** Do the nonportable output for fun.
-remark EOF
-NOCOMP
-SUCCESS?
-----------------------------------------------------------------------
-This test merely shows whether your system can successfully handle
-floating-point overflow. If you get a fatal exception at this point,
-the source needs some editing--glob.c should mask overflow exceptions
-in init_glob(). Again, the numbers are specific to IEEE754 double
-format.
-----------------------------------------------------------------------
-EOF
descriptives BIGNUM.
-rem-SUCCESS
foobar
if [ $? -ne 0 ] ; then no_result ; fi
+++ /dev/null
-#!/bin/sh
-
-# This program tests the COMPUTE command
-# (it also gives LPAD and RPAD a work out)
-
-TEMPDIR=/tmp/pspp-tst-$$
-
-here=`pwd`;
-
-# ensure that top_srcdir is absolute
-cd $top_srcdir; top_srcdir=`pwd`
-
-export STAT_CONFIG_PATH=$top_srcdir/config
-
-
-cleanup()
-{
- rm -rf $TEMPDIR
-}
-
-
-fail()
-{
- echo $activity
- echo FAILED
- cleanup;
- exit 1;
-}
-
-
-no_result()
-{
- echo $activity
- echo NO RESULT;
- cleanup;
- exit 2;
-}
-
-pass()
-{
- cleanup;
- exit 0;
-}
-
-mkdir -p $TEMPDIR
-
-cd $TEMPDIR
-
-activity="create program"
-cat > $TEMPDIR/compute.stat <<EOF
-data list /w 1-3(a).
-begin data.
-123
-456
-919
-572
-end data.
-
-string z(a6).
-compute x=number(w).
-compute y=number(w,f8).
-compute z=lpad(
- rpad(
- substr(string(x,f6),4,1),
- 3,'@'),
- 6,'*').
-compute y=y+1e-10.
-list.
-EOF
-if [ $? -ne 0 ] ; then no_result ; fi
-
-
-activity="run program"
-$SUPERVISOR $here/../src/pspp --testing-mode -o raw-ascii $TEMPDIR/compute.stat
-if [ $? -ne 0 ] ; then no_result ; fi
-
-diff -B -b $TEMPDIR/pspp.list - <<EOF
-1.1 DATA LIST. Reading 1 record from the command file.
-+--------+------+-------+------+
-|Variable|Record|Columns|Format|
-#========#======#=======#======#
-|W | 1| 1- 3|A3 |
-+--------+------+-------+------+
-
- W Z X Y
---- ------ -------- --------
-123 ***1@@ 123.00 123.00
-456 ***4@@ 456.00 456.00
-919 ***9@@ 919.00 919.00
-572 ***5@@ 572.00 572.00
-
-EOF
-if [ $? -ne 0 ] ; then fail ; fi
-
-
-pass;
activity="create program"
cat > $TEMPDIR/list.stat << foobar
*** Single lines.
-remark EOF
-----------------------------------------------------------------------
-Testing use of LIST in single-line cases.
-----------------------------------------------------------------------
-EOF
data list file='$top_srcdir/tests/weighting.data'/AVAR 1-5 BVAR 6-10.
weight by BVAR.
list.
list /format numbered weight.
*** Multiple lines.
-remark EOF
-----------------------------------------------------------------------
-Testing use of LIST in multi-line cases.
-----------------------------------------------------------------------
-EOF
data list file='$top_srcdir/tests/list.data' notable /X000 to X126 1-127.
*list /cases=from 1 to 25 by 5 /format numbered.
list x000 to x030.
activity="compare results"
diff -b -B $TEMPDIR/pspp.list - <<EOF
-----------------------------------------------------------------------
-Testing use of LIST in single-line cases.
-----------------------------------------------------------------------
-
1.1 DATA LIST. Reading 1 record from file $top_srcdir/tests/weighting.data.
+--------+------+-------+------+
|Variable|Record|Columns|Format|
51 80 1
52 94 1
-----------------------------------------------------------------------
-Testing use of LIST in multi-line cases.
-----------------------------------------------------------------------
-
X000 X001 X002 X003 X004 X005 X006 X007 X008 X009 X010 X011 X012 X013 X014 X015 X016 X017 X018 X019 X020 X021 X022 X023 X024 X025 X026 X027 X028 X029 X030
---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
7 6 7 5 3 2 4 6 6 3 4 8 5 1 3 7 8 9 0 7 3 4 8 3 1 0 6 4 0 9 1
cat > $TEMPDIR/print.stat << foobar
title 'Test PRINT transformation'.
-remark EOF
-----------------------------------------------------------------------
-There is no test for DATA LIST FIXED since it is imagined that the
-rest of the tests give it a pretty good workout.
-----------------------------------------------------------------------
-EOF
-remark EOF
-----------------------------------------------------------------------
-Testing use of DATA LIST FREE.
-----------------------------------------------------------------------
-EOF
data list free table file='$TEMPDIR/data-list.data'/A B C D.
print outfile="foo" table/A(f8.2) '/' B(e8.2) '/' C(n10) '/' D(rbhex16) '/'.
print space a.
print outfile="foo" /a b c d.
list.
-remark EOF
-----------------------------------------------------------------------
-Testing use of DATA LIST LIST.
-----------------------------------------------------------------------
-EOF
data list list table file='$TEMPDIR/data-list.data'/A B C D.
print table/A B C D.
list.
activity="compare output"
diff -b -B $TEMPDIR/pspp.list - << EOF
-----------------------------------------------------------------------
-There is no test for DATA LIST FIXED since it is imagined that the
-rest of the tests give it a pretty good workout.
-----------------------------------------------------------------------
-
-----------------------------------------------------------------------
-Testing use of DATA LIST FREE.
-----------------------------------------------------------------------
-
1.1 DATA LIST. Reading free-form data from file $TEMPDIR/data-list.data.
+--------+------+
|Variable|Format|
. 10.00 11.00 12.00
-----------------------------------------------------------------------
-Testing use of DATA LIST LIST.
-----------------------------------------------------------------------
-
3.1 DATA LIST. Reading free-form data from file $TEMPDIR/data-list.data.
+--------+------+
|Variable|Format|
--- /dev/null
+#! /bin/sh
+
+# Tests the expression optimizer and evaluator.
+
+TEMPDIR=/tmp/pspp-tst-$$
+
+here=`pwd`;
+
+# ensure that top_srcdir is absolute
+cd $top_srcdir; top_srcdir=`pwd`
+
+export STAT_CONFIG_PATH=$top_srcdir/config
+
+
+cleanup()
+{
+ #rm -rf $TEMPDIR
+ :
+}
+
+
+fail()
+{
+ echo $activity
+ echo FAILED
+ cleanup;
+ exit 1;
+}
+
+
+no_result()
+{
+ echo $activity
+ echo NO RESULT;
+ cleanup;
+ exit 2;
+}
+
+pass()
+{
+ cleanup;
+ exit 0;
+}
+
+mkdir -p $TEMPDIR
+
+cd $TEMPDIR
+activity="create expressions list"
+sed -ne 's/#.*//;/^[ ]*$/!p' > $TEMPDIR/expr-list <<'EOF'
+
+# Number syntax.
+1e2 => 100.00
+1e+2 => 100.00
+1e-2 => 0.01
+1e-99 => 0.00
+
+# Test using numeric/string values as Booleans and vice-versa
+0 AND 1 => false
+$true AND 1 => true
+1 OR $false => true
+1 OR $sysmis => true
+2 OR $sysmis => sysmis
+2 AND $sysmis => false
+'string' AND $sysmis => error
+0 AND $sysmis => false
+(1>2) + 1 => 1.00
+$true + $false => 1.00
+
+# Addition and subtraction.
+1 + 2 => 3.00
+1 + $true => 2.00
+$sysmis + 1 => sysmis
+7676 + $sysmis => sysmis
+('foo') + 5 => error
+('foo') + ('bar') => error # Arithmetic concatenation requires CONCAT.
+'foo' + 'bar' => "foobar" # Lexical concatentation succeeds.
+1 +3 - 2 +4 -5 => 1.00
+1 - $true => 0.00
+$true - 4/3 => -0.33
+'string' - 1e10 => error
+9.5 - '' => error
+1 - 2 => -1.00
+52 -23 => 29.00
+
+# Multiplication and division
+5 * 10 => 50.00
+10 * $true => 10.00
+$true * 5 => 5.00
+1.5 * $true => 1.50
+5 * $sysmis => sysmis
+$sysmis * 15 => sysmis
+2 * 5 / 10 => 1.00
+1 / 2 => 0.50
+2 / 5 => 0.40
+12 / 3 / 2 => 2.00
+
+# Exponentiation.
+2**8 => 256.00
+(2**3)**4 => 4096.00 # Irritating, but compatible.
+2**3**4 => 4096.00
+
+# Unary minus.
+2+-3 => -1.00
+2*-3 => -6.00
+-3**2 => -9.00
+(-3)**2 => 9.00
+2**-1 => 0.50
+0**0 => sysmis
+0**-1 => sysmis
+(-3)**1.5 => sysmis
+
+# AND truth table.
+$false AND $false => false
+$false AND $true => false
+$false AND $sysmis => false
+$true AND $false => false
+$true AND $true => true
+$true AND $sysmis => sysmis
+$sysmis AND $false => false
+$sysmis AND $true => sysmis
+$sysmis AND $sysmis => sysmis
+$false & $false => false
+$false & $true => false
+$false & $sysmis => false
+$true & $false => false
+$true & $true => true
+$true & $sysmis => sysmis
+$sysmis & $false => false
+$sysmis & $true => sysmis
+$sysmis & $sysmis => sysmis
+
+# OR truth table.
+$false OR $false => false
+$false OR $true => true
+$false OR $sysmis => sysmis
+$true OR $false => true
+$true OR $true => true
+$true OR $sysmis => true
+$sysmis OR $false => sysmis
+$sysmis OR $true => true
+$sysmis OR $sysmis => sysmis
+$false | $false => false
+$false | $true => true
+$false | $sysmis => sysmis
+$true | $false => true
+$true | $true => true
+$true | $sysmis => true
+$sysmis | $false => sysmis
+$sysmis | $true => true
+$sysmis | $sysmis => sysmis
+
+# NOT truth table.
+not $false => true
+not 0 => true
+not 2.5 => true
+not $true => false
+not 1 => false
+not $sysmis => sysmis
+~ $false => true
+~ 0 => true
+~ 2.5 => true
+~ $true => false
+~ 1 => false
+~ $sysmis => sysmis
+
+# Relational operators.
+1 eq 1 => true
+1 = 1 => true
+1 eq 2 => false
+2 = 3 => false
+1 eq 'foobar' => error
+5 eq 'foobar' => error
+'baz' = 10 => error
+'quux' = 5.55 => error
+'foobar' = 'foobar' => true
+'quux' = 'bar' => false
+'bar ' = 'bar' => true
+'asdf ' = 'asdf ' => true
+'asdfj ' = 'asdf' => false
+1 + 2 = 3 => true # Check precedence.
+1 >= 2 = 2 ge 3 => false # Check precedence.
+3 ne 2 ~= 1 => false # Mathematically true.
+3 > 2 > 1 => false # Mathematically true.
+
+1 <= 2 => true
+2.5 <= 1.5 => false
+1 le 2 => true
+2 <= 2 => true
+2 le 2 => true
+2 < = 2 => error # Make sure <= token can't be split.
+1 <= 'foobar' => error
+5 <= 'foobar' => error
+'baz' <= 10 => error
+'quux' <= 5.55 => error
+'0123' <= '0123' => true
+'0123' <= '0124' => true
+'0124' le '0123' => false
+'0123 ' <= '0123' => true
+'0123' le '0123 ' => true
+
+1 < 2 => true
+2.5 < 1.5 => false
+3.5 lt 4 => true
+4 lt 3.5 => false
+1 lt 'foobar' => error
+5 lt 'foobar' => error
+'baz' < 10 => error
+'quux' < 5.55 => error
+'0123' lt '0123' => false
+'0123' < '0124' => true
+'0124' lt '0123' => false
+'0123 ' < '0123' => false
+'0123' lt '0123 ' => false
+
+1 >= 2 => false
+2.5 >= 1.5 => true
+1 ge 2 => false
+2 >= 2 => true
+2 ge 2 => true
+2 > = 2 => error # Make sure >= token can't be split.
+1 >= 'foobar' => error
+5 ge 'foobar' => error
+'baz' ge 10 => error
+'quux' >= 5.55 => error
+'0123' ge '0123' => true
+'0123' >= '0124' => false
+'0124' >= '0123' => true
+'0123 ' ge '0123' => true
+'0123' >= '0123 ' => true
+
+1 > 2 => false
+2.5 > 1.5 => true
+3.5 gt 4 => false
+4 gt 3.5 => true
+1 gt 'foobar' => error
+5 gt 'foobar' => error
+'baz' > 10 => error
+'quux' > 5.55 => error
+'0123' gt '0123' => false
+'0123' > '0124' => false
+'0124' gt '0123' => true
+'0123 ' > '0123' => false
+'0123' gt '0123 ' => false
+
+1 ne 1 => false
+1 ~= 1 => false
+1 <> 2 => true
+2 ne 3 => true
+1 ~= 'foobar' => error
+5 <> 'foobar' => error
+'baz' ne 10 => error
+'quux' ~= 5.55 => error
+'foobar' <> 'foobar' => false
+'quux' ne 'bar' => true
+'bar ' <> 'bar' => false
+'asdf ' ~= 'asdf ' => false
+'asdfj ' ne 'asdf' => true
+1 < > 1 => error # <> token can't be split
+1 ~ = 1 => error # ~= token can't be split
+
+exp(10) => 22026.47
+exp('x') => error
+
+lg10(500) => 2.70
+lg10('x') => error
+
+ln(10) => 2.30
+ln('x') => error
+
+sqrt(500) => 22.36
+sqrt('x') => error
+
+abs(-10.5) => 10.50
+abs(-55.79) => 55.79
+abs(22) => 22.00
+abs(0) => 0.00
+
+mod(55.5, 2) => 1.50
+mod(-55.5, 2) => -1.50
+mod(55.5, -2) => 1.50
+mod(-55.5, -2) => -1.50
+mod('a', 2) => error
+mod(2, 'a') => error
+mod('a', 'b') => error
+
+mod10(55.5) => 5.50
+mod10(-55.5) => -5.50
+mod10('x') => error
+
+rnd(5.4) => 5.00
+rnd(5.6) => 6.00
+rnd(-5.4) => -5.00
+rnd(-5.6) => -6.00
+rnd('x') => error
+
+trunc(1.2) => 1.00
+trunc(1.9) => 1.00
+trunc(-1.2) => -1.00
+trunc(-1.9) => -1.00
+trunc('x') => error
+
+acos(.5) / 3.14159 * 180 => 60.00
+arcos(.75) / 3.14159 * 180 => 41.41
+arcos(-.5) / 3.14159 * 180 => 120.00
+acos(-.75) / 3.14159 * 180 => 138.59
+acos(-1) / 3.14159 * 180 => 180.00
+arcos(1) / 3.14159 * 180 => 0.00
+acos(-1.01) => sysmis
+arcos(1.01) => sysmis
+acos('x') => error
+
+arsin(.5) / 3.14159 * 180 => 30.00
+asin(.25) / 3.14159 * 180 => 14.48
+arsin(-.5) / 3.14159 * 180 => -30.00
+asin(-.25) / 3.14159 * 180 => -14.48
+arsin(-1.01) => sysmis
+asin(1.01) => sysmis
+arsin('x') => error
+
+artan(1) / 3.14159 * 180 => 45.00
+atan(10) / 3.14159 * 180 => 84.29
+artan(-1) / 3.14159 * 180 => -45.00
+atan(-10) / 3.14159 * 180 => -84.29
+artan('x') => error
+
+cos(60 / 180 * 3.14159) => 0.50
+cos(45 / 180 * 3.14159) => 0.71
+cos(30 / 180 * 3.14159) => 0.87
+cos(15 / 180 * 3.14159) => 0.97
+cos(-60 / 180 * 3.14159) => 0.50
+cos(-45 / 180 * 3.14159) => 0.71
+cos(-30 / 180 * 3.14159) => 0.87
+cos(-15 / 180 * 3.14159) => 0.97
+cos(123 / 180 * 3.14159) => -0.54
+cos(321 / 180 * 3.14159) => 0.78
+cos('x') => error
+
+sin(60 / 180 * 3.14159) => 0.87
+sin(45 / 180 * 3.14159) => 0.71
+sin(30 / 180 * 3.14159) => 0.50
+sin(15 / 180 * 3.14159) => 0.26
+sin(-60 / 180 * 3.14159) => -0.87
+sin(-45 / 180 * 3.14159) => -0.71
+sin(-30 / 180 * 3.14159) => -0.50
+sin(-15 / 180 * 3.14159) => -0.26
+sin(123 / 180 * 3.14159) => 0.84
+sin(321 / 180 * 3.14159) => -0.63
+sin('x') => error
+
+tan(60 / 180 * 3.14159) => 1.73
+tan(45 / 180 * 3.14159) => 1.00
+tan(30 / 180 * 3.14159) => 0.58
+tan(15 / 180 * 3.14159) => 0.27
+tan(-60 / 180 * 3.14159) => -1.73
+tan(-45 / 180 * 3.14159) => -1.00
+tan(-30 / 180 * 3.14159) => -0.58
+tan(-15 / 180 * 3.14159) => -0.27
+tan(123 / 180 * 3.14159) => -1.54
+tan(321 / 180 * 3.14159) => -0.81
+tan('x') => error
+
+# FIXME: a variable name as the argument to SYSMIS is a special case
+# that we don't yet test. We also can't test VALUE this way.
+missing(10) => false
+missing($sysmis) => true
+missing(asin(1.01)) => true
+missing(asin(.5)) => false
+missing(' ') => error
+nmiss($sysmis) => 1.00
+nmiss(0) => 0.00
+nmiss($sysmis, $sysmis, $sysmis) => 3.00
+nmiss(1, 2, 3, 4) => 0.00
+nmiss(1, $sysmis, $sysmis, 2, 2, $sysmis, $sysmis, 3, 4) => 4.00
+nvalid($sysmis) => 0.00
+nvalid(0) => 1.00
+nvalid($sysmis, $sysmis, $sysmis) => 0.00
+nvalid(1, 2, 3, 4) => 4.00
+nvalid(1, $sysmis, $sysmis, 2, 2, $sysmis, $sysmis, 3, 4) => 5.00
+sysmis(10) => false
+sysmis($sysmis) => true
+sysmis(asin(1.01)) => true
+sysmis(asin(.5)) => false
+sysmis(' ') => error
+
+any($sysmis, 1, $sysmis, 3) => sysmis
+any(1, 1, 2, 3) => true
+any(2, 1, 2, 3) => true
+any(3, 1, 2, 3) => true
+any(5, 1, 2, 3) => false
+any(1, 1, 1, 1) => true
+any($sysmis, 1, 1, 1) => sysmis
+any(1, $sysmis, $sysmis, $sysmis) => sysmis
+any($sysmis, $sysmis, $sysmis, $sysmis) => sysmis
+any(1) => error
+any('1', 2, 3, 4) => error
+any(1, '2', 3, 4) => error
+any(1, 2, '3', 4) => error
+any(1, 2, 3, '4') => error
+
+any('', 'a', '', 'c') => true
+any('a', 'a', 'b', 'c') => true
+any('b', 'a', 'b', 'c') => true
+any('c', 'a', 'b', 'c') => true
+any('e', 'a', 'b', 'c') => false
+any('a', 'a', 'a', 'a') => true
+any('', 'a', 'a', 'a') => false
+any('a', '', '', '') => false
+any('a') => error
+any('a', 'a ', 'b', 'c') => true
+any('b ', 'a', 'b', 'c') => true
+any('c ', 'a', 'b', 'c ') => true
+any(a, 'b', 'c', 'd') => error
+any('a', b, 'c', 'd') => error
+any('a', 'b', c, 'd') => error
+any('a', 'b', 'c', d) => error
+
+range(5, 1, 10) => true
+range(1, 1, 10) => true
+range(10, 1, 10) => true
+range(-1, 1, 10) => false
+range(12, 1, 10) => false
+range($sysmis, 1, 10) => sysmis
+range(5, 1, $sysmis) => sysmis
+range(5, $sysmis, 10) => sysmis
+range($sysmis, $sysmis, 10) => sysmis
+range($sysmis, 1, $sysmis) => sysmis
+range($sysmis, $sysmis, $sysmis) => sysmis
+range(0, 1, 8, 10, 18) => false
+range(1, 1, 8, 10, 18) => true
+range(6, 1, 8, 10, 18) => true
+range(8, 1, 8, 10, 18) => true
+range(9, 1, 8, 10, 18) => false
+range(10, 1, 8, 10, 18) => true
+range(13, 1, 8, 10, 18) => true
+range(16, 1, 8, 10, 18) => true
+range(18, 1, 8, 10, 18) => true
+range(20, 1, 8, 10, 18) => false
+range(1) => error
+range(1, 2) => error
+range(1, 2, 3, 4) => error
+range(1, 2, 3, 4, 5, 6) => error
+range('1', 2, 3) => error
+range(1, '2', 3) => error
+range(1, 2, '3') => error
+
+range('123', '111', '888') => true
+range('111', '111', '888') => true
+range('888', '111', '888') => true
+range('110', '111', '888') => false
+range('889', '111', '888') => false
+range('000', '111', '888') => false
+range('999', '111', '888') => false
+range('123 ', '111', '888') => true
+range('123', '111 ', '888') => true
+range('123', '111', '888 ') => true
+range('123', '111 ', '888 ') => true
+range('00', '01', '08', '10', '18') => false
+range('01', '01', '08', '10', '18') => true
+range('06', '01', '08', '10', '18') => true
+range('08', '01', '08', '10', '18') => true
+range('09', '01', '08', '10', '18') => false
+range('10', '01', '08', '10', '18') => true
+range('15', '01', '08', '10', '18') => true
+range('18', '01', '08', '10', '18') => true
+range('19', '01', '08', '10', '18') => false
+range('1') => error
+range('1', '2') => error
+range('1', '2', '3', '4') => error
+range('1', '2', '3', '4', '5', '6') => error
+range(1, '2', '3') => error
+range('1', 2, '3') => error
+range('1', '2', 3) => error
+
+cfvar(1, 2, 3, 4, 5) => 0.53
+cfvar(1, $sysmis, 2, 3, $sysmis, 4, 5) => 0.53
+cfvar(1, 2) => 0.47
+cfvar(1) => error
+cfvar(1, $sysmis) => sysmis
+cfvar(1, 2, 3, $sysmis) => 0.50
+cfvar.4(1, 2, 3, $sysmis) => sysmis
+cfvar.4(1, 2, 3) => error
+cfvar('x') => error
+cfvar('x', 1, 2, 3) => error
+
+max(1, 2, 3, 4, 5) => 5.00
+max(1, $sysmis, 2, 3, $sysmis, 4, 5) => 5.00
+max(1, 2) => 2.00
+max() => error
+max(1) => 1.00
+max(1, $sysmis) => 1.00
+max(1, 2, 3, $sysmis) => 3.00
+max.4(1, 2, 3, $sysmis) => sysmis
+max.4(1, 2, 3) => error
+
+max("2", "3", "5", "1", "4") => "5"
+max("1", "2") => "2"
+max("1") => "1"
+
+mean(1, 2, 3, 4, 5) => 3.00
+mean(1, $sysmis, 2, 3, $sysmis, 4, 5) => 3.00
+mean(1, 2) => 1.50
+mean() => error
+mean(1) => 1.00
+mean(1, $sysmis) => 1.00
+mean(1, 2, 3, $sysmis) => 2.00
+mean.4(1, 2, 3, $sysmis) => sysmis
+mean.4(1, 2, 3) => error
+
+min(1, 2, 3, 4, 5) => 1.00
+min(1, $sysmis, 2, 3, $sysmis, 4, 5) => 1.00
+min(1, 2) => 1.00
+min() => error
+min(1) => 1.00
+min(1, $sysmis) => 1.00
+min(1, 2, 3, $sysmis) => 1.00
+min.4(1, 2, 3, $sysmis) => sysmis
+min.4(1, 2, 3) => error
+
+min("2", "3", "5", "1", "4") => "1"
+min("1", "2") => "1"
+min("1") => "1"
+
+sd(1, 2, 3, 4, 5) => 1.58
+sd(1, $sysmis, 2, 3, $sysmis, 4, 5) => 1.58
+sd(1, 2) => 0.71
+sd(1) => error
+sd(1, $sysmis) => sysmis
+sd(1, 2, 3, $sysmis) => 1.00
+sd.4(1, 2, 3, $sysmis) => sysmis
+sd.4(1, 2, 3) => error
+sd('x') => error
+sd('x', 1, 2, 3) => error
+
+sum(1, 2, 3, 4, 5) => 15.00
+sum(1, $sysmis, 2, 3, $sysmis, 4, 5) => 15.00
+sum(1, 2) => 3.00
+sum() => error
+sum(1) => 1.00
+sum(1, $sysmis) => 1.00
+sum(1, 2, 3, $sysmis) => 6.00
+sum.4(1, 2, 3, $sysmis) => sysmis
+sum.4(1, 2, 3) => error
+
+variance(1, 2, 3, 4, 5) => 2.50
+variance(1, $sysmis, 2, 3, $sysmis, 4, 5) => 2.50
+variance(1, 2) => 0.50
+variance(1) => error
+variance(1, $sysmis) => sysmis
+variance(1, 2, 3, $sysmis) => 1.00
+variance.4(1, 2, 3, $sysmis) => sysmis
+variance.4(1, 2, 3) => error
+variance('x') => error
+variance('x', 1, 2, 3) => error
+
+concat('') => ""
+concat('a', 'b') => "ab"
+concat('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h') => "abcdefgh"
+concat('abcdefgh', 'ijklmnopq') => "abcdefghijklmnopq"
+concat('a', 1) => error
+concat(1, 2) => error
+
+index('abcbcde', 'bc') => 2.00
+index('abcbcde', 'bcd') => 4.00
+index('abcbcde', 'bcbc') => 2.00
+index('abcdefgh', 'abc') => 1.00
+index('abcdefgh', 'bcd') => 2.00
+index('abcdefgh', 'cde') => 3.00
+index('abcdefgh', 'def') => 4.00
+index('abcdefgh', 'efg') => 5.00
+index('abcdefgh', 'fgh') => 6.00
+index('abcdefgh', 'fghi') => 0.00
+index('abcdefgh', 'x') => 0.00
+index('abcdefgh', 'abch') => 0.00
+index('banana', 'na') => 3.00
+index('banana', 'ana') => 2.00
+index('', 'x') => 0.00
+index('', '') => sysmis
+index('abcdefgh', '') => sysmis
+index('abcdefgh', 'alkjsfdjlskalkjfa') => 0.00
+
+index('abcbcde', 'bc', 1) => 2.00
+index('abcbcde', 'dc', 1) => 3.00
+index('abcbcde', 'abc', 1) => 1.00
+index('abcbcde', 'bc', 2) => 2.00
+index('abcbcde', 'dc', 2) => 0.00
+index('abcbcde', 'abc', 1) => 1.00
+index('abcbcde', 'bccb', 2) => 2.00
+index('abcbcde', 'bcbc', 2) => 2.00
+index('abcbcde', 'bcbc', $sysmis) => sysmis
+
+rindex('abcbcde', 'bc') => 4.00
+rindex('abcbcde', 'bcd') => 4.00
+rindex('abcbcde', 'bcbc') => 2.00
+rindex('abcdefgh', 'abc') => 1.00
+rindex('abcdefgh', 'bcd') => 2.00
+rindex('abcdefgh', 'cde') => 3.00
+rindex('abcdefgh', 'def') => 4.00
+rindex('abcdefgh', 'efg') => 5.00
+rindex('abcdefgh', 'fgh') => 6.00
+rindex('abcdefgh', 'fghi') => 0.00
+rindex('abcdefgh', 'x') => 0.00
+rindex('abcdefgh', 'abch') => 0.00
+rindex('banana', 'na') => 5.00
+rindex('banana', 'ana') => 4.00
+rindex('', 'x') => 0.00
+rindex('', '') => sysmis
+rindex('abcdefgh', '') => sysmis
+rindex('abcdefgh', 'alkjsfdjlskalkjfa') => 0.00
+
+rindex('abcbcde', 'bc', 1) => 5.00
+rindex('abcbcde', 'dc', 1) => 6.00
+rindex('abcbcde', 'abc', 1) => 5.00
+rindex('abcbcde', 'bc', 2) => 4.00
+rindex('abcbcde', 'dc', 2) => 0.00
+rindex('abcbcde', 'abc', 1) => 5.00
+rindex('abcbcde', 'bccb', 2) => 4.00
+rindex('abcbcde', 'bcbc', 2) => 4.00
+rindex('abcbcde', 'bcbc', $sysmis) => sysmis
+rindex('abcbcde', 'bcbcg', 2) => sysmis
+rindex('abcbcde', 'bcbcg', $sysmis) => sysmis
+rindex('abcbcde', 'bcbcg', 'x') => error
+rindex(1, 'bcdfkjl', 2) => error
+rindex('aksj', 2, 2) => error
+rindex(1, 2, 3) => error
+rindex(1, 2, '3') => error
+
+length('') => 0.00
+length('a') => 1.00
+length('xy') => 2.00
+length('adsf ') => 8.00
+length('abcdefghijkl') => 12.00
+length(0) => error
+length($sysmis) => error
+
+lower('ABCDEFGHIJKLMNOPQRSTUVWXYZ!@%&*(089') => "abcdefghijklmnopqrstuvwxyz!@%&*(089"
+lower('') => ""
+lower(1) => error
+
+lpad('abc', -1) => ""
+lpad('abc', 0) => "abc"
+lpad('abc', 2) => "abc"
+lpad('abc', 3) => "abc"
+lpad('abc', 10) => " abc"
+lpad('abc', 256) => ""
+lpad('abc', $sysmis) => ""
+lpad('abc', -1, '*') => ""
+lpad('abc', 0, '*') => "abc"
+lpad('abc', 2, '*') => "abc"
+lpad('abc', 3, '*') => "abc"
+lpad('abc', 10, '*') => "*******abc"
+lpad('abc', 256, '*') => ""
+lpad('abc', $sysmis, '*') => ""
+lpad('abc', $sysmis, '') => ""
+lpad('abc', $sysmis, 'xy') => ""
+lpad(0, 10) => error
+lpad('abc', 'def') => error
+lpad(0, 10, ' ') => error
+lpad('abc', 'def', ' ') => error
+lpad('x', 5, 0) => error
+lpad('x', 5, 2) => error
+
+number("123", f3.0) => 123.00
+number(" 123", f3.0) => 12.00
+number("123", f3.1) => 12.30
+number(" ", f3.1) => sysmis
+
+ltrim(' abc') => "abc"
+rtrim(' abc ') => " abc"
+ltrim('abc') => "abc"
+ltrim(' abc') => " abc"
+ltrim(' ') => ""
+ltrim('') => ""
+ltrim(8) => error
+ltrim('***abc', '*') => "abc"
+ltrim('abc', '*') => "abc"
+ltrim('*abc', '*') => "abc"
+ltrim('', '*') => ""
+ltrim(8, '*') => error
+ltrim(' x', 8) => error
+ltrim(8, 9) => error
+
+rpad('abc', -1) => ""
+rpad('abc', 0) => "abc"
+rpad('abc', 2) => "abc"
+rpad('abc', 3) => "abc"
+rpad('abc', 10) => "abc "
+rpad('abc', 256) => ""
+rpad('abc', $sysmis) => ""
+rpad('abc', -1, '*') => ""
+rpad('abc', 0, '*') => "abc"
+rpad('abc', 2, '*') => "abc"
+rpad('abc', 3, '*') => "abc"
+rpad('abc', 10, '*') => "abc*******"
+rpad('abc', 256, '*') => ""
+rpad('abc', $sysmis, '*') => ""
+rpad('abc', $sysmis, '') => ""
+rpad('abc', $sysmis, 'xy') => ""
+rpad(0, 10) => error
+rpad('abc', 'def') => error
+rpad(0, 10, ' ') => error
+rpad('abc', 'def', ' ') => error
+rpad('x', 5, 0) => error
+rpad('x', 5, 2) => error
+
+rtrim('abc ') => "abc"
+rtrim(' abc ') => " abc"
+rtrim('abc') => "abc"
+rtrim('abc ') => "abc "
+rtrim(' ') => ""
+rtrim('') => ""
+rtrim(8) => error
+rtrim('abc***', '*') => "abc"
+rtrim('abc', '*') => "abc"
+rtrim('abc*', '*') => "abc"
+rtrim('', '*') => ""
+rtrim(8, '*') => error
+rtrim(' x', 8) => error
+rtrim(8, 9) => error
+
+string(123.56, f5.1) => "123.6"
+string($sysmis, f5.1) => " . "
+string("abc", A5) => error
+
+substr('abcdefgh', -5) => ""
+substr('abcdefgh', 0) => ""
+substr('abcdefgh', 1) => "abcdefgh"
+substr('abcdefgh', 3) => "cdefgh"
+substr('abcdefgh', 5) => "efgh"
+substr('abcdefgh', 6) => "fgh"
+substr('abcdefgh', 7) => "gh"
+substr('abcdefgh', 8) => "h"
+substr('abcdefgh', 9) => ""
+substr('abcdefgh', 10) => ""
+substr('abcdefgh', 20) => ""
+substr('abcdefgh', $sysmis) => ""
+substr(0, 10) => error
+substr('abcd', 'abc') => error
+substr(0, 'abc') => error
+
+substr('abcdefgh', 0, 0) => ""
+substr('abcdefgh', 3, 0) => ""
+substr('abcdefgh', 5, 0) => ""
+substr('abcdefgh', 9, 0) => ""
+substr('abcdefgh', 0, 1) => ""
+substr('abcdefgh', 0, 5) => ""
+substr('abcdefgh', 1, 8) => "abcdefgh"
+substr('abcdefgh', 1, 10) => "abcdefgh"
+substr('abcdefgh', 1, 20) => "abcdefgh"
+substr('abcdefgh', 3, 4) => "cdef"
+substr('abcdefgh', 5, 2) => "ef"
+substr('abcdefgh', 6, 1) => "f"
+substr('abcdefgh', 7, 10) => "gh"
+substr('abcdefgh', 8, 1) => "h"
+substr('abcdefgh', 8, 2) => "h"
+substr('abcdefgh', 9, 11) => ""
+substr('abcdefgh', 10, 52) => ""
+substr('abcdefgh', 20, 1) => ""
+substr('abcdefgh', $sysmis, 2) => ""
+substr('abcdefgh', 9, $sysmis) => ""
+substr('abcdefgh', $sysmis, $sysmis) => ""
+substr('abc', 1, 'x') => error
+substr(0, 10, 1) => error
+substr(0, 10, 'x') => error
+substr('abcd', 'abc', 0) => error
+substr('abcd', 'abc', 'j') => error
+substr(0, 'abc', 4) => error
+substr(0, 'abc', 'k') => error
+
+upcase('abcdefghijklmnopqrstuvwxyz!@%&*(089') => "ABCDEFGHIJKLMNOPQRSTUVWXYZ!@%&*(089"
+upcase('') => ""
+upcase(1) => error
+
+time.days(1) => 86400.00
+time.days(-1) => -86400.00
+time.days(0.5) => 43200.00
+time.days('x') => error
+time.days($sysmis) => sysmis
+
+time.hms(5, 6, 7) => 18367.00
+time.hms(5, 6, 0) => 18360.00
+time.hms(5, 0, 7) => 18007.00
+time.hms(0, 6, 7) => 367.00
+time.hms(-5, 6, -7) => sysmis
+time.hms(-5, 5, -7) => sysmis
+time.hms($sysmis, 6, 7) => sysmis
+time.hms(5, $sysmis, 7) => sysmis
+time.hms(5, $sysmis, 7) => sysmis
+time.hms($sysmis, $sysmis, 7) => sysmis
+time.hms(5, $sysmis, $sysmis) => sysmis
+time.hms($sysmis, $sysmis, 7) => sysmis
+time.hms($sysmis, $sysmis, $sysmis) => sysmis
+
+ctime.days(106272) => 1.23
+ctime.hours(106272) => 29.52
+ctime.minutes(106272) => 1771.20
+ctime.seconds(106272) => 106272.00
+ctime.days(-106272) => -1.23
+ctime.hours(-106272) => -29.52
+ctime.minutes(-106272) => -1771.20
+ctime.seconds(-106272) => -106272.00
+ctime.days($sysmis) => sysmis
+ctime.hours($sysmis) => sysmis
+ctime.minutes($sysmis) => sysmis
+ctime.seconds($sysmis) => sysmis
+ctime.days('a') => error
+ctime.hours('b') => error
+ctime.minutes('c') => error
+ctime.seconds('d') => error
+
+ctime.days(date.dmy(15,10,1582)) => 1.00
+ctime.days(date.dmy(6,9,1719)) => 50000.00
+ctime.days(date.dmy(24,1,1583)) => 102.00
+ctime.days(date.dmy(14,12,1585)) => 1157.00
+ctime.days(date.dmy(26,11,1621)) => 14288.00
+ctime.days(date.dmy(25,12,1821)) => 87365.00
+ctime.days(date.dmy(3,12,1882)) => 109623.00
+ctime.days(date.dmy(6,4,2002)) => 153211.00
+ctime.days(date.dmy(19,12,1999)) => 152372.00
+ctime.days(date.dmy(1,10,1978)) => 144623.00
+ctime.days(date.dmy(0,10,1978)) => 144622.00
+ctime.days(date.dmy(32,10,1978)) => sysmis
+ctime.days(date.dmy(31,0,1978)) => 144349.00
+ctime.days(date.dmy(31,13,1978)) => 144745.00
+ctime.days(date.dmy($sysmis,10,1978)) => sysmis
+ctime.days(date.dmy(31,$sysmis,1978)) => sysmis
+ctime.days(date.dmy(31,10,$sysmis)) => sysmis
+ctime.days(date.dmy($sysmis,$sysmis,1978)) => sysmis
+ctime.days(date.dmy(31,$sysmis,$sysmis)) => sysmis
+ctime.days(date.dmy($sysmis,10,$sysmis)) => sysmis
+ctime.days(date.dmy($sysmis,$sysmis,$sysmis)) => sysmis
+date.dmy('a',1,2) => error
+date.dmy(1,'a',2) => error
+date.dmy(1,2,'a') => error
+
+ctime.days(date.mdy(10,15,1582)) => 1.00
+ctime.days(date.mdy(9,6,1719)) => 50000.00
+ctime.days(date.mdy(1,24,1583)) => 102.00
+ctime.days(date.mdy(12,14,1585)) => 1157.00
+ctime.days(date.mdy(11,26,1621)) => 14288.00
+ctime.days(date.mdy(12,25,1821)) => 87365.00
+ctime.days(date.mdy(12,3,1882)) => 109623.00
+ctime.days(date.mdy(4,6,2002)) => 153211.00
+ctime.days(date.mdy(12,19,1999)) => 152372.00
+ctime.days(date.mdy(10,1,1978)) => 144623.00
+ctime.days(date.mdy(10,0,1978)) => 144622.00
+ctime.days(date.mdy(10,32,1978)) => sysmis
+ctime.days(date.mdy(0,31,1978)) => 144349.00
+ctime.days(date.mdy(13,31,1978)) => 144745.00
+ctime.days(date.mdy(10,$sysmis,1978)) => sysmis
+ctime.days(date.mdy($sysmis,31,1978)) => sysmis
+ctime.days(date.mdy(10,31,$sysmis)) => sysmis
+ctime.days(date.mdy($sysmis,$sysmis,1978)) => sysmis
+ctime.days(date.mdy($sysmis,31,$sysmis)) => sysmis
+ctime.days(date.mdy(10,$sysmis,$sysmis)) => sysmis
+ctime.days(date.mdy($sysmis,$sysmis,$sysmis)) => sysmis
+date.mdy('a',1,2) => error
+date.mdy(1,'a',2) => error
+date.mdy(1,2,'a') => error
+
+ctime.days(date.moyr(1,2000)) => 152385.00
+ctime.days(date.moyr(2,2000)) => 152416.00
+ctime.days(date.moyr(3,2000)) => 152445.00
+ctime.days(date.moyr(4,2000)) => 152476.00
+ctime.days(date.moyr(5,2000)) => 152506.00
+ctime.days(date.moyr(13,2000)) => 152751.00
+ctime.days(date.moyr(14,2000)) => sysmis
+ctime.days(date.moyr($sysmis,2000)) => sysmis
+ctime.days(date.moyr(1,$sysmis)) => sysmis
+ctime.days(date.moyr($sysmis,$sysmis)) => sysmis
+date.moyr('a',2000) => error
+date.moyr(5,'a') => error
+date.moyr('a','b') => error
+
+ctime.days(date.qyr(1,2000)) => 152385.00
+ctime.days(date.qyr(2,2000)) => 152476.00
+ctime.days(date.qyr(5,2000)) => 152751.00
+ctime.days(date.qyr(6,2000)) => sysmis
+ctime.days(date.qyr($sysmis,2000)) => sysmis
+ctime.days(date.qyr(1,$sysmis)) => sysmis
+ctime.days(date.qyr($sysmis,$sysmis)) => sysmis
+date.qyr('a',2000) => error
+date.qyr(5,'a') => error
+date.qyr('a','b') => error
+
+ctime.days(date.wkyr(1,2000)) => 152385.00
+ctime.days(date.wkyr(15,1999)) => 152118.00
+ctime.days(date.wkyr(36,1999)) => 152265.00
+ctime.days(date.wkyr(54,1999)) => sysmis
+ctime.days(date.wkyr($sysmis,1999)) => sysmis
+ctime.days(date.wkyr(1,$sysmis)) => sysmis
+ctime.days(date.wkyr($sysmis,$sysmis)) => sysmis
+date.wkyr('a',1999) => error
+date.wkyr(5,'a') => error
+date.wkyr('a','b') => error
+
+ctime.days(date.yrday(2000,1)) => 152385.00
+ctime.days(date.yrday(2000,100)) => 152484.00
+ctime.days(date.yrday(2000,253)) => 152637.00
+ctime.days(date.yrday(2000,500)) => 152884.00
+ctime.days(date.yrday(2000,-100)) => 152284.00
+ctime.days(date.yrday(1999,$sysmis)) => sysmis
+ctime.days(date.yrday($sysmis,1)) => sysmis
+ctime.days(date.yrday($sysmis,$sysmis)) => sysmis
+date.yrday(1999,'a') => error
+date.yrday('a',5) => error
+date.yrday('a','b') => error
+
+# FIXME: XDATE.* functions
+# FIXME: LAG
+# FIXME: YRMODA
+EOF
+if [ $? -ne 0 ] ; then no_result ; fi
+
+activity="create optimizing input"
+echo 'set mxwarn 1000.
+set mxerr 1000.' > $TEMPDIR/expr-opt.stat
+sed < $TEMPDIR/expr-list >> $TEMPDIR/expr-opt.stat \
+ -e 's#^\(.*\) => \(.*\)$#DEBUG EVALUATE/\1.#'
+if [ $? -ne 0 ] ; then no_result ; fi
+
+activity="run optimizing program"
+$SUPERVISOR $here/../src/pspp --testing-mode -o raw-ascii \
+ $TEMPDIR/expr-opt.stat >$TEMPDIR/expr-opt.err 2> $TEMPDIR/expr-opt.out
+
+activity="compare optimizing output"
+diff -B -b $TEMPDIR/expr-list $TEMPDIR/expr-opt.out
+if [ $? -ne 0 ] ; then fail ; fi
+
+activity="create non-optimizing input"
+echo 'set mxwarn 1000.
+set mxerr 1000.' > $TEMPDIR/expr-noopt.stat
+sed < $TEMPDIR/expr-list >> $TEMPDIR/expr-noopt.stat \
+ -e 's#^\(.*\) => \(.*\)$#DEBUG EVALUATE NOOPTIMIZE/\1.#'
+if [ $? -ne 0 ] ; then no_result ; fi
+
+activity="run non-optimizing program"
+$SUPERVISOR $here/../src/pspp --testing-mode -o raw-ascii \
+ $TEMPDIR/expr-noopt.stat >$TEMPDIR/expr-noopt.err 2> $TEMPDIR/expr-noopt.out
+
+activity="compare non-optimizing output"
+diff -B -b $TEMPDIR/expr-list $TEMPDIR/expr-noopt.out
+if [ $? -ne 0 ] ; then fail ; fi
+
+pass