X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fpspp.texi;h=30a514cbf6088081affb30a5dc7edf7f6c488203;hb=97d6c6f6b1922621ca013668eba9a9a9f71d60fe;hp=cda0308fd0aa9c87012c6f88796098ab9325337e;hpb=48cf1d7de82d12cdf3c0433d49d3c66f820f1609;p=pspp diff --git a/doc/pspp.texi b/doc/pspp.texi index cda0308fd0..30a514cbf6 100644 --- a/doc/pspp.texi +++ b/doc/pspp.texi @@ -3573,7 +3573,7 @@ as arguments. With few exceptions, operator arguments may be full-fledged expressions in themselves. @menu -* Booleans:: Boolean values. +* Boolean Values:: Boolean values. * Missing Values in Expressions:: Using missing values in expressions. * Grouping Operators:: ( ) * Arithmetic Operators:: + - * / ** @@ -3583,36 +3583,25 @@ full-fledged expressions in themselves. * Order of Operations:: Operator precedence. @end menu -@node Booleans, Missing Values in Expressions, Expressions, Expressions -@section Boolean values +@node Boolean Values, Missing Values in Expressions, Expressions, Expressions +@section Boolean Values @cindex Boolean @cindex values, Boolean -There is a third type for arguments and results, the @dfn{Boolean} type, -which is used to represent true/false conditions. Booleans have only -three possible values: 0 (false), 1 (true), and system-missing. -System-missing is neither true nor false. +Some PSPP operators and expressions work with Boolean values, which +represent true/false conditions. Booleans have only three possible +values: 0 (false), 1 (true), and system-missing (unknown). +System-missing is neither true nor false and indicates that the true +value is unknown. -@itemize @bullet -@item -A numeric expression that has value 0, 1, or system-missing may be used -in place of a Boolean. Thus, the expression @code{0 AND 1} is valid -(although it is always false). +Boolean-typed operands or function arguments must take on one of these +three values. Other values are considered false, but cause an error +when the expression is evaluated. -@item -A numeric expression with any other value will cause an error if it is -used as a Boolean. So, @code{2 OR 3} is invalid. - -@item -A Boolean expression may not be used in place of a numeric expression. -Thus, @code{(1>2) + (3<4)} is invalid. - -@item Strings and Booleans are not compatible, and neither may be used in place of the other. -@end itemize -@node Missing Values in Expressions, Grouping Operators, Booleans, Expressions +@node Missing Values in Expressions, Grouping Operators, Boolean Values, Expressions @section Missing Values in Expressions String missing values are not treated specially in expressions. Most @@ -3621,8 +3610,8 @@ arguments. Exceptions are listed under particular operator descriptions. User-missing values for numeric variables are always transformed into -the system-missing value, except inside the arguments to the -@code{VALUE}, @code{SYSMIS}, and @code{MISSING} functions. +the system-missing value, except inside the arguments to the +@code{VALUE} and @code{SYSMIS} functions. The missing-value functions can be used to precisely control how missing values are treated in expressions. @xref{Missing Value Functions}, for @@ -3706,8 +3695,8 @@ system-missing value. @cindex logical intersection @item @var{a} AND @var{b} @itemx @var{a} & @var{b} -True if both @var{a} and @var{b} are true. However, if one argument is -false and the other is missing, the result is false, not missing. If +True if both @var{a} and @var{b} are true, false otherwise. If one +argument is false, the result is false even if the other is missing. If both arguments are missing, the result is missing. @cindex @code{OR} @@ -3717,7 +3706,7 @@ both arguments are missing, the result is missing. @item @var{a} OR @var{b} @itemx @var{a} | @var{b} True if at least one of @var{a} and @var{b} is true. If one argument is -true and the other is missing, the result is true, not missing. If both +true, the result is true even if the other argument is missing. If both arguments are missing, the result is missing. @cindex @code{NOT} @@ -3726,7 +3715,8 @@ arguments are missing, the result is missing. @cindex logical inversion @item NOT @var{a} @itemx ~ @var{a} -True if @var{a} is false. +True if @var{a} is false. If the argument is missing, then the result +is missing. @end table @node Relational Operators, Functions, Logical Operators, Expressions @@ -3735,20 +3725,6 @@ True if @var{a} is false. The relational operators take numeric or string arguments and produce Boolean results. -Note that, with numeric arguments, PSPP does not make exact -relational tests. Instead, two numbers are considered to be equal even -if they differ by a small amount. This amount, @dfn{epsilon}, is -dependent on the PSPP configuration and determined at compile -time. (The default value is 0.000000001, or -@ifinfo -@code{10**(-9)}.) -@end ifinfo -@tex -$10 ^{-9}$.) -@end tex -Use of epsilon allows for round-off errors. Use of epsilon is also -idiotic, but the author is not a numeric analyst. - Strings cannot be compared to numbers. When strings of different lengths are compared, the shorter string is right-padded with spaces to match the length of the longer string. @@ -3916,11 +3892,9 @@ results. @cindex arccosine @cindex inverse cosine -@deftypefn {Function} {} ACOS (@var{number}) -@deftypefnx {Function} {} ARCOS (@var{number}) +@deftypefn {Function} {} ARCOS (@var{number}) Takes the arccosine, in radians, of @var{number}. Results in -system-missing if @var{number} is not between -1 and 1. Portability: -none. +system-missing if @var{number} is not between -1 and 1. @end deftypefn @cindex arcsine @@ -3936,26 +3910,6 @@ system-missing if @var{number} is not between -1 and 1 inclusive. Takes the arctangent, in radians, of @var{number}. @end deftypefn -@cindex arcsine -@cindex inverse sine -@deftypefn {Function} {} ASIN (@var{number}) -Takes the arcsine, in radians, of @var{number}. Results in -system-missing if @var{number} is not between -1 and 1 inclusive. -Portability: none. -@end deftypefn - -@cindex arctangent -@cindex inverse tangent -@deftypefn {Function} {} ATAN (@var{number}) -Takes the arctangent, in radians, of @var{number}. -@end deftypefn - -@quotation -@strong{Please note:} Use of the AR* group of inverse trigonometric -functions is recommended over the A* group because they are more -portable. -@end quotation - @cindex cosine @deftypefn {Function} {} COS (@var{angle}) Takes the cosine of @var{angle} which should be in radians. @@ -3980,57 +3934,42 @@ Portability: none. @cindex values, missing @cindex functions, missing-value -Missing-value functions take various types as arguments, returning -various types of results. - -@deftypefn {Function} {} MISSING (@var{variable or expression}) -@var{num} may be a single variable name or an expression. If it is a -variable name, results in 1 if the variable has a user-missing or -system-missing value for the current case, 0 otherwise. If it is an -expression, results in 1 if the expression has the system-missing value, -0 otherwise. +Missing-value functions take various numeric arguments and yield +various types of results. Note that the normal rules of evaluation +apply within expression arguments to these functions. In particular, +user-missing values for numeric variables are converted to +system-missing values. -@quotation -@strong{Please note:} If the argument is a string expression other than -a variable name, MISSING is guaranteed to return 0, because strings do -not have a system-missing value. Also, when using a numeric expression -argument, remember that user-missing values are converted to the -system-missing value in most contexts. Thus, the expressions -@code{MISSING(VAR1 @var{op} VAR2)} and @code{MISSING(VAR1) OR -MISSING(VAR2)} are often equivalent, depending on the specific operator -@var{op} used. -@end quotation +@deftypefn {Function} {} MISSING (@var{expr}) +Returns 1 if @var{expr} has the system-missing value, 0 otherwise. @end deftypefn @deftypefn {Function} {} NMISS (@var{expr} [, @var{expr}]@dots{}) Each argument must be a numeric expression. Returns the number of -user- or system-missing values in the list. As a special extension, +system-missing values in the list. As a special extension, the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a range of variables; see @ref{Sets of Variables}, for more details. @end deftypefn @deftypefn {Function} {} NVALID (@var{expr} [, @var{expr}]@dots{}) Each argument must be a numeric expression. Returns the number of -values in the list that are not user- or system-missing. As a special extension, +values in the list that are not system-missing. As a special extension, the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a range of variables; see @ref{Sets of Variables}, for more details. @end deftypefn -@deftypefn {Function} {} SYSMIS (@var{variable or expression}) -When given the name of a numeric variable, returns 1 if the value of -that variable is system-missing. Otherwise, if the value is not -missing or if it is user-missing, returns 0. If given the name of a -string variable, always returns 1. If given an expression other than -a single variable name, results in 1 if the value is system- or -user-missing, 0 otherwise. +@deftypefn {Function} {} SYSMIS (@var{expr}) +When @var{expr} is simply the name of a numeric variable, returns 1 if +the variable has the system-missing value, 0 if it is user-missing or +not missing. If given @var{expr} takes another form, results in 1 if +the value is system-missing, 0 otherwise. @end deftypefn @deftypefn {Function} {} VALUE (@var{variable}) Prevents the user-missing values of @var{variable} from being -transformed into system-missing values: If @var{variable} is not -system- or user-missing, results in the value of @var{variable}. If -@var{variable} is user-missing, results in the value of @var{variable} -anyway. If @var{variable} is system-missing, results in system-missing. +transformed into system-missing values, and always results in the +actual value of @var{variable}, whether it is user-missing, +system-missing or not missing at all. @end deftypefn @node Pseudo-Random Numbers, Set Membership, Missing Value Functions, Functions @@ -4171,15 +4110,9 @@ non-missing result. @end deftypefn @cindex variance -@deftypefn {Function} {} VAR (@var{number}, @var{number}[, @dots{}]) -Results in the variance of the values of @var{number}. This function -requires at least two valid arguments to give a non-missing result. -@end deftypefn - @deftypefn {Function} {} VARIANCE (@var{number}, @var{number}[, @dots{}]) Results in the variance of the values of @var{number}. This function requires at least two valid arguments to give a non-missing result. -(Use VAR in preference to VARIANCE for reasons of portability.) @end deftypefn @node String Functions, Time & Date, Statistical Functions, Functions @@ -4258,20 +4191,15 @@ empty string. @cindex numbers, converting from strings @cindex strings, converting to numbers -@deftypefn {Function} {} NUMBER (@var{string}) -Returns the number produced when @var{string} is interpreted according -to format F@var{x}.0, where @var{x} is the number of characters in -@var{string}. If @var{string} does not form a proper number, -system-missing is returned without an error message. Portability: none. -@end deftypefn - @deftypefn {Function} {} NUMBER (@var{string}, @var{format}) Returns the number produced when @var{string} is interpreted according -to format specifier @var{format}. Only the number of characters in -@var{string} specified by @var{format} are examined. For example, -@code{NUMBER("123", F3.0)} and @code{NUMBER("1234", F3.0)} both have -value 123. If @var{string} does not form a proper number, -system-missing is returned without an error message. +to format specifier @var{format}. If the format width @var{w} is less +than the length of @var{string}, then only the first @var{w} +characters in @var{string} are used, e.g.@: @code{NUMBER("123", F3.0)} +and @code{NUMBER("1234", F3.0)} both have value 123. If @var{w} is +greater than @var{string}'s length, then it is treated as if it were +right-padded with spaces. If @var{string} is not in the correct +format for @var{format}, system-missing is returned. @end deftypefn @cindex strings, searching backwards @@ -4757,6 +4685,7 @@ results. @cindex cross-case function @cindex function, cross-case @deftypefn {Function} {} LAG (@var{variable}) +@anchor{LAG} @var{variable} must be a numeric or string variable name. @code{LAG} results in the value of that variable for the case before the current one. In case-selection procedures, @code{LAG} results in the value of @@ -6021,7 +5950,7 @@ including the active file. Records with the same values for BY variables are combined into a single record. Records with different values are output in order. Thus, multiple sorted system files are combined into a single sorted system file based on the value of the BY -variables. +variables. The results of the merge become the new active file. The BY subcommand specifies a list of variables that are used to match records from each of the system files. Variables specified must exist @@ -6054,6 +5983,9 @@ string variables. IN, FIRST, LAST, and MAP are currently not used. +@cmd{MATCH FILES} may not be specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}) if the active file is used as an input source. + @node SAVE, SYSFILE INFO, MATCH FILES, System and Portable Files @section SAVE @vindex SAVE @@ -6382,6 +6314,9 @@ MAP is currently ignored. If either DROP or KEEP is specified, the data is read; otherwise it is not. +@cmd{MODIFY VARS} may not be specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}). + @node NUMERIC, PRINT FORMATS, MODIFY VARS, Variable Attributes @section NUMERIC @vindex NUMERIC @@ -6433,6 +6368,9 @@ name. Multiple parenthesized groups of variables may be specified. @cmd{RENAME VARIABLES} takes effect immediately. It does not cause the data to be read. +@cmd{RENAME VARIABLES} may not be specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}). + @node VALUE LABELS, STRING, RENAME VARIABLES, Variable Attributes @section VALUE LABELS @vindex VALUE LABELS @@ -6740,9 +6678,13 @@ Using @cmd{COMPUTE} to assign to a variable specified on @cmd{LEAVE} (@pxref{LEAVE}) resets the variable's left state. Therefore, @code{LEAVE} should be specified following @cmd{COMPUTE}, not before. -COMPUTE is a transformation. It does not cause the active file to be +@cmd{COMPUTE} is a transformation. It does not cause the active file to be read. +When @cmd{COMPUTE} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + @node COUNT, FLIP, COMPUTE, Data Manipulation @section COUNT @vindex COUNT @@ -6887,6 +6829,9 @@ the active file is subsequently transposed using @cmd{FLIP}, this variable can be used to recreate the original variable names. +FLIP honors N OF CASES. It ignores TEMPORARY, so that ``temporary'' +transformations become permanent. + @node IF, RECODE, FLIP, Data Manipulation @section IF @vindex IF @@ -6921,6 +6866,10 @@ Using @cmd{IF} to assign to a variable specified on @cmd{LEAVE} (@pxref{LEAVE}) resets the variable's left state. Therefore, @code{LEAVE} should be specified following @cmd{IF}, not before. +When @cmd{IF} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + @node RECODE, SORT CASES, IF, Data Manipulation @section RECODE @vindex RECODE @@ -7010,13 +6959,9 @@ preceding them. @cmd{SORT CASES} attempts to sort the entire active file in main memory. If main memory is exhausted, it falls back to a merge sort algorithm that -involves writing and reading numerous temporary files. Environment -variables determine the temporary files' location. The first of -SPSSTMPDIR, SPSSXTMPDIR, or TMPDIR that is set determines the location. -Otherwise, if the compiler environment defined P_tmpdir, that is used. -Otherwise, under Unix-like OSes /tmp is used; under MS-DOS, the first of -TEMP, TMP, or root on the current drive is used; under other OSes, the -current directory. +involves writing and reading numerous temporary files. + +@cmd{SORT CASES} may not be specified following TEMPORARY. @node Data Selection, Conditionals and Looping, Data Manipulation, Top @chapter Selecting data for analysis @@ -7051,14 +6996,18 @@ To set up filtering, specify BY and a variable name. Keyword BY is optional but recommended. Cases which have a zero or system- or user-missing value are excluded from analysis, but not deleted from the data stream. Cases with other values are analyzed. +To filter based on a different condition, use +transformations such as @cmd{COMPUTE} or @cmd{RECODE} to compute a +filter variable of the required form, then specify that variable on +@cmd{FILTER}. @code{FILTER OFF} turns off case filtering. Filtering takes place immediately before cases pass to a procedure for analysis. Only one filter variable may be active at a time. Normally, case filtering continues until it is explicitly turned off with @code{FILTER -OFF}. However, if @cmd{FILTER} is placed after TEMPORARY, filtering stops -after execution of the next procedure or procedure-like command. +OFF}. However, if @cmd{FILTER} is placed after TEMPORARY, it filters only +the next procedure or procedure-like command. @node N OF CASES, PROCESS IF, FILTER, Data Selection @section N OF CASES @@ -7111,6 +7060,9 @@ read in data. @code{ESTIMATED} never limits the number of cases processed by procedures. PSPP currently does not make use of case count estimates. +When @cmd{N} is specified after @cmd{TEMPORARY}, it affects only +the next procedure (@pxref{TEMPORARY}). + @node PROCESS IF, SAMPLE, N OF CASES, Data Selection @section PROCESS IF @vindex PROCESS IF @@ -7136,6 +7088,11 @@ The effects of @cmd{PROCESS IF} are similar, but not identical, to the effects of executing @cmd{TEMPORARY}, then @cmd{SELECT IF} (@pxref{SELECT IF}). +The filtering performed by @cmd{PROCESS IF} takes place immediately +before cases pass to a procedure for analysis. Because @cmd{PROCESS +IF} affects only a single procedure, its placement relative to +@cmd{TEMPORARY} is unimportant. + @cmd{PROCESS IF} is deprecated. It is included for compatibility with old command files. New syntax files should use @cmd{SELECT IF} or @cmd{FILTER} instead. @@ -7148,10 +7105,9 @@ old command files. New syntax files should use @cmd{SELECT IF} or SAMPLE num1 [FROM num2]. @end display -@cmd{SAMPLE} is used to randomly sample a proportion of the cases in -the active file. @cmd{SAMPLE} is temporary, affecting only the next -procedure, unless that is a data transformation, such as @cmd{SELECT IF} -or @cmd{RECODE}. +@cmd{SAMPLE} randomly samples a proportion of the cases in the active +file. Unless it follows @cmd{TEMPORARY}, it operates as a +transformation, permanently removing cases from the active file. The proportion to sample can be expressed as a single number between 0 and 1. If @code{k} is the number specified, and @code{N} is the number @@ -7177,13 +7133,11 @@ active, exactly @var{m} cases will be selected @emph{from the first @var{N} cases in the active file.} @end enumerate -@cmd{SAMPLE}, @cmd{SELECT IF}, and @code{PROCESS IF} are performed in +@cmd{SAMPLE} and @cmd{SELECT IF} are performed in the order specified by the syntax file. -@cmd{SAMPLE} is ignored before @code{SORT CASES}. - @cmd{SAMPLE} is always performed before @code{N OF CASES}, regardless -of ordering in the syntax file. @xref{N OF CASES}. +of ordering in the syntax file (@pxref{N OF CASES}). The same values for @cmd{SAMPLE} may result in different samples. To obtain the same sample, use the @code{SET} command to set the random @@ -7214,6 +7168,10 @@ Place @cmd{SELECT IF} as early in the command file as possible. Cases that are deleted early can be processed more efficiently in time and space. +When @cmd{SELECT IF} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + @node SPLIT FILE, TEMPORARY, SELECT IF, Data Selection @section SPLIT FILE @vindex SPLIT FILE @@ -7237,6 +7195,9 @@ variable values for the group are printed along with the analysis. Specify OFF to disable @cmd{SPLIT FILE} and resume analysis of the entire active file as a single group of data. +When @cmd{SPLIT FILE} is specified after @cmd{TEMPORARY}, it affects only +the next procedure (@pxref{TEMPORARY}). + @node TEMPORARY, WEIGHT, SPLIT FILE, Data Selection @section TEMPORARY @vindex TEMPORARY @@ -7250,11 +7211,13 @@ following its execution temporary. These transformations will affect only the execution of the next procedure or procedure-like command. Their effects will not be saved to the active file. -The only specification is the command name. +The only specification on @cmd{TEMPORARY} is the command name. @cmd{TEMPORARY} may not appear within a @cmd{DO IF} or @cmd{LOOP} -construct. It may -appear only once between procedures and procedure-like commands. +construct. It may appear only once between procedures and +procedure-like commands. + +Scratch variables cannot be used following @cmd{TEMPORARY}. An example may help to clarify: @@ -7309,6 +7272,9 @@ integers, but negative and system-missing values for the weighting variable are interpreted as weighting factors of 0. User-missing values are not treated specially. +When @cmd{WEIGHT} is specified after @cmd{TEMPORARY}, it affects only +the next procedure (@pxref{TEMPORARY}). + @cmd{WEIGHT} does not cause cases in the active file to be replicated in memory. @@ -7369,6 +7335,10 @@ the boolean expression on the first @cmd{ELSE IF}, if present, is tested in turn, with the same rules applied. If all expressions evaluate to false, then the @cmd{ELSE} code block is executed, if it is present. +When @cmd{DO IF} or @cmd{ELSE IF} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + @node DO REPEAT, LOOP, DO IF, Conditionals and Looping @section DO REPEAT @vindex DO REPEAT @@ -7468,6 +7438,10 @@ loop is executed MXLOOPS (@pxref{SET}) times. @cmd{BREAK} also terminates @cmd{LOOP} execution (@pxref{BREAK}). +When @cmd{LOOP} or @cmd{END LOOP} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + @node Statistics, Utilities, Conditionals and Looping, Top @chapter Statistics @@ -8011,6 +7985,7 @@ encountered in the input. * INCLUDE:: Include a file within the current one. * QUIT:: Terminate the PSPP session. * SET:: Adjust PSPP runtime parameters. +* SHOW:: Display runtime parameters. * SUBTITLE:: Provide a document subtitle. * TITLE:: Provide a document title. @end menu @@ -8190,7 +8165,7 @@ to the operating system. This command is not valid within a command file. -@node SET, SUBTITLE, QUIT, Utilities +@node SET, SHOW, QUIT, Utilities @section SET @vindex SET @@ -8505,7 +8480,33 @@ Be aware that this setting does not guarantee safety (commands can still overwrite files, for instance) but it is an improvement. @end table -@node SUBTITLE, TITLE, SET, Utilities +@node SHOW, SUBTITLE, SET, Utilities +@comment node-name, next, previous, up +@section SHOW +@vindex SHOW + +@display +SHOW + /@var{subcommand} + +@end display + +@cmd{SHOW} can be used to display the current state of PSPP's +execution parameters. All of the parameters which can be changed +using @code{SET} @xref{SET}, can be examined using @cmd{SHOW}, by +using a subcommand with the same name. +In addition, @code{SHOW} supports the following subcommands: + +@table @code +@item WARRANTY +Show details of the lack of warranty for PSPP. +@item COPYING +Display the terms of PSPP's copyright licence @ref{License}. +@end table + + + +@node SUBTITLE, TITLE, SHOW, Utilities @section SUBTITLE @vindex SUBTITLE