1 This is pspp.info, produced by makeinfo version 4.0 from pspp.texi.
4 * PSPP: (pspp). Statistical analysis package.
7 PSPP, for statistical analysis of sampled data, by Ben Pfaff.
9 This file documents PSPP, a statistical package for analysis of
10 sampled data that uses a command language compatible with SPSS.
12 Copyright (C) 1996-9, 2000 Free Software Foundation, Inc.
14 This version of the PSPP documentation is consistent with version 2
17 Permission is granted to make and distribute verbatim copies of this
18 manual provided the copyright notice and this permission notice are
19 preserved on all copies.
21 Permission is granted to copy and distribute modified versions of
22 this manual under the conditions for verbatim copying, provided that the
23 entire resulting derived work is distributed under the terms of a
24 permission notice identical to this one.
26 Permission is granted to copy and distribute translations of this
27 manual into another language, under the above condition for modified
28 versions, except that this permission notice may be stated in a
29 translation approved by the Free Software Foundation.
32 File: pspp.info, Node: Input/Output Formats, Next: Scratch Variables, Prev: Sets of Variables, Up: Variables
34 Input and Output Formats
35 ------------------------
37 Data that PSPP inputs and outputs must have one of a number of
38 formats. These formats are described, in general, by a format
39 specification of the form `NAMEw.d', where NAME is the format name and
40 W is a field width. D is the optional desired number of decimal
41 places, if appropriate. If D is not included then it is assumed to be
42 0. Some formats do not allow D to be specified.
44 When an input format is specified on DATA LIST or another command,
45 then it is converted to an output format for the purposes of PRINT and
46 other data output commands. For most purposes, input and output
47 formats are the same; the salient differences are described below.
49 Below are listed the input and output formats supported by PSPP. If
50 an input format is mapped to a different output format by default, then
51 that mapping is indicated with =>. Each format has the listed bounds
52 on input width (iw) and output width (ow).
54 The standard numeric input and output formats are given in the
57 Fw.d: 1 <= iw,ow <= 40
58 Standard decimal format with D decimal places. If the number is
59 too large to fit within the field width, it is expressed in
60 scientific notation (`1.2+34') if w >= 6, with always at least two
61 digits in the exponent. When used as an input format, scientific
62 notation is allowed but an E or an F must be used to introduce the
65 The default output format is the same as the input format, except
66 if D > 1. In that case the output W is always made to be at least
69 Ew.d: 1 <= iw <= 40; 6 <= ow <= 40
70 For input this is equivalent to F format except that no E or F is
71 require to introduce the exponent. For output, produces scientific
72 notation in the form `1.2+34'. There are always at least two
73 digits given in the exponent.
75 The default output W is the largest of the input W, the input D +
76 7, and 10. The default output D is the input D, but at least 3.
78 COMMAw.d: 1 <= iw,ow <= 40
79 Equivalent to F format, except that groups of three digits are
80 comma-separated on output. If the number is too large to express
81 in the field width, then first commas are eliminated, then if
82 there is still not enough space the number is expressed in
83 scientific notation given that w >= 6. Commas are allowed and
84 ignored when this is used as an input format.
86 DOTw.d: 1 <= iw,ow <= 40
87 Equivalent to COMMA format except that the roles of comma and
88 decimal point are interchanged. However: If SET /DECIMAL=DOT is
89 in effect, then COMMA uses `,' for a decimal point and DOT uses
90 `.' for a decimal point.
92 DOLLARw.d: 1 <= iw <= 40; 2 <= ow <= 40
93 Equivalent to COMMA format, except that the number is prefixed by a
94 dollar sign (`$') if there is room. On input the value is allowed
95 to be prefixed by a dollar sign, which is ignored.
97 The default output W is the input W, but at least 2.
99 PCTw.d: 2 <= iw,ow <= 40
100 Equivalent to F format, except that the number is suffixed by a
101 percent sign (`%') if there is room. On input the value is
102 allowed to be suffixed by a percent sign, which is ignored.
104 The default output W is the input W, but at least 2.
106 Nw.d: 1 <= iw,ow <= 40
107 Only digits are allowed within the field width. The decimal point
108 is assumed to be D digits from the right margin.
110 The default output format is F with the same W and D, except if D
111 > 1. In that case the output W is always made to be at least 2 +
114 Zw.d => F: 1 <= iw,ow <= 40
115 Zoned decimal input. If you need to use this then you know how.
117 IBw.d => F: 1 <= iw,ow <= 8
118 Integer binary format. The field is interpreted as a fixed-point
119 positive or negative binary number in two's-complement notation.
120 The location of the decimal point is implied. Endianness is the
121 same as the host machine.
123 The default output format is F8.2 if D is 0. Otherwise it is F,
124 with output W as 9 + input D and output D as input D.
126 PIB => F: 1 <= iw,ow <= 8
127 Positive integer binary format. The field is interpreted as a
128 fixed-point positive binary number. The location of the decimal
129 point is implied. Endianness is teh same as the host machine.
131 The default output format follows the rules for IB format.
133 Pw.d => F: 1 <= iw,ow <= 16
134 Binary coded decimal format. Each byte from left to right, except
135 the rightmost, represents two digits. The upper nibble of each
136 byte is more significant. The upper nibble of the final byte is
137 the least significant digit. The lower nibble of the final byte
138 is the sign; a value of D represents a negative sign and all other
139 values are considered positive. The decimal point is implied.
141 The default output format follows the rules for IB format.
143 PKw.d => F: 1 <= iw,ow <= 16
144 Positive binary code decimal format. Same as P but the last byte
145 is the same as the others.
147 The default output format follows the rules for IB format.
149 RBw => F: 2 <= iw,ow <= 8
150 Binary C architecture-dependent "double" format. For a standard
151 IEEE754 implementation W should be 8.
153 The default output format follows the rules for IB format.
155 PIBHEXw.d => F: 2 <= iw,ow <= 16
156 PIB format encoded as textual hex digit pairs. W must be even.
158 The input width is mapped to a default output width as follows:
159 2=>4, 4=>6, 6=>9, 8=>11, 10=>14, 12=>16, 14=>18, 16=>21. No
160 allowances are made for decimal places.
162 RBHEXw => F: 4 <= iw,ow <= 16
163 RB format encoded as textual hex digits pairs. W must be even.
165 The default output format is F8.2.
167 CCAw.d: 1 <= ow <= 40
168 CCBw.d: 1 <= ow <= 40
169 CCCw.d: 1 <= ow <= 40
170 CCDw.d: 1 <= ow <= 40
171 CCEw.d: 1 <= ow <= 40
172 User-defined custom currency formats. May not be used as an input
173 format. *Note SET::, for more details.
175 The date and time numeric input and output formats accept a number of
176 possible formats. Before describing the formats themselves, some
177 definitions of the elements that make up their formats will be helpful:
180 All formats accept an optional whitespace leader.
183 An integer between 1 and 31 representing the day of month.
186 An integer representing a number of days.
189 One or more characters of whitespace or the following characters:
193 A month name in one of the following forms:
194 * An integer between 1 and 12.
196 * Roman numerals representing an integer between 1 and 12.
198 * At least the first three characters of an English month name
199 (January, February, ...).
202 An integer year number between 1582 and 19999, or between 1 and
203 199. Years between 1 and 199 will have 1900 added.
206 A single number with a year number in the first 2, 3, or 4 digits
207 (as above) and the day number within the year in the last 3 digits.
210 An integer between 1 and 4 representing a quarter.
213 The letter `Q' or `q'.
216 An integer between 1 and 53 representing a week within a year.
219 The letters `wk' in any case.
222 At least one characters of whitespace or `:' or `.'.
225 An integer greater than 0 representing an hour.
228 An integer between 0 and 59 representing a minute within an hour.
231 Optionally, a time-delimiter followed by a real number
232 representing a number of seconds.
235 An integer between 0 and 23 representing an hour within a day.
238 At least the first two characters of an English day word.
241 Any amount or no amount of whitespace.
244 An optional positive or negative sign.
247 All formats accept an optional whitespace trailer.
249 The date input formats are strung together from the above pieces. On
250 output, the date formats are always printed in a single canonical
251 manner, based on field width. The date input and output formats are
254 DATEw: 9 <= iw,ow <= 40
255 Date format. Input format: leader + day + date-delimiter + month +
256 date-delimiter + year + trailer. Output format: DD-MMM-YY for W <
257 11, DD-MMM-YYYY otherwise.
259 EDATEw: 8 <= iw,ow <= 40
260 European date format. Input format same as DATE. Output format:
261 DD.MM.YY for W < 10, DD.MM.YYYY otherwise.
263 SDATEw: 8 <= iw,ow <= 40
264 Standard date format. Input format: leader + year + date-delimiter
265 + month + date-delimiter + day + trailer. Output format: YY/MM/DD
266 for W < 10, YYYY/MM/DD otherwise.
268 ADATEw: 8 <= iw,ow <= 40
269 American date format. Input format: leader + month +
270 date-delimiter + day + date-delimiter + year + trailer. Output
271 format: MM/DD/YY for W < 10, MM/DD/YYYY otherwise.
273 JDATEw: 5 <= iw,ow <= 40
274 Julian date format. Input format: leader + julian + trailer.
275 Output format: YYDDD for W < 7, YYYYDDD otherwise.
277 QYRw: 4 <= iw <= 40, 6 <= ow <= 40
278 Quarter/year format. Input format: leader + quarter + q-delimiter
279 + year + trailer. Output format: `Q Q YY', where the first `Q' is
280 one of the digits 1, 2, 3, 4, if W < 8, `Q Q YYYY' otherwise.
282 MOYRw: 6 <= iw,ow <= 40
283 Month/year format. Input format: leader + month + date-delimiter
284 + year + trailer. Output format: `MMM YY' for W < 8, `MMM YYYY'
287 WKYRw: 6 <= iw <= 40, 8 <= ow <= 40
288 Week/year format. Input format: leader + week + wk-delimiter +
289 year + trailer. Output format: `WW WK YY' for W < 10, `WW WK
292 DATETIMEw.d: 17 <= iw,ow <= 40
293 Date and time format. Input format: leader + day + date-delimiter
294 + month + date-delimiter + yaer + time-delimiter + hour24 +
295 time-delimiter + minute + opt-second. Output format: `DD-MMM-YYYY
296 HH:MM'. If W > 19 then seconds `:SS' is added. If W > 22 and D >
297 0 then fractional seconds `.SS' are added.
299 TIMEw.d: 5 <= iw,ow <= 40
300 Time format. Input format: leader + sign + spaces + hour +
301 time-delimiter + minute + opt-second. Output format: `HH:MM'.
302 Seconds and fractional seconds are available with W of at least 8
303 and 10, respectively.
305 DTIMEw.d: 1 <= iw <= 40, 8 <= ow <= 40
306 Time format with day count. Input format: leader + sign + spaces +
307 day-count + time-delimiter + hour + time-delimiter + minute +
308 opt-second. Output format: `DD HH:MM'. Seconds and fractional
309 seconds are available with W of at least 8 and 10, respectively.
311 WKDAYw: 2 <= iw,ow <= 40
312 A weekday as a number between 1 and 7, where 1 is Sunday. Input
313 format: leader + weekday + trailer. Output format: as many
314 characters, in all capital letters, of the English name of the
315 weekday as will fit in the field width.
317 MONTHw: 3 <= iw,ow <= 40
318 A month as a number between 1 and 12, where 1 is January. Input
319 format: leader + month + trailer. Output format: as many
320 character, in all capital letters, of the English name of the
321 month as will fit in the field width.
323 There are only two formats that may be used with string variables:
325 Aw: 1 <= iw <= 255, 1 <= ow <= 254
326 The entire field is treated as a string value.
328 AHEXw => A: 2 <= iw <= 254; 2 <= ow <= 510
329 The field is composed of characters in a string encoded as textual
332 The default output W is half the input W.
335 File: pspp.info, Node: Scratch Variables, Prev: Input/Output Formats, Up: Variables
340 Most of the time, variables don't retain their values between cases.
341 Instead, either they're being read from a data file or the active file,
342 in which case they assume the value read, or, if created with COMPUTE or
343 another transformation, they're initialized to the system-missing value
344 or to blanks, depending on type.
346 However, sometimes it's useful to have a variable that keeps its
347 value between cases. You can do this with LEAVE (*note LEAVE::), or
348 you can use a "scratch variable". Scratch variables are variables whose
349 names begin with an octothorpe (`#').
351 Scratch variables have the same properties as variables left with
352 LEAVE: they retain their values between cases, and for the first case
353 they are initialized to 0 or blanks. They have the additional property
354 that they are deleted before the execution of any procedure. For this
355 reason, scratch variables can't be used for analysis. To obtain the
356 same effect, use COMPUTE (*note COMPUTE::) to copy the scratch
357 variable's value into an ordinary variable, then analysis that variable.
360 File: pspp.info, Node: Files, Next: BNF, Prev: Variables, Up: Language
365 PSPP makes use of many files each time it runs. Some of these it
366 reads, some it writes, some it creates. Here is a table listing the
367 most important of these files:
371 These names (synonyms) refer to the file that contains
372 instructions to PSPP that tell it what to do. The syntax file's
373 name is specified on the PSPP command line. Syntax files can also
374 be pulled in with the `INCLUDE' command.
377 Data files contain raw data in ASCII format suitable for being
378 read in by the `DATA LIST' command. Data can be embedded in the
379 syntax file with `BEGIN DATA' and `END DATA' commands: this makes
380 the syntax file a data file too.
383 One or more output files are created by PSPP each time it is run.
384 The output files receive the tables and charts produced by
385 statistical procedures. The output files may be in any number of
386 formats, depending on how PSPP is configured.
389 The active file is the "file" on which all PSPP procedures are
390 performed. The active file contains variable definitions and
391 cases. The active file is not necessarily a disk file: it is
392 stored in memory if there is room.
395 File: pspp.info, Node: BNF, Prev: Files, Up: Language
400 The syntax of some parts of the PSPP language is presented in this
401 manual using the formalism known as "Backus-Naur Form", or BNF. The
402 following table describes BNF:
404 * Words in all-uppercase are PSPP keyword tokens. In BNF, these are
405 often called "terminals". There are some special terminals, which
406 are actually written in lowercase for clarity:
418 A single variable name.
420 `=', `/', `+', `-', etc.
421 Operators and punctuators.
424 The terminal dot. This is not necessarily an actual dot in
425 the syntax file: *Note Commands::, for more details.
427 * Other words in all lowercase refer to BNF definitions, called
428 "productions". These productions are also known as
429 "nonterminals". Some nonterminals are very common, so they are
430 defined here in English for clarity:
433 A list of one or more variable names or the keyword `ALL'.
436 An expression. *Note Expressions::, for details.
438 * `::=' means "is defined as". The left side of `::=' gives the
439 name of the nonterminal being defined. The right side of `::='
440 gives the definition of that nonterminal. If the right side is
441 empty, then one possible expansion of that nonterminal is nothing.
442 A BNF definition is called a "production".
444 * So, the key difference between a terminal and a nonterminal is
445 that a terminal cannot be broken into smaller parts--in fact,
446 every terminal is a single token (*note Tokens::). On the other
447 hand, nonterminals are composed of a (possibly empty) sequence of
448 terminals and nonterminals. Thus, terminals indicate the deepest
449 level of syntax description. (In parsing theory, terminals are
450 the leaves of the parse tree; nonterminals form the branches.)
452 * The first nonterminal defined in a set of productions is called the
453 "start symbol". The start symbol defines the entire syntax for
457 File: pspp.info, Node: Expressions, Next: Data Input and Output, Prev: Language, Up: Top
459 Mathematical Expressions
460 ************************
462 Some PSPP commands use expressions, which share a common syntax
463 among all PSPP commands. Expressions are made up of "operands", which
464 can be numbers, strings, or variable names, separated by "operators".
465 There are five types of operators: grouping, arithmetic, logical,
466 relational, and functions.
468 Every operator takes one or more "arguments" as input and produces
469 or "returns" exactly one result as output. Both strings and numeric
470 values can be used as arguments and are produced as results, but each
471 operator accepts only specific combinations of numeric and string values
472 as arguments. With few exceptions, operator arguments may be
473 full-fledged expressions in themselves.
477 * Booleans:: Boolean values.
478 * Missing Values in Expressions:: Using missing values in expressions.
479 * Grouping Operators:: ( )
480 * Arithmetic Operators:: + - * / **
481 * Logical Operators:: AND NOT OR
482 * Relational Operators:: EQ GE GT LE LT NE
483 * Functions:: More-sophisticated operators.
484 * Order of Operations:: Operator precedence.
487 File: pspp.info, Node: Booleans, Next: Missing Values in Expressions, Prev: Expressions, Up: Expressions
492 There is a third type for arguments and results, the "Boolean" type,
493 which is used to represent true/false conditions. Booleans have only
494 three possible values: 0 (false), 1 (true), and system-missing.
495 System-missing is neither true or false.
497 * A numeric expression that has value 0, 1, or system-missing may be
498 used in place of a Boolean. Thus, the expression `0 AND 1' is
499 valid (although it is always true).
501 * A numeric expression with any other value will cause an error if
502 it is used as a Boolean. So, `2 OR 3' is invalid.
504 * A Boolean expression may not be used in place of a numeric
505 expression. Thus, `(1>2) + (3<4)' is invalid.
507 * Strings and Booleans are not compatible, and neither may be used in
511 File: pspp.info, Node: Missing Values in Expressions, Next: Grouping Operators, Prev: Booleans, Up: Expressions
513 Missing Values in Expressions
514 =============================
516 String missing values are not treated specially in expressions. Most
517 numeric operators return system-missing when given system-missing
518 arguments. Exceptions are listed under particular operator
521 User-missing values for numeric variables are always transformed into
522 the system-missing value, except inside the arguments to the `VALUE',
523 `SYSMIS', and `MISSING' functions.
525 The missing-value functions can be used to precisely control how
526 missing values are treated in expressions. *Note Missing Value
527 Functions::, for more details.
530 File: pspp.info, Node: Grouping Operators, Next: Arithmetic Operators, Prev: Missing Values in Expressions, Up: Expressions
535 Parentheses (`()') are the grouping operators. Surround an
536 expression with parentheses to force early evaluation.
538 Parentheses also surround the arguments to functions, but in that
539 situation they act as punctuators, not as operators.
542 File: pspp.info, Node: Arithmetic Operators, Next: Logical Operators, Prev: Grouping Operators, Up: Expressions
547 The arithmetic operators take numeric arguments and produce numeric
551 Adds A and B, returning the sum.
554 Subtracts B from A, returning the difference.
557 Multiplies A and B, returning the product.
560 Divides A by B, returning the quotient. If B is zero, the result
564 Returns the result of raising A to the power B. If A is negative
565 and B is not an integer, the result is system-missing. The result
566 of `0**0' is system-missing as well.
569 Reverses the sign of A.
572 File: pspp.info, Node: Logical Operators, Next: Relational Operators, Prev: Arithmetic Operators, Up: Expressions
577 The logical operators take logical arguments and produce logical
578 results, meaning "true or false". PSPP logical operators are not true
579 Boolean operators because they may also result in a system-missing
584 True if both A and B are true. However, if one argument is false
585 and the other is missing, the result is false, not missing. If
586 both arguments are missing, the result is missing.
590 True if at least one of A and B is true. If one argument is true
591 and the other is missing, the result is true, not missing. If both
592 arguments are missing, the result is missing.
599 File: pspp.info, Node: Relational Operators, Next: Functions, Prev: Logical Operators, Up: Expressions
604 The relational operators take numeric or string arguments and
605 produce Boolean results.
607 Note that, with numeric arguments, PSPP does not make exact
608 relational tests. Instead, two numbers are considered to be equal even
609 if they differ by a small amount. This amount, "epsilon", is dependent
610 on the PSPP configuration and determined at compile time. (The default
611 value is 0.000000001, or `10**(-9)'.) Use of epsilon allows for
612 round-off errors. Use of epsilon is also idiotic, but the author is
613 not a numeric analyst.
615 Strings cannot be compared to numbers. When strings of different
616 lengths are compared, the shorter string is right-padded with spaces to
617 match the length of the longer string.
619 The results of string comparisons, other than tests for equality or
620 inequality, are dependent on the character set in use. String
621 comparisons are case-sensitive.
625 True if A is equal to B.
629 True if A is less than or equal to B.
633 True if A is less than B.
637 True if A is greater than or equal to B.
641 True if A is greater than B.
646 True is A is not equal to B.
649 File: pspp.info, Node: Functions, Next: Order of Operations, Prev: Relational Operators, Up: Expressions
654 PSPP functions provide mathematical abilities above and beyond those
655 possible using simple operators. Functions have a common syntax: each
656 is composed of a function name followed by a left parenthesis, one or
657 more arguments, and a right parenthesis. Function names are *not*
658 reserved; their names are specially treated only when followed by a
659 left parenthesis: `EXP(10)' refers to the constant value `e' raised to
660 the 10th power, but `EXP' by itself refers to the value of variable EXP.
662 The sections below describe each function in detail.
666 * Advanced Mathematics:: EXP LG10 LN SQRT
667 * Miscellaneous Mathematics:: ABS MOD MOD10 RND TRUNC
668 * Trigonometry:: ACOS ARCOS ARSIN ARTAN ASIN ATAN COS SIN TAN
669 * Missing Value Functions:: MISSING NMISS NVALID SYSMIS VALUE
670 * Pseudo-Random Numbers:: NORMAL UNIFORM
671 * Set Membership:: ANY RANGE
672 * Statistical Functions:: CFVAR MAX MEAN MIN SD SUM VARIANCE
673 * String Functions:: CONCAT INDEX LENGTH LOWER LPAD LTRIM NUMBER
674 RINDEX RPAD RTRIM STRING SUBSTR UPCASE
675 * Time & Date:: CTIME.xxx DATE.xxx TIME.xxx XDATE.xxx
676 * Miscellaneous Functions:: LAG YRMODA
677 * Functions Not Implemented:: CDF.xxx CDFNORM IDF.xxx NCDF.xxx PROBIT RV.xxx
680 File: pspp.info, Node: Advanced Mathematics, Next: Miscellaneous Mathematics, Prev: Functions, Up: Functions
682 Advanced Mathematical Functions
683 -------------------------------
685 Advanced mathematical functions take numeric arguments and produce
688 - Function: EXP (EXPONENT)
689 Returns e (approximately 2.71828) raised to power EXPONENT.
691 - Function: LG10 (NUMBER)
692 Takes the base-10 logarithm of NUMBER. If NUMBER is not positive,
693 the result is system-missing.
695 - Function: LN (NUMBER)
696 Takes the base-`e' logarithm of NUMBER. If NUMBER is not
697 positive, the result is system-missing.
699 - Function: SQRT (NUMBER)
700 Takes the square root of NUMBER. If NUMBER is negative, the
701 result is system-missing.
704 File: pspp.info, Node: Miscellaneous Mathematics, Next: Trigonometry, Prev: Advanced Mathematics, Up: Functions
706 Miscellaneous Mathematical Functions
707 ------------------------------------
709 Miscellaneous mathematical functions take numeric arguments and
710 produce numeric results.
712 - Function: ABS (NUMBER)
713 Results in the absolute value of NUMBER.
715 - Function: MOD (NUMERATOR, DENOMINATOR)
716 Returns the remainder (modulus) of NUMERATOR divided by
717 DENOMINATOR. If DENOMINATOR is 0, the result is system-missing.
718 However, if NUMERATOR is 0 and DENOMINATOR is system-missing, the
721 - Function: MOD10 (NUMBER)
722 Returns the remainder when NUMBER is divided by 10. If NUMBER is
723 negative, MOD10(NUMBER) is negative or zero.
725 - Function: RND (NUMBER)
726 Takes the absolute value of NUMBER and rounds it to an integer.
727 Then, if NUMBER was negative originally, negates the result.
729 - Function: TRUNC (NUMBER)
730 Discards the fractional part of NUMBER; that is, rounds NUMBER
734 File: pspp.info, Node: Trigonometry, Next: Missing Value Functions, Prev: Miscellaneous Mathematics, Up: Functions
736 Trigonometric Functions
737 -----------------------
739 Trigonometric functions take numeric arguments and produce numeric
742 - Function: ACOS (NUMBER)
743 - Function: ARCOS (NUMBER)
744 Takes the arccosine, in radians, of NUMBER. Results in
745 system-missing if NUMBER is not between -1 and 1. Portability:
748 - Function: ARSIN (NUMBER)
749 Takes the arcsine, in radians, of NUMBER. Results in
750 system-missing if NUMBER is not between -1 and 1 inclusive.
752 - Function: ARTAN (NUMBER)
753 Takes the arctangent, in radians, of NUMBER.
755 - Function: ASIN (NUMBER)
756 Takes the arcsine, in radians, of NUMBER. Results in
757 system-missing if NUMBER is not between -1 and 1 inclusive.
760 - Function: ATAN (NUMBER)
761 Takes the arctangent, in radians, of NUMBER.
763 *Please note:* Use of the AR* group of inverse trigonometric
764 functions is recommended over the A* group because they are more
767 - Function: COS (RADIANS)
768 Takes the cosine of RADIANS.
770 - Function: SIN (ANGLE)
771 Takes the sine of RADIANS.
773 - Function: TAN (ANGLE)
774 Takes the tangent of RADIANS. Results in system-missing at values
775 of ANGLE that are too close to odd multiples of pi/2.
779 File: pspp.info, Node: Missing Value Functions, Next: Pseudo-Random Numbers, Prev: Trigonometry, Up: Functions
781 Missing-Value Functions
782 -----------------------
784 Missing-value functions take various types as arguments, returning
785 various types of results.
787 - Function: MISSING (VARIABLE OR EXPRESSION)
788 NUM may be a single variable name or an expression. If it is a
789 variable name, results in 1 if the variable has a user-missing or
790 system-missing value for the current case, 0 otherwise. If it is
791 an expression, results in 1 if the expression has the
792 system-missing value, 0 otherwise.
794 *Please note:* If the argument is a string expression other
795 than a variable name, MISSING is guaranteed to return 0,
796 because strings do not have a system-missing value. Also,
797 when using a numeric expression argument, remember that
798 user-missing values are converted to the system-missing value
799 in most contexts. Thus, the expressions `MISSING(VAR1 OP
800 VAR2)' and `MISSING(VAR1) OR MISSING(VAR2)' are often
801 equivalent, depending on the specific operator OP used.
803 - Function: NMISS (EXPR [, EXPR]...)
804 Each argument must be a numeric expression. Returns the number of
805 user- or system-missing values in the list. As a special
806 extension, the syntax `VAR1 TO VAR2' may be used to refer to a
807 range of variables; see *Note Sets of Variables::, for more
810 - Function: NVALID (EXPR [, EXPR]...)
811 Each argument must be a numeric expression. Returns the number of
812 values in the list that are not user- or system-missing. As a
813 special extension, the syntax `VAR1 TO VAR2' may be used to refer
814 to a range of variables; see *Note Sets of Variables::, for more
817 - Function: SYSMIS (VARIABLE OR EXPRESSION)
818 When given the name of a numeric variable, returns 1 if the value
819 of that variable is system-missing. Otherwise, if the value is not
820 missing or if it is user-missing, returns 0. If given the name of
821 a string variable, always returns 1. If given an expression other
822 than a single variable name, results in 1 if the value is system-
823 or user-missing, 0 otherwise.
825 - Function: VALUE (VARIABLE)
826 Prevents the user-missing values of VARIABLE from being
827 transformed into system-missing values: If VARIABLE is not system-
828 or user-missing, results in the value of VARIABLE. If VARIABLE is
829 user-missing, results in the value of VARIABLE anyway. If
830 VARIABLE is system-missing, results in system-missing.
833 File: pspp.info, Node: Pseudo-Random Numbers, Next: Set Membership, Prev: Missing Value Functions, Up: Functions
835 Pseudo-Random Number Generation Functions
836 -----------------------------------------
838 Pseudo-random number generation functions take numeric arguments and
839 produce numeric results.
841 The system's C library random generator is used as a basis for
842 generating random numbers, since random number generation is a
843 system-dependent task. However, Knuth's Algorithm B is used to shuffle
844 the resultant values, which is enough to make even a stream of
845 consecutive integers random enough for most applications.
847 (If you're worried about the quality of the random number generator,
848 well, you're using a statistical processing package--analyze it!)
850 - Function: NORMAL (NUMBER)
851 Results in a random number. Results from `NORMAL' are normally
852 distributed with a mean of 0 and a standard deviation of NUMBER.
854 - Function: UNIFORM (NUMBER)
855 Results in a random number between 0 and NUMBER. Results from
856 `UNIFORM' are evenly distributed across its entire range. There
857 may be a maximum on the largest random number ever generated--this
858 is often 2**31-1 (2,147,483,647), but it may be orders of magnitude
862 File: pspp.info, Node: Set Membership, Next: Statistical Functions, Prev: Pseudo-Random Numbers, Up: Functions
864 Set-Membership Functions
865 ------------------------
867 Set membership functions determine whether a value is a member of a
868 set. They take a set of numeric arguments or a set of string
869 arguments, and produce Boolean results.
871 String comparisons are performed according to the rules given in
872 *Note Relational Operators::.
874 - Function: ANY (VALUE, SET [, SET]...)
875 Results in true if VALUE is equal to any of the SET values.
876 Otherwise, results in false. If VALUE is system-missing, returns
877 system-missing. System-missing values in SET do not cause ANY to
878 return system-missing.
880 - Function: RANGE (VALUE, LOW, HIGH [, LOW, HIGH]...)
881 Results in true if VALUE is in any of the intervals bounded by LOW
882 and HIGH inclusive. Otherwise, results in false. Each LOW must
883 be less than or equal to its corresponding HIGH value. LOW and
884 HIGH must be given in pairs. If VALUE is system-missing, returns
885 system-missing. System-missing values in SET do not cause RANGE
886 to return system-missing.
889 File: pspp.info, Node: Statistical Functions, Next: String Functions, Prev: Set Membership, Up: Functions
891 Statistical Functions
892 ---------------------
894 Statistical functions compute descriptive statistics on a list of
895 values. Some statistics can be computed on numeric or string values;
896 other can only be computed on numeric values. They result in the same
897 type as their arguments.
899 With statistical functions it is possible to specify a minimum
900 number of non-missing arguments for the function to be evaluated. To
901 do so, append a dot and the number to the function name. For instance,
902 to specify a minimum of three valid arguments to the MEAN function, use
905 - Function: CFVAR (NUMBER, NUMBER[, ...])
906 Results in the coefficient of variation of the values of NUMBER.
907 This function requires at least two valid arguments to give a
908 non-missing result. (The coefficient of variation is the standard
909 deviation divided by the mean.)
911 - Function: MAX (VALUE, VALUE[, ...])
912 Results in the value of the greatest VALUE. The VALUEs may be
913 numeric or string. Although at least two arguments must be given,
914 only one need be valid for MAX to give a non-missing result.
916 - Function: MEAN (NUMBER, NUMBER[, ...])
917 Results in the mean of the values of NUMBER. Although at least
918 two arguments must be given, only one need be valid for MEAN to
919 give a non-missing result.
921 - Function: MIN (NUMBER, NUMBER[, ...])
922 Results in the value of the least VALUE. The VALUEs may be
923 numeric or string. Although at least two arguments must be given,
924 only one need be valid for MAX to give a non-missing result.
926 - Function: SD (NUMBER, NUMBER[, ...])
927 Results in the standard deviation of the values of NUMBER. This
928 function requires at least two valid arguments to give a
931 - Function: SUM (NUMBER, NUMBER[, ...])
932 Results in the sum of the values of NUMBER. Although at least two
933 arguments must be given, only one need by valid for SUM to give a
936 - Function: VAR (NUMBER, NUMBER[, ...])
937 Results in the variance of the values of NUMBER. This function
938 requires at least two valid arguments to give a non-missing result.
940 - Function: VARIANCE (NUMBER, NUMBER[, ...])
941 Results in the variance of the values of NUMBER. This function
942 requires at least two valid arguments to give a non-missing result.
943 (Use VAR in preference to VARIANCE for reasons of portability.)
946 File: pspp.info, Node: String Functions, Next: Time & Date, Prev: Statistical Functions, Up: Functions
951 String functions take various arguments and return various results.
953 - Function: CONCAT (STRING, STRING[, ...])
954 Returns a string consisting of each STRING in sequence.
955 `CONCAT("abc", "def", "ghi")' has a value of `"abcdefghi"'. The
956 resultant string is truncated to a maximum of 255 characters.
958 - Function: INDEX (HAYSTACK, NEEDLE)
959 Returns a positive integer indicating the position of the first
960 occurrence NEEDLE in HAYSTACK. Returns 0 if HAYSTACK does not
961 contain NEEDLE. Returns system-missing if NEEDLE is an empty
964 - Function: INDEX (HAYSTACK, NEEDLE, DIVISOR)
965 Divides NEEDLE into parts, each with length DIVISOR. Searches
966 HAYSTACK for the first occurrence of each part, and returns the
967 smallest value. Returns 0 if HAYSTACK does not contain any part
968 in NEEDLE. It is an error if DIVISOR cannot be evenly divided
969 into the length of NEEDLE. Returns system-missing if NEEDLE is an
972 - Function: LENGTH (STRING)
973 Returns the number of characters in STRING.
975 - Function: LOWER (STRING)
976 Returns a string identical to STRING except that all uppercase
977 letters are changed to lowercase letters. The definitions of
978 "uppercase" and "lowercase" are system-dependent.
980 - Function: LPAD (STRING, LENGTH)
981 If STRING is at least LENGTH characters in length, returns STRING
982 unchanged. Otherwise, returns STRING padded with spaces on the
983 left side to length LENGTH. Returns an empty string if LENGTH is
984 system-missing, negative, or greater than 255.
986 - Function: LPAD (STRING, LENGTH, PADDING)
987 If STRING is at least LENGTH characters in length, returns STRING
988 unchanged. Otherwise, returns STRING padded with PADDING on the
989 left side to length LENGTH. Returns an empty string if LENGTH is
990 system-missing, negative, or greater than 255, or if PADDING does
991 not contain exactly one character.
993 - Function: LTRIM (STRING)
994 Returns STRING, after removing leading spaces. Other whitespace,
995 such as tabs, carriage returns, line feeds, and vertical tabs, is
998 - Function: LTRIM (STRING, PADDING)
999 Returns STRING, after removing leading PADDING characters. If
1000 PADDING does not contain exactly one character, returns an empty
1003 - Function: NUMBER (STRING)
1004 Returns the number produced when STRING is interpreted according
1005 to format FX.0, where X is the number of characters in STRING. If
1006 STRING does not form a proper number, system-missing is returned
1007 without an error message. Portability: none.
1009 - Function: NUMBER (STRING, FORMAT)
1010 Returns the number produced when STRING is interpreted according
1011 to format specifier FORMAT. Only the number of characters in
1012 STRING specified by FORMAT are examined. For example,
1013 `NUMBER("123", F3.0)' and `NUMBER("1234", F3.0)' both have value
1014 123. If STRING does not form a proper number, system-missing is
1015 returned without an error message.
1017 - Function: RINDEX (STRING, FORMAT)
1018 Returns a positive integer indicating the position of the last
1019 occurrence of NEEDLE in HAYSTACK. Returns 0 if HAYSTACK does not
1020 contain NEEDLE. Returns system-missing if NEEDLE is an empty
1023 - Function: RINDEX (HAYSTACK, NEEDLE, DIVISOR)
1024 Divides NEEDLE into parts, each with length DIVISOR. Searches
1025 HAYSTACK for the last occurrence of each part, and returns the
1026 largest value. Returns 0 if HAYSTACK does not contain any part in
1027 NEEDLE. It is an error if DIVISOR cannot be evenly divided into
1028 the length of NEEDLE. Returns system-missing if NEEDLE is an
1031 - Function: RPAD (STRING, LENGTH)
1032 If STRING is at least LENGTH characters in length, returns STRING
1033 unchanged. Otherwise, returns STRING padded with spaces on the
1034 right to length LENGTH. Returns an empty string if LENGTH is
1035 system-missing, negative, or greater than 255.
1037 - Function: RPAD (STRING, LENGTH, PADDING)
1038 If STRING is at least LENGTH characters in length, returns STRING
1039 unchanged. Otherwise, returns STRING padded with PADDING on the
1040 right to length LENGTH. Returns an empty string if LENGTH is
1041 system-missing, negative, or greater than 255, or if PADDING does
1042 not contain exactly one character.
1044 - Function: RTRIM (STRING)
1045 Returns STRING, after removing trailing spaces. Other types of
1046 whitespace are not removed.
1048 - Function: RTRIM (STRING, PADDING)
1049 Returns STRING, after removing trailing PADDING characters. If
1050 PADDING does not contain exactly one character, returns an empty
1053 - Function: STRING (NUMBER, FORMAT)
1054 Returns a string corresponding to NUMBER in the format given by
1055 format specifier FORMAT. For example, `STRING(123.56, F5.1)' has
1056 the value `"123.6"'.
1058 - Function: SUBSTR (STRING, START)
1059 Returns a string consisting of the value of STRING from position
1060 START onward. Returns an empty string if START is system-missing
1061 or has a value less than 1 or greater than the number of
1062 characters in STRING.
1064 - Function: SUBSTR (STRING, START, COUNT)
1065 Returns a string consisting of the first COUNT characters from
1066 STRING beginning at position START. Returns an empty string if
1067 START or COUNT is system-missing, if START is less than 1 or
1068 greater than the number of characters in STRING, or if COUNT is
1069 less than 1. Returns a string shorter than COUNT characters if
1070 START + COUNT - 1 is greater than the number of characters in
1071 STRING. Examples: `SUBSTR("abcdefg", 3, 2)' has value `"cd"';
1072 `SUBSTR("Ben Pfaff", 5, 10)' has the value `"Pfaff"'.
1074 - Function: UPCASE (STRING)
1075 Returns STRING, changing lowercase letters to uppercase letters.
1078 File: pspp.info, Node: Time & Date, Next: Miscellaneous Functions, Prev: String Functions, Up: Functions
1080 Time & Date Functions
1081 ---------------------
1083 The legal range of dates for use in PSPP is 15 Oct 1582 through 31
1086 *Please note:* Most time & date extraction functions will accept
1089 * Negative numbers in PSPP time format.
1091 * Numbers less than 86,400 in PSPP date format.
1093 However, sensible results are not guaranteed for these invalid
1094 values. The given equivalents for these functions are definitely
1095 not guaranteed for invalid values.
1097 *Please note also:* The time & date construction functions *do*
1098 produce reasonable and useful results for out-of-range values;
1099 these are not considered invalid.
1103 * Time & Date Concepts:: How times & dates are defined and represented
1104 * Time Construction:: TIME.{DAYS HMS}
1105 * Time Extraction:: CTIME.{DAYS HOURS MINUTES SECONDS}
1106 * Date Construction:: DATE.{DMY MDY MOYR QYR WKYR YRDAY}
1107 * Date Extraction:: XDATE.{DATE HOUR JDAY MDAY MINUTE MONTH
1108 QUARTER SECOND TDAY TIME WEEK
1112 File: pspp.info, Node: Time & Date Concepts, Next: Time Construction, Prev: Time & Date, Up: Time & Date
1114 How times & dates are defined and represented
1115 .............................................
1117 Times and dates are handled by PSPP as single numbers. A "time" is
1118 an interval. PSPP measures times in seconds. Thus, the following
1119 intervals correspond with the numeric values given:
1123 1 day, 3 hours, 10 seconds 97,210
1125 10010 d, 14 min, 24 s 864,864,864
1127 A "date", on the other hand, is a particular instant in the past or
1128 the future. PSPP represents a date as a number of seconds after the
1129 midnight that separated 8 Oct 1582 and 9 Oct 1582. (Please note that 15
1130 Oct 1582 immediately followed 9 Oct 1582.) Thus, the midnights before
1131 the dates given below correspond with the numeric PSPP dates given:
1134 4 Jul 1776 6,113,318,400
1135 1 Jan 1900 10,010,390,400
1136 1 Oct 1978 12,495,427,200
1137 24 Aug 1995 13,028,601,600
1141 * A time may be added to, or subtracted from, a date, resulting in a
1144 * The difference of two dates may be taken, resulting in a time.
1146 * Two times may be added to, or subtracted from, each other,
1147 resulting in a time.
1149 (Adding two dates does not produce a useful result.)
1151 Since times and dates are merely numbers, the ordinary addition and
1152 subtraction operators are employed for these purposes.
1154 *Please note:* Many dates and times have extremely large
1155 values--just look at the values above. Thus, it is not a good
1156 idea to take powers of these values; also, the accuracy of some
1157 procedures may be affected. If necessary, convert times or dates
1158 in seconds to some other unit, like days or years, before
1159 performing analysis.
1162 File: pspp.info, Node: Time Construction, Next: Time Extraction, Prev: Time & Date Concepts, Up: Time & Date
1164 Functions that Produce Times
1165 ............................
1167 These functions take numeric arguments and produce numeric results in
1170 - Function: TIME.DAYS (NDAYS)
1171 Results in a time value corresponding to NDAYS days.
1172 (`TIME.DAYS(X)' is equivalent to `X * 60 * 60 * 24'.)
1174 - Function: TIME.HMS (NHOURS, NMINS, NSECS)
1175 Results in a time value corresponding to NHOURS hours, NMINS
1176 minutes, and NSECS seconds. (`TIME.HMS(H, M, S)' is equivalent to
1177 `H*60*60 + M*60 + S'.)
1180 File: pspp.info, Node: Time Extraction, Next: Date Construction, Prev: Time Construction, Up: Time & Date
1182 Functions that Examine Times
1183 ............................
1185 These functions take numeric arguments in PSPP time format and give
1188 - Function: CTIME.DAYS (TIME)
1189 Results in the number of days and fractional days in TIME.
1190 (`CTIME.DAYS(X)' is equivalent to `X/60/60/24'.)
1192 - Function: CTIME.HOURS (TIME)
1193 Results in the number of hours and fractional hours in TIME.
1194 (`CTIME.HOURS(X)' is equivalent to `X/60/60'.)
1196 - Function: CTIME.MINUTES (TIME)
1197 Results in the number of minutes and fractional minutes in TIME.
1198 (`CTIME.MINUTES(X)' is equivalent to `X/60'.)
1200 - Function: CTIME.SECONDS (TIME)
1201 Results in the number of seconds and fractional seconds in TIME.
1202 (`CTIME.SECONDS' does nothing; `CTIME.SECONDS(X)' is equivalent to
1206 File: pspp.info, Node: Date Construction, Next: Date Extraction, Prev: Time Extraction, Up: Time & Date
1208 Functions that Produce Dates
1209 ............................
1211 These functions take numeric arguments and give numeric results in
1212 the PSPP date format. Arguments taken by these functions are:
1215 Refers to a day of the month between 1 and 31.
1218 Refers to a month of the year between 1 and 12.
1221 Refers to a quarter of the year between 1 and 4. The quarters of
1222 the year begin on the first days of months 1, 4, 7, and 10.
1225 Refers to a week of the year between 1 and 53.
1228 Refers to a day of the year between 1 and 366.
1231 Refers to a year between 1582 and 19999.
1233 If these functions' arguments are out-of-range, they are correctly
1234 normalized before conversion to date format. Non-integers are rounded
1237 - Function: DATE.DMY (DAY, MONTH, YEAR)
1238 - Function: DATE.MDY (MONTH, DAY, YEAR)
1239 Results in a date value corresponding to the midnight before day
1240 DAY of month MONTH of year YEAR.
1242 - Function: DATE.MOYR (MONTH, YEAR)
1243 Results in a date value corresponding to the midnight before the
1244 first day of month MONTH of year YEAR.
1246 - Function: DATE.QYR (QUARTER, YEAR)
1247 Results in a date value corresponding to the midnight before the
1248 first day of quarter QUARTER of year YEAR.
1250 - Function: DATE.WKYR (WEEK, YEAR)
1251 Results in a date value corresponding to the midnight before the
1252 first day of week WEEK of year YEAR.
1254 - Function: DATE.YRDAY (YEAR, YDAY)
1255 Results in a date value corresponding to the midnight before day