This is pspp.info, produced by makeinfo version 4.0 from pspp.texi.

START-INFO-DIR-ENTRY
* PSPP: (pspp).             Statistical analysis package.
END-INFO-DIR-ENTRY

   PSPP, for statistical analysis of sampled data, by Ben Pfaff.

   This file documents PSPP, a statistical package for analysis of
sampled data that uses a command language compatible with SPSS.

   Copyright (C) 1996-9, 2000 Free Software Foundation, Inc.

   This version of the PSPP documentation is consistent with version 2
of "texinfo.tex".

   Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.

   Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that the
entire resulting derived work is distributed under the terms of a
permission notice identical to this one.

   Permission is granted to copy and distribute translations of this
manual into another language, under the above condition for modified
versions, except that this permission notice may be stated in a
translation approved by the Free Software Foundation.


File: pspp.info,  Node: Input/Output Formats,  Next: Scratch Variables,  Prev: Sets of Variables,  Up: Variables

Input and Output Formats
------------------------

   Data that PSPP inputs and outputs must have one of a number of
formats.  These formats are described, in general, by a format
specification of the form `NAMEw.d', where NAME is the format name and
W is a field width.  D is the optional desired number of decimal
places, if appropriate.  If D is not included then it is assumed to be
0.  Some formats do not allow D to be specified.

   When an input format is specified on DATA LIST or another command,
then it is converted to an output format for the purposes of PRINT and
other data output commands.  For most purposes, input and output
formats are the same; the salient differences are described below.

   Below are listed the input and output formats supported by PSPP.  If
an input format is mapped to a different output format by default, then
that mapping is indicated with =>.  Each format has the listed bounds
on input width (iw) and output width (ow).

   The standard numeric input and output formats are given in the
following table:

Fw.d: 1 <= iw,ow <= 40
     Standard decimal format with D decimal places.  If the number is
     too large to fit within the field width, it is expressed in
     scientific notation (`1.2+34') if w >= 6, with always at least two
     digits in the exponent.  When used as an input format, scientific
     notation is allowed but an E or an F must be used to introduce the
     exponent.

     The default output format is the same as the input format, except
     if D > 1.  In that case the output W is always made to be at least
     2 + D.

Ew.d: 1 <= iw <= 40; 6 <= ow <= 40
     For input this is equivalent to F format except that no E or F is
     require to introduce the exponent.  For output, produces scientific
     notation in the form `1.2+34'.  There are always at least two
     digits given in the exponent.

     The default output W is the largest of the input W, the input D +
     7, and 10.  The default output D is the input D, but at least 3.

COMMAw.d: 1 <= iw,ow <= 40
     Equivalent to F format, except that groups of three digits are
     comma-separated on output.  If the number is too large to express
     in the field width, then first commas are eliminated, then if
     there is still not enough space the number is expressed in
     scientific notation given that w >= 6.  Commas are allowed and
     ignored when this is used as an input format.

DOTw.d: 1 <= iw,ow <= 40
     Equivalent to COMMA format except that the roles of comma and
     decimal point are interchanged.  However: If SET /DECIMAL=DOT is
     in effect, then COMMA uses `,' for a decimal point and DOT uses
     `.' for a decimal point.

DOLLARw.d: 1 <= iw <= 40; 2 <= ow <= 40
     Equivalent to COMMA format, except that the number is prefixed by a
     dollar sign (`$') if there is room.  On input the value is allowed
     to be prefixed by a dollar sign, which is ignored.

     The default output W is the input W, but at least 2.

PCTw.d: 2 <= iw,ow <= 40
     Equivalent to F format, except that the number is suffixed by a
     percent sign (`%') if there is room.  On input the value is
     allowed to be suffixed by a percent sign, which is ignored.

     The default output W is the input W, but at least 2.

Nw.d: 1 <= iw,ow <= 40
     Only digits are allowed within the field width.  The decimal point
     is assumed to be D digits from the right margin.

     The default output format is F with the same W and D, except if D
     > 1.  In that case the output W is always made to be at least 2 +
     D.

Zw.d => F: 1 <= iw,ow <= 40
     Zoned decimal input.  If you need to use this then you know how.

IBw.d => F: 1 <= iw,ow <= 8
     Integer binary format.  The field is interpreted as a fixed-point
     positive or negative binary number in two's-complement notation.
     The location of the decimal point is implied.  Endianness is the
     same as the host machine.

     The default output format is F8.2 if D is 0.  Otherwise it is F,
     with output W as 9 + input D and output D as input D.

PIB => F: 1 <= iw,ow <= 8
     Positive integer binary format.  The field is interpreted as a
     fixed-point positive binary number.  The location of the decimal
     point is implied.  Endianness is teh same as the host machine.

     The default output format follows the rules for IB format.

Pw.d => F: 1 <= iw,ow <= 16
     Binary coded decimal format.  Each byte from left to right, except
     the rightmost, represents two digits.  The upper nibble of each
     byte is more significant.  The upper nibble of the final byte is
     the least significant digit.  The lower nibble of the final byte
     is the sign; a value of D represents a negative sign and all other
     values are considered positive.  The decimal point is implied.

     The default output format follows the rules for IB format.

PKw.d => F: 1 <= iw,ow <= 16
     Positive binary code decimal format.  Same as P but the last byte
     is the same as the others.

     The default output format follows the rules for IB format.

RBw => F: 2 <= iw,ow <= 8
     Binary C architecture-dependent "double" format.  For a standard
     IEEE754 implementation W should be 8.

     The default output format follows the rules for IB format.

PIBHEXw.d => F: 2 <= iw,ow <= 16
     PIB format encoded as textual hex digit pairs.  W must be even.

     The input width is mapped to a default output width as follows:
     2=>4, 4=>6, 6=>9, 8=>11, 10=>14, 12=>16, 14=>18, 16=>21.  No
     allowances are made for decimal places.

RBHEXw => F: 4 <= iw,ow <= 16
     RB format encoded as textual hex digits pairs.  W must be even.

     The default output format is F8.2.

CCAw.d: 1 <= ow <= 40
CCBw.d: 1 <= ow <= 40
CCCw.d: 1 <= ow <= 40
CCDw.d: 1 <= ow <= 40
CCEw.d: 1 <= ow <= 40
     User-defined custom currency formats.  May not be used as an input
     format.  *Note SET::, for more details.

   The date and time numeric input and output formats accept a number of
possible formats.  Before describing the formats themselves, some
definitions of the elements that make up their formats will be helpful:

"leader"
     All formats accept an optional whitespace leader.

"day"
     An integer between 1 and 31 representing the day of month.

"day-count"
     An integer representing a number of days.

"date-delimiter"
     One or more characters of whitespace or the following characters:
     `- / . ,'

"month"
     A month name in one of the following forms:
        * An integer between 1 and 12.

        * Roman numerals representing an integer between 1 and 12.

        * At least the first three characters of an English month name
          (January, February, ...).

"year"
     An integer year number between 1582 and 19999, or between 1 and
     199.  Years between 1 and 199 will have 1900 added.

"julian"
     A single number with a year number in the first 2, 3, or 4 digits
     (as above) and the day number within the year in the last 3 digits.

"quarter"
     An integer between 1 and 4 representing a quarter.

"q-delimiter"
     The letter `Q' or `q'.

"week"
     An integer between 1 and 53 representing a week within a year.

"wk-delimiter"
     The letters `wk' in any case.

"time-delimiter"
     At least one characters of whitespace or `:' or `.'.

"hour"
     An integer greater than 0 representing an hour.

"minute"
     An integer between 0 and 59 representing a minute within an hour.

"opt-second"
     Optionally, a time-delimiter followed by a real number
     representing a number of seconds.

"hour24"
     An integer between 0 and 23 representing an hour within a day.

"weekday"
     At least the first two characters of an English day word.

"spaces"
     Any amount or no amount of whitespace.

"sign"
     An optional positive or negative sign.

"trailer"
     All formats accept an optional whitespace trailer.

   The date input formats are strung together from the above pieces.  On
output, the date formats are always printed in a single canonical
manner, based on field width.  The date input and output formats are
described below:

DATEw: 9 <= iw,ow <= 40
     Date format. Input format: leader + day + date-delimiter + month +
     date-delimiter + year + trailer.  Output format: DD-MMM-YY for W <
     11, DD-MMM-YYYY otherwise.

EDATEw: 8 <= iw,ow <= 40
     European date format.  Input format same as DATE.  Output format:
     DD.MM.YY for W < 10, DD.MM.YYYY otherwise.

SDATEw: 8 <= iw,ow <= 40
     Standard date format. Input format: leader + year + date-delimiter
     + month + date-delimiter + day + trailer.  Output format: YY/MM/DD
     for W < 10, YYYY/MM/DD otherwise.

ADATEw: 8 <= iw,ow <= 40
     American date format.  Input format: leader + month +
     date-delimiter + day + date-delimiter + year + trailer.  Output
     format: MM/DD/YY for W < 10, MM/DD/YYYY otherwise.

JDATEw: 5 <= iw,ow <= 40
     Julian date format.  Input format: leader + julian + trailer.
     Output format: YYDDD for W < 7, YYYYDDD otherwise.

QYRw: 4 <= iw <= 40, 6 <= ow <= 40
     Quarter/year format.  Input format: leader + quarter + q-delimiter
     + year + trailer.  Output format: `Q Q YY', where the first `Q' is
     one of the digits 1, 2, 3, 4, if W < 8, `Q Q YYYY' otherwise.

MOYRw: 6 <= iw,ow <= 40
     Month/year format.  Input format: leader + month + date-delimiter
     + year + trailer.  Output format: `MMM YY' for W < 8, `MMM YYYY'
     otherwise.

WKYRw: 6 <= iw <= 40, 8 <= ow <= 40
     Week/year format.  Input format: leader + week + wk-delimiter +
     year + trailer.  Output format: `WW WK YY' for W < 10, `WW WK
     YYYY' otherwise.

DATETIMEw.d: 17 <= iw,ow <= 40
     Date and time format.  Input format: leader + day + date-delimiter
     + month + date-delimiter + yaer + time-delimiter + hour24 +
     time-delimiter + minute + opt-second.  Output format: `DD-MMM-YYYY
     HH:MM'.  If W > 19 then seconds `:SS' is added.  If W > 22 and D >
     0 then fractional seconds `.SS' are added.

TIMEw.d: 5 <= iw,ow <= 40
     Time format.  Input format: leader + sign + spaces + hour +
     time-delimiter + minute + opt-second.  Output format: `HH:MM'.
     Seconds and fractional seconds are available with W of at least 8
     and 10, respectively.

DTIMEw.d: 1 <= iw <= 40, 8 <= ow <= 40
     Time format with day count.  Input format: leader + sign + spaces +
     day-count + time-delimiter + hour + time-delimiter + minute +
     opt-second.  Output format: `DD HH:MM'.  Seconds and fractional
     seconds are available with W of at least 8 and 10, respectively.

WKDAYw: 2 <= iw,ow <= 40
     A weekday as a number between 1 and 7, where 1 is Sunday.  Input
     format: leader + weekday + trailer.  Output format: as many
     characters, in all capital letters, of the English name of the
     weekday as will fit in the field width.

MONTHw: 3 <= iw,ow <= 40
     A month as a number between 1 and 12, where 1 is January.  Input
     format: leader + month + trailer.  Output format: as many
     character, in all capital letters, of the English name of the
     month as will fit in the field width.

   There are only two formats that may be used with string variables:

Aw: 1 <= iw <= 255, 1 <= ow <= 254
     The entire field is treated as a string value.

AHEXw => A: 2 <= iw <= 254; 2 <= ow <= 510
     The field is composed of characters in a string encoded as textual
     hex digit pairs.

     The default output W is half the input W.


File: pspp.info,  Node: Scratch Variables,  Prev: Input/Output Formats,  Up: Variables

Scratch Variables
-----------------

   Most of the time, variables don't retain their values between cases.
Instead, either they're being read from a data file or the active file,
in which case they assume the value read, or, if created with COMPUTE or
another transformation, they're initialized to the system-missing value
or to blanks, depending on type.

   However, sometimes it's useful to have a variable that keeps its
value between cases.  You can do this with LEAVE (*note LEAVE::), or
you can use a "scratch variable".  Scratch variables are variables whose
names begin with an octothorpe (`#').

   Scratch variables have the same properties as variables left with
LEAVE: they retain their values between cases, and for the first case
they are initialized to 0 or blanks.  They have the additional property
that they are deleted before the execution of any procedure.  For this
reason, scratch variables can't be used for analysis.  To obtain the
same effect, use COMPUTE (*note COMPUTE::) to copy the scratch
variable's value into an ordinary variable, then analysis that variable.


File: pspp.info,  Node: Files,  Next: BNF,  Prev: Variables,  Up: Language

Files Used by PSPP
==================

   PSPP makes use of many files each time it runs.  Some of these it
reads, some it writes, some it creates.  Here is a table listing the
most important of these files:

*command file*
*syntax file*
     These names (synonyms) refer to the file that contains
     instructions to PSPP that tell it what to do.  The syntax file's
     name is specified on the PSPP command line.  Syntax files can also
     be pulled in with the `INCLUDE' command.

*data file*
     Data files contain raw data in ASCII format suitable for being
     read in by the `DATA LIST' command.  Data can be embedded in the
     syntax file with `BEGIN DATA' and `END DATA' commands: this makes
     the syntax file a data file too.

*listing file*
     One or more output files are created by PSPP each time it is run.
     The output files receive the tables and charts produced by
     statistical procedures.  The output files may be in any number of
     formats, depending on how PSPP is configured.

*active file*
     The active file is the "file" on which all PSPP procedures are
     performed.  The active file contains variable definitions and
     cases.  The active file is not necessarily a disk file: it is
     stored in memory if there is room.


File: pspp.info,  Node: BNF,  Prev: Files,  Up: Language

Backus-Naur Form
================

   The syntax of some parts of the PSPP language is presented in this
manual using the formalism known as "Backus-Naur Form", or BNF. The
following table describes BNF:

   * Words in all-uppercase are PSPP keyword tokens.  In BNF, these are
     often called "terminals".  There are some special terminals, which
     are actually written in lowercase for clarity:

    `number'
          A real number.

    `integer'
          An integer number.

    `string'
          A string.

    `var-name'
          A single variable name.

    `=', `/', `+', `-', etc.
          Operators and punctuators.

    `.'
          The terminal dot.  This is not necessarily an actual dot in
          the syntax file: *Note Commands::, for more details.

   * Other words in all lowercase refer to BNF definitions, called
     "productions".  These productions are also known as
     "nonterminals".  Some nonterminals are very common, so they are
     defined here in English for clarity:

    `var-list'
          A list of one or more variable names or the keyword `ALL'.

    `expression'
          An expression.  *Note Expressions::, for details.

   * `::=' means "is defined as".  The left side of `::=' gives the
     name of the nonterminal being defined.  The right side of `::='
     gives the definition of that nonterminal.  If the right side is
     empty, then one possible expansion of that nonterminal is nothing.
     A BNF definition is called a "production".

   * So, the key difference between a terminal and a nonterminal is
     that a terminal cannot be broken into smaller parts--in fact,
     every terminal is a single token (*note Tokens::).  On the other
     hand, nonterminals are composed of a (possibly empty) sequence of
     terminals and nonterminals.  Thus, terminals indicate the deepest
     level of syntax description.  (In parsing theory, terminals are
     the leaves of the parse tree; nonterminals form the branches.)

   * The first nonterminal defined in a set of productions is called the
     "start symbol".  The start symbol defines the entire syntax for
     that command.


File: pspp.info,  Node: Expressions,  Next: Data Input and Output,  Prev: Language,  Up: Top

Mathematical Expressions
************************

   Some PSPP commands use expressions, which share a common syntax
among all PSPP commands.  Expressions are made up of "operands", which
can be numbers, strings, or variable names, separated by "operators".
There are five types of operators: grouping, arithmetic, logical,
relational, and functions.

   Every operator takes one or more "arguments" as input and produces
or "returns" exactly one result as output.  Both strings and numeric
values can be used as arguments and are produced as results, but each
operator accepts only specific combinations of numeric and string values
as arguments.  With few exceptions, operator arguments may be
full-fledged expressions in themselves.

* Menu:

* Booleans::                       Boolean values.
* Missing Values in Expressions::  Using missing values in expressions.
* Grouping Operators::             ( )
* Arithmetic Operators::           + - * / **
* Logical Operators::              AND NOT OR
* Relational Operators::           EQ GE GT LE LT NE
* Functions::                      More-sophisticated operators.
* Order of Operations::            Operator precedence.


File: pspp.info,  Node: Booleans,  Next: Missing Values in Expressions,  Prev: Expressions,  Up: Expressions

Boolean values
==============

   There is a third type for arguments and results, the "Boolean" type,
which is used to represent true/false conditions.  Booleans have only
three possible values: 0 (false), 1 (true), and system-missing.
System-missing is neither true or false.

   * A numeric expression that has value 0, 1, or system-missing may be
     used in place of a Boolean.  Thus, the expression `0 AND 1' is
     valid (although it is always true).

   * A numeric expression with any other value will cause an error if
     it is used as a Boolean.  So, `2 OR 3' is invalid.

   * A Boolean expression may not be used in place of a numeric
     expression.  Thus, `(1>2) + (3<4)' is invalid.

   * Strings and Booleans are not compatible, and neither may be used in
     place of the other.


File: pspp.info,  Node: Missing Values in Expressions,  Next: Grouping Operators,  Prev: Booleans,  Up: Expressions

Missing Values in Expressions
=============================

   String missing values are not treated specially in expressions.  Most
numeric operators return system-missing when given system-missing
arguments.  Exceptions are listed under particular operator
descriptions.

   User-missing values for numeric variables are always transformed into
the system-missing value, except inside the arguments to the `VALUE',
`SYSMIS', and `MISSING' functions.

   The missing-value functions can be used to precisely control how
missing values are treated in expressions.  *Note Missing Value
Functions::, for more details.


File: pspp.info,  Node: Grouping Operators,  Next: Arithmetic Operators,  Prev: Missing Values in Expressions,  Up: Expressions

Grouping Operators
==================

   Parentheses (`()') are the grouping operators.  Surround an
expression with parentheses to force early evaluation.

   Parentheses also surround the arguments to functions, but in that
situation they act as punctuators, not as operators.


File: pspp.info,  Node: Arithmetic Operators,  Next: Logical Operators,  Prev: Grouping Operators,  Up: Expressions

Arithmetic Operators
====================

   The arithmetic operators take numeric arguments and produce numeric
results.

`A + B'
     Adds A and B, returning the sum.

`A - B'
     Subtracts B from A, returning the difference.

`A * B'
     Multiplies A and B, returning the product.

`A / B'
     Divides A by B, returning the quotient.  If B is zero, the result
     is system-missing.

`A ** B'
     Returns the result of raising A to the power B.  If A is negative
     and B is not an integer, the result is system-missing.  The result
     of `0**0' is system-missing as well.

`- A'
     Reverses the sign of A.


File: pspp.info,  Node: Logical Operators,  Next: Relational Operators,  Prev: Arithmetic Operators,  Up: Expressions

Logical Operators
=================

   The logical operators take logical arguments and produce logical
results, meaning "true or false".  PSPP logical operators are not true
Boolean operators because they may also result in a system-missing
value.

`A AND B'
`A & B'
     True if both A and B are true.  However, if one argument is false
     and the other is missing, the result is false, not missing.  If
     both arguments are missing, the result is missing.

`A OR B'
`A | B'
     True if at least one of A and B is true.  If one argument is true
     and the other is missing, the result is true, not missing.  If both
     arguments are missing, the result is missing.

`NOT A'
`~ A'
     True if A is false.


File: pspp.info,  Node: Relational Operators,  Next: Functions,  Prev: Logical Operators,  Up: Expressions

Relational Operators
====================

   The relational operators take numeric or string arguments and
produce Boolean results.

   Note that, with numeric arguments, PSPP does not make exact
relational tests.  Instead, two numbers are considered to be equal even
if they differ by a small amount.  This amount, "epsilon", is dependent
on the PSPP configuration and determined at compile time.  (The default
value is 0.000000001, or `10**(-9)'.)  Use of epsilon allows for
round-off errors.  Use of epsilon is also idiotic, but the author is
not a numeric analyst.

   Strings cannot be compared to numbers.  When strings of different
lengths are compared, the shorter string is right-padded with spaces to
match the length of the longer string.

   The results of string comparisons, other than tests for equality or
inequality, are dependent on the character set in use.  String
comparisons are case-sensitive.

`A EQ B'
`A = B'
     True if A is equal to B.

`A LE B'
`A <= B'
     True if A is less than or equal to B.

`A LT B'
`A < B'
     True if A is less than B.

`A GE B'
`A >= B'
     True if A is greater than or equal to B.

`A GT B'
`A > B'
     True if A is greater than B.

`A NE B'
`A ~= B'
`A <> B'
     True is A is not equal to B.


File: pspp.info,  Node: Functions,  Next: Order of Operations,  Prev: Relational Operators,  Up: Expressions

Functions
=========

   PSPP functions provide mathematical abilities above and beyond those
possible using simple operators.  Functions have a common syntax: each
is composed of a function name followed by a left parenthesis, one or
more arguments, and a right parenthesis.  Function names are *not*
reserved; their names are specially treated only when followed by a
left parenthesis: `EXP(10)' refers to the constant value `e' raised to
the 10th power, but `EXP' by itself refers to the value of variable EXP.

   The sections below describe each function in detail.

* Menu:

* Advanced Mathematics::        EXP LG10 LN SQRT
* Miscellaneous Mathematics::   ABS MOD MOD10 RND TRUNC
* Trigonometry::                ACOS ARCOS ARSIN ARTAN ASIN ATAN COS SIN TAN
* Missing Value Functions::     MISSING NMISS NVALID SYSMIS VALUE
* Pseudo-Random Numbers::       NORMAL UNIFORM
* Set Membership::              ANY RANGE
* Statistical Functions::       CFVAR MAX MEAN MIN SD SUM VARIANCE
* String Functions::            CONCAT INDEX LENGTH LOWER LPAD LTRIM NUMBER
                                RINDEX RPAD RTRIM STRING SUBSTR UPCASE
* Time & Date::                 CTIME.xxx DATE.xxx TIME.xxx XDATE.xxx
* Miscellaneous Functions::     LAG YRMODA
* Functions Not Implemented::   CDF.xxx CDFNORM IDF.xxx NCDF.xxx PROBIT RV.xxx


File: pspp.info,  Node: Advanced Mathematics,  Next: Miscellaneous Mathematics,  Prev: Functions,  Up: Functions

Advanced Mathematical Functions
-------------------------------

   Advanced mathematical functions take numeric arguments and produce
numeric results.

 - Function:  EXP (EXPONENT)
     Returns e (approximately 2.71828) raised to power EXPONENT.

 - Function:  LG10 (NUMBER)
     Takes the base-10 logarithm of NUMBER.  If NUMBER is not positive,
     the result is system-missing.

 - Function:  LN (NUMBER)
     Takes the base-`e' logarithm of NUMBER.  If NUMBER is not
     positive, the result is system-missing.

 - Function:  SQRT (NUMBER)
     Takes the square root of NUMBER.  If NUMBER is negative, the
     result is system-missing.


File: pspp.info,  Node: Miscellaneous Mathematics,  Next: Trigonometry,  Prev: Advanced Mathematics,  Up: Functions

Miscellaneous Mathematical Functions
------------------------------------

   Miscellaneous mathematical functions take numeric arguments and
produce numeric results.

 - Function:  ABS (NUMBER)
     Results in the absolute value of NUMBER.

 - Function:  MOD (NUMERATOR, DENOMINATOR)
     Returns the remainder (modulus) of NUMERATOR divided by
     DENOMINATOR.  If DENOMINATOR is 0, the result is system-missing.
     However, if NUMERATOR is 0 and DENOMINATOR is system-missing, the
     result is 0.

 - Function:  MOD10 (NUMBER)
     Returns the remainder when NUMBER is divided by 10.  If NUMBER is
     negative, MOD10(NUMBER) is negative or zero.

 - Function:  RND (NUMBER)
     Takes the absolute value of NUMBER and rounds it to an integer.
     Then, if NUMBER was negative originally, negates the result.

 - Function:  TRUNC (NUMBER)
     Discards the fractional part of NUMBER; that is, rounds NUMBER
     towards zero.


File: pspp.info,  Node: Trigonometry,  Next: Missing Value Functions,  Prev: Miscellaneous Mathematics,  Up: Functions

Trigonometric Functions
-----------------------

   Trigonometric functions take numeric arguments and produce numeric
results.

 - Function:  ACOS (NUMBER)
 - Function:  ARCOS (NUMBER)
     Takes the arccosine, in radians, of NUMBER.  Results in
     system-missing if NUMBER is not between -1 and 1.  Portability:
     none.

 - Function:  ARSIN (NUMBER)
     Takes the arcsine, in radians, of NUMBER.  Results in
     system-missing if NUMBER is not between -1 and 1 inclusive.

 - Function:  ARTAN (NUMBER)
     Takes the arctangent, in radians, of NUMBER.

 - Function:  ASIN (NUMBER)
     Takes the arcsine, in radians, of NUMBER.  Results in
     system-missing if NUMBER is not between -1 and 1 inclusive.
     Portability: none.

 - Function:  ATAN (NUMBER)
     Takes the arctangent, in radians, of NUMBER.

     *Please note:* Use of the AR* group of inverse trigonometric
     functions is recommended over the A* group because they are more
     portable.

 - Function:  COS (RADIANS)
     Takes the cosine of RADIANS.

 - Function:  SIN (ANGLE)
     Takes the sine of RADIANS.

 - Function:  TAN (ANGLE)
     Takes the tangent of RADIANS.  Results in system-missing at values
     of ANGLE that are too close to odd multiples of pi/2.
     Portability: none.


File: pspp.info,  Node: Missing Value Functions,  Next: Pseudo-Random Numbers,  Prev: Trigonometry,  Up: Functions

Missing-Value Functions
-----------------------

   Missing-value functions take various types as arguments, returning
various types of results.

 - Function:  MISSING (VARIABLE OR EXPRESSION)
     NUM may be a single variable name or an expression.  If it is a
     variable name, results in 1 if the variable has a user-missing or
     system-missing value for the current case, 0 otherwise.  If it is
     an expression, results in 1 if the expression has the
     system-missing value, 0 otherwise.

          *Please note:* If the argument is a string expression other
          than a variable name, MISSING is guaranteed to return 0,
          because strings do not have a system-missing value.  Also,
          when using a numeric expression argument, remember that
          user-missing values are converted to the system-missing value
          in most contexts.  Thus, the expressions `MISSING(VAR1 OP
          VAR2)' and `MISSING(VAR1) OR MISSING(VAR2)' are often
          equivalent, depending on the specific operator OP used.

 - Function:  NMISS (EXPR [, EXPR]...)
     Each argument must be a numeric expression.  Returns the number of
     user- or system-missing values in the list.  As a special
     extension, the syntax `VAR1 TO VAR2' may be used to refer to a
     range of variables; see *Note Sets of Variables::, for more
     details.

 - Function:  NVALID (EXPR [, EXPR]...)
     Each argument must be a numeric expression.  Returns the number of
     values in the list that are not user- or system-missing.  As a
     special extension, the syntax `VAR1 TO VAR2' may be used to refer
     to a range of variables; see *Note Sets of Variables::, for more
     details.

 - Function:  SYSMIS (VARIABLE OR EXPRESSION)
     When given the name of a numeric variable, returns 1 if the value
     of that variable is system-missing.  Otherwise, if the value is not
     missing or if it is user-missing, returns 0.  If given the name of
     a string variable, always returns 1.  If given an expression other
     than a single variable name, results in 1 if the value is system-
     or user-missing, 0 otherwise.

 - Function:  VALUE (VARIABLE)
     Prevents the user-missing values of VARIABLE from being
     transformed into system-missing values: If VARIABLE is not system-
     or user-missing, results in the value of VARIABLE.  If VARIABLE is
     user-missing, results in the value of VARIABLE anyway.  If
     VARIABLE is system-missing, results in system-missing.


File: pspp.info,  Node: Pseudo-Random Numbers,  Next: Set Membership,  Prev: Missing Value Functions,  Up: Functions

Pseudo-Random Number Generation Functions
-----------------------------------------

   Pseudo-random number generation functions take numeric arguments and
produce numeric results.

   The system's C library random generator is used as a basis for
generating random numbers, since random number generation is a
system-dependent task.  However, Knuth's Algorithm B is used to shuffle
the resultant values, which is enough to make even a stream of
consecutive integers random enough for most applications.

   (If you're worried about the quality of the random number generator,
well, you're using a statistical processing package--analyze it!)

 - Function:  NORMAL (NUMBER)
     Results in a random number.  Results from `NORMAL' are normally
     distributed with a mean of 0 and a standard deviation of NUMBER.

 - Function:  UNIFORM (NUMBER)
     Results in a random number between 0 and NUMBER.  Results from
     `UNIFORM' are evenly distributed across its entire range.  There
     may be a maximum on the largest random number ever generated--this
     is often 2**31-1 (2,147,483,647), but it may be orders of magnitude
     higher or lower.


File: pspp.info,  Node: Set Membership,  Next: Statistical Functions,  Prev: Pseudo-Random Numbers,  Up: Functions

Set-Membership Functions
------------------------

   Set membership functions determine whether a value is a member of a
set.  They take a set of numeric arguments or a set of string
arguments, and produce Boolean results.

   String comparisons are performed according to the rules given in
*Note Relational Operators::.

 - Function:  ANY (VALUE, SET [, SET]...)
     Results in true if VALUE is equal to any of the SET values.
     Otherwise, results in false.  If VALUE is system-missing, returns
     system-missing.  System-missing values in SET do not cause ANY to
     return system-missing.

 - Function:  RANGE (VALUE, LOW, HIGH [, LOW, HIGH]...)
     Results in true if VALUE is in any of the intervals bounded by LOW
     and HIGH inclusive.  Otherwise, results in false.  Each LOW must
     be less than or equal to its corresponding HIGH value.  LOW and
     HIGH must be given in pairs.  If VALUE is system-missing, returns
     system-missing.  System-missing values in SET do not cause RANGE
     to return system-missing.


File: pspp.info,  Node: Statistical Functions,  Next: String Functions,  Prev: Set Membership,  Up: Functions

Statistical Functions
---------------------

   Statistical functions compute descriptive statistics on a list of
values.  Some statistics can be computed on numeric or string values;
other can only be computed on numeric values.  They result in the same
type as their arguments.

   With statistical functions it is possible to specify a minimum
number of non-missing arguments for the function to be evaluated.  To
do so, append a dot and the number to the function name.  For instance,
to specify a minimum of three valid arguments to the MEAN function, use
the name `MEAN.3'.

 - Function:  CFVAR (NUMBER, NUMBER[, ...])
     Results in the coefficient of variation of the values of NUMBER.
     This function requires at least two valid arguments to give a
     non-missing result.  (The coefficient of variation is the standard
     deviation divided by the mean.)

 - Function:  MAX (VALUE, VALUE[, ...])
     Results in the value of the greatest VALUE.  The VALUEs may be
     numeric or string.  Although at least two arguments must be given,
     only one need be valid for MAX to give a non-missing result.

 - Function:  MEAN (NUMBER, NUMBER[, ...])
     Results in the mean of the values of NUMBER.  Although at least
     two arguments must be given, only one need be valid for MEAN to
     give a non-missing result.

 - Function:  MIN (NUMBER, NUMBER[, ...])
     Results in the value of the least VALUE.  The VALUEs may be
     numeric or string.  Although at least two arguments must be given,
     only one need be valid for MAX to give a non-missing result.

 - Function:  SD (NUMBER, NUMBER[, ...])
     Results in the standard deviation of the values of NUMBER.  This
     function requires at least two valid arguments to give a
     non-missing result.

 - Function:  SUM (NUMBER, NUMBER[, ...])
     Results in the sum of the values of NUMBER.  Although at least two
     arguments must be given, only one need by valid for SUM to give a
     non-missing result.

 - Function:  VAR (NUMBER, NUMBER[, ...])
     Results in the variance of the values of NUMBER.  This function
     requires at least two valid arguments to give a non-missing result.

 - Function:  VARIANCE (NUMBER, NUMBER[, ...])
     Results in the variance of the values of NUMBER.  This function
     requires at least two valid arguments to give a non-missing result.
     (Use VAR in preference to VARIANCE for reasons of portability.)


File: pspp.info,  Node: String Functions,  Next: Time & Date,  Prev: Statistical Functions,  Up: Functions

String Functions
----------------

   String functions take various arguments and return various results.

 - Function:  CONCAT (STRING, STRING[, ...])
     Returns a string consisting of each STRING in sequence.
     `CONCAT("abc", "def", "ghi")' has a value of `"abcdefghi"'.  The
     resultant string is truncated to a maximum of 255 characters.

 - Function:  INDEX (HAYSTACK, NEEDLE)
     Returns a positive integer indicating the position of the first
     occurrence NEEDLE in HAYSTACK.  Returns 0 if HAYSTACK does not
     contain NEEDLE.  Returns system-missing if NEEDLE is an empty
     string.

 - Function:  INDEX (HAYSTACK, NEEDLE, DIVISOR)
     Divides NEEDLE into parts, each with length DIVISOR.  Searches
     HAYSTACK for the first occurrence of each part, and returns the
     smallest value.  Returns 0 if HAYSTACK does not contain any part
     in NEEDLE.  It is an error if DIVISOR cannot be evenly divided
     into the length of NEEDLE.  Returns system-missing if NEEDLE is an
     empty string.

 - Function:  LENGTH (STRING)
     Returns the number of characters in STRING.

 - Function:  LOWER (STRING)
     Returns a string identical to STRING except that all uppercase
     letters are changed to lowercase letters.  The definitions of
     "uppercase" and "lowercase" are system-dependent.

 - Function:  LPAD (STRING, LENGTH)
     If STRING is at least LENGTH characters in length, returns STRING
     unchanged.  Otherwise, returns STRING padded with spaces on the
     left side to length LENGTH.  Returns an empty string if LENGTH is
     system-missing, negative, or greater than 255.

 - Function:  LPAD (STRING, LENGTH, PADDING)
     If STRING is at least LENGTH characters in length, returns STRING
     unchanged.  Otherwise, returns STRING padded with PADDING on the
     left side to length LENGTH.  Returns an empty string if LENGTH is
     system-missing, negative, or greater than 255, or if PADDING does
     not contain exactly one character.

 - Function:  LTRIM (STRING)
     Returns STRING, after removing leading spaces.  Other whitespace,
     such as tabs, carriage returns, line feeds, and vertical tabs, is
     not removed.

 - Function:  LTRIM (STRING, PADDING)
     Returns STRING, after removing leading PADDING characters.  If
     PADDING does not contain exactly one character, returns an empty
     string.

 - Function:  NUMBER (STRING)
     Returns the number produced when STRING is interpreted according
     to format FX.0, where X is the number of characters in STRING.  If
     STRING does not form a proper number, system-missing is returned
     without an error message.  Portability: none.

 - Function:  NUMBER (STRING, FORMAT)
     Returns the number produced when STRING is interpreted according
     to format specifier FORMAT.  Only the number of characters in
     STRING specified by FORMAT are examined.  For example,
     `NUMBER("123", F3.0)' and `NUMBER("1234", F3.0)' both have value
     123.  If STRING does not form a proper number, system-missing is
     returned without an error message.

 - Function:  RINDEX (STRING, FORMAT)
     Returns a positive integer indicating the position of the last
     occurrence of NEEDLE in HAYSTACK.  Returns 0 if HAYSTACK does not
     contain NEEDLE.  Returns system-missing if NEEDLE is an empty
     string.

 - Function:  RINDEX (HAYSTACK, NEEDLE, DIVISOR)
     Divides NEEDLE into parts, each with length DIVISOR.  Searches
     HAYSTACK for the last occurrence of each part, and returns the
     largest value.  Returns 0 if HAYSTACK does not contain any part in
     NEEDLE.  It is an error if DIVISOR cannot be evenly divided into
     the length of NEEDLE.  Returns system-missing if NEEDLE is an
     empty string.

 - Function:  RPAD (STRING, LENGTH)
     If STRING is at least LENGTH characters in length, returns STRING
     unchanged.  Otherwise, returns STRING padded with spaces on the
     right to length LENGTH.  Returns an empty string if LENGTH is
     system-missing, negative, or greater than 255.

 - Function:  RPAD (STRING, LENGTH, PADDING)
     If STRING is at least LENGTH characters in length, returns STRING
     unchanged.  Otherwise, returns STRING padded with PADDING on the
     right to length LENGTH.  Returns an empty string if LENGTH is
     system-missing, negative, or greater than 255, or if PADDING does
     not contain exactly one character.

 - Function:  RTRIM (STRING)
     Returns STRING, after removing trailing spaces.  Other types of
     whitespace are not removed.

 - Function:  RTRIM (STRING, PADDING)
     Returns STRING, after removing trailing PADDING characters.  If
     PADDING does not contain exactly one character, returns an empty
     string.

 - Function:  STRING (NUMBER, FORMAT)
     Returns a string corresponding to NUMBER in the format given by
     format specifier FORMAT.  For example, `STRING(123.56, F5.1)' has
     the value `"123.6"'.

 - Function:  SUBSTR (STRING, START)
     Returns a string consisting of the value of STRING from position
     START onward.  Returns an empty string if START is system-missing
     or has a value less than 1 or greater than the number of
     characters in STRING.

 - Function:  SUBSTR (STRING, START, COUNT)
     Returns a string consisting of the first COUNT characters from
     STRING beginning at position START.  Returns an empty string if
     START or COUNT is system-missing, if START is less than 1 or
     greater than the number of characters in STRING, or if COUNT is
     less than 1.  Returns a string shorter than COUNT characters if
     START + COUNT - 1 is greater than the number of characters in
     STRING.  Examples: `SUBSTR("abcdefg", 3, 2)' has value `"cd"';
     `SUBSTR("Ben Pfaff", 5, 10)' has the value `"Pfaff"'.

 - Function:  UPCASE (STRING)
     Returns STRING, changing lowercase letters to uppercase letters.


File: pspp.info,  Node: Time & Date,  Next: Miscellaneous Functions,  Prev: String Functions,  Up: Functions

Time & Date Functions
---------------------

   The legal range of dates for use in PSPP is 15 Oct 1582 through 31
Dec 19999.

     *Please note:* Most time & date extraction functions will accept
     invalid arguments:

        * Negative numbers in PSPP time format.

        * Numbers less than 86,400 in PSPP date format.

     However, sensible results are not guaranteed for these invalid
     values.  The given equivalents for these functions are definitely
     not guaranteed for invalid values.

     *Please note also:* The time & date construction functions *do*
     produce reasonable and useful results for out-of-range values;
     these are not considered invalid.

* Menu:

* Time & Date Concepts::        How times & dates are defined and represented
* Time Construction::           TIME.{DAYS HMS}
* Time Extraction::             CTIME.{DAYS HOURS MINUTES SECONDS}
* Date Construction::           DATE.{DMY MDY MOYR QYR WKYR YRDAY}
* Date Extraction::             XDATE.{DATE HOUR JDAY MDAY MINUTE MONTH
                                       QUARTER SECOND TDAY TIME WEEK
                                       WKDAY YEAR}


File: pspp.info,  Node: Time & Date Concepts,  Next: Time Construction,  Prev: Time & Date,  Up: Time & Date

How times & dates are defined and represented
.............................................

   Times and dates are handled by PSPP as single numbers.  A "time" is
an interval.  PSPP measures times in seconds.  Thus, the following
intervals correspond with the numeric values given:

               10 minutes                        600
               1 hour                          3,600
               1 day, 3 hours, 10 seconds     97,210
               40 days                     3,456,000
               10010 d, 14 min, 24 s     864,864,864

   A "date", on the other hand, is a particular instant in the past or
the future.  PSPP represents a date as a number of seconds after the
midnight that separated 8 Oct 1582 and 9 Oct 1582.  (Please note that 15
Oct 1582 immediately followed 9 Oct 1582.)  Thus, the midnights before
the dates given below correspond with the numeric PSPP dates given:

                   15 Oct 1582                86,400
                    4 Jul 1776         6,113,318,400
                    1 Jan 1900        10,010,390,400
                    1 Oct 1978        12,495,427,200
                   24 Aug 1995        13,028,601,600

Please note:

   * A time may be added to, or subtracted from, a date, resulting in a
     date.

   * The difference of two dates may be taken, resulting in a time.

   * Two times may be added to, or subtracted from, each other,
     resulting in a time.

   (Adding two dates does not produce a useful result.)

   Since times and dates are merely numbers, the ordinary addition and
subtraction operators are employed for these purposes.

     *Please note:* Many dates and times have extremely large
     values--just look at the values above.  Thus, it is not a good
     idea to take powers of these values; also, the accuracy of some
     procedures may be affected.  If necessary, convert times or dates
     in seconds to some other unit, like days or years, before
     performing analysis.


File: pspp.info,  Node: Time Construction,  Next: Time Extraction,  Prev: Time & Date Concepts,  Up: Time & Date

Functions that Produce Times
............................

   These functions take numeric arguments and produce numeric results in
PSPP time format.

 - Function:  TIME.DAYS (NDAYS)
     Results in a time value corresponding to NDAYS days.
     (`TIME.DAYS(X)' is equivalent to `X * 60 * 60 * 24'.)

 - Function:  TIME.HMS (NHOURS, NMINS, NSECS)
     Results in a time value corresponding to NHOURS hours, NMINS
     minutes, and NSECS seconds.  (`TIME.HMS(H, M, S)' is equivalent to
     `H*60*60 + M*60 + S'.)


File: pspp.info,  Node: Time Extraction,  Next: Date Construction,  Prev: Time Construction,  Up: Time & Date

Functions that Examine Times
............................

   These functions take numeric arguments in PSPP time format and give
numeric results.

 - Function:  CTIME.DAYS (TIME)
     Results in the number of days and fractional days in TIME.
     (`CTIME.DAYS(X)' is equivalent to `X/60/60/24'.)

 - Function:  CTIME.HOURS (TIME)
     Results in the number of hours and fractional hours in TIME.
     (`CTIME.HOURS(X)' is equivalent to `X/60/60'.)

 - Function:  CTIME.MINUTES (TIME)
     Results in the number of minutes and fractional minutes in TIME.
     (`CTIME.MINUTES(X)' is equivalent to `X/60'.)

 - Function:  CTIME.SECONDS (TIME)
     Results in the number of seconds and fractional seconds in TIME.
     (`CTIME.SECONDS' does nothing; `CTIME.SECONDS(X)' is equivalent to
     `X'.)


File: pspp.info,  Node: Date Construction,  Next: Date Extraction,  Prev: Time Extraction,  Up: Time & Date

Functions that Produce Dates
............................

   These functions take numeric arguments and give numeric results in
the PSPP date format.  Arguments taken by these functions are:

DAY
     Refers to a day of the month between 1 and 31.

MONTH
     Refers to a month of the year between 1 and 12.

QUARTER
     Refers to a quarter of the year between 1 and 4.  The quarters of
     the year begin on the first days of months 1, 4, 7, and 10.

WEEK
     Refers to a week of the year between 1 and 53.

YDAY
     Refers to a day of the year between 1 and 366.

YEAR
     Refers to a year between 1582 and 19999.

   If these functions' arguments are out-of-range, they are correctly
normalized before conversion to date format.  Non-integers are rounded
toward zero.

 - Function:  DATE.DMY (DAY, MONTH, YEAR)
 - Function:  DATE.MDY (MONTH, DAY, YEAR)
     Results in a date value corresponding to the midnight before day
     DAY of month MONTH of year YEAR.

 - Function:  DATE.MOYR (MONTH, YEAR)
     Results in a date value corresponding to the midnight before the
     first day of month MONTH of year YEAR.

 - Function:  DATE.QYR (QUARTER, YEAR)
     Results in a date value corresponding to the midnight before the
     first day of quarter QUARTER of year YEAR.

 - Function:  DATE.WKYR (WEEK, YEAR)
     Results in a date value corresponding to the midnight before the
     first day of week WEEK of year YEAR.

 - Function:  DATE.YRDAY (YEAR, YDAY)
     Results in a date value corresponding to the midnight before day
     YDAY of year YEAR.