Formfeed (ASCII 12).
@item \n
-Newline (ASCII 10)
+New-line (ASCII 10)
@item \r
Carriage return (ASCII 13).
@item line-ends=@var{line-end-type}
One of @code{cr}, @code{lf}, or @code{crlf}. This controls what is used
-for newline in the output file. Default: @code{cr}.
+for new-line in the output file. Default: @code{cr}.
@item optimize-line-size=@var{level}
@code{paginate}, described below, for a related setting. Default:
@code{"\f"}.
-@item newline-string=@var{newline-value}
+@item newline-string=@var{new-line-value}
-The string written to the output to cause a newline (carriage return
+The string written to the output to cause a new-line (carriage return
plus linefeed). The default, which can be specified explicitly with
-@code{newline-string=default}, is to use the system-dependent newline
+@code{newline-string=default}, is to use the system-dependent new-line
sequence by opening the output file in text mode. This is usually the
right choice.
current run. The configuration options are:
@table @code
+@item -a @{compatible|enhanced@}
+@itemx --algorithm=@{compatible|enhanced@}
+
+If you chose @code{compatible}, then PSPP will use the same algorithms
+as used by some proprietary statistical analysis packages.
+This is not recommended, as these algorithms are inferior and in some cases
+compeletely broken.
+The default setting is @code{enhanced}.
+Certain commands have subcommands which allow you to override this setting on
+a per command basis.
+
@item -B @var{dir}
@itemx --config-dir=@var{dir}
Lists the available device driver classes, then terminates.
+@item -x @{compatible|enhanced@}
+@itemx --syntax=@{compatible|enhanced@}
+
+If you chose @code{compatible}, then PSPP will only accept command syntax that
+is compatible with the proprietary program SPSS.
+If you choose @code{enhanced} then additional syntax will be available.
+The default is @code{enhanced}.
+
+
@item -V
@item --version
full-fledged expressions in themselves.
@menu
-* Booleans:: Boolean values.
+* Boolean Values:: Boolean values.
* Missing Values in Expressions:: Using missing values in expressions.
-* Grouping Operators:: ( )
-* Arithmetic Operators:: + - * / **
-* Logical Operators:: AND NOT OR
-* Relational Operators:: EQ GE GT LE LT NE
-* Functions:: More-sophisticated operators.
-* Order of Operations:: Operator precedence.
+* Grouping Operators:: parentheses
+* Arithmetic Operators:: add sub mul div pow
+* Logical Operators:: AND NOT OR
+* Relational Operators:: EQ GE GT LE LT NE
+* Functions:: More-sophisticated operators.
+* Order of Operations:: Operator precedence.
@end menu
-@node Booleans, Missing Values in Expressions, Expressions, Expressions
-@section Boolean values
+@node Boolean Values, Missing Values in Expressions, Expressions, Expressions
+@section Boolean Values
@cindex Boolean
@cindex values, Boolean
-There is a third type for arguments and results, the @dfn{Boolean} type,
-which is used to represent true/false conditions. Booleans have only
-three possible values: 0 (false), 1 (true), and system-missing.
-System-missing is neither true nor false.
-
-@itemize @bullet
-@item
-A numeric expression that has value 0, 1, or system-missing may be used
-in place of a Boolean. Thus, the expression @code{0 AND 1} is valid
-(although it is always false).
-
-@item
-A numeric expression with any other value will cause an error if it is
-used as a Boolean. So, @code{2 OR 3} is invalid.
+Some PSPP operators and expressions work with Boolean values, which
+represent true/false conditions. Booleans have only three possible
+values: 0 (false), 1 (true), and system-missing (unknown).
+System-missing is neither true nor false and indicates that the true
+value is unknown.
-@item
-A Boolean expression may not be used in place of a numeric expression.
-Thus, @code{(1>2) + (3<4)} is invalid.
+Boolean-typed operands or function arguments must take on one of these
+three values. Other values are considered false, but cause an error
+when the expression is evaluated.
-@item
Strings and Booleans are not compatible, and neither may be used in
place of the other.
-@end itemize
-@node Missing Values in Expressions, Grouping Operators, Booleans, Expressions
+@node Missing Values in Expressions, Grouping Operators, Boolean Values, Expressions
@section Missing Values in Expressions
String missing values are not treated specially in expressions. Most
descriptions.
User-missing values for numeric variables are always transformed into
-the system-missing value, except inside the arguments to the
-@code{VALUE}, @code{SYSMIS}, and @code{MISSING} functions.
+the system-missing value, except inside the arguments to the
+@code{VALUE} and @code{SYSMIS} functions.
The missing-value functions can be used to precisely control how missing
values are treated in expressions. @xref{Missing Value Functions}, for
@cindex logical intersection
@item @var{a} AND @var{b}
@itemx @var{a} & @var{b}
-True if both @var{a} and @var{b} are true. However, if one argument is
-false and the other is missing, the result is false, not missing. If
+True if both @var{a} and @var{b} are true, false otherwise. If one
+argument is false, the result is false even if the other is missing. If
both arguments are missing, the result is missing.
@cindex @code{OR}
@item @var{a} OR @var{b}
@itemx @var{a} | @var{b}
True if at least one of @var{a} and @var{b} is true. If one argument is
-true and the other is missing, the result is true, not missing. If both
+true, the result is true even if the other argument is missing. If both
arguments are missing, the result is missing.
@cindex @code{NOT}
@cindex logical inversion
@item NOT @var{a}
@itemx ~ @var{a}
-True if @var{a} is false.
+True if @var{a} is false. If the argument is missing, then the result
+is missing.
@end table
@node Relational Operators, Functions, Logical Operators, Expressions
The relational operators take numeric or string arguments and produce Boolean
results.
-Note that, with numeric arguments, PSPP does not make exact
-relational tests. Instead, two numbers are considered to be equal even
-if they differ by a small amount. This amount, @dfn{epsilon}, is
-dependent on the PSPP configuration and determined at compile
-time. (The default value is 0.000000001, or
-@ifinfo
-@code{10**(-9)}.)
-@end ifinfo
-@tex
-$10 ^{-9}$.)
-@end tex
-Use of epsilon allows for round-off errors. Use of epsilon is also
-idiotic, but the author is not a numeric analyst.
-
Strings cannot be compared to numbers. When strings of different
lengths are compared, the shorter string is right-padded with spaces
to match the length of the longer string.
@cindex arccosine
@cindex inverse cosine
-@deftypefn {Function} {} ACOS (@var{number})
-@deftypefnx {Function} {} ARCOS (@var{number})
+@deftypefn {Function} {} ARCOS (@var{number})
Takes the arccosine, in radians, of @var{number}. Results in
-system-missing if @var{number} is not between -1 and 1. Portability:
-none.
+system-missing if @var{number} is not between -1 and 1.
@end deftypefn
@cindex arcsine
Takes the arctangent, in radians, of @var{number}.
@end deftypefn
-@cindex arcsine
-@cindex inverse sine
-@deftypefn {Function} {} ASIN (@var{number})
-Takes the arcsine, in radians, of @var{number}. Results in
-system-missing if @var{number} is not between -1 and 1 inclusive.
-Portability: none.
-@end deftypefn
-
-@cindex arctangent
-@cindex inverse tangent
-@deftypefn {Function} {} ATAN (@var{number})
-Takes the arctangent, in radians, of @var{number}.
-@end deftypefn
-
-@quotation
-@strong{Please note:} Use of the AR* group of inverse trigonometric
-functions is recommended over the A* group because they are more
-portable.
-@end quotation
-
@cindex cosine
@deftypefn {Function} {} COS (@var{angle})
Takes the cosine of @var{angle} which should be in radians.
@cindex values, missing
@cindex functions, missing-value
-Missing-value functions take various types as arguments, returning
-various types of results.
-
-@deftypefn {Function} {} MISSING (@var{variable or expression})
-@var{num} may be a single variable name or an expression. If it is a
-variable name, results in 1 if the variable has a user-missing or
-system-missing value for the current case, 0 otherwise. If it is an
-expression, results in 1 if the expression has the system-missing value,
-0 otherwise.
+Missing-value functions take various numeric arguments and yield
+various types of results. Note that the normal rules of evaluation
+apply within expression arguments to these functions. In particular,
+user-missing values for numeric variables are converted to
+system-missing values.
-@quotation
-@strong{Please note:} If the argument is a string expression other than
-a variable name, MISSING is guaranteed to return 0, because strings do
-not have a system-missing value. Also, when using a numeric expression
-argument, remember that user-missing values are converted to the
-system-missing value in most contexts. Thus, the expressions
-@code{MISSING(VAR1 @var{op} VAR2)} and @code{MISSING(VAR1) OR
-MISSING(VAR2)} are often equivalent, depending on the specific operator
-@var{op} used.
-@end quotation
+@deftypefn {Function} {} MISSING (@var{expr})
+Returns 1 if @var{expr} has the system-missing value, 0 otherwise.
@end deftypefn
@deftypefn {Function} {} NMISS (@var{expr} [, @var{expr}]@dots{})
Each argument must be a numeric expression. Returns the number of
-user- or system-missing values in the list. As a special extension,
+system-missing values in the list. As a special extension,
the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a
range of variables; see @ref{Sets of Variables}, for more details.
@end deftypefn
@deftypefn {Function} {} NVALID (@var{expr} [, @var{expr}]@dots{})
Each argument must be a numeric expression. Returns the number of
-values in the list that are not user- or system-missing. As a special extension,
+values in the list that are not system-missing. As a special extension,
the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a
range of variables; see @ref{Sets of Variables}, for more details.
@end deftypefn
-@deftypefn {Function} {} SYSMIS (@var{variable or expression})
-When given the name of a numeric variable, returns 1 if the value of
-that variable is system-missing. Otherwise, if the value is not
-missing or if it is user-missing, returns 0. If given the name of a
-string variable, always returns 1. If given an expression other than
-a single variable name, results in 1 if the value is system- or
-user-missing, 0 otherwise.
+@deftypefn {Function} {} SYSMIS (@var{expr})
+When @var{expr} is simply the name of a numeric variable, returns 1 if
+the variable has the system-missing value, 0 if it is user-missing or
+not missing. If given @var{expr} takes another form, results in 1 if
+the value is system-missing, 0 otherwise.
@end deftypefn
@deftypefn {Function} {} VALUE (@var{variable})
Prevents the user-missing values of @var{variable} from being
-transformed into system-missing values: If @var{variable} is not
-system- or user-missing, results in the value of @var{variable}. If
-@var{variable} is user-missing, results in the value of @var{variable}
-anyway. If @var{variable} is system-missing, results in system-missing.
+transformed into system-missing values, and always results in the
+actual value of @var{variable}, whether it is user-missing,
+system-missing or not missing at all.
@end deftypefn
@node Pseudo-Random Numbers, Set Membership, Missing Value Functions, Functions
@end deftypefn
@cindex variance
-@deftypefn {Function} {} VAR (@var{number}, @var{number}[, @dots{}])
-Results in the variance of the values of @var{number}. This function
-requires at least two valid arguments to give a non-missing result.
-@end deftypefn
-
@deftypefn {Function} {} VARIANCE (@var{number}, @var{number}[, @dots{}])
Results in the variance of the values of @var{number}. This function
requires at least two valid arguments to give a non-missing result.
-(Use VAR in preference to VARIANCE for reasons of portability.)
@end deftypefn
@node String Functions, Time & Date, Statistical Functions, Functions
@cindex numbers, converting from strings
@cindex strings, converting to numbers
-@deftypefn {Function} {} NUMBER (@var{string})
-Returns the number produced when @var{string} is interpreted according
-to format F@var{x}.0, where @var{x} is the number of characters in
-@var{string}. If @var{string} does not form a proper number,
-system-missing is returned without an error message. Portability: none.
-@end deftypefn
-
@deftypefn {Function} {} NUMBER (@var{string}, @var{format})
Returns the number produced when @var{string} is interpreted according
-to format specifier @var{format}. Only the number of characters in
-@var{string} specified by @var{format} are examined. For example,
-@code{NUMBER("123", F3.0)} and @code{NUMBER("1234", F3.0)} both have
-value 123. If @var{string} does not form a proper number,
-system-missing is returned without an error message.
+to format specifier @var{format}. If the format width @var{w} is less
+than the length of @var{string}, then only the first @var{w}
+characters in @var{string} are used, e.g.@: @code{NUMBER("123", F3.0)}
+and @code{NUMBER("1234", F3.0)} both have value 123. If @var{w} is
+greater than @var{string}'s length, then it is treated as if it were
+right-padded with spaces. If @var{string} is not in the correct
+format for @var{format}, system-missing is returned.
@end deftypefn
@cindex strings, searching backwards
@display
DATA LIST FREE
+ [(@{TAB,'c'@}, @dots{})]
[@{NOTABLE,TABLE@}]
FILE='filename'
END=end_var
var_list *
@end display
-In free format, the input data is structured as a series of comma- or
-whitespace-delimited fields (end of line is one form of whitespace; it
-is not treated specially). Field contents may be surrounded by matched
-pairs of apostrophes (@samp{'}) or quotes (@samp{"}), or they may be
-unenclosed. For any type of field leading white space (up to the
-apostrophe or quote, if any) is not included in the field.
-
-Multiple consecutive delimiters are equivalent to a single delimiter.
-To specify an empty field, write an empty set of single or double
-quotes; for instance, @samp{""}.
+In free format, the input data is, by default, structured as a series
+of fields separated by spaces, tabs, commas, or line breaks. Each
+field's content may be unquoted, or it may be quoted with a pairs of
+apostrophes (@samp{'}) or double quotes (@samp{"}). Unquoted white
+space separates fields but is not part of any field. Any mix of
+spaces, tabs, and line breaks is equivalent to a single space for the
+purpose of separating fields, but consecutive commas will skip a
+field.
+
+Alternatively, delimiters can be specified explicitly, as a
+parenthesized, comma-separated list of single-character strings
+immediately following FREE. The word TAB may also be used to specify
+a tab character as a delimiter. When delimiters are specified
+explicitly, only the given characters, plus line breaks, separate
+fields. Furthermore, leading spaces at the beginnings of fields are
+not trimmed, consecutive delimiters define empty fields, and no form
+of quoting is allowed.
The NOTABLE and TABLE subcommands are as in @cmd{DATA LIST FIXED} above.
NOTABLE is the default.
@display
DATA LIST LIST
+ [(@{TAB,'c'@}, @dots{})]
[@{NOTABLE,TABLE@}]
FILE='filename'
END=end_var
@display
FILE HANDLE handle_name
/NAME='filename'
- /RECFORM=@{VARIABLE,FIXED,SPANNED@}
+ /MODE=@{CHARACTER,IMAGE@}
/LRECL=rec_len
- /MODE=@{CHARACTER,IMAGE,BINARY,MULTIPUNCH,360@}
+ /TABWIDTH=tab_width
@end display
-Use @cmd{FILE HANDLE} to define the attributes of a file that does
-not use conventional variable-length records terminated by newline
-characters.
+Use @cmd{FILE HANDLE} to associate a file handle name with a file and
+its attributes, so that later commands can refer to the file by its
+handle name. Because names of text files can be specified directly on
+commands that access files, @cmd{FILE HANDLE} is only needed when a
+file is not an ordinary file containing lines of text. However,
+@cmd{FILE HANDLE} may be used even for text files, and it may be
+easier to specify a file's name once and later refer to it by an
+abstract handle.
Specify the file handle name as an identifier. Any given identifier may
only appear once in a PSPP run. File handles may not be reassigned to a
The NAME subcommand specifies the name of the file associated with the
handle. It is the only required subcommand.
-The RECFORM subcommand specifies how the file is laid out. VARIABLE
-specifies variable-length lines terminated with newlines, and it is the
-default. FIXED specifies fixed-length records. SPANNED is not
-supported.
-
-LRECL specifies the length of fixed-length records. It is required if
-@code{/RECFORM FIXED} is specified.
+MODE specifies a file mode. In CHARACTER mode, the default, the data
+file is opened in ANSI C text mode, so that local end of line
+conventions are followed, and each text line is read as one record.
+In CHARACTER mode, most input programs will expand tabs to spaces
+(@cmd{DATA LIST FREE} with explicitly specified delimiters is an
+exception). By default, each tab is 4 characters wide, but an
+alternate width may be specified on TABWIDTH. A tab width of 0
+suppresses tab expansion entirely.
-MODE specifies a file mode. CHARACTER, the default, causes the data
-file to be opened in ANSI C text mode. BINARY causes the data file to
-be opened in ANSI C binary mode. The other possibilities are not
-supported.
+By contrast, in BINARY mode, the data file is opened in ANSI C binary
+mode and records are a fixed length. In BINARY mode, LRECL specifies
+the record length in bytes, with a default of 1024. Tab characters
+are never expanded to spaces in binary mode.
@node INPUT PROGRAM, LIST, FILE HANDLE, Data Input and Output
@section INPUT PROGRAM
The aggregation functions listed above exclude all user-missing values
from calculations. To include user-missing values, insert a period
(@samp{.}) between the function name and left parenthesis
-(e.g.~@samp{SUM.}).
+(e.g.@: @samp{SUM.}).
Normally, only a single case (for SD and SD., two cases) need be
non-missing in each group for the aggregate variable to be
value is reported.) By default, the mean, standard deviation of the
mean, minimum, and maximum are reported for each variable.
-NTILES causes the specified quartiles to be reported. For instance,
-@code{/NTILES=4} would cause quartiles to be reported. In addition,
-particular percentiles can be requested with the PERCENTILES subcommand.
+PERCENTILES causes the specified percentiles to be reported.
+The percentiles should be presented at a list of numbers between 0
+and 100 inclusive.
+The NTILES subcommand causes the percentiles to be reported at the
+boundaries of the data set divided into the specified number of ranges.
+For instance, @code{/NTILES=4} would cause quartiles to be reported.
+
@node CROSSTABS, T-TEST, FREQUENCIES, Statistics
@section CROSSTABS
Fixes for any of these deficiencies would be welcomed.
-@node T-TEST, , CROSSTABS, Statistics
+@node T-TEST, , CROSSTABS, Statistics
@comment node-name, next, previous, up
@section T-TEST
@menu
-* One Sample Mode:: Testing against a hypothesised mean
-* Independent Samples Mode:: Testing two independent groups for the same mean
-* Paired Samples Mode:: Testing two interdependet groups for the same mean
+* One Sample Mode:: Testing against a hypothesised mean
+* Independent Samples Mode:: Testing two independent groups for equal mean
+* Paired Samples Mode:: Testing two interdependent groups for equal mean
@end menu
@node One Sample Mode, Independent Samples Mode, T-TEST, T-TEST
-@comment node-name, next, previous, up
-
@subsection One Sample Mode
The @cmd{TESTVAL} subcommand invokes the One Sample mode.
of whether @cmd{/MISSING=LISTWISE} was specified.
-@node Paired Samples Mode, , Independent Samples Mode, T-TEST
+@node Paired Samples Mode, , Independent Samples Mode, T-TEST
@comment node-name, next, previous, up
@subsection Paired Samples Mode
@table @code
@item int32 rec_type;
-Record type. Always set to 3.
+Record type. Always set to 7.
@item int32 subtype;
Record subtype. Always set to 4.
@table @code
@item int32 rec_type;
-Record type. Always set to 3.
+Record type. Always set to 7.
@item int32 subtype;
-Record subtype. May take any value.
+Record subtype. May take any value. According to Aapi
+H@"am@"al@"ainen, value 5 indicates a set of grouped variables and 6
+indicates date info (probably related to USE).
@item int32 size;
Size of each piece of data in the data part. Should have the value 4 or
* Version and Date Info Record::
* Identification Records::
* Variable Count Record::
+* Case Weight Variable Record::
* Variable Records::
* Value Label Records::
* Portable File Data::
Portable files are arranged as a series of lines of exactly 80
characters each. Each line is terminated by a carriage-return,
-line-feed sequence (henceforth, ``newline''). Newlines are not
-delimiters: they are only used to avoid line-length limitations existing
-on some operating systems.
+line-feed sequence ``new-lines''). New-lines are only used to avoid
+line length limits imposed by some OSes; they are not meaningful.
The file must be terminated with a @samp{Z} character. In addition, if
the final line in the file does not have exactly 80 characters, then it
character set, as explained in the next section. Therefore, the
@samp{Z} character is not necessarily an ASCII @samp{Z}.)
-For the rest of the description of the portable file format, newlines
+For the rest of the description of the portable file format, new-lines
and the trailing @samp{Z}s will be ignored, as if they did not exist,
because they are not an important part of understanding the file
contents.
@item
Variable count.
+@item
+Case weight variable (optional).
+
@item
Variables. Each variable record may optionally be followed by a
missing value record and a variable label record.
through @samp{9} plus capital letters @samp{A} through @samp{T}.
@item
-A fraction, consisting of a radix point (@samp{.}) followed by one or
-more base-30 digits (optional).
+Optional fraction, consisting of a radix point (@samp{.}) followed by
+one or more base-30 digits.
@item
-An exponent, consisting of a plus or minus sign (@samp{+} or @samp{-})
-followed by one or more base-30 digits (optional).
+Optional exponent, consisting of a plus or minus sign (@samp{+} or
+@samp{-}) followed by one or more base-30 digits.
@item
A forward slash (@samp{/}).
@end itemize
-Integer fields take form identical to floating-point fields, but they
+Integer fields take a form identical to floating-point fields, but they
may not contain a fraction.
String fields take the form of a integer field having value @var{n},
character set translation table, followed by an 8-byte tag string.
The 200-byte segment is divided into five 40-byte sections, each of
-which represents the string @code{ASCII SPSS PORT FILE} in a different
-character set encoding. (If the file is encoded in EBCDIC then the
-string is actually @code{EBCDIC SPSS PORT FILE}, and so on.) These
-strings are padded on the right with spaces in their own character set.
+which represents the string @code{@var{charset} SPSS PORT FILE} in a
+different character set encoding, where @var{charset} is the name of
+the character set used in the file, e.g.@: @code{ASCII} or
+@code{EBCDIC}. Each string is padded on the right with spaces in its
+respective character set.
It appears that these strings exist only to inform those who might view
the file on a screen, and that they are not parsed by SPSS products.
consists of a single string field giving additional information on the
product that wrote the portable file.
-@node Variable Count Record, Variable Records, Identification Records, Portable File Format
+@node Variable Count Record, Case Weight Variable Record, Identification Records, Portable File Format
@section Variable Count Record
The variable count record has tag code @samp{4}. It consists of two
dictionary. The purpose of the second is unknown; it contains the value
161 in all portable files examined so far.
-@node Variable Records, Value Label Records, Variable Count Record, Portable File Format
+@node Case Weight Variable Record, Variable Records, Variable Count Record, Portable File Format
+@section Case Weight Variable Record
+
+The case weight variable record is optional. If it is present, it
+indicates the variable used for weighting cases; if it is absent,
+cases are unweighted. It has tag code @samp{6}. It consists of a
+single string field that names the weighting variable.
+
+@node Variable Records, Value Label Records, Case Weight Variable Record, Portable File Format
@section Variable Records
Each variable record represents a single variable. Variable records