1 This is pspp.info, produced by makeinfo version 4.0 from pspp.texi.
4 * PSPP: (pspp). Statistical analysis package.
7 PSPP, for statistical analysis of sampled data, by Ben Pfaff.
9 This file documents PSPP, a statistical package for analysis of
10 sampled data that uses a command language compatible with SPSS.
12 Copyright (C) 1996-9, 2000 Free Software Foundation, Inc.
14 This version of the PSPP documentation is consistent with version 2
17 Permission is granted to make and distribute verbatim copies of this
18 manual provided the copyright notice and this permission notice are
19 preserved on all copies.
21 Permission is granted to copy and distribute modified versions of
22 this manual under the conditions for verbatim copying, provided that the
23 entire resulting derived work is distributed under the terms of a
24 permission notice identical to this one.
26 Permission is granted to copy and distribute translations of this
27 manual into another language, under the above condition for modified
28 versions, except that this permission notice may be stated in a
29 translation approved by the Free Software Foundation.
32 File: pspp.info, Node: ASCII output options, Next: ASCII page options, Prev: ASCII driver class, Up: ASCII driver class
37 `output-file=FILENAME'
38 File to which output should be sent. This can be an ordinary
39 filename (i.e., `"pspp.ps"'), a pipe filename (i.e., `"|lpr"'), or
40 stdout (`"-"'). Default: `"pspp.list"'.
42 `char-set=CHAR-SET-TYPE'
43 One of `ascii' or `latin1'. This has no effect on output at the
44 present time. Default: `ascii'.
46 `form-feed-string=FORM-FEED-VALUE'
47 The string written to the output to cause a formfeed. See also
48 `paginate', described below, for a related setting. Default:
51 `newline-string=NEWLINE-VALUE'
52 The string written to the output to cause a newline (carriage
53 return plus linefeed). The default, which can be specified
54 explicitly with `newline-string=default', is to use the
55 system-dependent newline sequence by opening the output file in
56 text mode. This is usually the right choice.
58 However, `newline-string' can be set to any string. When this is
59 done, the output file is opened in binary mode.
62 If set, a formfeed (as set in `form-feed-string', described above)
63 will be written to the device after every page. Default: `on'.
65 `tab-width=TAB-WIDTH-VALUE'
66 The distance between tab stops for this device. If set to 0, tabs
67 will not be used in the output. Default: `8'.
69 `init=INITIALIZATION-STRING.'
70 String written to the device before anything else, at the
71 beginning of the output. Default: `""' (the empty string).
73 `done=FINALIZATION-STRING.'
74 String written to the device after everything else, at the end of
75 the output. Default: `""' (the empty string).
78 File: pspp.info, Node: ASCII page options, Next: ASCII font options, Prev: ASCII output options, Up: ASCII driver class
83 These options affect page setup:
86 If enabled, two lines of header information giving title and
87 subtitle, page number, date and time, and PSPP version are printed
88 at the top of every page. These two lines are in addition to any
89 top margin requested. Default: `on'.
92 Physical length of a page, in lines. Headers and margins are
93 subtracted from this value. Default: `66'.
95 `width=CHARACTER-COUNT'
96 Physical width of a page, in characters. Margins are subtracted
97 from this value. Default: `130'.
100 Number of lines per vertical inch. Not currently used. Default:
103 `cpi=CHARACTERS-PER-INCH'
104 Number of characters per horizontal inch. Not currently used.
107 `left-margin=LEFT-MARGIN-WIDTH'
108 Width of the left margin, in characters. PSPP subtracts this value
109 from the page width. Default: `0'.
111 `right-margin=RIGHT-MARGIN-WIDTH'
112 Width of the right margin, in characters. PSPP subtracts this
113 value from the page width. Default: `0'.
115 `top-margin=TOP-MARGIN-LINES'
116 Length of the top margin, in lines. PSPP subtracts this value from
117 the page length. Default: `2'.
119 `bottom-margin=BOTTOM-MARGIN-LINES'
120 Length of the bottom margin, in lines. PSPP subtracts this value
121 from the page length. Default: `2'.
124 File: pspp.info, Node: ASCII font options, Prev: ASCII page options, Up: ASCII driver class
129 These are the ASCII font options:
131 `box[LINE-TYPE]=BOX-CHARS'
132 The characters used for lines in tables produced by the ASCII
133 driver can be changed using this option. LINE-TYPE is used to
134 indicate which type of line to change; BOX-CHARS is the character
135 or string of characters to use for this type of line.
137 LINE-TYPE must be a 4-digit number in base 4. The digits are in
138 the order `right', `bottom', `left', `top'. The four
139 possibilities for each digit are:
151 Special device-defined line, if one is available; otherwise,
157 Sets `|' as the character to use for a single-width line with
158 bottom and top components.
161 Sets `#' as the character to use for the intersection of four
162 double-width lines, one each from the top, bottom, left and
166 Sets `"\xda"', which under MS-DOG is a box character suitable
167 for the top-left corner of a box, as the character for the
168 intersection of two single-width lines, one each from the
199 * For all others, `+' is used unless there are double lines or
200 special lines, in which case `#' is used.
202 `italic-on=ITALIC-ON-STRING'
203 Character sequence written to turn on italics or underline
204 printing. If this is set to `overstrike', then the driver will
205 simulate underlining by overstriking with underscore characters
206 (`_') in the manner described by `overstrike-style' and
207 `carriage-return-style'. Default: `overstrike'.
209 `italic-off=ITALIC-OFF-STRING'
210 Character sequence to turn off italics or underline printing.
211 Default: `""' (the empty string).
213 `bold-on=BOLD-ON-STRING'
214 Character sequence written to turn on bold or emphasized printing.
215 If set to `overstrike', then the driver will simulated bold
216 printing by overstriking characters in the manner described by
217 `overstrike-style' and `carriage-return-style'. Default:
220 `bold-off=BOLD-OFF-STRING'
221 Character sequence to turn off bold or emphasized printing.
222 Default: `""' (the empty string).
224 `bold-italic-on=BOLD-ITALIC-ON-STRING'
225 Character sequence written to turn on bold-italic printing. If
226 set to `overstrike', then the driver will simulate bold-italics by
227 overstriking twice, once with the character, a second time with an
228 underscore (`_') character, in the manner described by
229 `overstrike-style' and `carriage-return-style'. Default:
232 `bold-italic-off=BOLD-ITALIC-OFF-STRING'
233 Character sequence to turn off bold-italic printing. Default: `""'
236 `overstrike-style=OVERSTRIKE-OPTION'
237 Either `single' or `line':
239 * If `single' is selected, then, to overstrike a line of text,
240 the output driver will output a character, backspace,
241 overstrike, output a character, backspace, overstrike, and so
244 * If `line' is selected then the output driver will output an
245 entire line, then backspace or emit a carriage return (as
246 indicated by `carriage-return-style'), then overstrike the
249 `single' is recommended for use with ttys and programs that
250 understand overstriking in text files, such as the pager `less'.
251 `single' will also work with printer devices but results in rapid
252 back-and-forth motions of the printhead that can cause the printer
253 to physically overheat!
255 `line' is recommended for use with printer devices. Most programs
256 that understand overstriking in text files will not properly deal
261 `carriage-return-style=CARRIAGE-RETURN-TYPE'
262 Either `bs' or `cr'. This option applies only when one or more of
263 the font commands is set to `overstrike' and, at the same time,
264 `overstrike-style' is set to `line'.
266 * If `bs' is selected then the driver will return to the
267 beginning of a line by emitting a sequence of backspace
268 characters (ASCII 8).
270 * If `cr' is selected then the driver will return to the
271 beginning of a line by emitting a single carriage-return
272 character (ASCII 13).
274 Although `cr' is preferred as being more compact, `bs' is more
275 general since some devices do not interpret carriage returns in the
276 desired manner. Default: `bs'.
279 File: pspp.info, Node: HTML driver class, Next: Miscellaneous configuring, Prev: ASCII driver class, Up: Configuration
281 The HTML driver class
282 =====================
284 The `html' driver class is used to produce output for viewing in
285 tables-capable web browsers such as Emacs' w3-mode. Its configuration
286 is very simple. Currently, the output has a very plain format. In the
287 future, further work may be done on improving the output appearance.
289 There are few options for use with the `html' driver class:
291 `output-file=FILENAME'
292 File to which output should be sent. This can be an ordinary
293 filename (i.e., `"pspp.ps"'), a pipe filename (i.e., `"|lpr"'), or
294 stdout (`"-"'). Default: `"pspp.html"'.
296 `prologue-file=PROLOGUE-FILE-NAME'
297 Sets the name of the PostScript prologue file. You can write your
298 own prologue if you want to customize colors or other settings: see
299 *Note HTML Prologue::. Default: `html-prologue'.
303 * HTML Prologue:: Format of the HTML prologue file.
306 File: pspp.info, Node: HTML Prologue, Prev: HTML driver class, Up: HTML driver class
311 HTML files that are generated by PSPP consist of two parts: a
312 prologue and a body. The prologue is a collection of boilerplate.
313 Only the body differs greatly between two outputs. You can tune the
314 colors and other attributes of the output by editing the prologue.
316 The prologue is dumped into the output stream essentially unmodified.
317 However, two actions are performed on its lines. First, certain lines
318 may be omitted as specified in the prologue file itself. Second,
319 variables are substituted.
321 The following lines are omitted:
323 1. All lines that contain three bangs in a row (`!!!').
325 2. Lines that contain `!title', if no title is set for the output. If
326 a title is set, then the characters `!title' are removed before the
329 3. Lines that contain `!subtitle', if no subtitle is set for the
330 output. If a subtitle is set, then the characters `!subtitle' are
331 removed before the line is output.
333 The following are the variables that are substituted. Only the
334 variables listed are substituted; environment variables are not. *Note
335 Environment substitutions::.
338 PSPP version as a string: `GNU PSPP 0.1b', for example.
341 Date the file was created. Example: `Tue May 21 13:46:22 1991'.
344 Under multiuser OSes, the user's login name, taken either from the
345 environment variable `LOGNAME' or, if that fails, the result of the
346 C library function `getlogin()'. Defaults to `nobody'.
349 System hostname as reported by `gethostname()'. Defaults to
353 Document title as a string. This is the title specified in the
357 Document subtitle as a string.
360 PSPP syntax file name. Example: `mary96/first.stat'.
363 File: pspp.info, Node: Miscellaneous configuring, Next: Improving output quality, Prev: HTML driver class, Up: Configuration
365 Miscellaneous configuration
366 ===========================
368 The following environment variables can be used to further configure
372 Used to determine the user's home directory. No default value.
375 Path used to find include files in PSPP syntax files. Defaults
376 vary across operating systems:
383 * `/usr/local/lib/pspp/include'
385 * `/usr/lib/pspp/include'
387 * `/usr/local/share/pspp/include'
389 * `/usr/share/pspp/include'
403 When PSPP invokes an external pager, it uses the first of these
404 that is defined. There is a default pager only if the person who
405 compiled PSPP defined one.
408 The terminal type `termcap' or `ncurses' will use, if such support
409 was compiled into PSPP.
411 `STAT_OUTPUT_INIT_FILE'
412 The basename used to search for the driver definition file. *Note
413 Output devices::. *Note File locations::. Default: `devices'.
415 `STAT_OUTPUT_PAPERSIZE_FILE'
416 The basename used to search for the papersize file. *Note
417 papersize::. *Note File locations::. Default: `papersize'.
419 `STAT_OUTPUT_INIT_PATH'
420 The path used to search for the driver definition file and the
421 papersize file. *Note File locations::. Default: the standard
425 The `sort' procedure stores its temporary files in this directory.
426 Default: (UNIX) `/tmp', (MS-DOS) `\', (other OSes) empty string.
431 Under MS-DOS only, these variables are consulted after TMPDIR, in
435 File: pspp.info, Node: Improving output quality, Prev: Miscellaneous configuring, Up: Configuration
437 Improving output quality
438 ========================
440 When its drivers are set up properly, PSPP can produce output that
441 looks very good indeed. The PostScript driver, suitably configured, can
442 produce presentation-quality output. Here are a few guidelines for
443 producing better-looking output, regardless of output driver. Your
444 mileage may vary, of course, and everyone has different esthetic
447 * Width is important in PSPP output. Greater output width leads to
448 more readable output, to a point. Try the following to increase
451 - If you're using the ASCII driver with a dot-matrix printer,
452 figure out what you need to do to put the printer into
453 compressed mode. Put that string into the `init-string'
454 setting. Try to get 132 columns; 160 might be better, but
455 you might find that print that tiny is difficult to read.
457 - With the PostScript driver, try these ideas:
461 + Legal-size (8.5" x 14") paper in landscape mode.
463 + Reducing font sizes. If you're using 12-point fonts,
464 try 10 point; if you're using 10-point fonts, try 8
465 point. Some fonts are more readable than others at
468 Try to strike a balance between character size and page width.
470 * Use high-quality fonts. Many public domain fonts are poor in
471 quality. Recently, URW made some high-quality fonts available
472 under the GPL. These are probably suitable.
474 * Be sure you're using the proper font metrics. The font metrics
475 provided with PSPP may not correspond to the fonts actually being
476 printed. This can cause bizarre-looking output.
478 * Make sure that you're using good ink/ribbon/toner. Darker print is
481 * Use plain fonts with serifs, such as Times-Roman or Palatino.
482 Avoid choosing italic or bold fonts as document base fonts.
485 File: pspp.info, Node: Invocation, Next: Language, Prev: Configuration, Up: Top
490 pspp [ -B DIR | --config-dir=DIR ] [ -o DEVICE | --device=DEVICE ]
491 [ -d VAR[=VALUE] | --define=VAR[=VALUE] ] [-u VAR | --undef=VAR ]
492 [ -f FILE | --out-file=FILE ] [ -p | --pipe ] [ -I- | --no-include ]
493 [ -I DIR | --include=DIR ] [ -i | --interactive ]
494 [ -n | --edit | --dry-run | --just-print | --recon ]
495 [ -r | --no-statrc ] [ -h | --help ] [ -l | --list ]
496 [ -c COMMAND | --command COMMAND ] [ -s | --safer ]
497 [ --testing-mode ] [ -V | --version ] [ -v | --verbose ]
498 [ KEY=VALUE ] FILE....
502 * Non-option Arguments:: Specifying syntax files and output devices.
503 * Configuration Options:: Change the configuration for the current run.
504 * Input and output options:: Controlling input and output files.
505 * Language control options:: Language variants.
506 * Informational options:: Helpful information about PSPP.
509 File: pspp.info, Node: Non-option Arguments, Next: Configuration Options, Prev: Invocation, Up: Invocation
514 Syntax files and output device substitutions can be specified on
518 A file by itself on the command line will be executed as a syntax
519 file. PSPP terminates after the syntax file runs, unless the `-i'
520 or `--interactive' option is given (*note Language control
524 When two or more filenames are given on the command line, the first
525 syntax file is executed, then PSPP's dictionary is cleared, then
526 the second syntax file is executed.
529 If syntax files' names are delimited by a plus sign (`+'), then the
530 dictionary is not cleared between their executions, as if they were
531 concatenated together into a single file.
534 Defines an output device macro KEY to expand to VALUE, overriding
535 any macro having the same KEY defined in the device configuration
536 file. *Note Macro definitions::.
538 There is one other way to specify a syntax file, if your operating
539 system supports it. If you have a syntax file `foobar.stat', put the
542 #! /usr/local/bin/pspp
544 at the top, and mark the file as executable with `chmod +x
545 foobar.stat'. (If PSPP is not installed in `/usr/local/bin', then
546 insert its actual installation directory into the syntax file instead.)
547 Now you should be able to invoke the syntax file just by typing its
548 name. You can include any options on the command line as usual. PSPP
549 entirely ignores any lines beginning with `#!'.
552 File: pspp.info, Node: Configuration Options, Next: Input and output options, Prev: Non-option Arguments, Up: Invocation
554 Configuration Options
555 =====================
557 Configuration options are used to change PSPP's configuration for the
558 current run. The configuration options are:
562 Sets the configuration directory to DIR. *Note File locations::.
566 Selects the output device with name DEVICE. If this option is
567 given more than once, then all devices mentioned are selected.
568 This option disables all devices besides those mentioned on the
572 `--define=VAR[=VALUE]'
573 Defines an `environment variable' named VAR having the optional
574 value VALUE specified. *Note Variable values::.
578 Undefines the `environment variable' named VAR. *Note Variable
582 File: pspp.info, Node: Input and output options, Next: Language control options, Prev: Configuration Options, Up: Invocation
584 Input and output options
585 ========================
587 Input and output options affect how PSPP reads input and writes
588 output. These are the input and output options:
592 This overrides the output file name for devices designated as
593 listing devices. If a file named FILE already exists, it is
598 Allows PSPP to be used as a filter by causing the syntax file to be
599 read from stdin and output to be written to stdout. Conflicts
600 with the `-f FILE' and `--file=FILE' options.
604 Clears all directories from the include path. This includes all
605 directories put in the include path by default. *Note
606 Miscellaneous configuring::.
610 Appends directory DIR to the path that is searched for include
611 files in PSPP syntax files.
615 Execute literal command COMMAND. The command is executed before
616 startup syntax files, if any.
619 Invoke heuristics to assist with testing PSPP. For use by `make
620 check' and similar scripts.
623 File: pspp.info, Node: Language control options, Next: Informational options, Prev: Input and output options, Up: Invocation
625 Language control options
626 ========================
628 Language control options control how PSPP syntax files are parsed and
629 interpreted. The available language control options are:
633 When a syntax file is specified on the command line, PSPP normally
634 terminates after processing it. Giving this option will cause
635 PSPP to bring up a command prompt after processing the syntax file.
637 In addition, this forces syntax files to be interpreted in
638 interactive mode, rather than the default batch mode. *Note
639 Tokenizing lines::, for information on the differences between
640 batch mode and interactive mode command interpretation.
647 Only the syntax of any syntax file specified or of commands
648 entered at the command line is checked. Transformations are not
649 performed and procedures are not executed. Not yet implemented.
653 Prevents the execution of the PSPP startup syntax file. Not yet
654 implemented, as startup syntax files aren't, either.
658 Disables certain unsafe operations. This includes the `ERASE' and
659 `HOST' commands, as well as use of pipes as input and output files.
662 File: pspp.info, Node: Informational options, Prev: Language control options, Up: Invocation
664 Informational options
665 =====================
667 Informational options cause information about PSPP to be written to
668 the terminal. Here are the available options:
673 Prints a message describing PSPP command-line syntax and the
674 available device driver classes, then terminates.
679 Lists the available device driver classes, then terminates.
684 Prints a brief message listing PSPP's version, warranties you don't
685 have, copying conditions and copyright, and e-mail address for bug
686 reports, then terminates.
691 Increments PSPP's verbosity level. Higher verbosity levels cause
692 PSPP to display greater amounts of information about what it is
693 doing. Often useful for debugging PSPP's configuration.
695 This option can be given multiple times to set the verbosity level
696 to that value. The default verbosity level is 0, in which no
697 informational messages will be displayed.
699 Higher verbosity levels cause messages to be displayed when the
700 corresponding events take place.
703 Driver and subsystem initializations.
706 Completion of driver initializations. Beginning of driver
710 Completion of driver closings.
713 Files searched for; success of searches.
716 Individual directories included in file searches.
718 Each verbosity level also includes messages from lower verbosity
722 File: pspp.info, Node: Language, Next: Expressions, Prev: Invocation, Up: Top
727 *Please note:* PSPP is not even close to completion. Only a few
728 actual statistical procedures are implemented. PSPP is a work in
731 This chapter discusses elements common to many PSPP commands. Later
732 chapters will describe individual commands in detail.
736 * Tokens:: Characters combine to form tokens.
737 * Commands:: Tokens combine to form commands.
738 * Types of Commands:: Commands come in several flavors.
739 * Order of Commands:: Commands combine to form syntax files.
740 * Missing Observations:: Handling missing observations.
741 * Variables:: The unit of data storage.
742 * Files:: Files used by PSPP.
743 * BNF:: How command syntax is described.
746 File: pspp.info, Node: Tokens, Next: Commands, Prev: Language, Up: Language
751 PSPP divides most syntax file lines into series of short chunks
752 called "tokens", "lexical elements", or "lexemes". These tokens are
753 then grouped to form commands, each of which tells PSPP to take some
754 action--read in data, write out data, perform a statistical procedure,
755 etc. The process of dividing input into tokens is "tokenization", or
756 "lexical analysis". Each type of token is described below.
758 Tokens must be separated from each other by "delimiters".
759 Delimiters include whitespace (spaces, tabs, carriage returns, line
760 feeds, vertical tabs), punctuation (commas, forward slashes, etc.), and
761 operators (plus, minus, times, divide, etc.) Note that while whitespace
762 only separates tokens, other delimiters are tokens in themselves.
765 Identifiers are names that specify variable names, commands, or
768 * The first character in an identifier must be a letter, `#', or
769 `@'. Some system identifiers begin with `$', but
770 user-defined variables' names may not begin with `$'.
772 * The remaining characters in the identifier must be letters,
773 digits, or one of the following special characters:
777 * Variable names may be any length, but only the first 8
778 characters are significant.
780 * Identifiers are not case-sensitive: `foobar', `Foobar',
781 `FooBar', `FOOBAR', and `FoObaR' are different
782 representations of the same identifier.
784 * Identifiers other than variable names may be abbreviated to
785 their first 3 characters if this abbreviation is unambiguous.
786 These identifiers are often called "keywords". (Unique
787 abbreviations of more than 3 characters are also accepted:
788 `FRE', `FREQ', and `FREQUENCIES' are equivalent when the last
791 * Whether an identifier is a keyword depends on the context.
793 * Some keywords are reserved. These keywords may not be used
794 in any context besides those explicitly described in this
795 manual. The reserved keywords are:
797 ALL AND BY EQ GE GT LE LT NE NOT OR TO WITH
799 * Since keywords are identifiers, all the rules for identifiers
800 apply. Specifically, they must be delimited as are other
801 identifiers: `WITH' is a reserved keyword, but `WITHOUT' is a
804 *Caution:* It is legal to end a variable name with a period, but
805 _don't do it!_ The variable name will be misinterpreted when it is
806 the final token on a line: `FOO.' will be divided into two separate
807 tokens, `FOO' and `.', the "terminal dot". *Note Forming commands
811 Numbers may be specified as integers or reals. Integers are
812 internally converted into reals. Scientific notation is not
813 supported. Here are some examples of valid numbers:
815 1234 3.14159265359 .707106781185 8945.
817 *Caution:* The last example will be interpreted as two tokens,
818 `8945' and `.', if it is the last token on a line.
821 Strings are literal sequences of characters enclosed in pairs of
822 single quotes (`'') or double quotes (`"').
824 * Whitespace and case of letters _are_ significant inside
827 * Whitespace characters inside a string are not delimiters.
829 * To include single-quote characters in a string, enclose the
830 string in double quotes.
832 * To include double-quote characters in a string, enclose the
833 string in single quotes.
835 * It is not possible to put both single- and double-quote
836 characters inside one string.
839 Hexstrings are string variants that use hex digits to specify
842 * A hexstring may be used anywhere that an ordinary string is
845 * A hexstring begins with `X'' or `x'', and ends with `''.
847 * No whitespace is allowed between the initial `X' and `''.
849 * Double quotes `"' may be used in place of single quotes `'' if
852 * Each pair of hex digits is internally changed into a single
853 character with the given value.
855 * If there is an odd number of hex digits, the missing last
856 digit is assumed to be `0'.
858 * *Please note:* Use of hexstrings is nonportable because the
859 same numeric values are associated with different glyphs by
860 different operating systems. Therefore, their use should be
861 confined to syntax files that will not be widely distributed.
863 * *Please note also:* The character with value 00 is reserved
864 for internal use by PSPP. Its use in strings causes an error
865 and replacement with a blank space (in ASCII, hex 20, decimal
869 Punctuation separates tokens; punctuators are delimiters. These
870 are the punctuation characters:
875 Operators describe mathematical operations. Some operators are
880 Many of the above operators are also punctuators. Punctuators are
881 distinguished from operators by context.
883 The other operators are all reserved keywords. None of these are
886 AND EQ GE GT LE LT NE OR
889 A period (`.') at the end of a line (except for whitespace) is one
890 type of a "terminal dot", although not every terminal dot is a
891 period at the end of a line. *Note Forming commands of tokens:
892 Commands. A period is a terminal dot _only_ when it is at the end
893 of a line; otherwise it is part of a floating-point number. (A
894 period outside a number in the middle of a line is an error.)
896 *Please note:* The character used for the "terminal dot" can
897 be changed with the SET command. This is strongly
898 discouraged, and throughout all the remainder of this manual
899 it will be assumed that the default setting is in effect.
902 File: pspp.info, Node: Commands, Next: Types of Commands, Prev: Tokens, Up: Language
904 Forming commands of tokens
905 ==========================
907 Most PSPP commands share a common structure, diagrammed below:
909 CMD... [SBC[=][SPEC [[,]SPEC]...]] [[/[=][SPEC [[,]SPEC]...]]...].
911 In the above, rather daunting, expression, pairs of square brackets
912 (`[ ]') indicate optional elements, and names such as CMD indicate
913 parts of the syntax that vary from command to command. Ellipses
914 (`...') indicate that the preceding part may be repeated an arbitrary
915 number of times. Let's pick apart what it says above:
917 * A command begins with a command name of one or more keywords, such
918 as `FREQUENCIES', `DATA LIST', or `N OF CASES'. CMD may be
919 abbreviated to its first word if that is unambiguous; each word in
920 CMD may be abbreviated to a unique prefix of three or more
921 characters as described above.
923 * The command name may be followed by one or more "subcommands":
925 - Each subcommand begins with a unique keyword, indicated by SBC
926 above. This is analogous to the command name.
928 - The subcommand name is optionally followed by an equals sign
931 - Some subcommands accept a series of one or more specifications
932 (SPEC), optionally separated by commas.
934 - Each subcommand must be separated from the next (if any) by a
937 * Each command must be terminated with a "terminal dot". The
938 terminal dot may be given one of three ways:
940 - (most commonly) A period character at the very end of a line,
943 - (only if NULLINE is on: *Note Setting user preferences: SET,
944 for more details.) A completely blank line.
946 - (in batch mode only) Any line that is not indented from the
947 left side of the page causes a terminal dot to be inserted
948 before that line. Therefore, each command begins with a line
949 that is flush left, followed by zero or more lines that are
950 indented one or more characters from the left margin.
952 In batch mode, PSPP will ignore a plus sign, minus sign, or
953 period (`+', `-', or `.') as the first character in a line.
954 Any of these characters as the first character on a line will
955 begin a new command. This allows for visual indentation of a
956 command without that command being considered part of the
959 PSPP is in batch mode when it is reading input from a file,
960 rather than from an interactive user. Note that the other
961 forms of the terminal dot may also be used in batch mode.
963 Sometimes, one encounters syntax files that are intended to be
964 interpreted in interactive mode rather than batch mode (for
965 instance, this can happen if a session log file is used
966 directly as a syntax file). When this occurs, use the `-i'
967 command line option to force interpretation in interactive
968 mode (*note Language control options::).
970 PSPP ignores empty commands when they are generated by the above
971 rules. Note that, as a consequence of these rules, each command must
975 File: pspp.info, Node: Types of Commands, Next: Order of Commands, Prev: Commands, Up: Language
980 Commands in PSPP are divided roughly into six categories:
983 Set or display various global options that affect PSPP operations.
984 May appear anywhere in a syntax file. *Note Utility commands:
987 *File definition commands*
988 Give instructions for reading data from text files or from special
989 binary "system files". Most of these commands discard any previous
990 data or variables in order to replace it with the new data and
991 variables. At least one must appear before the first command in
992 any of the categories below. *Note Data Input and Output::.
994 *Input program commands*
995 Though rarely used, these provide powerful tools for reading data
996 files in arbitrary textual or binary formats. *Note INPUT
1000 Perform operations on data and write data to output files.
1001 Transformations are not carried out until a procedure is executed.
1003 *Restricted transformations*
1004 Same as transformations for most purposes. *Note Order of
1005 Commands::, for a detailed description of the differences.
1008 Analyze data, writing results of analyses to the listing file.
1009 Cause transformations specified earlier in the file to be
1010 performed. In a more general sense, a "procedure" is any command
1011 that causes the active file (the data) to be read.
1014 File: pspp.info, Node: Order of Commands, Next: Missing Observations, Prev: Types of Commands, Up: Language
1019 PSPP does not place many restrictions on ordering of commands. The
1020 main restriction is that variables must be defined with one of the
1021 file-definition commands before they are otherwise referred to.
1023 Of course, there are specific rules, for those who are interested.
1024 PSPP possesses five internal states, called initial, INPUT PROGRAM,
1025 FILE TYPE, transformation, and procedure states. (Please note the
1026 distinction between the INPUT PROGRAM and FILE TYPE _commands_ and the
1027 INPUT PROGRAM and FILE TYPE _states_.)
1029 PSPP starts up in the initial state. Each successful completion of
1030 a command may cause a state transition. Each type of command has its
1031 own rules for state transitions:
1034 * Legal in all states, except Pennsylvania.
1036 * Do not cause state transitions. Exception: when the N OF
1037 CASES command is executed in the procedure state, it causes a
1038 transition to the transformation state.
1041 * Legal in all states.
1043 * When executed in the initial or procedure state, causes a
1044 transition to the transformation state.
1046 * Clears the active file if executed in the procedure or
1047 transformation state.
1050 * Invalid in INPUT PROGRAM and FILE TYPE states.
1052 * Causes a transition to the INPUT PROGRAM state.
1054 * Clears the active file.
1057 * Invalid in INPUT PROGRAM and FILE TYPE states.
1059 * Causes a transition to the FILE TYPE state.
1061 * Clears the active file.
1063 *Other file definition commands*
1064 * Invalid in INPUT PROGRAM and FILE TYPE states.
1066 * Cause a transition to the transformation state.
1068 * Clear the active file, except for ADD FILES, MATCH FILES, and
1072 * Invalid in initial and FILE TYPE states.
1074 * Cause a transition to the transformation state.
1076 *Restricted transformations*
1077 * Invalid in initial, INPUT PROGRAM, and FILE TYPE states.
1079 * Cause a transition to the transformation state.
1082 * Invalid in initial, INPUT PROGRAM, and FILE TYPE states.
1084 * Cause a transition to the procedure state.
1087 File: pspp.info, Node: Missing Observations, Next: Variables, Prev: Order of Commands, Up: Language
1089 Handling missing observations
1090 =============================
1092 PSPP includes special support for unknown numeric data values.
1093 Missing observations are assigned a special value, called the
1094 "system-missing value". This "value" actually indicates the absence of
1095 value; it means that the actual value is unknown. Procedures
1096 automatically exclude from analyses those observations or cases that
1097 have missing values. Whether single observations or entire cases are
1098 excluded depends on the procedure.
1100 The system-missing value exists only for numeric variables. String
1101 variables always have a defined value, even if it is only a string of
1104 Variables, whether numeric or string, can have designated
1105 "user-missing values". Every user-missing value is an actual value for
1106 that variable. However, most of the time user-missing values are
1107 treated in the same way as the system-missing value. String variables
1108 that are wider than a certain width, usually 8 characters (depending on
1109 computer architecture), cannot have user-missing values.
1111 For more information on missing values, see the following sections:
1112 *Note Variables::, *Note MISSING VALUES::, *Note Expressions::. See
1113 also the documentation on individual procedures for information on how
1114 they handle missing values.
1117 File: pspp.info, Node: Variables, Next: Files, Prev: Missing Observations, Up: Language
1122 Variables are the basic unit of data storage in PSPP. All the
1123 variables in a file taken together, apart from any associated data, are
1124 said to form a "dictionary". Each case contain a value for each
1125 variable. Some details of variables are described in the sections
1130 * Attributes:: Attributes of variables.
1131 * System Variables:: Variables automatically defined by PSPP.
1132 * Sets of Variables:: Lists of variable names.
1133 * Input/Output Formats:: Input and output formats.
1134 * Scratch Variables:: Variables deleted by procedures.
1137 File: pspp.info, Node: Attributes, Next: System Variables, Prev: Variables, Up: Variables
1139 Attributes of Variables
1140 -----------------------
1142 Each variable has a number of attributes, including:
1145 This is an identifier. Each variable must have a different name.
1152 (string variables only) String variables with a width of 8
1153 characters or fewer are called "short string variables". Short
1154 string variables can be used in many procedures where "long string
1155 variables" (those with widths greater than 8) are not allowed.
1157 *Please note:* Certain systems may consider strings longer
1158 than 8 characters to be short strings. Eight characters
1159 represents a minimum figure for the maximum length of a short
1163 Variables in the dictionary are arranged in a specific order. The
1164 DISPLAY command can be used to show this order: see *Note
1168 Dexter or sinister. *Note LEAVE::.
1171 Optionally, up to three values, or a range of values, or a specific
1172 value plus a range, can be specified as "user-missing values".
1173 There is also a "system-missing value" that is assigned to an
1174 observation when there is no other obvious value for that
1175 observation. Observations with missing values are automatically
1176 excluded from analyses. User-missing values are actual data
1177 values, while the system-missing value is not a value at all.
1178 *Note Missing Observations::.
1181 A string that describes the variable. *Note VARIABLE LABELS::.
1184 Optionally, these associate each possible value of the variable
1185 with a string. *Note VALUE LABELS::.
1188 Display width, format, and (for numeric variables) number of
1189 decimal places. This attribute does not affect how data are
1190 stored, just how they are displayed. Example: a width of 8, with
1191 2 decimal places. *Note PRINT FORMATS::.
1194 Similar to print format, but used by certain commands that are
1195 designed to write to binary files. *Note WRITE FORMATS::.
1198 File: pspp.info, Node: System Variables, Next: Sets of Variables, Prev: Attributes, Up: Variables
1200 Variables Automatically Defined by PSPP
1201 ---------------------------------------
1203 There are seven system variables. These are not like ordinary
1204 variables, as they are not stored in each case. They can only be used
1205 in expressions. These system variables, whose values and output formats
1206 cannot be modified, are described below.
1209 Case number of the case at the moment. This changes as cases are
1213 Date the PSPP process was started, in format A9, following the
1214 pattern `DD MMM YY'.
1217 Number of days between 15 Oct 1582 and the time the PSPP process
1221 Page length, in lines, in format F11.
1224 System missing value, in format F1.
1227 Number of seconds between midnight 14 Oct 1582 and the time the
1228 active file was read, in format F20.
1231 Page width, in characters, in format F3.
1234 File: pspp.info, Node: Sets of Variables, Next: Input/Output Formats, Prev: System Variables, Up: Variables
1236 Lists of variable names
1237 -----------------------
1239 There are several ways to specify a set of variables:
1241 1. (Most commonly.) List the variable names one after another,
1242 optionally separating them by commas.
1244 2. (This method cannot be used on commands that define the
1245 dictionary, such as `DATA LIST'.) The syntax is the names of two
1246 existed variables, separated by the reserved keyword `TO'. The
1247 meaning is to include every variable in the dictionary between and
1248 including the variables specified. For instance, if the
1249 dictionary contains six variables with the names `ID', `X1', `X2',
1250 `GOAL', `MET', and `NEXTGOAL', in that order, then `X2 TO MET'
1251 would include variables `X2', `GOAL', and `MET'.
1253 3. (This method can be used only on commands that define the
1254 dictionary, such as `DATA LIST'.) It is used to define sequences
1255 of variables that end in consecutive integers. The syntax is two
1256 identifiers that end in numbers. This method is best illustrated
1259 * The syntax `X1 TO X5' defines 5 variables:
1271 * The syntax `ITEM0008 TO ITEM0013' defines 6 variables:
1285 * Each of the syntaxes `QUES001 TO QUES9' and `QUES6 TO QUES3'
1286 are invalid, although for different reasons, which should be
1289 Note that after a set of variables has been defined on `DATA LIST'
1290 or another command with this method, the same set can be
1291 referenced on later commands using the same syntax.
1293 4. The above methods can be combined, either one after another or
1294 delimited by commas. For instance, the combined syntax `A Q5 TO
1295 Q8 X TO Z' is legal as long as each part `A', `Q5 TO Q8', `X TO Z'
1296 is individually legal.