From: John Darrington Date: Sat, 30 Oct 2004 10:14:05 +0000 (+0000) Subject: Split pspp.texi into one texi file per chapter X-Git-Tag: v0.4.0~246 X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=1fc3af93c0ba6cbaf7ef09edc979096b6f16dd6f;p=pspp-builds.git Split pspp.texi into one texi file per chapter --- diff --git a/doc/ChangeLog b/doc/ChangeLog index ceaaacfa..13497e4a 100644 --- a/doc/ChangeLog +++ b/doc/ChangeLog @@ -1,534 +1,10 @@ -Sun May 30 22:44:25 2004 Ben Pfaff - * pspp.texi: Update FILE HANDLE, DATA LIST FREE, DATA LIST LIST - documentation to reflect latest changes. +Sat Oct 30 17:32:53 WST 2004 John Darrington -Mon Apr 19 22:46:37 2004 Ben Pfaff + * Started this changelog - * pspp.texi: Minor updates to data file and portable file - descriptions based on emails from Aapi Hämäläinen - . + * Removed the monolithic pspp.texi file and replaced with *.texi + wrapped by a single pspp.texinfo file -Fri Mar 26 00:07:46 2004 Ben Pfaff - - * pspp.texi: Update chapter on expressions. - -Sat Mar 20 00:53:10 WST 2004 John Darrington - - * pspp.texi: Added a brief mention of the SHOW command. - -Sun Mar 14 21:50:56 2004 Ben Pfaff - - * pspp.texi: Added details on how various features interact. - -Fri Mar 12 16:33:12 WST 2004 John Darrington - - * Added indeces for the commands in the STATISTICS section - - * Added notes on the T-TEST Independent samples command about - single values for the independent variable. - -Tue Mar 9 22:34:17 2004 Ben Pfaff - - * Updated. - -Wed Feb 18 18:15:48 2004 Ben Pfaff - - * Improve CROSSTABS description - -Wed Feb 18 21:50:36 WST 2004 John Darrington - - * Added a section on T-TEST - -Mon Jan 5 12:37:03 WAST 2004 John Darrington - - * Added documentation for the HOST command. - -Sat Dec 27 16:36:05 2003 Ben Pfaff - - * Makefile.am (MAKEINFO): Removed, since the manual validates (and - should validate from now on). - - * pspp.texi: Updated. - -Sun Jan 2 21:30:53 2000 Ben Pfaff - - * pspp.texi: Updated. - -Tue Mar 9 12:47:20 1999 Ben Pfaff - - * pspp.texi: Updated. - -Mon Jan 18 19:29:21 1999 Ben Pfaff - - * pspp.texi: Updated. - -Tue Jan 5 12:04:09 1999 Ben Pfaff - - * pspp.texi: Updated. - -Thu Nov 19 12:35:01 1998 Ben Pfaff - - * pspp.texi: Revised. - -Sun Aug 9 11:11:43 1998 Ben Pfaff - - * pspp.texi: Revised. - -Sat Aug 8 00:19:22 1998 Ben Pfaff - - * pspp.texi: Revised. - -Sun Jul 5 00:14:24 1998 Ben Pfaff - - * pspp.texi: Updated. - -Fri May 29 21:43:52 1998 Ben Pfaff - - * pspp.texi: Revised. - -Wed May 20 00:03:50 1998 Ben Pfaff - - * pspp.texi: Updated. - -Fri Apr 24 12:51:28 1998 Ben Pfaff - - * pspp.texi: Updated. - -Wed Apr 15 13:01:28 1998 Ben Pfaff - - * AUTHORS.html, BUGS.html, LANGUAGE.html, README.html, - THANKS.html: Removed. - - * Makefile.am: Don't reference the deleted files. - -Mon Mar 9 00:55:59 1998 Ben Pfaff - - * LANGUAGE.html: Updated. - -1998-03-05 Ben Pfaff - - * pspp.texi: Updated. - -1998-02-23 Ben Pfaff - - * pspp.texi: Updated. - -Fri Feb 13 15:35:44 1998 Ben Pfaff - - * LANGUAGE.html: Updated. - -Thu Feb 5 00:18:10 1998 Ben Pfaff - - * LANGUAGE.html: Updated. - - * pspp.texi: Revised. - -Tue Jan 13 23:44:43 1998 Ben Pfaff - - * BUGS.html: Updated. - - * LANGUAGE.html: Updated. - -Thu Jan 8 22:27:29 1998 Ben Pfaff - - * pspp.texi: Updated. - -Sun Jan 4 18:12:11 1998 Ben Pfaff - - * LANGUAGE.html: Updated. - -Wed Dec 24 22:36:09 1997 Ben Pfaff - - * pspp.texi: Updated. - -Sun Dec 21 16:18:18 1997 Ben Pfaff - - * pspp.texi: Updated. - -Fri Dec 5 22:53:35 1997 Ben Pfaff - - * fiasco.man: Renamed pspp.man. - - * fiasco.texi: Renamed pspp.texi. - -Fri Dec 5 21:52:29 1997 Ben Pfaff - - * fiasco.texi: Updated. - -Tue Dec 2 14:35:34 1997 Ben Pfaff - - * BUGS.html: Updated. - -Sat Nov 22 01:20:41 1997 Ben Pfaff - - * fiasco.texi: Revised. - -Fri Nov 21 00:02:36 1997 Ben Pfaff - - * fiasco.man, fiasco.texi: Revised. - -Tue Oct 28 16:08:01 1997 Ben Pfaff - - * fiasco.texi: Revised. - -Tue Oct 7 20:22:14 1997 Ben Pfaff - - * LANGUAGE.html: Updated. - -Sat Oct 4 16:19:27 1997 Ben Pfaff - - * LANGUAGE.html: Updated. - -Thu Sep 18 21:33:44 1997 Ben Pfaff - - * BUGS.html, LANGUAGE.html: Updated. - -Wed Aug 20 14:21:35 1997 Ben Pfaff - - * Makefile.am: (info_TEXINFOS) Remove FAQ.texi. - -Wed Aug 20 12:49:40 1997 Ben Pfaff - - * ANNOUNCE.html.in, FAQ.texi, HELP-WANTED.html: Removed. - - * BUGS.html, LANGUAGE.html, README.html.in: Updated per - suggestions of rms. - - * Makefile.am: (noinst_DATA) Removed ANNOUNCE.html, - HELP-WANTED.html. - (EXTRA_DIST) Removed ANNOUNCE.html, ANNOUNCE.html.in, - HELP-WANTED.html. - (MAINTAINERCLEANFILES, HTML_FORMATTER) Removed. - - * fiasco.texi: Revised. - -Sat Aug 16 10:51:51 1997 Ben Pfaff - - * ANNOUNCE.html.in, HELP-WANTED.html, README.html.in: Updated per - suggestions of rms. - - * AUTHORS.html, BUGS.html, FAQ.texi, LANGUAGE.html, THANKS.html, - fiasco.man, fiasco.texi: Updated. - - * README-i386linux.html, dist.html.in.in, fiasco.lsm.in, - changelogs.html.top, changelogs.html.bot: Removed, all references - removed. - -Thu Aug 14 22:07:02 1997 Ben Pfaff - - * ANNOUNCE.html.in, README.html.in, dist.html.in.in: Updated. - - * Makefile.am: Use $(VERSION) instead of VERSION file. - (EXTRA_DIST) Add README-i386linux. - -Thu Aug 14 11:52:20 1997 Ben Pfaff - - * ANNOUNCE.html.in, AUTHORS.html, BUGS.html, HELP-WANTED.html, - LANGUAGE.html, README-i386linux.html, README.html.in, THANKS.html, - changelogs.html.bot, changelogs.html.top: Revised. - - * Makefile.am: (noinst_DATA) Remove dist.html, add dist.html.in. - (EXTRA_DIST) Add ONEWS, remove dist.html, dist.html.in, add - dist.html.in.in. - (MAINTAINERCLEANFILES) Add dist.html.in. - (dist.html) Removed. - (dist.html.in) New target depending on dist.html.in.in. - (docfiles) New target. - - * dist.html.in: Renamed dist.html.in.in. - -Tue Aug 5 13:57:20 1997 Ben Pfaff - - * FAQ.texi, fiasco.texi: Updated. - -Sun Aug 3 11:34:43 1997 Ben Pfaff - - * ANNOUNCE.html.in, AUTHORS.html, BUGS.html, FAQ.texi, - LANGUAGE.html, README-i386linux.html, README.html.in, THANKS.html, - changelogs.html.bot, dist.html.in, fiasco.texi: Updated. - - * Makefile.am: (noinst_DATA, EXTRA_DIST) Add HELP-WANTED.html, - remove README-i386gnuwin32.html. - (MAINTAINERCLEANFILES) Remove README-i386gnuwin32.html, add - README-i386linux. - (README-i386linux) New target. - - * README-i386gnuwin32.html.in: Removed. - - * HELP-WANTED.html: New file. - -Thu Jul 17 21:40:28 1997 Ben Pfaff - - * Makefile.am: Generates fiasco.lsm from fiasco.lsm.in. - -Thu Jul 17 01:49:06 1997 Ben Pfaff - - * FAQ.texi: Updated. - - * Makefile.am: Completely rewritten. - - * ANNOUNCE.html.in, README-i386gnuwin32.html.in, - README-i386linux.html, README.html.in, dist.html.in, - fiasco.lsm.in: New files. - -Fri Jul 11 23:01:32 1997 Ben Pfaff - - * fiasco.texi: Updated. - -Sun Jul 6 20:46:38 1997 Ben Pfaff - - * ANNOUNCE.html, FAQ.texi, README.html: Updated. - - * Makefile.am: Add all the recent new files to EXTRA_DIST. - -Sat Jul 5 23:43:50 1997 Ben Pfaff - - * ANNOUNCE.html, FAQ.texi, README.html: Updated. - - * changelogs.html.bot: Fix copyright notice. - - * fiasco.man: New file. - -Fri Jul 4 13:23:57 1997 Ben Pfaff - - * changelogs.html.bot, changelogs.html.top: New files. - - * fiasco.lsm: New file. - - * ANNOUNCE.html, FAQ.texi, README.html: Updated. - - * Makefile.am: (EXTRA_DIST) Removed duplicate assignment. - -Wed Jun 25 22:51:39 1997 Ben Pfaff - - * FAQ.texi: Finished. - - * README.html: Updates. - -Sun Jun 22 21:59:07 1997 Ben Pfaff - - * ANNOUNCE.html, BUGS.html, LANGUAGE.html, README.html, - fiasco.texi: Updates. - - * Makefile.am: Add `FAQ.texi' to info_TEXINFOS. - - * FAQ.texi: New file. - -Tue Jun 3 23:25:51 1997 Ben Pfaff - - * AUTHORS.html, BUGS.html, README.html, THANKS.html: Updates. - - * fiasco.texi: Update. - -Sun Jun 1 11:58:27 1997 Ben Pfaff - - * fiasco.texi: Development. - -Fri May 30 19:39:37 1997 Ben Pfaff - - * fiasco.texi: Development. - -Mon May 5 21:57:20 1997 Ben Pfaff - - * fiasco.texi: Development. - -Fri May 2 22:07:26 1997 Ben Pfaff - - * fiasco.texi: Development. - -Thu May 1 14:58:31 1997 Ben Pfaff - - * BUGS.html: Update. - - * fiasco.texi: Development. - -Wed Apr 23 21:33:48 1997 Ben Pfaff - - * THANKS.html: Update. - -Fri Apr 18 15:42:22 1997 Ben Pfaff - - * Makefile.am: Maintainer-clean Makefile.in. - -Thu Mar 27 01:11:29 1997 Ben Pfaff - - * THANKS.html: Added Fran,cois Pinard. - -Mon Mar 24 21:47:31 1997 Ben Pfaff - - * THANKS.html: Spelling fix. - -Sat Feb 15 21:26:53 1997 Ben Pfaff - - * LANGUAGE.html: Updated. - -Fri Feb 14 23:32:58 1997 Ben Pfaff - - * BUGS.html: Updated. - -Wed Jan 22 21:54:00 1997 Ben Pfaff - - * LANGUAGE.html: RENAME VARIABLES is implemented. - -Thu Jan 16 13:08:57 1997 Ben Pfaff - - * LANGUAGE.html: MODIFY VARS now works. - - * README.html: Added `alpha.gnu.ai.mit.edu' to list of sites. - -Sat Jan 11 15:44:15 1997 Ben Pfaff - - * README.html: Commented out sunsite reference and added - ALPHA-release warning. - -Fri Jan 10 20:22:08 1997 Ben Pfaff - - * LANGUAGE.html: Reformatted. - -Thu Jan 2 19:08:23 1997 Ben Pfaff - - * LANGUAGE.html: Updated. - -Wed Jan 1 22:08:10 1997 Ben Pfaff - - * LANGUAGE.html: Updated. - -Sun Dec 29 21:36:48 1996 Ben Pfaff - - * LANGUAGE.html: Updated. - - * fiasco.texi: Updated. - -Tue Dec 24 20:42:32 1996 Ben Pfaff - - * LANGUAGE.html, README.html: Miscellaneous changes. - -Sun Dec 22 23:10:39 1996 Ben Pfaff - - * LANGUAGE.html, README.html: Miscellaneous changes. - - * AUTHORS.html, BUGS.html, THANKS.html: New files derived from - corresponding files without the `.html'. - -Sat Dec 21 21:51:04 1996 Ben Pfaff - - * AUTHORS: Grammar fix. - - * LANGUAGE.html: New file. LANGUAGE is now automatically - generated from this html source through lynx. - - * README.html: Similar situation to LANGUAGE.html. - -Sun Dec 15 15:32:16 1996 Ben Pfaff - - * LANGUAGE: Updated. - -Fri Dec 6 23:53:47 1996 Ben Pfaff - - * AUTHORS, BUGS, LANGUAGE, README: Updated. - - * fiasco.texi: Fixes. - -Wed Dec 4 21:34:17 1996 Ben Pfaff - - * LANGUAGE: Updated. - -Sun Dec 1 17:19:00 1996 Ben Pfaff - - * BUGS, LANGUAGE, NEWS: Misc. changes. - -Sun Nov 24 14:53:53 1996 Ben Pfaff - - * fiasco.texi: Changed many instances of `illegal' to `invalid'. - -Wed Oct 30 17:13:08 1996 Ben Pfaff - - * LANGUAGE: Updated. - - * README: Updated. - -Sat Oct 26 23:06:06 1996 Ben Pfaff - - * LANGUAGE: Updated. - -Sat Oct 26 10:39:25 1996 Ben Pfaff - - * LANGUAGE: Updated. - -Thu Oct 24 20:13:42 1996 Ben Pfaff - - * LANGUAGE: Updated. - - * README: Updated. - - * fiasco.texi: Updated. - -Thu Oct 24 17:47:14 1996 Ben Pfaff - - * LANGUAGE: Updated. - -Wed Oct 23 21:53:43 1996 Ben Pfaff - - * LANGUAGE: Updated. - -Tue Oct 22 17:27:04 1996 Ben Pfaff - - * LANGUAGE: Updated. - - * fiasco.texi: Very minor changes. - -Sun Sep 29 19:37:03 1996 Ben Pfaff - - * fiasco.texi: Continued development. - -Tue Sep 24 18:39:09 1996 Ben Pfaff - - * avl.texi, gpl.texi: Removed. - - * fiasco.texi: Changed copyright notices; deleted references to - avl.texi, gpl.texi. - -Sat Sep 21 23:16:31 1996 Ben Pfaff - - * fiasco.texi: Continued work--added to configuration chapter. - -Fri Sep 20 22:52:28 1996 Ben Pfaff - - * fiasco.texi: Continued work--added to configuration chapter. - -Thu Sep 12 18:40:33 1996 Ben Pfaff - - * fiasco.texi: Continued work--added section on bug reports. - -Wed Sep 11 22:01:41 1996 Ben Pfaff - - * fiasco.texi: Added timestamp. Started some updating. - -Tue Sep 10 21:39:00 1996 Ben Pfaff - - * LANGUAGE: Updated. - - * README: Minor change. - -Mon Sep 9 21:43:13 1996 Ben Pfaff - - * NEWS: Added automagic timestamp. - - * README: Restructured, extended. - - * BUGS, LANGUAGE: New files. - -Sat Jul 6 22:22:25 1996 Ben Pfaff - - * fiasco.texi: Remarked on broken Borland alloca(). - -Mon Jul 1 13:00:00 1996 Ben Pfaff - - * stat.texi: Renamed to `fiasco.texi'. - ----------------------------------------------------------------------- -Local Variables: -mode: change-log -version-control: never -End: + * Minor corrections to the documentation where I noticed it needed + them. diff --git a/doc/Makefile.am b/doc/Makefile.am index 15ea6557..430c6df0 100644 --- a/doc/Makefile.am +++ b/doc/Makefile.am @@ -1,8 +1,33 @@ ## Process this file with automake to produce Makefile.in -*- makefile -*- -info_TEXINFOS = pspp.texi +info_TEXINFOS = pspp.texinfo + +EXTRA_DIST = pspp.man \ + bugs.texi \ + command-index.texi \ + concept-index.texi \ + configuring.texi \ + credits.texi \ + data-file-format.texi \ + data-io.texi \ + data-selection.texi \ + expressions.texi \ + files.texi \ + flow-control.texi \ + function-index.texi \ + installing.texi \ + introduction.texi \ + invoking.texi \ + language.texi \ + license.texi \ + not-implemented.texi \ + portable-file-format.texi \ + q2c.texi \ + statistics.texi \ + transformation.texi \ + utilities.texi \ + variables.texi -EXTRA_DIST = pspp.man CLEANFILES = pspp.info pspp.info-* -MAINTAINERCLEANFILES=Makefile.in README.html +MAINTAINERCLEANFILES=Makefile.in diff --git a/doc/bugs.texi b/doc/bugs.texi new file mode 100644 index 00000000..1df01284 --- /dev/null +++ b/doc/bugs.texi @@ -0,0 +1,85 @@ +@node Bugs, Function Index, Not Implemented, Top +@chapter Bugs + +@menu +* Known bugs:: Pointers to other files. +* Contacting the Author:: Where to send the bug reports. +@end menu + +@node Known bugs, Contacting the Author, Bugs, Bugs +@section Known bugs + +This is the list of known bugs in PSPP. In addition, @xref{Not +Implemented}, and @xref{Functions Not Implemented}, for lists of bugs +due to features not implemented. For known bugs in individual language +features, see the documentation for that feature. + +@itemize @bullet +@item +Nothing has yet been tested exhaustively. Be cautious using PSPP to +make important decisions. + +@item +@code{make check} fails on some systems that don't like the syntax. I'm +not sure why. If someone could make an attempt to track this down, it +would be appreciated. + +@item +PostScript driver bugs: + +@itemize @minus +@item +Does not support driver arguments `max-fonts-simult' or +`optimize-text-size'. + +@item +Minor problems with font-encodings. + +@item +Fails to align fonts along their baselines. + +@item +Does not support certain bizarre line intersections--should +never crop up in practice. + +@item +Does not gracefully substitute for existing fonts whose +encodings are missing. + +@item +Does not perform italic correction or left italic correction +on font changes. + +@item +Encapsulated PostScript is unimplemented. +@end itemize + +@item +ASCII driver bugs: + +@itemize @minus +Does not support `infinite length' or `infinite width' paper. +@end itemize +@end itemize + +See below for information on reporting bugs not listed here. + +@node Contacting the Author, , Known bugs, Bugs +@section Contacting the Author + +The author can be contacted at e-mail address +@ifinfo +. +@end ifinfo +@iftex +@code{}. +@end iftex + +PSPP bug reports should be sent to +@ifinfo +. +@end ifinfo +@iftex +@code{}. +@end iftex +@setfilename ignored diff --git a/doc/command-index.texi b/doc/command-index.texi new file mode 100644 index 00000000..12ad8575 --- /dev/null +++ b/doc/command-index.texi @@ -0,0 +1,4 @@ +@node Command Index, Concept Index, Function Index, Top +@chapter Command Index +@printindex vr +@setfilename ignored diff --git a/doc/concept-index.texi b/doc/concept-index.texi new file mode 100644 index 00000000..06dfc78e --- /dev/null +++ b/doc/concept-index.texi @@ -0,0 +1,4 @@ +@node Concept Index, Installation, Command Index, Top +@chapter Concept Index +@printindex cp +@setfilename ignored diff --git a/doc/configuring.texi b/doc/configuring.texi new file mode 100644 index 00000000..ac7a518b --- /dev/null +++ b/doc/configuring.texi @@ -0,0 +1,1744 @@ +@node Configuration, Portable File Format, Installation, Top +@appendix Configuring PSPP +@cindex configuration +@cindex PSPP, configuring + +PSPP has dozens of configuration possibilities and hundreds of +settings. This is both a bane and a blessing. On one hand, it's +possible to easily accommodate diverse ranges of setups. But, on the +other, the multitude of possibilities can overwhelm the casual user. +Fortunately, the configuration mechanisms are profusely described in the +sections below@enddots{} + +@menu +* File locations:: How PSPP finds config files. +* Configuration techniques:: Many different methods of configuration@enddots{} +* Configuration files:: How configuration files are read. +* Environment variables:: All about environment variables. +* Output devices:: Describing your terminal(s) and printer(s). +* PostScript driver class:: Configuration of PostScript devices. +* ASCII driver class:: Configuration of character-code devices. +* HTML driver class:: Configuration for HTML output. +* Miscellaneous configuring:: Even more configuration variables. +* Improving output quality:: Hints for producing ever-more-lovely output. +@end menu + +@node File locations, Configuration techniques, Configuration, Configuration +@section Locating configuration files + +PSPP uses the same method to find most of its configuration files: + +@enumerate +@item +The @dfn{base name} of the file being sought is determined. + +@item +The path to search is determined. + +@item +Each directory in the search path, from left to right, is searched for a +file with the name of the base name. The first occurrence is read +as the configuration file. +@end enumerate + +The first two steps are elaborated below for the sake of our pedantic +friends. + +@enumerate +@item +A @dfn{base name} is a file name lacking an absolute directory +reference. Some examples of base names are: @file{ps-encodings}, +@file{devices}, @file{devps/DESC} (under UNIX), @file{devps\DESC} (under +M$ environments). + +Determining the base name is a two-step process: + +@enumerate a +@item +If the appropriate environment variable is defined, the value of that +variable is used (@pxref{Environment variables}). For instance, when +searching for the output driver initialization file, the variable +examined is @code{STAT_OUTPUT_INIT_FILE}. + +@item +Otherwise, the compiled-in default is used. For example, when searching +for the output driver initialization file, the default base name is +@file{devices}. +@end enumerate + +@strong{Please note:} If a user-specified base name does contain an +absolute directory reference, as in a file name like +@file{/home/pfaff/fonts/TR}, no path is searched---the file name is used +exactly as given---and the algorithm terminates. + +@item +The path is the first of the following that is defined: + +@itemize @bullet +@item +A variable definition for the path given in the user environment. This +is a PSPP-specific environment variable name; for instance, +@code{STAT_OUTPUT_INIT_PATH}. + +@item +In some cases, another, less-specific environment variable is checked. +For instance, when searching for font files, the PostScript driver first +checks for a variable with name @code{STAT_GROFF_FONT_PATH}, then for +one with name @code{GROFF_FONT_PATH}. (However, font searching has its +own list of esoteric search rules.) + +@item +The configuration file path, which is itself determined by the +following rules: + +@enumerate a +@item +If the command line contains an option of the form @samp{-B @var{path}} +or @samp{--config-dir=@var{path}}, then the value given on the +rightmost occurrence of such an option is used. + +@item +Otherwise, if the environment variable @code{STAT_CONFIG_PATH} is +defined, the value of that variable is used. + +@item +Otherwise, the compiled-in fallback default is used. On UNIX machines, +the default fallback path is + +@enumerate 1 +@item +@file{~/.pspp} + +@item +@file{/usr/local/lib/pspp} + +@item +@file{/usr/lib/pspp} +@end enumerate + +On DOS machines, the default fallback path is: + +@enumerate 1 +@item +All the paths from the DOS search path in the @samp{PATH} environment +variable, in left-to-right order. + +@item +@file{C:\PSPP}, as a last resort. +@end enumerate + +Note that the installer of PSPP can easily change this default +fallback path; thus the above should not be taken as gospel. +@end enumerate +@end itemize +@end enumerate + +As a final note: Under DOS, directories given in paths are delimited by +semicolons (@samp{;}); under UNIX, directories are delimited by colons +(@samp{:}). This corresponds with the standard path delimiter under +these OSes. + +@node Configuration techniques, Configuration files, File locations, Configuration +@section Configuration techniques + +There are many ways that PSPP can be configured. These are +described in the list below. Values given by earlier items take +precedence over those given by later items. + +@enumerate +@item +Syntax commands that modify settings, such as @cmd{SET}. @xref{SET}. + +@item +Command-line options. @xref{Invocation}. + +@item +PSPP-specific environment variable contents. @xref{Environment +variables}. + +@item +General environment variable contents. @xref{Environment variables}. + +@item +Configuration file contents. @xref{Configuration files}. + +@item +Fallback defaults. +@end enumerate + +Some of the above may not apply to a particular setting. For instance, +the current pager (such as @samp{more}, @samp{most}, or @samp{less}) +cannot be determined by configuration file contents because there is no +appropriate configuration file. + +@node Configuration files, Environment variables, Configuration techniques, Configuration +@section Configuration files + +Most configuration files have a common form: + +@itemize @bullet +@item +Each line forms a separate command or directive. This means that lines +cannot be broken up, unless they are spliced together with a trailing +backslash, as described below. + +@item +Before anything else is done, trailing whitespace is removed. + +@item +When a line ends in a backslash (@samp{\}), the backslash is removed, +and the next line is read and appended to the current line. + +@itemize @minus +@item +Whitespace preceding the backslash is retained. + +@item +This rule continues to be applied until the line read does not end in a +backslash. + +@item +It is an error if the last line in the file ends in a backslash. +@end itemize + +@item +Comments are introduced by an octothorpe (@samp{#}), and continue until the +end of the line. + +@itemize @minus +@item +An octothorpe inside balanced pairs of double quotation marks (@samp{"}) +or single quotation marks (@samp{'}) does not introduce a comment. + +@item +The backslash character can be used inside balanced quotes of either +type to escape the following character as a literal character. + +(This is distinct from the use of a backslash as a line-splicing +character.) + +@item +Line splicing takes place before comment removal. +@end itemize + +@item +Blank lines, and lines that contain only whitespace, are ignored. +@end itemize + +@node Environment variables, Output devices, Configuration files, Configuration +@section Environment variables + +You may think the concept of environment variables is a fairly simple +one. However, the author of PSPP has found a way to complicate +even something so simple. Environment variables are further described +in the sections below: + +@menu +* Variable values:: Values of variables are determined this way. +* Environment substitutions:: How environment substitutions are made. +* Predefined variables:: A few variables are automatically defined. +@end menu + +@node Variable values, Environment substitutions, Environment variables, Environment variables +@subsection Values of environment variables + +Values for environment variables are obtained by the following means, +which are arranged in order of decreasing precedence: + +@enumerate +@item +Command-line options. @xref{Invocation}. + +@item +The @file{environment} configuration file---more on this below. + +@item +Actual environment variables (defined in the shell or other parent +process). +@end enumerate + +The @file{environment} configuration file is located through application +of the usual algorithm for configuration files (@pxref{File locations}), +except that its contents do not affect the search path used to find +@file{environment} itself. Use of @file{environment} is discouraged on +systems that allow an arbitrarily large environment; it is supported for +use on systems like MS-DOS that limit environment size. + +@file{environment} is composed of lines having the form +@samp{@var{key}=@var{value}}, where @var{key} and the equals sign +(@samp{=}) are required, and @var{value} is optional. If @var{value} is +given, variable @var{key} is given that value; if @var{value} is absent, +variable @var{key} is undefined (deleted). Variables may not be defined +with a null value. + +Environment substitutions are performed on each line in the file +(@pxref{Environment substitutions}). + +See @ref{Configuration files}, for more details on formatting of the +environment configuration file. + +@quotation +@strong{Please note:} Support for @file{environment} is not yet +implemented. +@end quotation + +@node Environment substitutions, Predefined variables, Variable values, Environment variables +@subsection Environment substitutions + +Much of the power of environment variables lies in the way that they may +be substituted into configuration files. Variable substitutions are +described below. + +The line is scanned from left to right. In this scan, all characters +other than dollar signs (@samp{$}) are retained unmolested. Dollar +signs, however, introduce an environment variable reference. References +take three forms: + +@table @code +@item $@var{var} +Replaced by the value of environment variable @var{var}, determined as +specified in @ref{Variable values}. @var{var} must be one of the +following: + +@itemize @bullet +@item +One or more letters. + +@item +Exactly one nonalphabetic character. This may not be a left brace +(@samp{@{}). +@end itemize + +@item $@{@var{var}@} +Same as above, but @var{var} may contain any character (except +@samp{@}}). + +@item $$ +Replaced by a single dollar sign. +@end table + +Undefined variables expand to a empty value. + +@node Predefined variables, , Environment substitutions, Environment variables +@subsection Predefined environment variables + +There are two environment variables predefined for use in environment +substitutions: + +@table @samp +@item VER +Defined as the version number of PSPP, as a string, in a format +something like @samp{0.9.4}. + +@item ARCH +Defined as the host architecture of PSPP, as a string, in standard +cpu-manufacturer-OS format. For instance, Debian GNU/Linux 1.1 on an +Intel machine defines this as @samp{i586-unknown-linux}. This is +somewhat dependent on the system used to compile PSPP. +@end table + +Nothing prevents these values from being overridden, although it's a +good idea not to do so. + +@node Output devices, PostScript driver class, Environment variables, Configuration +@section Output devices + +Configuring output devices is the most complicated aspect of configuring +PSPP. The output device configuration file is named +@file{devices}. It is searched for using the usual algorithm for +finding configuration files (@pxref{File locations}). Each line in the +file is read in the usual manner for configuration files +(@pxref{Configuration files}). + +Lines in @file{devices} are divided into three categories, described +briefly in the table below: + +@table @i +@item driver category definitions +Define a driver in terms of other drivers. + +@item macro definitions +Define environment variables local to the the output driver +configuration file. + +@item device definitions +Describe the configuration of an output device. +@end table + +The following sections further elaborate the contents of the +@file{devices} file. + +@menu +* Driver categories:: How to organize the driver namespace. +* Macro definitions:: Environment variables local to @file{devices}. +* Device definitions:: Output device descriptions. +* Dimensions:: Lengths, widths, sizes, @enddots{} +* papersize:: Letter, legal, A4, envelope, @enddots{} +* Distinguishing line types:: Details on @file{devices} parsing. +* Tokenizing lines:: Dividing @file{devices} lines into tokens. +@end menu + +@node Driver categories, Macro definitions, Output devices, Output devices +@subsection Driver categories + +Drivers can be divided into categories. Drivers are specified by their +names, or by the names of the categories that they are contained in. +Only certain drivers are enabled each time PSPP is run; by +default, these are the drivers in the category `default'. To enable a +different set of drivers, use the @samp{-o @var{device}} command-line +option (@pxref{Invocation}). + +Categories are specified with a line of the form +@samp{@var{category}=@var{driver1} @var{driver2} @var{driver3} @var{@dots{}} +@var{driver@var{n}}}. This line specifies that the category +@var{category} is composed of drivers named @var{driver1}, +@var{driver2}, and so on. There may be any number of drivers in the +category, from zero on up. + +Categories may also be specified on the command line +(@pxref{Invocation}). + +This is all you need to know about categories. If you're still curious, +read on. + +First of all, the term `categories' is a bit of a misnomer. In fact, +the internal representation is nothing like the hierarchy that the term +seems to imply: a linear list is used to keep track of the enabled +drivers. + +When PSPP first begins reading @file{devices}, this list contains +the name of any drivers or categories specified on the command line, or +the single item `default' if none were specified. + +Each time a category definition is specified, the list is searched for +an item with the value of @var{category}. If a matching item is found, +it is deleted. If there was a match, the list of drivers (@var{driver1} +through @var{driver@var{n}}) is then appended to the list. + +Each time a driver definition line is encountered, the list is searched. +If the list contains an item with that driver's name, the driver is +enabled and the item is deleted from the list. Otherwise, the driver +is not enabled. + +It is an error if the list is not empty when the end of @file{devices} +is reached. + +@node Macro definitions, Device definitions, Driver categories, Output devices +@subsection Macro definitions + +Macro definitions take the form @samp{define @var{macroname} +@var{definition}}. In such a macro definition, the environment variable +@var{macroname} is defined to expand to the value @var{definition}. +Before the definition is made, however, any macros used in +@var{definition} are expanded. + +Please note the following nuances of macro usage: + +@itemize @bullet +@item +For the purposes of this section, @dfn{macro} and @dfn{environment +variable} are synonyms. + +@item +Macros may not take arguments. + +@item +Macros may not recurse. + +@item +Macros are just environment variable definitions like other environment +variable definitions, with the exception that they are limited in scope +to the @file{devices} configuration file. + +@item +Macros override other all environment variables of the same name (within +the scope of @file{devices}). + +@item +Earlier macro definitions for a particular @var{key} override later +ones. In particular, macro definitions on the command line override +those in the device definition file. @xref{Non-option Arguments}. + +@item +There are two predefined macros, whose values are determined at runtime: + +@table @samp +@item viewwidth +Defined as the width of the console screen, in columns of text. + +@item viewlength +Defined as the length of the console screen, in lines of text. +@end table +@end itemize + +@node Device definitions, Dimensions, Macro definitions, Output devices +@subsection Driver definitions + +Driver definitions are the ultimate purpose of the @file{devices} +configuration file. These are where the real action is. Driver +definitions tell PSPP where it should send its output. + +Each driver definition line is divided into four fields. These fields +are delimited by colons (@samp{:}). Each line is subjected to +environment variable interpolation before it is processed further +(@pxref{Environment substitutions}). From left to right, the four +fields are, in brief: + +@table @i +@item driver name +A unique identifier, used to determine whether to enable the driver. + +@item class name +One of the predefined driver classes supported by PSPP. The +currently supported driver classes include `postscript' and `ascii'. + +@item device type(s) +Zero or more of the following keywords, delimited by spaces: + +@table @code +@item screen + +Indicates that the device is a screen display. This may reduce the +amount of buffering done by the driver, to make interactive use more +convenient. + +@item printer + +Indicates that the device is a printer. + +@item listing + +Indicates that the device is a listing file. +@end table + +These options are just hints to PSPP and do not cause the output to be +directed to the screen, or to the printer, or to a listing file---those +must be set elsewhere in the options. They are used primarily to decide +which devices should be enabled at any given time. @xref{SET}, for more +information. + +@item options +An optional set of options to pass to the driver itself. The exact +format for the options varies among drivers. +@end table + +The driver is enabled if: + +@enumerate +@item +Its driver name is specified on the command line, or + +@item +It's in a category specified on the command line, or + +@item +If no categories or driver names are specified on the command line, it +is in category @code{default}. +@end enumerate + +For more information on driver names, see @ref{Driver categories}. + +The class name must be one of those supported by PSPP. The +classes supported depend on the options with which PSPP was +compiled. See later sections in this chapter for descriptions of the +available driver classes. + +Options are dependent on the driver. See the driver descriptions for +details. + +@node Dimensions, papersize, Device definitions, Output devices +@subsection Dimensions + +Quite often in configuration it is necessary to specify a length or a +size. PSPP uses a common syntax for all such, calling them +collectively by the name @dfn{dimensions}. + +@itemize @bullet +@item +You can specify dimensions in decimal form (@samp{12.5}) or as +fractions, either as mixed numbers (@samp{12-1/2}) or raw fractions +(@samp{25/2}). + +@item +A number of different units are available. These are suffixed to the +numeric part of the dimension. There must be no spaces between the +number and the unit. The available units are identical to those offered +by the popular typesetting system @TeX{}: + +@table @code +@item in +inch (1 @code{in} = 2.54 @code{cm}) + +@item " +inch (1 @code{in} = 2.54 @code{cm}) + +@item pt +printer's point (1 @code{in} = 72.27 @code{pt}) + +@item pc +pica (12 @code{pt} = 1 @code{pc}) + +@item bp +PostScript point (1 @code{in} = 72 @code{bp}) + +@item cm +centimeter + +@item mm +millimeter (10 @code{mm} = 1 @code{cm}) + +@item dd +didot point (1157 @code{dd} = 1238 @code{pt}) + +@item cc +cicero (1 @code{cc} = 12 @code{dd}) + +@item sp +scaled point (65536 @code{sp} = 1 @code{pt}) +@end table + +@item +If no explicit unit is given, PSPP attempts to guess the best unit: + +@itemize @minus +@item +Numbers less than 50 are assumed to be in inches. + +@item +Numbers 50 or greater are assumed to be in millimeters. +@end itemize +@end itemize + +@node papersize, Distinguishing line types, Dimensions, Output devices +@subsection Paper sizes + +Output drivers usually deal with some sort of hardcopy media. This +media is called @dfn{paper} by the drivers, though in reality it could +be a transparency or film or thinly veiled sarcasm. To make it easier +for you to deal with paper, PSPP allows you to have (of course!) a +configuration file that gives symbolic names, like ``letter'' or +``legal'' or ``a4'', to paper sizes, rather than forcing you to use +cryptic numbers like ``8-1/2 x 11'' or ``210 by 297''. Surprisingly +enough, this configuration file is named @file{papersize}. +@xref{Configuration files}. + +When PSPP tries to connect a symbolic paper name to a paper size, it +reads and parses each non-comment line in the file, in order. The first +field on each line must be a symbolic paper name in double quotes. +Paper names may not contain double quotes. Paper names are not +case-sensitive: @samp{legal} and @samp{Legal} are equivalent. + +If a match is found for the paper name, the rest of the line is parsed. +If it is found to be a pair of dimensions (@pxref{Dimensions}) separated +by either @samp{x} or @samp{by}, then those are taken to be the paper +size, in order of width followed by length. There @emph{must} be at +least one space on each side of @samp{x} or @samp{by}. + +Otherwise the line must be of the form +@samp{"@var{paper-1}"="@var{paper-2}"}. In this case the target of the +search becomes paper name @var{paper-2} and the search through the file +continues. + +@node Distinguishing line types, Tokenizing lines, papersize, Output devices +@subsection How lines are divided into types + +The lines in @file{devices} are distinguished in the following manner: + +@enumerate +@item +Leading whitespace is removed. + +@item +If the resulting line begins with the exact string @code{define}, +followed by one or more whitespace characters, the line is processed as +a macro definition. + +@item +Otherwise, the line is scanned for the first instance of a colon +(@samp{:}) or an equals sign (@samp{=}). + +@item +If a colon is encountered first, the line is processed as a driver +definition. + +@item +Otherwise, if an equals sign is encountered, the line is processed as a +macro definition. + +@item +Otherwise, the line is ill-formed. +@end enumerate + +@node Tokenizing lines, , Distinguishing line types, Output devices +@subsection How lines are divided into tokens + +Each driver definition line is run through a simple tokenizer. This +tokenizer recognizes two basic types of tokens. + +The first type is an equals sign (@samp{=}). Equals signs are both +delimiters between tokens and tokens in themselves. + +The second type is an identifier or string token. Identifiers and +strings are equivalent after tokenization, though they are written +differently. An identifier is any string of characters other than +whitespace or equals sign. + +A string is introduced by a single- or double-quote character (@samp{'} +or @samp{"}) and, in general, continues until the next occurrence of +that same character. The following standard C escapes can also be +embedded within strings: + +@table @code +@item \' +A single-quote (@samp{'}). + +@item \" +A double-quote (@samp{"}). + +@item \? +A question mark (@samp{?}). Included for hysterical raisins. + +@item \\ +A backslash (@samp{\}). + +@item \a +Audio bell (ASCII 7). + +@item \b +Backspace (ASCII 8). + +@item \f +Formfeed (ASCII 12). + +@item \n +New-line (ASCII 10) + +@item \r +Carriage return (ASCII 13). + +@item \t +Tab (ASCII 9). + +@item \v +Vertical tab (ASCII 11). + +@item \@var{o}@var{o}@var{o} +Each @samp{o} must be an octal digit. The character is the one having +the octal value specified. Any number of octal digits is read and +interpreted; only the lower 8 bits are used. + +@item \x@var{h}@var{h} +Each @samp{h} must be a hex digit. The character is the one having the +hexadecimal value specified. Any number of hex digits is read and +interpreted; only the lower 8 bits are used. +@end table + +Tokens, outside of quoted strings, are delimited by whitespace or equals +signs. + +@node PostScript driver class, ASCII driver class, Output devices, Configuration +@section The PostScript driver class + +The @code{postscript} driver class is used to produce output that is +acceptable to PostScript printers and to PC-based PostScript +interpreters such as Ghostscript. Continuing a long tradition, +PSPP's PostScript driver is configurable to the point of +absurdity. + +There are actually two PostScript drivers. The first one, +@samp{postscript}, produces ordinary DSC-compliant PostScript output. +The second one @samp{epsf}, produces an Encapsulated PostScript file. +The two drivers are otherwise identical in configuration and in +operation. + +The PostScript driver is described in further detail below. + +@menu +* PS output options:: Output file options. +* PS page options:: Paper, margins, scaling & rotation, more! +* PS file options:: Configuration files. +* PS font options:: Default fonts, font options. +* PS line options:: Line widths, options. +* Prologue:: Details on the PostScript prologue. +* Encodings:: Details on PostScript font encodings. +@end menu + +@node PS output options, PS page options, PostScript driver class, PostScript driver class +@subsection PostScript output options + +These options deal with the form of the output and the output file +itself: + +@table @code +@item output-file=@var{filename} + +File to which output should be sent. This can be an ordinary filename +(i.e., @code{"pspp.ps"}), a pipe filename (i.e., @code{"|lpr"}), or +stdout (@code{"-"}). Default: @code{"pspp.ps"}. + +@item color=@var{boolean} + +Most of the time black-and-white PostScript devices are smart enough to +map colors to shades themselves. However, you can cause the PSPP +output driver to do an ugly simulation of this in its own driver by +turning @code{color} off. Default: @code{on}. + +This is a boolean setting, as are many settings in the PostScript +driver. Valid positive boolean values are @samp{on}, @samp{true}, +@samp{yes}, and nonzero integers. Negative boolean values are +@samp{off}, @samp{false}, @samp{no}, and zero. + +@item data=@var{data-type} + +One of @code{clean7bit}, @code{clean8bit}, or @code{binary}. This +controls what characters will be written to the output file. PostScript +produced with @code{clean7bit} can be transmitted over 7-bit +transmission channels that use ASCII control characters for line +control. @code{clean8bit} is similar but allows characters above 127 to +be written to the output file. @code{binary} allows any character in +the output file. Default: @code{clean7bit}. + +@item line-ends=@var{line-end-type} + +One of @code{cr}, @code{lf}, or @code{crlf}. This controls what is used +for new-line in the output file. Default: @code{cr}. + +@item optimize-line-size=@var{level} + +Either @code{0} or @code{1}. If @var{level} is @code{1}, then short +line segments will be collected and merged into longer ones. This +reduces output file size but requires more time and memory. A +@var{level} of @code{0} has the advantage of being better for +interactive environments. @code{1} is the default unless the +@code{screen} flag is set; in that case, the default is @code{0}. + +@item optimize-text-size=@var{level} + +One of @code{0}, @code{1}, or @code{2}, each higher level representing +correspondingly more aggressive space savings for text in the output +file and requiring correspondingly more time and memory. Unfortunately +the levels presently are all the same. @code{1} is the default unless +the @code{screen} flag is set; in that case, the default is @code{0}. +@end table + +@node PS page options, PS file options, PS output options, PostScript driver class +@subsection PostScript page options + +These options affect page setup: + +@table @code +@item headers=@var{boolean} + +Controls whether the standard headers showing the time and date and +title and subtitle are printed at the top of each page. Default: +@code{on}. + +@item paper-size=@var{paper-size} + +Paper size, either as a symbolic name (i.e., @code{letter} or @code{a4}) +or specific measurements (i.e., @code{8-1/2x11} or @code{"210 x 297"}. +@xref{papersize, , Paper sizes}. Default: @code{letter}. + +@item orientation=@var{orientation} + +Either @code{portrait} or @code{landscape}. Default: @code{portrait}. + +@item left-margin=@var{dimension} +@itemx right-margin=@var{dimension} +@itemx top-margin=@var{dimension} +@itemx bottom-margin=@var{dimension} + +Sets the margins around the page. The headers, if enabled, are not +included in the margins; they are in addition to the margins. For a +description of dimensions, see @ref{Dimensions}. Default: @code{0.5in}. + +@end table + +@node PS file options, PS font options, PS page options, PostScript driver class +@subsection PostScript file options + +Oh, my. You don't really want to know about the way that the PostScript +driver deals with files, do you? Well I suppose you're entitled, but I +warn you right now: it's not pretty. Here goes@enddots{} + +First let's look at the options that are available: + +@table @code + +@item font-dir=@var{font-directory} + +Sets the font directory. Default: @code{devps}. + +@item prologue-file=@var{prologue-file-name} + +Sets the name of the PostScript prologue file. You can write your own +prologue, though I have no idea why you'd want to: see @ref{Prologue}. +Default: @code{ps-prologue}. + +@item device-file=@var{device-file-name} + +Sets the name of the Groff-format device description file. The +PostScript driver reads this to know about the scaling of fonts +and so on. The format of such files is described in groff_font(5), +included with Groff. Default: @code{DESC}. + +@item encoding-file=@var{encoding-file-name} + +Sets the name of the encoding file. This file contains a list of all +font encodings that will be needed so that the driver can put all of +them at the top of the prologue. @xref{Encodings}. Default: +@code{ps-encodings}. + +If the specified encoding file cannot be found, this error will be +silently ignored, since most people do not need any encodings besides +the ones that can be found using @code{auto-encodings}, described below. + +@item auto-encode=@var{boolean} + +When enabled, the font encodings needed by the default proportional- and +fixed-pitch fonts will automatically be dumped to the PostScript +output. Otherwise, it is assumed that the user has an encoding file +and knows how to use it (@pxref{Encodings}). There is probably no good +reason to turn off this convenient feature. Default: @code{on}. + +@end table + +Next I suppose it's time to describe the search algorithm. When the +PostScript driver needs a file, whether that file be a font, a +PostScript prologue, or what you will, it searches in this manner: + +@enumerate + +@item +Constructs a path by taking the first of the following that is defined: + +@enumerate a + +@item +Environment variable @code{STAT_GROFF_FONT_PATH}. @xref{Environment +variables}. + +@item +Environment variable @code{GROFF_FONT_PATH}. + +@item +The compiled-in fallback default. +@end enumerate + +@item +Constructs a base name from concatenating, in order, the font directory, +a path separator (@samp{/} or @samp{\}), and the file to be found. A +typical base name would be something like @code{devps/ps-encodings}. + +@item +Searches for the base name in the path constructed above. If the file +is found, the algorithm terminates. + +@item +Searches for the base name in the standard configuration path. See +@ref{File locations}, for more details. If the file is found, the +algorithm terminates. + +@item +At this point we remove the font directory and path separator from the +base name. Now the base name is simply the file to be found, i.e., +@code{ps-encodings}. + +@item +Searches for the base name in the path constructed in the first step. +If the file is found, the algorithm terminates. + +@item +Searches for the base name in the standard configuration path. If the +file is found, the algorithm terminates. + +@item +The algorithm terminates unsuccessfully. +@end enumerate + +So, as you see, there are several ways to configure the PostScript +drivers. Careful selection of techniques can make the configuration +very flexible indeed. + +@node PS font options, PS line options, PS file options, PostScript driver class +@subsection PostScript font options + +The list of available font options is short and sweet: + +@table @code +@item prop-font=@var{font-name} + +Sets the default proportional font. The name should be that of a +PostScript font. Default: @code{"Helvetica"}. + +@item fixed-font=@var{font-name} + +Sets the default fixed-pitch font. The name should be that of a +PostScript font. Default: @code{"Courier"}. + +@item font-size=@var{font-size} + +Sets the size of the default fonts, in thousandths of a point. Default: +@code{10000}. + +@end table + +@node PS line options, Prologue, PS font options, PostScript driver class +@subsection PostScript line options + +Most tables contain lines, or rules, between cells. Some features of +the way that lines are drawn in PostScript tables are user-definable: + +@table @code + +@item line-style=@var{style} + +Sets the style used for lines used to divide tables into sections. +@var{style} must be either @code{thick}, in which case thick lines are +used, or @var{double}, in which case double lines are used. Default: +@code{thick}. + +@item line-gutter=@var{dimension} + +Sets the line gutter, which is the amount of whitespace on either side +of lines that border text or graphics objects. @xref{Dimensions}. +Default: @code{0.5pt}. + +@item line-spacing=@var{dimension} + +Sets the line spacing, which is the amount of whitespace that separates +lines that are side by side, as in a double line. Default: +@code{0.5pt}. + +@item line-width=@var{dimension} + +Sets the width of a typical line used in tables. Default: @code{0.5pt}. + +@item line-width-thick=@var{dimension} + +Sets the width of a thick line used in tables. Not used if +@code{line-style} is set to @code{thick}. Default: @code{1.5pt}. + +@end table + +@node Prologue, Encodings, PS line options, PostScript driver class +@subsection The PostScript prologue + +Most PostScript files that are generated mechanically by programs +consist of two parts: a prologue and a body. The prologue is generally +a collection of boilerplate. Only the body differs greatly between +two outputs from the same program. + +This is also the strategy used in the PSPP PostScript driver. In +general, the prologue supplied with PSPP will be more than sufficient. +In this case, you will not need to read the rest of this section. +However, hackers might want to know more. Read on, if you fall into +this category. + +The prologue is dumped into the output stream essentially unmodified. +However, two actions are performed on its lines. First, certain lines +may be omitted as specified in the prologue file itself. Second, +variables are substituted. + +The following lines are omitted: + +@enumerate +@item +All lines that contain three bangs in a row (@code{!!!}). + +@item +Lines that contain @code{!eps}, if the PostScript driver is producing +ordinary PostScript output. Otherwise an EPS file is being produced, +and the line is included in the output, although everything following +@code{!eps} is deleted. + +@item +Lines that contain @code{!ps}, if the PostScript driver is producing EPS +output. Otherwise, ordinary PostScript is being produced, and the line +is included in the output, although everything following @code{!ps} is +deleted. +@end enumerate + +The following are the variables that are substituted. Only the +variables listed are substituted; environment variables are not. +@xref{Environment substitutions}. + +@table @code +@item bounding-box + +The page bounding box, in points, as four space-separated numbers. For +U.S. letter size paper, this is @samp{0 0 612 792}. + +@item creator + +PSPP version as a string: @samp{GNU PSPP 0.1b}, for example. + +@item date + +Date the file was created. Example: @samp{Tue May 21 13:46:22 1991}. + +@item data + +Value of the @code{data} PostScript driver option, as one of the strings +@samp{Clean7Bit}, @samp{Clean8Bit}, or @samp{Binary}. + +@item orientation + +Page orientation, as one of the strings @code{Portrait} or +@code{Landscape}. + +@item user + +Under multiuser OSes, the user's login name, taken either from the +environment variable @code{LOGNAME} or, if that fails, the result of the +C library function @code{getlogin()}. Defaults to @samp{nobody}. + +@item host + +System hostname as reported by @code{gethostname()}. Defaults to +@samp{nowhere}. + +@item prop-font + +Name of the default proportional font, prefixed by the word +@samp{font} and a space. Example: @samp{font Times-Roman}. + +@item fixed-font + +Name of the default fixed-pitch font, prefixed by the word @samp{font} +and a space. + +@item scale-factor + +The page scaling factor as a floating-point number. Example: +@code{1.0}. Note that this is also passed as an argument to the BP +macro. + +@item paper-length +@item paper-width + +The paper length and paper width, respectively, in thousandths of a +point. Note that these are also passed as arguments to the BP macro. + +@item left-margin +@item top-margin + +The left margin and top margin, respectively, in thousandths of a +point. Note that these are also passed as arguments to the BP macro. + +@item title + +Document title as a string. This is not the title specified in the +PSPP syntax file. A typical title is the word @samp{PSPP} followed +by the syntax file name in parentheses. Example: @samp{PSPP +()}. + +@item source-file + +PSPP syntax file name. Example: @samp{mary96/first.stat}. + +@end table + +Any other questions about the PostScript prologue can best be answered +by examining the default prologue or the PSPP source. + +@node Encodings, , Prologue, PostScript driver class +@subsection PostScript encodings + +PostScript fonts often contain many more than 256 characters, in order +to accommodate foreign language characters and special symbols. +PostScript uses @dfn{encodings} to map these onto single-byte symbol +sets. Each font can have many different encodings applied to it. + +PSPP's PostScript driver needs to know which encoding to apply to each +font. It can determine this from the information encapsulated in the +Groff font description that it reads. However, there is an additional +problem---for efficiency, the PostScript driver needs to have a complete +list of all encodings that will be used in the entire session @emph{when +it opens the output file}. For this reason, it can't use the +information built into the fonts because it doesn't know which fonts +will be used. + +As a stopgap solution, there are two mechanisms for specifying which +encodings will be used. The first mechanism is automatic and it is the +only one that most PSPP users will ever need. The second mechanism is +manual, but it is more flexible. Either mechanism or both may be used +at one time. + +The first mechanism is activated by the @samp{auto-encode} driver option +(@pxref{PS file options}). When enabled, @samp{auto-encode} causes the +PostScript driver to include the encodings used by the default +proportional and fixed-pitch fonts (@pxref{PS font options}). Many +PSPP output files will only need these encodings. + +The second mechanism is the file specified by the @samp{encoding-file} +option (@pxref{PS file options}). If it exists, this file must consist +of lines in PSPP configuration-file format (@pxref{Configuration +files}). Each line that is not a comment should name a PostScript +encoding to include in the output. + +It is not an error if an encoding is included more than once, by either +mechanism. It will appear only once in the output. It is also not an +error if an encoding is included in the output but never used. It +@emph{is} an error if an encoding is used but not included by one of +these mechanisms. In this case, the built-in PostScript encoding +@samp{ISOLatin1Encoding} is substituted. + +@node ASCII driver class, HTML driver class, PostScript driver class, Configuration +@section The ASCII driver class + +The ASCII driver class produces output that can be displayed on a +terminal or output to printers. All of its options are highly +configurable. The ASCII driver has class name @samp{ascii}. + +The ASCII driver is described in further detail below. + +@menu +* ASCII output options:: Output file options. +* ASCII page options:: Page size, margins, more. +* ASCII font options:: Box character, bold & italics. +@end menu + +@node ASCII output options, ASCII page options, ASCII driver class, ASCII driver class +@subsection ASCII output options + +@table @code +@item output-file=@var{filename} + +File to which output should be sent. This can be an ordinary filename +(e.g., @code{"pspp.txt"}), a pipe filename (e.g., @code{"|lpr"}), or +stdout (@code{"-"}). Default: @code{"pspp.list"}. + +@item char-set=@var{char-set-type} + +One of @samp{ascii} or @samp{latin1}. This has no effect on output at +the present time. Default: @code{ascii}. + +@item form-feed-string=@var{form-feed-value} + +The string written to the output to cause a formfeed. See also +@code{paginate}, described below, for a related setting. Default: +@code{"\f"}. + +@item newline-string=@var{new-line-value} + +The string written to the output to cause a new-line (carriage return +plus linefeed). The default, which can be specified explicitly with +@code{newline-string=default}, is to use the system-dependent new-line +sequence by opening the output file in text mode. This is usually the +right choice. + +However, @code{newline-string} can be set to any string. When this is +done, the output file is opened in binary mode. + +@item paginate=@var{boolean} + +If set, a formfeed (as set in @code{form-feed-string}, described above) +will be written to the device after every page. Default: @code{on}. + +@item tab-width=@var{tab-width-value} + +The distance between tab stops for this device. If set to 0, tabs will +not be used in the output. Default: @code{8}. + +@item init=@var{initialization-string}. + +String written to the device before anything else, at the beginning of +the output. Default: @code{""} (the empty string). + +@item done=@var{finalization-string}. + +String written to the device after everything else, at the end of the +output. Default: @code{""} (the empty string). +@end table + +@node ASCII page options, ASCII font options, ASCII output options, ASCII driver class +@subsection ASCII page options + +These options affect page setup: + +@table @code +@item headers=@var{boolean} + +If enabled, two lines of header information giving title and subtitle, +page number, date and time, and PSPP version are printed at the top of +every page. These two lines are in addition to any top margin +requested. Default: @code{on}. + +@item length=@var{line-count} + +Physical length of a page, in lines. Headers and margins are subtracted +from this value. Default: @code{66}. + +@item width=@var{character-count} + +Physical width of a page, in characters. Margins are subtracted from +this value. Default: @code{130}. + +@item lpi=@var{lines-per-inch} + +Number of lines per vertical inch. Not currently used. Default: @code{6}. + +@item cpi=@var{characters-per-inch} + +Number of characters per horizontal inch. Not currently used. Default: +@code{10}. + +@item left-margin=@var{left-margin-width} + +Width of the left margin, in characters. PSPP subtracts this value +from the page width. Default: @code{0}. + +@item right-margin=@var{right-margin-width} + +Width of the right margin, in characters. PSPP subtracts this value +from the page width. Default: @code{0}. + +@item top-margin=@var{top-margin-lines} + +Length of the top margin, in lines. PSPP subtracts this value from +the page length. Default: @code{2}. + +@item bottom-margin=@var{bottom-margin-lines} + +Length of the bottom margin, in lines. PSPP subtracts this value from +the page length. Default: @code{2}. + +@end table + +@node ASCII font options, , ASCII page options, ASCII driver class +@subsection ASCII font options + +These are the ASCII font options: + +@table @code +@item box[@var{line-type}]=@var{box-chars} + +The characters used for lines in tables produced by the ASCII driver can +be changed using this option. @var{line-type} is used to indicate which +type of line to change; @var{box-chars} is the character or string of +characters to use for this type of line. + +@var{line-type} must be a 4-digit number in base 4. The digits are in +the order `right', `bottom', `left', `top'. The four possibilities for +each digit are: + +@table @asis +@item 0 +No line. + +@item 1 +Single line. + +@item 2 +Double line. + +@item 3 +Special device-defined line, if one is available; otherwise, a double +line. +@end table + +Examples: + +@table @code +@item box[0101]="|" + +Sets @samp{|} as the character to use for a single-width line with +bottom and top components. + +@item box[2222]="#" + +Sets @samp{#} as the character to use for the intersection of four +double-width lines, one each from the top, bottom, left and right. + +@item box[1100]="\xda" + +Sets @samp{"\xda"}, which under MS-DOS is a box character suitable for +the top-left corner of a box, as the character for the intersection of +two single-width lines, one each from the right and bottom. + +@end table + +Defaults: + +@itemize @bullet +@item +@code{box[0000]=" "} + +@item +@code{box[1000]="-"} +@*@code{box[0010]="-"} +@*@code{box[1010]="-"} + +@item +@code{box[0100]="|"} +@*@code{box[0001]="|"} +@*@code{box[0101]="|"} + +@item +@code{box[2000]="="} +@*@code{box[0020]="="} +@*@code{box[2020]="="} + +@item +@code{box[0200]="#"} +@*@code{box[0002]="#"} +@*@code{box[0202]="#"} + +@item +@code{box[3000]="="} +@*@code{box[0030]="="} +@*@code{box[3030]="="} + +@item +@code{box[0300]="#"} +@*@code{box[0003]="#"} +@*@code{box[0303]="#"} + +@item +For all others, @samp{+} is used unless there are double lines or +special lines, in which case @samp{#} is used. +@end itemize + +@item italic-on=@var{italic-on-string} + +Character sequence written to turn on italics or underline printing. If +this is set to @code{overstrike}, then the driver will simulate +underlining by overstriking with underscore characters (@samp{_}) in the +manner described by @code{overstrike-style} and +@code{carriage-return-style}. Default: @code{overstrike}. + +@item italic-off=@var{italic-off-string} + +Character sequence to turn off italics or underline printing. Default: +@code{""} (the empty string). + +@item bold-on=@var{bold-on-string} + +Character sequence written to turn on bold or emphasized printing. If +set to @code{overstrike}, then the driver will simulated bold printing +by overstriking characters in the manner described by +@code{overstrike-style} and @code{carriage-return-style}. Default: +@code{overstrike}. + +@item bold-off=@var{bold-off-string} + +Character sequence to turn off bold or emphasized printing. Default: +@code{""} (the empty string). + +@item bold-italic-on=@var{bold-italic-on-string} + +Character sequence written to turn on bold-italic printing. If set to +@code{overstrike}, then the driver will simulate bold-italics by +overstriking twice, once with the character, a second time with an +underscore (@samp{_}) character, in the manner described by +@code{overstrike-style} and @code{carriage-return-style}. Default: +@code{overstrike}. + +@item bold-italic-off=@var{bold-italic-off-string} + +Character sequence to turn off bold-italic printing. Default: @code{""} +(the empty string). + +@item overstrike-style=@var{overstrike-option} + +Either @code{single} or @code{line}: + +@itemize @bullet +@item +If @code{single} is selected, then, to overstrike a line of text, the +output driver will output a character, backspace, overstrike, output a +character, backspace, overstrike, and so on along a line. + +@item +If @code{line} is selected then the output driver will output an entire +line, then backspace or emit a carriage return (as indicated by +@code{carriage-return-style}), then overstrike the entire line at once. +@end itemize + +@code{single} is recommended for use with ttys and programs that +understand overstriking in text files, such as the pager @code{less}. +@code{single} will also work with printer devices but results in rapid +back-and-forth motions of the printhead that can cause the printer to +physically overheat! + +@code{line} is recommended for use with printer devices. Most programs +that understand overstriking in text files will not properly deal with +@code{line} mode. + +Default: @code{single}. + +@item carriage-return-style=@var{carriage-return-type} + +Either @code{bs} or @code{cr}. This option applies only when one or +more of the font commands is set to @code{overstrike} and, at the same +time, @code{overstrike-style} is set to @code{line}. + +@itemize @bullet +@item +If @code{bs} is selected then the driver will return to the beginning of +a line by emitting a sequence of backspace characters (ASCII 8). + +@item +If @code{cr} is selected then the driver will return to the beginning of +a line by emitting a single carriage-return character (ASCII 13). +@end itemize + +Although @code{cr} is preferred as being more compact, @code{bs} is more +general since some devices do not interpret carriage returns in the +desired manner. Default: @code{bs}. +@end table + +@node HTML driver class, Miscellaneous configuring, ASCII driver class, Configuration +@section The HTML driver class + +The @code{html} driver class is used to produce output for viewing in +tables-capable web browsers such as Emacs' w3-mode. Its configuration +is very simple. Currently, the output has a very plain format. In the +future, further work may be done on improving the output appearance. + +There are few options for use with the @code{html} driver class: + +@table @code +@item output-file=@var{filename} + +File to which output should be sent. This can be an ordinary filename +(i.e., @code{"pspp.ps"}), a pipe filename (i.e., @code{"|lpr"}), or +stdout (@code{"-"}). Default: @code{"pspp.html"}. + +@item prologue-file=@var{prologue-file-name} + +Sets the name of the PostScript prologue file. You can write your own +prologue if you want to customize colors or other settings: see +@ref{HTML Prologue}. Default: @code{html-prologue}. +@end table + +@menu +* HTML Prologue:: Format of the HTML prologue file. +@end menu + +@node HTML Prologue, , HTML driver class, HTML driver class +@subsection The HTML prologue + +HTML files that are generated by PSPP consist of two parts: a prologue +and a body. The prologue is a collection of boilerplate. Only the body +differs greatly between two outputs. You can tune the colors and other +attributes of the output by editing the prologue. + +The prologue is dumped into the output stream essentially unmodified. +However, two actions are performed on its lines. First, certain lines +may be omitted as specified in the prologue file itself. Second, +variables are substituted. + +The following lines are omitted: + +@enumerate +@item +All lines that contain three bangs in a row (@code{!!!}). + +@item +Lines that contain @code{!title}, if no title is set for the output. If +a title is set, then the characters @code{!title} are removed before the +line is output. + +@item +Lines that contain @code{!subtitle}, if no subtitle is set for the +output. If a subtitle is set, then the characters @code{!subtitle} are +removed before the line is output. +@end enumerate + +The following are the variables that are substituted. Only the +variables listed are substituted; environment variables are not. +@xref{Environment substitutions}. + +@table @code +@item generator + +PSPP version as a string: @samp{GNU PSPP 0.1b}, for example. + +@item date + +Date the file was created. Example: @samp{Tue May 21 13:46:22 1991}. + +@item user + +Under multiuser OSes, the user's login name, taken either from the +environment variable @code{LOGNAME} or, if that fails, the result of the +C library function @code{getlogin()}. Defaults to @samp{nobody}. + +@item host + +System hostname as reported by @code{gethostname()}. Defaults to +@samp{nowhere}. + +@item title + +Document title as a string. This is the title specified in the PSPP +syntax file. + +@item subtitle + +Document subtitle as a string. + +@item source-file + +PSPP syntax file name. Example: @samp{mary96/first.stat}. +@end table + +@node Miscellaneous configuring, Improving output quality, HTML driver class, Configuration +@section Miscellaneous configuration + +The following environment variables can be used to further configure +PSPP: + +@table @code +@item HOME + +Used to determine the user's home directory. No default value. + +@item STAT_INCLUDE_PATH + +Path used to find include files in PSPP syntax files. Defaults vary +across operating systems: + +@table @asis +@item UNIX + +@itemize @bullet +@item +@file{.} + +@item +@file{~/.pspp/include} + +@item +@file{/usr/local/lib/pspp/include} + +@item +@file{/usr/lib/pspp/include} + +@item +@file{/usr/local/share/pspp/include} + +@item +@file{/usr/share/pspp/include} +@end itemize + +@item MS-DOS + +@itemize @bullet +@item +@file{.} + +@item +@file{C:\PSPP\INCLUDE} + +@item +@file{$PATH} +@end itemize + +@item Other OSes +No default path. +@end table + +@item STAT_PAGER +@itemx PAGER + +When PSPP invokes an external pager, it uses the first of these that +is defined. There is a default pager only if the person who compiled +PSPP defined one. + +@item TERM + +The terminal type @code{termcap} or @code{ncurses} will use, if such +support was compiled into PSPP. + +@item STAT_OUTPUT_INIT_FILE + +The basename used to search for the driver definition file. +@xref{Output devices}. @xref{File locations}. Default: @code{devices}. + +@item STAT_OUTPUT_PAPERSIZE_FILE + +The basename used to search for the papersize file. @xref{papersize}. +@xref{File locations}. Default: @code{papersize}. + +@item STAT_OUTPUT_INIT_PATH + +The path used to search for the driver definition file and the papersize +file. @xref{File locations}. Default: the standard configuration path. + +@item TMPDIR + +The @code{sort} procedure stores its temporary files in this directory. +Default: (UNIX) @file{/tmp}, (MS-DOS) @file{\}, (other OSes) empty string. + +@item TEMP +@item TMP + +Under MS-DOS only, these variables are consulted after TMPDIR, in this +order. +@end table + +@node Improving output quality, , Miscellaneous configuring, Configuration +@section Improving output quality + +When its drivers are set up properly, PSPP can produce output that +looks very good indeed. The PostScript driver, suitably configured, can +produce presentation-quality output. Here are a few guidelines for +producing better-looking output, regardless of output driver. Your +mileage may vary, of course, and everyone has different esthetic +preferences. + +@itemize @bullet +@item +Width is important in PSPP output. Greater output width leads to more +readable output, to a point. Try the following to increase the output +width: + +@itemize @minus +@item +If you're using the ASCII driver with a dot-matrix printer, figure out +what you need to do to put the printer into compressed mode. Put that +string into the @code{init-string} setting. Try to get 132 columns; 160 +might be better, but you might find that print that tiny is difficult to +read. + +@item +With the PostScript driver, try these ideas: + +@itemize + +@item +Landscape mode. + +@item +Legal-size (8.5" x 14") paper in landscape mode. + +@item +Reducing font sizes. If you're using 12-point fonts, try 10 point; if +you're using 10-point fonts, try 8 point. Some fonts are more readable +than others at small sizes. +@end itemize +@end itemize + +Try to strike a balance between character size and page width. + +@item +Use high-quality fonts. Many public domain fonts are poor in quality. +Recently, URW made some high-quality fonts available under the GPL. +These are probably suitable. + +@item +Be sure you're using the proper font metrics. The font metrics provided +with PSPP may not correspond to the fonts actually being printed. +This can cause bizarre-looking output. + +@item +Make sure that you're using good ink/ribbon/toner. Darker print is +easier to read. + +@item +Use plain fonts with serifs, such as Times-Roman or Palatino. Avoid +choosing italic or bold fonts as document base fonts. +@end itemize +@setfilename ignored diff --git a/doc/credits.texi b/doc/credits.texi new file mode 100644 index 00000000..9d0d7769 --- /dev/null +++ b/doc/credits.texi @@ -0,0 +1,23 @@ +@node Credits, Invocation, License, Top +@chapter Credits +@cindex credits +@cindex authors + +@cindex Pfaff, Ben +Most of PSPP, as well as this manual, +was written by Ben Pfaff. @xref{Contacting the Author}, for +instructions on contacting the author. + +@cindex Covington, Michael A. +@cindex Van Zandt, James +@cindex @file{ftp.cdrom.com} +@cindex @file{/pub/algorithms/c/julcal10} +@cindex @file{julcal.c} +@cindex @file{julcal.h} +The PSPP source code incorporates @code{julcal10} originally +written by Michael A. Covington and translated into C by Jim Van Zandt. +The original package can be found in directory +@url{ftp://ftp.cdrom.com/pub/algorithms/c/julcal10}. The entire +contents of that directory constitute the package. The files actually +used in PSPP are @code{julcal.c} and @code{julcal.h}. +@setfilename ignored diff --git a/doc/data-file-format.texi b/doc/data-file-format.texi new file mode 100644 index 00000000..230a72f9 --- /dev/null +++ b/doc/data-file-format.texi @@ -0,0 +1,641 @@ +@node Data File Format, q2c Input Format, Portable File Format, Top +@appendix Data File Format + +PSPP necessarily uses the same format for system files as do the +products with which it is compatible. This chapter is a description of +that format. + +There are three data types used in system files: 32-bit integers, 64-bit +floating points, and 1-byte characters. In this document these will +simply be referred to as @code{int32}, @code{flt64}, and @code{char}, +the names that are used in the PSPP source code. Every field of type +@code{int32} or @code{flt64} is aligned on a 32-bit boundary. + +The endianness of data in PSPP system files is not specified. System +files output on a computer of a particular endianness will have the +endianness of that computer. However, PSPP can read files of either +endianness, regardless of its host computer's endianness. PSPP +translates endianness for both integer and floating point numbers. + +Floating point formats are also not specified. PSPP does not +translate between floating point formats. This is unlikely to be a +problem as all modern computer architectures use IEEE 754 format for +floating point representation. + +The PSPP system-missing value is represented by the largest possible +negative number in the floating point format; in C, this is most likely +@code{-DBL_MAX}. There are two other important values used in missing +values: @code{HIGHEST} and @code{LOWEST}. These are represented by the +largest possible positive number (probably @code{DBL_MAX}) and the +second-largest negative number. The latter must be determined in a +system-dependent manner; in IEEE 754 format it is represented by value +@code{0xffeffffffffffffe}. + +System files are divided into records. Each record begins with an +@code{int32} giving a numeric record type. Individual record types are +described below: + +@menu +* File Header Record:: +* Variable Record:: +* Value Label Record:: +* Value Label Variable Record:: +* Document Record:: +* Machine int32 Info Record:: +* Machine flt64 Info Record:: +* Miscellaneous Informational Records:: +* Dictionary Termination Record:: +* Data Record:: +@end menu + +@node File Header Record, Variable Record, Data File Format, Data File Format +@section File Header Record + +The file header is always the first record in the file. + +@example +struct sysfile_header + @{ + char rec_type[4]; + char prod_name[60]; + int32 layout_code; + int32 case_size; + int32 compressed; + int32 weight_index; + int32 ncases; + flt64 bias; + char creation_date[9]; + char creation_time[8]; + char file_label[64]; + char padding[3]; + @}; +@end example + +@table @code +@item char rec_type[4]; +Record type code. Always set to @samp{$FL2}. This is the only record +for which the record type is not of type @code{int32}. + +@item char prod_name[60]; +Product identification string. This always begins with the characters +@samp{@@(#) SPSS DATA FILE}. PSPP uses the remaining characters to +give its version and the operating system name; for example, @samp{GNU +pspp 0.1.4 - sparc-sun-solaris2.5.2}. The string is truncated if it +would be longer than 60 characters; otherwise it is padded on the right +with spaces. + +@item int32 layout_code; +Always set to 2. PSPP reads this value to determine the +file's endianness. + +@item int32 case_size; +Number of data elements per case. This is the number of variables, +except that long string variables add extra data elements (one for every +8 characters after the first 8). + +@item int32 compressed; +Set to 1 if the data in the file is compressed, 0 otherwise. + +@item int32 weight_index; +If one of the variables in the data set is used as a weighting variable, +set to the index of that variable. Otherwise, set to 0. + +@item int32 ncases; +Set to the number of cases in the file if it is known, or -1 otherwise. + +In the general case it is not possible to determine the number of cases +that will be output to a system file at the time that the header is +written. The way that this is dealt with is by writing the entire +system file, including the header, then seeking back to the beginning of +the file and writing just the @code{ncases} field. For `files' in which +this is not valid, the seek operation fails. In this case, +@code{ncases} remains -1. + +@item flt64 bias; +Compression bias. Always set to 100. The significance of this value is +that only numbers between @code{(1 - bias)} and @code{(251 - bias)} can +be compressed. + +@item char creation_date[9]; +Set to the date of creation of the system file, in @samp{dd mmm yy} +format, with the month as standard English abbreviations, using an +initial capital letter and following with lowercase. If the date is not +available then this field is arbitrarily set to @samp{01 Jan 70}. + +@item char creation_time[8]; +Set to the time of creation of the system file, in @samp{hh:mm:ss} +format and using 24-hour time. If the time is not available then this +field is arbitrarily set to @samp{00:00:00}. + +@item char file_label[64]; +Set the the file label declared by the user, if any. Padded on the +right with spaces. + +@item char padding[3]; +Ignored padding bytes to make the structure a multiple of 32 bits in +length. Set to zeros. +@end table + +@node Variable Record, Value Label Record, File Header Record, Data File Format +@section Variable Record + +Immediately following the header must come the variable records. There +must be one variable record for every variable and every 8 characters in +a long string beyond the first 8; i.e., there must be exactly as many +variable records as the value specified for @code{case_size} in the file +header record. + +@example +struct sysfile_variable + @{ + int32 rec_type; + int32 type; + int32 has_var_label; + int32 n_missing_values; + int32 print; + int32 write; + char name[8]; + + /* The following two fields are present + only if has_var_label is 1. */ + int32 label_len; + char label[/* variable length */]; + + /* The following field is present only + if n_missing_values is not 0. */ + flt64 missing_values[/* variable length*/]; + @}; +@end example + +@table @code +@item int32 rec_type; +Record type code. Always set to 2. + +@item int32 type; +Variable type code. Set to 0 for a numeric variable. For a short +string variable or the first part of a long string variable, this is set +to the width of the string. For the second and subsequent parts of a +long string variable, set to -1, and the remaining fields in the +structure are ignored. + +@item int32 has_var_label; +If this variable has a variable label, set to 1; otherwise, set to 0. + +@item int32 n_missing_values; +If the variable has no missing values, set to 0. If the variable has +one, two, or three discrete missing values, set to 1, 2, or 3, +respectively. If the variable has a range for missing variables, set to +-2; if the variable has a range for missing variables plus a single +discrete value, set to -3. + +@item int32 print; +Print format for this variable. See below. + +@item int32 write; +Write format for this variable. See below. + +@item char name[8]; +Variable name. The variable name must begin with a capital letter or +the at-sign (@samp{@@}). Subsequent characters may also be octothorpes +(@samp{#}), dollar signs (@samp{$}), underscores (@samp{_}), or full +stops (@samp{.}). The variable name is padded on the right with spaces. + +@item int32 label_len; +This field is present only if @code{has_var_label} is set to 1. It is +set to the length, in characters, of the variable label, which must be a +number between 0 and 120. + +@item char label[/* variable length */]; +This field is present only if @code{has_var_label} is set to 1. It has +length @code{label_len}, rounded up to the nearest multiple of 32 bits. +The first @code{label_len} characters are the variable's variable label. + +@item flt64 missing_values[/* variable length */]; +This field is present only if @code{n_missing_values} is not 0. It has +the same number of elements as the absolute value of +@code{n_missing_values}. For discrete missing values, each element +represents one missing value. When a range is present, the first +element denotes the minimum value in the range, and the second element +denotes the maximum value in the range. When a range plus a value are +present, the third element denotes the additional discrete missing +value. HIGHEST and LOWEST are indicated as described in the chapter +introduction. +@end table + +The @code{print} and @code{write} members of sysfile_variable are output +formats coded into @code{int32} types. The LSB (least-significant byte) +of the @code{int32} represents the number of decimal places, and the +next two bytes in order of increasing significance represent field width +and format type, respectively. The MSB (most-significant byte) is not +used and should be set to zero. + +Format types are defined as follows: +@table @asis +@item 0 +Not used. +@item 1 +@code{A} +@item 2 +@code{AHEX} +@item 3 +@code{COMMA} +@item 4 +@code{DOLLAR} +@item 5 +@code{F} +@item 6 +@code{IB} +@item 7 +@code{PIBHEX} +@item 8 +@code{P} +@item 9 +@code{PIB} +@item 10 +@code{PK} +@item 11 +@code{RB} +@item 12 +@code{RBHEX} +@item 13 +Not used. +@item 14 +Not used. +@item 15 +@code{Z} +@item 16 +@code{N} +@item 17 +@code{E} +@item 18 +Not used. +@item 19 +Not used. +@item 20 +@code{DATE} +@item 21 +@code{TIME} +@item 22 +@code{DATETIME} +@item 23 +@code{ADATE} +@item 24 +@code{JDATE} +@item 25 +@code{DTIME} +@item 26 +@code{WKDAY} +@item 27 +@code{MONTH} +@item 28 +@code{MOYR} +@item 29 +@code{QYR} +@item 30 +@code{WKYR} +@item 31 +@code{PCT} +@item 32 +@code{DOT} +@item 33 +@code{CCA} +@item 34 +@code{CCB} +@item 35 +@code{CCC} +@item 36 +@code{CCD} +@item 37 +@code{CCE} +@item 38 +@code{EDATE} +@item 39 +@code{SDATE} +@end table + +@node Value Label Record, Value Label Variable Record, Variable Record, Data File Format +@section Value Label Record + +Value label records must follow the variable records and must precede +the header termination record. Other than this, they may appear +anywhere in the system file. Every value label record must be +immediately followed by a label variable record, described below. + +Value label records begin with @code{rec_type}, an @code{int32} value +set to the record type of 3. This is followed by @code{count}, an +@code{int32} value set to the number of value labels present in this +record. + +These two fields are followed by a series of @code{count} tuples. Each +tuple is divided into two fields, the value and the label. The first of +these, the value, is composed of a 64-bit value, which is either a +@code{flt64} value or up to 8 characters (padded on the right to 8 +bytes) denoting a short string value. Whether the value is a +@code{flt64} or a character string is not defined inside the value label +record. + +The second field in the tuple, the label, has variable length. The +first @code{char} is a count of the number of characters in the value +label. The remainder of the field is the label itself. The field is +padded on the right to a multiple of 64 bits in length. + +@node Value Label Variable Record, Document Record, Value Label Record, Data File Format +@section Value Label Variable Record + +Every value label variable record must be immediately preceded by a +value label record, described above. + +@example +struct sysfile_value_label_variable + @{ + int32 rec_type; + int32 count; + int32 vars[/* variable length */]; + @}; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 4. + +@item int32 count; +Number of variables that the associated value labels from the value +label record are to be applied. + +@item int32 vars[/* variable length]; +A list of variables to which to apply the value labels. There are +@code{count} elements. +@end table + +@node Document Record, Machine int32 Info Record, Value Label Variable Record, Data File Format +@section Document Record + +There must be no more than one document record per system file. +Document records must follow the variable records and precede the +dictionary termination record. + +@example +struct sysfile_document + @{ + int32 rec_type; + int32 n_lines; + char lines[/* variable length */][80]; + @}; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 6. + +@item int32 n_lines; +Number of lines of documents present. + +@item char lines[/* variable length */][80]; +Document lines. The number of elements is defined by @code{n_lines}. +Lines shorter than 80 characters are padded on the right with spaces. +@end table + +@node Machine int32 Info Record, Machine flt64 Info Record, Document Record, Data File Format +@section Machine @code{int32} Info Record + +There must be no more than one machine @code{int32} info record per +system file. Machine @code{int32} info records must follow the variable +records and precede the dictionary termination record. + +@example +struct sysfile_machine_int32_info + @{ + /* Header. */ + int32 rec_type; + int32 subtype; + int32 size; + int32 count; + + /* Data. */ + int32 version_major; + int32 version_minor; + int32 version_revision; + int32 machine_code; + int32 floating_point_rep; + int32 compression_code; + int32 endianness; + int32 character_code; + @}; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 7. + +@item int32 subtype; +Record subtype. Always set to 3. + +@item int32 size; +Size of each piece of data in the data part, in bytes. Always set to 4. + +@item int32 count; +Number of pieces of data in the data part. Always set to 8. + +@item int32 version_major; +PSPP major version number. In version @var{x}.@var{y}.@var{z}, this +is @var{x}. + +@item int32 version_minor; +PSPP minor version number. In version @var{x}.@var{y}.@var{z}, this +is @var{y}. + +@item int32 version_revision; +PSPP version revision number. In version @var{x}.@var{y}.@var{z}, +this is @var{z}. + +@item int32 machine_code; +Machine code. PSPP always set this field to value to -1, but other +values may appear. + +@item int32 floating_point_rep; +Floating point representation code. For IEEE 754 systems this is 1. +IBM 370 sets this to 2, and DEC VAX E to 3. + +@item int32 compression_code; +Compression code. Always set to 1. + +@item int32 endianness; +Machine endianness. 1 indicates big-endian, 2 indicates little-endian. + +@item int32 character_code; +Character code. 1 indicates EBCDIC, 2 indicates 7-bit ASCII, 3 +indicates 8-bit ASCII, 4 indicates DEC Kanji. +@end table + +@node Machine flt64 Info Record, Miscellaneous Informational Records, Machine int32 Info Record, Data File Format +@section Machine @code{flt64} Info Record + +There must be no more than one machine @code{flt64} info record per +system file. Machine @code{flt64} info records must follow the variable +records and precede the dictionary termination record. + +@example +struct sysfile_machine_flt64_info + @{ + /* Header. */ + int32 rec_type; + int32 subtype; + int32 size; + int32 count; + + /* Data. */ + flt64 sysmis; + flt64 highest; + flt64 lowest; + @}; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 7. + +@item int32 subtype; +Record subtype. Always set to 4. + +@item int32 size; +Size of each piece of data in the data part, in bytes. Always set to 4. + +@item int32 count; +Number of pieces of data in the data part. Always set to 3. + +@item flt64 sysmis; +The system missing value. + +@item flt64 highest; +The value used for HIGHEST in missing values. + +@item flt64 lowest; +The value used for LOWEST in missing values. +@end table + +@node Miscellaneous Informational Records, Dictionary Termination Record, Machine flt64 Info Record, Data File Format +@section Miscellaneous Informational Records + +Miscellaneous informational records must follow the variable records and +precede the dictionary termination record. + +Miscellaneous informational records are ignored by PSPP when reading +system files. They are not written by PSPP when writing system files. + +@example +struct sysfile_misc_info + @{ + /* Header. */ + int32 rec_type; + int32 subtype; + int32 size; + int32 count; + + /* Data. */ + char data[/* variable length */]; + @}; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 7. + +@item int32 subtype; +Record subtype. May take any value. According to Aapi +H@"am@"al@"ainen, value 5 indicates a set of grouped variables and 6 +indicates date info (probably related to USE). + +@item int32 size; +Size of each piece of data in the data part. Should have the value 4 or +8, for @code{int32} and @code{flt64}, respectively. + +@item int32 count; +Number of pieces of data in the data part. + +@item char data[/* variable length */]; +Arbitrary data. There must be @code{size} times @code{count} bytes of +data. +@end table + +@node Dictionary Termination Record, Data Record, Miscellaneous Informational Records, Data File Format +@section Dictionary Termination Record + +The dictionary termination record must follow all other records, except +for the actual cases, which it must precede. There must be exactly one +dictionary termination record in every system file. + +@example +struct sysfile_dict_term + @{ + int32 rec_type; + int32 filler; + @}; +@end example + +@table @code +@item int32 rec_type; +Record type. Always set to 999. + +@item int32 filler; +Ignored padding. Should be set to 0. +@end table + +@node Data Record, , Dictionary Termination Record, Data File Format +@section Data Record + +Data records must follow all other records in the data file. There must +be at least one data record in every system file. + +The format of data records varies depending on whether the data is +compressed. Regardless, the data is arranged in a series of 8-byte +elements. + +When data is not compressed, Every case is composed of @code{case_size} +of these 8-byte elements, where @code{case_size} comes from the file +header record (@pxref{File Header Record}). Each element corresponds to +the variable declared in the respective variable record (@pxref{Variable +Record}). Numeric values are given in @code{flt64} format; string +values are literal characters string, padded on the right when +necessary. + +Compressed data is arranged in the following manner: the first 8-byte +element in the data section is divided into a series of 1-byte command +codes. These codes have meanings as described below: + +@table @asis +@item 0 +Ignored. If the program writing the system file accumulates compressed +data in blocks of fixed length, 0 bytes can be used to pad out extra +bytes remaining at the end of a fixed-size block. + +@item 1 through 251 +These values indicate that the corresponding numeric variable has the +value @code{(@var{code} - @var{bias})} for the case being read, where +@var{code} is the value of the compression code and @var{bias} is the +variable @code{compression_bias} from the file header. For example, +code 105 with bias 100.0 (the normal value) indicates a numeric variable +of value 5. + +@item 252 +End of file. This code may or may not appear at the end of the data +stream. PSPP always outputs this code but its use is not required. + +@item 253 +This value indicates that the numeric or string value is not +compressible. The value is stored in the 8-byte element following the +current block of command bytes. If this value appears twice in a block +of command bytes, then it indicates the second element following the +command bytes, and so on. + +@item 254 +Used to indicate a string value that is all spaces. + +@item 255 +Used to indicate the system-missing value. +@end table + +When the end of the first 8-byte element of command bytes is reached, +any blocks of non-compressible values are skipped, and the next element +of command bytes is read and interpreted, until the end of the file is +reached. +@setfilename ignored diff --git a/doc/data-io.texi b/doc/data-io.texi new file mode 100644 index 00000000..134f90cb --- /dev/null +++ b/doc/data-io.texi @@ -0,0 +1,1000 @@ +@node Data Input and Output, System and Portable Files, Expressions, Top +@chapter Data Input and Output +@cindex input +@cindex output +@cindex data +@cindex cases +@cindex observations + +Data are the focus of the PSPP language. +Each datum belongs to a @dfn{case} (also called an @dfn{observation}). +Each case represents an individual or `experimental unit'. +For example, in the results of a survey, the names of the respondents, +their sex, age @i{etc}. and their responses are all data and the data +pertaining to single respondent is a case. +This chapter examines +the PSPP commands for defining variables and reading and writing data. + +@quotation +@strong{Please note:} Data is not actually read until a procedure is +executed. These commands tell PSPP how to read data, but they +do not @emph{cause} PSPP to read data. +@end quotation + +@menu +* BEGIN DATA:: Embed data within a syntax file. +* CLEAR TRANSFORMATIONS:: Clear pending transformations. +* DATA LIST:: Fundamental data reading command. +* END CASE:: Output the current case. +* END FILE:: Terminate the current input program. +* FILE HANDLE:: Support for fixed-length records. +* INPUT PROGRAM:: Support for complex input programs. +* LIST:: List cases in the active file. +* MATRIX DATA:: Read matrices in text format. +* NEW FILE:: Clear the active file and dictionary. +* PRINT:: Display values in print formats. +* PRINT EJECT:: Eject the current page then print. +* PRINT SPACE:: Print blank lines. +* REREAD:: Take another look at the previous input line. +* REPEATING DATA:: Multiple cases on a single line. +* WRITE:: Display values in write formats. +@end menu + +@node BEGIN DATA, CLEAR TRANSFORMATIONS, Data Input and Output, Data Input and Output +@section BEGIN DATA +@vindex BEGIN DATA +@vindex END DATA +@cindex Embedding data in syntax files +@cindex Data, embedding in syntax files + +@display +BEGIN DATA. +@dots{} +END DATA. +@end display + +@cmd{BEGIN DATA} and @cmd{END DATA} can be used to embed raw ASCII +data in a PSPP syntax file. @cmd{DATA LIST} or another input +procedure must be used before @cmd{BEGIN DATA} (@pxref{DATA LIST}). +@cmd{BEGIN DATA} and @cmd{END DATA} must be used together. @cmd{END +DATA} must appear by itself on a single line, with no leading +whitespace and exactly one space between the words @code{END} and +@code{DATA}, followed immediately by the terminal dot, like this: + +@example +END DATA. +@end example + +@node CLEAR TRANSFORMATIONS, DATA LIST, BEGIN DATA, Data Input and Output +@section CLEAR TRANSFORMATIONS +@vindex CLEAR TRANSFORMATIONS + +@display +CLEAR TRANSFORMATIONS. +@end display + +@cmd{CLEAR TRANSFORMATIONS} clears out all pending +transformations. It does not cancel the current input program. It is +valid only when PSPP is interactive, not in syntax files. + +@node DATA LIST, END CASE, CLEAR TRANSFORMATIONS, Data Input and Output +@section DATA LIST +@vindex DATA LIST +@cindex reading data from a file +@cindex data, reading from a file +@cindex data, embedding in syntax files +@cindex embedding data in syntax files + +Used to read text or binary data, @cmd{DATA LIST} is the most +fundamental data-reading command. Even the more sophisticated input +methods use @cmd{DATA LIST} commands as a building block. +Understanding @cmd{DATA LIST} is important to understanding how to use +PSPP to read your data files. + +There are two major variants of @cmd{DATA LIST}, which are fixed +format and free format. In addition, free format has a minor variant, +list format, which is discussed in terms of its differences from vanilla +free format. + +Each form of @cmd{DATA LIST} is described in detail below. + +@menu +* DATA LIST FIXED:: Fixed columnar locations for data. +* DATA LIST FREE:: Any spacing you like. +* DATA LIST LIST:: Each case must be on a single line. +@end menu + +@node DATA LIST FIXED, DATA LIST FREE, DATA LIST, DATA LIST +@subsection DATA LIST FIXED +@vindex DATA LIST FIXED +@cindex reading fixed-format data +@cindex fixed-format data, reading +@cindex data, fixed-format, reading +@cindex embedding fixed-format data + +@display +DATA LIST [FIXED] + @{TABLE,NOTABLE@} + FILE='filename' + RECORDS=record_count + END=end_var + /[line_no] var_spec@dots{} + +where each var_spec takes one of the forms + var_list start-end [type_spec] + var_list (fortran_spec) +@end display + +@cmd{DATA LIST FIXED} is used to read data files that have values at fixed +positions on each line of single-line or multiline records. The +keyword FIXED is optional. + +The FILE subcommand must be used if input is to be taken from an +external file. It may be used to specify a filename as a string or a +file handle (@pxref{FILE HANDLE}). If the FILE subcommand is not used, +then input is assumed to be specified within the command file using +@cmd{BEGIN DATA}@dots{}@cmd{END DATA} (@pxref{BEGIN DATA}). + +The optional RECORDS subcommand, which takes a single integer as an +argument, is used to specify the number of lines per record. If RECORDS +is not specified, then the number of lines per record is calculated from +the list of variable specifications later in @cmd{DATA LIST}. + +The END subcommand is only useful in conjunction with @cmd{INPUT +PROGRAM}. @xref{INPUT PROGRAM}, for details. + +@cmd{DATA LIST} can optionally output a table describing how the data file +will be read. The TABLE subcommand enables this output, and NOTABLE +disables it. The default is to output the table. + +The list of variables to be read from the data list must come last. +Each line in the data record is introduced by a slash (@samp{/}). +Optionally, a line number may follow the slash. Following, any number +of variable specifications may be present. + +Each variable specification consists of a list of variable names +followed by a description of their location on the input line. Sets of +variables may specified using the @code{DATA LIST} TO convention +(@pxref{Sets of +Variables}). There are two ways to specify the location of the variable +on the line: PSPP style and FORTRAN style. + +With PSPP style, the starting column and ending column for the field +are specified after the variable name, separated by a dash (@samp{-}). +For instance, the third through fifth columns on a line would be +specified @samp{3-5}. By default, variables are considered to be in +@samp{F} format (@pxref{Input/Output Formats}). (This default can be +changed; see @ref{SET} for more information.) + +When using PSPP style, to use a variable format other than the default, +specify the format type in parentheses after the column numbers. For +instance, for alphanumeric @samp{A} format, use @samp{(A)}. + +In addition, implied decimal places can be specified in parentheses +after the column numbers. As an example, suppose that a data file has a +field in which the characters @samp{1234} should be interpreted as +having the value 12.34. Then this field has two implied decimal places, +and the corresponding specification would be @samp{(2)}. If a field +that has implied decimal places contains a decimal point, then the +implied decimal places are not applied. + +Changing the variable format and adding implied decimal places can be +done together; for instance, @samp{(N,5)}. + +When using PSPP style, the input and output width of each variable is +computed from the field width. The field width must be evenly divisible +into the number of variables specified. + +FORTRAN style is an altogether different approach to specifying field +locations. With this approach, a list of variable input format +specifications, separated by commas, are placed after the variable names +inside parentheses. Each format specifier advances as many characters +into the input line as it uses. + +In addition to the standard format specifiers (@pxref{Input/Output +Formats}), FORTRAN style defines some extensions: + +@table @asis +@item @code{X} +Advance the current column on this line by one character position. + +@item @code{T}@var{x} +Set the current column on this line to column @var{x}, with column +numbers considered to begin with 1 at the left margin. + +@item @code{NEWREC}@var{x} +Skip forward @var{x} lines in the current record, resetting the active +column to the left margin. + +@item Repeat count +Any format specifier may be preceded by a number. This causes the +action of that format specifier to be repeated the specified number of +times. + +@item (@var{spec1}, @dots{}, @var{specN}) +Group the given specifiers together. This is most useful when preceded +by a repeat count. Groups may be nested arbitrarily. +@end table + +FORTRAN and PSPP styles may be freely intermixed. PSPP style leaves the +active column immediately after the ending column specified. Record +motion using @code{NEWREC} in FORTRAN style also applies to later +FORTRAN and PSPP specifiers. + +@menu +* DATA LIST FIXED Examples:: Examples of DATA LIST FIXED. +@end menu + +@node DATA LIST FIXED Examples, , DATA LIST FIXED, DATA LIST FIXED +@unnumberedsubsubsec Examples + +@enumerate +@item +@example +DATA LIST TABLE /NAME 1-10 (A) INFO1 TO INFO3 12-17 (1). + +BEGIN DATA. +John Smith 102311 +Bob Arnold 122015 +Bill Yates 918 6 +END DATA. +@end example + +Defines the following variables: + +@itemize @bullet +@item +@code{NAME}, a 10-character-wide long string variable, in columns 1 +through 10. + +@item +@code{INFO1}, a numeric variable, in columns 12 through 13. + +@item +@code{INFO2}, a numeric variable, in columns 14 through 15. + +@item +@code{INFO3}, a numeric variable, in columns 16 through 17. +@end itemize + +The @code{BEGIN DATA}/@code{END DATA} commands cause three cases to be +defined: + +@example +Case NAME INFO1 INFO2 INFO3 + 1 John Smith 10 23 11 + 2 Bob Arnold 12 20 15 + 3 Bill Yates 9 18 6 +@end example + +The @code{TABLE} keyword causes PSPP to print out a table +describing the four variables defined. + +@item +@example +DAT LIS FIL="survey.dat" + /ID 1-5 NAME 7-36 (A) SURNAME 38-67 (A) MINITIAL 69 (A) + /Q01 TO Q50 7-56 + /. +@end example + +Defines the following variables: + +@itemize @bullet +@item +@code{ID}, a numeric variable, in columns 1-5 of the first record. + +@item +@code{NAME}, a 30-character long string variable, in columns 7-36 of the +first record. + +@item +@code{SURNAME}, a 30-character long string variable, in columns 38-67 of +the first record. + +@item +@code{MINITIAL}, a 1-character short string variable, in column 69 of +the first record. + +@item +Fifty variables @code{Q01}, @code{Q02}, @code{Q03}, @dots{}, @code{Q49}, +@code{Q50}, all numeric, @code{Q01} in column 7, @code{Q02} in column 8, +@dots{}, @code{Q49} in column 55, @code{Q50} in column 56, all in the second +record. +@end itemize + +Cases are separated by a blank record. + +Data is read from file @file{survey.dat} in the current directory. + +This example shows keywords abbreviated to their first 3 letters. + +@end enumerate + +@node DATA LIST FREE, DATA LIST LIST, DATA LIST FIXED, DATA LIST +@subsection DATA LIST FREE +@vindex DATA LIST FREE + +@display +DATA LIST FREE + [(@{TAB,'c'@}, @dots{})] + [@{NOTABLE,TABLE@}] + FILE='filename' + END=end_var + /var_spec@dots{} + +where each var_spec takes one of the forms + var_list [(type_spec)] + var_list * +@end display + +In free format, the input data is, by default, structured as a series +of fields separated by spaces, tabs, commas, or line breaks. Each +field's content may be unquoted, or it may be quoted with a pairs of +apostrophes (@samp{'}) or double quotes (@samp{"}). Unquoted white +space separates fields but is not part of any field. Any mix of +spaces, tabs, and line breaks is equivalent to a single space for the +purpose of separating fields, but consecutive commas will skip a +field. + +Alternatively, delimiters can be specified explicitly, as a +parenthesized, comma-separated list of single-character strings +immediately following FREE. The word TAB may also be used to specify +a tab character as a delimiter. When delimiters are specified +explicitly, only the given characters, plus line breaks, separate +fields. Furthermore, leading spaces at the beginnings of fields are +not trimmed, consecutive delimiters define empty fields, and no form +of quoting is allowed. + +The NOTABLE and TABLE subcommands are as in @cmd{DATA LIST FIXED} above. +NOTABLE is the default. + +The FILE and END subcommands are as in @cmd{DATA LIST FIXED} above. + +The variables to be parsed are given as a single list of variable names. +This list must be introduced by a single slash (@samp{/}). The set of +variable names may contain format specifications in parentheses +(@pxref{Input/Output Formats}). Format specifications apply to all +variables back to the previous parenthesized format specification. + +In addition, an asterisk may be used to indicate that all variables +preceding it are to have input/output format @samp{F8.0}. + +Specified field widths are ignored on input, although all normal limits +on field width apply, but they are honored on output. + +@node DATA LIST LIST, , DATA LIST FREE, DATA LIST +@subsection DATA LIST LIST +@vindex DATA LIST LIST + +@display +DATA LIST LIST + [(@{TAB,'c'@}, @dots{})] + [@{NOTABLE,TABLE@}] + FILE='filename' + END=end_var + /var_spec@dots{} + +where each var_spec takes one of the forms + var_list [(type_spec)] + var_list * +@end display + +With one exception, @cmd{DATA LIST LIST} is syntactically and +semantically equivalent to @cmd{DATA LIST FREE}. The exception is +that each input line is expected to correspond to exactly one input +record. If more or fewer fields are found on an input line than +expected, an appropriate diagnostic is issued. + +@node END CASE, END FILE, DATA LIST, Data Input and Output +@section END CASE +@vindex END CASE + +@display +END CASE. +@end display + +@cmd{END CASE} is used only within @cmd{INPUT PROGRAM} to output the +current case. @xref{INPUT PROGRAM}, for details. + +@node END FILE, FILE HANDLE, END CASE, Data Input and Output +@section END FILE +@vindex END FILE + +@display +END FILE. +@end display + +@cmd{END FILE} is used only within @cmd{INPUT PROGRAM} to terminate +the current input program. @xref{INPUT PROGRAM}. + +@node FILE HANDLE, INPUT PROGRAM, END FILE, Data Input and Output +@section FILE HANDLE +@vindex FILE HANDLE + +@display +FILE HANDLE handle_name + /NAME='filename' + /MODE=@{CHARACTER,IMAGE@} + /LRECL=rec_len + /TABWIDTH=tab_width +@end display + +Use @cmd{FILE HANDLE} to associate a file handle name with a file and +its attributes, so that later commands can refer to the file by its +handle name. Because names of text files can be specified directly on +commands that access files, @cmd{FILE HANDLE} is only needed when a +file is not an ordinary file containing lines of text. However, +@cmd{FILE HANDLE} may be used even for text files, and it may be +easier to specify a file's name once and later refer to it by an +abstract handle. + +Specify the file handle name as an identifier. Any given identifier may +only appear once in a PSPP run. File handles may not be reassigned to a +different file. The file handle name must immediately follow the @cmd{FILE +HANDLE} command name. + +The NAME subcommand specifies the name of the file associated with the +handle. It is the only required subcommand. + +MODE specifies a file mode. In CHARACTER mode, the default, the data +file is opened in ANSI C text mode, so that local end of line +conventions are followed, and each text line is read as one record. +In CHARACTER mode, most input programs will expand tabs to spaces +(@cmd{DATA LIST FREE} with explicitly specified delimiters is an +exception). By default, each tab is 4 characters wide, but an +alternate width may be specified on TABWIDTH. A tab width of 0 +suppresses tab expansion entirely. + +By contrast, in BINARY mode, the data file is opened in ANSI C binary +mode and records are a fixed length. In BINARY mode, LRECL specifies +the record length in bytes, with a default of 1024. Tab characters +are never expanded to spaces in binary mode. + +@node INPUT PROGRAM, LIST, FILE HANDLE, Data Input and Output +@section INPUT PROGRAM +@vindex INPUT PROGRAM + +@display +INPUT PROGRAM. +@dots{} input commands @dots{} +END INPUT PROGRAM. +@end display + +@cmd{INPUT PROGRAM}@dots{}@cmd{END INPUT PROGRAM} specifies a +complex input program. By placing data input commands within @cmd{INPUT +PROGRAM}, PSPP programs can take advantage of more complex file +structures than available with only @cmd{DATA LIST}. + +The first sort of extended input program is to simply put multiple @cmd{DATA +LIST} commands within the @cmd{INPUT PROGRAM}. This will cause all of +the data +files to be read in parallel. Input will stop when end of file is +reached on any of the data files. + +Transformations, such as conditional and looping constructs, can also be +included within @cmd{INPUT PROGRAM}. These can be used to combine input +from several data files in more complex ways. However, input will still +stop when end of file is reached on any of the data files. + +To prevent @cmd{INPUT PROGRAM} from terminating at the first end of +file, use +the END subcommand on @cmd{DATA LIST}. This subcommand takes a +variable name, +which should be a numeric scratch variable (@pxref{Scratch Variables}). +(It need not be a scratch variable but otherwise the results can be +surprising.) The value of this variable is set to 0 when reading the +data file, or 1 when end of file is encountered. + +Two additional commands are useful in conjunction with @cmd{INPUT PROGRAM}. +@cmd{END CASE} is the first. Normally each loop through the +@cmd{INPUT PROGRAM} +structure produces one case. @cmd{END CASE} controls exactly +when cases are output. When @cmd{END CASE} is used, looping from the end of +@cmd{INPUT PROGRAM} to the beginning does not cause a case to be output. + +@cmd{END FILE} is the second. When the END subcommand is used on @cmd{DATA +LIST}, there is no way for the @cmd{INPUT PROGRAM} construct to stop +looping, +so an infinite loop results. @cmd{END FILE}, when executed, +stops the flow of input data and passes out of the @cmd{INPUT PROGRAM} +structure. + +All this is very confusing. A few examples should help to clarify. + +@example +INPUT PROGRAM. + DATA LIST NOTABLE FILE='a.data'/X 1-10. + DATA LIST NOTABLE FILE='b.data'/Y 1-10. +END INPUT PROGRAM. +LIST. +@end example + +The example above reads variable X from file @file{a.data} and variable +Y from file @file{b.data}. If one file is shorter than the other then +the extra data in the longer file is ignored. + +@example +INPUT PROGRAM. + NUMERIC #A #B. + + DO IF NOT #A. + DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10. + END IF. + DO IF NOT #B. + DATA LIST NOTABLE END=#B FILE='b.data'/Y 1-10. + END IF. + DO IF #A AND #B. + END FILE. + END IF. + END CASE. +END INPUT PROGRAM. +LIST. +@end example + +The above example reads variable X from @file{a.data} and variable Y from +@file{b.data}. If one file is shorter than the other then the missing +field is set to the system-missing value alongside the present value for +the remaining length of the longer file. + +@example +INPUT PROGRAM. + NUMERIC #A #B. + + DO IF #A. + DATA LIST NOTABLE END=#B FILE='b.data'/X 1-10. + DO IF #B. + END FILE. + ELSE. + END CASE. + END IF. + ELSE. + DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10. + DO IF NOT #A. + END CASE. + END IF. + END IF. +END INPUT PROGRAM. +LIST. +@end example + +The above example reads data from file @file{a.data}, then from +@file{b.data}, and concatenates them into a single active file. + +@example +INPUT PROGRAM. + NUMERIC #EOF. + + LOOP IF NOT #EOF. + DATA LIST NOTABLE END=#EOF FILE='a.data'/X 1-10. + DO IF NOT #EOF. + END CASE. + END IF. + END LOOP. + + COMPUTE #EOF = 0. + LOOP IF NOT #EOF. + DATA LIST NOTABLE END=#EOF FILE='b.data'/X 1-10. + DO IF NOT #EOF. + END CASE. + END IF. + END LOOP. + + END FILE. +END INPUT PROGRAM. +LIST. +@end example + +The above example does the same thing as the previous example, in a +different way. + +@example +INPUT PROGRAM. + LOOP #I=1 TO 50. + COMPUTE X=UNIFORM(10). + END CASE. + END LOOP. + END FILE. +END INPUT PROGRAM. +LIST/FORMAT=NUMBERED. +@end example + +The above example causes an active file to be created consisting of 50 +random variates between 0 and 10. + +@node LIST, MATRIX DATA, INPUT PROGRAM, Data Input and Output +@section LIST +@vindex LIST + +@display +LIST + /VARIABLES=var_list + /CASES=FROM start_index TO end_index BY incr_index + /FORMAT=@{UNNUMBERED,NUMBERED@} @{WRAP,SINGLE@} + @{NOWEIGHT,WEIGHT@} +@end display + +The @cmd{LIST} procedure prints the values of specified variables to the +listing file. + +The VARIABLES subcommand specifies the variables whose values are to be +printed. Keyword VARIABLES is optional. If VARIABLES subcommand is not +specified then all variables in the active file are printed. + +The CASES subcommand can be used to specify a subset of cases to be +printed. Specify FROM and the case number of the first case to print, +TO and the case number of the last case to print, and BY and the number +of cases to advance between printing cases, or any subset of those +settings. If CASES is not specified then all cases are printed. + +The FORMAT subcommand can be used to change the output format. NUMBERED +will print case numbers along with each case; UNNUMBERED, the default, +causes the case numbers to be omitted. The WRAP and SINGLE settings are +currently not used. WEIGHT will cause case weights to be printed along +with variable values; NOWEIGHT, the default, causes case weights to be +omitted from the output. + +Case numbers start from 1. They are counted after all transformations +have been considered. + +@cmd{LIST} attempts to fit all the values on a single line. If needed +to make them fit, variable names are displayed vertically. If values +cannot fit on a single line, then a multi-line format will be used. + +@cmd{LIST} is a procedure. It causes the data to be read. + +@node MATRIX DATA, NEW FILE, LIST, Data Input and Output +@section MATRIX DATA +@vindex MATRIX DATA + +@display +MATRIX DATA + /VARIABLES=var_list + /FILE='filename' + /FORMAT=@{LIST,FREE@} @{LOWER,UPPER,FULL@} @{DIAGONAL,NODIAGONAL@} + /SPLIT=@{new_var,var_list@} + /FACTORS=var_list + /CELLS=n_cells + /N=n + /CONTENTS=@{N_VECTOR,N_SCALAR,N_MATRIX,MEAN,STDDEV,COUNT,MSE, + DFE,MAT,COV,CORR,PROX@} +@end display + +@cmd{MATRIX DATA} command reads square matrices in one of several textual +formats. @cmd{MATRIX DATA} clears the dictionary and replaces it and +reads a +data file. + +Use VARIABLES to specify the variables that form the rows and columns of +the matrices. You may not specify a variable named @code{VARNAME_}. You +should specify VARIABLES first. + +Specify the file to read on FILE, either as a file name string or a file +handle (@pxref{FILE HANDLE}). If FILE is not specified then matrix data +must immediately follow @cmd{MATRIX DATA} with a @cmd{BEGIN +DATA}@dots{}@cmd{END DATA} +construct (@pxref{BEGIN DATA}). + +The FORMAT subcommand specifies how the matrices are formatted. LIST, +the default, indicates that there is one line per row of matrix data; +FREE allows single matrix rows to be broken across multiple lines. This +is analogous to the difference between @cmd{DATA LIST FREE} and +@cmd{DATA LIST LIST} +(@pxref{DATA LIST}). LOWER, the default, indicates that the lower +triangle of the matrix is given; UPPER indicates the upper triangle; and +FULL indicates that the entire matrix is given. DIAGONAL, the default, +indicates that the diagonal is part of the data; NODIAGONAL indicates +that it is omitted. DIAGONAL/NODIAGONAL have no effect when FULL is +specified. + +The SPLIT subcommand is used to specify @cmd{SPLIT FILE} variables for the +input matrices (@pxref{SPLIT FILE}). Specify either a single variable +not specified on VARIABLES, or one or more variables that are specified +on VARIABLES. In the former case, the SPLIT values are not present in +the data and ROWTYPE_ may not be specified on VARIABLES. In the latter +case, the SPLIT values are present in the data. + +Specify a list of factor variables on FACTORS. Factor variables must +also be listed on VARIABLES. Factor variables are used when there are +some variables where, for each possible combination of their values, +statistics on the matrix variables are included in the data. + +If FACTORS is specified and ROWTYPE_ is not specified on VARIABLES, the +CELLS subcommand is required. Specify the number of factor variable +combinations that are given. For instance, if factor variable A has 2 +values and factor variable B has 3 values, specify 6. + +The N subcommand specifies a population number of observations. When N +is specified, one N record is output for each @cmd{SPLIT FILE}. + +Use CONTENTS to specify what sort of information the matrices include. +Each possible option is described in more detail below. When ROWTYPE_ +is specified on VARIABLES, CONTENTS is optional; otherwise, if CONTENTS +is not specified then /CONTENTS=CORR is assumed. + +@table @asis +@item N +@item N_VECTOR +Number of observations as a vector, one value for each variable. +@item N_SCALAR +Number of observations as a single value. +@item N_MATRIX +Matrix of counts. +@item MEAN +Vector of means. +@item STDDEV +Vector of standard deviations. +@item COUNT +Vector of counts. +@item MSE +Vector of mean squared errors. +@item DFE +Vector of degrees of freedom. +@item MAT +Generic matrix. +@item COV +Covariance matrix. +@item CORR +Correlation matrix. +@item PROX +Proximities matrix. +@end table + +The exact semantics of the matrices read by @cmd{MATRIX DATA} are complex. +Right now @cmd{MATRIX DATA} isn't too useful due to a lack of procedures +accepting or producing related data, so these semantics aren't +documented. Later, they'll be described here in detail. + +@node NEW FILE, PRINT, MATRIX DATA, Data Input and Output +@section NEW FILE +@vindex NEW FILE + +@display +NEW FILE. +@end display + +@cmd{NEW FILE} command clears the current active file. + +@node PRINT, PRINT EJECT, NEW FILE, Data Input and Output +@section PRINT +@vindex PRINT + +@display +PRINT + OUTFILE='filename' + RECORDS=n_lines + @{NOTABLE,TABLE@} + /[line_no] arg@dots{} + +arg takes one of the following forms: + 'string' [start-end] + var_list start-end [type_spec] + var_list (fortran_spec) + var_list * +@end display + +The @cmd{PRINT} transformation writes variable data to an output file. +@cmd{PRINT} is executed when a procedure causes the data to be read. +Follow @cmd{PRINT} by @cmd{EXECUTE} to print variable data without +invoking a procedure (@pxref{EXECUTE}). + +All @cmd{PRINT} subcommands are optional. + +The OUTFILE subcommand specifies the file to receive the output. The +file may be a file name as a string or a file handle (@pxref{FILE +HANDLE}). If OUTFILE is not present then output will be sent to PSPP's +output listing file. + +The RECORDS subcommand specifies the number of lines to be output. The +number of lines may optionally be surrounded by parentheses. + +TABLE will cause the PRINT command to output a table to the listing file +that describes what it will print to the output file. NOTABLE, the +default, suppresses this output table. + +Introduce the strings and variables to be printed with a slash +(@samp{/}). Optionally, the slash may be followed by a number +indicating which output line will be specified. In the absence of this +line number, the next line number will be specified. Multiple lines may +be specified using multiple slashes with the intended output for a line +following its respective slash. + +Literal strings may be printed. Specify the string itself. Optionally +the string may be followed by a column number or range of column +numbers, specifying the location on the line for the string to be +printed. Otherwise, the string will be printed at the current position +on the line. + +Variables to be printed can be specified in the same ways as available +for @cmd{DATA LIST FIXED} (@pxref{DATA LIST FIXED}). In addition, a +variable +list may be followed by an asterisk (@samp{*}), which indicates that the +variables should be printed in their dictionary print formats, separated +by spaces. A variable list followed by a slash or the end of command +will be interpreted the same way. + +If a FORTRAN type specification is used to move backwards on the current +line, then text is written at that point on the line, the line will be +truncated to that length, although additional text being added will +again extend the line to that length. + +@node PRINT EJECT, PRINT SPACE, PRINT, Data Input and Output +@section PRINT EJECT +@vindex PRINT EJECT + +@display +PRINT EJECT + OUTFILE='filename' + RECORDS=n_lines + @{NOTABLE,TABLE@} + /[line_no] arg@dots{} + +arg takes one of the following forms: + 'string' [start-end] + var_list start-end [type_spec] + var_list (fortran_spec) + var_list * +@end display + +@cmd{PRINT EJECT} writes data to an output file. Before the data is +written, the current page in the listing file is ejected. + +@xref{PRINT}, for more information on syntax and usage. + +@node PRINT SPACE, REREAD, PRINT EJECT, Data Input and Output +@section PRINT SPACE +@vindex PRINT SPACE + +@display +PRINT SPACE OUTFILE='filename' n_lines. +@end display + +@cmd{PRINT SPACE} prints one or more blank lines to an output file. + +The OUTFILE subcommand is optional. It may be used to direct output to +a file specified by file name as a string or file handle (@pxref{FILE +HANDLE}). If OUTFILE is not specified then output will be directed to +the listing file. + +n_lines is also optional. If present, it is an expression +(@pxref{Expressions}) specifying the number of blank lines to be +printed. The expression must evaluate to a nonnegative value. + +@node REREAD, REPEATING DATA, PRINT SPACE, Data Input and Output +@section REREAD +@vindex REREAD + +@display +REREAD FILE=handle COLUMN=column. +@end display + +The @cmd{REREAD} transformation allows the previous input line in a +data file +already processed by @cmd{DATA LIST} or another input command to be re-read +for further processing. + +The FILE subcommand, which is optional, is used to specify the file to +have its line re-read. The file must be specified in the form of a file +handle (@pxref{FILE HANDLE}). If FILE is not specified then the last +file specified on @cmd{DATA LIST} will be assumed (last file specified +lexically, not in terms of flow-of-control). + +By default, the line re-read is re-read in its entirety. With the +COLUMN subcommand, a prefix of the line can be exempted from +re-reading. Specify an expression (@pxref{Expressions}) evaluating to +the first column that should be included in the re-read line. Columns +are numbered from 1 at the left margin. + +Issuing @code{REREAD} multiple times will not back up in the data +file. Instead, it will re-read the same line multiple times. + +@node REPEATING DATA, WRITE, REREAD, Data Input and Output +@section REPEATING DATA +@vindex REPEATING DATA + +@display +REPEATING DATA + /STARTS=start-end + /OCCURS=n_occurs + /FILE='filename' + /LENGTH=length + /CONTINUED[=cont_start-cont_end] + /ID=id_start-id_end=id_var + /@{TABLE,NOTABLE@} + /DATA=var_spec@dots{} + +where each var_spec takes one of the forms + var_list start-end [type_spec] + var_list (fortran_spec) +@end display + +@cmd{REPEATING DATA} parses groups of data repeating in +a uniform format, possibly with several groups on a single line. Each +group of data corresponds with one case. @cmd{REPEATING DATA} may only be +used within an @cmd{INPUT PROGRAM} structure (@pxref{INPUT PROGRAM}). +When used with @cmd{DATA LIST}, it +can be used to parse groups of cases that share a subset of variables +but differ in their other data. + +The STARTS subcommand is required. Specify a range of columns, using +literal numbers or numeric variable names. This range specifies the +columns on the first line that are used to contain groups of data. The +ending column is optional. If it is not specified, then the record +width of the input file is used. For the inline file (@pxref{BEGIN +DATA}) this is 80 columns; for a file with fixed record widths it is the +record width; for other files it is 1024 characters by default. + +The OCCURS subcommand is required. It must be a number or the name of a +numeric variable. Its value is the number of groups present in the +current record. + +The DATA subcommand is required. It must be the last subcommand +specified. It is used to specify the data present within each repeating +group. Column numbers are specified relative to the beginning of a +group at column 1. Data is specified in the same way as with @cmd{DATA LIST +FIXED} (@pxref{DATA LIST FIXED}). + +All other subcommands are optional. + +FILE specifies the file to read, either a file name as a string or a +file handle (@pxref{FILE HANDLE}). If FILE is not present then the +default is the last file handle used on @cmd{DATA LIST} (lexically, not in +terms of flow of control). + +By default @cmd{REPEATING DATA} will output a table describing how it will +parse the input data. Specifying NOTABLE will disable this behavior; +specifying TABLE will explicitly enable it. + +The LENGTH subcommand specifies the length in characters of each group. +If it is not present then length is inferred from the DATA subcommand. +LENGTH can be a number or a variable name. + +Normally all the data groups are expected to be present on a single +line. Use the CONTINUED command to indicate that data can be continued +onto additional lines. If data on continuation lines starts at the left +margin and continues through the entire field width, no column +specifications are necessary on CONTINUED. Otherwise, specify the +possible range of columns in the same way as on STARTS. + +When data groups are continued from line to line, it is easy +for cases to get out of sync through careless hand editing. The +ID subcommand allows a case identifier to be present on each line of +repeating data groups. @cmd{REPEATING DATA} will check for the same +identifier on each line and report mismatches. Specify the range of +columns that the identifier will occupy, followed by an equals sign +(@samp{=}) and the identifier variable name. The variable must already +have been declared with @cmd{NUMERIC} or another command. + +@cmd{REPEATING DATA} should be the last command given within an +@cmd{INPUT PROGRAM}. It should not be enclosed within a @cmd{LOOP} +structure (@pxref{LOOP}). Use @cmd{DATA LIST} before, not after, +@cmd{REPEATING DATA}. + +@node WRITE, , REPEATING DATA, Data Input and Output +@section WRITE +@vindex WRITE + +@display +WRITE + OUTFILE='filename' + RECORDS=n_lines + @{NOTABLE,TABLE@} + /[line_no] arg@dots{} + +arg takes one of the following forms: + 'string' [start-end] + var_list start-end [type_spec] + var_list (fortran_spec) + var_list * +@end display + +@code{WRITE} writes text or binary data to an output file. + +@xref{PRINT}, for more information on syntax and usage. The main +difference between @code{PRINT} and @code{WRITE} is that @cmd{WRITE} +uses write formats by default, where PRINT uses print formats. + +The sole additional difference is that if @cmd{WRITE} is used to send output +to a binary file, carriage control characters will not be output. +@xref{FILE HANDLE}, for information on how to declare a file as binary. +@setfilename ignored diff --git a/doc/data-selection.texi b/doc/data-selection.texi new file mode 100644 index 00000000..32914f04 --- /dev/null +++ b/doc/data-selection.texi @@ -0,0 +1,315 @@ +@node Data Selection, Conditionals and Looping, Data Manipulation, Top +@chapter Selecting data for analysis + +This chapter documents PSPP commands that temporarily or permanently +select data records from the active file for analysis. + +@menu +* FILTER:: Exclude cases based on a variable. +* N OF CASES:: Limit the size of the active file. +* PROCESS IF:: Temporarily excluding cases. +* SAMPLE:: Select a specified proportion of cases. +* SELECT IF:: Permanently delete selected cases. +* SPLIT FILE:: Do multiple analyses with one command. +* TEMPORARY:: Make transformations' effects temporary. +* WEIGHT:: Weight cases by a variable. +@end menu + +@node FILTER, N OF CASES, Data Selection, Data Selection +@section FILTER +@vindex FILTER + +@display +FILTER BY var_name. +FILTER OFF. +@end display + +@cmd{FILTER} allows a boolean-valued variable to be used to select +cases from the data stream for processing. + +To set up filtering, specify BY and a variable name. Keyword +BY is optional but recommended. Cases which have a zero or system- or +user-missing value are excluded from analysis, but not deleted from the +data stream. Cases with other values are analyzed. +To filter based on a different condition, use +transformations such as @cmd{COMPUTE} or @cmd{RECODE} to compute a +filter variable of the required form, then specify that variable on +@cmd{FILTER}. + +@code{FILTER OFF} turns off case filtering. + +Filtering takes place immediately before cases pass to a procedure for +analysis. Only one filter variable may be active at a time. Normally, +case filtering continues until it is explicitly turned off with @code{FILTER +OFF}. However, if @cmd{FILTER} is placed after TEMPORARY, it filters only +the next procedure or procedure-like command. + +@node N OF CASES, PROCESS IF, FILTER, Data Selection +@section N OF CASES +@vindex N OF CASES + +@display +N [OF CASES] num_of_cases [ESTIMATED]. +@end display + +Sometimes you may want to disregard cases of your input. @cmd{N} can +do this. @code{N 100} tells PSPP to disregard all cases after the +first 100. + +If the value specified for @cmd{N} is greater than the number of cases +read in, the value is ignored. + +@cmd{N} does not discard cases or prevent them from being read. It +just causes cases beyond the last one specified to be ignored by data +analysis commands. + +A later @cmd{N} command can increase or decrease the number of cases +selected. (To select all the cases without knowing how many there are, +specify a very high number: 100000 or whatever you think is large enough.) + +Transformation procedures performed after @cmd{N} is executed +@emph{do} cause cases to be discarded. + +@cmd{SAMPLE}, @cmd{PROCESS IF}, and @cmd{SELECT IF} have +precedence over @cmd{N}---the same results are obtained by both of the +following fragments, given the same random number seeds: + +@example +@i{@dots{}set up, read in data@dots{}} +N 100. +SAMPLE .5. +@i{@dots{}analyze data@dots{}} + +@i{@dots{}set up, read in data@dots{}} +SAMPLE .5. +N 100. +@i{@dots{}analyze data@dots{}} +@end example + +Both fragments above first randomly sample approximately half of the +cases, then select the first 100 of those sampled. + +@cmd{N} with the @code{ESTIMATED} keyword gives an +estimated number of cases before @cmd{DATA LIST} or another command to +read in data. @code{ESTIMATED} never limits the number of cases +processed by procedures. PSPP currently does not make use of +case count estimates. + +When @cmd{N} is specified after @cmd{TEMPORARY}, it affects only +the next procedure (@pxref{TEMPORARY}). + +@node PROCESS IF, SAMPLE, N OF CASES, Data Selection +@section PROCESS IF +@vindex PROCESS IF + +@example +PROCESS IF expression. +@end example + +@cmd{PROCESS IF} temporarily eliminates cases from the +data stream. Its effects are active only through the execution of the +next procedure or procedure-like command. + +Specify a boolean expression (@pxref{Expressions}). If the value of the +expression is true for a particular case, the case will be analyzed. If +the expression has a false or missing value, then the case will be +deleted from the data stream for this procedure only. + +Regardless of its placement relative to other commands, @cmd{PROCESS IF} +always takes effect immediately before data passes to the procedure. +Only one @cmd{PROCESS IF} command may be in effect at any given time. + +The effects of @cmd{PROCESS IF} are similar, but not identical, to the +effects of executing @cmd{TEMPORARY}, then @cmd{SELECT IF} +(@pxref{SELECT IF}). + +The filtering performed by @cmd{PROCESS IF} takes place immediately +before cases pass to a procedure for analysis. Because @cmd{PROCESS +IF} affects only a single procedure, its placement relative to +@cmd{TEMPORARY} is unimportant. + +@cmd{PROCESS IF} is deprecated. It is included for compatibility with +old command files. New syntax files should use @cmd{SELECT IF} or +@cmd{FILTER} instead. + +@node SAMPLE, SELECT IF, PROCESS IF, Data Selection +@section SAMPLE +@vindex SAMPLE + +@display +SAMPLE num1 [FROM num2]. +@end display + +@cmd{SAMPLE} randomly samples a proportion of the cases in the active +file. Unless it follows @cmd{TEMPORARY}, it operates as a +transformation, permanently removing cases from the active file. + +The proportion to sample can be expressed as a single number between 0 +and 1. If @code{k} is the number specified, and @code{N} is the number +of currently-selected cases in the active file, then after +@code{SAMPLE @var{k}.}, approximately @code{k*N} cases will be +selected. + +The proportion to sample can also be specified in the style @code{SAMPLE +@var{m} FROM @var{N}}. With this style, cases are selected as follows: + +@enumerate +@item +If @var{N} is equal to the number of currently-selected cases in the +active file, exactly @var{m} cases will be selected. + +@item +If @var{N} is greater than the number of currently-selected cases in the +active file, an equivalent proportion of cases will be selected. + +@item +If @var{N} is less than the number of currently-selected cases in the +active, exactly @var{m} cases will be selected @emph{from the first +@var{N} cases in the active file.} +@end enumerate + +@cmd{SAMPLE} and @cmd{SELECT IF} are performed in +the order specified by the syntax file. + +@cmd{SAMPLE} is always performed before @code{N OF CASES}, regardless +of ordering in the syntax file (@pxref{N OF CASES}). + +The same values for @cmd{SAMPLE} may result in different samples. To +obtain the same sample, use the @code{SET} command to set the random +number seed to the same value before each @cmd{SAMPLE}. Different +samples may still result when the file is processed on systems with +differing endianness or floating-point formats. By default, the +random number seed is based on the system time. + +@node SELECT IF, SPLIT FILE, SAMPLE, Data Selection +@section SELECT IF +@vindex SELECT IF + +@display +SELECT IF expression. +@end display + +@cmd{SELECT IF} selects cases for analysis based on the value of a +boolean expression. Cases not selected are permanently eliminated +from the active file, unless @cmd{TEMPORARY} is in effect +(@pxref{TEMPORARY}). + +Specify a boolean expression (@pxref{Expressions}). If the value of the +expression is true for a particular case, the case will be analyzed. If +the expression has a false or missing value, then the case will be +deleted from the data stream. + +Place @cmd{SELECT IF} as early in the command file as +possible. Cases that are deleted early can be processed more +efficiently in time and space. + +When @cmd{SELECT IF} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + +@node SPLIT FILE, TEMPORARY, SELECT IF, Data Selection +@section SPLIT FILE +@vindex SPLIT FILE + +@display +Two possible syntaxes: + SPLIT FILE BY var_list. + SPLIT FILE OFF. +@end display + +@cmd{SPLIT FILE} allows multiple sets of data present in one data +file to be analyzed separately using single statistical procedure +commands. + +Specify a list of variable names to analyze multiple sets of +data separately. Groups of cases having the same values for these +variables are analyzed by statistical procedure commands as one group. +An independent analysis is carried out for each group of cases, and the +variable values for the group are printed along with the analysis. + +Specify OFF to disable @cmd{SPLIT FILE} and resume analysis of the +entire active file as a single group of data. + +When @cmd{SPLIT FILE} is specified after @cmd{TEMPORARY}, it affects only +the next procedure (@pxref{TEMPORARY}). + +@node TEMPORARY, WEIGHT, SPLIT FILE, Data Selection +@section TEMPORARY +@vindex TEMPORARY + +@display +TEMPORARY. +@end display + +@cmd{TEMPORARY} is used to make the effects of transformations +following its execution temporary. These transformations will +affect only the execution of the next procedure or procedure-like +command. Their effects will not be saved to the active file. + +The only specification on @cmd{TEMPORARY} is the command name. + +@cmd{TEMPORARY} may not appear within a @cmd{DO IF} or @cmd{LOOP} +construct. It may appear only once between procedures and +procedure-like commands. + +Scratch variables cannot be used following @cmd{TEMPORARY}. + +An example may help to clarify: + +@example +DATA LIST /X 1-2. +BEGIN DATA. + 2 + 4 +10 +15 +20 +24 +END DATA. +COMPUTE X=X/2. +TEMPORARY. +COMPUTE X=X+3. +DESCRIPTIVES X. +DESCRIPTIVES X. +@end example + +The data read by the first @cmd{DESCRIPTIVES} are 4, 5, 8, +10.5, 13, 15. The data read by the first @cmd{DESCRIPTIVES} are 1, 2, +5, 7.5, 10, 12. + +@node WEIGHT, , TEMPORARY, Data Selection +@section WEIGHT +@vindex WEIGHT + +@display +WEIGHT BY var_name. +WEIGHT OFF. +@end display + +@cmd{WEIGHT} assigns cases varying weights, +changing the frequency distribution of the active file. Execution of +@cmd{WEIGHT} is delayed until data have been read. + +If a variable name is specified, @cmd{WEIGHT} causes the values of that +variable to be used as weighting factors for subsequent statistical +procedures. Use of keyword BY is optional but recommended. Weighting +variables must be numeric. Scratch variables may not be used for +weighting (@pxref{Scratch Variables}). + +When OFF is specified, subsequent statistical procedures will weight all +cases equally. + +A positive integer weighting factor @var{w} on a case will yield the +same statistical output as would replicating the case @var{w} times. +A weighting factor of 0 is treated for statistical purposes as if the +case did not exist in the input. Weighting values need not be +integers, but negative and system-missing values for the weighting +variable are interpreted as weighting factors of 0. User-missing +values are not treated specially. + +When @cmd{WEIGHT} is specified after @cmd{TEMPORARY}, it affects only +the next procedure (@pxref{TEMPORARY}). + +@cmd{WEIGHT} does not cause cases in the active file to be replicated in +memory. +@setfilename ignored diff --git a/doc/expressions.texi b/doc/expressions.texi new file mode 100644 index 00000000..87355293 --- /dev/null +++ b/doc/expressions.texi @@ -0,0 +1,1229 @@ +@node Expressions, Data Input and Output, Language, Top +@chapter Mathematical Expressions +@cindex expressions, mathematical +@cindex mathematical expressions + +Some PSPP commands use expressions, which share a common syntax +among all PSPP commands. Expressions are made up of +@dfn{operands}, which can be numbers, strings, or variable names, +separated by @dfn{operators}. There are five types of operators: +grouping, arithmetic, logical, relational, and functions. + +Every operator takes one or more @dfn{arguments} as input and produces +or @dfn{returns} exactly one result as output. Both strings and numeric +values can be used as arguments and are produced as results, but each +operator accepts only specific combinations of numeric and string values +as arguments. With few exceptions, operator arguments may be +full-fledged expressions in themselves. + +@menu +* Boolean Values:: Boolean values. +* Missing Values in Expressions:: Using missing values in expressions. +* Grouping Operators:: parentheses +* Arithmetic Operators:: add sub mul div pow +* Logical Operators:: AND NOT OR +* Relational Operators:: EQ GE GT LE LT NE +* Functions:: More-sophisticated operators. +* Order of Operations:: Operator precedence. +@end menu + +@node Boolean Values, Missing Values in Expressions, Expressions, Expressions +@section Boolean Values +@cindex Boolean +@cindex values, Boolean + +Some PSPP operators and expressions work with Boolean values, which +represent true/false conditions. Booleans have only three possible +values: 0 (false), 1 (true), and system-missing (unknown). +System-missing is neither true nor false and indicates that the true +value is unknown. + +Boolean-typed operands or function arguments must take on one of these +three values. Other values are considered false, but cause an error +when the expression is evaluated. + +Strings and Booleans are not compatible, and neither may be used in +place of the other. + +@node Missing Values in Expressions, Grouping Operators, Boolean Values, Expressions +@section Missing Values in Expressions + +String missing values are not treated specially in expressions. Most +numeric operators return system-missing when given system-missing +arguments. Exceptions are listed under particular operator +descriptions. + +User-missing values for numeric variables are always transformed into +the system-missing value, except inside the arguments to the +@code{VALUE} and @code{SYSMIS} functions. + +The missing-value functions can be used to precisely control how missing +values are treated in expressions. @xref{Missing Value Functions}, for +more details. + +@node Grouping Operators, Arithmetic Operators, Missing Values in Expressions, Expressions +@section Grouping Operators +@cindex parentheses +@cindex @samp{( )} +@cindex grouping operators +@cindex operators, grouping + +Parentheses (@samp{()}) are the grouping operators. Surround an +expression with parentheses to force early evaluation. + +Parentheses also surround the arguments to functions, but in that +situation they act as punctuators, not as operators. + +@node Arithmetic Operators, Logical Operators, Grouping Operators, Expressions +@section Arithmetic Operators +@cindex operators, arithmetic +@cindex arithmetic operators + +The arithmetic operators take numeric arguments and produce numeric +results. + +@table @code +@cindex @samp{+} +@cindex addition +@item @var{a} + @var{b} +Adds @var{a} and @var{b}, returning the sum. + +@cindex @samp{-} +@cindex subtraction +@item @var{a} - @var{b} +Subtracts @var{b} from @var{a}, returning the difference. + +@cindex @samp{*} +@cindex multiplication +@item @var{a} * @var{b} +Multiplies @var{a} and @var{b}, returning the product. + +@cindex @samp{/} +@cindex division +@item @var{a} / @var{b} +Divides @var{a} by @var{b}, returning the quotient. If @var{b} is +zero, the result is system-missing. + +@cindex @samp{**} +@cindex exponentiation +@item @var{a} ** @var{b} +Returns the result of raising @var{a} to the power @var{b}. If +@var{a} is negative and @var{b} is not an integer, the result is +system-missing. The result of @code{0**0} is system-missing as well. + +@cindex @samp{-} +@cindex negation +@item - @var{a} +Reverses the sign of @var{a}. +@end table + +@node Logical Operators, Relational Operators, Arithmetic Operators, Expressions +@section Logical Operators +@cindex logical operators +@cindex operators, logical + +@cindex true +@cindex false +@cindex Boolean +@cindex values, system-missing +@cindex system-missing +The logical operators take logical arguments and produce logical +results, meaning ``true or false''. PSPP logical operators are +not true Boolean operators because they may also result in a +system-missing value. + +@table @code +@cindex @code{AND} +@cindex @samp{&} +@cindex intersection, logical +@cindex logical intersection +@item @var{a} AND @var{b} +@itemx @var{a} & @var{b} +True if both @var{a} and @var{b} are true, false otherwise. If one +argument is false, the result is false even if the other is missing. If +both arguments are missing, the result is missing. + +@cindex @code{OR} +@cindex @samp{|} +@cindex union, logical +@cindex logical union +@item @var{a} OR @var{b} +@itemx @var{a} | @var{b} +True if at least one of @var{a} and @var{b} is true. If one argument is +true, the result is true even if the other argument is missing. If both +arguments are missing, the result is missing. + +@cindex @code{NOT} +@cindex @samp{~} +@cindex inversion, logical +@cindex logical inversion +@item NOT @var{a} +@itemx ~ @var{a} +True if @var{a} is false. If the argument is missing, then the result +is missing. +@end table + +@node Relational Operators, Functions, Logical Operators, Expressions +@section Relational Operators + +The relational operators take numeric or string arguments and produce Boolean +results. + +Strings cannot be compared to numbers. When strings of different +lengths are compared, the shorter string is right-padded with spaces +to match the length of the longer string. + +The results of string comparisons, other than tests for equality or +inequality, are dependent on the character set in use. String +comparisons are case-sensitive. + +@table @code +@cindex equality, testing +@cindex testing for equality +@cindex @code{EQ} +@cindex @samp{=} +@item @var{a} EQ @var{b} +@itemx @var{a} = @var{b} +True if @var{a} is equal to @var{b}. + +@cindex less than or equal to +@cindex @code{LE} +@cindex @code{<=} +@item @var{a} LE @var{b} +@itemx @var{a} <= @var{b} +True if @var{a} is less than or equal to @var{b}. + +@cindex less than +@cindex @code{LT} +@cindex @code{<} +@item @var{a} LT @var{b} +@itemx @var{a} < @var{b} +True if @var{a} is less than @var{b}. + +@cindex greater than or equal to +@cindex @code{GE} +@cindex @code{>=} +@item @var{a} GE @var{b} +@itemx @var{a} >= @var{b} +True if @var{a} is greater than or equal to @var{b}. + +@cindex greater than +@cindex @code{GT} +@cindex @samp{>} +@item @var{a} GT @var{b} +@itemx @var{a} > @var{b} +True if @var{a} is greater than @var{b}. + +@cindex inequality, testing +@cindex testing for inequality +@cindex @code{NE} +@cindex @code{~=} +@cindex @code{<>} +@item @var{a} NE @var{b} +@itemx @var{a} ~= @var{b} +@itemx @var{a} <> @var{b} +True is @var{a} is not equal to @var{b}. +@end table + +@node Functions, Order of Operations, Relational Operators, Expressions +@section Functions +@cindex functions + +@cindex mathematics +@cindex operators +@cindex parentheses +@cindex @code{(} +@cindex @code{)} +@cindex names, of functions +PSPP functions provide mathematical abilities above and beyond +those possible using simple operators. Functions have a common +syntax: each is composed of a function name followed by a left +parenthesis, one or more arguments, and a right parenthesis. Function +names are @strong{not} reserved; their names are specially treated +only when followed by a left parenthesis: @code{EXP(10)} refers to the +constant value @code{e} raised to the 10th power, but @code{EXP} by +itself refers to the value of variable EXP. + +The sections below describe each function in detail. + +@menu +* Advanced Mathematics:: EXP LG10 LN SQRT +* Miscellaneous Mathematics:: ABS MOD MOD10 RND TRUNC +* Trigonometry:: ACOS ARCOS ARSIN ARTAN ASIN ATAN COS SIN TAN +* Missing Value Functions:: MISSING NMISS NVALID SYSMIS VALUE +* Pseudo-Random Numbers:: NORMAL UNIFORM +* Set Membership:: ANY RANGE +* Statistical Functions:: CFVAR MAX MEAN MIN SD SUM VARIANCE +* String Functions:: CONCAT INDEX LENGTH LOWER LPAD LTRIM NUMBER + RINDEX RPAD RTRIM STRING SUBSTR UPCASE +* Time & Date:: CTIME.xxx DATE.xxx TIME.xxx XDATE.xxx +* Miscellaneous Functions:: LAG YRMODA +* Functions Not Implemented:: CDF.xxx CDFNORM IDF.xxx NCDF.xxx PROBIT RV.xxx +@end menu + +@node Advanced Mathematics, Miscellaneous Mathematics, Functions, Functions +@subsection Advanced Mathematical Functions +@cindex mathematics, advanced + +Advanced mathematical functions take numeric arguments and produce +numeric results. + +@deftypefn {Function} {} EXP (@var{exponent}) +Returns @i{e} (approximately 2.71828) raised to power @var{exponent}. +@end deftypefn + +@cindex logarithms +@deftypefn {Function} {} LG10 (@var{number}) +Takes the base-10 logarithm of @var{number}. If @var{number} is +not positive, the result is system-missing. +@end deftypefn + +@deftypefn {Function} {} LN (@var{number}) +Takes the base-@i{e} logarithm of @var{number}. If @var{number} is +not positive, the result is system-missing. +@end deftypefn + +@cindex square roots +@deftypefn {Function} {} SQRT (@var{number}) +Takes the square root of @var{number}. If @var{number} is negative, +the result is system-missing. +@end deftypefn + +@node Miscellaneous Mathematics, Trigonometry, Advanced Mathematics, Functions +@subsection Miscellaneous Mathematical Functions +@cindex mathematics, miscellaneous + +Miscellaneous mathematical functions take numeric arguments and produce +numeric results. + +@cindex absolute value +@deftypefn {Function} {} ABS (@var{number}) +Results in the absolute value of @var{number}. +@end deftypefn + +@cindex modulus +@deftypefn {Function} {} MOD (@var{numerator}, @var{denominator}) +Returns the remainder (modulus) of @var{numerator} divided by +@var{denominator}. If @var{denominator} is 0, the result is +system-missing. However, if @var{numerator} is 0 and +@var{denominator} is system-missing, the result is 0. +@end deftypefn + +@cindex modulus, by 10 +@deftypefn {Function} {} MOD10 (@var{number}) +Returns the remainder when @var{number} is divided by 10. If +@var{number} is negative, MOD10(@var{number}) is negative or zero. +@end deftypefn + +@cindex rounding +@deftypefn {Function} {} RND (@var{number}) +Takes the absolute value of @var{number} and rounds it to an integer. +Then, if @var{number} was negative originally, negates the result. +@end deftypefn + +@cindex truncation +@deftypefn {Function} {} TRUNC (@var{number}) +Discards the fractional part of @var{number}; that is, rounds +@var{number} towards zero. +@end deftypefn + +@node Trigonometry, Missing Value Functions, Miscellaneous Mathematics, Functions +@subsection Trigonometric Functions +@cindex trigonometry + +Trigonometric functions take numeric arguments and produce numeric +results. + +@cindex arccosine +@cindex inverse cosine +@deftypefn {Function} {} ARCOS (@var{number}) +Takes the arccosine, in radians, of @var{number}. Results in +system-missing if @var{number} is not between -1 and 1. +@end deftypefn + +@cindex arcsine +@cindex inverse sine +@deftypefn {Function} {} ARSIN (@var{number}) +Takes the arcsine, in radians, of @var{number}. Results in +system-missing if @var{number} is not between -1 and 1 inclusive. +@end deftypefn + +@cindex arctangent +@cindex inverse tangent +@deftypefn {Function} {} ARTAN (@var{number}) +Takes the arctangent, in radians, of @var{number}. +@end deftypefn + +@cindex cosine +@deftypefn {Function} {} COS (@var{angle}) +Takes the cosine of @var{angle} which should be in radians. +@end deftypefn + +@cindex sine +@deftypefn {Function} {} SIN (@var{angle}) +Takes the sine of @var{angle} which should be in radians. +@end deftypefn + +@cindex tangent +@deftypefn {Function} {} TAN (@var{angle}) +Takes the tangent of @var{angle} which should be in radians. +Results in system-missing at values +of @var{angle} that are too close to odd multiples of pi/2. +Portability: none. +@end deftypefn + +@node Missing Value Functions, Pseudo-Random Numbers, Trigonometry, Functions +@subsection Missing-Value Functions +@cindex missing values +@cindex values, missing +@cindex functions, missing-value + +Missing-value functions take various numeric arguments and yield +various types of results. Note that the normal rules of evaluation +apply within expression arguments to these functions. In particular, +user-missing values for numeric variables are converted to +system-missing values. + +@deftypefn {Function} {} MISSING (@var{expr}) +Returns 1 if @var{expr} has the system-missing value, 0 otherwise. +@end deftypefn + +@deftypefn {Function} {} NMISS (@var{expr} [, @var{expr}]@dots{}) +Each argument must be a numeric expression. Returns the number of +system-missing values in the list. As a special extension, +the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a +range of variables; see @ref{Sets of Variables}, for more details. +@end deftypefn + +@deftypefn {Function} {} NVALID (@var{expr} [, @var{expr}]@dots{}) +Each argument must be a numeric expression. Returns the number of +values in the list that are not system-missing. As a special extension, +the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a +range of variables; see @ref{Sets of Variables}, for more details. +@end deftypefn + +@deftypefn {Function} {} SYSMIS (@var{expr}) +When @var{expr} is simply the name of a numeric variable, returns 1 if +the variable has the system-missing value, 0 if it is user-missing or +not missing. If given @var{expr} takes another form, results in 1 if +the value is system-missing, 0 otherwise. +@end deftypefn + +@deftypefn {Function} {} VALUE (@var{variable}) +Prevents the user-missing values of @var{variable} from being +transformed into system-missing values, and always results in the +actual value of @var{variable}, whether it is user-missing, +system-missing or not missing at all. +@end deftypefn + +@node Pseudo-Random Numbers, Set Membership, Missing Value Functions, Functions +@subsection Pseudo-Random Number Generation Functions +@cindex random numbers +@cindex pseudo-random numbers (see random numbers) + +Pseudo-random number generation functions take numeric arguments and +produce numeric results. + +PSPP uses the alleged RC4 cipher as a pseudo-random number generator +(PRNG). The bytes output by this PRNG are system-independent for a +given random seed, but differences in endianness and floating-point +formats will make PRNG results differ from system to system. RC4 +should produce high-quality random numbers for simulation purposes. +(If you're concerned about the quality of the random number generator, +well, you're using a statistical processing package---analyze it!) + +PSPP's implementation of RC4 has not undergone any security auditing. +Furthermore, various precautions that would be necessary for secure +operation, such as secure seeding and discarding the first several +bytes of output, have not been taken. Therefore, PSPP's +implementation of RC4 should not be used for security purposes. + +@cindex random numbers, normally-distributed +@deftypefn {Function} {} NORMAL (@var{number}) +Results in a random number. Results from @code{NORMAL} are normally +distributed with a mean of 0 and a standard deviation of @var{number}. +@end deftypefn + +@cindex random numbers, uniformly-distributed +@deftypefn {Function} {} UNIFORM (@var{number}) +Results in a random number between 0 and @var{number}. Results from +@code{UNIFORM} are evenly distributed across its entire range. There +may be a maximum on the largest random number ever generated---this is +often +@ifinfo +2**31-1 +@end ifinfo +@tex +$2^{31}-1$ +@end tex +(2,147,483,647), but it may be orders of magnitude +higher or lower. +@end deftypefn + +@node Set Membership, Statistical Functions, Pseudo-Random Numbers, Functions +@subsection Set-Membership Functions +@cindex set membership +@cindex membership, of set + +Set membership functions determine whether a value is a member of a set. +They take a set of numeric arguments or a set of string arguments, and +produce Boolean results. + +String comparisons are performed according to the rules given in +@ref{Relational Operators}. + +@deftypefn {Function} {} ANY (@var{value}, @var{set} [, @var{set}]@dots{}) +Results in true if @var{value} is equal to any of the @var{set} +values. Otherwise, results in false. If @var{value} is +system-missing, returns system-missing. System-missing values in +@var{set} do not cause ANY to return system-missing. +@end deftypefn + +@deftypefn {Function} {} RANGE (@var{value}, @var{low}, @var{high} [, @var{low}, @var{high}]@dots{}) +Results in true if @var{value} is in any of the intervals bounded by +@var{low} and @var{high} inclusive. Otherwise, results in false. +Each @var{low} must be less than or equal to its corresponding +@var{high} value. @var{low} and @var{high} must be given in pairs. +If @var{value} is system-missing, returns system-missing. +System-missing values in @var{set} do not cause RANGE to return +system-missing. +@end deftypefn + +@node Statistical Functions, String Functions, Set Membership, Functions +@subsection Statistical Functions +@cindex functions, statistical +@cindex statistics + +Statistical functions compute descriptive statistics on a list of +values. Some statistics can be computed on numeric or string values; +other can only be computed on numeric values. Their results have the +same type as their arguments. The current case's weighting factor +(@pxref{WEIGHT}) has no effect on statistical functions. + +@cindex arguments, minimum valid +@cindex minimum valid number of arguments +With statistical functions it is possible to specify a minimum number of +non-missing arguments for the function to be evaluated. To do so, +append a dot and the number to the function name. For instance, to +specify a minimum of three valid arguments to the MEAN function, use the +name @code{MEAN.3}. + +@cindex coefficient of variation +@cindex variation, coefficient of +@deftypefn {Function} {} CFVAR (@var{number}, @var{number}[, @dots{}]) +Results in the coefficient of variation of the values of @var{number}. +This function requires at least two valid arguments to give a +non-missing result. (The coefficient of variation is the standard +deviation divided by the mean.) +@end deftypefn + +@cindex maximum +@deftypefn {Function} {} MAX (@var{value}, @var{value}[, @dots{}]) +Results in the value of the greatest @var{value}. The @var{value}s may +be numeric or string. Although at least two arguments must be given, +only one need be valid for MAX to give a non-missing result. +@end deftypefn + +@cindex mean +@deftypefn {Function} {} MEAN (@var{number}, @var{number}[, @dots{}]) +Results in the mean of the values of @var{number}. Although at least +two arguments must be given, only one need be valid for MEAN to give a +non-missing result. +@end deftypefn + +@cindex minimum +@deftypefn {Function} {} MIN (@var{number}, @var{number}[, @dots{}]) +Results in the value of the least @var{value}. The @var{value}s may +be numeric or string. Although at least two arguments must be given, +only one need be valid for MAX to give a non-missing result. +@end deftypefn + +@cindex standard deviation +@cindex deviation, standard +@deftypefn {Function} {} SD (@var{number}, @var{number}[, @dots{}]) +Results in the standard deviation of the values of @var{number}. +This function requires at least two valid arguments to give a +non-missing result. +@end deftypefn + +@cindex sum +@deftypefn {Function} {} SUM (@var{number}, @var{number}[, @dots{}]) +Results in the sum of the values of @var{number}. Although at least two +arguments must be given, only one need by valid for SUM to give a +non-missing result. +@end deftypefn + +@cindex variance +@deftypefn {Function} {} VARIANCE (@var{number}, @var{number}[, @dots{}]) +Results in the variance of the values of @var{number}. This function +requires at least two valid arguments to give a non-missing result. +@end deftypefn + +@node String Functions, Time & Date, Statistical Functions, Functions +@subsection String Functions +@cindex functions, string +@cindex string functions + +String functions take various arguments and return various results. + +@cindex concatenation +@cindex strings, concatenation of +@deftypefn {Function} {} CONCAT (@var{string}, @var{string}[, @dots{}]) +Returns a string consisting of each @var{string} in sequence. +@code{CONCAT("abc", "def", "ghi")} has a value of @code{"abcdefghi"}. +The resultant string is truncated to a maximum of 255 characters. +@end deftypefn + +@cindex searching strings +@deftypefn {Function} {} INDEX (@var{haystack}, @var{needle}) +Returns a positive integer indicating the position of the first +occurrence @var{needle} in @var{haystack}. Returns 0 if @var{haystack} +does not contain @var{needle}. Returns system-missing if @var{needle} +is an empty string. +@end deftypefn + +@deftypefn {Function} {} INDEX (@var{haystack}, @var{needle}, @var{divisor}) +Divides @var{needle} into parts, each with length @var{divisor}. +Searches @var{haystack} for the first occurrence of each part, and +returns the smallest value. Returns 0 if @var{haystack} does not +contain any part in @var{needle}. It is an error if @var{divisor} +cannot be evenly divided into the length of @var{needle}. Returns +system-missing if @var{needle} is an empty string. +@end deftypefn + +@cindex strings, finding length of +@deftypefn {Function} {} LENGTH (@var{string}) +Returns the number of characters in @var{string}. +@end deftypefn + +@cindex strings, case of +@deftypefn {Function} {} LOWER (@var{string}) +Returns a string identical to @var{string} except that all uppercase +letters are changed to lowercase letters. The definitions of +``uppercase'' and ``lowercase'' are system-dependent. +@end deftypefn + +@cindex strings, padding +@deftypefn {Function} {} LPAD (@var{string}, @var{length}) +If @var{string} is at least @var{length} characters in length, returns +@var{string} unchanged. Otherwise, returns @var{string} padded with +spaces on the left side to length @var{length}. Returns an empty string +if @var{length} is system-missing, negative, or greater than 255. +@end deftypefn + +@deftypefn {Function} {} LPAD (@var{string}, @var{length}, @var{padding}) +If @var{string} is at least @var{length} characters in length, returns +@var{string} unchanged. Otherwise, returns @var{string} padded with +@var{padding} on the left side to length @var{length}. Returns an empty +string if @var{length} is system-missing, negative, or greater than 255, or +if @var{padding} does not contain exactly one character. +@end deftypefn + +@cindex strings, trimming +@cindex whitespace, trimming +@deftypefn {Function} {} LTRIM (@var{string}) +Returns @var{string}, after removing leading spaces. Other whitespace, +such as tabs, carriage returns, line feeds, and vertical tabs, is not +removed. +@end deftypefn + +@deftypefn {Function} {} LTRIM (@var{string}, @var{padding}) +Returns @var{string}, after removing leading @var{padding} characters. +If @var{padding} does not contain exactly one character, returns an +empty string. +@end deftypefn + +@cindex numbers, converting from strings +@cindex strings, converting to numbers +@deftypefn {Function} {} NUMBER (@var{string}, @var{format}) +Returns the number produced when @var{string} is interpreted according +to format specifier @var{format}. If the format width @var{w} is less +than the length of @var{string}, then only the first @var{w} +characters in @var{string} are used, e.g.@: @code{NUMBER("123", F3.0)} +and @code{NUMBER("1234", F3.0)} both have value 123. If @var{w} is +greater than @var{string}'s length, then it is treated as if it were +right-padded with spaces. If @var{string} is not in the correct +format for @var{format}, system-missing is returned. +@end deftypefn + +@cindex strings, searching backwards +@deftypefn {Function} {} RINDEX (@var{string}, @var{format}) +Returns a positive integer indicating the position of the last +occurrence of @var{needle} in @var{haystack}. Returns 0 if +@var{haystack} does not contain @var{needle}. Returns system-missing if +@var{needle} is an empty string. +@end deftypefn + +@deftypefn {Function} {} RINDEX (@var{haystack}, @var{needle}, @var{divisor}) +Divides @var{needle} into parts, each with length @var{divisor}. +Searches @var{haystack} for the last occurrence of each part, and +returns the largest value. Returns 0 if @var{haystack} does not contain +any part in @var{needle}. It is an error if @var{divisor} cannot be +evenly divided into the length of @var{needle}. Returns system-missing +if @var{needle} is an empty string. +@end deftypefn + +@cindex padding strings +@cindex strings, padding +@deftypefn {Function} {} RPAD (@var{string}, @var{length}) +If @var{string} is at least @var{length} characters in length, returns +@var{string} unchanged. Otherwise, returns @var{string} padded with +spaces on the right to length @var{length}. Returns an empty string if +@var{length} is system-missing, negative, or greater than 255. +@end deftypefn + +@deftypefn {Function} {} RPAD (@var{string}, @var{length}, @var{padding}) +If @var{string} is at least @var{length} characters in length, returns +@var{string} unchanged. Otherwise, returns @var{string} padded with +@var{padding} on the right to length @var{length}. Returns an empty +string if @var{length} is system-missing, negative, or greater than 255, +or if @var{padding} does not contain exactly one character. +@end deftypefn + +@cindex strings, trimming +@cindex whitespace, trimming +@deftypefn {Function} {} RTRIM (@var{string}) +Returns @var{string}, after removing trailing spaces. Other types of +whitespace are not removed. +@end deftypefn + +@deftypefn {Function} {} RTRIM (@var{string}, @var{padding}) +Returns @var{string}, after removing trailing @var{padding} characters. +If @var{padding} does not contain exactly one character, returns an +empty string. +@end deftypefn + +@cindex strings, converting from numbers +@cindex numbers, converting to strings +@deftypefn {Function} {} STRING (@var{number}, @var{format}) +Returns a string corresponding to @var{number} in the format given by +format specifier @var{format}. For example, @code{STRING(123.56, F5.1)} +has the value @code{"123.6"}. +@end deftypefn + +@cindex substrings +@cindex strings, taking substrings of +@deftypefn {Function} {} SUBSTR (@var{string}, @var{start}) +Returns a string consisting of the value of @var{string} from position +@var{start} onward. Returns an empty string if @var{start} is system-missing +or has a value less than 1 or greater than the number of characters in +@var{string}. +@end deftypefn + +@deftypefn {Function} {} SUBSTR (@var{string}, @var{start}, @var{count}) +Returns a string consisting of the first @var{count} characters from +@var{string} beginning at position @var{start}. Returns an empty string +if @var{start} or @var{count} is system-missing, if @var{start} is less +than 1 or greater than the number of characters in @var{string}, or if +@var{count} is less than 1. Returns a string shorter than @var{count} +characters if @var{start} + @var{count} - 1 is greater than the number +of characters in @var{string}. Examples: @code{SUBSTR("abcdefg", 3, 2)} +has value @code{"cd"}; @code{SUBSTR("Ben Pfaff", 5, 10)} has the value +@code{"Pfaff"}. +@end deftypefn + +@cindex case conversion +@cindex strings, case of +@deftypefn {Function} {} UPCASE (@var{string}) +Returns @var{string}, changing lowercase letters to uppercase letters. +@end deftypefn + +@node Time & Date, Miscellaneous Functions, String Functions, Functions +@subsection Time & Date Functions +@cindex functions, time & date +@cindex times +@cindex dates + +@cindex dates, legal range of +The legal range of dates for use in PSPP is 15 Oct 1582 +through 31 Dec 19999. + +@cindex arguments, invalid +@cindex invalid arguments +@quotation +@strong{Please note:} Most time & date extraction functions will accept +invalid arguments: + +@itemize @bullet +@item +Negative numbers in PSPP time format. +@item +Numbers less than 86,400 in PSPP date format. +@end itemize + +However, sensible results are not guaranteed for these invalid values. +The given equivalents for these functions are definitely not guaranteed +for invalid values. +@end quotation + +@quotation +@strong{Please note also:} The time & date construction +functions @strong{do} produce reasonable and useful results for +out-of-range values; these are not considered invalid. +@end quotation + +@menu +* Time & Date Concepts:: How times & dates are defined and represented +* Time Construction:: TIME.@{DAYS HMS@} +* Time Extraction:: CTIME.@{DAYS HOURS MINUTES SECONDS@} +* Date Construction:: DATE.@{DMY MDY MOYR QYR WKYR YRDAY@} +* Date Extraction:: XDATE.@{DATE HOUR JDAY MDAY MINUTE MONTH + QUARTER SECOND TDAY TIME WEEK + WKDAY YEAR@} +@end menu + +@node Time & Date Concepts, Time Construction, Time & Date, Time & Date +@subsubsection How times & dates are defined and represented + +@cindex time, concepts +@cindex time, intervals +Times and dates are handled by PSPP as single numbers. A +@dfn{time} is an interval. PSPP measures times in seconds. +Thus, the following intervals correspond with the numeric values given: + +@example + 10 minutes 600 + 1 hour 3,600 + 1 day, 3 hours, 10 seconds 97,210 + 40 days 3,456,000 + 10010 d, 14 min, 24 s 864,864,864 +@end example + +@cindex dates, concepts +@cindex time, instants of +A @dfn{date}, on the other hand, is a particular instant in the past or +the future. PSPP represents a date as a number of seconds after the +midnight that separated 8 Oct 1582 and 9 Oct 1582. (Please note that 15 +Oct 1582 immediately followed 9 Oct 1582.) Thus, the midnights before +the dates given below correspond with the numeric PSPP dates given: + +@example + 15 Oct 1582 86,400 + 4 Jul 1776 6,113,318,400 + 1 Jan 1900 10,010,390,400 + 1 Oct 1978 12,495,427,200 + 24 Aug 1995 13,028,601,600 +@end example + +@cindex time, mathematical properties of +@cindex mathematics, applied to times & dates +@cindex dates, mathematical properties of +@noindent +Please note: + +@itemize @bullet +@item +A time may be added to, or subtracted from, a date, resulting in a date. + +@item +The difference of two dates may be taken, resulting in a time. + +@item +Two times may be added to, or subtracted from, each other, resulting in +a time. +@end itemize + +(Adding two dates does not produce a useful result.) + +Since times and dates are merely numbers, the ordinary addition and +subtraction operators are employed for these purposes. + +@quotation +@strong{Please note:} Many dates and times have extremely large +values---just look at the values above. Thus, it is not a good idea to +take powers of these values; also, the accuracy of some procedures may +be affected. If necessary, convert times or dates in seconds to some +other unit, like days or years, before performing analysis. +@end quotation + +@node Time Construction, Time Extraction, Time & Date Concepts, Time & Date +@subsubsection Functions that Produce Times +@cindex times, constructing +@cindex constructing times + +These functions take numeric arguments and produce numeric results in +PSPP time format. + +@cindex days +@cindex time, in days +@deftypefn {Function} {} TIME.DAYS (@var{ndays}) +Results in a time value corresponding to @var{ndays} days. +(@code{TIME.DAYS(@var{x})} is equivalent to @code{@var{x} * 60 * 60 * +24}.) +@end deftypefn + +@cindex hours-minutes-seconds +@cindex time, in hours-minutes-seconds +@deftypefn {Function} {} TIME.HMS (@var{nhours}, @var{nmins}, @var{nsecs}) +Results in a time value corresponding to @var{nhours} hours, @var{nmins} +minutes, and @var{nsecs} seconds. (@code{TIME.HMS(@var{h}, @var{m}, +@var{s})} is equivalent to @code{@var{h}*60*60 + @var{m}*60 + +@var{s}}.) +@end deftypefn + +@node Time Extraction, Date Construction, Time Construction, Time & Date +@subsubsection Functions that Examine Times +@cindex extraction, of time +@cindex time examination +@cindex examination, of times +@cindex time, lengths of + +These functions take numeric arguments in PSPP time format and +give numeric results. + +@cindex days +@cindex time, in days +@deftypefn {Function} {} CTIME.DAYS (@var{time}) +Results in the number of days and fractional days in @var{time}. +(@code{CTIME.DAYS(@var{x})} is equivalent to @code{@var{x}/60/60/24}.) +@end deftypefn + +@cindex hours +@cindex time, in hours +@deftypefn {Function} {} CTIME.HOURS (@var{time}) +Results in the number of hours and fractional hours in @var{time}. +(@code{CTIME.HOURS(@var{x})} is equivalent to @code{@var{x}/60/60}.) +@end deftypefn + +@cindex minutes +@cindex time, in minutes +@deftypefn {Function} {} CTIME.MINUTES (@var{time}) +Results in the number of minutes and fractional minutes in @var{time}. +(@code{CTIME.MINUTES(@var{x})} is equivalent to @code{@var{x}/60}.) +@end deftypefn + +@cindex seconds +@cindex time, in seconds +@deftypefn {Function} {} CTIME.SECONDS (@var{time}) +Results in the number of seconds and fractional seconds in @var{time}. +(@code{CTIME.SECONDS} does nothing; @code{CTIME.SECONDS(@var{x})} is +equivalent to @code{@var{x}}.) +@end deftypefn + +@node Date Construction, Date Extraction, Time Extraction, Time & Date +@subsubsection Functions that Produce Dates +@cindex dates, constructing +@cindex constructing dates + +@cindex arguments, of date construction functions +These functions take numeric arguments and give numeric results in the +PSPP date format. Arguments taken by these functions are: + +@table @var +@item day +Refers to a day of the month between 1 and 31. + +@item month +Refers to a month of the year between 1 and 12. + +@item quarter +Refers to a quarter of the year between 1 and 4. The quarters of the +year begin on the first days of months 1, 4, 7, and 10. + +@item week +Refers to a week of the year between 1 and 53. + +@item yday +Refers to a day of the year between 1 and 366. + +@item year +Refers to a year between 1582 and 19999. +@end table + +@cindex arguments, invalid +If these functions' arguments are out-of-range, they are correctly +normalized before conversion to date format. Non-integers are rounded +toward zero. + +@cindex day-month-year +@cindex dates, day-month-year +@deftypefn {Function} {} DATE.DMY (@var{day}, @var{month}, @var{year}) +@deftypefnx {Function} {} DATE.MDY (@var{month}, @var{day}, @var{year}) +Results in a date value corresponding to the midnight before day +@var{day} of month @var{month} of year @var{year}. +@end deftypefn + +@cindex month-year +@cindex dates, month-year +@deftypefn {Function} {} DATE.MOYR (@var{month}, @var{year}) +Results in a date value corresponding to the midnight before the first +day of month @var{month} of year @var{year}. +@end deftypefn + +@cindex quarter-year +@cindex dates, quarter-year +@deftypefn {Function} {} DATE.QYR (@var{quarter}, @var{year}) +Results in a date value corresponding to the midnight before the first +day of quarter @var{quarter} of year @var{year}. +@end deftypefn + +@cindex week-year +@cindex dates, week-year +@deftypefn {Function} {} DATE.WKYR (@var{week}, @var{year}) +Results in a date value corresponding to the midnight before the first +day of week @var{week} of year @var{year}. +@end deftypefn + +@cindex year-day +@cindex dates, year-day +@deftypefn {Function} {} DATE.YRDAY (@var{year}, @var{yday}) +Results in a date value corresponding to the midnight before day +@var{yday} of year @var{year}. +@end deftypefn + +@node Date Extraction, , Date Construction, Time & Date +@subsubsection Functions that Examine Dates +@cindex extraction, of dates +@cindex date examination + +@cindex arguments, of date extraction functions +These functions take numeric arguments in PSPP date or time +format and give numeric results. These names are used for arguments: + +@table @var +@item date +A numeric value in PSPP date format. + +@item time +A numeric value in PSPP time format. + +@item time-or-date +A numeric value in PSPP time or date format. +@end table + +@cindex days +@cindex dates, in days +@cindex time, in days +@deftypefn {Function} {} XDATE.DATE (@var{time-or-date}) +For a time, results in the time corresponding to the number of whole +days @var{date-or-time} includes. For a date, results in the date +corresponding to the latest midnight at or before @var{date-or-time}; +that is, gives the date that @var{date-or-time} is in. +(XDATE.DATE(@var{x}) is equivalent to TRUNC(@var{x}/86400)*86400.) +Applying this function to a time is a non-portable feature. +@end deftypefn + +@cindex hours +@cindex dates, in hours +@cindex time, in hours +@deftypefn {Function} {} XDATE.HOUR (@var{time-or-date}) +For a time, results in the number of whole hours beyond the number of +whole days represented by @var{date-or-time}. For a date, results in +the hour (as an integer between 0 and 23) corresponding to +@var{date-or-time}. (XDATE.HOUR(@var{x}) is equivalent to +MOD(TRUNC(@var{x}/3600),24)) Applying this function to a time is a +non-portable feature. +@end deftypefn + +@cindex day of the year +@cindex dates, day of the year +@deftypefn {Function} {} XDATE.JDAY (@var{date}) +Results in the day of the year (as an integer between 1 and 366) +corresponding to @var{date}. +@end deftypefn + +@cindex day of the month +@cindex dates, day of the month +@deftypefn {Function} {} XDATE.MDAY (@var{date}) +Results in the day of the month (as an integer between 1 and 31) +corresponding to @var{date}. +@end deftypefn + +@cindex minutes +@cindex dates, in minutes +@cindex time, in minutes +@deftypefn {Function} {} XDATE.MINUTE (@var{time-or-date}) +Results in the number of minutes (as an integer between 0 and 59) after +the last hour in @var{time-or-date}. (XDATE.MINUTE(@var{x}) is +equivalent to MOD(TRUNC(@var{x}/60),60)) Applying this function to a +time is a non-portable feature. +@end deftypefn + +@cindex months +@cindex dates, in months +@deftypefn {Function} {} XDATE.MONTH (@var{date}) +Results in the month of the year (as an integer between 1 and 12) +corresponding to @var{date}. +@end deftypefn + +@cindex quarters +@cindex dates, in quarters +@deftypefn {Function} {} XDATE.QUARTER (@var{date}) +Results in the quarter of the year (as an integer between 1 and 4) +corresponding to @var{date}. +@end deftypefn + +@cindex seconds +@cindex dates, in seconds +@cindex time, in seconds +@deftypefn {Function} {} XDATE.SECOND (@var{time-or-date}) +Results in the number of whole seconds after the last whole minute (as +an integer between 0 and 59) in @var{time-or-date}. +(XDATE.SECOND(@var{x}) is equivalent to MOD(@var{x}, 60).) Applying +this function to a time is a non-portable feature. +@end deftypefn + +@cindex days +@cindex times, in days +@deftypefn {Function} {} XDATE.TDAY (@var{time}) +Results in the number of whole days (as an integer) in @var{time}. +(XDATE.TDAY(@var{x}) is equivalent to TRUNC(@var{x}/86400).) +@end deftypefn + +@cindex time +@cindex dates, time of day +@deftypefn {Function} {} XDATE.TIME (@var{date}) +Results in the time of day at the instant corresponding to @var{date}, +in PSPP time format. This is the number of seconds since +midnight on the day corresponding to @var{date}. (XDATE.TIME(@var{x}) is +equivalent to TRUNC(@var{x}/86400)*86400.) +@end deftypefn + +@cindex week +@cindex dates, in weeks +@deftypefn {Function} {} XDATE.WEEK (@var{date}) +Results in the week of the year (as an integer between 1 and 53) +corresponding to @var{date}. +@end deftypefn + +@cindex day of the week +@cindex weekday +@cindex dates, day of the week +@cindex dates, in weekdays +@deftypefn {Function} {} XDATE.WKDAY (@var{date}) +Results in the day of week (as an integer between 1 and 7) corresponding +to @var{date}. The days of the week are: + +@table @asis +@item 1 +Sunday +@item 2 +Monday +@item 3 +Tuesday +@item 4 +Wednesday +@item 5 +Thursday +@item 6 +Friday +@item 7 +Saturday +@end table +@end deftypefn + +@cindex years +@cindex dates, in years +@deftypefn {Function} {} XDATE.YEAR (@var{date}) +Returns the year (as an integer between 1582 and 19999) corresponding to +@var{date}. +@end deftypefn + +@node Miscellaneous Functions, Functions Not Implemented, Time & Date, Functions +@subsection Miscellaneous Functions +@cindex functions, miscellaneous + +Miscellaneous functions take various arguments and produce various +results. + +@cindex cross-case function +@cindex function, cross-case +@deftypefn {Function} {} LAG (@var{variable}) +@anchor{LAG} +@var{variable} must be a numeric or string variable name. @code{LAG} +results in the value of that variable for the case before the current +one. In case-selection procedures, @code{LAG} results in the value of +the variable for the last case selected. Results in system-missing (for +numeric variables) or blanks (for string variables) for the first case +or before any cases are selected. +@end deftypefn + +@deftypefn {Function} {} LAG (@var{variable}, @var{ncases}) +@var{variable} must be a numeric or string variable name. @var{ncases} +must be a small positive constant integer, although there is no explicit +limit. (Use of a large value for @var{ncases} will increase memory +consumption, since PSPP must keep @var{ncases} cases in memory.) +@code{LAG (@var{variable}, @var{ncases}} results in the value of +@var{variable} that is @var{ncases} before the case currently being +processed. See @code{LAG (@var{variable})} above for more details. +@end deftypefn + +@cindex date, Julian +@cindex Julian date +@deftypefn {Function} {} YRMODA (@var{year}, @var{month}, @var{day}) +@var{year} is a year between 0 and 199 or 1582 and 19999. @var{month} is +a month between 1 and 12. @var{day} is a day between 1 and 31. If +@var{month} or @var{day} is out-of-range, it changes the next higher +unit. For instance, a @var{day} of 0 refers to the last day of the +previous month, and a @var{month} of 13 refers to the first month of the +next year. @var{year} must be in range. If @var{year} is between 0 and +199, 1900 is added. @var{year}, @var{month}, and @var{day} must all be +integers. + +@code{YRMODA} results in the number of days between 15 Oct 1582 and +the date specified, plus one. The date passed to @code{YRMODA} must be +on or after 15 Oct 1582. 15 Oct 1582 has a value of 1. +@end deftypefn + +@node Functions Not Implemented, , Miscellaneous Functions, Functions +@subsection Functions Not Implemented +@cindex functions, not implemented +@cindex not implemented +@cindex features, not implemented + +These functions are not yet implemented and thus not yet documented, +since it's a hassle. + +@findex CDF.xxx +@findex CDFNORM +@findex IDF.xxx +@findex NCDF.xxx +@findex PROBIT +@findex RV.xxx + +@itemize @bullet +@item +@code{CDF.xxx} +@item +@code{CDFNORM} +@item +@code{IDF.xxx} +@item +@code{NCDF.xxx} +@item +@code{PROBIT} +@item +@code{RV.xxx} +@end itemize + +@node Order of Operations, , Functions, Expressions +@section Operator Precedence +@cindex operator precedence +@cindex precedence, operator +@cindex order of operations +@cindex operations, order of + +The following table describes operator precedence. Smaller-numbered +levels in the table have higher precedence. Within a level, operations +are performed from left to right, except for level 2 (exponentiation), +where operations are performed from right to left. If an operator +appears in the table in two places (@code{-}), the first occurrence is +unary, the second is binary. + +@enumerate +@item +@code{( )} +@item +@code{**} +@item +@code{-} +@item +@code{* /} +@item +@code{+ -} +@item +@code{EQ GE GT LE LT NE} +@item +@code{AND NOT OR} +@end enumerate +@setfilename ignored diff --git a/doc/files.texi b/doc/files.texi new file mode 100644 index 00000000..3321390e --- /dev/null +++ b/doc/files.texi @@ -0,0 +1,306 @@ +@node System and Portable Files, Variable Attributes, Data Input and Output, Top +@chapter System Files and Portable Files + +The commands in this chapter read, write, and examine system files and +portable files. + +@menu +* APPLY DICTIONARY:: Apply system file dictionary to active file. +* EXPORT:: Write to a portable file. +* GET:: Read from a system file. +* IMPORT:: Read from a portable file. +* MATCH FILES:: Merge system files. +* SAVE:: Write to a system file. +* SYSFILE INFO:: Display system file dictionary. +* XSAVE:: Write to a system file, as a transform. +@end menu + +@node APPLY DICTIONARY, EXPORT, System and Portable Files, System and Portable Files +@section APPLY DICTIONARY +@vindex APPLY DICTIONARY + +@display +APPLY DICTIONARY FROM='filename'. +@end display + +@cmd{APPLY DICTIONARY} applies the variable labels, value labels, +and missing values from variables in a system file to corresponding +variables in the active file. In some cases it also updates the +weighting variable. + +Specify a system file with a file name string or as a file handle +(@pxref{FILE HANDLE}). The dictionary in the system file will be read, +but it will not replace the active file dictionary. The system file's +data will not be read. + +Only variables with names that exist in both the active file and the +system file are considered. Variables with the same name but different +types (numeric, string) will cause an error message. Otherwise, the +system file variables' attributes will replace those in their matching +active file variables, as described below. + +If a system file variable has a variable label, then it will replace the +active file variable's variable label. If the system file variable does +not have a variable label, then the active file variable's variable +label, if any, will be retained. + +If the active file variable is numeric or short string, then value +labels and missing values, if any, will be copied to the active file +variable. If the system file variable does not have value labels or +missing values, then those in the active file variable, if any, will not +be disturbed. + +Finally, weighting of the active file is updated (@pxref{WEIGHT}). If +the active file has a weighting variable, and the system file does not, +or if the weighting variable in the system file does not exist in the +active file, then the active file weighting variable, if any, is +retained. Otherwise, the weighting variable in the system file becomes +the active file weighting variable. + +@cmd{APPLY DICTIONARY} takes effect immediately. It does not read the +active +file. The system file is not modified. + +@node EXPORT, GET, APPLY DICTIONARY, System and Portable Files +@section EXPORT +@vindex EXPORT + +@display +EXPORT + /OUTFILE='filename' + /DROP=var_list + /KEEP=var_list + /RENAME=(src_names=target_names)@dots{} +@end display + +The @cmd{EXPORT} procedure writes the active file dictionary and data to a +specified portable file. + +The OUTFILE subcommand, which is the only required subcommand, specifies +the portable file to be written as a file name string or a file handle +(@pxref{FILE HANDLE}). + +DROP, KEEP, and RENAME follow the same format as the SAVE procedure +(@pxref{SAVE}). + +@cmd{EXPORT} is a procedure. It causes the active file to be read. + +@node GET, IMPORT, EXPORT, System and Portable Files +@section GET +@vindex GET + +@display +GET + /FILE='filename' + /DROP=var_list + /KEEP=var_list + /RENAME=(src_names=target_names)@dots{} +@end display + +@cmd{GET} clears the current dictionary and active file and +replaces them with the dictionary and data from a specified system file. + +The FILE subcommand is the only required subcommand. Specify the system +file to be read as a string file name or a file handle (@pxref{FILE +HANDLE}). + +By default, all the variables in a system file are read. The DROP +subcommand can be used to specify a list of variables that are not to be +read. By contrast, the KEEP subcommand can be used to specify variable +that are to be read, with all other variables not read. + +Normally variables in a system file retain the names that they were +saved under. Use the RENAME subcommand to change these names. Specify, +within parentheses, a list of variable names followed by an equals sign +(@samp{=}) and the names that they should be renamed to. Multiple +parenthesized groups of variable names can be included on a single +RENAME subcommand. Variables' names may be swapped using a RENAME +subcommand of the form @samp{/RENAME=(A B=B A)}. + +Alternate syntax for the RENAME subcommand allows the parentheses to be +eliminated. When this is done, only a single variable may be renamed at +once. For instance, @samp{/RENAME=A=B}. This alternate syntax is +deprecated. + +DROP, KEEP, and RENAME are performed in left-to-right order. They +each may be present any number of times. @cmd{GET} never modifies a +system file on disk. Only the active file read from the system file +is affected by these subcommands. + +@cmd{GET} does not cause the data to be read, only the dictionary. The data +is read later, when a procedure is executed. + +@node IMPORT, MATCH FILES, GET, System and Portable Files +@section IMPORT +@vindex IMPORT + +@display +IMPORT + /FILE='filename' + /TYPE=@{COMM,TAPE@} + /DROP=var_list + /KEEP=var_list + /RENAME=(src_names=target_names)@dots{} +@end display + +The @cmd{IMPORT} transformation clears the active file dictionary and +data and +replaces them with a dictionary and data from a portable file on disk. + +The FILE subcommand, which is the only required subcommand, specifies +the portable file to be read as a file name string or a file handle +(@pxref{FILE HANDLE}). + +The TYPE subcommand is currently not used. + +DROP, KEEP, and RENAME follow the syntax used by @cmd{GET} (@pxref{GET}). + +@cmd{IMPORT} does not cause the data to be read, only the dictionary. The +data is read later, when a procedure is executed. + +@node MATCH FILES, SAVE, IMPORT, System and Portable Files +@section MATCH FILES +@vindex MATCH FILES + +@display +MATCH FILES + /BY var_list + /@{FILE,TABLE@}=@{*,'filename'@} + /DROP=var_list + /KEEP=var_list + /RENAME=(src_names=target_names)@dots{} + /IN=var_name + /FIRST=var_name + /LAST=var_name + /MAP +@end display + +@cmd{MATCH FILES} merges one or more system files, optionally +including the active file. Records with the same values for BY +variables are combined into a single record. Records with different +values are output in order. Thus, multiple sorted system files are +combined into a single sorted system file based on the value of the BY +variables. The results of the merge become the new active file. + +The BY subcommand specifies a list of variables that are used to match +records from each of the system files. Variables specified must exist +in all the files specified on FILE and TABLE. BY should usually be +specified. If TABLE is used then BY is required. + +Specify FILE with a system file as a file name string or file handle +(@pxref{FILE HANDLE}), or with an asterisk (@samp{*}) to +indicate the current active file. The files specified on FILE are +merged together based on the BY variables, or combined case-by-case if +BY is not specified. Normally at least two FILE subcommands should be +specified. + +Specify TABLE with a system file to use it as a @dfn{table +lookup file}. Records in table lookup files are not used up after +they've been used once. This means that data in table lookup files can +correspond to any number of records in FILE files. Table lookup files +correspond to lookup tables in traditional relational database systems. +It is incorrect to have records with duplicate BY values in table lookup +files. + +Any number of FILE and TABLE subcommands may be specified. Each +instance of FILE or TABLE can be followed by DROP, KEEP, and/or RENAME +subcommands. These take the same form as the corresponding subcommands +of @cmd{GET} (@pxref{GET}), and perform the same functions. + +Variables belonging to files that are not present for the current case +are set to the system-missing value for numeric variables or spaces for +string variables. + +IN, FIRST, LAST, and MAP are currently not used. + +@cmd{MATCH FILES} may not be specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}) if the active file is used as an input source. + +@node SAVE, SYSFILE INFO, MATCH FILES, System and Portable Files +@section SAVE +@vindex SAVE + +@display +SAVE + /OUTFILE='filename' + /@{COMPRESSED,UNCOMPRESSED@} + /DROP=var_list + /KEEP=var_list + /RENAME=(src_names=target_names)@dots{} +@end display + +The @cmd{SAVE} procedure causes the dictionary and data in the active +file to +be written to a system file. + +FILE is the only required subcommand. Specify the system +file to be written as a string file name or a file handle (@pxref{FILE +HANDLE}). + +The COMPRESS and UNCOMPRESS subcommand determine whether the saved +system file is compressed. By default, system files are compressed. +This default can be changed with the SET command (@pxref{SET}). + +By default, all the variables in the active file dictionary are written +to the system file. The DROP subcommand can be used to specify a list +of variables not to be written. In contrast, KEEP specifies variables +to be written, with all variables not specified not written. + +Normally variables are saved to a system file under the same names they +have in the active file. Use the RENAME subcommand to change these names. +Specify, within parentheses, a list of variable names followed by an +equals sign (@samp{=}) and the names that they should be renamed to. +Multiple parenthesized groups of variable names can be included on a +single RENAME subcommand. Variables' names may be swapped using a +RENAME subcommand of the form @samp{/RENAME=(A B=B A)}. + +Alternate syntax for the RENAME subcommand allows the parentheses to be +eliminated. When this is done, only a single variable may be renamed at +once. For instance, @samp{/RENAME=A=B}. This alternate syntax is +deprecated. + +DROP, KEEP, and RENAME are performed in left-to-right order. They +each may be present any number of times. @cmd{SAVE} never modifies +the active file. DROP, KEEP, and RENAME only affect the system file +written to disk. + +@cmd{SAVE} causes the data to be read. It is a procedure. + +@node SYSFILE INFO, XSAVE, SAVE, System and Portable Files +@section SYSFILE INFO +@vindex SYSFILE INFO + +@display +SYSFILE INFO FILE='filename'. +@end display + +@cmd{SYSFILE INFO} reads the dictionary in a system file and +displays the information in its dictionary. + +Specify a file name or file handle. @cmd{SYSFILE INFO} reads that file as +a system file and displays information on its dictionary. + +@cmd{SYSFILE INFO} does not affect the current active file. + +@node XSAVE, , SYSFILE INFO, System and Portable Files +@section XSAVE +@vindex XSAVE + +@display +XSAVE + /FILE='filename' + /@{COMPRESSED,UNCOMPRESSED@} + /DROP=var_list + /KEEP=var_list + /RENAME=(src_names=target_names)@dots{} +@end display + +The @cmd{XSAVE} transformation writes the active file dictionary and +data to a +system file stored on disk. + +@cmd{XSAVE} is a transformation, not a procedure. It is executed when the +data is read by a procedure or procedure-like command. In all other +respects, @cmd{XSAVE} is identical to @cmd{SAVE}. @xref{SAVE}, for +more information on syntax and usage. +@setfilename ignored diff --git a/doc/flow-control.texi b/doc/flow-control.texi new file mode 100644 index 00000000..21450331 --- /dev/null +++ b/doc/flow-control.texi @@ -0,0 +1,164 @@ +@node Conditionals and Looping, Statistics, Data Selection, Top +@chapter Conditional and Looping Constructs +@cindex conditionals +@cindex loops +@cindex flow of control +@cindex control flow + +This chapter documents PSPP commands used for conditional execution, +looping, and flow of control. + +@menu +* BREAK:: Exit a loop. +* DO IF:: Conditionally execute a block of code. +* DO REPEAT:: Textually repeat a code block. +* LOOP:: Repeat a block of code. +@end menu + +@node BREAK, DO IF, Conditionals and Looping, Conditionals and Looping +@section BREAK +@vindex BREAK + +@display +BREAK. +@end display + +@cmd{BREAK} terminates execution of the innermost currently executing +@cmd{LOOP} construct. + +@cmd{BREAK} is allowed only inside @cmd{LOOP}@dots{}@cmd{END LOOP}. +@xref{LOOP}, for more details. + +@node DO IF, DO REPEAT, BREAK, Conditionals and Looping +@section DO IF +@vindex DO IF + +@display +DO IF condition. + @dots{} +[ELSE IF condition. + @dots{} +]@dots{} +[ELSE. + @dots{}] +END IF. +@end display + +@cmd{DO IF} allows one of several sets of transformations to be +executed, depending on user-specified conditions. + +If the specified boolean expression evaluates as true, then the block +of code following @cmd{DO IF} is executed. If it evaluates as +missing, then +none of the code blocks is executed. If it is false, then +the boolean expression on the first @cmd{ELSE IF}, if present, is tested in +turn, with the same rules applied. If all expressions evaluate to +false, then the @cmd{ELSE} code block is executed, if it is present. + +When @cmd{DO IF} or @cmd{ELSE IF} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + +@node DO REPEAT, LOOP, DO IF, Conditionals and Looping +@section DO REPEAT +@vindex DO REPEAT + +@display +DO REPEAT repvar_name=expansion@dots{}. + @dots{} +END REPEAT [PRINT]. + +expansion takes one of the following forms: + var_list + num_or_range@dots{} + 'string'@dots{} + +num_or_range takes one of the following forms: + number + num1 TO num2 +@end display + +@cmd{DO REPEAT} repeats a block of code, textually substituting +different variables, numbers, or strings into the block with each +repetition. + +Specify a repeat variable name followed by an equals sign (@samp{=}) and +the list of replacements. Replacements can be a list of variables +(which may be existing variables or new variables or a combination +thereof), of numbers, or of strings. When new variable names are +specified, @cmd{DO REPEAT} creates them as numeric variables. When numbers +are specified, runs of integers may be indicated with TO notation, for +instance @samp{1 TO 5} and @samp{1 2 3 4 5} would be equivalent. There +is no equivalent notation for string values. + +Multiple repeat variables can be specified. When this is done, each +variable must have the same number of replacements. + +The code within @cmd{DO REPEAT} is repeated as many times as there are +replacements for each variable. The first time, the first value for +each repeat variable is substituted; the second time, the second value +for each repeat variable is substituted; and so on. + +Repeat variable substitutions work like macros. They take place +anywhere in a line that the repeat variable name occurs as a token, +including command and subcommand names. For this reason it is not a +good idea to select words commonly used in command and subcommand names +as repeat variable identifiers. + +If PRINT is specified on @cmd{END REPEAT}, the commands after substitutions +are made are printed to the listing file, prefixed by a plus sign +(@samp{+}). + +@node LOOP, , DO REPEAT, Conditionals and Looping +@section LOOP +@vindex LOOP + +@display +LOOP [index_var=start TO end [BY incr]] [IF condition]. + @dots{} +END LOOP [IF condition]. +@end display + +@cmd{LOOP} iterates a group of commands. A number of +termination options are offered. + +Specify index_var to make that variable count from one value to +another by a particular increment. index_var must be a pre-existing +numeric variable. start, end, and incr are numeric expressions +(@pxref{Expressions}.) + +During the first iteration, index_var is set to the value of start. +During each successive iteration, index_var is increased by the value of +incr. If end > start, then the loop terminates when index_var > end; +otherwise it terminates when index_var < end. If incr is not specified +then it defaults to +1 or -1 as appropriate. + +If end > start and incr < 0, or if end < start and incr > 0, then the +loop is never executed. index_var is nevertheless set to the value of +start. + +Modifying index_var within the loop is allowed, but it has no effect on +the value of index_var in the next iteration. + +Specify a boolean expression for the condition on @cmd{LOOP} to +cause the loop to be executed only if the condition is true. If the +condition is false or missing before the loop contents are executed the +first time, the loop contents are not executed at all. + +If index and condition clauses are both present on @cmd{LOOP}, the index +clause is always evaluated first. + +Specify a boolean expression for the condition on @cmd{END LOOP} to cause +the loop to terminate if the condition is not true after the enclosed +code block is executed. The condition is evaluated at the end of the +loop, not at the beginning. + +If the index clause and both condition clauses are not present, then the +loop is executed MXLOOPS (@pxref{SET}) times. + +@cmd{BREAK} also terminates @cmd{LOOP} execution (@pxref{BREAK}). + +When @cmd{LOOP} or @cmd{END LOOP} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). +@setfilename ignored diff --git a/doc/function-index.texi b/doc/function-index.texi new file mode 100644 index 00000000..5b9d3b6c --- /dev/null +++ b/doc/function-index.texi @@ -0,0 +1,4 @@ +@node Function Index, Command Index, Bugs, Top +@chapter Function Index +@printindex fn +@setfilename ignored diff --git a/doc/installing.texi b/doc/installing.texi new file mode 100644 index 00000000..97f82ec5 --- /dev/null +++ b/doc/installing.texi @@ -0,0 +1,110 @@ +@node Installation, Configuration, Concept Index, Top +@appendix Installing PSPP +@cindex installation +@cindex PSPP, installing + +@cindex GNU C compiler +@cindex gcc +@cindex compiler, recommended +@cindex compiler, gcc +PSPP conforms to the GNU Coding Standards. PSPP is written in, and +requires for proper operation, ANSI/ISO C. You might want to +additionally note the following points: + +@itemize @bullet +@item +The compiler and linker must allow for significance of several +characters in external identifiers. The exact number is unknown but at +least 31 is recommended. + +@item +The @code{int} type must be 32 bits or wider. + +@item +The recommended compiler is gcc 2.7.2.1 or later, but any ANSI compiler +will do if it fits the above criteria. +@end itemize + +Many UNIX variants should work out-of-the-box, as PSPP uses GNU +autoconf to detect differences between environments. Please report any +problems with compilation of PSPP under UNIX and UNIX-like operating +systems---portability is a major concern of the author. + +The pages below give specific instructions for installing PSPP +on each type of system mentioned above. + +@menu +* UNIX installation:: Installing on UNIX-like environments. +@end menu + +@node UNIX installation, , Installation, Installation +@section UNIX installation +@cindex UNIX, installing PSPP under +@cindex installation, under UNIX +@noindent +To install PSPP under a UNIX-like operating system, follow the steps +below in order. Some of the text below was taken directly from various +Free Software Foundation sources. + +@enumerate +@item +@code{cd} to the directory containing the PSPP source. + +@cindex configure, GNU +@cindex GNU configure +@item +Type @samp{./configure} to configure for your particular operating +system and compiler. Running @code{configure} takes a while. While +running, it displays some messages telling which features it is checking +for. + +You can optionally supply some options to @code{configure} to +give it hints about how to do its job. Type @code{./configure --help} +to see a list of options. One of the most useful options is +@samp{--with-checker}, which enables the use of the Checker memory +debugger under supported operating systems. Checker must already be +installed to use this option. Do not use @samp{--with-checker} if you +are not debugging PSPP itself. + +@cindex @file{Makefile} +@cindex @file{config.h} +@cindex @file{pref.h} +@cindex makefile +@item +(optional) Edit @file{Makefile}, @file{config.h}, and @file{pref.h}. +These files are produced by @code{configure}. Note that most PSPP +settings can be changed at runtime. + +@file{pref.h} is only generated by @code{configure} if it does not +already exist. (It's copied from @file{prefh.orig}.) + +@cindex compiling +@item +Type @samp{make} to compile the package. If there are any errors during +compilation, try to fix them. If modifications are necessary to compile +correctly under your configuration, contact the author. +@xref{Bugs,,Submitting Bug Reports}, for details. + +@cindex self-tests, running +@item +Type @samp{make check} to run self-tests on the compiled PSPP package. + +@cindex installation +@cindex PSPP, installing +@cindex @file{/usr/local/share/pspp/} +@cindex @file{/usr/local/bin/} +@cindex @file{/usr/local/info/} +@cindex documentation, installing +@item +Become the superuser and type @samp{make install} to install the +PSPP binaries, by default in @file{/usr/local/bin/}. The +directory @file{/usr/local/share/pspp/} is created and populated with +files needed by PSPP at runtime. This step will also cause the +PSPP documentation to be installed in @file{/usr/local/info/}, +but only if that directory already exists. + +@item +(optional) Type @samp{make clean} to delete the PSPP binaries +from the source tree. +@end enumerate +@setfilename ignored diff --git a/doc/introduction.texi b/doc/introduction.texi new file mode 100644 index 00000000..64bb5b25 --- /dev/null +++ b/doc/introduction.texi @@ -0,0 +1,34 @@ +@node Introduction, License, Top, Top +@chapter Introduction +@cindex introduction + +@cindex PSPP language +@cindex language, PSPP +PSPP is a tool for statistical analysis of sampled data. It reads a +syntax file and a data file, analyzes the data, and writes the results +to a listing file or to standard output. + +The language accepted by PSPP is similar to those accepted by SPSS +statistical products. The details of PSPP's language are given +later in this manual. + +@cindex files, PSPP +@cindex output, PSPP +@cindex PostScript +@cindex graphics +@cindex Ghostscript +@cindex Free Software Foundation +PSPP produces output in two forms: tables and charts. Both of these can +be written in several formats; currently, ASCII, PostScript, and HTML +are supported. In the future, more drivers, such as PCL and X Window +System drivers, may be developed. For now, Ghostscript, available from +the Free Software Foundation, may be used to convert PostScript chart +output to other formats. + +The current version of PSPP, @value{VERSION}, is woefully incomplete in +terms of its statistical procedure support. PSPP is a work in progress. +The author hopes to support fully support all features in the products +that PSPP replaces, eventually. The author welcomes questions, +comments, donations, and code submissions. @xref{Bugs,,Submitting Bug +Reports}, for instructions on contacting the author. +@setfilename ignored diff --git a/doc/invoking.texi b/doc/invoking.texi new file mode 100644 index 00000000..c70d0cf6 --- /dev/null +++ b/doc/invoking.texi @@ -0,0 +1,279 @@ +@node Invocation, Language, Credits, Top +@chapter Invoking PSPP +@cindex invocation +@cindex PSPP, invoking + +@cindex command line, options +@cindex options, command-line +@example +pspp [ -B @var{dir} | --config-dir=@var{dir} ] [ -o @var{device} | --device=@var{device} ] + [ -d @var{var}[=@var{value}] | --define=@var{var}[=@var{value}] ] [-u @var{var} | --undef=@var{var} ] + [ -f @var{file} | --out-file=@var{file} ] [ -p | --pipe ] [ -I- | --no-include ] + [ -I @var{dir} | --include=@var{dir} ] [ -i | --interactive ] + [ -n | --edit | --dry-run | --just-print | --recon ] + [ -r | --no-statrc ] [ -h | --help ] [ -l | --list ] + [ -c @var{command} | --command @var{command} ] [ -s | --safer ] + [ --testing-mode ] [ -V | --version ] [ -v | --verbose ] + [ @var{key}=@var{value} ] @var{file}@enddots{} +@end example + +@menu +* Non-option Arguments:: Specifying syntax files and output devices. +* Configuration Options:: Change the configuration for the current run. +* Input and output options:: Controlling input and output files. +* Language control options:: Language variants. +* Informational options:: Helpful information about PSPP. +@end menu + +@node Non-option Arguments, Configuration Options, Invocation, Invocation +@section Non-option Arguments + +Syntax files and output device substitutions can be specified on +PSPP's command line: + +@table @code +@item @var{file} + +A file by itself on the command line will be executed as a syntax file. +PSPP terminates after the syntax file runs, unless the @code{-i} or +@code{--interactive} option is given (@pxref{Language control options}). + +@item @var{file1} @var{file2} + +When two or more filenames are given on the command line, the first +syntax file is executed, then PSPP's dictionary is cleared, then the second +syntax file is executed. + +@item @var{file1} + @var{file2} + +If syntax files' names are delimited by a plus sign (@samp{+}), then the +dictionary is not cleared between their executions, as if they were +concatenated together into a single file. + +@item @var{key}=@var{value} + +Defines an output device macro @var{key} to expand to @var{value}, +overriding any macro having the same @var{key} defined in the device +configuration file. @xref{Macro definitions}. + +@end table + +There is one other way to specify a syntax file, if your operating +system supports it. If you have a syntax file @file{foobar.stat}, put +the notation + +@example +#! /usr/local/bin/pspp +@end example + +at the top, and mark the file as executable with @code{chmod +x +foobar.stat}. (If PSPP is not installed in @file{/usr/local/bin}, +then insert its actual installation directory into the syntax file +instead.) Now you should be able to invoke the syntax file just by +typing its name. You can include any options on the command line as +usual. PSPP entirely ignores any lines beginning with @samp{#!}. + +@node Configuration Options, Input and output options, Non-option Arguments, Invocation +@section Configuration Options + +Configuration options are used to change PSPP's configuration for the +current run. The configuration options are: + +@table @code +@item -a @{compatible|enhanced@} +@itemx --algorithm=@{compatible|enhanced@} + +If you chose @code{compatible}, then PSPP will use the same algorithms +as used by some proprietary statistical analysis packages. +This is not recommended, as these algorithms are inferior and in some cases +compeletely broken. +The default setting is @code{enhanced}. +Certain commands have subcommands which allow you to override this setting on +a per command basis. + +@item -B @var{dir} +@itemx --config-dir=@var{dir} + +Sets the configuration directory to @var{dir}. @xref{File locations}. + +@item -o @var{device} +@itemx --device=@var{device} + +Selects the output device with name @var{device}. If this option is +given more than once, then all devices mentioned are selected. This +option disables all devices besides those mentioned on the command line. + +@item -d @var{var}[=@var{value}] +@itemx --define=@var{var}[=@var{value}] + +Defines an `environment variable' named @var{var} having the optional +value @var{value} specified. @xref{Variable values}. + +@item -u @var{var} +@itemx --undef=@var{var} + +Undefines the `environment variable' named @var{var}. @xref{Variable +values}. +@end table + +@node Input and output options, Language control options, Configuration Options, Invocation +@section Input and output options + +Input and output options affect how PSPP reads input and writes +output. These are the input and output options: + +@table @code +@item -f @var{file} +@itemx --out-file=@var{file} + +This overrides the output file name for devices designated as listing +devices. If a file named @var{file} already exists, it is overwritten. + +@item -p +@itemx --pipe + +Allows PSPP to be used as a filter by causing the syntax file to be +read from stdin and output to be written to stdout. Conflicts with the +@code{-f @var{file}} and @code{--file=@var{file}} options. + +@item -I- +@itemx --no-include + +Clears all directories from the include path. This includes all +directories put in the include path by default. @xref{Miscellaneous +configuring}. + +@item -I @var{dir} +@itemx --include=@var{dir} + +Appends directory @var{dir} to the path that is searched for include +files in PSPP syntax files. + +@item -c @var{command} +@itemx --command=@var{command} + +Execute literal command @var{command}. The command is executed before +startup syntax files, if any. + +@item --testing-mode + +Invoke heuristics to assist with testing PSPP. For use by @code{make +check} and similar scripts. +@end table + +@node Language control options, Informational options, Input and output options, Invocation +@section Language control options + +Language control options control how PSPP syntax files are parsed and +interpreted. The available language control options are: + +@table @code +@item -i +@itemx --interactive + +When a syntax file is specified on the command line, PSPP normally +terminates after processing it. Giving this option will cause PSPP to +bring up a command prompt after processing the syntax file. + +In addition, this forces syntax files to be interpreted in interactive +mode, rather than the default batch mode. @xref{Tokenizing lines}, for +information on the differences between batch mode and interactive mode +command interpretation. + +@item -n +@itemx --edit +@itemx --dry-run +@itemx --just-print +@itemx --recon + +Only the syntax of any syntax file specified or of commands entered at +the command line is checked. Transformations are not performed and +procedures are not executed. Not yet implemented. + +@item -r +@itemx --no-statrc + +Prevents the execution of the PSPP startup syntax file. Not yet +implemented, as startup syntax files aren't, either. + +@item -s +@itemx --safer + +Disables certain unsafe operations. This includes the ERASE and +HOST commands, as well as use of pipes as input and output files. +@end table + +@node Informational options, , Language control options, Invocation +@section Informational options + +Informational options cause information about PSPP to be written to +the terminal. Here are the available options: + +@table @code +@item -h +@item --help + +Prints a message describing PSPP command-line syntax and the available +device driver classes, then terminates. + +@item -l +@item --list + +Lists the available device driver classes, then terminates. + +@item -x @{compatible|enhanced@} +@itemx --syntax=@{compatible|enhanced@} + +If you chose @code{compatible}, then PSPP will only accept command syntax that +is compatible with the proprietary program SPSS. +If you choose @code{enhanced} then additional syntax will be available. +The default is @code{enhanced}. + + +@item -V +@item --version + +Prints a brief message listing PSPP's version, warranties you don't +have, copying conditions and copyright, and e-mail address for bug +reports, then terminates. + +@item -v +@item --verbose + +Increments PSPP's verbosity level. Higher verbosity levels cause +PSPP to display greater amounts of information about what it is +doing. Often useful for debugging PSPP's configuration. + +This option can be given multiple times to set the verbosity level to +that value. The default verbosity level is 0, in which no informational +messages will be displayed. + +Higher verbosity levels cause messages to be displayed when the +corresponding events take place. + +@table @asis +@item 1 + +Driver and subsystem initializations. + +@item 2 + +Completion of driver initializations. Beginning of driver closings. + +@item 3 + +Completion of driver closings. + +@item 4 + +Files searched for; success of searches. + +@item 5 + +Individual directories included in file searches. +@end table + +Each verbosity level also includes messages from lower verbosity levels. + +@end table +@setfilename ignored diff --git a/doc/language.texi b/doc/language.texi new file mode 100644 index 00000000..6a5a19e6 --- /dev/null +++ b/doc/language.texi @@ -0,0 +1,1217 @@ +@node Language, Expressions, Invocation, Top +@chapter The PSPP language +@cindex language, PSPP +@cindex PSPP, language + +@quotation +@strong{Please note:} PSPP is not even close to completion. +Only a few actual statistical procedures are implemented. PSPP +is a work in progress. +@end quotation + +This chapter discusses elements common to many PSPP commands. +Later chapters will describe individual commands in detail. + +@menu +* Tokens:: Characters combine to form tokens. +* Commands:: Tokens combine to form commands. +* Types of Commands:: Commands come in several flavors. +* Order of Commands:: Commands combine to form syntax files. +* Missing Observations:: Handling missing observations. +* Variables:: The unit of data storage. +* Files:: Files used by PSPP. +* BNF:: How command syntax is described. +@end menu + +@node Tokens, Commands, Language, Language +@section Tokens +@cindex language, lexical analysis +@cindex language, tokens +@cindex tokens +@cindex lexical analysis +@cindex lexemes + +PSPP divides most syntax file lines into series of short chunks +called @dfn{tokens}, @dfn{lexical elements}, or @dfn{lexemes}. These +tokens are then grouped to form commands, each of which tells +PSPP to take some action---read in data, write out data, perform +a statistical procedure, etc. The process of dividing input into tokens +is @dfn{tokenization}, or @dfn{lexical analysis}. Each type of token is +described below. + +@cindex delimiters +@cindex whitespace +Tokens must be separated from each other by @dfn{delimiters}. +Delimiters include whitespace (spaces, tabs, carriage returns, line +feeds, vertical tabs), punctuation (commas, forward slashes, etc.), and +operators (plus, minus, times, divide, etc.) Note that while whitespace +only separates tokens, other delimiters are tokens in themselves. + +@table @strong +@cindex identifiers +@item Identifiers +Identifiers are names that specify variable names, commands, or command +details. + +@itemize @bullet +@item +The first character in an identifier must be a letter, @samp{#}, or +@samp{@@}. Some system identifiers begin with @samp{$}, but +user-defined variables' names may not begin with @samp{$}. + +@item +The remaining characters in the identifier must be letters, digits, or +one of the following special characters: + +@example +. _ $ # @@ +@end example + +@item +@cindex variable names +@cindex names, variable +Variable names may be any length, but only the first 8 characters are +significant. + +@item +@cindex case-sensitivity +Identifiers are not case-sensitive: @code{foobar}, @code{Foobar}, +@code{FooBar}, @code{FOOBAR}, and @code{FoObaR} are different +representations of the same identifier. + +@item +@cindex keywords +Identifiers other than variable names may be abbreviated to their first +3 characters if this abbreviation is unambiguous. These identifiers are +often called @dfn{keywords}. (Unique abbreviations of 3 or more +characters are also accepted: @samp{FRE}, @samp{FREQ}, and +@samp{FREQUENCIES} are equivalent when the last is a keyword.) + +@item +Whether an identifier is a keyword depends on the context. + +@item +@cindex keywords, reserved +@cindex reserved keywords +Some keywords are reserved. These keywords may not be used in any +context besides those explicitly described in this manual. The reserved +keywords are: + +@example +ALL AND BY EQ GE GT LE LT NE NOT OR TO WITH +@end example + +@item +Since keywords are identifiers, all the rules for identifiers apply. +Specifically, they must be delimited as are other identifiers: +@code{WITH} is a reserved keyword, but @code{WITHOUT} is a valid +variable name. +@end itemize + +@cindex @samp{.} +@cindex period +@cindex variable names, ending with period +@strong{Caution:} It is legal to end a variable name with a period, but +@emph{don't do it!} The variable name will be misinterpreted when it is +the final token on a line: @code{FOO.} will be divided into two separate +tokens, @samp{FOO} and @samp{.}, the @dfn{terminal dot}. +@xref{Commands, , Forming commands of tokens}. + +@item Numbers +@cindex numbers +@cindex integers +@cindex reals +Numbers may be specified as integers or reals. Integers are internally +converted into reals. Scientific notation is not supported. Here are +some examples of valid numbers: + +@example +1234 3.14159265359 .707106781185 8945. +@end example + +@strong{Caution:} The last example will be interpreted as two tokens, +@samp{8945} and @samp{.}, if it is the last token on a line. + +@item Strings +@cindex strings +@cindex @samp{'} +@cindex @samp{"} +@cindex case-sensitivity +Strings are literal sequences of characters enclosed in pairs of single +quotes (@samp{'}) or double quotes (@samp{"}). + +@itemize @bullet +@item +Whitespace and case of letters @emph{are} significant inside strings. +@item +Whitespace characters inside a string are not delimiters. +@item +To include single-quote characters in a string, enclose the string in +double quotes. +@item +To include double-quote characters in a string, enclose the string in +single quotes. +@item +It is not possible to put both single- and double-quote characters +inside one string. +@end itemize + +@item Hexstrings +@cindex hexstrings +Hexstrings are string variants that use hex digits to specify +characters. + +@itemize @bullet +@item +A hexstring may be used anywhere that an ordinary string is allowed. + +@item +@cindex @samp{X'} +@cindex @samp{'} +A hexstring begins with @samp{X'} or @samp{x'}, and ends with @samp{'}. + +@cindex whitespace +@item +No whitespace is allowed between the initial @samp{X} and @samp{'}. + +@item +Double quotes @samp{"} may be used in place of single quotes @samp{'} if +done in both places. + +@item +Each pair of hex digits is internally changed into a single character +with the given value. + +@item +If there is an odd number of hex digits, the missing last digit is +assumed to be @samp{0}. + +@item +@cindex portability +@strong{Please note:} Use of hexstrings is nonportable because the same +numeric values are associated with different glyphs by different +operating systems. Therefore, their use should be confined to syntax +files that will not be widely distributed. + +@item +@cindex characters, reserved +@cindex 0 +@cindex whitespace +@strong{Please note also:} The character with value 00 is reserved for +internal use by PSPP. Its use in strings causes an error and +replacement with a blank space (in ASCII, hex 20, decimal 32). +@end itemize + +@item Punctuation +@cindex punctuation +Punctuation separates tokens; punctuators are delimiters. These are the +punctuation characters: + +@example +, / = ( ) +@end example + +@item Operators +@cindex operators +Operators describe mathematical operations. Some operators are delimiters: + +@example +( ) + - * / ** +@end example + +Many of the above operators are also punctuators. Punctuators are +distinguished from operators by context. + +The other operators are all reserved keywords. None of these are +delimiters: + +@example +AND EQ GE GT LE LT NE OR +@end example + +@item Terminal Dot +@cindex terminal dot +@cindex dot, terminal +@cindex period +@cindex @samp{.} +A period (@samp{.}) at the end of a line (except for whitespace) is one +type of a @dfn{terminal dot}, although not every terminal dot is a +period at the end of a line. @xref{Commands, , Forming commands of +tokens}. A period is a terminal dot @emph{only} +when it is at the end of a line; otherwise it is part of a +floating-point number. (A period outside a number in the middle of a +line is an error.) + +@quotation +@cindex terminal dot, changing +@cindex dot, terminal, changing +@strong{Please note:} The character used for the @dfn{terminal dot} +can be changed with @cmd{SET}'s ENDCMD subcommand (@pxref{SET}). This +is strongly discouraged, and throughout all the remainder of this +manual it will be assumed that the default setting is in effect. +@end quotation + +@end table + +@node Commands, Types of Commands, Tokens, Language +@section Forming commands of tokens + +@cindex PSPP, command structure +@cindex language, command structure +@cindex commands, structure + +Most PSPP commands share a common structure, diagrammed below: + +@example +@var{cmd}@dots{} [@var{sbc}[=][@var{spec} [[,]@var{spec}]@dots{}]] [[/[=][@var{spec} [[,]@var{spec}]@dots{}]]@dots{}]. +@end example + +@cindex @samp{[ ]} +In the above, rather daunting, expression, pairs of square brackets +(@samp{[ ]}) indicate optional elements, and names such as @var{cmd} +indicate parts of the syntax that vary from command to command. +Ellipses (@samp{...}) indicate that the preceding part may be repeated +an arbitrary number of times. Let's pick apart what it says above: + +@itemize @bullet +@cindex commands, names +@item +A command begins with a command name of one or more keywords, such as +@cmd{FREQUENCIES}, @cmd{DATA LIST}, or @cmd{N OF CASES}. @var{cmd} +may be abbreviated to its first word if that is unambiguous; each word +in @var{cmd} may be abbreviated to a unique prefix of three or more +characters as described above. + +@cindex subcommands +@item +The command name may be followed by one or more @dfn{subcommands}: + +@itemize @minus +@item +Each subcommand begins with a unique keyword, indicated by @var{sbc} +above. This is analogous to the command name. + +@item +The subcommand name is optionally followed by an equals sign (@samp{=}). + +@item +Some subcommands accept a series of one or more specifications +(@var{spec}), optionally separated by commas. + +@item +Each subcommand must be separated from the next (if any) by a forward +slash (@samp{/}). +@end itemize + +@cindex dot, terminal +@cindex terminal dot +@item +Each command must be terminated with a @dfn{terminal dot}. +The terminal dot may be given one of three ways: + +@itemize @minus +@item +(most commonly) A period character at the very end of a line, as +described above. + +@item +(only if NULLINE is on: @xref{SET, , Setting user preferences}, for more +details.) A completely blank line. + +@item +(in batch mode only) Any line that is not indented from the left side of +the page causes a terminal dot to be inserted before that line. +Therefore, each command begins with a line that is flush left, followed +by zero or more lines that are indented one or more characters from the +left margin. + +In batch mode, PSPP will ignore a plus sign, minus sign, or period +(@samp{+}, @samp{@minus{}}, or @samp{.}) as the first character in a +line. Any of these characters as the first character on a line will +begin a new command. This allows for visual indentation of a command +without that command being considered part of the previous command. + +PSPP is in batch mode when it is reading input from a file, rather +than from an interactive user. Note that the other forms of the +terminal dot may also be used in batch mode. + +Sometimes, one encounters syntax files that are intended to be +interpreted in interactive mode rather than batch mode (for instance, +this can happen if a session log file is used directly as a syntax +file). When this occurs, use the @samp{-i} command line option to force +interpretation in interactive mode (@pxref{Language control options}). +@end itemize +@end itemize + +PSPP ignores empty commands when they are generated by the above +rules. Note that, as a consequence of these rules, each command must +begin on a new line. + +@node Types of Commands, Order of Commands, Commands, Language +@section Types of Commands + +Commands in PSPP are divided roughly into six categories: + +@table @strong +@item Utility commands +@cindex utility commands +Set or display various global options that affect PSPP operations. +May appear anywhere in a syntax file. @xref{Utilities, , Utility +commands}. + +@item File definition commands +@cindex file definition commands +Give instructions for reading data from text files or from special +binary ``system files''. Most of these commands discard any previous +data or variables to replace it with the new data and +variables. At least one must appear before the first command in any of +the categories below. @xref{Data Input and Output}. + +@item Input program commands +@cindex input program commands +Though rarely used, these provide powerful tools for reading data files +in arbitrary textual or binary formats. @xref{INPUT PROGRAM}. + +@item Transformations +@cindex transformations +Perform operations on data and write data to output files. Transformations +are not carried out until a procedure is executed. + +@item Restricted transformations +@cindex restricted transformations +Same as transformations for most purposes. @xref{Order of Commands}, for a +detailed description of the differences. + +@item Procedures +@cindex procedures +Analyze data, writing results of analyses to the listing file. Cause +transformations specified earlier in the file to be performed. In a +more general sense, a @dfn{procedure} is any command that causes the +active file (the data) to be read. +@end table + +@node Order of Commands, Missing Observations, Types of Commands, Language +@section Order of Commands +@cindex commands, ordering +@cindex order of commands + +PSPP does not place many restrictions on ordering of commands. +The main restriction is that variables must be defined with one of the +file-definition commands before they are otherwise referred to. + +Of course, there are specific rules, for those who are interested. +PSPP possesses five internal states, called initial, INPUT PROGRAM, +FILE TYPE, transformation, and procedure states. (Please note the +distinction between the @cmd{INPUT PROGRAM} and @cmd{FILE TYPE} +@emph{commands} and the INPUT PROGRAM and FILE TYPE @emph{states}.) + +PSPP starts up in the initial state. Each successful completion +of a command may cause a state transition. Each type of command has its +own rules for state transitions: + +@table @strong +@item Utility commands +@itemize @bullet +@item +Legal in all states. +@item +Do not cause state transitions. Exception: when @cmd{N OF CASES} +is executed in the procedure state, it causes a transition to the +transformation state. +@end itemize + +@item @cmd{DATA LIST} +@itemize @bullet +@item +Legal in all states. +@item +When executed in the initial or procedure state, causes a transition to +the transformation state. +@item +Clears the active file if executed in the procedure or transformation +state. +@end itemize + +@item @cmd{INPUT PROGRAM} +@itemize @bullet +@item +Invalid in INPUT PROGRAM and FILE TYPE states. +@item +Causes a transition to the INPUT PROGRAM state. +@item +Clears the active file. +@end itemize + +@item @cmd{FILE TYPE} +@itemize @bullet +@item +Invalid in INPUT PROGRAM and FILE TYPE states. +@item +Causes a transition to the FILE TYPE state. +@item +Clears the active file. +@end itemize + +@item Other file definition commands +@itemize @bullet +@item +Invalid in INPUT PROGRAM and FILE TYPE states. +@item +Cause a transition to the transformation state. +@item +Clear the active file, except for @cmd{ADD FILES}, @cmd{MATCH FILES}, +and @cmd{UPDATE}. +@end itemize + +@item Transformations +@itemize @bullet +@item +Invalid in initial and FILE TYPE states. +@item +Cause a transition to the transformation state. +@end itemize + +@item Restricted transformations +@itemize @bullet +@item +Invalid in initial, INPUT PROGRAM, and FILE TYPE states. +@item +Cause a transition to the transformation state. +@end itemize + +@item Procedures +@itemize @bullet +@item +Invalid in initial, INPUT PROGRAM, and FILE TYPE states. +@item +Cause a transition to the procedure state. +@end itemize +@end table + +@node Missing Observations, Variables, Order of Commands, Language +@section Handling missing observations +@cindex missing values +@cindex values, missing + +PSPP includes special support for unknown numeric data values. +Missing observations are assigned a special value, called the +@dfn{system-missing value}. This ``value'' actually indicates the +absence of value; it means that the actual value is unknown. Procedures +automatically exclude from analyses those observations or cases that +have missing values. Whether single observations or entire cases are +excluded depends on the procedure. + +The system-missing value exists only for numeric variables. String +variables always have a defined value, even if it is only a string of +spaces. + +Variables, whether numeric or string, can have designated +@dfn{user-missing values}. Every user-missing value is an actual value +for that variable. However, most of the time user-missing values are +treated in the same way as the system-missing value. String variables +that are wider than a certain width, usually 8 characters (depending on +computer architecture), cannot have user-missing values. + +For more information on missing values, see the following sections: +@ref{Variables}, @ref{MISSING VALUES}, @ref{Expressions}. See also the +documentation on individual procedures for information on how they +handle missing values. + +@node Variables, Files, Missing Observations, Language +@section Variables +@cindex variables +@cindex dictionary + +Variables are the basic unit of data storage in PSPP. All the +variables in a file taken together, apart from any associated data, are +said to form a @dfn{dictionary}. +Some details of variables are described in the sections below. + +@menu +* Attributes:: Attributes of variables. +* System Variables:: Variables automatically defined by PSPP. +* Sets of Variables:: Lists of variable names. +* Input/Output Formats:: Input and output formats. +* Scratch Variables:: Variables deleted by procedures. +@end menu + +@node Attributes, System Variables, Variables, Variables +@subsection Attributes of Variables +@cindex variables, attributes of +@cindex attributes of variables +Each variable has a number of attributes, including: + +@table @strong +@item Name +This is an identifier. Each variable must have a different name. +@xref{Tokens}. + +@cindex variables, type +@cindex type of variables +@item Type +Numeric or string. + +@cindex variables, width +@cindex width of variables +@item Width +(string variables only) String variables with a width of 8 characters or +fewer are called @dfn{short string variables}. Short string variables +can be used in many procedures where @dfn{long string variables} (those +with widths greater than 8) are not allowed. + +@quotation +@strong{Please note:} Certain systems may consider strings longer than 8 +characters to be short strings. Eight characters represents a minimum +figure for the maximum length of a short string. +@end quotation + +@item Position +Variables in the dictionary are arranged in a specific order. +@cmd{DISPLAY} can be used to show this order: see @ref{DISPLAY}. + +@item Initialization +Either reinitialized to 0 or spaces for each case, or left at its +existing value. @xref{LEAVE}. + +@cindex missing values +@cindex values, missing +@item Missing values +Optionally, up to three values, or a range of values, or a specific +value plus a range, can be specified as @dfn{user-missing values}. +There is also a @dfn{system-missing value} that is assigned to an +observation when there is no other obvious value for that observation. +Observations with missing values are automatically excluded from +analyses. User-missing values are actual data values, while the +system-missing value is not a value at all. @xref{Missing Observations}. + +@cindex variable labels +@cindex labels, variable +@item Variable label +A string that describes the variable. @xref{VARIABLE LABELS}. + +@cindex value labels +@cindex labels, value +@item Value label +Optionally, these associate each possible value of the variable with a +string. @xref{VALUE LABELS}. + +@cindex print format +@item Print format +Display width, format, and (for numeric variables) number of decimal +places. This attribute does not affect how data are stored, just how +they are displayed. Example: a width of 8, with 2 decimal places. +@xref{PRINT FORMATS}. + +@cindex write format +@item Write format +Similar to print format, but used by certain commands that are +designed to write to binary files. @xref{WRITE FORMATS}. +@end table + +@node System Variables, Sets of Variables, Attributes, Variables +@subsection Variables Automatically Defined by PSPP +@cindex system variables +@cindex variables, system + +There are seven system variables. These are not like ordinary +variables, as they are not stored in each case. They can only be used +in expressions. These system variables, whose values and output formats +cannot be modified, are described below. + +@table @code +@cindex @code{$CASENUM} +@item $CASENUM +Case number of the case at the moment. This changes as cases are +shuffled around. + +@cindex @code{$DATE} +@item $DATE +Date the PSPP process was started, in format A9, following the +pattern @code{DD MMM YY}. + +@cindex @code{$JDATE} +@item $JDATE +Number of days between 15 Oct 1582 and the time the PSPP process +was started. + +@cindex @code{$LENGTH} +@item $LENGTH +Page length, in lines, in format F11. + +@cindex @code{$SYSMIS} +@item $SYSMIS +System missing value, in format F1. + +@cindex @code{$TIME} +@item $TIME +Number of seconds between midnight 14 Oct 1582 and the time the active file +was read, in format F20. + +@cindex @code{$WIDTH} +@item $WIDTH +Page width, in characters, in format F3. +@end table + +@node Sets of Variables, Input/Output Formats, System Variables, Variables +@subsection Lists of variable names +@cindex TO convention +@cindex convention, TO + +There are several ways to specify a set of variables: + +@enumerate +@item +(Most commonly.) List the variable names one after another, optionally +separating them by commas. + +@cindex @code{TO} +@item +(This method cannot be used on commands that define the dictionary, such +as @cmd{DATA LIST}.) The syntax is the names of two existing variables, +separated by the reserved keyword @code{TO}. The meaning is to include +every variable in the dictionary between and including the variables +specified. For instance, if the dictionary contains six variables with +the names @code{ID}, @code{X1}, @code{X2}, @code{GOAL}, @code{MET}, and +@code{NEXTGOAL}, in that order, then @code{X2 TO MET} would include +variables @code{X2}, @code{GOAL}, and @code{MET}. + +@item +(This method can be used only on commands that define the dictionary, +such as @cmd{DATA LIST}.) It is used to define sequences of variables +that end in consecutive integers. The syntax is two identifiers that +end in numbers. This method is best illustrated with examples: + +@itemize @bullet +@item +The syntax @code{X1 TO X5} defines 5 variables: + +@itemize @minus +@item +X1 +@item +X2 +@item +X3 +@item +X4 +@item +X5 +@end itemize + +@item +The syntax @code{ITEM0008 TO ITEM0013} defines 6 variables: + +@itemize @minus +@item +ITEM0008 +@item +ITEM0009 +@item +ITEM0010 +@item +ITEM0011 +@item +ITEM0012 +@item +ITEM0013 +@end itemize + +@item +Each of the syntaxes @code{QUES001 TO QUES9} and @code{QUES6 TO QUES3} +are invalid, although for different reasons, which should be evident. +@end itemize + +Note that after a set of variables has been defined with @cmd{DATA LIST} +or another command with this method, the same set can be referenced on +later commands using the same syntax. + +@item +The above methods can be combined, either one after another or delimited +by commas. For instance, the combined syntax @code{A Q5 TO Q8 X TO Z} +is legal as long as each part @code{A}, @code{Q5 TO Q8}, @code{X TO Z} +is individually legal. +@end enumerate + +@node Input/Output Formats, Scratch Variables, Sets of Variables, Variables +@subsection Input and Output Formats + +Data that PSPP inputs and outputs must have one of a number of formats. +These formats are described, in general, by a format specification of +the form @code{NAMEw.d}, where @var{name} is the +format name and @var{w} is a field width. @var{d} is the optional +desired number of decimal places, if appropriate. If @var{d} is not +included then it is assumed to be 0. Some formats do not allow @var{d} +to be specified. + +When an input format is specified on @cmd{DATA LIST} or another +command, then +it is converted to an output format for the purposes of @cmd{PRINT} +and other +data output commands. For most purposes, input and output formats are +the same; the salient differences are described below. + +Below are listed the input and output formats supported by PSPP. If an +input format is mapped to a different output format by default, then +that mapping is indicated with @result{}. Each format has the listed +bounds on input width (iw) and output width (ow). + +The standard numeric input and output formats are given in the following +table: + +@table @asis +@item Fw.d: 1 <= iw,ow <= 40 +Standard decimal format with @var{d} decimal places. If the number is +too large to fit within the field width, it is expressed in scientific +notation (@code{1.2+34}) if w >= 6, with always at least two digits in +the exponent. When used as an input format, scientific notation is +allowed but an E or an F must be used to introduce the exponent. + +The default output format is the same as the input format, except if +@var{d} > 1. In that case the output @var{w} is always made to be at +least 2 + @var{d}. + +@item Ew.d: 1 <= iw <= 40; 6 <= ow <= 40 +For input this is equivalent to F format except that no E or F is +require to introduce the exponent. For output, produces scientific +notation in the form @code{1.2+34}. There are always at least two +digits given in the exponent. + +The default output @var{w} is the largest of the input @var{w}, the +input @var{d} + 7, and 10. The default output @var{d} is the input +@var{d}, but at least 3. + +@item COMMAw.d: 1 <= iw,ow <= 40 +Equivalent to F format, except that groups of three digits are +comma-separated on output. If the number is too large to express in the +field width, then first commas are eliminated, then if there is still +not enough space the number is expressed in scientific notation given +that w >= 6. Commas are allowed and ignored when this is used as an +input format. + +@item DOTw.d: 1 <= iw,ow <= 40 +Equivalent to COMMA format except that the roles of comma and decimal +point are interchanged. However: If SET /DECIMAL=DOT is in effect, then +COMMA uses @samp{,} for a decimal point and DOT uses @samp{.} for a +decimal point. + +@item DOLLARw.d: 1 <= iw <= 40; 2 <= ow <= 40 +Equivalent to COMMA format, except that the number is prefixed by a +dollar sign (@samp{$}) if there is room. On input the value is allowed +to be prefixed by a dollar sign, which is ignored. + +The default output @var{w} is the input @var{w}, but at least 2. + +@item PCTw.d: 2 <= iw,ow <= 40 +Equivalent to F format, except that the number is suffixed by a percent +sign (@samp{%}) if there is room. On input the value is allowed to be +suffixed by a percent sign, which is ignored. + +The default output @var{w} is the input @var{w}, but at least 2. + +@item Nw.d: 1 <= iw,ow <= 40 +Only digits are allowed within the field width. The decimal point is +assumed to be @var{d} digits from the right margin. + +The default output format is F with the same @var{w} and @var{d}, except +if @var{d} > 1. In that case the output @var{w} is always made to be at +least 2 + @var{d}. + +@item Zw.d @result{} F: 1 <= iw,ow <= 40 +Zoned decimal input. If you need to use this then you know how. + +@item IBw.d @result{} F: 1 <= iw,ow <= 8 +Integer binary format. The field is interpreted as a fixed-point +positive or negative binary number in two's-complement notation. The +location of the decimal point is implied. Endianness is the same as the +host machine. + +The default output format is F8.2 if @var{d} is 0. Otherwise it is F, +with output @var{w} as 9 + input @var{d} and output @var{d} as input +@var{d}. + +@item PIB @result{} F: 1 <= iw,ow <= 8 +Positive integer binary format. The field is interpreted as a +fixed-point positive binary number. The location of the decimal point +is implied. Endianness is teh same as the host machine. + +The default output format follows the rules for IB format. + +@item Pw.d @result{} F: 1 <= iw,ow <= 16 +Binary coded decimal format. Each byte from left to right, except the +rightmost, represents two digits. The upper nibble of each byte is more +significant. The upper nibble of the final byte is the least +significant digit. The lower nibble of the final byte is the sign; a +value of D represents a negative sign and all other values are +considered positive. The decimal point is implied. + +The default output format follows the rules for IB format. + +@item PKw.d @result{} F: 1 <= iw,ow <= 16 +Positive binary code decimal format. Same as P but the last byte is the +same as the others. + +The default output format follows the rules for IB format. + +@item RBw @result{} F: 2 <= iw,ow <= 8 + +Binary C architecture-dependent ``double'' format. For a standard +IEEE754 implementation @var{w} should be 8. + +The default output format follows the rules for IB format. + +@item PIBHEXw.d @result{} F: 2 <= iw,ow <= 16 +PIB format encoded as textual hex digit pairs. @var{w} must be even. + +The input width is mapped to a default output width as follows: +2@result{}4, 4@result{}6, 6@result{}9, 8@result{}11, 10@result{}14, +12@result{}16, 14@result{}18, 16@result{}21. No allowances are made for +decimal places. + +@item RBHEXw @result{} F: 4 <= iw,ow <= 16 + +RB format encoded as textual hex digits pairs. @var{w} must be even. + +The default output format is F8.2. + +@item CCAw.d: 1 <= ow <= 40 +@itemx CCBw.d: 1 <= ow <= 40 +@itemx CCCw.d: 1 <= ow <= 40 +@itemx CCDw.d: 1 <= ow <= 40 +@itemx CCEw.d: 1 <= ow <= 40 + +User-defined custom currency formats. May not be used as an input +format. @xref{SET}, for more details. +@end table + +The date and time numeric input and output formats accept a number of +possible formats. Before describing the formats themselves, some +definitions of the elements that make up their formats will be helpful: + +@table @dfn +@item leader +All formats accept an optional whitespace leader. + +@item day +An integer between 1 and 31 representing the day of month. + +@item day-count +An integer representing a number of days. + +@item date-delimiter +One or more characters of whitespace or the following characters: +@code{- / . ,} + +@item month +A month name in one of the following forms: +@itemize @bullet +@item +An integer between 1 and 12. +@item +Roman numerals representing an integer between 1 and 12. +@item +At least the first three characters of an English month name (January, +February, @dots{}). +@end itemize + +@item year +An integer year number between 1582 and 19999, or between 1 and 199. +Years between 1 and 199 will have 1900 added. + +@item julian +A single number with a year number in the first 2, 3, or 4 digits (as +above) and the day number within the year in the last 3 digits. + +@item quarter +An integer between 1 and 4 representing a quarter. + +@item q-delimiter +The letter @samp{Q} or @samp{q}. + +@item week +An integer between 1 and 53 representing a week within a year. + +@item wk-delimiter +The letters @samp{wk} in any case. + +@item time-delimiter +At least one characters of whitespace or @samp{:} or @samp{.}. + +@item hour +An integer greater than 0 representing an hour. + +@item minute +An integer between 0 and 59 representing a minute within an hour. + +@item opt-second +Optionally, a time-delimiter followed by a real number representing a +number of seconds. + +@item hour24 +An integer between 0 and 23 representing an hour within a day. + +@item weekday +At least the first two characters of an English day word. + +@item spaces +Any amount or no amount of whitespace. + +@item sign +An optional positive or negative sign. + +@item trailer +All formats accept an optional whitespace trailer. +@end table + +The date input formats are strung together from the above pieces. On +output, the date formats are always printed in a single canonical +manner, based on field width. The date input and output formats are +described below: + +@table @asis +@item DATEw: 9 <= iw,ow <= 40 +Date format. Input format: leader + day + date-delimiter + +month + date-delimiter + year + trailer. Output format: DD-MMM-YY for +@var{w} < 11, DD-MMM-YYYY otherwise. + +@item EDATEw: 8 <= iw,ow <= 40 +European date format. Input format same as DATE. Output format: +DD.MM.YY for @var{w} < 10, DD.MM.YYYY otherwise. + +@item SDATEw: 8 <= iw,ow <= 40 +Standard date format. Input format: leader + year + date-delimiter + +month + date-delimiter + day + trailer. Output format: YY/MM/DD for +@var{w} < 10, YYYY/MM/DD otherwise. + +@item ADATEw: 8 <= iw,ow <= 40 +American date format. Input format: leader + month + date-delimiter + +day + date-delimiter + year + trailer. Output format: MM/DD/YY for +@var{w} < 10, MM/DD/YYYY otherwise. + +@item JDATEw: 5 <= iw,ow <= 40 +Julian date format. Input format: leader + julian + trailer. Output +format: YYDDD for @var{w} < 7, YYYYDDD otherwise. + +@item QYRw: 4 <= iw <= 40, 6 <= ow <= 40 +Quarter/year format. Input format: leader + quarter + q-delimiter + +year + trailer. Output format: @samp{Q Q YY}, where the first +@samp{Q} is one of the digits 1, 2, 3, 4, if @var{w} < 8, @code{Q Q +YYYY} otherwise. + +@item MOYRw: 6 <= iw,ow <= 40 +Month/year format. Input format: leader + month + date-delimiter + year ++ trailer. Output format: @samp{MMM YY} for @var{w} < 8, @samp{MMM +YYYY} otherwise. + +@item WKYRw: 6 <= iw <= 40, 8 <= ow <= 40 +Week/year format. Input format: leader + week + wk-delimiter + year + +trailer. Output format: @samp{WW WK YY} for @var{w} < 10, @samp{WW WK +YYYY} otherwise. + +@item DATETIMEw.d: 17 <= iw,ow <= 40 +Date and time format. Input format: leader + day + date-delimiter + +month + date-delimiter + yaer + time-delimiter + hour24 + time-delimiter ++ minute + opt-second. Output format: @samp{DD-MMM-YYYY HH:MM}. If +@var{w} > 19 then seconds @samp{:SS} is added. If @var{w} > 22 and +@var{d} > 0 then fractional seconds @samp{.SS} are added. + +@item TIMEw.d: 5 <= iw,ow <= 40 +Time format. Input format: leader + sign + spaces + hour + +time-delimiter + minute + opt-second. Output format: @samp{HH:MM}. +Seconds and fractional seconds are available with @var{w} of at least 8 +and 10, respectively. + +@item DTIMEw.d: 1 <= iw <= 40, 8 <= ow <= 40 +Time format with day count. Input format: leader + sign + spaces + +day-count + time-delimiter + hour + time-delimiter + minute + +opt-second. Output format: @samp{DD HH:MM}. Seconds and fractional +seconds are available with @var{w} of at least 8 and 10, respectively. + +@item WKDAYw: 2 <= iw,ow <= 40 +A weekday as a number between 1 and 7, where 1 is Sunday. Input format: +leader + weekday + trailer. Output format: as many characters, in all +capital letters, of the English name of the weekday as will fit in the +field width. + +@item MONTHw: 3 <= iw,ow <= 40 +A month as a number between 1 and 12, where 1 is January. Input format: +leader + month + trailer. Output format: as many character, in all +capital letters, of the English name of the month as will fit in the +field width. +@end table + +There are only two formats that may be used with string variables: + +@table @asis +@item Aw: 1 <= iw <= 255, 1 <= ow <= 254 +The entire field is treated as a string value. + +@item AHEXw @result{} A: 2 <= iw <= 254; 2 <= ow <= 510 +The field is composed of characters in a string encoded as textual hex +digit pairs. + +The default output @var{w} is half the input @var{w}. +@end table + +@node Scratch Variables, , Input/Output Formats, Variables +@subsection Scratch Variables + +Most of the time, variables don't retain their values between cases. +Instead, either they're being read from a data file or the active file, +in which case they assume the value read, or, if created with +@cmd{COMPUTE} or +another transformation, they're initialized to the system-missing value +or to blanks, depending on type. + +However, sometimes it's useful to have a variable that keeps its value +between cases. You can do this with @cmd{LEAVE} (@pxref{LEAVE}), or you can +use a @dfn{scratch variable}. Scratch variables are variables whose +names begin with an octothorpe (@samp{#}). + +Scratch variables have the same properties as variables left with +@cmd{LEAVE}: +they retain their values between cases, and for the first case they are +initialized to 0 or blanks. They have the additional property that they +are deleted before the execution of any procedure. For this reason, +scratch variables can't be used for analysis. To obtain the same +effect, use @cmd{COMPUTE} (@pxref{COMPUTE}) to copy the scratch variable's +value into an ordinary variable, then analysis that variable. + +@node Files, BNF, Variables, Language +@section Files Used by PSPP + +PSPP makes use of many files each time it runs. Some of these it +reads, some it writes, some it creates. Here is a table listing the +most important of these files: + +@table @strong +@cindex file, command +@cindex file, syntax file +@cindex command file +@cindex syntax file +@item command file +@itemx syntax file +These names (synonyms) refer to the file that contains instructions to +PSPP that tell it what to do. The syntax file's name is specified on +the PSPP command line. Syntax files can also be pulled in with +@cmd{INCLUDE} (@pxref{INCLUDE}). + +@cindex file, data +@cindex data file +@item data file +Data files contain raw data in ASCII format suitable for being read in +by @cmd{DATA LIST}. Data can be embedded in the syntax +file with @cmd{BEGIN DATA} and @cmd{END DATA}: this makes the +syntax file a data file too. + +@cindex file, output +@cindex output file +@item listing file +One or more output files are created by PSPP each time it is +run. The output files receive the tables and charts produced by +statistical procedures. The output files may be in any number of formats, +depending on how PSPP is configured. + +@cindex active file +@cindex file, active +@item active file +The active file is the ``file'' on which all PSPP procedures +are performed. The active file contains variable definitions and +cases. The active file is not necessarily a disk file: it is stored +in memory if there is room. +@end table + +@node BNF, , Files, Language +@section Backus-Naur Form +@cindex BNF +@cindex Backus-Naur Form +@cindex command syntax, description of +@cindex description of command syntax + +The syntax of some parts of the PSPP language is presented in this +manual using the formalism known as @dfn{Backus-Naur Form}, or BNF. The +following table describes BNF: + +@itemize @bullet +@cindex keywords +@cindex terminals +@item +Words in all-uppercase are PSPP keyword tokens. In BNF, these are +often called @dfn{terminals}. There are some special terminals, which +are actually written in lowercase for clarity: + +@table @asis +@cindex @code{number} +@item @code{number} +A real number. + +@cindex @code{integer} +@item @code{integer} +An integer number. + +@cindex @code{string} +@item @code{string} +A string. + +@cindex @code{var-name} +@item @code{var-name} +A single variable name. + +@cindex operators +@cindex punctuators +@item @code{=}, @code{/}, @code{+}, @code{-}, etc. +Operators and punctuators. + +@cindex @code{.} +@cindex terminal dot +@cindex dot, terminal +@item @code{.} +The terminal dot. This is not necessarily an actual dot in the syntax +file: @xref{Commands}, for more details. +@end table + +@item +@cindex productions +@cindex nonterminals +Other words in all lowercase refer to BNF definitions, called +@dfn{productions}. These productions are also known as +@dfn{nonterminals}. Some nonterminals are very common, so they are +defined here in English for clarity: + +@table @code +@cindex @code{var-list} +@item var-list +A list of one or more variable names or the keyword @code{ALL}. + +@cindex @code{expression} +@item expression +An expression. @xref{Expressions}, for details. +@end table + +@item +@cindex @code{::=} +@cindex ``is defined as'' +@cindex productions +@samp{::=} means ``is defined as''. The left side of @samp{::=} gives +the name of the nonterminal being defined. The right side of @samp{::=} +gives the definition of that nonterminal. If the right side is empty, +then one possible expansion of that nonterminal is nothing. A BNF +definition is called a @dfn{production}. + +@item +@cindex terminals and nonterminals, differences +So, the key difference between a terminal and a nonterminal is that a +terminal cannot be broken into smaller parts---in fact, every terminal +is a single token (@pxref{Tokens}). On the other hand, nonterminals are +composed of a (possibly empty) sequence of terminals and nonterminals. +Thus, terminals indicate the deepest level of syntax description. (In +parsing theory, terminals are the leaves of the parse tree; nonterminals +form the branches.) + +@item +@cindex start symbol +@cindex symbol, start +The first nonterminal defined in a set of productions is called the +@dfn{start symbol}. The start symbol defines the entire syntax for +that command. +@end itemize +@setfilename ignored diff --git a/doc/license.texi b/doc/license.texi new file mode 100644 index 00000000..b856a431 --- /dev/null +++ b/doc/license.texi @@ -0,0 +1,42 @@ +@node License, Credits, Introduction, Top +@chapter Your rights and obligations +@cindex license +@cindex your rights and obligations +@cindex rights, your +@cindex obligations, your + +@cindex Free Software Foundation +@cindex GNU General Public License +@cindex General Public License +@cindex GPL +@cindex distribution +@cindex redistribution +Most of PSPP is distributed under the GNU General Public +License. The General Public License says, in effect, that you may +modify and distribute PSPP as you like, as long as you grant the +same rights to others. It also states that you must provide source code +when you distribute PSPP, or, if you obtained PSPP +source code from an anonymous ftp site, give out the name of that site. + +The General Public License is given in full in the source distribution +as file @file{COPYING}. In Debian GNU/Linux, this file is also +available as file @file{/usr/share/common-licenses/GPL-2}. + +To quote the GPL itself: + +@quotation +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 2 of the License, or (at your +option) any later version. + +This program is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +You should have received a copy of the GNU General Public License along +with this program; if not, write to the Free Software Foundation, Inc., +59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +@end quotation +@setfilename ignored diff --git a/doc/not-implemented.texi b/doc/not-implemented.texi new file mode 100644 index 00000000..cd668d60 --- /dev/null +++ b/doc/not-implemented.texi @@ -0,0 +1,75 @@ +@node Not Implemented, Bugs, Utilities, Top +@chapter Not Implemented + +This chapter lists parts of the PSPP language that are not yet +implemented. + +The following transformations and utilities are not yet implemented, but +they will be supported in a later release. + +@itemize @bullet +@item +ADD FILES +@item +ANOVA +@item +DEFINE +@item +FILE TYPE +@item +GET SAS +@item +GET TRANSLATE +@item +MCONVERT +@item +PLOT +@item +PRESERVE +@item +PROCEDURE OUTPUT +@item +RESTORE +@item +SAVE TRANSLATE +@item +UPDATE +@end itemize + +The following transformations and utilities are not implemented. There +are no plans to support them in future releases. Contributions to +implement them will still be accepted. + +@itemize @bullet +@item +EDIT +@item +GET DATABASE +@item +GET OSIRIS +@item +GET SCSS +@item +GSET +@item +HELP +@item +INFO +@item +INPUT MATRIX +@item +KEYED DATA LIST +@item +NUMBERED and UNNUMBERED +@item +OPTIONS +@item +REVIEW +@item +SAVE SCSS +@item +SPSS MANAGER +@item +STATISTICS +@end itemize +@setfilename ignored diff --git a/doc/portable-file-format.texi b/doc/portable-file-format.texi new file mode 100644 index 00000000..3f351b23 --- /dev/null +++ b/doc/portable-file-format.texi @@ -0,0 +1,438 @@ +@node Portable File Format, Data File Format, Configuration, Top +@appendix Portable File Format + +These days, most computers use the same internal data formats for +integer and floating-point data, if one ignores little differences like +big- versus little-endian byte ordering. However, occasionally it is +necessary to exchange data between systems with incompatible data +formats. This is what portable files are designed to do. + +@strong{Please note:} Although all of the following information is +correct, as far as the author has been able to ascertain, it is gleaned +from examination of ASCII-formatted portable files only, so some of it +may be incorrect in the general case. + +@menu +* Portable File Characters:: +* Portable File Structure:: +* Portable File Header:: +* Version and Date Info Record:: +* Identification Records:: +* Variable Count Record:: +* Case Weight Variable Record:: +* Variable Records:: +* Value Label Records:: +* Portable File Data:: +@end menu + +@node Portable File Characters, Portable File Structure, Portable File Format, Portable File Format +@section Portable File Characters + +Portable files are arranged as a series of lines of exactly 80 +characters each. Each line is terminated by a carriage-return, +line-feed sequence ``new-lines''). New-lines are only used to avoid +line length limits imposed by some OSes; they are not meaningful. + +The file must be terminated with a @samp{Z} character. In addition, if +the final line in the file does not have exactly 80 characters, then it +is padded on the right with @samp{Z} characters. (The file contents may +be in any character set; the file contains a description of its own +character set, as explained in the next section. Therefore, the +@samp{Z} character is not necessarily an ASCII @samp{Z}.) + +For the rest of the description of the portable file format, new-lines +and the trailing @samp{Z}s will be ignored, as if they did not exist, +because they are not an important part of understanding the file +contents. + +@node Portable File Structure, Portable File Header, Portable File Characters, Portable File Format +@section Portable File Structure + +Every portable file consists of the following records, in sequence: + +@itemize @bullet + +@item +File header. + +@item +Version and date info. + +@item +Product identification. + +@item +Subproduct identification (optional). + +@item +Variable count. + +@item +Case weight variable (optional). + +@item +Variables. Each variable record may optionally be followed by a +missing value record and a variable label record. + +@item +Value labels (optional). + +@item +Data. +@end itemize + +Most records are identified by a single-character tag code. The file +header and version info record do not have a tag. + +Other than these single-character codes, there are three types of fields +in a portable file: floating-point, integer, and string. Floating-point +fields have the following format: + +@itemize @bullet + +@item +Zero or more leading spaces. + +@item +Optional asterisk (@samp{*}), which indicates a missing value. The +asterisk must be followed by a single character, generally a period +(@samp{.}), but it appears that other characters may also be possible. +This completes the specification of a missing value. + +@item +Optional minus sign (@samp{-}) to indicate a negative number. + +@item +A whole number, consisting of one or more base-30 digits: @samp{0} +through @samp{9} plus capital letters @samp{A} through @samp{T}. + +@item +Optional fraction, consisting of a radix point (@samp{.}) followed by +one or more base-30 digits. + +@item +Optional exponent, consisting of a plus or minus sign (@samp{+} or +@samp{-}) followed by one or more base-30 digits. + +@item +A forward slash (@samp{/}). +@end itemize + +Integer fields take a form identical to floating-point fields, but they +may not contain a fraction. + +String fields take the form of a integer field having value @var{n}, +followed by exactly @var{n} characters, which are the string content. + +@node Portable File Header, Version and Date Info Record, Portable File Structure, Portable File Format +@section Portable File Header + +Every portable file begins with a 464-byte header, consisting of a +200-byte collection of vanity splash strings, followed by a 256-byte +character set translation table, followed by an 8-byte tag string. + +The 200-byte segment is divided into five 40-byte sections, each of +which represents the string @code{@var{charset} SPSS PORT FILE} in a +different character set encoding, where @var{charset} is the name of +the character set used in the file, e.g.@: @code{ASCII} or +@code{EBCDIC}. Each string is padded on the right with spaces in its +respective character set. + +It appears that these strings exist only to inform those who might view +the file on a screen, and that they are not parsed by SPSS products. +Thus, they can be safely ignored. For those interested, the strings are +supposed to be in the following character sets, in the specified order: +EBCDIC, 7-bit ASCII, CDC 6-bit ASCII, 6-bit ASCII, Honeywell 6-bit +ASCII. + +The 256-byte segment describes a mapping from the character set used in +the portable file to an arbitrary character set having characters at the +following positions: + +@table @asis +@item 0--60 + +Control characters. Not important enough to describe in full here. + +@item 61--63 + +Reserved. + +@item 64--73 + +Digits @samp{0} through @samp{9}. + +@item 74--99 + +Capital letters @samp{A} through @samp{Z}. + +@item 100--125 + +Lowercase letters @samp{a} through @samp{z}. + +@item 126 + +Space. + +@item 127--130 + +Symbols @code{.<(+} + +@item 131 + +Solid vertical pipe. + +@item 132--142 + +Symbols @code{&[]!$*);^-/} + +@item 143 + +Broken vertical pipe. + +@item 144--150 + +Symbols @code{,%_>}?@code{`:} @c @code{?} is an inverted question mark + +@item 151 + +British pound symbol. + +@item 152--155 + +Symbols @code{@@'="}. + +@item 156 + +Less than or equal symbol. + +@item 157 + +Empty box. + +@item 158 + +Plus or minus. + +@item 159 + +Filled box. + +@item 160 + +Degree symbol. + +@item 161 + +Dagger. + +@item 162 + +Symbol @samp{~}. + +@item 163 + +En dash. + +@item 164 + +Lower left corner box draw. + +@item 165 + +Upper left corner box draw. + +@item 166 + +Greater than or equal symbol. + +@item 167--176 + +Superscript @samp{0} through @samp{9}. + +@item 177 + +Lower right corner box draw. + +@item 178 + +Upper right corner box draw. + +@item 179 + +Not equal symbol. + +@item 180 + +Em dash. + +@item 181 + +Superscript @samp{(}. + +@item 182 + +Superscript @samp{)}. + +@item 183 + +Horizontal dagger (?). + +@item 184--186 + +Symbols @samp{@{@}\}. +@item 187 + +Cents symbol. + +@item 188 + +Centered dot, or bullet. + +@item 189--255 + +Reserved. +@end table + +Symbols that are not defined in a particular character set are set to +the same value as symbol 64; i.e., to @samp{0}. + +The 8-byte tag string consists of the exact characters @code{SPSSPORT} +in the portable file's character set, which can be used to verify that +the file is indeed a portable file. + +@node Version and Date Info Record, Identification Records, Portable File Header, Portable File Format +@section Version and Date Info Record + +This record does not have a tag code. It has the following structure: + +@itemize @bullet +@item +A single character identifying the file format version. The letter A +represents version 0, and so on. + +@item +An 8-character string field giving the file creation date in the format +YYYYMMDD. + +@item +A 6-character string field giving the file creation time in the format +HHMMSS. +@end itemize + +@node Identification Records, Variable Count Record, Version and Date Info Record, Portable File Format +@section Identification Records + +The product identification record has tag code @samp{1}. It consists of +a single string field giving the name of the product that wrote the +portable file. + +The subproduct identification record has tag code @samp{3}. It +consists of a single string field giving additional information on the +product that wrote the portable file. + +@node Variable Count Record, Case Weight Variable Record, Identification Records, Portable File Format +@section Variable Count Record + +The variable count record has tag code @samp{4}. It consists of two +integer fields. The first contains the number of variables in the file +dictionary. The purpose of the second is unknown; it contains the value +161 in all portable files examined so far. + +@node Case Weight Variable Record, Variable Records, Variable Count Record, Portable File Format +@section Case Weight Variable Record + +The case weight variable record is optional. If it is present, it +indicates the variable used for weighting cases; if it is absent, +cases are unweighted. It has tag code @samp{6}. It consists of a +single string field that names the weighting variable. + +@node Variable Records, Value Label Records, Case Weight Variable Record, Portable File Format +@section Variable Records + +Each variable record represents a single variable. Variable records +have tag code @samp{7}. They have the following structure: + +@itemize @bullet + +@item +Width (integer). This is 0 for a numeric variable, and a number between 1 +and 255 for a string variable. + +@item +Name (string). 1--8 characters long. Must be in all capitals. + +@item +Print format. This is a set of three integer fields: + +@itemize @minus + +@item +Format type (@pxref{Variable Record}). + +@item +Format width. 1--40. + +@item +Number of decimal places. 1--40. +@end itemize + +@item +Write format. Same structure as the print format described above. +@end itemize + +Each variable record can optionally be followed by a missing value +record, which has tag code @samp{8}. A missing value record has one +field, the missing value itself (a floating-point or string, as +appropriate). Up to three of these missing value records can be used. + +There is also a record for missing value ranges, which has tag code +@samp{B}. It is followed by two fields representing the range, which +are floating-point or string as appropriate. If a missing value range +is present, it may be followed by a single missing value record. + +Tag codes @samp{9} and @samp{A} represent @code{LO THRU @var{x}} and +@code{@var{x} THRU HI} ranges, respectively. Each is followed by a +single field representing @var{x}. If one of the ranges is present, it +may be followed by a single missing value record. + +In addition, each variable record can optionally be followed by a +variable label record, which has tag code @samp{C}. A variable label +record has one field, the variable label itself (string). + +@node Value Label Records, Portable File Data, Variable Records, Portable File Format +@section Value Label Records + +Value label records have tag code @samp{D}. They have the following +format: + +@itemize @bullet +@item +Variable count (integer). + +@item +List of variables (strings). The variable count specifies the number in +the list. Variables are specified by their names. All variables must +be of the same type (numeric or string). + +@item +Label count (integer). + +@item +List of (value, label) tuples. The label count specifies the number of +tuples. Each tuple consists of a value, which is numeric or string as +appropriate to the variables, followed by a label (string). +@end itemize + +@node Portable File Data, , Value Label Records, Portable File Format +@section Portable File Data + +The data record has tag code @samp{F}. There is only one tag for all +the data; thus, all the data must follow the dictionary. The data is +terminated by the end-of-file marker @samp{Z}, which is not valid as the +beginning of a data element. + +Data elements are output in the same order as the variable records +describing them. String variables are output as string fields, and +numeric variables are output as floating-point fields. +@setfilename ignored diff --git a/doc/pspp.texi b/doc/pspp.texi deleted file mode 100644 index 0ad81eb1..00000000 --- a/doc/pspp.texi +++ /dev/null @@ -1,10182 +0,0 @@ -\input texinfo @c -*- texinfo -*- -@c %**start of header -@setfilename pspp.info -@settitle PSPP -@set TIMESTAMP Time-stamp: Sat Dec 20 20:25:33 WST 2003 jmd -@set EDITION 0.2 -@set VERSION 0.3 -@c For double-sided printing, uncomment: -@c @setchapternewpage odd -@c %**end of header - -@macro cmd{CMDNAME} -\CMDNAME\ -@end macro - -@iftex -@finalout -@end iftex - -@dircategory Math -@direntry -* PSPP: (pspp). Statistical analysis package. -@end direntry - -@ifinfo -PSPP, for statistical analysis of sampled data, by Ben Pfaff. - -This file documents PSPP, a statistical package for analysis of -sampled data that uses a command language compatible with SPSS. - -Copyright (C) 1996-9, 2000 Free Software Foundation, Inc. - -This version of the PSPP documentation is consistent with version 2 of -``texinfo.tex''. - -Permission is granted to make and distribute verbatim copies of this -manual provided the copyright notice and this permission notice are -preserved on all copies. - -@ignore -Permission is granted to process this file through TeX and print the -results, provided the printed document carries copying permission notice -identical to this one except for the removal of this paragraph (this -paragraph not being relevant to the printed manual). - -@end ignore -Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided that the -entire resulting derived work is distributed under the terms of a -permission notice identical to this one. - -Permission is granted to copy and distribute translations of this -manual into another language, under the above condition for modified -versions, except that this permission notice may be stated in a -translation approved by the Free Software Foundation. -@end ifinfo - -@titlepage -@title PSPP -@subtitle A System for Statistical Analysis -@subtitle Edition @value{EDITION}, for PSPP version @value{VERSION} -@author by Ben Pfaff - -@page -@vskip 0pt plus 1filll - -PSPP Copyright @copyright{} 1997, 1998 Free Software Foundation, Inc. - -Permission is granted to make and distribute verbatim copies of this -manual provided the copyright notice and this permission notice are -preserved on all copies. - -Permission is granted to copy and distribute modified versions of this -manual under the conditions for verbatim copying, provided that the -entire derived work is distributed under the terms of a permission -notice identical to this one. - -Permission is granted to copy and distribute translations of this manual -into another language, under the above conditions for modified versions, -except that this permission notice may be stated in a translation -approved by the Foundation. -@end titlepage - -@node Top, Introduction, (dir), (dir) -@ifinfo -@top PSPP - -This file documents the PSPP package for statistical analysis of sampled -data. This is edition @value{EDITION}, for PSPP version -@value{VERSION}, last modified at @value{TIMESTAMP}. - -@end ifinfo - -@menu -* Introduction:: Description of the package. -* License:: Your rights and obligations. -* Credits:: Acknowledgement of authors. - -* Installation:: How to compile and install PSPP. -* Configuration:: Configuring PSPP. -* Invocation:: Starting and running PSPP. - -* Language:: Basics of the PSPP command language. -* Expressions:: Numeric and string expression syntax. - -* Data Input and Output:: Reading data from user files. -* System and Portable Files:: Dealing with system & portable files. -* Variable Attributes:: Adjusting and examining variables. -* Data Manipulation:: Simple operations on data. -* Data Selection:: Select certain cases for analysis. -* Conditionals and Looping:: Doing things many times or not at all. -* Statistics:: Basic statistical procedures. -* Utilities:: Other commands. -* Not Implemented:: What's not here yet - -* Data File Format:: Format of PSPP system files. -* Portable File Format:: Format of PSPP portable files. -* q2c Input Format:: Format of syntax accepted by q2c. - -* Bugs:: Known problems; submitting bug reports. - -* Function Index:: Index of PSPP functions for expressions. -* Concept Index:: Index of concepts. -* Command Index:: Index of PSPP procedures. - -@end menu - -@node Introduction, License, Top, Top -@chapter Introduction -@cindex introduction - -@cindex PSPP language -@cindex language, PSPP -PSPP is a tool for statistical analysis of sampled data. It reads a -syntax file and a data file, analyzes the data, and writes the results -to a listing file or to standard output. - -The language accepted by PSPP is similar to those accepted by SPSS -statistical products. The details of PSPP's language are given -later in this manual. - -@cindex files, PSPP -@cindex output, PSPP -@cindex PostScript -@cindex graphics -@cindex Ghostscript -@cindex Free Software Foundation -PSPP produces output in two forms: tables and charts. Both of these can -be written in several formats; currently, ASCII, PostScript, and HTML -are supported. In the future, more drivers, such as PCL and X Window -System drivers, may be developed. For now, Ghostscript, available from -the Free Software Foundation, may be used to convert PostScript chart -output to other formats. - -The current version of PSPP, @value{VERSION}, is woefully incomplete in -terms of its statistical procedure support. PSPP is a work in progress. -The author hopes to support fully support all features in the products -that PSPP replaces, eventually. The author welcomes questions, -comments, donations, and code submissions. @xref{Bugs,,Submitting Bug -Reports}, for instructions on contacting the author. - -@node License, Credits, Introduction, Top -@chapter Your rights and obligations -@cindex license -@cindex your rights and obligations -@cindex rights, your -@cindex obligations, your - -@cindex Free Software Foundation -@cindex GNU General Public License -@cindex General Public License -@cindex GPL -@cindex distribution -@cindex redistribution -Most of PSPP is distributed under the GNU General Public -License. The General Public License says, in effect, that you may -modify and distribute PSPP as you like, as long as you grant the -same rights to others. It also states that you must provide source code -when you distribute PSPP, or, if you obtained PSPP -source code from an anonymous ftp site, give out the name of that site. - -The General Public License is given in full in the source distribution -as file @file{COPYING}. In Debian GNU/Linux, this file is also -available as file @file{/usr/share/common-licenses/GPL-2}. - -To quote the GPL itself: - -@quotation -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 2 of the License, or (at your -option) any later version. - -This program is distributed in the hope that it will be useful, but -WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -General Public License for more details. - -You should have received a copy of the GNU General Public License along -with this program; if not, write to the Free Software Foundation, Inc., -59 Temple Place, Suite 330, Boston, MA 02111-1307 USA -@end quotation - -@node Credits, Installation, License, Top -@chapter Credits -@cindex credits -@cindex authors - -@cindex Pfaff, Ben -Most of PSPP, as well as this manual, -was written by Ben Pfaff. @xref{Contacting the Author}, for -instructions on contacting the author. - -@cindex Covington, Michael A. -@cindex Van Zandt, James -@cindex @file{ftp.cdrom.com} -@cindex @file{/pub/algorithms/c/julcal10} -@cindex @file{julcal.c} -@cindex @file{julcal.h} -The PSPP source code incorporates @code{julcal10} originally -written by Michael A. Covington and translated into C by Jim Van Zandt. -The original package can be found in directory -@url{ftp://ftp.cdrom.com/pub/algorithms/c/julcal10}. The entire -contents of that directory constitute the package. The files actually -used in PSPP are @code{julcal.c} and @code{julcal.h}. - -@node Installation, Configuration, Credits, Top -@chapter Installing PSPP -@cindex installation -@cindex PSPP, installing - -@cindex GNU C compiler -@cindex gcc -@cindex compiler, recommended -@cindex compiler, gcc -PSPP conforms to the GNU Coding Standards. PSPP is written in, and -requires for proper operation, ANSI/ISO C. You might want to -additionally note the following points: - -@itemize @bullet -@item -The compiler and linker must allow for significance of several -characters in external identifiers. The exact number is unknown but at -least 31 is recommended. - -@item -The @code{int} type must be 32 bits or wider. - -@item -The recommended compiler is gcc 2.7.2.1 or later, but any ANSI compiler -will do if it fits the above criteria. -@end itemize - -Many UNIX variants should work out-of-the-box, as PSPP uses GNU -autoconf to detect differences between environments. Please report any -problems with compilation of PSPP under UNIX and UNIX-like operating -systems---portability is a major concern of the author. - -The pages below give specific instructions for installing PSPP -on each type of system mentioned above. - -@menu -* UNIX installation:: Installing on UNIX-like environments. -@end menu - -@node UNIX installation, , Installation, Installation -@section UNIX installation -@cindex UNIX, installing PSPP under -@cindex installation, under UNIX -@noindent -To install PSPP under a UNIX-like operating system, follow the steps -below in order. Some of the text below was taken directly from various -Free Software Foundation sources. - -@enumerate -@item -@code{cd} to the directory containing the PSPP source. - -@cindex configure, GNU -@cindex GNU configure -@item -Type @samp{./configure} to configure for your particular operating -system and compiler. Running @code{configure} takes a while. While -running, it displays some messages telling which features it is checking -for. - -You can optionally supply some options to @code{configure} to -give it hints about how to do its job. Type @code{./configure --help} -to see a list of options. One of the most useful options is -@samp{--with-checker}, which enables the use of the Checker memory -debugger under supported operating systems. Checker must already be -installed to use this option. Do not use @samp{--with-checker} if you -are not debugging PSPP itself. - -@cindex @file{Makefile} -@cindex @file{config.h} -@cindex @file{pref.h} -@cindex makefile -@item -(optional) Edit @file{Makefile}, @file{config.h}, and @file{pref.h}. -These files are produced by @code{configure}. Note that most PSPP -settings can be changed at runtime. - -@file{pref.h} is only generated by @code{configure} if it does not -already exist. (It's copied from @file{prefh.orig}.) - -@cindex compiling -@item -Type @samp{make} to compile the package. If there are any errors during -compilation, try to fix them. If modifications are necessary to compile -correctly under your configuration, contact the author. -@xref{Bugs,,Submitting Bug Reports}, for details. - -@cindex self-tests, running -@item -Type @samp{make check} to run self-tests on the compiled PSPP package. - -@cindex installation -@cindex PSPP, installing -@cindex @file{/usr/local/share/pspp/} -@cindex @file{/usr/local/bin/} -@cindex @file{/usr/local/info/} -@cindex documentation, installing -@item -Become the superuser and type @samp{make install} to install the -PSPP binaries, by default in @file{/usr/local/bin/}. The -directory @file{/usr/local/share/pspp/} is created and populated with -files needed by PSPP at runtime. This step will also cause the -PSPP documentation to be installed in @file{/usr/local/info/}, -but only if that directory already exists. - -@item -(optional) Type @samp{make clean} to delete the PSPP binaries -from the source tree. -@end enumerate - -@node Configuration, Invocation, Installation, Top -@chapter Configuring PSPP -@cindex configuration -@cindex PSPP, configuring - -PSPP has dozens of configuration possibilities and hundreds of -settings. This is both a bane and a blessing. On one hand, it's -possible to easily accommodate diverse ranges of setups. But, on the -other, the multitude of possibilities can overwhelm the casual user. -Fortunately, the configuration mechanisms are profusely described in the -sections below@enddots{} - -@menu -* File locations:: How PSPP finds config files. -* Configuration techniques:: Many different methods of configuration@enddots{} -* Configuration files:: How configuration files are read. -* Environment variables:: All about environment variables. -* Output devices:: Describing your terminal(s) and printer(s). -* PostScript driver class:: Configuration of PostScript devices. -* ASCII driver class:: Configuration of character-code devices. -* HTML driver class:: Configuration for HTML output. -* Miscellaneous configuring:: Even more configuration variables. -* Improving output quality:: Hints for producing ever-more-lovely output. -@end menu - -@node File locations, Configuration techniques, Configuration, Configuration -@section Locating configuration files - -PSPP uses the same method to find most of its configuration files: - -@enumerate -@item -The @dfn{base name} of the file being sought is determined. - -@item -The path to search is determined. - -@item -Each directory in the search path, from left to right, is searched for a -file with the name of the base name. The first occurrence is read -as the configuration file. -@end enumerate - -The first two steps are elaborated below for the sake of our pedantic -friends. - -@enumerate -@item -A @dfn{base name} is a file name lacking an absolute directory -reference. Some examples of base names are: @file{ps-encodings}, -@file{devices}, @file{devps/DESC} (under UNIX), @file{devps\DESC} (under -M$ environments). - -Determining the base name is a two-step process: - -@enumerate a -@item -If the appropriate environment variable is defined, the value of that -variable is used (@pxref{Environment variables}). For instance, when -searching for the output driver initialization file, the variable -examined is @code{STAT_OUTPUT_INIT_FILE}. - -@item -Otherwise, the compiled-in default is used. For example, when searching -for the output driver initialization file, the default base name is -@file{devices}. -@end enumerate - -@strong{Please note:} If a user-specified base name does contain an -absolute directory reference, as in a file name like -@file{/home/pfaff/fonts/TR}, no path is searched---the file name is used -exactly as given---and the algorithm terminates. - -@item -The path is the first of the following that is defined: - -@itemize @bullet -@item -A variable definition for the path given in the user environment. This -is a PSPP-specific environment variable name; for instance, -@code{STAT_OUTPUT_INIT_PATH}. - -@item -In some cases, another, less-specific environment variable is checked. -For instance, when searching for font files, the PostScript driver first -checks for a variable with name @code{STAT_GROFF_FONT_PATH}, then for -one with name @code{GROFF_FONT_PATH}. (However, font searching has its -own list of esoteric search rules.) - -@item -The configuration file path, which is itself determined by the -following rules: - -@enumerate a -@item -If the command line contains an option of the form @samp{-B @var{path}} -or @samp{--config-dir=@var{path}}, then the value given on the -rightmost occurrence of such an option is used. - -@item -Otherwise, if the environment variable @code{STAT_CONFIG_PATH} is -defined, the value of that variable is used. - -@item -Otherwise, the compiled-in fallback default is used. On UNIX machines, -the default fallback path is - -@enumerate 1 -@item -@file{~/.pspp} - -@item -@file{/usr/local/lib/pspp} - -@item -@file{/usr/lib/pspp} -@end enumerate - -On DOS machines, the default fallback path is: - -@enumerate 1 -@item -All the paths from the DOS search path in the @samp{PATH} environment -variable, in left-to-right order. - -@item -@file{C:\PSPP}, as a last resort. -@end enumerate - -Note that the installer of PSPP can easily change this default -fallback path; thus the above should not be taken as gospel. -@end enumerate -@end itemize -@end enumerate - -As a final note: Under DOS, directories given in paths are delimited by -semicolons (@samp{;}); under UNIX, directories are delimited by colons -(@samp{:}). This corresponds with the standard path delimiter under -these OSes. - -@node Configuration techniques, Configuration files, File locations, Configuration -@section Configuration techniques - -There are many ways that PSPP can be configured. These are -described in the list below. Values given by earlier items take -precedence over those given by later items. - -@enumerate -@item -Syntax commands that modify settings, such as @cmd{SET}. @xref{SET}. - -@item -Command-line options. @xref{Invocation}. - -@item -PSPP-specific environment variable contents. @xref{Environment -variables}. - -@item -General environment variable contents. @xref{Environment variables}. - -@item -Configuration file contents. @xref{Configuration files}. - -@item -Fallback defaults. -@end enumerate - -Some of the above may not apply to a particular setting. For instance, -the current pager (such as @samp{more}, @samp{most}, or @samp{less}) -cannot be determined by configuration file contents because there is no -appropriate configuration file. - -@node Configuration files, Environment variables, Configuration techniques, Configuration -@section Configuration files - -Most configuration files have a common form: - -@itemize @bullet -@item -Each line forms a separate command or directive. This means that lines -cannot be broken up, unless they are spliced together with a trailing -backslash, as described below. - -@item -Before anything else is done, trailing whitespace is removed. - -@item -When a line ends in a backslash (@samp{\}), the backslash is removed, -and the next line is read and appended to the current line. - -@itemize @minus -@item -Whitespace preceding the backslash is retained. - -@item -This rule continues to be applied until the line read does not end in a -backslash. - -@item -It is an error if the last line in the file ends in a backslash. -@end itemize - -@item -Comments are introduced by an octothorpe (@samp{#}), and continue until the -end of the line. - -@itemize @minus -@item -An octothorpe inside balanced pairs of double quotation marks (@samp{"}) -or single quotation marks (@samp{'}) does not introduce a comment. - -@item -The backslash character can be used inside balanced quotes of either -type to escape the following character as a literal character. - -(This is distinct from the use of a backslash as a line-splicing -character.) - -@item -Line splicing takes place before comment removal. -@end itemize - -@item -Blank lines, and lines that contain only whitespace, are ignored. -@end itemize - -@node Environment variables, Output devices, Configuration files, Configuration -@section Environment variables - -You may think the concept of environment variables is a fairly simple -one. However, the author of PSPP has found a way to complicate -even something so simple. Environment variables are further described -in the sections below: - -@menu -* Variable values:: Values of variables are determined this way. -* Environment substitutions:: How environment substitutions are made. -* Predefined variables:: A few variables are automatically defined. -@end menu - -@node Variable values, Environment substitutions, Environment variables, Environment variables -@subsection Values of environment variables - -Values for environment variables are obtained by the following means, -which are arranged in order of decreasing precedence: - -@enumerate -@item -Command-line options. @xref{Invocation}. - -@item -The @file{environment} configuration file---more on this below. - -@item -Actual environment variables (defined in the shell or other parent -process). -@end enumerate - -The @file{environment} configuration file is located through application -of the usual algorithm for configuration files (@pxref{File locations}), -except that its contents do not affect the search path used to find -@file{environment} itself. Use of @file{environment} is discouraged on -systems that allow an arbitrarily large environment; it is supported for -use on systems like MS-DOS that limit environment size. - -@file{environment} is composed of lines having the form -@samp{@var{key}=@var{value}}, where @var{key} and the equals sign -(@samp{=}) are required, and @var{value} is optional. If @var{value} is -given, variable @var{key} is given that value; if @var{value} is absent, -variable @var{key} is undefined (deleted). Variables may not be defined -with a null value. - -Environment substitutions are performed on each line in the file -(@pxref{Environment substitutions}). - -See @ref{Configuration files}, for more details on formatting of the -environment configuration file. - -@quotation -@strong{Please note:} Support for @file{environment} is not yet -implemented. -@end quotation - -@node Environment substitutions, Predefined variables, Variable values, Environment variables -@subsection Environment substitutions - -Much of the power of environment variables lies in the way that they may -be substituted into configuration files. Variable substitutions are -described below. - -The line is scanned from left to right. In this scan, all characters -other than dollar signs (@samp{$}) are retained unmolested. Dollar -signs, however, introduce an environment variable reference. References -take three forms: - -@table @code -@item $@var{var} -Replaced by the value of environment variable @var{var}, determined as -specified in @ref{Variable values}. @var{var} must be one of the -following: - -@itemize @bullet -@item -One or more letters. - -@item -Exactly one nonalphabetic character. This may not be a left brace -(@samp{@{}). -@end itemize - -@item $@{@var{var}@} -Same as above, but @var{var} may contain any character (except -@samp{@}}). - -@item $$ -Replaced by a single dollar sign. -@end table - -Undefined variables expand to a empty value. - -@node Predefined variables, , Environment substitutions, Environment variables -@subsection Predefined environment variables - -There are two environment variables predefined for use in environment -substitutions: - -@table @samp -@item VER -Defined as the version number of PSPP, as a string, in a format -something like @samp{0.9.4}. - -@item ARCH -Defined as the host architecture of PSPP, as a string, in standard -cpu-manufacturer-OS format. For instance, Debian GNU/Linux 1.1 on an -Intel machine defines this as @samp{i586-unknown-linux}. This is -somewhat dependent on the system used to compile PSPP. -@end table - -Nothing prevents these values from being overridden, although it's a -good idea not to do so. - -@node Output devices, PostScript driver class, Environment variables, Configuration -@section Output devices - -Configuring output devices is the most complicated aspect of configuring -PSPP. The output device configuration file is named -@file{devices}. It is searched for using the usual algorithm for -finding configuration files (@pxref{File locations}). Each line in the -file is read in the usual manner for configuration files -(@pxref{Configuration files}). - -Lines in @file{devices} are divided into three categories, described -briefly in the table below: - -@table @i -@item driver category definitions -Define a driver in terms of other drivers. - -@item macro definitions -Define environment variables local to the the output driver -configuration file. - -@item device definitions -Describe the configuration of an output device. -@end table - -The following sections further elaborate the contents of the -@file{devices} file. - -@menu -* Driver categories:: How to organize the driver namespace. -* Macro definitions:: Environment variables local to @file{devices}. -* Device definitions:: Output device descriptions. -* Dimensions:: Lengths, widths, sizes, @enddots{} -* papersize:: Letter, legal, A4, envelope, @enddots{} -* Distinguishing line types:: Details on @file{devices} parsing. -* Tokenizing lines:: Dividing @file{devices} lines into tokens. -@end menu - -@node Driver categories, Macro definitions, Output devices, Output devices -@subsection Driver categories - -Drivers can be divided into categories. Drivers are specified by their -names, or by the names of the categories that they are contained in. -Only certain drivers are enabled each time PSPP is run; by -default, these are the drivers in the category `default'. To enable a -different set of drivers, use the @samp{-o @var{device}} command-line -option (@pxref{Invocation}). - -Categories are specified with a line of the form -@samp{@var{category}=@var{driver1} @var{driver2} @var{driver3} @var{@dots{}} -@var{driver@var{n}}}. This line specifies that the category -@var{category} is composed of drivers named @var{driver1}, -@var{driver2}, and so on. There may be any number of drivers in the -category, from zero on up. - -Categories may also be specified on the command line -(@pxref{Invocation}). - -This is all you need to know about categories. If you're still curious, -read on. - -First of all, the term `categories' is a bit of a misnomer. In fact, -the internal representation is nothing like the hierarchy that the term -seems to imply: a linear list is used to keep track of the enabled -drivers. - -When PSPP first begins reading @file{devices}, this list contains -the name of any drivers or categories specified on the command line, or -the single item `default' if none were specified. - -Each time a category definition is specified, the list is searched for -an item with the value of @var{category}. If a matching item is found, -it is deleted. If there was a match, the list of drivers (@var{driver1} -through @var{driver@var{n}}) is then appended to the list. - -Each time a driver definition line is encountered, the list is searched. -If the list contains an item with that driver's name, the driver is -enabled and the item is deleted from the list. Otherwise, the driver -is not enabled. - -It is an error if the list is not empty when the end of @file{devices} -is reached. - -@node Macro definitions, Device definitions, Driver categories, Output devices -@subsection Macro definitions - -Macro definitions take the form @samp{define @var{macroname} -@var{definition}}. In such a macro definition, the environment variable -@var{macroname} is defined to expand to the value @var{definition}. -Before the definition is made, however, any macros used in -@var{definition} are expanded. - -Please note the following nuances of macro usage: - -@itemize @bullet -@item -For the purposes of this section, @dfn{macro} and @dfn{environment -variable} are synonyms. - -@item -Macros may not take arguments. - -@item -Macros may not recurse. - -@item -Macros are just environment variable definitions like other environment -variable definitions, with the exception that they are limited in scope -to the @file{devices} configuration file. - -@item -Macros override other all environment variables of the same name (within -the scope of @file{devices}). - -@item -Earlier macro definitions for a particular @var{key} override later -ones. In particular, macro definitions on the command line override -those in the device definition file. @xref{Non-option Arguments}. - -@item -There are two predefined macros, whose values are determined at runtime: - -@table @samp -@item viewwidth -Defined as the width of the console screen, in columns of text. - -@item viewlength -Defined as the length of the console screen, in lines of text. -@end table -@end itemize - -@node Device definitions, Dimensions, Macro definitions, Output devices -@subsection Driver definitions - -Driver definitions are the ultimate purpose of the @file{devices} -configuration file. These are where the real action is. Driver -definitions tell PSPP where it should send its output. - -Each driver definition line is divided into four fields. These fields -are delimited by colons (@samp{:}). Each line is subjected to -environment variable interpolation before it is processed further -(@pxref{Environment substitutions}). From left to right, the four -fields are, in brief: - -@table @i -@item driver name -A unique identifier, used to determine whether to enable the driver. - -@item class name -One of the predefined driver classes supported by PSPP. The -currently supported driver classes include `postscript' and `ascii'. - -@item device type(s) -Zero or more of the following keywords, delimited by spaces: - -@table @code -@item screen - -Indicates that the device is a screen display. This may reduce the -amount of buffering done by the driver, to make interactive use more -convenient. - -@item printer - -Indicates that the device is a printer. - -@item listing - -Indicates that the device is a listing file. -@end table - -These options are just hints to PSPP and do not cause the output to be -directed to the screen, or to the printer, or to a listing file---those -must be set elsewhere in the options. They are used primarily to decide -which devices should be enabled at any given time. @xref{SET}, for more -information. - -@item options -An optional set of options to pass to the driver itself. The exact -format for the options varies among drivers. -@end table - -The driver is enabled if: - -@enumerate -@item -Its driver name is specified on the command line, or - -@item -It's in a category specified on the command line, or - -@item -If no categories or driver names are specified on the command line, it -is in category @code{default}. -@end enumerate - -For more information on driver names, see @ref{Driver categories}. - -The class name must be one of those supported by PSPP. The -classes supported depend on the options with which PSPP was -compiled. See later sections in this chapter for descriptions of the -available driver classes. - -Options are dependent on the driver. See the driver descriptions for -details. - -@node Dimensions, papersize, Device definitions, Output devices -@subsection Dimensions - -Quite often in configuration it is necessary to specify a length or a -size. PSPP uses a common syntax for all such, calling them -collectively by the name @dfn{dimensions}. - -@itemize @bullet -@item -You can specify dimensions in decimal form (@samp{12.5}) or as -fractions, either as mixed numbers (@samp{12-1/2}) or raw fractions -(@samp{25/2}). - -@item -A number of different units are available. These are suffixed to the -numeric part of the dimension. There must be no spaces between the -number and the unit. The available units are identical to those offered -by the popular typesetting system @TeX{}: - -@table @code -@item in -inch (1 @code{in} = 2.54 @code{cm}) - -@item " -inch (1 @code{in} = 2.54 @code{cm}) - -@item pt -printer's point (1 @code{in} = 72.27 @code{pt}) - -@item pc -pica (12 @code{pt} = 1 @code{pc}) - -@item bp -PostScript point (1 @code{in} = 72 @code{bp}) - -@item cm -centimeter - -@item mm -millimeter (10 @code{mm} = 1 @code{cm}) - -@item dd -didot point (1157 @code{dd} = 1238 @code{pt}) - -@item cc -cicero (1 @code{cc} = 12 @code{dd}) - -@item sp -scaled point (65536 @code{sp} = 1 @code{pt}) -@end table - -@item -If no explicit unit is given, PSPP attempts to guess the best unit: - -@itemize @minus -@item -Numbers less than 50 are assumed to be in inches. - -@item -Numbers 50 or greater are assumed to be in millimeters. -@end itemize -@end itemize - -@node papersize, Distinguishing line types, Dimensions, Output devices -@subsection Paper sizes - -Output drivers usually deal with some sort of hardcopy media. This -media is called @dfn{paper} by the drivers, though in reality it could -be a transparency or film or thinly veiled sarcasm. To make it easier -for you to deal with paper, PSPP allows you to have (of course!) a -configuration file that gives symbolic names, like ``letter'' or -``legal'' or ``a4'', to paper sizes, rather than forcing you to use -cryptic numbers like ``8-1/2 x 11'' or ``210 by 297''. Surprisingly -enough, this configuration file is named @file{papersize}. -@xref{Configuration files}. - -When PSPP tries to connect a symbolic paper name to a paper size, it -reads and parses each non-comment line in the file, in order. The first -field on each line must be a symbolic paper name in double quotes. -Paper names may not contain double quotes. Paper names are not -case-sensitive: @samp{legal} and @samp{Legal} are equivalent. - -If a match is found for the paper name, the rest of the line is parsed. -If it is found to be a pair of dimensions (@pxref{Dimensions}) separated -by either @samp{x} or @samp{by}, then those are taken to be the paper -size, in order of width followed by length. There @emph{must} be at -least one space on each side of @samp{x} or @samp{by}. - -Otherwise the line must be of the form -@samp{"@var{paper-1}"="@var{paper-2}"}. In this case the target of the -search becomes paper name @var{paper-2} and the search through the file -continues. - -@node Distinguishing line types, Tokenizing lines, papersize, Output devices -@subsection How lines are divided into types - -The lines in @file{devices} are distinguished in the following manner: - -@enumerate -@item -Leading whitespace is removed. - -@item -If the resulting line begins with the exact string @code{define}, -followed by one or more whitespace characters, the line is processed as -a macro definition. - -@item -Otherwise, the line is scanned for the first instance of a colon -(@samp{:}) or an equals sign (@samp{=}). - -@item -If a colon is encountered first, the line is processed as a driver -definition. - -@item -Otherwise, if an equals sign is encountered, the line is processed as a -macro definition. - -@item -Otherwise, the line is ill-formed. -@end enumerate - -@node Tokenizing lines, , Distinguishing line types, Output devices -@subsection How lines are divided into tokens - -Each driver definition line is run through a simple tokenizer. This -tokenizer recognizes two basic types of tokens. - -The first type is an equals sign (@samp{=}). Equals signs are both -delimiters between tokens and tokens in themselves. - -The second type is an identifier or string token. Identifiers and -strings are equivalent after tokenization, though they are written -differently. An identifier is any string of characters other than -whitespace or equals sign. - -A string is introduced by a single- or double-quote character (@samp{'} -or @samp{"}) and, in general, continues until the next occurrence of -that same character. The following standard C escapes can also be -embedded within strings: - -@table @code -@item \' -A single-quote (@samp{'}). - -@item \" -A double-quote (@samp{"}). - -@item \? -A question mark (@samp{?}). Included for hysterical raisins. - -@item \\ -A backslash (@samp{\}). - -@item \a -Audio bell (ASCII 7). - -@item \b -Backspace (ASCII 8). - -@item \f -Formfeed (ASCII 12). - -@item \n -New-line (ASCII 10) - -@item \r -Carriage return (ASCII 13). - -@item \t -Tab (ASCII 9). - -@item \v -Vertical tab (ASCII 11). - -@item \@var{o}@var{o}@var{o} -Each @samp{o} must be an octal digit. The character is the one having -the octal value specified. Any number of octal digits is read and -interpreted; only the lower 8 bits are used. - -@item \x@var{h}@var{h} -Each @samp{h} must be a hex digit. The character is the one having the -hexadecimal value specified. Any number of hex digits is read and -interpreted; only the lower 8 bits are used. -@end table - -Tokens, outside of quoted strings, are delimited by whitespace or equals -signs. - -@node PostScript driver class, ASCII driver class, Output devices, Configuration -@section The PostScript driver class - -The @code{postscript} driver class is used to produce output that is -acceptable to PostScript printers and to PC-based PostScript -interpreters such as Ghostscript. Continuing a long tradition, -PSPP's PostScript driver is configurable to the point of -absurdity. - -There are actually two PostScript drivers. The first one, -@samp{postscript}, produces ordinary DSC-compliant PostScript output. -The second one @samp{epsf}, produces an Encapsulated PostScript file. -The two drivers are otherwise identical in configuration and in -operation. - -The PostScript driver is described in further detail below. - -@menu -* PS output options:: Output file options. -* PS page options:: Paper, margins, scaling & rotation, more! -* PS file options:: Configuration files. -* PS font options:: Default fonts, font options. -* PS line options:: Line widths, options. -* Prologue:: Details on the PostScript prologue. -* Encodings:: Details on PostScript font encodings. -@end menu - -@node PS output options, PS page options, PostScript driver class, PostScript driver class -@subsection PostScript output options - -These options deal with the form of the output and the output file -itself: - -@table @code -@item output-file=@var{filename} - -File to which output should be sent. This can be an ordinary filename -(i.e., @code{"pspp.ps"}), a pipe filename (i.e., @code{"|lpr"}), or -stdout (@code{"-"}). Default: @code{"pspp.ps"}. - -@item color=@var{boolean} - -Most of the time black-and-white PostScript devices are smart enough to -map colors to shades themselves. However, you can cause the PSPP -output driver to do an ugly simulation of this in its own driver by -turning @code{color} off. Default: @code{on}. - -This is a boolean setting, as are many settings in the PostScript -driver. Valid positive boolean values are @samp{on}, @samp{true}, -@samp{yes}, and nonzero integers. Negative boolean values are -@samp{off}, @samp{false}, @samp{no}, and zero. - -@item data=@var{data-type} - -One of @code{clean7bit}, @code{clean8bit}, or @code{binary}. This -controls what characters will be written to the output file. PostScript -produced with @code{clean7bit} can be transmitted over 7-bit -transmission channels that use ASCII control characters for line -control. @code{clean8bit} is similar but allows characters above 127 to -be written to the output file. @code{binary} allows any character in -the output file. Default: @code{clean7bit}. - -@item line-ends=@var{line-end-type} - -One of @code{cr}, @code{lf}, or @code{crlf}. This controls what is used -for new-line in the output file. Default: @code{cr}. - -@item optimize-line-size=@var{level} - -Either @code{0} or @code{1}. If @var{level} is @code{1}, then short -line segments will be collected and merged into longer ones. This -reduces output file size but requires more time and memory. A -@var{level} of @code{0} has the advantage of being better for -interactive environments. @code{1} is the default unless the -@code{screen} flag is set; in that case, the default is @code{0}. - -@item optimize-text-size=@var{level} - -One of @code{0}, @code{1}, or @code{2}, each higher level representing -correspondingly more aggressive space savings for text in the output -file and requiring correspondingly more time and memory. Unfortunately -the levels presently are all the same. @code{1} is the default unless -the @code{screen} flag is set; in that case, the default is @code{0}. -@end table - -@node PS page options, PS file options, PS output options, PostScript driver class -@subsection PostScript page options - -These options affect page setup: - -@table @code -@item headers=@var{boolean} - -Controls whether the standard headers showing the time and date and -title and subtitle are printed at the top of each page. Default: -@code{on}. - -@item paper-size=@var{paper-size} - -Paper size, either as a symbolic name (i.e., @code{letter} or @code{a4}) -or specific measurements (i.e., @code{8-1/2x11} or @code{"210 x 297"}. -@xref{papersize, , Paper sizes}. Default: @code{letter}. - -@item orientation=@var{orientation} - -Either @code{portrait} or @code{landscape}. Default: @code{portrait}. - -@item left-margin=@var{dimension} -@itemx right-margin=@var{dimension} -@itemx top-margin=@var{dimension} -@itemx bottom-margin=@var{dimension} - -Sets the margins around the page. The headers, if enabled, are not -included in the margins; they are in addition to the margins. For a -description of dimensions, see @ref{Dimensions}. Default: @code{0.5in}. - -@end table - -@node PS file options, PS font options, PS page options, PostScript driver class -@subsection PostScript file options - -Oh, my. You don't really want to know about the way that the PostScript -driver deals with files, do you? Well I suppose you're entitled, but I -warn you right now: it's not pretty. Here goes@enddots{} - -First let's look at the options that are available: - -@table @code - -@item font-dir=@var{font-directory} - -Sets the font directory. Default: @code{devps}. - -@item prologue-file=@var{prologue-file-name} - -Sets the name of the PostScript prologue file. You can write your own -prologue, though I have no idea why you'd want to: see @ref{Prologue}. -Default: @code{ps-prologue}. - -@item device-file=@var{device-file-name} - -Sets the name of the Groff-format device description file. The -PostScript driver reads this to know about the scaling of fonts -and so on. The format of such files is described in groff_font(5), -included with Groff. Default: @code{DESC}. - -@item encoding-file=@var{encoding-file-name} - -Sets the name of the encoding file. This file contains a list of all -font encodings that will be needed so that the driver can put all of -them at the top of the prologue. @xref{Encodings}. Default: -@code{ps-encodings}. - -If the specified encoding file cannot be found, this error will be -silently ignored, since most people do not need any encodings besides -the ones that can be found using @code{auto-encodings}, described below. - -@item auto-encode=@var{boolean} - -When enabled, the font encodings needed by the default proportional- and -fixed-pitch fonts will automatically be dumped to the PostScript -output. Otherwise, it is assumed that the user has an encoding file -and knows how to use it (@pxref{Encodings}). There is probably no good -reason to turn off this convenient feature. Default: @code{on}. - -@end table - -Next I suppose it's time to describe the search algorithm. When the -PostScript driver needs a file, whether that file be a font, a -PostScript prologue, or what you will, it searches in this manner: - -@enumerate - -@item -Constructs a path by taking the first of the following that is defined: - -@enumerate a - -@item -Environment variable @code{STAT_GROFF_FONT_PATH}. @xref{Environment -variables}. - -@item -Environment variable @code{GROFF_FONT_PATH}. - -@item -The compiled-in fallback default. -@end enumerate - -@item -Constructs a base name from concatenating, in order, the font directory, -a path separator (@samp{/} or @samp{\}), and the file to be found. A -typical base name would be something like @code{devps/ps-encodings}. - -@item -Searches for the base name in the path constructed above. If the file -is found, the algorithm terminates. - -@item -Searches for the base name in the standard configuration path. See -@ref{File locations}, for more details. If the file is found, the -algorithm terminates. - -@item -At this point we remove the font directory and path separator from the -base name. Now the base name is simply the file to be found, i.e., -@code{ps-encodings}. - -@item -Searches for the base name in the path constructed in the first step. -If the file is found, the algorithm terminates. - -@item -Searches for the base name in the standard configuration path. If the -file is found, the algorithm terminates. - -@item -The algorithm terminates unsuccessfully. -@end enumerate - -So, as you see, there are several ways to configure the PostScript -drivers. Careful selection of techniques can make the configuration -very flexible indeed. - -@node PS font options, PS line options, PS file options, PostScript driver class -@subsection PostScript font options - -The list of available font options is short and sweet: - -@table @code -@item prop-font=@var{font-name} - -Sets the default proportional font. The name should be that of a -PostScript font. Default: @code{"Helvetica"}. - -@item fixed-font=@var{font-name} - -Sets the default fixed-pitch font. The name should be that of a -PostScript font. Default: @code{"Courier"}. - -@item font-size=@var{font-size} - -Sets the size of the default fonts, in thousandths of a point. Default: -@code{10000}. - -@end table - -@node PS line options, Prologue, PS font options, PostScript driver class -@subsection PostScript line options - -Most tables contain lines, or rules, between cells. Some features of -the way that lines are drawn in PostScript tables are user-definable: - -@table @code - -@item line-style=@var{style} - -Sets the style used for lines used to divide tables into sections. -@var{style} must be either @code{thick}, in which case thick lines are -used, or @var{double}, in which case double lines are used. Default: -@code{thick}. - -@item line-gutter=@var{dimension} - -Sets the line gutter, which is the amount of whitespace on either side -of lines that border text or graphics objects. @xref{Dimensions}. -Default: @code{0.5pt}. - -@item line-spacing=@var{dimension} - -Sets the line spacing, which is the amount of whitespace that separates -lines that are side by side, as in a double line. Default: -@code{0.5pt}. - -@item line-width=@var{dimension} - -Sets the width of a typical line used in tables. Default: @code{0.5pt}. - -@item line-width-thick=@var{dimension} - -Sets the width of a thick line used in tables. Not used if -@code{line-style} is set to @code{thick}. Default: @code{1.5pt}. - -@end table - -@node Prologue, Encodings, PS line options, PostScript driver class -@subsection The PostScript prologue - -Most PostScript files that are generated mechanically by programs -consist of two parts: a prologue and a body. The prologue is generally -a collection of boilerplate. Only the body differs greatly between -two outputs from the same program. - -This is also the strategy used in the PSPP PostScript driver. In -general, the prologue supplied with PSPP will be more than sufficient. -In this case, you will not need to read the rest of this section. -However, hackers might want to know more. Read on, if you fall into -this category. - -The prologue is dumped into the output stream essentially unmodified. -However, two actions are performed on its lines. First, certain lines -may be omitted as specified in the prologue file itself. Second, -variables are substituted. - -The following lines are omitted: - -@enumerate -@item -All lines that contain three bangs in a row (@code{!!!}). - -@item -Lines that contain @code{!eps}, if the PostScript driver is producing -ordinary PostScript output. Otherwise an EPS file is being produced, -and the line is included in the output, although everything following -@code{!eps} is deleted. - -@item -Lines that contain @code{!ps}, if the PostScript driver is producing EPS -output. Otherwise, ordinary PostScript is being produced, and the line -is included in the output, although everything following @code{!ps} is -deleted. -@end enumerate - -The following are the variables that are substituted. Only the -variables listed are substituted; environment variables are not. -@xref{Environment substitutions}. - -@table @code -@item bounding-box - -The page bounding box, in points, as four space-separated numbers. For -U.S. letter size paper, this is @samp{0 0 612 792}. - -@item creator - -PSPP version as a string: @samp{GNU PSPP 0.1b}, for example. - -@item date - -Date the file was created. Example: @samp{Tue May 21 13:46:22 1991}. - -@item data - -Value of the @code{data} PostScript driver option, as one of the strings -@samp{Clean7Bit}, @samp{Clean8Bit}, or @samp{Binary}. - -@item orientation - -Page orientation, as one of the strings @code{Portrait} or -@code{Landscape}. - -@item user - -Under multiuser OSes, the user's login name, taken either from the -environment variable @code{LOGNAME} or, if that fails, the result of the -C library function @code{getlogin()}. Defaults to @samp{nobody}. - -@item host - -System hostname as reported by @code{gethostname()}. Defaults to -@samp{nowhere}. - -@item prop-font - -Name of the default proportional font, prefixed by the word -@samp{font} and a space. Example: @samp{font Times-Roman}. - -@item fixed-font - -Name of the default fixed-pitch font, prefixed by the word @samp{font} -and a space. - -@item scale-factor - -The page scaling factor as a floating-point number. Example: -@code{1.0}. Note that this is also passed as an argument to the BP -macro. - -@item paper-length -@item paper-width - -The paper length and paper width, respectively, in thousandths of a -point. Note that these are also passed as arguments to the BP macro. - -@item left-margin -@item top-margin - -The left margin and top margin, respectively, in thousandths of a -point. Note that these are also passed as arguments to the BP macro. - -@item title - -Document title as a string. This is not the title specified in the -PSPP syntax file. A typical title is the word @samp{PSPP} followed -by the syntax file name in parentheses. Example: @samp{PSPP -()}. - -@item source-file - -PSPP syntax file name. Example: @samp{mary96/first.stat}. - -@end table - -Any other questions about the PostScript prologue can best be answered -by examining the default prologue or the PSPP source. - -@node Encodings, , Prologue, PostScript driver class -@subsection PostScript encodings - -PostScript fonts often contain many more than 256 characters, in order -to accommodate foreign language characters and special symbols. -PostScript uses @dfn{encodings} to map these onto single-byte symbol -sets. Each font can have many different encodings applied to it. - -PSPP's PostScript driver needs to know which encoding to apply to each -font. It can determine this from the information encapsulated in the -Groff font description that it reads. However, there is an additional -problem---for efficiency, the PostScript driver needs to have a complete -list of all encodings that will be used in the entire session @emph{when -it opens the output file}. For this reason, it can't use the -information built into the fonts because it doesn't know which fonts -will be used. - -As a stopgap solution, there are two mechanisms for specifying which -encodings will be used. The first mechanism is automatic and it is the -only one that most PSPP users will ever need. The second mechanism is -manual, but it is more flexible. Either mechanism or both may be used -at one time. - -The first mechanism is activated by the @samp{auto-encode} driver option -(@pxref{PS file options}). When enabled, @samp{auto-encode} causes the -PostScript driver to include the encodings used by the default -proportional and fixed-pitch fonts (@pxref{PS font options}). Many -PSPP output files will only need these encodings. - -The second mechanism is the file specified by the @samp{encoding-file} -option (@pxref{PS file options}). If it exists, this file must consist -of lines in PSPP configuration-file format (@pxref{Configuration -files}). Each line that is not a comment should name a PostScript -encoding to include in the output. - -It is not an error if an encoding is included more than once, by either -mechanism. It will appear only once in the output. It is also not an -error if an encoding is included in the output but never used. It -@emph{is} an error if an encoding is used but not included by one of -these mechanisms. In this case, the built-in PostScript encoding -@samp{ISOLatin1Encoding} is substituted. - -@node ASCII driver class, HTML driver class, PostScript driver class, Configuration -@section The ASCII driver class - -The ASCII driver class produces output that can be displayed on a -terminal or output to printers. All of its options are highly -configurable. The ASCII driver has class name @samp{ascii}. - -The ASCII driver is described in further detail below. - -@menu -* ASCII output options:: Output file options. -* ASCII page options:: Page size, margins, more. -* ASCII font options:: Box character, bold & italics. -@end menu - -@node ASCII output options, ASCII page options, ASCII driver class, ASCII driver class -@subsection ASCII output options - -@table @code -@item output-file=@var{filename} - -File to which output should be sent. This can be an ordinary filename -(e.g., @code{"pspp.txt"}), a pipe filename (e.g., @code{"|lpr"}), or -stdout (@code{"-"}). Default: @code{"pspp.list"}. - -@item char-set=@var{char-set-type} - -One of @samp{ascii} or @samp{latin1}. This has no effect on output at -the present time. Default: @code{ascii}. - -@item form-feed-string=@var{form-feed-value} - -The string written to the output to cause a formfeed. See also -@code{paginate}, described below, for a related setting. Default: -@code{"\f"}. - -@item newline-string=@var{new-line-value} - -The string written to the output to cause a new-line (carriage return -plus linefeed). The default, which can be specified explicitly with -@code{newline-string=default}, is to use the system-dependent new-line -sequence by opening the output file in text mode. This is usually the -right choice. - -However, @code{newline-string} can be set to any string. When this is -done, the output file is opened in binary mode. - -@item paginate=@var{boolean} - -If set, a formfeed (as set in @code{form-feed-string}, described above) -will be written to the device after every page. Default: @code{on}. - -@item tab-width=@var{tab-width-value} - -The distance between tab stops for this device. If set to 0, tabs will -not be used in the output. Default: @code{8}. - -@item init=@var{initialization-string}. - -String written to the device before anything else, at the beginning of -the output. Default: @code{""} (the empty string). - -@item done=@var{finalization-string}. - -String written to the device after everything else, at the end of the -output. Default: @code{""} (the empty string). -@end table - -@node ASCII page options, ASCII font options, ASCII output options, ASCII driver class -@subsection ASCII page options - -These options affect page setup: - -@table @code -@item headers=@var{boolean} - -If enabled, two lines of header information giving title and subtitle, -page number, date and time, and PSPP version are printed at the top of -every page. These two lines are in addition to any top margin -requested. Default: @code{on}. - -@item length=@var{line-count} - -Physical length of a page, in lines. Headers and margins are subtracted -from this value. Default: @code{66}. - -@item width=@var{character-count} - -Physical width of a page, in characters. Margins are subtracted from -this value. Default: @code{130}. - -@item lpi=@var{lines-per-inch} - -Number of lines per vertical inch. Not currently used. Default: @code{6}. - -@item cpi=@var{characters-per-inch} - -Number of characters per horizontal inch. Not currently used. Default: -@code{10}. - -@item left-margin=@var{left-margin-width} - -Width of the left margin, in characters. PSPP subtracts this value -from the page width. Default: @code{0}. - -@item right-margin=@var{right-margin-width} - -Width of the right margin, in characters. PSPP subtracts this value -from the page width. Default: @code{0}. - -@item top-margin=@var{top-margin-lines} - -Length of the top margin, in lines. PSPP subtracts this value from -the page length. Default: @code{2}. - -@item bottom-margin=@var{bottom-margin-lines} - -Length of the bottom margin, in lines. PSPP subtracts this value from -the page length. Default: @code{2}. - -@end table - -@node ASCII font options, , ASCII page options, ASCII driver class -@subsection ASCII font options - -These are the ASCII font options: - -@table @code -@item box[@var{line-type}]=@var{box-chars} - -The characters used for lines in tables produced by the ASCII driver can -be changed using this option. @var{line-type} is used to indicate which -type of line to change; @var{box-chars} is the character or string of -characters to use for this type of line. - -@var{line-type} must be a 4-digit number in base 4. The digits are in -the order `right', `bottom', `left', `top'. The four possibilities for -each digit are: - -@table @asis -@item 0 -No line. - -@item 1 -Single line. - -@item 2 -Double line. - -@item 3 -Special device-defined line, if one is available; otherwise, a double -line. -@end table - -Examples: - -@table @code -@item box[0101]="|" - -Sets @samp{|} as the character to use for a single-width line with -bottom and top components. - -@item box[2222]="#" - -Sets @samp{#} as the character to use for the intersection of four -double-width lines, one each from the top, bottom, left and right. - -@item box[1100]="\xda" - -Sets @samp{"\xda"}, which under MS-DOS is a box character suitable for -the top-left corner of a box, as the character for the intersection of -two single-width lines, one each from the right and bottom. - -@end table - -Defaults: - -@itemize @bullet -@item -@code{box[0000]=" "} - -@item -@code{box[1000]="-"} -@*@code{box[0010]="-"} -@*@code{box[1010]="-"} - -@item -@code{box[0100]="|"} -@*@code{box[0001]="|"} -@*@code{box[0101]="|"} - -@item -@code{box[2000]="="} -@*@code{box[0020]="="} -@*@code{box[2020]="="} - -@item -@code{box[0200]="#"} -@*@code{box[0002]="#"} -@*@code{box[0202]="#"} - -@item -@code{box[3000]="="} -@*@code{box[0030]="="} -@*@code{box[3030]="="} - -@item -@code{box[0300]="#"} -@*@code{box[0003]="#"} -@*@code{box[0303]="#"} - -@item -For all others, @samp{+} is used unless there are double lines or -special lines, in which case @samp{#} is used. -@end itemize - -@item italic-on=@var{italic-on-string} - -Character sequence written to turn on italics or underline printing. If -this is set to @code{overstrike}, then the driver will simulate -underlining by overstriking with underscore characters (@samp{_}) in the -manner described by @code{overstrike-style} and -@code{carriage-return-style}. Default: @code{overstrike}. - -@item italic-off=@var{italic-off-string} - -Character sequence to turn off italics or underline printing. Default: -@code{""} (the empty string). - -@item bold-on=@var{bold-on-string} - -Character sequence written to turn on bold or emphasized printing. If -set to @code{overstrike}, then the driver will simulated bold printing -by overstriking characters in the manner described by -@code{overstrike-style} and @code{carriage-return-style}. Default: -@code{overstrike}. - -@item bold-off=@var{bold-off-string} - -Character sequence to turn off bold or emphasized printing. Default: -@code{""} (the empty string). - -@item bold-italic-on=@var{bold-italic-on-string} - -Character sequence written to turn on bold-italic printing. If set to -@code{overstrike}, then the driver will simulate bold-italics by -overstriking twice, once with the character, a second time with an -underscore (@samp{_}) character, in the manner described by -@code{overstrike-style} and @code{carriage-return-style}. Default: -@code{overstrike}. - -@item bold-italic-off=@var{bold-italic-off-string} - -Character sequence to turn off bold-italic printing. Default: @code{""} -(the empty string). - -@item overstrike-style=@var{overstrike-option} - -Either @code{single} or @code{line}: - -@itemize @bullet -@item -If @code{single} is selected, then, to overstrike a line of text, the -output driver will output a character, backspace, overstrike, output a -character, backspace, overstrike, and so on along a line. - -@item -If @code{line} is selected then the output driver will output an entire -line, then backspace or emit a carriage return (as indicated by -@code{carriage-return-style}), then overstrike the entire line at once. -@end itemize - -@code{single} is recommended for use with ttys and programs that -understand overstriking in text files, such as the pager @code{less}. -@code{single} will also work with printer devices but results in rapid -back-and-forth motions of the printhead that can cause the printer to -physically overheat! - -@code{line} is recommended for use with printer devices. Most programs -that understand overstriking in text files will not properly deal with -@code{line} mode. - -Default: @code{single}. - -@item carriage-return-style=@var{carriage-return-type} - -Either @code{bs} or @code{cr}. This option applies only when one or -more of the font commands is set to @code{overstrike} and, at the same -time, @code{overstrike-style} is set to @code{line}. - -@itemize @bullet -@item -If @code{bs} is selected then the driver will return to the beginning of -a line by emitting a sequence of backspace characters (ASCII 8). - -@item -If @code{cr} is selected then the driver will return to the beginning of -a line by emitting a single carriage-return character (ASCII 13). -@end itemize - -Although @code{cr} is preferred as being more compact, @code{bs} is more -general since some devices do not interpret carriage returns in the -desired manner. Default: @code{bs}. -@end table - -@node HTML driver class, Miscellaneous configuring, ASCII driver class, Configuration -@section The HTML driver class - -The @code{html} driver class is used to produce output for viewing in -tables-capable web browsers such as Emacs' w3-mode. Its configuration -is very simple. Currently, the output has a very plain format. In the -future, further work may be done on improving the output appearance. - -There are few options for use with the @code{html} driver class: - -@table @code -@item output-file=@var{filename} - -File to which output should be sent. This can be an ordinary filename -(i.e., @code{"pspp.ps"}), a pipe filename (i.e., @code{"|lpr"}), or -stdout (@code{"-"}). Default: @code{"pspp.html"}. - -@item prologue-file=@var{prologue-file-name} - -Sets the name of the PostScript prologue file. You can write your own -prologue if you want to customize colors or other settings: see -@ref{HTML Prologue}. Default: @code{html-prologue}. -@end table - -@menu -* HTML Prologue:: Format of the HTML prologue file. -@end menu - -@node HTML Prologue, , HTML driver class, HTML driver class -@subsection The HTML prologue - -HTML files that are generated by PSPP consist of two parts: a prologue -and a body. The prologue is a collection of boilerplate. Only the body -differs greatly between two outputs. You can tune the colors and other -attributes of the output by editing the prologue. - -The prologue is dumped into the output stream essentially unmodified. -However, two actions are performed on its lines. First, certain lines -may be omitted as specified in the prologue file itself. Second, -variables are substituted. - -The following lines are omitted: - -@enumerate -@item -All lines that contain three bangs in a row (@code{!!!}). - -@item -Lines that contain @code{!title}, if no title is set for the output. If -a title is set, then the characters @code{!title} are removed before the -line is output. - -@item -Lines that contain @code{!subtitle}, if no subtitle is set for the -output. If a subtitle is set, then the characters @code{!subtitle} are -removed before the line is output. -@end enumerate - -The following are the variables that are substituted. Only the -variables listed are substituted; environment variables are not. -@xref{Environment substitutions}. - -@table @code -@item generator - -PSPP version as a string: @samp{GNU PSPP 0.1b}, for example. - -@item date - -Date the file was created. Example: @samp{Tue May 21 13:46:22 1991}. - -@item user - -Under multiuser OSes, the user's login name, taken either from the -environment variable @code{LOGNAME} or, if that fails, the result of the -C library function @code{getlogin()}. Defaults to @samp{nobody}. - -@item host - -System hostname as reported by @code{gethostname()}. Defaults to -@samp{nowhere}. - -@item title - -Document title as a string. This is the title specified in the PSPP -syntax file. - -@item subtitle - -Document subtitle as a string. - -@item source-file - -PSPP syntax file name. Example: @samp{mary96/first.stat}. -@end table - -@node Miscellaneous configuring, Improving output quality, HTML driver class, Configuration -@section Miscellaneous configuration - -The following environment variables can be used to further configure -PSPP: - -@table @code -@item HOME - -Used to determine the user's home directory. No default value. - -@item STAT_INCLUDE_PATH - -Path used to find include files in PSPP syntax files. Defaults vary -across operating systems: - -@table @asis -@item UNIX - -@itemize @bullet -@item -@file{.} - -@item -@file{~/.pspp/include} - -@item -@file{/usr/local/lib/pspp/include} - -@item -@file{/usr/lib/pspp/include} - -@item -@file{/usr/local/share/pspp/include} - -@item -@file{/usr/share/pspp/include} -@end itemize - -@item MS-DOS - -@itemize @bullet -@item -@file{.} - -@item -@file{C:\PSPP\INCLUDE} - -@item -@file{$PATH} -@end itemize - -@item Other OSes -No default path. -@end table - -@item STAT_PAGER -@itemx PAGER - -When PSPP invokes an external pager, it uses the first of these that -is defined. There is a default pager only if the person who compiled -PSPP defined one. - -@item TERM - -The terminal type @code{termcap} or @code{ncurses} will use, if such -support was compiled into PSPP. - -@item STAT_OUTPUT_INIT_FILE - -The basename used to search for the driver definition file. -@xref{Output devices}. @xref{File locations}. Default: @code{devices}. - -@item STAT_OUTPUT_PAPERSIZE_FILE - -The basename used to search for the papersize file. @xref{papersize}. -@xref{File locations}. Default: @code{papersize}. - -@item STAT_OUTPUT_INIT_PATH - -The path used to search for the driver definition file and the papersize -file. @xref{File locations}. Default: the standard configuration path. - -@item TMPDIR - -The @code{sort} procedure stores its temporary files in this directory. -Default: (UNIX) @file{/tmp}, (MS-DOS) @file{\}, (other OSes) empty string. - -@item TEMP -@item TMP - -Under MS-DOS only, these variables are consulted after TMPDIR, in this -order. -@end table - -@node Improving output quality, , Miscellaneous configuring, Configuration -@section Improving output quality - -When its drivers are set up properly, PSPP can produce output that -looks very good indeed. The PostScript driver, suitably configured, can -produce presentation-quality output. Here are a few guidelines for -producing better-looking output, regardless of output driver. Your -mileage may vary, of course, and everyone has different esthetic -preferences. - -@itemize @bullet -@item -Width is important in PSPP output. Greater output width leads to more -readable output, to a point. Try the following to increase the output -width: - -@itemize @minus -@item -If you're using the ASCII driver with a dot-matrix printer, figure out -what you need to do to put the printer into compressed mode. Put that -string into the @code{init-string} setting. Try to get 132 columns; 160 -might be better, but you might find that print that tiny is difficult to -read. - -@item -With the PostScript driver, try these ideas: - -@itemize + -@item -Landscape mode. - -@item -Legal-size (8.5" x 14") paper in landscape mode. - -@item -Reducing font sizes. If you're using 12-point fonts, try 10 point; if -you're using 10-point fonts, try 8 point. Some fonts are more readable -than others at small sizes. -@end itemize -@end itemize - -Try to strike a balance between character size and page width. - -@item -Use high-quality fonts. Many public domain fonts are poor in quality. -Recently, URW made some high-quality fonts available under the GPL. -These are probably suitable. - -@item -Be sure you're using the proper font metrics. The font metrics provided -with PSPP may not correspond to the fonts actually being printed. -This can cause bizarre-looking output. - -@item -Make sure that you're using good ink/ribbon/toner. Darker print is -easier to read. - -@item -Use plain fonts with serifs, such as Times-Roman or Palatino. Avoid -choosing italic or bold fonts as document base fonts. -@end itemize - -@node Invocation, Language, Configuration, Top -@chapter Invoking PSPP -@cindex invocation -@cindex PSPP, invoking - -@cindex command line, options -@cindex options, command-line -@example -pspp [ -B @var{dir} | --config-dir=@var{dir} ] [ -o @var{device} | --device=@var{device} ] - [ -d @var{var}[=@var{value}] | --define=@var{var}[=@var{value}] ] [-u @var{var} | --undef=@var{var} ] - [ -f @var{file} | --out-file=@var{file} ] [ -p | --pipe ] [ -I- | --no-include ] - [ -I @var{dir} | --include=@var{dir} ] [ -i | --interactive ] - [ -n | --edit | --dry-run | --just-print | --recon ] - [ -r | --no-statrc ] [ -h | --help ] [ -l | --list ] - [ -c @var{command} | --command @var{command} ] [ -s | --safer ] - [ --testing-mode ] [ -V | --version ] [ -v | --verbose ] - [ @var{key}=@var{value} ] @var{file}@enddots{} -@end example - -@menu -* Non-option Arguments:: Specifying syntax files and output devices. -* Configuration Options:: Change the configuration for the current run. -* Input and output options:: Controlling input and output files. -* Language control options:: Language variants. -* Informational options:: Helpful information about PSPP. -@end menu - -@node Non-option Arguments, Configuration Options, Invocation, Invocation -@section Non-option Arguments - -Syntax files and output device substitutions can be specified on -PSPP's command line: - -@table @code -@item @var{file} - -A file by itself on the command line will be executed as a syntax file. -PSPP terminates after the syntax file runs, unless the @code{-i} or -@code{--interactive} option is given (@pxref{Language control options}). - -@item @var{file1} @var{file2} - -When two or more filenames are given on the command line, the first -syntax file is executed, then PSPP's dictionary is cleared, then the second -syntax file is executed. - -@item @var{file1} + @var{file2} - -If syntax files' names are delimited by a plus sign (@samp{+}), then the -dictionary is not cleared between their executions, as if they were -concatenated together into a single file. - -@item @var{key}=@var{value} - -Defines an output device macro @var{key} to expand to @var{value}, -overriding any macro having the same @var{key} defined in the device -configuration file. @xref{Macro definitions}. - -@end table - -There is one other way to specify a syntax file, if your operating -system supports it. If you have a syntax file @file{foobar.stat}, put -the notation - -@example -#! /usr/local/bin/pspp -@end example - -at the top, and mark the file as executable with @code{chmod +x -foobar.stat}. (If PSPP is not installed in @file{/usr/local/bin}, -then insert its actual installation directory into the syntax file -instead.) Now you should be able to invoke the syntax file just by -typing its name. You can include any options on the command line as -usual. PSPP entirely ignores any lines beginning with @samp{#!}. - -@node Configuration Options, Input and output options, Non-option Arguments, Invocation -@section Configuration Options - -Configuration options are used to change PSPP's configuration for the -current run. The configuration options are: - -@table @code -@item -a @{compatible|enhanced@} -@itemx --algorithm=@{compatible|enhanced@} - -If you chose @code{compatible}, then PSPP will use the same algorithms -as used by some proprietary statistical analysis packages. -This is not recommended, as these algorithms are inferior and in some cases -compeletely broken. -The default setting is @code{enhanced}. -Certain commands have subcommands which allow you to override this setting on -a per command basis. - -@item -B @var{dir} -@itemx --config-dir=@var{dir} - -Sets the configuration directory to @var{dir}. @xref{File locations}. - -@item -o @var{device} -@itemx --device=@var{device} - -Selects the output device with name @var{device}. If this option is -given more than once, then all devices mentioned are selected. This -option disables all devices besides those mentioned on the command line. - -@item -d @var{var}[=@var{value}] -@itemx --define=@var{var}[=@var{value}] - -Defines an `environment variable' named @var{var} having the optional -value @var{value} specified. @xref{Variable values}. - -@item -u @var{var} -@itemx --undef=@var{var} - -Undefines the `environment variable' named @var{var}. @xref{Variable -values}. -@end table - -@node Input and output options, Language control options, Configuration Options, Invocation -@section Input and output options - -Input and output options affect how PSPP reads input and writes -output. These are the input and output options: - -@table @code -@item -f @var{file} -@itemx --out-file=@var{file} - -This overrides the output file name for devices designated as listing -devices. If a file named @var{file} already exists, it is overwritten. - -@item -p -@itemx --pipe - -Allows PSPP to be used as a filter by causing the syntax file to be -read from stdin and output to be written to stdout. Conflicts with the -@code{-f @var{file}} and @code{--file=@var{file}} options. - -@item -I- -@itemx --no-include - -Clears all directories from the include path. This includes all -directories put in the include path by default. @xref{Miscellaneous -configuring}. - -@item -I @var{dir} -@itemx --include=@var{dir} - -Appends directory @var{dir} to the path that is searched for include -files in PSPP syntax files. - -@item -c @var{command} -@itemx --command=@var{command} - -Execute literal command @var{command}. The command is executed before -startup syntax files, if any. - -@item --testing-mode - -Invoke heuristics to assist with testing PSPP. For use by @code{make -check} and similar scripts. -@end table - -@node Language control options, Informational options, Input and output options, Invocation -@section Language control options - -Language control options control how PSPP syntax files are parsed and -interpreted. The available language control options are: - -@table @code -@item -i -@itemx --interactive - -When a syntax file is specified on the command line, PSPP normally -terminates after processing it. Giving this option will cause PSPP to -bring up a command prompt after processing the syntax file. - -In addition, this forces syntax files to be interpreted in interactive -mode, rather than the default batch mode. @xref{Tokenizing lines}, for -information on the differences between batch mode and interactive mode -command interpretation. - -@item -n -@itemx --edit -@itemx --dry-run -@itemx --just-print -@itemx --recon - -Only the syntax of any syntax file specified or of commands entered at -the command line is checked. Transformations are not performed and -procedures are not executed. Not yet implemented. - -@item -r -@itemx --no-statrc - -Prevents the execution of the PSPP startup syntax file. Not yet -implemented, as startup syntax files aren't, either. - -@item -s -@itemx --safer - -Disables certain unsafe operations. This includes the ERASE and -HOST commands, as well as use of pipes as input and output files. -@end table - -@node Informational options, , Language control options, Invocation -@section Informational options - -Informational options cause information about PSPP to be written to -the terminal. Here are the available options: - -@table @code -@item -h -@item --help - -Prints a message describing PSPP command-line syntax and the available -device driver classes, then terminates. - -@item -l -@item --list - -Lists the available device driver classes, then terminates. - -@item -x @{compatible|enhanced@} -@itemx --syntax=@{compatible|enhanced@} - -If you chose @code{compatible}, then PSPP will only accept command syntax that -is compatible with the proprietary program SPSS. -If you choose @code{enhanced} then additional syntax will be available. -The default is @code{enhanced}. - - -@item -V -@item --version - -Prints a brief message listing PSPP's version, warranties you don't -have, copying conditions and copyright, and e-mail address for bug -reports, then terminates. - -@item -v -@item --verbose - -Increments PSPP's verbosity level. Higher verbosity levels cause -PSPP to display greater amounts of information about what it is -doing. Often useful for debugging PSPP's configuration. - -This option can be given multiple times to set the verbosity level to -that value. The default verbosity level is 0, in which no informational -messages will be displayed. - -Higher verbosity levels cause messages to be displayed when the -corresponding events take place. - -@table @asis -@item 1 - -Driver and subsystem initializations. - -@item 2 - -Completion of driver initializations. Beginning of driver closings. - -@item 3 - -Completion of driver closings. - -@item 4 - -Files searched for; success of searches. - -@item 5 - -Individual directories included in file searches. -@end table - -Each verbosity level also includes messages from lower verbosity levels. - -@end table - -@node Language, Expressions, Invocation, Top -@chapter The PSPP language -@cindex language, PSPP -@cindex PSPP, language - -@quotation -@strong{Please note:} PSPP is not even close to completion. -Only a few actual statistical procedures are implemented. PSPP -is a work in progress. -@end quotation - -This chapter discusses elements common to many PSPP commands. -Later chapters will describe individual commands in detail. - -@menu -* Tokens:: Characters combine to form tokens. -* Commands:: Tokens combine to form commands. -* Types of Commands:: Commands come in several flavors. -* Order of Commands:: Commands combine to form syntax files. -* Missing Observations:: Handling missing observations. -* Variables:: The unit of data storage. -* Files:: Files used by PSPP. -* BNF:: How command syntax is described. -@end menu - -@node Tokens, Commands, Language, Language -@section Tokens -@cindex language, lexical analysis -@cindex language, tokens -@cindex tokens -@cindex lexical analysis -@cindex lexemes - -PSPP divides most syntax file lines into series of short chunks -called @dfn{tokens}, @dfn{lexical elements}, or @dfn{lexemes}. These -tokens are then grouped to form commands, each of which tells -PSPP to take some action---read in data, write out data, perform -a statistical procedure, etc. The process of dividing input into tokens -is @dfn{tokenization}, or @dfn{lexical analysis}. Each type of token is -described below. - -@cindex delimiters -@cindex whitespace -Tokens must be separated from each other by @dfn{delimiters}. -Delimiters include whitespace (spaces, tabs, carriage returns, line -feeds, vertical tabs), punctuation (commas, forward slashes, etc.), and -operators (plus, minus, times, divide, etc.) Note that while whitespace -only separates tokens, other delimiters are tokens in themselves. - -@table @strong -@cindex identifiers -@item Identifiers -Identifiers are names that specify variable names, commands, or command -details. - -@itemize @bullet -@item -The first character in an identifier must be a letter, @samp{#}, or -@samp{@@}. Some system identifiers begin with @samp{$}, but -user-defined variables' names may not begin with @samp{$}. - -@item -The remaining characters in the identifier must be letters, digits, or -one of the following special characters: - -@example -. _ $ # @@ -@end example - -@item -@cindex variable names -@cindex names, variable -Variable names may be any length, but only the first 8 characters are -significant. - -@item -@cindex case-sensitivity -Identifiers are not case-sensitive: @code{foobar}, @code{Foobar}, -@code{FooBar}, @code{FOOBAR}, and @code{FoObaR} are different -representations of the same identifier. - -@item -@cindex keywords -Identifiers other than variable names may be abbreviated to their first -3 characters if this abbreviation is unambiguous. These identifiers are -often called @dfn{keywords}. (Unique abbreviations of 3 or more -characters are also accepted: @samp{FRE}, @samp{FREQ}, and -@samp{FREQUENCIES} are equivalent when the last is a keyword.) - -@item -Whether an identifier is a keyword depends on the context. - -@item -@cindex keywords, reserved -@cindex reserved keywords -Some keywords are reserved. These keywords may not be used in any -context besides those explicitly described in this manual. The reserved -keywords are: - -@example -ALL AND BY EQ GE GT LE LT NE NOT OR TO WITH -@end example - -@item -Since keywords are identifiers, all the rules for identifiers apply. -Specifically, they must be delimited as are other identifiers: -@code{WITH} is a reserved keyword, but @code{WITHOUT} is a valid -variable name. -@end itemize - -@cindex @samp{.} -@cindex period -@cindex variable names, ending with period -@strong{Caution:} It is legal to end a variable name with a period, but -@emph{don't do it!} The variable name will be misinterpreted when it is -the final token on a line: @code{FOO.} will be divided into two separate -tokens, @samp{FOO} and @samp{.}, the @dfn{terminal dot}. -@xref{Commands, , Forming commands of tokens}. - -@item Numbers -@cindex numbers -@cindex integers -@cindex reals -Numbers may be specified as integers or reals. Integers are internally -converted into reals. Scientific notation is not supported. Here are -some examples of valid numbers: - -@example -1234 3.14159265359 .707106781185 8945. -@end example - -@strong{Caution:} The last example will be interpreted as two tokens, -@samp{8945} and @samp{.}, if it is the last token on a line. - -@item Strings -@cindex strings -@cindex @samp{'} -@cindex @samp{"} -@cindex case-sensitivity -Strings are literal sequences of characters enclosed in pairs of single -quotes (@samp{'}) or double quotes (@samp{"}). - -@itemize @bullet -@item -Whitespace and case of letters @emph{are} significant inside strings. -@item -Whitespace characters inside a string are not delimiters. -@item -To include single-quote characters in a string, enclose the string in -double quotes. -@item -To include double-quote characters in a string, enclose the string in -single quotes. -@item -It is not possible to put both single- and double-quote characters -inside one string. -@end itemize - -@item Hexstrings -@cindex hexstrings -Hexstrings are string variants that use hex digits to specify -characters. - -@itemize @bullet -@item -A hexstring may be used anywhere that an ordinary string is allowed. - -@item -@cindex @samp{X'} -@cindex @samp{'} -A hexstring begins with @samp{X'} or @samp{x'}, and ends with @samp{'}. - -@cindex whitespace -@item -No whitespace is allowed between the initial @samp{X} and @samp{'}. - -@item -Double quotes @samp{"} may be used in place of single quotes @samp{'} if -done in both places. - -@item -Each pair of hex digits is internally changed into a single character -with the given value. - -@item -If there is an odd number of hex digits, the missing last digit is -assumed to be @samp{0}. - -@item -@cindex portability -@strong{Please note:} Use of hexstrings is nonportable because the same -numeric values are associated with different glyphs by different -operating systems. Therefore, their use should be confined to syntax -files that will not be widely distributed. - -@item -@cindex characters, reserved -@cindex 0 -@cindex whitespace -@strong{Please note also:} The character with value 00 is reserved for -internal use by PSPP. Its use in strings causes an error and -replacement with a blank space (in ASCII, hex 20, decimal 32). -@end itemize - -@item Punctuation -@cindex punctuation -Punctuation separates tokens; punctuators are delimiters. These are the -punctuation characters: - -@example -, / = ( ) -@end example - -@item Operators -@cindex operators -Operators describe mathematical operations. Some operators are delimiters: - -@example -( ) + - * / ** -@end example - -Many of the above operators are also punctuators. Punctuators are -distinguished from operators by context. - -The other operators are all reserved keywords. None of these are -delimiters: - -@example -AND EQ GE GT LE LT NE OR -@end example - -@item Terminal Dot -@cindex terminal dot -@cindex dot, terminal -@cindex period -@cindex @samp{.} -A period (@samp{.}) at the end of a line (except for whitespace) is one -type of a @dfn{terminal dot}, although not every terminal dot is a -period at the end of a line. @xref{Commands, , Forming commands of -tokens}. A period is a terminal dot @emph{only} -when it is at the end of a line; otherwise it is part of a -floating-point number. (A period outside a number in the middle of a -line is an error.) - -@quotation -@cindex terminal dot, changing -@cindex dot, terminal, changing -@strong{Please note:} The character used for the @dfn{terminal dot} -can be changed with @cmd{SET}'s ENDCMD subcommand (@pxref{SET}). This -is strongly discouraged, and throughout all the remainder of this -manual it will be assumed that the default setting is in effect. -@end quotation - -@end table - -@node Commands, Types of Commands, Tokens, Language -@section Forming commands of tokens - -@cindex PSPP, command structure -@cindex language, command structure -@cindex commands, structure - -Most PSPP commands share a common structure, diagrammed below: - -@example -@var{cmd}@dots{} [@var{sbc}[=][@var{spec} [[,]@var{spec}]@dots{}]] [[/[=][@var{spec} [[,]@var{spec}]@dots{}]]@dots{}]. -@end example - -@cindex @samp{[ ]} -In the above, rather daunting, expression, pairs of square brackets -(@samp{[ ]}) indicate optional elements, and names such as @var{cmd} -indicate parts of the syntax that vary from command to command. -Ellipses (@samp{...}) indicate that the preceding part may be repeated -an arbitrary number of times. Let's pick apart what it says above: - -@itemize @bullet -@cindex commands, names -@item -A command begins with a command name of one or more keywords, such as -@cmd{FREQUENCIES}, @cmd{DATA LIST}, or @cmd{N OF CASES}. @var{cmd} -may be abbreviated to its first word if that is unambiguous; each word -in @var{cmd} may be abbreviated to a unique prefix of three or more -characters as described above. - -@cindex subcommands -@item -The command name may be followed by one or more @dfn{subcommands}: - -@itemize @minus -@item -Each subcommand begins with a unique keyword, indicated by @var{sbc} -above. This is analogous to the command name. - -@item -The subcommand name is optionally followed by an equals sign (@samp{=}). - -@item -Some subcommands accept a series of one or more specifications -(@var{spec}), optionally separated by commas. - -@item -Each subcommand must be separated from the next (if any) by a forward -slash (@samp{/}). -@end itemize - -@cindex dot, terminal -@cindex terminal dot -@item -Each command must be terminated with a @dfn{terminal dot}. -The terminal dot may be given one of three ways: - -@itemize @minus -@item -(most commonly) A period character at the very end of a line, as -described above. - -@item -(only if NULLINE is on: @xref{SET, , Setting user preferences}, for more -details.) A completely blank line. - -@item -(in batch mode only) Any line that is not indented from the left side of -the page causes a terminal dot to be inserted before that line. -Therefore, each command begins with a line that is flush left, followed -by zero or more lines that are indented one or more characters from the -left margin. - -In batch mode, PSPP will ignore a plus sign, minus sign, or period -(@samp{+}, @samp{@minus{}}, or @samp{.}) as the first character in a -line. Any of these characters as the first character on a line will -begin a new command. This allows for visual indentation of a command -without that command being considered part of the previous command. - -PSPP is in batch mode when it is reading input from a file, rather -than from an interactive user. Note that the other forms of the -terminal dot may also be used in batch mode. - -Sometimes, one encounters syntax files that are intended to be -interpreted in interactive mode rather than batch mode (for instance, -this can happen if a session log file is used directly as a syntax -file). When this occurs, use the @samp{-i} command line option to force -interpretation in interactive mode (@pxref{Language control options}). -@end itemize -@end itemize - -PSPP ignores empty commands when they are generated by the above -rules. Note that, as a consequence of these rules, each command must -begin on a new line. - -@node Types of Commands, Order of Commands, Commands, Language -@section Types of Commands - -Commands in PSPP are divided roughly into six categories: - -@table @strong -@item Utility commands -@cindex utility commands -Set or display various global options that affect PSPP operations. -May appear anywhere in a syntax file. @xref{Utilities, , Utility -commands}. - -@item File definition commands -@cindex file definition commands -Give instructions for reading data from text files or from special -binary ``system files''. Most of these commands discard any previous -data or variables to replace it with the new data and -variables. At least one must appear before the first command in any of -the categories below. @xref{Data Input and Output}. - -@item Input program commands -@cindex input program commands -Though rarely used, these provide powerful tools for reading data files -in arbitrary textual or binary formats. @xref{INPUT PROGRAM}. - -@item Transformations -@cindex transformations -Perform operations on data and write data to output files. Transformations -are not carried out until a procedure is executed. - -@item Restricted transformations -@cindex restricted transformations -Same as transformations for most purposes. @xref{Order of Commands}, for a -detailed description of the differences. - -@item Procedures -@cindex procedures -Analyze data, writing results of analyses to the listing file. Cause -transformations specified earlier in the file to be performed. In a -more general sense, a @dfn{procedure} is any command that causes the -active file (the data) to be read. -@end table - -@node Order of Commands, Missing Observations, Types of Commands, Language -@section Order of Commands -@cindex commands, ordering -@cindex order of commands - -PSPP does not place many restrictions on ordering of commands. -The main restriction is that variables must be defined with one of the -file-definition commands before they are otherwise referred to. - -Of course, there are specific rules, for those who are interested. -PSPP possesses five internal states, called initial, INPUT PROGRAM, -FILE TYPE, transformation, and procedure states. (Please note the -distinction between the @cmd{INPUT PROGRAM} and @cmd{FILE TYPE} -@emph{commands} and the INPUT PROGRAM and FILE TYPE @emph{states}.) - -PSPP starts up in the initial state. Each successful completion -of a command may cause a state transition. Each type of command has its -own rules for state transitions: - -@table @strong -@item Utility commands -@itemize @bullet -@item -Legal in all states. -@item -Do not cause state transitions. Exception: when @cmd{N OF CASES} -is executed in the procedure state, it causes a transition to the -transformation state. -@end itemize - -@item @cmd{DATA LIST} -@itemize @bullet -@item -Legal in all states. -@item -When executed in the initial or procedure state, causes a transition to -the transformation state. -@item -Clears the active file if executed in the procedure or transformation -state. -@end itemize - -@item @cmd{INPUT PROGRAM} -@itemize @bullet -@item -Invalid in INPUT PROGRAM and FILE TYPE states. -@item -Causes a transition to the INPUT PROGRAM state. -@item -Clears the active file. -@end itemize - -@item @cmd{FILE TYPE} -@itemize @bullet -@item -Invalid in INPUT PROGRAM and FILE TYPE states. -@item -Causes a transition to the FILE TYPE state. -@item -Clears the active file. -@end itemize - -@item Other file definition commands -@itemize @bullet -@item -Invalid in INPUT PROGRAM and FILE TYPE states. -@item -Cause a transition to the transformation state. -@item -Clear the active file, except for @cmd{ADD FILES}, @cmd{MATCH FILES}, -and @cmd{UPDATE}. -@end itemize - -@item Transformations -@itemize @bullet -@item -Invalid in initial and FILE TYPE states. -@item -Cause a transition to the transformation state. -@end itemize - -@item Restricted transformations -@itemize @bullet -@item -Invalid in initial, INPUT PROGRAM, and FILE TYPE states. -@item -Cause a transition to the transformation state. -@end itemize - -@item Procedures -@itemize @bullet -@item -Invalid in initial, INPUT PROGRAM, and FILE TYPE states. -@item -Cause a transition to the procedure state. -@end itemize -@end table - -@node Missing Observations, Variables, Order of Commands, Language -@section Handling missing observations -@cindex missing values -@cindex values, missing - -PSPP includes special support for unknown numeric data values. -Missing observations are assigned a special value, called the -@dfn{system-missing value}. This ``value'' actually indicates the -absence of value; it means that the actual value is unknown. Procedures -automatically exclude from analyses those observations or cases that -have missing values. Whether single observations or entire cases are -excluded depends on the procedure. - -The system-missing value exists only for numeric variables. String -variables always have a defined value, even if it is only a string of -spaces. - -Variables, whether numeric or string, can have designated -@dfn{user-missing values}. Every user-missing value is an actual value -for that variable. However, most of the time user-missing values are -treated in the same way as the system-missing value. String variables -that are wider than a certain width, usually 8 characters (depending on -computer architecture), cannot have user-missing values. - -For more information on missing values, see the following sections: -@ref{Variables}, @ref{MISSING VALUES}, @ref{Expressions}. See also the -documentation on individual procedures for information on how they -handle missing values. - -@node Variables, Files, Missing Observations, Language -@section Variables -@cindex variables -@cindex dictionary - -Variables are the basic unit of data storage in PSPP. All the -variables in a file taken together, apart from any associated data, are -said to form a @dfn{dictionary}. -Some details of variables are described in the sections below. - -@menu -* Attributes:: Attributes of variables. -* System Variables:: Variables automatically defined by PSPP. -* Sets of Variables:: Lists of variable names. -* Input/Output Formats:: Input and output formats. -* Scratch Variables:: Variables deleted by procedures. -@end menu - -@node Attributes, System Variables, Variables, Variables -@subsection Attributes of Variables -@cindex variables, attributes of -@cindex attributes of variables -Each variable has a number of attributes, including: - -@table @strong -@item Name -This is an identifier. Each variable must have a different name. -@xref{Tokens}. - -@cindex variables, type -@cindex type of variables -@item Type -Numeric or string. - -@cindex variables, width -@cindex width of variables -@item Width -(string variables only) String variables with a width of 8 characters or -fewer are called @dfn{short string variables}. Short string variables -can be used in many procedures where @dfn{long string variables} (those -with widths greater than 8) are not allowed. - -@quotation -@strong{Please note:} Certain systems may consider strings longer than 8 -characters to be short strings. Eight characters represents a minimum -figure for the maximum length of a short string. -@end quotation - -@item Position -Variables in the dictionary are arranged in a specific order. -@cmd{DISPLAY} can be used to show this order: see @ref{DISPLAY}. - -@item Initialization -Either reinitialized to 0 or spaces for each case, or left at its -existing value. @xref{LEAVE}. - -@cindex missing values -@cindex values, missing -@item Missing values -Optionally, up to three values, or a range of values, or a specific -value plus a range, can be specified as @dfn{user-missing values}. -There is also a @dfn{system-missing value} that is assigned to an -observation when there is no other obvious value for that observation. -Observations with missing values are automatically excluded from -analyses. User-missing values are actual data values, while the -system-missing value is not a value at all. @xref{Missing Observations}. - -@cindex variable labels -@cindex labels, variable -@item Variable label -A string that describes the variable. @xref{VARIABLE LABELS}. - -@cindex value labels -@cindex labels, value -@item Value label -Optionally, these associate each possible value of the variable with a -string. @xref{VALUE LABELS}. - -@cindex print format -@item Print format -Display width, format, and (for numeric variables) number of decimal -places. This attribute does not affect how data are stored, just how -they are displayed. Example: a width of 8, with 2 decimal places. -@xref{PRINT FORMATS}. - -@cindex write format -@item Write format -Similar to print format, but used by certain commands that are -designed to write to binary files. @xref{WRITE FORMATS}. -@end table - -@node System Variables, Sets of Variables, Attributes, Variables -@subsection Variables Automatically Defined by PSPP -@cindex system variables -@cindex variables, system - -There are seven system variables. These are not like ordinary -variables, as they are not stored in each case. They can only be used -in expressions. These system variables, whose values and output formats -cannot be modified, are described below. - -@table @code -@cindex @code{$CASENUM} -@item $CASENUM -Case number of the case at the moment. This changes as cases are -shuffled around. - -@cindex @code{$DATE} -@item $DATE -Date the PSPP process was started, in format A9, following the -pattern @code{DD MMM YY}. - -@cindex @code{$JDATE} -@item $JDATE -Number of days between 15 Oct 1582 and the time the PSPP process -was started. - -@cindex @code{$LENGTH} -@item $LENGTH -Page length, in lines, in format F11. - -@cindex @code{$SYSMIS} -@item $SYSMIS -System missing value, in format F1. - -@cindex @code{$TIME} -@item $TIME -Number of seconds between midnight 14 Oct 1582 and the time the active file -was read, in format F20. - -@cindex @code{$WIDTH} -@item $WIDTH -Page width, in characters, in format F3. -@end table - -@node Sets of Variables, Input/Output Formats, System Variables, Variables -@subsection Lists of variable names -@cindex TO convention -@cindex convention, TO - -There are several ways to specify a set of variables: - -@enumerate -@item -(Most commonly.) List the variable names one after another, optionally -separating them by commas. - -@cindex @code{TO} -@item -(This method cannot be used on commands that define the dictionary, such -as @cmd{DATA LIST}.) The syntax is the names of two existing variables, -separated by the reserved keyword @code{TO}. The meaning is to include -every variable in the dictionary between and including the variables -specified. For instance, if the dictionary contains six variables with -the names @code{ID}, @code{X1}, @code{X2}, @code{GOAL}, @code{MET}, and -@code{NEXTGOAL}, in that order, then @code{X2 TO MET} would include -variables @code{X2}, @code{GOAL}, and @code{MET}. - -@item -(This method can be used only on commands that define the dictionary, -such as @cmd{DATA LIST}.) It is used to define sequences of variables -that end in consecutive integers. The syntax is two identifiers that -end in numbers. This method is best illustrated with examples: - -@itemize @bullet -@item -The syntax @code{X1 TO X5} defines 5 variables: - -@itemize @minus -@item -X1 -@item -X2 -@item -X3 -@item -X4 -@item -X5 -@end itemize - -@item -The syntax @code{ITEM0008 TO ITEM0013} defines 6 variables: - -@itemize @minus -@item -ITEM0008 -@item -ITEM0009 -@item -ITEM0010 -@item -ITEM0011 -@item -ITEM0012 -@item -ITEM0013 -@end itemize - -@item -Each of the syntaxes @code{QUES001 TO QUES9} and @code{QUES6 TO QUES3} -are invalid, although for different reasons, which should be evident. -@end itemize - -Note that after a set of variables has been defined with @cmd{DATA LIST} -or another command with this method, the same set can be referenced on -later commands using the same syntax. - -@item -The above methods can be combined, either one after another or delimited -by commas. For instance, the combined syntax @code{A Q5 TO Q8 X TO Z} -is legal as long as each part @code{A}, @code{Q5 TO Q8}, @code{X TO Z} -is individually legal. -@end enumerate - -@node Input/Output Formats, Scratch Variables, Sets of Variables, Variables -@subsection Input and Output Formats - -Data that PSPP inputs and outputs must have one of a number of formats. -These formats are described, in general, by a format specification of -the form @code{NAMEw.d}, where @var{name} is the -format name and @var{w} is a field width. @var{d} is the optional -desired number of decimal places, if appropriate. If @var{d} is not -included then it is assumed to be 0. Some formats do not allow @var{d} -to be specified. - -When an input format is specified on @cmd{DATA LIST} or another -command, then -it is converted to an output format for the purposes of @cmd{PRINT} -and other -data output commands. For most purposes, input and output formats are -the same; the salient differences are described below. - -Below are listed the input and output formats supported by PSPP. If an -input format is mapped to a different output format by default, then -that mapping is indicated with @result{}. Each format has the listed -bounds on input width (iw) and output width (ow). - -The standard numeric input and output formats are given in the following -table: - -@table @asis -@item Fw.d: 1 <= iw,ow <= 40 -Standard decimal format with @var{d} decimal places. If the number is -too large to fit within the field width, it is expressed in scientific -notation (@code{1.2+34}) if w >= 6, with always at least two digits in -the exponent. When used as an input format, scientific notation is -allowed but an E or an F must be used to introduce the exponent. - -The default output format is the same as the input format, except if -@var{d} > 1. In that case the output @var{w} is always made to be at -least 2 + @var{d}. - -@item Ew.d: 1 <= iw <= 40; 6 <= ow <= 40 -For input this is equivalent to F format except that no E or F is -require to introduce the exponent. For output, produces scientific -notation in the form @code{1.2+34}. There are always at least two -digits given in the exponent. - -The default output @var{w} is the largest of the input @var{w}, the -input @var{d} + 7, and 10. The default output @var{d} is the input -@var{d}, but at least 3. - -@item COMMAw.d: 1 <= iw,ow <= 40 -Equivalent to F format, except that groups of three digits are -comma-separated on output. If the number is too large to express in the -field width, then first commas are eliminated, then if there is still -not enough space the number is expressed in scientific notation given -that w >= 6. Commas are allowed and ignored when this is used as an -input format. - -@item DOTw.d: 1 <= iw,ow <= 40 -Equivalent to COMMA format except that the roles of comma and decimal -point are interchanged. However: If SET /DECIMAL=DOT is in effect, then -COMMA uses @samp{,} for a decimal point and DOT uses @samp{.} for a -decimal point. - -@item DOLLARw.d: 1 <= iw <= 40; 2 <= ow <= 40 -Equivalent to COMMA format, except that the number is prefixed by a -dollar sign (@samp{$}) if there is room. On input the value is allowed -to be prefixed by a dollar sign, which is ignored. - -The default output @var{w} is the input @var{w}, but at least 2. - -@item PCTw.d: 2 <= iw,ow <= 40 -Equivalent to F format, except that the number is suffixed by a percent -sign (@samp{%}) if there is room. On input the value is allowed to be -suffixed by a percent sign, which is ignored. - -The default output @var{w} is the input @var{w}, but at least 2. - -@item Nw.d: 1 <= iw,ow <= 40 -Only digits are allowed within the field width. The decimal point is -assumed to be @var{d} digits from the right margin. - -The default output format is F with the same @var{w} and @var{d}, except -if @var{d} > 1. In that case the output @var{w} is always made to be at -least 2 + @var{d}. - -@item Zw.d @result{} F: 1 <= iw,ow <= 40 -Zoned decimal input. If you need to use this then you know how. - -@item IBw.d @result{} F: 1 <= iw,ow <= 8 -Integer binary format. The field is interpreted as a fixed-point -positive or negative binary number in two's-complement notation. The -location of the decimal point is implied. Endianness is the same as the -host machine. - -The default output format is F8.2 if @var{d} is 0. Otherwise it is F, -with output @var{w} as 9 + input @var{d} and output @var{d} as input -@var{d}. - -@item PIB @result{} F: 1 <= iw,ow <= 8 -Positive integer binary format. The field is interpreted as a -fixed-point positive binary number. The location of the decimal point -is implied. Endianness is teh same as the host machine. - -The default output format follows the rules for IB format. - -@item Pw.d @result{} F: 1 <= iw,ow <= 16 -Binary coded decimal format. Each byte from left to right, except the -rightmost, represents two digits. The upper nibble of each byte is more -significant. The upper nibble of the final byte is the least -significant digit. The lower nibble of the final byte is the sign; a -value of D represents a negative sign and all other values are -considered positive. The decimal point is implied. - -The default output format follows the rules for IB format. - -@item PKw.d @result{} F: 1 <= iw,ow <= 16 -Positive binary code decimal format. Same as P but the last byte is the -same as the others. - -The default output format follows the rules for IB format. - -@item RBw @result{} F: 2 <= iw,ow <= 8 - -Binary C architecture-dependent ``double'' format. For a standard -IEEE754 implementation @var{w} should be 8. - -The default output format follows the rules for IB format. - -@item PIBHEXw.d @result{} F: 2 <= iw,ow <= 16 -PIB format encoded as textual hex digit pairs. @var{w} must be even. - -The input width is mapped to a default output width as follows: -2@result{}4, 4@result{}6, 6@result{}9, 8@result{}11, 10@result{}14, -12@result{}16, 14@result{}18, 16@result{}21. No allowances are made for -decimal places. - -@item RBHEXw @result{} F: 4 <= iw,ow <= 16 - -RB format encoded as textual hex digits pairs. @var{w} must be even. - -The default output format is F8.2. - -@item CCAw.d: 1 <= ow <= 40 -@itemx CCBw.d: 1 <= ow <= 40 -@itemx CCCw.d: 1 <= ow <= 40 -@itemx CCDw.d: 1 <= ow <= 40 -@itemx CCEw.d: 1 <= ow <= 40 - -User-defined custom currency formats. May not be used as an input -format. @xref{SET}, for more details. -@end table - -The date and time numeric input and output formats accept a number of -possible formats. Before describing the formats themselves, some -definitions of the elements that make up their formats will be helpful: - -@table @dfn -@item leader -All formats accept an optional whitespace leader. - -@item day -An integer between 1 and 31 representing the day of month. - -@item day-count -An integer representing a number of days. - -@item date-delimiter -One or more characters of whitespace or the following characters: -@code{- / . ,} - -@item month -A month name in one of the following forms: -@itemize @bullet -@item -An integer between 1 and 12. -@item -Roman numerals representing an integer between 1 and 12. -@item -At least the first three characters of an English month name (January, -February, @dots{}). -@end itemize - -@item year -An integer year number between 1582 and 19999, or between 1 and 199. -Years between 1 and 199 will have 1900 added. - -@item julian -A single number with a year number in the first 2, 3, or 4 digits (as -above) and the day number within the year in the last 3 digits. - -@item quarter -An integer between 1 and 4 representing a quarter. - -@item q-delimiter -The letter @samp{Q} or @samp{q}. - -@item week -An integer between 1 and 53 representing a week within a year. - -@item wk-delimiter -The letters @samp{wk} in any case. - -@item time-delimiter -At least one characters of whitespace or @samp{:} or @samp{.}. - -@item hour -An integer greater than 0 representing an hour. - -@item minute -An integer between 0 and 59 representing a minute within an hour. - -@item opt-second -Optionally, a time-delimiter followed by a real number representing a -number of seconds. - -@item hour24 -An integer between 0 and 23 representing an hour within a day. - -@item weekday -At least the first two characters of an English day word. - -@item spaces -Any amount or no amount of whitespace. - -@item sign -An optional positive or negative sign. - -@item trailer -All formats accept an optional whitespace trailer. -@end table - -The date input formats are strung together from the above pieces. On -output, the date formats are always printed in a single canonical -manner, based on field width. The date input and output formats are -described below: - -@table @asis -@item DATEw: 9 <= iw,ow <= 40 -Date format. Input format: leader + day + date-delimiter + -month + date-delimiter + year + trailer. Output format: DD-MMM-YY for -@var{w} < 11, DD-MMM-YYYY otherwise. - -@item EDATEw: 8 <= iw,ow <= 40 -European date format. Input format same as DATE. Output format: -DD.MM.YY for @var{w} < 10, DD.MM.YYYY otherwise. - -@item SDATEw: 8 <= iw,ow <= 40 -Standard date format. Input format: leader + year + date-delimiter + -month + date-delimiter + day + trailer. Output format: YY/MM/DD for -@var{w} < 10, YYYY/MM/DD otherwise. - -@item ADATEw: 8 <= iw,ow <= 40 -American date format. Input format: leader + month + date-delimiter + -day + date-delimiter + year + trailer. Output format: MM/DD/YY for -@var{w} < 10, MM/DD/YYYY otherwise. - -@item JDATEw: 5 <= iw,ow <= 40 -Julian date format. Input format: leader + julian + trailer. Output -format: YYDDD for @var{w} < 7, YYYYDDD otherwise. - -@item QYRw: 4 <= iw <= 40, 6 <= ow <= 40 -Quarter/year format. Input format: leader + quarter + q-delimiter + -year + trailer. Output format: @samp{Q Q YY}, where the first -@samp{Q} is one of the digits 1, 2, 3, 4, if @var{w} < 8, @code{Q Q -YYYY} otherwise. - -@item MOYRw: 6 <= iw,ow <= 40 -Month/year format. Input format: leader + month + date-delimiter + year -+ trailer. Output format: @samp{MMM YY} for @var{w} < 8, @samp{MMM -YYYY} otherwise. - -@item WKYRw: 6 <= iw <= 40, 8 <= ow <= 40 -Week/year format. Input format: leader + week + wk-delimiter + year + -trailer. Output format: @samp{WW WK YY} for @var{w} < 10, @samp{WW WK -YYYY} otherwise. - -@item DATETIMEw.d: 17 <= iw,ow <= 40 -Date and time format. Input format: leader + day + date-delimiter + -month + date-delimiter + yaer + time-delimiter + hour24 + time-delimiter -+ minute + opt-second. Output format: @samp{DD-MMM-YYYY HH:MM}. If -@var{w} > 19 then seconds @samp{:SS} is added. If @var{w} > 22 and -@var{d} > 0 then fractional seconds @samp{.SS} are added. - -@item TIMEw.d: 5 <= iw,ow <= 40 -Time format. Input format: leader + sign + spaces + hour + -time-delimiter + minute + opt-second. Output format: @samp{HH:MM}. -Seconds and fractional seconds are available with @var{w} of at least 8 -and 10, respectively. - -@item DTIMEw.d: 1 <= iw <= 40, 8 <= ow <= 40 -Time format with day count. Input format: leader + sign + spaces + -day-count + time-delimiter + hour + time-delimiter + minute + -opt-second. Output format: @samp{DD HH:MM}. Seconds and fractional -seconds are available with @var{w} of at least 8 and 10, respectively. - -@item WKDAYw: 2 <= iw,ow <= 40 -A weekday as a number between 1 and 7, where 1 is Sunday. Input format: -leader + weekday + trailer. Output format: as many characters, in all -capital letters, of the English name of the weekday as will fit in the -field width. - -@item MONTHw: 3 <= iw,ow <= 40 -A month as a number between 1 and 12, where 1 is January. Input format: -leader + month + trailer. Output format: as many character, in all -capital letters, of the English name of the month as will fit in the -field width. -@end table - -There are only two formats that may be used with string variables: - -@table @asis -@item Aw: 1 <= iw <= 255, 1 <= ow <= 254 -The entire field is treated as a string value. - -@item AHEXw @result{} A: 2 <= iw <= 254; 2 <= ow <= 510 -The field is composed of characters in a string encoded as textual hex -digit pairs. - -The default output @var{w} is half the input @var{w}. -@end table - -@node Scratch Variables, , Input/Output Formats, Variables -@subsection Scratch Variables - -Most of the time, variables don't retain their values between cases. -Instead, either they're being read from a data file or the active file, -in which case they assume the value read, or, if created with -@cmd{COMPUTE} or -another transformation, they're initialized to the system-missing value -or to blanks, depending on type. - -However, sometimes it's useful to have a variable that keeps its value -between cases. You can do this with @cmd{LEAVE} (@pxref{LEAVE}), or you can -use a @dfn{scratch variable}. Scratch variables are variables whose -names begin with an octothorpe (@samp{#}). - -Scratch variables have the same properties as variables left with -@cmd{LEAVE}: -they retain their values between cases, and for the first case they are -initialized to 0 or blanks. They have the additional property that they -are deleted before the execution of any procedure. For this reason, -scratch variables can't be used for analysis. To obtain the same -effect, use @cmd{COMPUTE} (@pxref{COMPUTE}) to copy the scratch variable's -value into an ordinary variable, then analysis that variable. - -@node Files, BNF, Variables, Language -@section Files Used by PSPP - -PSPP makes use of many files each time it runs. Some of these it -reads, some it writes, some it creates. Here is a table listing the -most important of these files: - -@table @strong -@cindex file, command -@cindex file, syntax file -@cindex command file -@cindex syntax file -@item command file -@itemx syntax file -These names (synonyms) refer to the file that contains instructions to -PSPP that tell it what to do. The syntax file's name is specified on -the PSPP command line. Syntax files can also be pulled in with -@cmd{INCLUDE} (@pxref{INCLUDE}). - -@cindex file, data -@cindex data file -@item data file -Data files contain raw data in ASCII format suitable for being read in -by @cmd{DATA LIST}. Data can be embedded in the syntax -file with @cmd{BEGIN DATA} and @cmd{END DATA}: this makes the -syntax file a data file too. - -@cindex file, output -@cindex output file -@item listing file -One or more output files are created by PSPP each time it is -run. The output files receive the tables and charts produced by -statistical procedures. The output files may be in any number of formats, -depending on how PSPP is configured. - -@cindex active file -@cindex file, active -@item active file -The active file is the ``file'' on which all PSPP procedures -are performed. The active file contains variable definitions and -cases. The active file is not necessarily a disk file: it is stored -in memory if there is room. -@end table - -@node BNF, , Files, Language -@section Backus-Naur Form -@cindex BNF -@cindex Backus-Naur Form -@cindex command syntax, description of -@cindex description of command syntax - -The syntax of some parts of the PSPP language is presented in this -manual using the formalism known as @dfn{Backus-Naur Form}, or BNF. The -following table describes BNF: - -@itemize @bullet -@cindex keywords -@cindex terminals -@item -Words in all-uppercase are PSPP keyword tokens. In BNF, these are -often called @dfn{terminals}. There are some special terminals, which -are actually written in lowercase for clarity: - -@table @asis -@cindex @code{number} -@item @code{number} -A real number. - -@cindex @code{integer} -@item @code{integer} -An integer number. - -@cindex @code{string} -@item @code{string} -A string. - -@cindex @code{var-name} -@item @code{var-name} -A single variable name. - -@cindex operators -@cindex punctuators -@item @code{=}, @code{/}, @code{+}, @code{-}, etc. -Operators and punctuators. - -@cindex @code{.} -@cindex terminal dot -@cindex dot, terminal -@item @code{.} -The terminal dot. This is not necessarily an actual dot in the syntax -file: @xref{Commands}, for more details. -@end table - -@item -@cindex productions -@cindex nonterminals -Other words in all lowercase refer to BNF definitions, called -@dfn{productions}. These productions are also known as -@dfn{nonterminals}. Some nonterminals are very common, so they are -defined here in English for clarity: - -@table @code -@cindex @code{var-list} -@item var-list -A list of one or more variable names or the keyword @code{ALL}. - -@cindex @code{expression} -@item expression -An expression. @xref{Expressions}, for details. -@end table - -@item -@cindex @code{::=} -@cindex ``is defined as'' -@cindex productions -@samp{::=} means ``is defined as''. The left side of @samp{::=} gives -the name of the nonterminal being defined. The right side of @samp{::=} -gives the definition of that nonterminal. If the right side is empty, -then one possible expansion of that nonterminal is nothing. A BNF -definition is called a @dfn{production}. - -@item -@cindex terminals and nonterminals, differences -So, the key difference between a terminal and a nonterminal is that a -terminal cannot be broken into smaller parts---in fact, every terminal -is a single token (@pxref{Tokens}). On the other hand, nonterminals are -composed of a (possibly empty) sequence of terminals and nonterminals. -Thus, terminals indicate the deepest level of syntax description. (In -parsing theory, terminals are the leaves of the parse tree; nonterminals -form the branches.) - -@item -@cindex start symbol -@cindex symbol, start -The first nonterminal defined in a set of productions is called the -@dfn{start symbol}. The start symbol defines the entire syntax for -that command. -@end itemize - -@node Expressions, Data Input and Output, Language, Top -@chapter Mathematical Expressions -@cindex expressions, mathematical -@cindex mathematical expressions - -Some PSPP commands use expressions, which share a common syntax -among all PSPP commands. Expressions are made up of -@dfn{operands}, which can be numbers, strings, or variable names, -separated by @dfn{operators}. There are five types of operators: -grouping, arithmetic, logical, relational, and functions. - -Every operator takes one or more @dfn{arguments} as input and produces -or @dfn{returns} exactly one result as output. Both strings and numeric -values can be used as arguments and are produced as results, but each -operator accepts only specific combinations of numeric and string values -as arguments. With few exceptions, operator arguments may be -full-fledged expressions in themselves. - -@menu -* Boolean Values:: Boolean values. -* Missing Values in Expressions:: Using missing values in expressions. -* Grouping Operators:: parentheses -* Arithmetic Operators:: add sub mul div pow -* Logical Operators:: AND NOT OR -* Relational Operators:: EQ GE GT LE LT NE -* Functions:: More-sophisticated operators. -* Order of Operations:: Operator precedence. -@end menu - -@node Boolean Values, Missing Values in Expressions, Expressions, Expressions -@section Boolean Values -@cindex Boolean -@cindex values, Boolean - -Some PSPP operators and expressions work with Boolean values, which -represent true/false conditions. Booleans have only three possible -values: 0 (false), 1 (true), and system-missing (unknown). -System-missing is neither true nor false and indicates that the true -value is unknown. - -Boolean-typed operands or function arguments must take on one of these -three values. Other values are considered false, but cause an error -when the expression is evaluated. - -Strings and Booleans are not compatible, and neither may be used in -place of the other. - -@node Missing Values in Expressions, Grouping Operators, Boolean Values, Expressions -@section Missing Values in Expressions - -String missing values are not treated specially in expressions. Most -numeric operators return system-missing when given system-missing -arguments. Exceptions are listed under particular operator -descriptions. - -User-missing values for numeric variables are always transformed into -the system-missing value, except inside the arguments to the -@code{VALUE} and @code{SYSMIS} functions. - -The missing-value functions can be used to precisely control how missing -values are treated in expressions. @xref{Missing Value Functions}, for -more details. - -@node Grouping Operators, Arithmetic Operators, Missing Values in Expressions, Expressions -@section Grouping Operators -@cindex parentheses -@cindex @samp{( )} -@cindex grouping operators -@cindex operators, grouping - -Parentheses (@samp{()}) are the grouping operators. Surround an -expression with parentheses to force early evaluation. - -Parentheses also surround the arguments to functions, but in that -situation they act as punctuators, not as operators. - -@node Arithmetic Operators, Logical Operators, Grouping Operators, Expressions -@section Arithmetic Operators -@cindex operators, arithmetic -@cindex arithmetic operators - -The arithmetic operators take numeric arguments and produce numeric -results. - -@table @code -@cindex @samp{+} -@cindex addition -@item @var{a} + @var{b} -Adds @var{a} and @var{b}, returning the sum. - -@cindex @samp{-} -@cindex subtraction -@item @var{a} - @var{b} -Subtracts @var{b} from @var{a}, returning the difference. - -@cindex @samp{*} -@cindex multiplication -@item @var{a} * @var{b} -Multiplies @var{a} and @var{b}, returning the product. - -@cindex @samp{/} -@cindex division -@item @var{a} / @var{b} -Divides @var{a} by @var{b}, returning the quotient. If @var{b} is -zero, the result is system-missing. - -@cindex @samp{**} -@cindex exponentiation -@item @var{a} ** @var{b} -Returns the result of raising @var{a} to the power @var{b}. If -@var{a} is negative and @var{b} is not an integer, the result is -system-missing. The result of @code{0**0} is system-missing as well. - -@cindex @samp{-} -@cindex negation -@item - @var{a} -Reverses the sign of @var{a}. -@end table - -@node Logical Operators, Relational Operators, Arithmetic Operators, Expressions -@section Logical Operators -@cindex logical operators -@cindex operators, logical - -@cindex true -@cindex false -@cindex Boolean -@cindex values, system-missing -@cindex system-missing -The logical operators take logical arguments and produce logical -results, meaning ``true or false''. PSPP logical operators are -not true Boolean operators because they may also result in a -system-missing value. - -@table @code -@cindex @code{AND} -@cindex @samp{&} -@cindex intersection, logical -@cindex logical intersection -@item @var{a} AND @var{b} -@itemx @var{a} & @var{b} -True if both @var{a} and @var{b} are true, false otherwise. If one -argument is false, the result is false even if the other is missing. If -both arguments are missing, the result is missing. - -@cindex @code{OR} -@cindex @samp{|} -@cindex union, logical -@cindex logical union -@item @var{a} OR @var{b} -@itemx @var{a} | @var{b} -True if at least one of @var{a} and @var{b} is true. If one argument is -true, the result is true even if the other argument is missing. If both -arguments are missing, the result is missing. - -@cindex @code{NOT} -@cindex @samp{~} -@cindex inversion, logical -@cindex logical inversion -@item NOT @var{a} -@itemx ~ @var{a} -True if @var{a} is false. If the argument is missing, then the result -is missing. -@end table - -@node Relational Operators, Functions, Logical Operators, Expressions -@section Relational Operators - -The relational operators take numeric or string arguments and produce Boolean -results. - -Strings cannot be compared to numbers. When strings of different -lengths are compared, the shorter string is right-padded with spaces -to match the length of the longer string. - -The results of string comparisons, other than tests for equality or -inequality, are dependent on the character set in use. String -comparisons are case-sensitive. - -@table @code -@cindex equality, testing -@cindex testing for equality -@cindex @code{EQ} -@cindex @samp{=} -@item @var{a} EQ @var{b} -@itemx @var{a} = @var{b} -True if @var{a} is equal to @var{b}. - -@cindex less than or equal to -@cindex @code{LE} -@cindex @code{<=} -@item @var{a} LE @var{b} -@itemx @var{a} <= @var{b} -True if @var{a} is less than or equal to @var{b}. - -@cindex less than -@cindex @code{LT} -@cindex @code{<} -@item @var{a} LT @var{b} -@itemx @var{a} < @var{b} -True if @var{a} is less than @var{b}. - -@cindex greater than or equal to -@cindex @code{GE} -@cindex @code{>=} -@item @var{a} GE @var{b} -@itemx @var{a} >= @var{b} -True if @var{a} is greater than or equal to @var{b}. - -@cindex greater than -@cindex @code{GT} -@cindex @samp{>} -@item @var{a} GT @var{b} -@itemx @var{a} > @var{b} -True if @var{a} is greater than @var{b}. - -@cindex inequality, testing -@cindex testing for inequality -@cindex @code{NE} -@cindex @code{~=} -@cindex @code{<>} -@item @var{a} NE @var{b} -@itemx @var{a} ~= @var{b} -@itemx @var{a} <> @var{b} -True is @var{a} is not equal to @var{b}. -@end table - -@node Functions, Order of Operations, Relational Operators, Expressions -@section Functions -@cindex functions - -@cindex mathematics -@cindex operators -@cindex parentheses -@cindex @code{(} -@cindex @code{)} -@cindex names, of functions -PSPP functions provide mathematical abilities above and beyond -those possible using simple operators. Functions have a common -syntax: each is composed of a function name followed by a left -parenthesis, one or more arguments, and a right parenthesis. Function -names are @strong{not} reserved; their names are specially treated -only when followed by a left parenthesis: @code{EXP(10)} refers to the -constant value @code{e} raised to the 10th power, but @code{EXP} by -itself refers to the value of variable EXP. - -The sections below describe each function in detail. - -@menu -* Advanced Mathematics:: EXP LG10 LN SQRT -* Miscellaneous Mathematics:: ABS MOD MOD10 RND TRUNC -* Trigonometry:: ACOS ARCOS ARSIN ARTAN ASIN ATAN COS SIN TAN -* Missing Value Functions:: MISSING NMISS NVALID SYSMIS VALUE -* Pseudo-Random Numbers:: NORMAL UNIFORM -* Set Membership:: ANY RANGE -* Statistical Functions:: CFVAR MAX MEAN MIN SD SUM VARIANCE -* String Functions:: CONCAT INDEX LENGTH LOWER LPAD LTRIM NUMBER - RINDEX RPAD RTRIM STRING SUBSTR UPCASE -* Time & Date:: CTIME.xxx DATE.xxx TIME.xxx XDATE.xxx -* Miscellaneous Functions:: LAG YRMODA -* Functions Not Implemented:: CDF.xxx CDFNORM IDF.xxx NCDF.xxx PROBIT RV.xxx -@end menu - -@node Advanced Mathematics, Miscellaneous Mathematics, Functions, Functions -@subsection Advanced Mathematical Functions -@cindex mathematics, advanced - -Advanced mathematical functions take numeric arguments and produce -numeric results. - -@deftypefn {Function} {} EXP (@var{exponent}) -Returns @i{e} (approximately 2.71828) raised to power @var{exponent}. -@end deftypefn - -@cindex logarithms -@deftypefn {Function} {} LG10 (@var{number}) -Takes the base-10 logarithm of @var{number}. If @var{number} is -not positive, the result is system-missing. -@end deftypefn - -@deftypefn {Function} {} LN (@var{number}) -Takes the base-@i{e} logarithm of @var{number}. If @var{number} is -not positive, the result is system-missing. -@end deftypefn - -@cindex square roots -@deftypefn {Function} {} SQRT (@var{number}) -Takes the square root of @var{number}. If @var{number} is negative, -the result is system-missing. -@end deftypefn - -@node Miscellaneous Mathematics, Trigonometry, Advanced Mathematics, Functions -@subsection Miscellaneous Mathematical Functions -@cindex mathematics, miscellaneous - -Miscellaneous mathematical functions take numeric arguments and produce -numeric results. - -@cindex absolute value -@deftypefn {Function} {} ABS (@var{number}) -Results in the absolute value of @var{number}. -@end deftypefn - -@cindex modulus -@deftypefn {Function} {} MOD (@var{numerator}, @var{denominator}) -Returns the remainder (modulus) of @var{numerator} divided by -@var{denominator}. If @var{denominator} is 0, the result is -system-missing. However, if @var{numerator} is 0 and -@var{denominator} is system-missing, the result is 0. -@end deftypefn - -@cindex modulus, by 10 -@deftypefn {Function} {} MOD10 (@var{number}) -Returns the remainder when @var{number} is divided by 10. If -@var{number} is negative, MOD10(@var{number}) is negative or zero. -@end deftypefn - -@cindex rounding -@deftypefn {Function} {} RND (@var{number}) -Takes the absolute value of @var{number} and rounds it to an integer. -Then, if @var{number} was negative originally, negates the result. -@end deftypefn - -@cindex truncation -@deftypefn {Function} {} TRUNC (@var{number}) -Discards the fractional part of @var{number}; that is, rounds -@var{number} towards zero. -@end deftypefn - -@node Trigonometry, Missing Value Functions, Miscellaneous Mathematics, Functions -@subsection Trigonometric Functions -@cindex trigonometry - -Trigonometric functions take numeric arguments and produce numeric -results. - -@cindex arccosine -@cindex inverse cosine -@deftypefn {Function} {} ARCOS (@var{number}) -Takes the arccosine, in radians, of @var{number}. Results in -system-missing if @var{number} is not between -1 and 1. -@end deftypefn - -@cindex arcsine -@cindex inverse sine -@deftypefn {Function} {} ARSIN (@var{number}) -Takes the arcsine, in radians, of @var{number}. Results in -system-missing if @var{number} is not between -1 and 1 inclusive. -@end deftypefn - -@cindex arctangent -@cindex inverse tangent -@deftypefn {Function} {} ARTAN (@var{number}) -Takes the arctangent, in radians, of @var{number}. -@end deftypefn - -@cindex cosine -@deftypefn {Function} {} COS (@var{angle}) -Takes the cosine of @var{angle} which should be in radians. -@end deftypefn - -@cindex sine -@deftypefn {Function} {} SIN (@var{angle}) -Takes the sine of @var{angle} which should be in radians. -@end deftypefn - -@cindex tangent -@deftypefn {Function} {} TAN (@var{angle}) -Takes the tangent of @var{angle} which should be in radians. -Results in system-missing at values -of @var{angle} that are too close to odd multiples of pi/2. -Portability: none. -@end deftypefn - -@node Missing Value Functions, Pseudo-Random Numbers, Trigonometry, Functions -@subsection Missing-Value Functions -@cindex missing values -@cindex values, missing -@cindex functions, missing-value - -Missing-value functions take various numeric arguments and yield -various types of results. Note that the normal rules of evaluation -apply within expression arguments to these functions. In particular, -user-missing values for numeric variables are converted to -system-missing values. - -@deftypefn {Function} {} MISSING (@var{expr}) -Returns 1 if @var{expr} has the system-missing value, 0 otherwise. -@end deftypefn - -@deftypefn {Function} {} NMISS (@var{expr} [, @var{expr}]@dots{}) -Each argument must be a numeric expression. Returns the number of -system-missing values in the list. As a special extension, -the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a -range of variables; see @ref{Sets of Variables}, for more details. -@end deftypefn - -@deftypefn {Function} {} NVALID (@var{expr} [, @var{expr}]@dots{}) -Each argument must be a numeric expression. Returns the number of -values in the list that are not system-missing. As a special extension, -the syntax @code{@var{var1} TO @var{var2}} may be used to refer to a -range of variables; see @ref{Sets of Variables}, for more details. -@end deftypefn - -@deftypefn {Function} {} SYSMIS (@var{expr}) -When @var{expr} is simply the name of a numeric variable, returns 1 if -the variable has the system-missing value, 0 if it is user-missing or -not missing. If given @var{expr} takes another form, results in 1 if -the value is system-missing, 0 otherwise. -@end deftypefn - -@deftypefn {Function} {} VALUE (@var{variable}) -Prevents the user-missing values of @var{variable} from being -transformed into system-missing values, and always results in the -actual value of @var{variable}, whether it is user-missing, -system-missing or not missing at all. -@end deftypefn - -@node Pseudo-Random Numbers, Set Membership, Missing Value Functions, Functions -@subsection Pseudo-Random Number Generation Functions -@cindex random numbers -@cindex pseudo-random numbers (see random numbers) - -Pseudo-random number generation functions take numeric arguments and -produce numeric results. - -PSPP uses the alleged RC4 cipher as a pseudo-random number generator -(PRNG). The bytes output by this PRNG are system-independent for a -given random seed, but differences in endianness and floating-point -formats will make PRNG results differ from system to system. RC4 -should produce high-quality random numbers for simulation purposes. -(If you're concerned about the quality of the random number generator, -well, you're using a statistical processing package---analyze it!) - -PSPP's implementation of RC4 has not undergone any security auditing. -Furthermore, various precautions that would be necessary for secure -operation, such as secure seeding and discarding the first several -bytes of output, have not been taken. Therefore, PSPP's -implementation of RC4 should not be used for security purposes. - -@cindex random numbers, normally-distributed -@deftypefn {Function} {} NORMAL (@var{number}) -Results in a random number. Results from @code{NORMAL} are normally -distributed with a mean of 0 and a standard deviation of @var{number}. -@end deftypefn - -@cindex random numbers, uniformly-distributed -@deftypefn {Function} {} UNIFORM (@var{number}) -Results in a random number between 0 and @var{number}. Results from -@code{UNIFORM} are evenly distributed across its entire range. There -may be a maximum on the largest random number ever generated---this is -often -@ifinfo -2**31-1 -@end ifinfo -@tex -$2^{31}-1$ -@end tex -(2,147,483,647), but it may be orders of magnitude -higher or lower. -@end deftypefn - -@node Set Membership, Statistical Functions, Pseudo-Random Numbers, Functions -@subsection Set-Membership Functions -@cindex set membership -@cindex membership, of set - -Set membership functions determine whether a value is a member of a set. -They take a set of numeric arguments or a set of string arguments, and -produce Boolean results. - -String comparisons are performed according to the rules given in -@ref{Relational Operators}. - -@deftypefn {Function} {} ANY (@var{value}, @var{set} [, @var{set}]@dots{}) -Results in true if @var{value} is equal to any of the @var{set} -values. Otherwise, results in false. If @var{value} is -system-missing, returns system-missing. System-missing values in -@var{set} do not cause ANY to return system-missing. -@end deftypefn - -@deftypefn {Function} {} RANGE (@var{value}, @var{low}, @var{high} [, @var{low}, @var{high}]@dots{}) -Results in true if @var{value} is in any of the intervals bounded by -@var{low} and @var{high} inclusive. Otherwise, results in false. -Each @var{low} must be less than or equal to its corresponding -@var{high} value. @var{low} and @var{high} must be given in pairs. -If @var{value} is system-missing, returns system-missing. -System-missing values in @var{set} do not cause RANGE to return -system-missing. -@end deftypefn - -@node Statistical Functions, String Functions, Set Membership, Functions -@subsection Statistical Functions -@cindex functions, statistical -@cindex statistics - -Statistical functions compute descriptive statistics on a list of -values. Some statistics can be computed on numeric or string values; -other can only be computed on numeric values. Their results have the -same type as their arguments. The current case's weighting factor -(@pxref{WEIGHT}) has no effect on statistical functions. - -@cindex arguments, minimum valid -@cindex minimum valid number of arguments -With statistical functions it is possible to specify a minimum number of -non-missing arguments for the function to be evaluated. To do so, -append a dot and the number to the function name. For instance, to -specify a minimum of three valid arguments to the MEAN function, use the -name @code{MEAN.3}. - -@cindex coefficient of variation -@cindex variation, coefficient of -@deftypefn {Function} {} CFVAR (@var{number}, @var{number}[, @dots{}]) -Results in the coefficient of variation of the values of @var{number}. -This function requires at least two valid arguments to give a -non-missing result. (The coefficient of variation is the standard -deviation divided by the mean.) -@end deftypefn - -@cindex maximum -@deftypefn {Function} {} MAX (@var{value}, @var{value}[, @dots{}]) -Results in the value of the greatest @var{value}. The @var{value}s may -be numeric or string. Although at least two arguments must be given, -only one need be valid for MAX to give a non-missing result. -@end deftypefn - -@cindex mean -@deftypefn {Function} {} MEAN (@var{number}, @var{number}[, @dots{}]) -Results in the mean of the values of @var{number}. Although at least -two arguments must be given, only one need be valid for MEAN to give a -non-missing result. -@end deftypefn - -@cindex minimum -@deftypefn {Function} {} MIN (@var{number}, @var{number}[, @dots{}]) -Results in the value of the least @var{value}. The @var{value}s may -be numeric or string. Although at least two arguments must be given, -only one need be valid for MAX to give a non-missing result. -@end deftypefn - -@cindex standard deviation -@cindex deviation, standard -@deftypefn {Function} {} SD (@var{number}, @var{number}[, @dots{}]) -Results in the standard deviation of the values of @var{number}. -This function requires at least two valid arguments to give a -non-missing result. -@end deftypefn - -@cindex sum -@deftypefn {Function} {} SUM (@var{number}, @var{number}[, @dots{}]) -Results in the sum of the values of @var{number}. Although at least two -arguments must be given, only one need by valid for SUM to give a -non-missing result. -@end deftypefn - -@cindex variance -@deftypefn {Function} {} VARIANCE (@var{number}, @var{number}[, @dots{}]) -Results in the variance of the values of @var{number}. This function -requires at least two valid arguments to give a non-missing result. -@end deftypefn - -@node String Functions, Time & Date, Statistical Functions, Functions -@subsection String Functions -@cindex functions, string -@cindex string functions - -String functions take various arguments and return various results. - -@cindex concatenation -@cindex strings, concatenation of -@deftypefn {Function} {} CONCAT (@var{string}, @var{string}[, @dots{}]) -Returns a string consisting of each @var{string} in sequence. -@code{CONCAT("abc", "def", "ghi")} has a value of @code{"abcdefghi"}. -The resultant string is truncated to a maximum of 255 characters. -@end deftypefn - -@cindex searching strings -@deftypefn {Function} {} INDEX (@var{haystack}, @var{needle}) -Returns a positive integer indicating the position of the first -occurrence @var{needle} in @var{haystack}. Returns 0 if @var{haystack} -does not contain @var{needle}. Returns system-missing if @var{needle} -is an empty string. -@end deftypefn - -@deftypefn {Function} {} INDEX (@var{haystack}, @var{needle}, @var{divisor}) -Divides @var{needle} into parts, each with length @var{divisor}. -Searches @var{haystack} for the first occurrence of each part, and -returns the smallest value. Returns 0 if @var{haystack} does not -contain any part in @var{needle}. It is an error if @var{divisor} -cannot be evenly divided into the length of @var{needle}. Returns -system-missing if @var{needle} is an empty string. -@end deftypefn - -@cindex strings, finding length of -@deftypefn {Function} {} LENGTH (@var{string}) -Returns the number of characters in @var{string}. -@end deftypefn - -@cindex strings, case of -@deftypefn {Function} {} LOWER (@var{string}) -Returns a string identical to @var{string} except that all uppercase -letters are changed to lowercase letters. The definitions of -``uppercase'' and ``lowercase'' are system-dependent. -@end deftypefn - -@cindex strings, padding -@deftypefn {Function} {} LPAD (@var{string}, @var{length}) -If @var{string} is at least @var{length} characters in length, returns -@var{string} unchanged. Otherwise, returns @var{string} padded with -spaces on the left side to length @var{length}. Returns an empty string -if @var{length} is system-missing, negative, or greater than 255. -@end deftypefn - -@deftypefn {Function} {} LPAD (@var{string}, @var{length}, @var{padding}) -If @var{string} is at least @var{length} characters in length, returns -@var{string} unchanged. Otherwise, returns @var{string} padded with -@var{padding} on the left side to length @var{length}. Returns an empty -string if @var{length} is system-missing, negative, or greater than 255, or -if @var{padding} does not contain exactly one character. -@end deftypefn - -@cindex strings, trimming -@cindex whitespace, trimming -@deftypefn {Function} {} LTRIM (@var{string}) -Returns @var{string}, after removing leading spaces. Other whitespace, -such as tabs, carriage returns, line feeds, and vertical tabs, is not -removed. -@end deftypefn - -@deftypefn {Function} {} LTRIM (@var{string}, @var{padding}) -Returns @var{string}, after removing leading @var{padding} characters. -If @var{padding} does not contain exactly one character, returns an -empty string. -@end deftypefn - -@cindex numbers, converting from strings -@cindex strings, converting to numbers -@deftypefn {Function} {} NUMBER (@var{string}, @var{format}) -Returns the number produced when @var{string} is interpreted according -to format specifier @var{format}. If the format width @var{w} is less -than the length of @var{string}, then only the first @var{w} -characters in @var{string} are used, e.g.@: @code{NUMBER("123", F3.0)} -and @code{NUMBER("1234", F3.0)} both have value 123. If @var{w} is -greater than @var{string}'s length, then it is treated as if it were -right-padded with spaces. If @var{string} is not in the correct -format for @var{format}, system-missing is returned. -@end deftypefn - -@cindex strings, searching backwards -@deftypefn {Function} {} RINDEX (@var{string}, @var{format}) -Returns a positive integer indicating the position of the last -occurrence of @var{needle} in @var{haystack}. Returns 0 if -@var{haystack} does not contain @var{needle}. Returns system-missing if -@var{needle} is an empty string. -@end deftypefn - -@deftypefn {Function} {} RINDEX (@var{haystack}, @var{needle}, @var{divisor}) -Divides @var{needle} into parts, each with length @var{divisor}. -Searches @var{haystack} for the last occurrence of each part, and -returns the largest value. Returns 0 if @var{haystack} does not contain -any part in @var{needle}. It is an error if @var{divisor} cannot be -evenly divided into the length of @var{needle}. Returns system-missing -if @var{needle} is an empty string. -@end deftypefn - -@cindex padding strings -@cindex strings, padding -@deftypefn {Function} {} RPAD (@var{string}, @var{length}) -If @var{string} is at least @var{length} characters in length, returns -@var{string} unchanged. Otherwise, returns @var{string} padded with -spaces on the right to length @var{length}. Returns an empty string if -@var{length} is system-missing, negative, or greater than 255. -@end deftypefn - -@deftypefn {Function} {} RPAD (@var{string}, @var{length}, @var{padding}) -If @var{string} is at least @var{length} characters in length, returns -@var{string} unchanged. Otherwise, returns @var{string} padded with -@var{padding} on the right to length @var{length}. Returns an empty -string if @var{length} is system-missing, negative, or greater than 255, -or if @var{padding} does not contain exactly one character. -@end deftypefn - -@cindex strings, trimming -@cindex whitespace, trimming -@deftypefn {Function} {} RTRIM (@var{string}) -Returns @var{string}, after removing trailing spaces. Other types of -whitespace are not removed. -@end deftypefn - -@deftypefn {Function} {} RTRIM (@var{string}, @var{padding}) -Returns @var{string}, after removing trailing @var{padding} characters. -If @var{padding} does not contain exactly one character, returns an -empty string. -@end deftypefn - -@cindex strings, converting from numbers -@cindex numbers, converting to strings -@deftypefn {Function} {} STRING (@var{number}, @var{format}) -Returns a string corresponding to @var{number} in the format given by -format specifier @var{format}. For example, @code{STRING(123.56, F5.1)} -has the value @code{"123.6"}. -@end deftypefn - -@cindex substrings -@cindex strings, taking substrings of -@deftypefn {Function} {} SUBSTR (@var{string}, @var{start}) -Returns a string consisting of the value of @var{string} from position -@var{start} onward. Returns an empty string if @var{start} is system-missing -or has a value less than 1 or greater than the number of characters in -@var{string}. -@end deftypefn - -@deftypefn {Function} {} SUBSTR (@var{string}, @var{start}, @var{count}) -Returns a string consisting of the first @var{count} characters from -@var{string} beginning at position @var{start}. Returns an empty string -if @var{start} or @var{count} is system-missing, if @var{start} is less -than 1 or greater than the number of characters in @var{string}, or if -@var{count} is less than 1. Returns a string shorter than @var{count} -characters if @var{start} + @var{count} - 1 is greater than the number -of characters in @var{string}. Examples: @code{SUBSTR("abcdefg", 3, 2)} -has value @code{"cd"}; @code{SUBSTR("Ben Pfaff", 5, 10)} has the value -@code{"Pfaff"}. -@end deftypefn - -@cindex case conversion -@cindex strings, case of -@deftypefn {Function} {} UPCASE (@var{string}) -Returns @var{string}, changing lowercase letters to uppercase letters. -@end deftypefn - -@node Time & Date, Miscellaneous Functions, String Functions, Functions -@subsection Time & Date Functions -@cindex functions, time & date -@cindex times -@cindex dates - -@cindex dates, legal range of -The legal range of dates for use in PSPP is 15 Oct 1582 -through 31 Dec 19999. - -@cindex arguments, invalid -@cindex invalid arguments -@quotation -@strong{Please note:} Most time & date extraction functions will accept -invalid arguments: - -@itemize @bullet -@item -Negative numbers in PSPP time format. -@item -Numbers less than 86,400 in PSPP date format. -@end itemize - -However, sensible results are not guaranteed for these invalid values. -The given equivalents for these functions are definitely not guaranteed -for invalid values. -@end quotation - -@quotation -@strong{Please note also:} The time & date construction -functions @strong{do} produce reasonable and useful results for -out-of-range values; these are not considered invalid. -@end quotation - -@menu -* Time & Date Concepts:: How times & dates are defined and represented -* Time Construction:: TIME.@{DAYS HMS@} -* Time Extraction:: CTIME.@{DAYS HOURS MINUTES SECONDS@} -* Date Construction:: DATE.@{DMY MDY MOYR QYR WKYR YRDAY@} -* Date Extraction:: XDATE.@{DATE HOUR JDAY MDAY MINUTE MONTH - QUARTER SECOND TDAY TIME WEEK - WKDAY YEAR@} -@end menu - -@node Time & Date Concepts, Time Construction, Time & Date, Time & Date -@subsubsection How times & dates are defined and represented - -@cindex time, concepts -@cindex time, intervals -Times and dates are handled by PSPP as single numbers. A -@dfn{time} is an interval. PSPP measures times in seconds. -Thus, the following intervals correspond with the numeric values given: - -@example - 10 minutes 600 - 1 hour 3,600 - 1 day, 3 hours, 10 seconds 97,210 - 40 days 3,456,000 - 10010 d, 14 min, 24 s 864,864,864 -@end example - -@cindex dates, concepts -@cindex time, instants of -A @dfn{date}, on the other hand, is a particular instant in the past or -the future. PSPP represents a date as a number of seconds after the -midnight that separated 8 Oct 1582 and 9 Oct 1582. (Please note that 15 -Oct 1582 immediately followed 9 Oct 1582.) Thus, the midnights before -the dates given below correspond with the numeric PSPP dates given: - -@example - 15 Oct 1582 86,400 - 4 Jul 1776 6,113,318,400 - 1 Jan 1900 10,010,390,400 - 1 Oct 1978 12,495,427,200 - 24 Aug 1995 13,028,601,600 -@end example - -@cindex time, mathematical properties of -@cindex mathematics, applied to times & dates -@cindex dates, mathematical properties of -@noindent -Please note: - -@itemize @bullet -@item -A time may be added to, or subtracted from, a date, resulting in a date. - -@item -The difference of two dates may be taken, resulting in a time. - -@item -Two times may be added to, or subtracted from, each other, resulting in -a time. -@end itemize - -(Adding two dates does not produce a useful result.) - -Since times and dates are merely numbers, the ordinary addition and -subtraction operators are employed for these purposes. - -@quotation -@strong{Please note:} Many dates and times have extremely large -values---just look at the values above. Thus, it is not a good idea to -take powers of these values; also, the accuracy of some procedures may -be affected. If necessary, convert times or dates in seconds to some -other unit, like days or years, before performing analysis. -@end quotation - -@node Time Construction, Time Extraction, Time & Date Concepts, Time & Date -@subsubsection Functions that Produce Times -@cindex times, constructing -@cindex constructing times - -These functions take numeric arguments and produce numeric results in -PSPP time format. - -@cindex days -@cindex time, in days -@deftypefn {Function} {} TIME.DAYS (@var{ndays}) -Results in a time value corresponding to @var{ndays} days. -(@code{TIME.DAYS(@var{x})} is equivalent to @code{@var{x} * 60 * 60 * -24}.) -@end deftypefn - -@cindex hours-minutes-seconds -@cindex time, in hours-minutes-seconds -@deftypefn {Function} {} TIME.HMS (@var{nhours}, @var{nmins}, @var{nsecs}) -Results in a time value corresponding to @var{nhours} hours, @var{nmins} -minutes, and @var{nsecs} seconds. (@code{TIME.HMS(@var{h}, @var{m}, -@var{s})} is equivalent to @code{@var{h}*60*60 + @var{m}*60 + -@var{s}}.) -@end deftypefn - -@node Time Extraction, Date Construction, Time Construction, Time & Date -@subsubsection Functions that Examine Times -@cindex extraction, of time -@cindex time examination -@cindex examination, of times -@cindex time, lengths of - -These functions take numeric arguments in PSPP time format and -give numeric results. - -@cindex days -@cindex time, in days -@deftypefn {Function} {} CTIME.DAYS (@var{time}) -Results in the number of days and fractional days in @var{time}. -(@code{CTIME.DAYS(@var{x})} is equivalent to @code{@var{x}/60/60/24}.) -@end deftypefn - -@cindex hours -@cindex time, in hours -@deftypefn {Function} {} CTIME.HOURS (@var{time}) -Results in the number of hours and fractional hours in @var{time}. -(@code{CTIME.HOURS(@var{x})} is equivalent to @code{@var{x}/60/60}.) -@end deftypefn - -@cindex minutes -@cindex time, in minutes -@deftypefn {Function} {} CTIME.MINUTES (@var{time}) -Results in the number of minutes and fractional minutes in @var{time}. -(@code{CTIME.MINUTES(@var{x})} is equivalent to @code{@var{x}/60}.) -@end deftypefn - -@cindex seconds -@cindex time, in seconds -@deftypefn {Function} {} CTIME.SECONDS (@var{time}) -Results in the number of seconds and fractional seconds in @var{time}. -(@code{CTIME.SECONDS} does nothing; @code{CTIME.SECONDS(@var{x})} is -equivalent to @code{@var{x}}.) -@end deftypefn - -@node Date Construction, Date Extraction, Time Extraction, Time & Date -@subsubsection Functions that Produce Dates -@cindex dates, constructing -@cindex constructing dates - -@cindex arguments, of date construction functions -These functions take numeric arguments and give numeric results in the -PSPP date format. Arguments taken by these functions are: - -@table @var -@item day -Refers to a day of the month between 1 and 31. - -@item month -Refers to a month of the year between 1 and 12. - -@item quarter -Refers to a quarter of the year between 1 and 4. The quarters of the -year begin on the first days of months 1, 4, 7, and 10. - -@item week -Refers to a week of the year between 1 and 53. - -@item yday -Refers to a day of the year between 1 and 366. - -@item year -Refers to a year between 1582 and 19999. -@end table - -@cindex arguments, invalid -If these functions' arguments are out-of-range, they are correctly -normalized before conversion to date format. Non-integers are rounded -toward zero. - -@cindex day-month-year -@cindex dates, day-month-year -@deftypefn {Function} {} DATE.DMY (@var{day}, @var{month}, @var{year}) -@deftypefnx {Function} {} DATE.MDY (@var{month}, @var{day}, @var{year}) -Results in a date value corresponding to the midnight before day -@var{day} of month @var{month} of year @var{year}. -@end deftypefn - -@cindex month-year -@cindex dates, month-year -@deftypefn {Function} {} DATE.MOYR (@var{month}, @var{year}) -Results in a date value corresponding to the midnight before the first -day of month @var{month} of year @var{year}. -@end deftypefn - -@cindex quarter-year -@cindex dates, quarter-year -@deftypefn {Function} {} DATE.QYR (@var{quarter}, @var{year}) -Results in a date value corresponding to the midnight before the first -day of quarter @var{quarter} of year @var{year}. -@end deftypefn - -@cindex week-year -@cindex dates, week-year -@deftypefn {Function} {} DATE.WKYR (@var{week}, @var{year}) -Results in a date value corresponding to the midnight before the first -day of week @var{week} of year @var{year}. -@end deftypefn - -@cindex year-day -@cindex dates, year-day -@deftypefn {Function} {} DATE.YRDAY (@var{year}, @var{yday}) -Results in a date value corresponding to the midnight before day -@var{yday} of year @var{year}. -@end deftypefn - -@node Date Extraction, , Date Construction, Time & Date -@subsubsection Functions that Examine Dates -@cindex extraction, of dates -@cindex date examination - -@cindex arguments, of date extraction functions -These functions take numeric arguments in PSPP date or time -format and give numeric results. These names are used for arguments: - -@table @var -@item date -A numeric value in PSPP date format. - -@item time -A numeric value in PSPP time format. - -@item time-or-date -A numeric value in PSPP time or date format. -@end table - -@cindex days -@cindex dates, in days -@cindex time, in days -@deftypefn {Function} {} XDATE.DATE (@var{time-or-date}) -For a time, results in the time corresponding to the number of whole -days @var{date-or-time} includes. For a date, results in the date -corresponding to the latest midnight at or before @var{date-or-time}; -that is, gives the date that @var{date-or-time} is in. -(XDATE.DATE(@var{x}) is equivalent to TRUNC(@var{x}/86400)*86400.) -Applying this function to a time is a non-portable feature. -@end deftypefn - -@cindex hours -@cindex dates, in hours -@cindex time, in hours -@deftypefn {Function} {} XDATE.HOUR (@var{time-or-date}) -For a time, results in the number of whole hours beyond the number of -whole days represented by @var{date-or-time}. For a date, results in -the hour (as an integer between 0 and 23) corresponding to -@var{date-or-time}. (XDATE.HOUR(@var{x}) is equivalent to -MOD(TRUNC(@var{x}/3600),24)) Applying this function to a time is a -non-portable feature. -@end deftypefn - -@cindex day of the year -@cindex dates, day of the year -@deftypefn {Function} {} XDATE.JDAY (@var{date}) -Results in the day of the year (as an integer between 1 and 366) -corresponding to @var{date}. -@end deftypefn - -@cindex day of the month -@cindex dates, day of the month -@deftypefn {Function} {} XDATE.MDAY (@var{date}) -Results in the day of the month (as an integer between 1 and 31) -corresponding to @var{date}. -@end deftypefn - -@cindex minutes -@cindex dates, in minutes -@cindex time, in minutes -@deftypefn {Function} {} XDATE.MINUTE (@var{time-or-date}) -Results in the number of minutes (as an integer between 0 and 59) after -the last hour in @var{time-or-date}. (XDATE.MINUTE(@var{x}) is -equivalent to MOD(TRUNC(@var{x}/60),60)) Applying this function to a -time is a non-portable feature. -@end deftypefn - -@cindex months -@cindex dates, in months -@deftypefn {Function} {} XDATE.MONTH (@var{date}) -Results in the month of the year (as an integer between 1 and 12) -corresponding to @var{date}. -@end deftypefn - -@cindex quarters -@cindex dates, in quarters -@deftypefn {Function} {} XDATE.QUARTER (@var{date}) -Results in the quarter of the year (as an integer between 1 and 4) -corresponding to @var{date}. -@end deftypefn - -@cindex seconds -@cindex dates, in seconds -@cindex time, in seconds -@deftypefn {Function} {} XDATE.SECOND (@var{time-or-date}) -Results in the number of whole seconds after the last whole minute (as -an integer between 0 and 59) in @var{time-or-date}. -(XDATE.SECOND(@var{x}) is equivalent to MOD(@var{x}, 60).) Applying -this function to a time is a non-portable feature. -@end deftypefn - -@cindex days -@cindex times, in days -@deftypefn {Function} {} XDATE.TDAY (@var{time}) -Results in the number of whole days (as an integer) in @var{time}. -(XDATE.TDAY(@var{x}) is equivalent to TRUNC(@var{x}/86400).) -@end deftypefn - -@cindex time -@cindex dates, time of day -@deftypefn {Function} {} XDATE.TIME (@var{date}) -Results in the time of day at the instant corresponding to @var{date}, -in PSPP time format. This is the number of seconds since -midnight on the day corresponding to @var{date}. (XDATE.TIME(@var{x}) is -equivalent to TRUNC(@var{x}/86400)*86400.) -@end deftypefn - -@cindex week -@cindex dates, in weeks -@deftypefn {Function} {} XDATE.WEEK (@var{date}) -Results in the week of the year (as an integer between 1 and 53) -corresponding to @var{date}. -@end deftypefn - -@cindex day of the week -@cindex weekday -@cindex dates, day of the week -@cindex dates, in weekdays -@deftypefn {Function} {} XDATE.WKDAY (@var{date}) -Results in the day of week (as an integer between 1 and 7) corresponding -to @var{date}. The days of the week are: - -@table @asis -@item 1 -Sunday -@item 2 -Monday -@item 3 -Tuesday -@item 4 -Wednesday -@item 5 -Thursday -@item 6 -Friday -@item 7 -Saturday -@end table -@end deftypefn - -@cindex years -@cindex dates, in years -@deftypefn {Function} {} XDATE.YEAR (@var{date}) -Returns the year (as an integer between 1582 and 19999) corresponding to -@var{date}. -@end deftypefn - -@node Miscellaneous Functions, Functions Not Implemented, Time & Date, Functions -@subsection Miscellaneous Functions -@cindex functions, miscellaneous - -Miscellaneous functions take various arguments and produce various -results. - -@cindex cross-case function -@cindex function, cross-case -@deftypefn {Function} {} LAG (@var{variable}) -@anchor{LAG} -@var{variable} must be a numeric or string variable name. @code{LAG} -results in the value of that variable for the case before the current -one. In case-selection procedures, @code{LAG} results in the value of -the variable for the last case selected. Results in system-missing (for -numeric variables) or blanks (for string variables) for the first case -or before any cases are selected. -@end deftypefn - -@deftypefn {Function} {} LAG (@var{variable}, @var{ncases}) -@var{variable} must be a numeric or string variable name. @var{ncases} -must be a small positive constant integer, although there is no explicit -limit. (Use of a large value for @var{ncases} will increase memory -consumption, since PSPP must keep @var{ncases} cases in memory.) -@code{LAG (@var{variable}, @var{ncases}} results in the value of -@var{variable} that is @var{ncases} before the case currently being -processed. See @code{LAG (@var{variable})} above for more details. -@end deftypefn - -@cindex date, Julian -@cindex Julian date -@deftypefn {Function} {} YRMODA (@var{year}, @var{month}, @var{day}) -@var{year} is a year between 0 and 199 or 1582 and 19999. @var{month} is -a month between 1 and 12. @var{day} is a day between 1 and 31. If -@var{month} or @var{day} is out-of-range, it changes the next higher -unit. For instance, a @var{day} of 0 refers to the last day of the -previous month, and a @var{month} of 13 refers to the first month of the -next year. @var{year} must be in range. If @var{year} is between 0 and -199, 1900 is added. @var{year}, @var{month}, and @var{day} must all be -integers. - -@code{YRMODA} results in the number of days between 15 Oct 1582 and -the date specified, plus one. The date passed to @code{YRMODA} must be -on or after 15 Oct 1582. 15 Oct 1582 has a value of 1. -@end deftypefn - -@node Functions Not Implemented, , Miscellaneous Functions, Functions -@subsection Functions Not Implemented -@cindex functions, not implemented -@cindex not implemented -@cindex features, not implemented - -These functions are not yet implemented and thus not yet documented, -since it's a hassle. - -@findex CDF.xxx -@findex CDFNORM -@findex IDF.xxx -@findex NCDF.xxx -@findex PROBIT -@findex RV.xxx - -@itemize @bullet -@item -@code{CDF.xxx} -@item -@code{CDFNORM} -@item -@code{IDF.xxx} -@item -@code{NCDF.xxx} -@item -@code{PROBIT} -@item -@code{RV.xxx} -@end itemize - -@node Order of Operations, , Functions, Expressions -@section Operator Precedence -@cindex operator precedence -@cindex precedence, operator -@cindex order of operations -@cindex operations, order of - -The following table describes operator precedence. Smaller-numbered -levels in the table have higher precedence. Within a level, operations -are performed from left to right, except for level 2 (exponentiation), -where operations are performed from right to left. If an operator -appears in the table in two places (@code{-}), the first occurrence is -unary, the second is binary. - -@enumerate -@item -@code{( )} -@item -@code{**} -@item -@code{-} -@item -@code{* /} -@item -@code{+ -} -@item -@code{EQ GE GT LE LT NE} -@item -@code{AND NOT OR} -@end enumerate - -@node Data Input and Output, System and Portable Files, Expressions, Top -@chapter Data Input and Output -@cindex input -@cindex output -@cindex data -@cindex cases -@cindex observations - -Data are the focus of the PSPP language. -Each datum belongs to a @dfn{case} (also called an @dfn{observation}). -Each case represents an individual or `experimental unit'. -For example, in the results of a survey, the names of the respondents, -their sex, age @i{etc}. and their responses are all data and the data -pertaining to single respondent is a case. -This chapter examines -the PSPP commands for defining variables and reading and writing data. - -@quotation -@strong{Please note:} Data is not actually read until a procedure is -executed. These commands tell PSPP how to read data, but they -do not @emph{cause} PSPP to read data. -@end quotation - -@menu -* BEGIN DATA:: Embed data within a syntax file. -* CLEAR TRANSFORMATIONS:: Clear pending transformations. -* DATA LIST:: Fundamental data reading command. -* END CASE:: Output the current case. -* END FILE:: Terminate the current input program. -* FILE HANDLE:: Support for fixed-length records. -* INPUT PROGRAM:: Support for complex input programs. -* LIST:: List cases in the active file. -* MATRIX DATA:: Read matrices in text format. -* NEW FILE:: Clear the active file and dictionary. -* PRINT:: Display values in print formats. -* PRINT EJECT:: Eject the current page then print. -* PRINT SPACE:: Print blank lines. -* REREAD:: Take another look at the previous input line. -* REPEATING DATA:: Multiple cases on a single line. -* WRITE:: Display values in write formats. -@end menu - -@node BEGIN DATA, CLEAR TRANSFORMATIONS, Data Input and Output, Data Input and Output -@section BEGIN DATA -@vindex BEGIN DATA -@vindex END DATA -@cindex Embedding data in syntax files -@cindex Data, embedding in syntax files - -@display -BEGIN DATA. -@dots{} -END DATA. -@end display - -@cmd{BEGIN DATA} and @cmd{END DATA} can be used to embed raw ASCII -data in a PSPP syntax file. @cmd{DATA LIST} or another input -procedure must be used before @cmd{BEGIN DATA} (@pxref{DATA LIST}). -@cmd{BEGIN DATA} and @cmd{END DATA} must be used together. @cmd{END -DATA} must appear by itself on a single line, with no leading -whitespace and exactly one space between the words @code{END} and -@code{DATA}, followed immediately by the terminal dot, like this: - -@example -END DATA. -@end example - -@node CLEAR TRANSFORMATIONS, DATA LIST, BEGIN DATA, Data Input and Output -@section CLEAR TRANSFORMATIONS -@vindex CLEAR TRANSFORMATIONS - -@display -CLEAR TRANSFORMATIONS. -@end display - -@cmd{CLEAR TRANSFORMATIONS} clears out all pending -transformations. It does not cancel the current input program. It is -valid only when PSPP is interactive, not in syntax files. - -@node DATA LIST, END CASE, CLEAR TRANSFORMATIONS, Data Input and Output -@section DATA LIST -@vindex DATA LIST -@cindex reading data from a file -@cindex data, reading from a file -@cindex data, embedding in syntax files -@cindex embedding data in syntax files - -Used to read text or binary data, @cmd{DATA LIST} is the most -fundamental data-reading command. Even the more sophisticated input -methods use @cmd{DATA LIST} commands as a building block. -Understanding @cmd{DATA LIST} is important to understanding how to use -PSPP to read your data files. - -There are two major variants of @cmd{DATA LIST}, which are fixed -format and free format. In addition, free format has a minor variant, -list format, which is discussed in terms of its differences from vanilla -free format. - -Each form of @cmd{DATA LIST} is described in detail below. - -@menu -* DATA LIST FIXED:: Fixed columnar locations for data. -* DATA LIST FREE:: Any spacing you like. -* DATA LIST LIST:: Each case must be on a single line. -@end menu - -@node DATA LIST FIXED, DATA LIST FREE, DATA LIST, DATA LIST -@subsection DATA LIST FIXED -@vindex DATA LIST FIXED -@cindex reading fixed-format data -@cindex fixed-format data, reading -@cindex data, fixed-format, reading -@cindex embedding fixed-format data - -@display -DATA LIST [FIXED] - @{TABLE,NOTABLE@} - FILE='filename' - RECORDS=record_count - END=end_var - /[line_no] var_spec@dots{} - -where each var_spec takes one of the forms - var_list start-end [type_spec] - var_list (fortran_spec) -@end display - -@cmd{DATA LIST FIXED} is used to read data files that have values at fixed -positions on each line of single-line or multiline records. The -keyword FIXED is optional. - -The FILE subcommand must be used if input is to be taken from an -external file. It may be used to specify a filename as a string or a -file handle (@pxref{FILE HANDLE}). If the FILE subcommand is not used, -then input is assumed to be specified within the command file using -@cmd{BEGIN DATA}@dots{}@cmd{END DATA} (@pxref{BEGIN DATA}). - -The optional RECORDS subcommand, which takes a single integer as an -argument, is used to specify the number of lines per record. If RECORDS -is not specified, then the number of lines per record is calculated from -the list of variable specifications later in @cmd{DATA LIST}. - -The END subcommand is only useful in conjunction with @cmd{INPUT -PROGRAM}. @xref{INPUT PROGRAM}, for details. - -@cmd{DATA LIST} can optionally output a table describing how the data file -will be read. The TABLE subcommand enables this output, and NOTABLE -disables it. The default is to output the table. - -The list of variables to be read from the data list must come last. -Each line in the data record is introduced by a slash (@samp{/}). -Optionally, a line number may follow the slash. Following, any number -of variable specifications may be present. - -Each variable specification consists of a list of variable names -followed by a description of their location on the input line. Sets of -variables may specified using the @code{DATA LIST} TO convention -(@pxref{Sets of -Variables}). There are two ways to specify the location of the variable -on the line: PSPP style and FORTRAN style. - -With PSPP style, the starting column and ending column for the field -are specified after the variable name, separated by a dash (@samp{-}). -For instance, the third through fifth columns on a line would be -specified @samp{3-5}. By default, variables are considered to be in -@samp{F} format (@pxref{Input/Output Formats}). (This default can be -changed; see @ref{SET} for more information.) - -When using PSPP style, to use a variable format other than the default, -specify the format type in parentheses after the column numbers. For -instance, for alphanumeric @samp{A} format, use @samp{(A)}. - -In addition, implied decimal places can be specified in parentheses -after the column numbers. As an example, suppose that a data file has a -field in which the characters @samp{1234} should be interpreted as -having the value 12.34. Then this field has two implied decimal places, -and the corresponding specification would be @samp{(2)}. If a field -that has implied decimal places contains a decimal point, then the -implied decimal places are not applied. - -Changing the variable format and adding implied decimal places can be -done together; for instance, @samp{(N,5)}. - -When using PSPP style, the input and output width of each variable is -computed from the field width. The field width must be evenly divisible -into the number of variables specified. - -FORTRAN style is an altogether different approach to specifying field -locations. With this approach, a list of variable input format -specifications, separated by commas, are placed after the variable names -inside parentheses. Each format specifier advances as many characters -into the input line as it uses. - -In addition to the standard format specifiers (@pxref{Input/Output -Formats}), FORTRAN style defines some extensions: - -@table @asis -@item @code{X} -Advance the current column on this line by one character position. - -@item @code{T}@var{x} -Set the current column on this line to column @var{x}, with column -numbers considered to begin with 1 at the left margin. - -@item @code{NEWREC}@var{x} -Skip forward @var{x} lines in the current record, resetting the active -column to the left margin. - -@item Repeat count -Any format specifier may be preceded by a number. This causes the -action of that format specifier to be repeated the specified number of -times. - -@item (@var{spec1}, @dots{}, @var{specN}) -Group the given specifiers together. This is most useful when preceded -by a repeat count. Groups may be nested arbitrarily. -@end table - -FORTRAN and PSPP styles may be freely intermixed. PSPP style leaves the -active column immediately after the ending column specified. Record -motion using @code{NEWREC} in FORTRAN style also applies to later -FORTRAN and PSPP specifiers. - -@menu -* DATA LIST FIXED Examples:: Examples of DATA LIST FIXED. -@end menu - -@node DATA LIST FIXED Examples, , DATA LIST FIXED, DATA LIST FIXED -@unnumberedsubsubsec Examples - -@enumerate -@item -@example -DATA LIST TABLE /NAME 1-10 (A) INFO1 TO INFO3 12-17 (1). - -BEGIN DATA. -John Smith 102311 -Bob Arnold 122015 -Bill Yates 918 6 -END DATA. -@end example - -Defines the following variables: - -@itemize @bullet -@item -@code{NAME}, a 10-character-wide long string variable, in columns 1 -through 10. - -@item -@code{INFO1}, a numeric variable, in columns 12 through 13. - -@item -@code{INFO2}, a numeric variable, in columns 14 through 15. - -@item -@code{INFO3}, a numeric variable, in columns 16 through 17. -@end itemize - -The @code{BEGIN DATA}/@code{END DATA} commands cause three cases to be -defined: - -@example -Case NAME INFO1 INFO2 INFO3 - 1 John Smith 10 23 11 - 2 Bob Arnold 12 20 15 - 3 Bill Yates 9 18 6 -@end example - -The @code{TABLE} keyword causes PSPP to print out a table -describing the four variables defined. - -@item -@example -DAT LIS FIL="survey.dat" - /ID 1-5 NAME 7-36 (A) SURNAME 38-67 (A) MINITIAL 69 (A) - /Q01 TO Q50 7-56 - /. -@end example - -Defines the following variables: - -@itemize @bullet -@item -@code{ID}, a numeric variable, in columns 1-5 of the first record. - -@item -@code{NAME}, a 30-character long string variable, in columns 7-36 of the -first record. - -@item -@code{SURNAME}, a 30-character long string variable, in columns 38-67 of -the first record. - -@item -@code{MINITIAL}, a 1-character short string variable, in column 69 of -the first record. - -@item -Fifty variables @code{Q01}, @code{Q02}, @code{Q03}, @dots{}, @code{Q49}, -@code{Q50}, all numeric, @code{Q01} in column 7, @code{Q02} in column 8, -@dots{}, @code{Q49} in column 55, @code{Q50} in column 56, all in the second -record. -@end itemize - -Cases are separated by a blank record. - -Data is read from file @file{survey.dat} in the current directory. - -This example shows keywords abbreviated to their first 3 letters. - -@end enumerate - -@node DATA LIST FREE, DATA LIST LIST, DATA LIST FIXED, DATA LIST -@subsection DATA LIST FREE -@vindex DATA LIST FREE - -@display -DATA LIST FREE - [(@{TAB,'c'@}, @dots{})] - [@{NOTABLE,TABLE@}] - FILE='filename' - END=end_var - /var_spec@dots{} - -where each var_spec takes one of the forms - var_list [(type_spec)] - var_list * -@end display - -In free format, the input data is, by default, structured as a series -of fields separated by spaces, tabs, commas, or line breaks. Each -field's content may be unquoted, or it may be quoted with a pairs of -apostrophes (@samp{'}) or double quotes (@samp{"}). Unquoted white -space separates fields but is not part of any field. Any mix of -spaces, tabs, and line breaks is equivalent to a single space for the -purpose of separating fields, but consecutive commas will skip a -field. - -Alternatively, delimiters can be specified explicitly, as a -parenthesized, comma-separated list of single-character strings -immediately following FREE. The word TAB may also be used to specify -a tab character as a delimiter. When delimiters are specified -explicitly, only the given characters, plus line breaks, separate -fields. Furthermore, leading spaces at the beginnings of fields are -not trimmed, consecutive delimiters define empty fields, and no form -of quoting is allowed. - -The NOTABLE and TABLE subcommands are as in @cmd{DATA LIST FIXED} above. -NOTABLE is the default. - -The FILE and END subcommands are as in @cmd{DATA LIST FIXED} above. - -The variables to be parsed are given as a single list of variable names. -This list must be introduced by a single slash (@samp{/}). The set of -variable names may contain format specifications in parentheses -(@pxref{Input/Output Formats}). Format specifications apply to all -variables back to the previous parenthesized format specification. - -In addition, an asterisk may be used to indicate that all variables -preceding it are to have input/output format @samp{F8.0}. - -Specified field widths are ignored on input, although all normal limits -on field width apply, but they are honored on output. - -@node DATA LIST LIST, , DATA LIST FREE, DATA LIST -@subsection DATA LIST LIST -@vindex DATA LIST LIST - -@display -DATA LIST LIST - [(@{TAB,'c'@}, @dots{})] - [@{NOTABLE,TABLE@}] - FILE='filename' - END=end_var - /var_spec@dots{} - -where each var_spec takes one of the forms - var_list [(type_spec)] - var_list * -@end display - -With one exception, @cmd{DATA LIST LIST} is syntactically and -semantically equivalent to @cmd{DATA LIST FREE}. The exception is -that each input line is expected to correspond to exactly one input -record. If more or fewer fields are found on an input line than -expected, an appropriate diagnostic is issued. - -@node END CASE, END FILE, DATA LIST, Data Input and Output -@section END CASE -@vindex END CASE - -@display -END CASE. -@end display - -@cmd{END CASE} is used only within @cmd{INPUT PROGRAM} to output the -current case. @xref{INPUT PROGRAM}, for details. - -@node END FILE, FILE HANDLE, END CASE, Data Input and Output -@section END FILE -@vindex END FILE - -@display -END FILE. -@end display - -@cmd{END FILE} is used only within @cmd{INPUT PROGRAM} to terminate -the current input program. @xref{INPUT PROGRAM}. - -@node FILE HANDLE, INPUT PROGRAM, END FILE, Data Input and Output -@section FILE HANDLE -@vindex FILE HANDLE - -@display -FILE HANDLE handle_name - /NAME='filename' - /MODE=@{CHARACTER,IMAGE@} - /LRECL=rec_len - /TABWIDTH=tab_width -@end display - -Use @cmd{FILE HANDLE} to associate a file handle name with a file and -its attributes, so that later commands can refer to the file by its -handle name. Because names of text files can be specified directly on -commands that access files, @cmd{FILE HANDLE} is only needed when a -file is not an ordinary file containing lines of text. However, -@cmd{FILE HANDLE} may be used even for text files, and it may be -easier to specify a file's name once and later refer to it by an -abstract handle. - -Specify the file handle name as an identifier. Any given identifier may -only appear once in a PSPP run. File handles may not be reassigned to a -different file. The file handle name must immediately follow the @cmd{FILE -HANDLE} command name. - -The NAME subcommand specifies the name of the file associated with the -handle. It is the only required subcommand. - -MODE specifies a file mode. In CHARACTER mode, the default, the data -file is opened in ANSI C text mode, so that local end of line -conventions are followed, and each text line is read as one record. -In CHARACTER mode, most input programs will expand tabs to spaces -(@cmd{DATA LIST FREE} with explicitly specified delimiters is an -exception). By default, each tab is 4 characters wide, but an -alternate width may be specified on TABWIDTH. A tab width of 0 -suppresses tab expansion entirely. - -By contrast, in BINARY mode, the data file is opened in ANSI C binary -mode and records are a fixed length. In BINARY mode, LRECL specifies -the record length in bytes, with a default of 1024. Tab characters -are never expanded to spaces in binary mode. - -@node INPUT PROGRAM, LIST, FILE HANDLE, Data Input and Output -@section INPUT PROGRAM -@vindex INPUT PROGRAM - -@display -INPUT PROGRAM. -@dots{} input commands @dots{} -END INPUT PROGRAM. -@end display - -@cmd{INPUT PROGRAM}@dots{}@cmd{END INPUT PROGRAM} specifies a -complex input program. By placing data input commands within @cmd{INPUT -PROGRAM}, PSPP programs can take advantage of more complex file -structures than available with only @cmd{DATA LIST}. - -The first sort of extended input program is to simply put multiple @cmd{DATA -LIST} commands within the @cmd{INPUT PROGRAM}. This will cause all of -the data -files to be read in parallel. Input will stop when end of file is -reached on any of the data files. - -Transformations, such as conditional and looping constructs, can also be -included within @cmd{INPUT PROGRAM}. These can be used to combine input -from several data files in more complex ways. However, input will still -stop when end of file is reached on any of the data files. - -To prevent @cmd{INPUT PROGRAM} from terminating at the first end of -file, use -the END subcommand on @cmd{DATA LIST}. This subcommand takes a -variable name, -which should be a numeric scratch variable (@pxref{Scratch Variables}). -(It need not be a scratch variable but otherwise the results can be -surprising.) The value of this variable is set to 0 when reading the -data file, or 1 when end of file is encountered. - -Two additional commands are useful in conjunction with @cmd{INPUT PROGRAM}. -@cmd{END CASE} is the first. Normally each loop through the -@cmd{INPUT PROGRAM} -structure produces one case. @cmd{END CASE} controls exactly -when cases are output. When @cmd{END CASE} is used, looping from the end of -@cmd{INPUT PROGRAM} to the beginning does not cause a case to be output. - -@cmd{END FILE} is the second. When the END subcommand is used on @cmd{DATA -LIST}, there is no way for the @cmd{INPUT PROGRAM} construct to stop -looping, -so an infinite loop results. @cmd{END FILE}, when executed, -stops the flow of input data and passes out of the @cmd{INPUT PROGRAM} -structure. - -All this is very confusing. A few examples should help to clarify. - -@example -INPUT PROGRAM. - DATA LIST NOTABLE FILE='a.data'/X 1-10. - DATA LIST NOTABLE FILE='b.data'/Y 1-10. -END INPUT PROGRAM. -LIST. -@end example - -The example above reads variable X from file @file{a.data} and variable -Y from file @file{b.data}. If one file is shorter than the other then -the extra data in the longer file is ignored. - -@example -INPUT PROGRAM. - NUMERIC #A #B. - - DO IF NOT #A. - DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10. - END IF. - DO IF NOT #B. - DATA LIST NOTABLE END=#B FILE='b.data'/Y 1-10. - END IF. - DO IF #A AND #B. - END FILE. - END IF. - END CASE. -END INPUT PROGRAM. -LIST. -@end example - -The above example reads variable X from @file{a.data} and variable Y from -@file{b.data}. If one file is shorter than the other then the missing -field is set to the system-missing value alongside the present value for -the remaining length of the longer file. - -@example -INPUT PROGRAM. - NUMERIC #A #B. - - DO IF #A. - DATA LIST NOTABLE END=#B FILE='b.data'/X 1-10. - DO IF #B. - END FILE. - ELSE. - END CASE. - END IF. - ELSE. - DATA LIST NOTABLE END=#A FILE='a.data'/X 1-10. - DO IF NOT #A. - END CASE. - END IF. - END IF. -END INPUT PROGRAM. -LIST. -@end example - -The above example reads data from file @file{a.data}, then from -@file{b.data}, and concatenates them into a single active file. - -@example -INPUT PROGRAM. - NUMERIC #EOF. - - LOOP IF NOT #EOF. - DATA LIST NOTABLE END=#EOF FILE='a.data'/X 1-10. - DO IF NOT #EOF. - END CASE. - END IF. - END LOOP. - - COMPUTE #EOF = 0. - LOOP IF NOT #EOF. - DATA LIST NOTABLE END=#EOF FILE='b.data'/X 1-10. - DO IF NOT #EOF. - END CASE. - END IF. - END LOOP. - - END FILE. -END INPUT PROGRAM. -LIST. -@end example - -The above example does the same thing as the previous example, in a -different way. - -@example -INPUT PROGRAM. - LOOP #I=1 TO 50. - COMPUTE X=UNIFORM(10). - END CASE. - END LOOP. - END FILE. -END INPUT PROGRAM. -LIST/FORMAT=NUMBERED. -@end example - -The above example causes an active file to be created consisting of 50 -random variates between 0 and 10. - -@node LIST, MATRIX DATA, INPUT PROGRAM, Data Input and Output -@section LIST -@vindex LIST - -@display -LIST - /VARIABLES=var_list - /CASES=FROM start_index TO end_index BY incr_index - /FORMAT=@{UNNUMBERED,NUMBERED@} @{WRAP,SINGLE@} - @{NOWEIGHT,WEIGHT@} -@end display - -The @cmd{LIST} procedure prints the values of specified variables to the -listing file. - -The VARIABLES subcommand specifies the variables whose values are to be -printed. Keyword VARIABLES is optional. If VARIABLES subcommand is not -specified then all variables in the active file are printed. - -The CASES subcommand can be used to specify a subset of cases to be -printed. Specify FROM and the case number of the first case to print, -TO and the case number of the last case to print, and BY and the number -of cases to advance between printing cases, or any subset of those -settings. If CASES is not specified then all cases are printed. - -The FORMAT subcommand can be used to change the output format. NUMBERED -will print case numbers along with each case; UNNUMBERED, the default, -causes the case numbers to be omitted. The WRAP and SINGLE settings are -currently not used. WEIGHT will cause case weights to be printed along -with variable values; NOWEIGHT, the default, causes case weights to be -omitted from the output. - -Case numbers start from 1. They are counted after all transformations -have been considered. - -@cmd{LIST} attempts to fit all the values on a single line. If needed -to make them fit, variable names are displayed vertically. If values -cannot fit on a single line, then a multi-line format will be used. - -@cmd{LIST} is a procedure. It causes the data to be read. - -@node MATRIX DATA, NEW FILE, LIST, Data Input and Output -@section MATRIX DATA -@vindex MATRIX DATA - -@display -MATRIX DATA - /VARIABLES=var_list - /FILE='filename' - /FORMAT=@{LIST,FREE@} @{LOWER,UPPER,FULL@} @{DIAGONAL,NODIAGONAL@} - /SPLIT=@{new_var,var_list@} - /FACTORS=var_list - /CELLS=n_cells - /N=n - /CONTENTS=@{N_VECTOR,N_SCALAR,N_MATRIX,MEAN,STDDEV,COUNT,MSE, - DFE,MAT,COV,CORR,PROX@} -@end display - -@cmd{MATRIX DATA} command reads square matrices in one of several textual -formats. @cmd{MATRIX DATA} clears the dictionary and replaces it and -reads a -data file. - -Use VARIABLES to specify the variables that form the rows and columns of -the matrices. You may not specify a variable named @code{VARNAME_}. You -should specify VARIABLES first. - -Specify the file to read on FILE, either as a file name string or a file -handle (@pxref{FILE HANDLE}). If FILE is not specified then matrix data -must immediately follow @cmd{MATRIX DATA} with a @cmd{BEGIN -DATA}@dots{}@cmd{END DATA} -construct (@pxref{BEGIN DATA}). - -The FORMAT subcommand specifies how the matrices are formatted. LIST, -the default, indicates that there is one line per row of matrix data; -FREE allows single matrix rows to be broken across multiple lines. This -is analogous to the difference between @cmd{DATA LIST FREE} and -@cmd{DATA LIST LIST} -(@pxref{DATA LIST}). LOWER, the default, indicates that the lower -triangle of the matrix is given; UPPER indicates the upper triangle; and -FULL indicates that the entire matrix is given. DIAGONAL, the default, -indicates that the diagonal is part of the data; NODIAGONAL indicates -that it is omitted. DIAGONAL/NODIAGONAL have no effect when FULL is -specified. - -The SPLIT subcommand is used to specify @cmd{SPLIT FILE} variables for the -input matrices (@pxref{SPLIT FILE}). Specify either a single variable -not specified on VARIABLES, or one or more variables that are specified -on VARIABLES. In the former case, the SPLIT values are not present in -the data and ROWTYPE_ may not be specified on VARIABLES. In the latter -case, the SPLIT values are present in the data. - -Specify a list of factor variables on FACTORS. Factor variables must -also be listed on VARIABLES. Factor variables are used when there are -some variables where, for each possible combination of their values, -statistics on the matrix variables are included in the data. - -If FACTORS is specified and ROWTYPE_ is not specified on VARIABLES, the -CELLS subcommand is required. Specify the number of factor variable -combinations that are given. For instance, if factor variable A has 2 -values and factor variable B has 3 values, specify 6. - -The N subcommand specifies a population number of observations. When N -is specified, one N record is output for each @cmd{SPLIT FILE}. - -Use CONTENTS to specify what sort of information the matrices include. -Each possible option is described in more detail below. When ROWTYPE_ -is specified on VARIABLES, CONTENTS is optional; otherwise, if CONTENTS -is not specified then /CONTENTS=CORR is assumed. - -@table @asis -@item N -@item N_VECTOR -Number of observations as a vector, one value for each variable. -@item N_SCALAR -Number of observations as a single value. -@item N_MATRIX -Matrix of counts. -@item MEAN -Vector of means. -@item STDDEV -Vector of standard deviations. -@item COUNT -Vector of counts. -@item MSE -Vector of mean squared errors. -@item DFE -Vector of degrees of freedom. -@item MAT -Generic matrix. -@item COV -Covariance matrix. -@item CORR -Correlation matrix. -@item PROX -Proximities matrix. -@end table - -The exact semantics of the matrices read by @cmd{MATRIX DATA} are complex. -Right now @cmd{MATRIX DATA} isn't too useful due to a lack of procedures -accepting or producing related data, so these semantics aren't -documented. Later, they'll be described here in detail. - -@node NEW FILE, PRINT, MATRIX DATA, Data Input and Output -@section NEW FILE -@vindex NEW FILE - -@display -NEW FILE. -@end display - -@cmd{NEW FILE} command clears the current active file. - -@node PRINT, PRINT EJECT, NEW FILE, Data Input and Output -@section PRINT -@vindex PRINT - -@display -PRINT - OUTFILE='filename' - RECORDS=n_lines - @{NOTABLE,TABLE@} - /[line_no] arg@dots{} - -arg takes one of the following forms: - 'string' [start-end] - var_list start-end [type_spec] - var_list (fortran_spec) - var_list * -@end display - -The @cmd{PRINT} transformation writes variable data to an output file. -@cmd{PRINT} is executed when a procedure causes the data to be read. -Follow @cmd{PRINT} by @cmd{EXECUTE} to print variable data without -invoking a procedure (@pxref{EXECUTE}). - -All @cmd{PRINT} subcommands are optional. - -The OUTFILE subcommand specifies the file to receive the output. The -file may be a file name as a string or a file handle (@pxref{FILE -HANDLE}). If OUTFILE is not present then output will be sent to PSPP's -output listing file. - -The RECORDS subcommand specifies the number of lines to be output. The -number of lines may optionally be surrounded by parentheses. - -TABLE will cause the PRINT command to output a table to the listing file -that describes what it will print to the output file. NOTABLE, the -default, suppresses this output table. - -Introduce the strings and variables to be printed with a slash -(@samp{/}). Optionally, the slash may be followed by a number -indicating which output line will be specified. In the absence of this -line number, the next line number will be specified. Multiple lines may -be specified using multiple slashes with the intended output for a line -following its respective slash. - -Literal strings may be printed. Specify the string itself. Optionally -the string may be followed by a column number or range of column -numbers, specifying the location on the line for the string to be -printed. Otherwise, the string will be printed at the current position -on the line. - -Variables to be printed can be specified in the same ways as available -for @cmd{DATA LIST FIXED} (@pxref{DATA LIST FIXED}). In addition, a -variable -list may be followed by an asterisk (@samp{*}), which indicates that the -variables should be printed in their dictionary print formats, separated -by spaces. A variable list followed by a slash or the end of command -will be interpreted the same way. - -If a FORTRAN type specification is used to move backwards on the current -line, then text is written at that point on the line, the line will be -truncated to that length, although additional text being added will -again extend the line to that length. - -@node PRINT EJECT, PRINT SPACE, PRINT, Data Input and Output -@section PRINT EJECT -@vindex PRINT EJECT - -@display -PRINT EJECT - OUTFILE='filename' - RECORDS=n_lines - @{NOTABLE,TABLE@} - /[line_no] arg@dots{} - -arg takes one of the following forms: - 'string' [start-end] - var_list start-end [type_spec] - var_list (fortran_spec) - var_list * -@end display - -@cmd{PRINT EJECT} writes data to an output file. Before the data is -written, the current page in the listing file is ejected. - -@xref{PRINT}, for more information on syntax and usage. - -@node PRINT SPACE, REREAD, PRINT EJECT, Data Input and Output -@section PRINT SPACE -@vindex PRINT SPACE - -@display -PRINT SPACE OUTFILE='filename' n_lines. -@end display - -@cmd{PRINT SPACE} prints one or more blank lines to an output file. - -The OUTFILE subcommand is optional. It may be used to direct output to -a file specified by file name as a string or file handle (@pxref{FILE -HANDLE}). If OUTFILE is not specified then output will be directed to -the listing file. - -n_lines is also optional. If present, it is an expression -(@pxref{Expressions}) specifying the number of blank lines to be -printed. The expression must evaluate to a nonnegative value. - -@node REREAD, REPEATING DATA, PRINT SPACE, Data Input and Output -@section REREAD -@vindex REREAD - -@display -REREAD FILE=handle COLUMN=column. -@end display - -The @cmd{REREAD} transformation allows the previous input line in a -data file -already processed by @cmd{DATA LIST} or another input command to be re-read -for further processing. - -The FILE subcommand, which is optional, is used to specify the file to -have its line re-read. The file must be specified in the form of a file -handle (@pxref{FILE HANDLE}). If FILE is not specified then the last -file specified on @cmd{DATA LIST} will be assumed (last file specified -lexically, not in terms of flow-of-control). - -By default, the line re-read is re-read in its entirety. With the -COLUMN subcommand, a prefix of the line can be exempted from -re-reading. Specify an expression (@pxref{Expressions}) evaluating to -the first column that should be included in the re-read line. Columns -are numbered from 1 at the left margin. - -Issuing @code{REREAD} multiple times will not back up in the data -file. Instead, it will re-read the same line multiple times. - -@node REPEATING DATA, WRITE, REREAD, Data Input and Output -@section REPEATING DATA -@vindex REPEATING DATA - -@display -REPEATING DATA - /STARTS=start-end - /OCCURS=n_occurs - /FILE='filename' - /LENGTH=length - /CONTINUED[=cont_start-cont_end] - /ID=id_start-id_end=id_var - /@{TABLE,NOTABLE@} - /DATA=var_spec@dots{} - -where each var_spec takes one of the forms - var_list start-end [type_spec] - var_list (fortran_spec) -@end display - -@cmd{REPEATING DATA} parses groups of data repeating in -a uniform format, possibly with several groups on a single line. Each -group of data corresponds with one case. @cmd{REPEATING DATA} may only be -used within an @cmd{INPUT PROGRAM} structure (@pxref{INPUT PROGRAM}). -When used with @cmd{DATA LIST}, it -can be used to parse groups of cases that share a subset of variables -but differ in their other data. - -The STARTS subcommand is required. Specify a range of columns, using -literal numbers or numeric variable names. This range specifies the -columns on the first line that are used to contain groups of data. The -ending column is optional. If it is not specified, then the record -width of the input file is used. For the inline file (@pxref{BEGIN -DATA}) this is 80 columns; for a file with fixed record widths it is the -record width; for other files it is 1024 characters by default. - -The OCCURS subcommand is required. It must be a number or the name of a -numeric variable. Its value is the number of groups present in the -current record. - -The DATA subcommand is required. It must be the last subcommand -specified. It is used to specify the data present within each repeating -group. Column numbers are specified relative to the beginning of a -group at column 1. Data is specified in the same way as with @cmd{DATA LIST -FIXED} (@pxref{DATA LIST FIXED}). - -All other subcommands are optional. - -FILE specifies the file to read, either a file name as a string or a -file handle (@pxref{FILE HANDLE}). If FILE is not present then the -default is the last file handle used on @cmd{DATA LIST} (lexically, not in -terms of flow of control). - -By default @cmd{REPEATING DATA} will output a table describing how it will -parse the input data. Specifying NOTABLE will disable this behavior; -specifying TABLE will explicitly enable it. - -The LENGTH subcommand specifies the length in characters of each group. -If it is not present then length is inferred from the DATA subcommand. -LENGTH can be a number or a variable name. - -Normally all the data groups are expected to be present on a single -line. Use the CONTINUED command to indicate that data can be continued -onto additional lines. If data on continuation lines starts at the left -margin and continues through the entire field width, no column -specifications are necessary on CONTINUED. Otherwise, specify the -possible range of columns in the same way as on STARTS. - -When data groups are continued from line to line, it is easy -for cases to get out of sync through careless hand editing. The -ID subcommand allows a case identifier to be present on each line of -repeating data groups. @cmd{REPEATING DATA} will check for the same -identifier on each line and report mismatches. Specify the range of -columns that the identifier will occupy, followed by an equals sign -(@samp{=}) and the identifier variable name. The variable must already -have been declared with @cmd{NUMERIC} or another command. - -@cmd{REPEATING DATA} should be the last command given within an -@cmd{INPUT PROGRAM}. It should not be enclosed within a @cmd{LOOP} -structure (@pxref{LOOP}). Use @cmd{DATA LIST} before, not after, -@cmd{REPEATING DATA}. - -@node WRITE, , REPEATING DATA, Data Input and Output -@section WRITE -@vindex WRITE - -@display -WRITE - OUTFILE='filename' - RECORDS=n_lines - @{NOTABLE,TABLE@} - /[line_no] arg@dots{} - -arg takes one of the following forms: - 'string' [start-end] - var_list start-end [type_spec] - var_list (fortran_spec) - var_list * -@end display - -@code{WRITE} writes text or binary data to an output file. - -@xref{PRINT}, for more information on syntax and usage. The main -difference between @code{PRINT} and @code{WRITE} is that @cmd{WRITE} -uses write formats by default, where PRINT uses print formats. - -The sole additional difference is that if @cmd{WRITE} is used to send output -to a binary file, carriage control characters will not be output. -@xref{FILE HANDLE}, for information on how to declare a file as binary. - -@node System and Portable Files, Variable Attributes, Data Input and Output, Top -@chapter System Files and Portable Files - -The commands in this chapter read, write, and examine system files and -portable files. - -@menu -* APPLY DICTIONARY:: Apply system file dictionary to active file. -* EXPORT:: Write to a portable file. -* GET:: Read from a system file. -* IMPORT:: Read from a portable file. -* MATCH FILES:: Merge system files. -* SAVE:: Write to a system file. -* SYSFILE INFO:: Display system file dictionary. -* XSAVE:: Write to a system file, as a transform. -@end menu - -@node APPLY DICTIONARY, EXPORT, System and Portable Files, System and Portable Files -@section APPLY DICTIONARY -@vindex APPLY DICTIONARY - -@display -APPLY DICTIONARY FROM='filename'. -@end display - -@cmd{APPLY DICTIONARY} applies the variable labels, value labels, -and missing values from variables in a system file to corresponding -variables in the active file. In some cases it also updates the -weighting variable. - -Specify a system file with a file name string or as a file handle -(@pxref{FILE HANDLE}). The dictionary in the system file will be read, -but it will not replace the active file dictionary. The system file's -data will not be read. - -Only variables with names that exist in both the active file and the -system file are considered. Variables with the same name but different -types (numeric, string) will cause an error message. Otherwise, the -system file variables' attributes will replace those in their matching -active file variables, as described below. - -If a system file variable has a variable label, then it will replace the -active file variable's variable label. If the system file variable does -not have a variable label, then the active file variable's variable -label, if any, will be retained. - -If the active file variable is numeric or short string, then value -labels and missing values, if any, will be copied to the active file -variable. If the system file variable does not have value labels or -missing values, then those in the active file variable, if any, will not -be disturbed. - -Finally, weighting of the active file is updated (@pxref{WEIGHT}). If -the active file has a weighting variable, and the system file does not, -or if the weighting variable in the system file does not exist in the -active file, then the active file weighting variable, if any, is -retained. Otherwise, the weighting variable in the system file becomes -the active file weighting variable. - -@cmd{APPLY DICTIONARY} takes effect immediately. It does not read the -active -file. The system file is not modified. - -@node EXPORT, GET, APPLY DICTIONARY, System and Portable Files -@section EXPORT -@vindex EXPORT - -@display -EXPORT - /OUTFILE='filename' - /DROP=var_list - /KEEP=var_list - /RENAME=(src_names=target_names)@dots{} -@end display - -The @cmd{EXPORT} procedure writes the active file dictionary and data to a -specified portable file. - -The OUTFILE subcommand, which is the only required subcommand, specifies -the portable file to be written as a file name string or a file handle -(@pxref{FILE HANDLE}). - -DROP, KEEP, and RENAME follow the same format as the SAVE procedure -(@pxref{SAVE}). - -@cmd{EXPORT} is a procedure. It causes the active file to be read. - -@node GET, IMPORT, EXPORT, System and Portable Files -@section GET -@vindex GET - -@display -GET - /FILE='filename' - /DROP=var_list - /KEEP=var_list - /RENAME=(src_names=target_names)@dots{} -@end display - -@cmd{GET} clears the current dictionary and active file and -replaces them with the dictionary and data from a specified system file. - -The FILE subcommand is the only required subcommand. Specify the system -file to be read as a string file name or a file handle (@pxref{FILE -HANDLE}). - -By default, all the variables in a system file are read. The DROP -subcommand can be used to specify a list of variables that are not to be -read. By contrast, the KEEP subcommand can be used to specify variable -that are to be read, with all other variables not read. - -Normally variables in a system file retain the names that they were -saved under. Use the RENAME subcommand to change these names. Specify, -within parentheses, a list of variable names followed by an equals sign -(@samp{=}) and the names that they should be renamed to. Multiple -parenthesized groups of variable names can be included on a single -RENAME subcommand. Variables' names may be swapped using a RENAME -subcommand of the form @samp{/RENAME=(A B=B A)}. - -Alternate syntax for the RENAME subcommand allows the parentheses to be -eliminated. When this is done, only a single variable may be renamed at -once. For instance, @samp{/RENAME=A=B}. This alternate syntax is -deprecated. - -DROP, KEEP, and RENAME are performed in left-to-right order. They -each may be present any number of times. @cmd{GET} never modifies a -system file on disk. Only the active file read from the system file -is affected by these subcommands. - -@cmd{GET} does not cause the data to be read, only the dictionary. The data -is read later, when a procedure is executed. - -@node IMPORT, MATCH FILES, GET, System and Portable Files -@section IMPORT -@vindex IMPORT - -@display -IMPORT - /FILE='filename' - /TYPE=@{COMM,TAPE@} - /DROP=var_list - /KEEP=var_list - /RENAME=(src_names=target_names)@dots{} -@end display - -The @cmd{IMPORT} transformation clears the active file dictionary and -data and -replaces them with a dictionary and data from a portable file on disk. - -The FILE subcommand, which is the only required subcommand, specifies -the portable file to be read as a file name string or a file handle -(@pxref{FILE HANDLE}). - -The TYPE subcommand is currently not used. - -DROP, KEEP, and RENAME follow the syntax used by @cmd{GET} (@pxref{GET}). - -@cmd{IMPORT} does not cause the data to be read, only the dictionary. The -data is read later, when a procedure is executed. - -@node MATCH FILES, SAVE, IMPORT, System and Portable Files -@section MATCH FILES -@vindex MATCH FILES - -@display -MATCH FILES - /BY var_list - /@{FILE,TABLE@}=@{*,'filename'@} - /DROP=var_list - /KEEP=var_list - /RENAME=(src_names=target_names)@dots{} - /IN=var_name - /FIRST=var_name - /LAST=var_name - /MAP -@end display - -@cmd{MATCH FILES} merges one or more system files, optionally -including the active file. Records with the same values for BY -variables are combined into a single record. Records with different -values are output in order. Thus, multiple sorted system files are -combined into a single sorted system file based on the value of the BY -variables. The results of the merge become the new active file. - -The BY subcommand specifies a list of variables that are used to match -records from each of the system files. Variables specified must exist -in all the files specified on FILE and TABLE. BY should usually be -specified. If TABLE is used then BY is required. - -Specify FILE with a system file as a file name string or file handle -(@pxref{FILE HANDLE}), or with an asterisk (@samp{*}) to -indicate the current active file. The files specified on FILE are -merged together based on the BY variables, or combined case-by-case if -BY is not specified. Normally at least two FILE subcommands should be -specified. - -Specify TABLE with a system file to use it as a @dfn{table -lookup file}. Records in table lookup files are not used up after -they've been used once. This means that data in table lookup files can -correspond to any number of records in FILE files. Table lookup files -correspond to lookup tables in traditional relational database systems. -It is incorrect to have records with duplicate BY values in table lookup -files. - -Any number of FILE and TABLE subcommands may be specified. Each -instance of FILE or TABLE can be followed by DROP, KEEP, and/or RENAME -subcommands. These take the same form as the corresponding subcommands -of @cmd{GET} (@pxref{GET}), and perform the same functions. - -Variables belonging to files that are not present for the current case -are set to the system-missing value for numeric variables or spaces for -string variables. - -IN, FIRST, LAST, and MAP are currently not used. - -@cmd{MATCH FILES} may not be specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}) if the active file is used as an input source. - -@node SAVE, SYSFILE INFO, MATCH FILES, System and Portable Files -@section SAVE -@vindex SAVE - -@display -SAVE - /OUTFILE='filename' - /@{COMPRESSED,UNCOMPRESSED@} - /DROP=var_list - /KEEP=var_list - /RENAME=(src_names=target_names)@dots{} -@end display - -The @cmd{SAVE} procedure causes the dictionary and data in the active -file to -be written to a system file. - -FILE is the only required subcommand. Specify the system -file to be written as a string file name or a file handle (@pxref{FILE -HANDLE}). - -The COMPRESS and UNCOMPRESS subcommand determine whether the saved -system file is compressed. By default, system files are compressed. -This default can be changed with the SET command (@pxref{SET}). - -By default, all the variables in the active file dictionary are written -to the system file. The DROP subcommand can be used to specify a list -of variables not to be written. In contrast, KEEP specifies variables -to be written, with all variables not specified not written. - -Normally variables are saved to a system file under the same names they -have in the active file. Use the RENAME subcommand to change these names. -Specify, within parentheses, a list of variable names followed by an -equals sign (@samp{=}) and the names that they should be renamed to. -Multiple parenthesized groups of variable names can be included on a -single RENAME subcommand. Variables' names may be swapped using a -RENAME subcommand of the form @samp{/RENAME=(A B=B A)}. - -Alternate syntax for the RENAME subcommand allows the parentheses to be -eliminated. When this is done, only a single variable may be renamed at -once. For instance, @samp{/RENAME=A=B}. This alternate syntax is -deprecated. - -DROP, KEEP, and RENAME are performed in left-to-right order. They -each may be present any number of times. @cmd{SAVE} never modifies -the active file. DROP, KEEP, and RENAME only affect the system file -written to disk. - -@cmd{SAVE} causes the data to be read. It is a procedure. - -@node SYSFILE INFO, XSAVE, SAVE, System and Portable Files -@section SYSFILE INFO -@vindex SYSFILE INFO - -@display -SYSFILE INFO FILE='filename'. -@end display - -@cmd{SYSFILE INFO} reads the dictionary in a system file and -displays the information in its dictionary. - -Specify a file name or file handle. @cmd{SYSFILE INFO} reads that file as -a system file and displays information on its dictionary. - -@cmd{SYSFILE INFO} does not affect the current active file. - -@node XSAVE, , SYSFILE INFO, System and Portable Files -@section XSAVE -@vindex XSAVE - -@display -XSAVE - /FILE='filename' - /@{COMPRESSED,UNCOMPRESSED@} - /DROP=var_list - /KEEP=var_list - /RENAME=(src_names=target_names)@dots{} -@end display - -The @cmd{XSAVE} transformation writes the active file dictionary and -data to a -system file stored on disk. - -@cmd{XSAVE} is a transformation, not a procedure. It is executed when the -data is read by a procedure or procedure-like command. In all other -respects, @cmd{XSAVE} is identical to @cmd{SAVE}. @xref{SAVE}, for -more information -on syntax and usage. - -@node Variable Attributes, Data Manipulation, System and Portable Files, Top -@chapter Manipulating variables - -The variables in the active file dictionary are important. There are -several utility functions for examining and adjusting them. - -@menu -* ADD VALUE LABELS:: Add value labels to variables. -* DISPLAY:: Display variable names & descriptions. -* DISPLAY VECTORS:: Display a list of vectors. -* FORMATS:: Set print and write formats. -* LEAVE:: Don't clear variables between cases. -* MISSING VALUES:: Set missing values for variables. -* MODIFY VARS:: Rename, reorder, and drop variables. -* NUMERIC:: Create new numeric variables. -* PRINT FORMATS:: Set variable print formats. -* RENAME VARIABLES:: Rename variables. -* VALUE LABELS:: Set value labels for variables. -* STRING:: Create new string variables. -* VARIABLE LABELS:: Set variable labels for variables. -* VECTOR:: Declare an array of variables. -* WRITE FORMATS:: Set variable write formats. -@end menu - -@node ADD VALUE LABELS, DISPLAY, Variable Attributes, Variable Attributes -@section ADD VALUE LABELS -@vindex ADD VALUE LABELS - -@display -ADD VALUE LABELS - /var_list value 'label' [value 'label']@dots{} -@end display - -@cmd{ADD VALUE LABELS} has the same syntax and purpose as @cmd{VALUE -LABELS} (@pxref{VALUE LABELS}), but it does not clear value -labels from the variables before adding the ones specified. - -@node DISPLAY, DISPLAY VECTORS, ADD VALUE LABELS, Variable Attributes -@section DISPLAY -@vindex DISPLAY - -@display -DISPLAY @{NAMES,INDEX,LABELS,VARIABLES,DICTIONARY,SCRATCH@} - [SORTED] [var_list] -@end display - -@cmd{DISPLAY} displays requested information on variables. Variables can -optionally be sorted alphabetically. The entire dictionary or just -specified variables can be described. - -One of the following keywords can be present: - -@table @asis -@item NAMES -The variables' names are displayed. - -@item INDEX -The variables' names are displayed along with a value describing their -position within the active file dictionary. - -@item LABELS -Variable names, positions, and variable labels are displayed. - -@item VARIABLES -Variable names, positions, print and write formats, and missing values -are displayed. - -@item DICTIONARY -Variable names, positions, print and write formats, missing values, -variable labels, and value labels are displayed. - -@item SCRATCH -Varible names are displayed, for scratch variables only (@pxref{Scratch -Variables}). -@end table - -If SORTED is specified, then the variables are displayed in ascending -order based on their names; otherwise, they are displayed in the order -that they occur in the active file dictionary. - -@node DISPLAY VECTORS, FORMATS, DISPLAY, Variable Attributes -@section DISPLAY VECTORS -@vindex DISPLAY VECTORS - -@display -DISPLAY VECTORS. -@end display - -@cmd{DISPLAY VECTORS} lists all the currently declared vectors. - -@node FORMATS, LEAVE, DISPLAY VECTORS, Variable Attributes -@section FORMATS -@vindex FORMATS - -@display -FORMATS var_list (fmt_spec). -@end display - -@cmd{FORMATS} set both print and write formats for the specified -variables to the specified format specification. @xref{Input/Output -Formats}. - -Specify a list of variables followed by a format specification in -parentheses. The print and write formats of the specified variables -will be changed. - -Additional lists of variables and formats may be included if they are -delimited by a slash (@samp{/}). - -@cmd{FORMATS} takes effect immediately. It is not affected by -conditional and looping structures such as @cmd{DO IF} or @cmd{LOOP}. - -@node LEAVE, MISSING VALUES, FORMATS, Variable Attributes -@section LEAVE -@vindex LEAVE - -@display -LEAVE var_list. -@end display - -@cmd{LEAVE} prevents the specified variables from being -reinitialized whenever a new case is processed. - -Normally, when a data file is processed, every variable in the active -file is initialized to the system-missing value or spaces at the -beginning of processing for each case. When a variable has been -specified on @cmd{LEAVE}, this is not the case. Instead, that variable is -initialized to 0 (not system-missing) or spaces for the first case. -After that, it retains its value between cases. - -This becomes useful for counters. For instance, in the example below -the variable SUM maintains a running total of the values in the ITEM -variable. - -@example -DATA LIST /ITEM 1-3. -COMPUTE SUM=SUM+ITEM. -PRINT /ITEM SUM. -LEAVE SUM -BEGIN DATA. -123 -404 -555 -999 -END DATA. -@end example - -@noindent Partial output from this example: - -@example -123 123.00 -404 527.00 -555 1082.00 -999 2081.00 -@end example - -It is best to use @cmd{LEAVE} command immediately before invoking a -procedure command, because the left status of variables is reset by -certain transformations---for instance, @cmd{COMPUTE} and @cmd{IF}. -Left status is also reset by all procedure invocations. - -@node MISSING VALUES, MODIFY VARS, LEAVE, Variable Attributes -@section MISSING VALUES -@vindex MISSING VALUES - -@display -MISSING VALUES var_list (missing_values). - -missing_values takes one of the following forms: - num1 - num1, num2 - num1, num2, num3 - num1 THRU num2 - num1 THRU num2, num3 - string1 - string1, string2 - string1, string2, string3 -As part of a range, LO or LOWEST may take the place of num1; -HI or HIGHEST may take the place of num2. -@end display - -@cmd{MISSING VALUES} sets user-missing values for numeric and -short string variables. Long string variables may not have missing -values. - -Specify a list of variables, followed by a list of their user-missing -values in parentheses. Up to three discrete values may be given, or, -for numeric variables only, a range of values optionally accompanied by -a single discrete value. Ranges may be open-ended on one end, indicated -through the use of the keyword LO or LOWEST or HI or HIGHEST. - -The @cmd{MISSING VALUES} command takes effect immediately. It is not -affected by conditional and looping constructs such as @cmd{DO IF} or -@cmd{LOOP}. - -@node MODIFY VARS, NUMERIC, MISSING VALUES, Variable Attributes -@section MODIFY VARS -@vindex MODIFY VARS - -@display -MODIFY VARS - /REORDER=@{FORWARD,BACKWARD@} @{POSITIONAL,ALPHA@} (var_list)@dots{} - /RENAME=(old_names=new_names)@dots{} - /@{DROP,KEEP@}=var_list - /MAP -@end display - -@cmd{MODIFY VARS} reorders, renames, and deletes variables in the -active file. - -At least one subcommand must be specified, and no subcommand may be -specified more than once. DROP and KEEP may not both be specified. - -The REORDER subcommand changes the order of variables in the active -file. Specify one or more lists of variable names in parentheses. By -default, each list of variables is rearranged into the specified order. -To put the variables into the reverse of the specified order, put -keyword BACKWARD before the parentheses. To put them into alphabetical -order in the dictionary, specify keyword ALPHA before the parentheses. -BACKWARD and ALPHA may also be combined. - -To rename variables in the active file, specify RENAME, an equals sign -(@samp{=}), and lists of the old variable names and new variable names -separated by another equals sign within parentheses. There must be the -same number of old and new variable names. Each old variable is renamed to -the corresponding new variable name. Multiple parenthesized groups of -variables may be specified. - -The DROP subcommand deletes a specified list of variables from the -active file. - -The KEEP subcommand keeps the specified list of variables in the active -file. Any unlisted variables are deleted from the active file. - -MAP is currently ignored. - -If either DROP or KEEP is specified, the data is read; otherwise it is -not. - -@cmd{MODIFY VARS} may not be specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}). - -@node NUMERIC, PRINT FORMATS, MODIFY VARS, Variable Attributes -@section NUMERIC -@vindex NUMERIC - -@display -NUMERIC /var_list [(fmt_spec)]. -@end display - -@cmd{NUMERIC} explicitly declares new numeric variables, optionally -setting their output formats. - -Specify a slash (@samp{/}), followed by the names of the new numeric -variables. If you wish to set their output formats, follow their names -by an output format specification in parentheses (@pxref{Input/Output -Formats}); otherwise, the default is F8.2. - -Variables created with @cmd{NUMERIC} are initialized to the -system-missing value. - -@node PRINT FORMATS, RENAME VARIABLES, NUMERIC, Variable Attributes -@section PRINT FORMATS -@vindex PRINT FORMATS - -@display -PRINT FORMATS var_list (fmt_spec). -@end display - -@cmd{PRINT FORMATS} sets the print formats for the specified -variables to the specified format specification. - -Its syntax is identical to that of @cmd{FORMATS} (@pxref{FORMATS}), -but @cmd{PRINT FORMATS} sets only print formats, not write formats. - -@node RENAME VARIABLES, VALUE LABELS, PRINT FORMATS, Variable Attributes -@section RENAME VARIABLES -@vindex RENAME VARIABLES - -@display -RENAME VARIABLES (old_names=new_names)@dots{} . -@end display - -@cmd{RENAME VARIABLES} changes the names of variables in the active -file. Specify lists of the old variable names and new -variable names, separated by an equals sign (@samp{=}), within -parentheses. There must be the same number of old and new variable -names. Each old variable is renamed to the corresponding new variable -name. Multiple parenthesized groups of variables may be specified. - -@cmd{RENAME VARIABLES} takes effect immediately. It does not cause the data -to be read. - -@cmd{RENAME VARIABLES} may not be specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}). - -@node VALUE LABELS, STRING, RENAME VARIABLES, Variable Attributes -@section VALUE LABELS -@vindex VALUE LABELS - -@display -VALUE LABELS - /var_list value 'label' [value 'label']@dots{} -@end display - -@cmd{VALUE LABELS} allows values of numeric and short string -variables to be associated with labels. In this way, a short value can -stand for a long value. - -To set up value labels for a set of variables, specify the -variable names after a slash (@samp{/}), followed by a list of values -and their associated labels, separated by spaces. Long string -variables may not be specified. - -Before @cmd{VALUE LABELS} is executed, any existing value labels -are cleared from the variables specified. Use @cmd{ADD VALUE LABELS} -(@pxref{ADD VALUE LABELS}) to add value labels without clearing those -already present. - -@node STRING, VARIABLE LABELS, VALUE LABELS, Variable Attributes -@section STRING -@vindex STRING - -@display -STRING /var_list (fmt_spec). -@end display - -@cmd{STRING} creates new string variables for use in -transformations. - -Specify a slash (@samp{/}), followed by the names of the string -variables to create and the desired output format specification in -parentheses (@pxref{Input/Output Formats}). Variable widths are -implicitly derived from the specified output formats. - -Created variables are initialized to spaces. - -@node VARIABLE LABELS, VECTOR, STRING, Variable Attributes -@section VARIABLE LABELS -@vindex VARIABLE LABELS - -@display -VARIABLE LABELS - /var_list 'var_label'. -@end display - -@cmd{VARIABLE LABELS} associates explanatory names -with variables. This name, called a @dfn{variable label}, is displayed by -statistical procedures. - -To assign a variable label to a group of variables, specify a slash -(@samp{/}), followed by the list of variable names and the variable -label as a string. - -@node VECTOR, WRITE FORMATS, VARIABLE LABELS, Variable Attributes -@section VECTOR -@vindex VECTOR - -@display -Two possible syntaxes: - VECTOR vec_name=var_list. - VECTOR vec_name_list(count). -@end display - -@cmd{VECTOR} allows a group of variables to be accessed as if they -were consecutive members of an array with a vector(index) notation. - -To make a vector out of a set of existing variables, specify a name for -the vector followed by an equals sign (@samp{=}) and the variables that -belong in the vector. - -To make a vector and create variables at the same time, specify one or -more vector names followed by a count in parentheses. This will cause -variables named @code{@var{vec}1} through @code{@var{vec}@var{count}} -to be created as numeric variables with print and write format F8.2. -Variable names including numeric suffixes may not exceed 8 characters -in length, and none of the variables may exist prior to @cmd{VECTOR}. - -All the variables in a vector must be the same type. - -Vectors created with @cmd{VECTOR} disappear after any procedure or -procedure-like command is executed. The variables contained in the -vectors remain, unless they are scratch variables (@pxref{Scratch -Variables}). - -Variables within a vector may be references in expressions using -@code{vector(index)} syntax. - -@node WRITE FORMATS, , VECTOR, Variable Attributes -@section WRITE FORMATS -@vindex WRITE FORMATS - -@display -WRITE FORMATS var_list (fmt_spec). -@end display - -@cmd{WRITE FORMATS} sets the write formats for the specified variables -to the specified format specification. Its syntax is identical to -that of FORMATS (@pxref{FORMATS}), but @cmd{WRITE FORMATS} sets only -write formats, not print formats. - -@node Data Manipulation, Data Selection, Variable Attributes, Top -@chapter Data transformations -@cindex transformations - -The PSPP procedures examined in this chapter manipulate data and -prepare the active file for later analyses. They do not produce output, -as a rule. - -@menu -* AGGREGATE:: Summarize multiple cases into a single case. -* AUTORECODE:: Automatic recoding of variables. -* COMPUTE:: Assigning a variable a calculated value. -* COUNT:: Counting variables with particular values. -* FLIP:: Exchange variables with cases. -* IF:: Conditionally assigning a calculated value. -* RECODE:: Mapping values from one set to another. -* SORT CASES:: Sort the active file. -@end menu - -@node AGGREGATE, AUTORECODE, Data Manipulation, Data Manipulation -@section AGGREGATE -@vindex AGGREGATE - -@display -AGGREGATE - /BREAK=var_list - /PRESORTED - /OUTFILE=@{*,'filename'@} - /DOCUMENT - /MISSING=COLUMNWISE - /dest_vars=agr_func(src_vars, args@dots{})@dots{} -@end display - -@cmd{AGGREGATE} summarizes groups of cases into single cases. -Cases are divided into groups that have the same values for one or more -variables called @dfn{break variables}. Several functions are available -for summarizing case contents. - -At least one break variable must be specified on BREAK, the only -required subcommand. The values of these variables are used to divide -the active file into groups to be summarized. In addition, at least -one @var{dest_var} must be specified. - -By default, the active file is sorted based on the break variables -before aggregation takes place. If the active file is already sorted -or otherwise grouped in terms of the break variables, specify -PRESORTED to save time. - -The OUTFILE subcommand specifies a system file by file name string or -file handle (@pxref{FILE HANDLE}). The aggregated cases are written to -this file. If OUTFILE is not specified, or if @samp{*} is specified, -then the aggregated cases replace the active file. - -Specify DOCUMENT to copy the documents from the active file into the -aggregate file (@pxref{DOCUMENT}). Otherwise, the aggregate file will -not contain any documents, even if the aggregate file replaces the -active file. - -One or more sets of aggregation variables must be specified. Each set -comprises a list of aggregation variables, an equals sign (@samp{=}), -the name of an aggregation function (see the list below), and a list -of source variables in parentheses. Some aggregation functions expect -additional arguments following the source variable names. - -Each set must have exactly as many source variables as aggregation -variables. Each aggregation variable receives the results of applying -the specified aggregation function to the corresponding source -variable. Most aggregation functions may be applied to numeric and -short and long string variables. Others, marked below, are restricted -to numeric values. - -The available aggregation functions are as follows: - -@table @asis -@item SUM(var_name) -Sum. Limited to numeric values. -@item MEAN(var_name) -Arithmetic mean. Limited to numeric values. -@item SD(var_name) -Standard deviation of the mean. Limited to numeric values. -@item MAX(var_name) -Maximum value. -@item MIN(var_name) -Minimum value. -@item FGT(var_name, value) -@itemx PGT(var_name, value) -Fraction between 0 and 1, or percentage between 0 and 100, respectively, -of values greater than the specified constant. -@item FLT(var_name, value) -@itemx PLT(var_name, value) -Fraction or percentage, respectively, of values less than the specified -constant. -@item FIN(var_name, low, high) -@itemx PIN(var_name, low, high) -Fraction or percentage, respectively, of values within the specified -inclusive range of constants. -@item FOUT(var_name, low, high) -@itemx POUT(var_name, low, high) -Fraction or percentage, respectively, of values strictly outside the -specified range of constants. -@item N(var_name) -Number of non-missing values. -@item N -Number of cases aggregated to form this group. Don't supply a source -variable for this aggregation function. -@item NU(var_name) -Number of non-missing values. Each case is considered to have a weight -of 1, regardless of the current weighting variable (@pxref{WEIGHT}). -@item NU -Number of cases aggregated to form this group. Each case is considered -to have a weight of 1, regardless of the current weighting variable. -@item NMISS(var_name) -Number of missing values. -@item NUMISS(var_name) -Number of missing values. Each case is considered to have a weight of -1, regardless of the current weighting variable. -@item FIRST(var_name) -First value in this group. -@item LAST(var_name) -Last value in this group. -@end table - -Aggregation functions compare string values in terms of internal -character codes. On most modern computers, this is a form of ASCII. - -The aggregation functions listed above exclude all user-missing values -from calculations. To include user-missing values, insert a period -(@samp{.}) between the function name and left parenthesis -(e.g.@: @samp{SUM.}). - -Normally, only a single case (for SD and SD., two cases) need be -non-missing in each group for the aggregate variable to be -non-missing. Specifying /MISSING=COLUMNWISE inverts this behavior, so -that the aggregate variable becomes missing if any aggregated value is -missing. - -@cmd{AGGREGATE} both ignores and cancels the current @cmd{SPLIT FILE} -settings (@pxref{SPLIT FILE}). - -@node AUTORECODE, COMPUTE, AGGREGATE, Data Manipulation -@section AUTORECODE -@vindex AUTORECODE - -@display -AUTORECODE VARIABLES=src_vars INTO dest_vars - /DESCENDING - /PRINT -@end display - -The @cmd{AUTORECODE} procedure considers the @var{n} values that a variable -takes on and maps them onto values 1@dots{}@var{n} on a new numeric -variable. - -Subcommand VARIABLES is the only required subcommand and must come -first. Specify VARIABLES, an equals sign (@samp{=}), a list of source -variables, INTO, and a list of target variables. There must the same -number of source and target variables. The target variables must not -already exist. - -By default, increasing values of a source variable (for a string, this -is based on character code comparisons) are recoded to increasing values -of its target variable. To cause increasing values of a source variable -to be recoded to decreasing values of its target variable (@var{n} down -to 1), specify DESCENDING. - -PRINT is currently ignored. - -@cmd{AUTORECODE} is a procedure. It causes the data to be read. - -@node COMPUTE, COUNT, AUTORECODE, Data Manipulation -@section COMPUTE -@vindex COMPUTE - -@display -COMPUTE variable = expression. - or -COMPUTE vector(index) = expression. -@end display - -@cmd{COMPUTE} assigns the value of an expression to a target -variable. For each case, the expression is evaluated and its value -assigned to the target variable. Numeric and short and long string -variables may be assigned. When a string expression's width differs -from the target variable's width, the string result of the expression -is truncated or padded with spaces on the right as necessary. The -expression and variable types must match. - -For numeric variables only, the target variable need not already -exist. Numeric variables created by @cmd{COMPUTE} are assigned an -@code{F8.2} output format. String variables must be declared before -they can be used as targets for @cmd{COMPUTE}. - -The target variable may be specified as an element of a vector -(@pxref{VECTOR}). In this case, a vector index expression must be -specified in parentheses following the vector name. The index -expression must evaluate to a numeric value that, after rounding down -to the nearest integer, is a valid index for the named vector. - -Using @cmd{COMPUTE} to assign to a variable specified on @cmd{LEAVE} -(@pxref{LEAVE}) resets the variable's left state. Therefore, -@code{LEAVE} should be specified following @cmd{COMPUTE}, not before. - -@cmd{COMPUTE} is a transformation. It does not cause the active file to be -read. - -When @cmd{COMPUTE} is specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used -(@pxref{LAG}). - -@node COUNT, FLIP, COMPUTE, Data Manipulation -@section COUNT -@vindex COUNT - -@display -COUNT var_name = var@dots{} (value@dots{}). - -Each value takes one of the following forms: - number - string - num1 THRU num2 - MISSING - SYSMIS -In addition, num1 and num2 can be LO or LOWEST, or HI or HIGHEST, -respectively. -@end display - -@cmd{COUNT} creates or replaces a numeric @dfn{target} variable that -counts the occurrence of a @dfn{criterion} value or set of values over -one or more @dfn{test} variables for each case. - -The target variable values are always nonnegative integers. They are -never missing. The target variable is assigned an F8.2 output format. -@xref{Input/Output Formats}. Any variables, including long and short -string variables, may be test variables. - -User-missing values of test variables are treated just like any other -values. They are @strong{not} treated as system-missing values. -User-missing values that are criterion values or inside ranges of -criterion values are counted as any other values. However (for numeric -variables), keyword MISSING may be used to refer to all system- -and user-missing values. - -@cmd{COUNT} target variables are assigned values in the order -specified. In the command @code{COUNT A=A B(1) /B=A B(2).}, the -following actions occur: - -@itemize @minus -@item -The number of occurrences of 1 between @code{A} and @code{B} is counted. - -@item -@code{A} is assigned this value. - -@item -The number of occurrences of 1 between @code{B} and the @strong{new} -value of @code{A} is counted. - -@item -@code{B} is assigned this value. -@end itemize - -Despite this ordering, all @cmd{COUNT} criterion variables must exist -before the procedure is executed---they may not be created as target -variables earlier in the command! Break such a command into two -separate commands. - -The examples below may help to clarify. - -@enumerate A -@item -Assuming @code{Q0}, @code{Q2}, @dots{}, @code{Q9} are numeric variables, -the following commands: - -@enumerate -@item -Count the number of times the value 1 occurs through these variables -for each case and assigns the count to variable @code{QCOUNT}. - -@item -Print out the total number of times the value 1 occurs throughout -@emph{all} cases using @cmd{DESCRIPTIVES}. @xref{DESCRIPTIVES}, for -details. -@end enumerate - -@example -COUNT QCOUNT=Q0 TO Q9(1). -DESCRIPTIVES QCOUNT /STATISTICS=SUM. -@end example - -@item -Given these same variables, the following commands: - -@enumerate -@item -Count the number of valid values of these variables for each case and -assigns the count to variable @code{QVALID}. - -@item -Multiplies each value of @code{QVALID} by 10 to obtain a percentage of -valid values, using @cmd{COMPUTE}. @xref{COMPUTE}, for details. - -@item -Print out the percentage of valid values across all cases, using -@cmd{DESCRIPTIVES}. @xref{DESCRIPTIVES}, for details. -@end enumerate - -@example -COUNT QVALID=Q0 TO Q9 (LO THRU HI). -COMPUTE QVALID=QVALID*10. -DESCRIPTIVES QVALID /STATISTICS=MEAN. -@end example -@end enumerate - -@node FLIP, IF, COUNT, Data Manipulation -@section FLIP -@vindex FLIP - -@display -FLIP /VARIABLES=var_list /NEWNAMES=var_name. -@end display - -@cmd{FLIP} transposes rows and columns in the active file. It -causes cases to be swapped with variables, and vice versa. - -All variables in the transposed active file are numeric. String -variables take on the system-missing value in the transposed file. - -No subcommands are required. The VARIABLES subcommand specifies -variables that will be transformed into cases. Variables not specified -are discarded. By default, all variables are selected for -transposition. - -The variables specified by NEWNAMES, which must be a string variable, is -used to give names to the variables created by @cmd{FLIP}. If -NEWNAMES is not -specified then the default is a variable named CASE_LBL, if it exists. -If it does not then the variables created by FLIP are named VAR000 -through VAR999, then VAR1000, VAR1001, and so on. - -When a NEWNAMES variable is available, the names must be canonicalized -before becoming variable names. Invalid characters are replaced by -letter @samp{V} in the first position, or by @samp{_} in subsequent -positions. If the name thus generated is not unique, then numeric -extensions are added, starting with 1, until a unique name is found or -there are no remaining possibilities. If the latter occurs then the -FLIP operation aborts. - -The resultant dictionary contains a CASE_LBL variable, which stores the -names of the variables in the dictionary before the transposition. If -the active file is subsequently transposed using @cmd{FLIP}, this -variable can -be used to recreate the original variable names. - -FLIP honors N OF CASES. It ignores TEMPORARY, so that ``temporary'' -transformations become permanent. - -@node IF, RECODE, FLIP, Data Manipulation -@section IF -@vindex IF - -@display -IF condition variable=expression. - or -IF condition vector(index)=expression. -@end display - -The @cmd{IF} transformation conditionally assigns the value of a target -expression to a target variable, based on the truth of a test -expression. - -Specify a boolean-valued expression (@pxref{Expressions}) to be tested -following the IF keyword. This expression is evaluated for each case. -If the value is true, then the value of the expression is computed and -assigned to the specified variable. If the value is false or missing, -nothing is done. Numeric and short and long string variables may be -assigned. When a string expression's width differs from the target -variable's width, the string result of the expression is truncated or -padded with spaces on the right as necessary. The expression and -variable types must match. - -The target variable may be specified as an element of a vector -(@pxref{VECTOR}). In this case, a vector index expression must be -specified in parentheses following the vector name. The index -expression must evaluate to a numeric value that, after rounding down -to the nearest integer, is a valid index for the named vector. - -Using @cmd{IF} to assign to a variable specified on @cmd{LEAVE} -(@pxref{LEAVE}) resets the variable's left state. Therefore, -@code{LEAVE} should be specified following @cmd{IF}, not before. - -When @cmd{IF} is specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used -(@pxref{LAG}). - -@node RECODE, SORT CASES, IF, Data Manipulation -@section RECODE -@vindex RECODE - -@display -RECODE var_list (src_value@dots{}=dest_value)@dots{} [INTO var_list]. - -src_value may take the following forms: - number - string - num1 THRU num2 - MISSING - SYSMIS - ELSE -Open-ended ranges may be specified using LO or LOWEST for num1 -or HI or HIGHEST for num2. - -dest_value may take the following forms: - num - string - SYSMIS - COPY -@end display - -@cmd{RECODE} translates data from one range of values to -another, via flexible user-specified mappings. Data may be remapped -in-place or copied to new variables. Numeric, short string, and long -string data can be recoded. - -Specify the list of source variables, followed by one or more mapping -specifications each enclosed in parentheses. If the data is to be -copied to new variables, specify INTO, then the list of target -variables. String target variables must already have been declared -using @cmd{STRING} or another transformation, but numeric target -variables can -be created on the fly. There must be exactly as many target variables -as source variables. Each source variable is remapped into its -corresponding target variable. - -When INTO is not used, the input and output variables must be of the -same type. Otherwise, string values can be recoded into numeric values, -and vice versa. When this is done and there is no mapping for a -particular value, either a value consisting of all spaces or the -system-missing value is assigned, depending on variable type. - -Mappings are considered from left to right. The first src_value that -matches the value of the source variable causes the target variable to -receive the value indicated by the dest_value. Literal number, string, -and range src_value's should be self-explanatory. MISSING as a -src_value matches any user- or system-missing value. SYSMIS matches the -system missing value only. ELSE is a catch-all that matches anything. -It should be the last src_value specified. - -Numeric and string dest_value's should also be self-explanatory. COPY -causes the input values to be copied to the output. This is only value -if the source and target variables are of the same type. SYSMIS -indicates the system-missing value. - -If the source variables are strings and the target variables are -numeric, then there is one additional mapping available: (CONVERT), -which must be the last specified mapping. CONVERT causes a number -specified as a string to be converted to a numeric value. If the string -cannot be parsed as a number, then the system-missing value is assigned. - -Multiple recodings can be specified on a single @cmd{RECODE} invocation. -Introduce additional recodings with a slash (@samp{/}) to -separate them from the previous recodings. - -@node SORT CASES, , RECODE, Data Manipulation -@section SORT CASES -@vindex SORT CASES - -@display -SORT CASES BY var_list. -@end display - -@cmd{SORT CASES} sorts the active file by the values of one or more -variables. - -Specify BY and a list of variables to sort by. By default, variables -are sorted in ascending order. To override sort order, specify (D) or -(DOWN) after a list of variables to get descending order, or (A) or (UP) -for ascending order. These apply to the entire list of variables -preceding them. - -@cmd{SORT CASES} is a procedure. It causes the data to be read. - -@cmd{SORT CASES} attempts to sort the entire active file in main memory. -If main memory is exhausted, it falls back to a merge sort algorithm that -involves writing and reading numerous temporary files. - -@cmd{SORT CASES} may not be specified following TEMPORARY. - -@node Data Selection, Conditionals and Looping, Data Manipulation, Top -@chapter Selecting data for analysis - -This chapter documents PSPP commands that temporarily or permanently -select data records from the active file for analysis. - -@menu -* FILTER:: Exclude cases based on a variable. -* N OF CASES:: Limit the size of the active file. -* PROCESS IF:: Temporarily excluding cases. -* SAMPLE:: Select a specified proportion of cases. -* SELECT IF:: Permanently delete selected cases. -* SPLIT FILE:: Do multiple analyses with one command. -* TEMPORARY:: Make transformations' effects temporary. -* WEIGHT:: Weight cases by a variable. -@end menu - -@node FILTER, N OF CASES, Data Selection, Data Selection -@section FILTER -@vindex FILTER - -@display -FILTER BY var_name. -FILTER OFF. -@end display - -@cmd{FILTER} allows a boolean-valued variable to be used to select -cases from the data stream for processing. - -To set up filtering, specify BY and a variable name. Keyword -BY is optional but recommended. Cases which have a zero or system- or -user-missing value are excluded from analysis, but not deleted from the -data stream. Cases with other values are analyzed. -To filter based on a different condition, use -transformations such as @cmd{COMPUTE} or @cmd{RECODE} to compute a -filter variable of the required form, then specify that variable on -@cmd{FILTER}. - -@code{FILTER OFF} turns off case filtering. - -Filtering takes place immediately before cases pass to a procedure for -analysis. Only one filter variable may be active at a time. Normally, -case filtering continues until it is explicitly turned off with @code{FILTER -OFF}. However, if @cmd{FILTER} is placed after TEMPORARY, it filters only -the next procedure or procedure-like command. - -@node N OF CASES, PROCESS IF, FILTER, Data Selection -@section N OF CASES -@vindex N OF CASES - -@display -N [OF CASES] num_of_cases [ESTIMATED]. -@end display - -Sometimes you may want to disregard cases of your input. @cmd{N} can -do this. @code{N 100} tells PSPP to disregard all cases after the -first 100. - -If the value specified for @cmd{N} is greater than the number of cases -read in, the value is ignored. - -@cmd{N} does not discard cases or prevent them from being read. It -just causes cases beyond the last one specified to be ignored by data -analysis commands. - -A later @cmd{N} command can increase or decrease the number of cases -selected. (To select all the cases without knowing how many there are, -specify a very high number: 100000 or whatever you think is large enough.) - -Transformation procedures performed after @cmd{N} is executed -@emph{do} cause cases to be discarded. - -@cmd{SAMPLE}, @cmd{PROCESS IF}, and @cmd{SELECT IF} have -precedence over @cmd{N}---the same results are obtained by both of the -following fragments, given the same random number seeds: - -@example -@i{@dots{}set up, read in data@dots{}} -N 100. -SAMPLE .5. -@i{@dots{}analyze data@dots{}} - -@i{@dots{}set up, read in data@dots{}} -SAMPLE .5. -N 100. -@i{@dots{}analyze data@dots{}} -@end example - -Both fragments above first randomly sample approximately half of the -cases, then select the first 100 of those sampled. - -@cmd{N} with the @code{ESTIMATED} keyword gives an -estimated number of cases before @cmd{DATA LIST} or another command to -read in data. @code{ESTIMATED} never limits the number of cases -processed by procedures. PSPP currently does not make use of -case count estimates. - -When @cmd{N} is specified after @cmd{TEMPORARY}, it affects only -the next procedure (@pxref{TEMPORARY}). - -@node PROCESS IF, SAMPLE, N OF CASES, Data Selection -@section PROCESS IF -@vindex PROCESS IF - -@example -PROCESS IF expression. -@end example - -@cmd{PROCESS IF} temporarily eliminates cases from the -data stream. Its effects are active only through the execution of the -next procedure or procedure-like command. - -Specify a boolean expression (@pxref{Expressions}). If the value of the -expression is true for a particular case, the case will be analyzed. If -the expression has a false or missing value, then the case will be -deleted from the data stream for this procedure only. - -Regardless of its placement relative to other commands, @cmd{PROCESS IF} -always takes effect immediately before data passes to the procedure. -Only one @cmd{PROCESS IF} command may be in effect at any given time. - -The effects of @cmd{PROCESS IF} are similar, but not identical, to the -effects of executing @cmd{TEMPORARY}, then @cmd{SELECT IF} -(@pxref{SELECT IF}). - -The filtering performed by @cmd{PROCESS IF} takes place immediately -before cases pass to a procedure for analysis. Because @cmd{PROCESS -IF} affects only a single procedure, its placement relative to -@cmd{TEMPORARY} is unimportant. - -@cmd{PROCESS IF} is deprecated. It is included for compatibility with -old command files. New syntax files should use @cmd{SELECT IF} or -@cmd{FILTER} instead. - -@node SAMPLE, SELECT IF, PROCESS IF, Data Selection -@section SAMPLE -@vindex SAMPLE - -@display -SAMPLE num1 [FROM num2]. -@end display - -@cmd{SAMPLE} randomly samples a proportion of the cases in the active -file. Unless it follows @cmd{TEMPORARY}, it operates as a -transformation, permanently removing cases from the active file. - -The proportion to sample can be expressed as a single number between 0 -and 1. If @code{k} is the number specified, and @code{N} is the number -of currently-selected cases in the active file, then after -@code{SAMPLE @var{k}.}, approximately @code{k*N} cases will be -selected. - -The proportion to sample can also be specified in the style @code{SAMPLE -@var{m} FROM @var{N}}. With this style, cases are selected as follows: - -@enumerate -@item -If @var{N} is equal to the number of currently-selected cases in the -active file, exactly @var{m} cases will be selected. - -@item -If @var{N} is greater than the number of currently-selected cases in the -active file, an equivalent proportion of cases will be selected. - -@item -If @var{N} is less than the number of currently-selected cases in the -active, exactly @var{m} cases will be selected @emph{from the first -@var{N} cases in the active file.} -@end enumerate - -@cmd{SAMPLE} and @cmd{SELECT IF} are performed in -the order specified by the syntax file. - -@cmd{SAMPLE} is always performed before @code{N OF CASES}, regardless -of ordering in the syntax file (@pxref{N OF CASES}). - -The same values for @cmd{SAMPLE} may result in different samples. To -obtain the same sample, use the @code{SET} command to set the random -number seed to the same value before each @cmd{SAMPLE}. Different -samples may still result when the file is processed on systems with -differing endianness or floating-point formats. By default, the -random number seed is based on the system time. - -@node SELECT IF, SPLIT FILE, SAMPLE, Data Selection -@section SELECT IF -@vindex SELECT IF - -@display -SELECT IF expression. -@end display - -@cmd{SELECT IF} selects cases for analysis based on the value of a -boolean expression. Cases not selected are permanently eliminated -from the active file, unless @cmd{TEMPORARY} is in effect -(@pxref{TEMPORARY}). - -Specify a boolean expression (@pxref{Expressions}). If the value of the -expression is true for a particular case, the case will be analyzed. If -the expression has a false or missing value, then the case will be -deleted from the data stream. - -Place @cmd{SELECT IF} as early in the command file as -possible. Cases that are deleted early can be processed more -efficiently in time and space. - -When @cmd{SELECT IF} is specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used -(@pxref{LAG}). - -@node SPLIT FILE, TEMPORARY, SELECT IF, Data Selection -@section SPLIT FILE -@vindex SPLIT FILE - -@display -Two possible syntaxes: - SPLIT FILE BY var_list. - SPLIT FILE OFF. -@end display - -@cmd{SPLIT FILE} allows multiple sets of data present in one data -file to be analyzed separately using single statistical procedure -commands. - -Specify a list of variable names to analyze multiple sets of -data separately. Groups of cases having the same values for these -variables are analyzed by statistical procedure commands as one group. -An independent analysis is carried out for each group of cases, and the -variable values for the group are printed along with the analysis. - -Specify OFF to disable @cmd{SPLIT FILE} and resume analysis of the -entire active file as a single group of data. - -When @cmd{SPLIT FILE} is specified after @cmd{TEMPORARY}, it affects only -the next procedure (@pxref{TEMPORARY}). - -@node TEMPORARY, WEIGHT, SPLIT FILE, Data Selection -@section TEMPORARY -@vindex TEMPORARY - -@display -TEMPORARY. -@end display - -@cmd{TEMPORARY} is used to make the effects of transformations -following its execution temporary. These transformations will -affect only the execution of the next procedure or procedure-like -command. Their effects will not be saved to the active file. - -The only specification on @cmd{TEMPORARY} is the command name. - -@cmd{TEMPORARY} may not appear within a @cmd{DO IF} or @cmd{LOOP} -construct. It may appear only once between procedures and -procedure-like commands. - -Scratch variables cannot be used following @cmd{TEMPORARY}. - -An example may help to clarify: - -@example -DATA LIST /X 1-2. -BEGIN DATA. - 2 - 4 -10 -15 -20 -24 -END DATA. -COMPUTE X=X/2. -TEMPORARY. -COMPUTE X=X+3. -DESCRIPTIVES X. -DESCRIPTIVES X. -@end example - -The data read by the first @cmd{DESCRIPTIVES} are 4, 5, 8, -10.5, 13, 15. The data read by the first @cmd{DESCRIPTIVES} are 1, 2, -5, 7.5, 10, 12. - -@node WEIGHT, , TEMPORARY, Data Selection -@section WEIGHT -@vindex WEIGHT - -@display -WEIGHT BY var_name. -WEIGHT OFF. -@end display - -@cmd{WEIGHT} assigns cases varying weights, -changing the frequency distribution of the active file. Execution of -@cmd{WEIGHT} is delayed until data have been read. - -If a variable name is specified, @cmd{WEIGHT} causes the values of that -variable to be used as weighting factors for subsequent statistical -procedures. Use of keyword BY is optional but recommended. Weighting -variables must be numeric. Scratch variables may not be used for -weighting (@pxref{Scratch Variables}). - -When OFF is specified, subsequent statistical procedures will weight all -cases equally. - -A positive integer weighting factor @var{w} on a case will yield the -same statistical output as would replicating the case @var{w} times. -A weighting factor of 0 is treated for statistical purposes as if the -case did not exist in the input. Weighting values need not be -integers, but negative and system-missing values for the weighting -variable are interpreted as weighting factors of 0. User-missing -values are not treated specially. - -When @cmd{WEIGHT} is specified after @cmd{TEMPORARY}, it affects only -the next procedure (@pxref{TEMPORARY}). - -@cmd{WEIGHT} does not cause cases in the active file to be replicated in -memory. - -@node Conditionals and Looping, Statistics, Data Selection, Top -@chapter Conditional and Looping Constructs -@cindex conditionals -@cindex loops -@cindex flow of control -@cindex control flow - -This chapter documents PSPP commands used for conditional execution, -looping, and flow of control. - -@menu -* BREAK:: Exit a loop. -* DO IF:: Conditionally execute a block of code. -* DO REPEAT:: Textually repeat a code block. -* LOOP:: Repeat a block of code. -@end menu - -@node BREAK, DO IF, Conditionals and Looping, Conditionals and Looping -@section BREAK -@vindex BREAK - -@display -BREAK. -@end display - -@cmd{BREAK} terminates execution of the innermost currently executing -@cmd{LOOP} construct. - -@cmd{BREAK} is allowed only inside @cmd{LOOP}@dots{}@cmd{END LOOP}. -@xref{LOOP}, for more details. - -@node DO IF, DO REPEAT, BREAK, Conditionals and Looping -@section DO IF -@vindex DO IF - -@display -DO IF condition. - @dots{} -[ELSE IF condition. - @dots{} -]@dots{} -[ELSE. - @dots{}] -END IF. -@end display - -@cmd{DO IF} allows one of several sets of transformations to be -executed, depending on user-specified conditions. - -If the specified boolean expression evaluates as true, then the block -of code following @cmd{DO IF} is executed. If it evaluates as -missing, then -none of the code blocks is executed. If it is false, then -the boolean expression on the first @cmd{ELSE IF}, if present, is tested in -turn, with the same rules applied. If all expressions evaluate to -false, then the @cmd{ELSE} code block is executed, if it is present. - -When @cmd{DO IF} or @cmd{ELSE IF} is specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used -(@pxref{LAG}). - -@node DO REPEAT, LOOP, DO IF, Conditionals and Looping -@section DO REPEAT -@vindex DO REPEAT - -@display -DO REPEAT repvar_name=expansion@dots{}. - @dots{} -END REPEAT [PRINT]. - -expansion takes one of the following forms: - var_list - num_or_range@dots{} - 'string'@dots{} - -num_or_range takes one of the following forms: - number - num1 TO num2 -@end display - -@cmd{DO REPEAT} repeats a block of code, textually substituting -different variables, numbers, or strings into the block with each -repetition. - -Specify a repeat variable name followed by an equals sign (@samp{=}) and -the list of replacements. Replacements can be a list of variables -(which may be existing variables or new variables or a combination -thereof), of numbers, or of strings. When new variable names are -specified, @cmd{DO REPEAT} creates them as numeric variables. When numbers -are specified, runs of integers may be indicated with TO notation, for -instance @samp{1 TO 5} and @samp{1 2 3 4 5} would be equivalent. There -is no equivalent notation for string values. - -Multiple repeat variables can be specified. When this is done, each -variable must have the same number of replacements. - -The code within @cmd{DO REPEAT} is repeated as many times as there are -replacements for each variable. The first time, the first value for -each repeat variable is substituted; the second time, the second value -for each repeat variable is substituted; and so on. - -Repeat variable substitutions work like macros. They take place -anywhere in a line that the repeat variable name occurs as a token, -including command and subcommand names. For this reason it is not a -good idea to select words commonly used in command and subcommand names -as repeat variable identifiers. - -If PRINT is specified on @cmd{END REPEAT}, the commands after substitutions -are made are printed to the listing file, prefixed by a plus sign -(@samp{+}). - -@node LOOP, , DO REPEAT, Conditionals and Looping -@section LOOP -@vindex LOOP - -@display -LOOP [index_var=start TO end [BY incr]] [IF condition]. - @dots{} -END LOOP [IF condition]. -@end display - -@cmd{LOOP} iterates a group of commands. A number of -termination options are offered. - -Specify index_var to make that variable count from one value to -another by a particular increment. index_var must be a pre-existing -numeric variable. start, end, and incr are numeric expressions -(@pxref{Expressions}.) - -During the first iteration, index_var is set to the value of start. -During each successive iteration, index_var is increased by the value of -incr. If end > start, then the loop terminates when index_var > end; -otherwise it terminates when index_var < end. If incr is not specified -then it defaults to +1 or -1 as appropriate. - -If end > start and incr < 0, or if end < start and incr > 0, then the -loop is never executed. index_var is nevertheless set to the value of -start. - -Modifying index_var within the loop is allowed, but it has no effect on -the value of index_var in the next iteration. - -Specify a boolean expression for the condition on @cmd{LOOP} to -cause the loop to be executed only if the condition is true. If the -condition is false or missing before the loop contents are executed the -first time, the loop contents are not executed at all. - -If index and condition clauses are both present on @cmd{LOOP}, the index -clause is always evaluated first. - -Specify a boolean expression for the condition on @cmd{END LOOP} to cause -the loop to terminate if the condition is not true after the enclosed -code block is executed. The condition is evaluated at the end of the -loop, not at the beginning. - -If the index clause and both condition clauses are not present, then the -loop is executed MXLOOPS (@pxref{SET}) times. - -@cmd{BREAK} also terminates @cmd{LOOP} execution (@pxref{BREAK}). - -When @cmd{LOOP} or @cmd{END LOOP} is specified following @cmd{TEMPORARY} -(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used -(@pxref{LAG}). - -@node Statistics, Utilities, Conditionals and Looping, Top -@chapter Statistics - -This chapter documents the statistical procedures that PSPP supports so -far. - -@menu -* DESCRIPTIVES:: Descriptive statistics. -* FREQUENCIES:: Frequency tables. -* CROSSTABS:: Crosstabulation tables. -* T-TEST:: Test hypotheses about means. -* ONEWAY:: One analysis of variance. -@end menu - -@node DESCRIPTIVES, FREQUENCIES, Statistics, Statistics -@section DESCRIPTIVES - -@vindex DESCRIPTIVES -@display -DESCRIPTIVES - /VARIABLES=var_list - /MISSING=@{VARIABLE,LISTWISE@} @{INCLUDE,NOINCLUDE@} - /FORMAT=@{LABELS,NOLABELS@} @{NOINDEX,INDEX@} @{LINE,SERIAL@} - /SAVE - /STATISTICS=@{ALL,MEAN,SEMEAN,STDDEV,VARIANCE,KURTOSIS, - SKEWNESS,RANGE,MINIMUM,MAXIMUM,SUM,DEFAULT, - SESKEWNESS,SEKURTOSIS@} - /SORT=@{NONE,MEAN,SEMEAN,STDDEV,VARIANCE,KURTOSIS,SKEWNESS, - RANGE,MINIMUM,MAXIMUM,SUM,SESKEWNESS,SEKURTOSIS,NAME@} - @{A,D@} -@end display - -The @cmd{DESCRIPTIVES} procedure reads the active file and outputs -descriptive -statistics requested by the user. In addition, it can optionally -compute Z-scores. - -The VARIABLES subcommand, which is required, specifies the list of -variables to be analyzed. Keyword VARIABLES is optional. - -All other subcommands are optional: - -The MISSING subcommand determines the handling of missing variables. If -INCLUDE is set, then user-missing values are included in the -calculations. If NOINCLUDE is set, which is the default, user-missing -values are excluded. If VARIABLE is set, then missing values are -excluded on a variable by variable basis; if LISTWISE is set, then -the entire case is excluded whenever any value in that case has a -system-missing or, if INCLUDE is set, user-missing value. - -The FORMAT subcommand affects the output format. Currently the -LABELS/NOLABELS and NOINDEX/INDEX settings are not used. When SERIAL is -set, both valid and missing number of cases are listed in the output; -when NOSERIAL is set, only valid cases are listed. - -The SAVE subcommand causes @cmd{DESCRIPTIVES} to calculate Z scores for all -the specified variables. The Z scores are saved to new variables. -Variable names are generated by trying first the original variable name -with Z prepended and truncated to a maximum of 8 characters, then the -names ZSC000 through ZSC999, STDZ00 through STDZ09, ZZZZ00 through -ZZZZ09, ZQZQ00 through ZQZQ09, in that sequence. In addition, Z score -variable names can be specified explicitly on VARIABLES in the variable -list by enclosing them in parentheses after each variable. - -The STATISTICS subcommand specifies the statistics to be displayed: - -@table @code -@item ALL -All of the statistics below. -@item MEAN -Arithmetic mean. -@item SEMEAN -Standard error of the mean. -@item STDDEV -Standard deviation. -@item VARIANCE -Variance. -@item KURTOSIS -Kurtosis and standard error of the kurtosis. -@item SKEWNESS -Skewness and standard error of the skewness. -@item RANGE -Range. -@item MINIMUM -Minimum value. -@item MAXIMUM -Maximum value. -@item SUM -Sum. -@item DEFAULT -Mean, standard deviation of the mean, minimum, maximum. -@item SEKURTOSIS -Standard error of the kurtosis. -@item SESKEWNESS -Standard error of the skewness. -@end table - -The SORT subcommand specifies how the statistics should be sorted. Most -of the possible values should be self-explanatory. NAME causes the -statistics to be sorted by name. By default, the statistics are listed -in the order that they are specified on the VARIABLES subcommand. The A -and D settings request an ascending or descending sort order, -respectively. - -@node FREQUENCIES, CROSSTABS, DESCRIPTIVES, Statistics -@section FREQUENCIES - -@vindex FREQUENCIES -@display -FREQUENCIES - /VARIABLES=var_list - /FORMAT=@{TABLE,NOTABLE,LIMIT(limit)@} - @{STANDARD,CONDENSE,ONEPAGE[(onepage_limit)]@} - @{LABELS,NOLABELS@} - @{AVALUE,DVALUE,AFREQ,DFREQ@} - @{SINGLE,DOUBLE@} - @{OLDPAGE,NEWPAGE@} - /MISSING=@{EXCLUDE,INCLUDE@} - /STATISTICS=@{DEFAULT,MEAN,SEMEAN,MEDIAN,MODE,STDDEV,VARIANCE, - KURTOSIS,SKEWNESS,RANGE,MINIMUM,MAXIMUM,SUM, - SESKEWNESS,SEKURTOSIS,ALL,NONE@} - /NTILES=ntiles - /PERCENTILES=percent@dots{} - -(These options are not currently implemented.) - /BARCHART=@dots{} - /HISTOGRAM=@dots{} - /HBAR=@dots{} - /GROUPED=@dots{} - -(Integer mode.) - /VARIABLES=var_list (low,high)@dots{} -@end display - -The @cmd{FREQUENCIES} procedure outputs frequency tables for specified -variables. -@cmd{FREQUENCIES} can also calculate and display descriptive statistics -(including median and mode) and percentiles. - -In the future, @cmd{FREQUENCIES} will also support graphical output in the -form of bar charts and histograms. In addition, it will be able to -support percentiles for grouped data. - -The VARIABLES subcommand is the only required subcommand. Specify the -variables to be analyzed. In most cases, this is all that is required. -This is known as @dfn{general mode}. - -Occasionally, one may want to invoke a special mode called @dfn{integer -mode}. Normally, in general mode, PSPP will automatically determine -what values occur in the data. In integer mode, the user specifies the -range of values that the data assumes. To invoke this mode, specify a -range of data values in parentheses, separated by a comma. Data values -inside the range are truncated to the nearest integer, then assigned to -that value. If values occur outside this range, they are discarded. - -The FORMAT subcommand controls the output format. It has several -possible settings: - -@itemize @bullet -@item -TABLE, the default, causes a frequency table to be output for every -variable specified. NOTABLE prevents them from being output. LIMIT -with a numeric argument causes them to be output except when there are -more than the specified number of values in the table. - -@item -STANDARD frequency tables contain more complete information, but also to -take up more space on the printed page. CONDENSE frequency tables are -less informative but take up less space. ONEPAGE with a numeric -argument will output standard frequency tables if there are the -specified number of values or less, condensed tables otherwise. ONEPAGE -without an argument defaults to a threshold of 50 values. - -@item -LABELS causes value labels to be displayed in STANDARD frequency -tables. NOLABLES prevents this. - -@item -Normally frequency tables are sorted in ascending order by value. This -is AVALUE. DVALUE tables are sorted in descending order by value. -AFREQ and DFREQ tables are sorted in ascending and descending order, -respectively, by frequency count. - -@item -SINGLE spaced frequency tables are closely spaced. DOUBLE spaced -frequency tables have wider spacing. - -@item -OLDPAGE and NEWPAGE are not currently used. -@end itemize - -The MISSING subcommand controls the handling of user-missing values. -When EXCLUDE, the default, is set, user-missing values are not included -in frequency tables or statistics. When INCLUDE is set, user-missing -are included. System-missing values are never included in statistics, -but are listed in frequency tables. - -The available STATISTICS are the same as available in @cmd{DESCRIPTIVES} -(@pxref{DESCRIPTIVES}), with the addition of MEDIAN, the data's median -value, and MODE, the mode. (If there are multiple modes, the smallest -value is reported.) By default, the mean, standard deviation of the -mean, minimum, and maximum are reported for each variable. - -PERCENTILES causes the specified percentiles to be reported. -The percentiles should be presented at a list of numbers between 0 -and 100 inclusive. -The NTILES subcommand causes the percentiles to be reported at the -boundaries of the data set divided into the specified number of ranges. -For instance, @code{/NTILES=4} would cause quartiles to be reported. - - -@node CROSSTABS, T-TEST, FREQUENCIES, Statistics -@section CROSSTABS - -@vindex CROSSTABS -@display -CROSSTABS - /TABLES=var_list BY var_list [BY var_list]@dots{} - /MISSING=@{TABLE,INCLUDE,REPORT@} - /WRITE=@{NONE,CELLS,ALL@} - /FORMAT=@{TABLES,NOTABLES@} - @{LABELS,NOLABELS,NOVALLABS@} - @{PIVOT,NOPIVOT@} - @{AVALUE,DVALUE@} - @{NOINDEX,INDEX@} - @{BOX,NOBOX@} - /CELLS=@{COUNT,ROW,COLUMN,TOTAL,EXPECTED,RESIDUAL,SRESIDUAL, - ASRESIDUAL,ALL,NONE@} - /STATISTICS=@{CHISQ,PHI,CC,LAMBDA,UC,BTAU,CTAU,RISK,GAMMA,D, - KAPPA,ETA,CORR,ALL,NONE@} - -(Integer mode.) - /VARIABLES=var_list (low,high)@dots{} -@end display - -The @cmd{CROSSTABS} procedure displays crosstabulation -tables requested by the user. It can calculate several statistics for -each cell in the crosstabulation tables. In addition, a number of -statistics can be calculated for each table itself. - -The TABLES subcommand is used to specify the tables to be reported. Any -number of dimensions is permitted, and any number of variables per -dimension is allowed. The TABLES subcommand may be repeated as many -times as needed. This is the only required subcommand in @dfn{general -mode}. - -Occasionally, one may want to invoke a special mode called @dfn{integer -mode}. Normally, in general mode, PSPP automatically determines -what values occur in the data. In integer mode, the user specifies the -range of values that the data assumes. To invoke this mode, specify the -VARIABLES subcommand, giving a range of data values in parentheses for -each variable to be used on the TABLES subcommand. Data values inside -the range are truncated to the nearest integer, then assigned to that -value. If values occur outside this range, they are discarded. When it -is present, the VARIABLES subcommand must precede the TABLES -subcommand. - -In general mode, numeric and string variables may be specified on -TABLES. Although long string variables are allowed, only their -initial short-string parts are used. In integer mode, only numeric -variables are allowed. - -The MISSING subcommand determines the handling of user-missing values. -When set to TABLE, the default, missing values are dropped on a table by -table basis. When set to INCLUDE, user-missing values are included in -tables and statistics. When set to REPORT, which is allowed only in -integer mode, user-missing values are included in tables but marked with -an @samp{M} (for ``missing'') and excluded from statistical -calculations. - -Currently the WRITE subcommand is ignored. - -The FORMAT subcommand controls the characteristics of the -crosstabulation tables to be displayed. It has a number of possible -settings: - -@itemize @bullet -@item -TABLES, the default, causes crosstabulation tables to be output. -NOTABLES suppresses them. - -@item -LABELS, the default, allows variable labels and value labels to appear -in the output. NOLABELS suppresses them. NOVALLABS displays variable -labels but suppresses value labels. - -@item -PIVOT, the default, causes each TABLES subcommand to be displayed in a -pivot table format. NOPIVOT causes the old-style crosstabulation format -to be used. - -@item -AVALUE, the default, causes values to be sorted in ascending order. -DVALUE asserts a descending sort order. - -@item -INDEX/NOINDEX is currently ignored. - -@item -BOX/NOBOX is currently ignored. -@end itemize - -The CELLS subcommand controls the contents of each cell in the displayed -crosstabulation table. The possible settings are: - -@table @asis -@item COUNT -Frequency count. -@item ROW -Row percent. -@item COLUMN -Column percent. -@item TOTAL -Table percent. -@item EXPECTED -Expected value. -@item RESIDUAL -Residual. -@item SRESIDUAL -Standardized residual. -@item ASRESIDUAL -Adjusted standardized residual. -@item ALL -All of the above. -@item NONE -Suppress cells entirely. -@end table - -@samp{/CELLS} without any settings specified requests COUNT, ROW, -COLUMN, and TOTAL. If CELLS is not specified at all then only COUNT -will be selected. - -The STATISTICS subcommand selects statistics for computation: - -@table @asis -@item CHISQ -Pearson chi-square, likelihood ratio, Fisher's exact test, continuity -correction, linear-by-linear association. -@item PHI -Phi. -@item CC -Contingency coefficient. -@item LAMBDA -Lambda. -@item UC -Uncertainty coefficient. -@item BTAU -Tau-b. -@item CTAU -Tau-c. -@item RISK -Risk estimate. -@item GAMMA -Gamma. -@item D -Somers' D. -@item KAPPA -Cohen's Kappa. -@item ETA -Eta. -@item CORR -Spearman correlation, Pearson's r. -@item ALL -All of the above. -@item NONE -No statistics. -@end table - -Selected statistics are only calculated when appropriate for the -statistic. Certain statistics require tables of a particular size, and -some statistics are calculated only in integer mode. - -@samp{/STATISTICS} without any settings selects CHISQ. If the -STATISTICS subcommand is not given, no statistics are calculated. - -@strong{Please note:} Currently the implementation of CROSSTABS has the -followings bugs: - -@itemize @bullet -@item -Pearson's R (but not Spearman) is off a little. -@item -T values for Spearman's R and Pearson's R are wrong. -@item -Significance of symmetric and directional measures is not calculated. -@item -Asymmetric ASEs and T values for lambda are wrong. -@item -ASE of Goodman and Kruskal's tau is not calculated. -@item -ASE of symmetric somers' d is wrong. -@item -Approximate T of uncertainty coefficient is wrong. -@end itemize - -Fixes for any of these deficiencies would be welcomed. - -@node T-TEST, ONEWAY, CROSSTABS, Statistics -@comment node-name, next, previous, up -@section T-TEST - -@vindex T-TEST -@display -T-TEST - /MISSING=@{ANALYSIS,LISTWISE@} @{EXCLUDE,INCLUDE@} - /CRITERIA=CIN(confidence) - - -(One Sample mode.) - TESTVAL=test_value - /VARIABLES=var_list - - -(Independent Samples mode.) - GROUPS=var(value1 [, value2]) - /VARIABLES=var_list - - -(Paired Samples mode.) - PAIRS=var_list [WITH var_list [(PAIRED)] ] - -@end display - - -The @cmd{T-TEST} procedure outputs tables used in testing hypotheses about -means. -It operates in one of three modes: -@itemize -@item One Sample mode. -@item Independent Groups mode. -@item Paired mode. -@end itemize - -@noindent -Each of these modes are described in more detail below. -There are two optional subcommands which are common to all modes. - -The @cmd{/CRITERIA} subcommand tells PSPP the confidence interval used -in the tests. The default value is 0.95. - - -The @cmd{MISSING} subcommand determines the handling of missing -variables. -If INCLUDE is set, then user-missing values are included in the -calculations, but system-missing values are not. -If EXCLUDE is set, which is the default, user-missing -values are excluded as well as system-missing values. -This is the default. - -If LISTWISE is set, then the entire case is excluded from analysis -whenever any variable specified in the @cmd{/VARIABLES}, @cmd{/PAIRS} or -@cmd{/GROUPS} subcommands contains a missing value. -If ANALYSIS is set, then missing values are excluded only in the analysis for -which they would be needed. This is the default. - - -@menu -* One Sample Mode:: Testing against a hypothesised mean -* Independent Samples Mode:: Testing two independent groups for equal mean -* Paired Samples Mode:: Testing two interdependent groups for equal mean -@end menu - -@node One Sample Mode, Independent Samples Mode, T-TEST, T-TEST -@subsection One Sample Mode - -The @cmd{TESTVAL} subcommand invokes the One Sample mode. -This mode is used to test a population mean against a hypothesised -mean. -The value given to the @cmd{TESTVAL} subcommand is the value against -which you wish to test. -In this mode, you must also use the @cmd{/VARIABLES} subcommand to -tell PSPP which variables you wish to test. - -@node Independent Samples Mode, Paired Samples Mode, One Sample Mode, T-TEST -@comment node-name, next, previous, up -@subsection Independent Samples Mode - -The @cmd{GROUPS} subcommand invokes Independent Samples mode or -`Groups' mode. -This mode is used to test whether two groups of values have the -same population mean. -In this mode, you must also use the @cmd{/VARIABLES} subcommand to -tell PSPP the dependent variables you wish to test. - -The variable given in the @cmd{GROUPS} subcommand is the independent -variable which determines to which group the samples belong. -The values in parentheses are the specific values of the independent -variable for each group. -If the parentheses are omitted and no values are given, the default values -of 1.0 and 2.0 are assumed. - -If the independent variable is numeric, -it is acceptable to specify only one value inside the parentheses. -If you do this, cases where the independent variable is -less than or equal to this value belong to the first group, and cases -greater than this value belong to the second group. -When using this form of the @cmd{GROUPS} subcommand, missing values in -the independent variable are excluded on a listwise basis, regardless -of whether @cmd{/MISSING=LISTWISE} was specified. - - -@node Paired Samples Mode, , Independent Samples Mode, T-TEST -@comment node-name, next, previous, up -@subsection Paired Samples Mode - -The @cmd{PAIRS} subcommand introduces Paired Samples mode. -Use this mode when repeated measures have been taken from the same -samples. -If the the @code{WITH} keyword is omitted, then tables for all -combinations of variables given in the @cmd{PAIRS} subcommand are -generated. -If the @code{WITH} keyword is given, and the @code{(PAIRED)} keyword -is also given, then the number of variables preceding @code{WITH} -must be the same as the number following it. -In this case, tables for each respective pair of variables are -generated. -In the event that the @code{WITH} keyword is given, but the -@code{(PAIRED)} keyword is omitted, then tables for each combination -of variable preceding @code{WITH} against variable following -@code{WITH} are generated. - - -@node ONEWAY, , T-TEST, Statistics -@comment node-name, next, previous, up -@section Oneway - -@vindex ONEWAY -@cindex analysis of variance -@cindex ANOVA - -@display -ONEWAY - [/VARIABLES = ] var_list BY var - /MISSING=@{ANALYSIS,LISTWISE@} @{EXCLUDE,INCLUDE@} - /CONTRASTS= value1 [, value2] ... [,valueN] - /STATISTICS=@{DESCRIPTIVES,HOMOGENEITY@} - -@end display - -The @cmd{ONEWAY} procedure performs a one-way analysis of variance of -variables factored by a single independent variable. -It is used to compare the means of a population -divided into more than two groups. - -The variables to be analysed should be given in the @code{VARIABLES} -subcommand. -The list of variables must be followed by the @code{BY} keyword and -the name of the independent (or factor) variable. - -You can use the @code{STATISTICS} subcommand to tell PSPP to display -ancilliary information. The options accepted are: -@itemize -@item DESCRIPTIVES -Displays descriptive statistics about the groups factored by the independent -variable. -@item HOMOGENEITY -Displays the Levene test of Homogeneity of Variance for the -variables and their groups. -@end itemize - -The @code{CONTRASTS} subcommand is used when you anticipate certain -differences between the groups. -The subcommand must be followed by a list of numerals which are the -coefficients of the groups to be tested. -The number of coefficients must correspond to the number of distinct -groups (or values of the independent variable). -If the total sum of the coefficients are not zero, then PSPP will -display a warning, but will proceed with the analysis. -The @code{CONTRASTS} subcommand may be given up to 10 times in order -to specify different contrast tests. - - - -@node Utilities, Not Implemented, Statistics, Top -@chapter Utilities - -Commands that don't fit any other category are placed here. - -Most of these commands are not affected by commands like @cmd{IF} and -@cmd{LOOP}: -they take effect only once, unconditionally, at the time that they are -encountered in the input. - -@menu -* COMMENT:: Document your syntax file. -* DOCUMENT:: Document the active file. -* DISPLAY DOCUMENTS:: Display active file documents. -* DISPLAY FILE LABEL:: Display the active file label. -* DROP DOCUMENTS:: Remove documents from the active file. -* ERASE:: Erase a file. -* EXECUTE:: Execute pending transformations. -* FILE LABEL:: Set the active file's label. -* FINISH:: Terminate the PSPP session. -* HOST:: Temporarily return to the operating system. -* INCLUDE:: Include a file within the current one. -* QUIT:: Terminate the PSPP session. -* SET:: Adjust PSPP runtime parameters. -* SHOW:: Display runtime parameters. -* SUBTITLE:: Provide a document subtitle. -* TITLE:: Provide a document title. -@end menu - -@node COMMENT, DOCUMENT, Utilities, Utilities -@section COMMENT -@vindex COMMENT -@vindex * - -@display -Two possibles syntaxes: - COMMENT comment text @dots{} . - *comment text @dots{} . -@end display - -@cmd{COMMENT} is ignored. It is used to provide information to -the author and other readers of the PSPP syntax file. - -@cmd{COMMENT} can extend over any number of lines. Don't forget to -terminate it with a dot or a blank line. - -@node DOCUMENT, DISPLAY DOCUMENTS, COMMENT, Utilities -@section DOCUMENT -@vindex DOCUMENT - -@display -DOCUMENT documentary_text. -@end display - -@cmd{DOCUMENT} adds one or more lines of descriptive commentary to the -active file. Documents added in this way are saved to system files. -They can be viewed using @cmd{SYSFILE INFO} or @cmd{DISPLAY -DOCUMENTS}. They can be removed from the active file with @cmd{DROP -DOCUMENTS}. - -Specify the documentary text following the DOCUMENT keyword. You can -extend the documentary text over as many lines as necessary. Lines are -truncated at 80 characters width. Don't forget to terminate -the command with a dot or a blank line. - -@node DISPLAY DOCUMENTS, DISPLAY FILE LABEL, DOCUMENT, Utilities -@section DISPLAY DOCUMENTS -@vindex DISPLAY DOCUMENTS - -@display -DISPLAY DOCUMENTS. -@end display - -@cmd{DISPLAY DOCUMENTS} displays the documents in the active file. Each -document is preceded by a line giving the time and date that it was -added. @xref{DOCUMENT}. - -@node DISPLAY FILE LABEL, DROP DOCUMENTS, DISPLAY DOCUMENTS, Utilities -@section DISPLAY FILE LABEL -@vindex DISPLAY FILE LABEL - -@display -DISPLAY FILE LABEL. -@end display - -@cmd{DISPLAY FILE LABEL} displays the file label contained in the -active file, -if any. @xref{FILE LABEL}. - -@node DROP DOCUMENTS, ERASE, DISPLAY FILE LABEL, Utilities -@section DROP DOCUMENTS -@vindex DROP DOCUMENTS - -@display -DROP DOCUMENTS. -@end display - -@cmd{DROP DOCUMENTS} removes all documents from the active file. -New documents can be added with @cmd{DOCUMENT} (@pxref{DOCUMENT}). - -@cmd{DROP DOCUMENTS} changes only the active file. It does not modify any -system files stored on disk. - - -@node ERASE, EXECUTE, DROP DOCUMENTS, Utilities -@comment node-name, next, previous, up -@section ERASE -@vindex ERASE - -@display -ERASE FILE file_name. -@end display - -@cmd{ERASE FILE} deletes a file from the local filesystem. -file_name must be quoted. -This command cannot be used if the SAFER setting is active. - - -@node EXECUTE, FILE LABEL, ERASE, Utilities -@section EXECUTE -@vindex EXECUTE - -@display -EXECUTE. -@end display - -@cmd{EXECUTE} causes the active file to be read and all pending -transformations to be executed. - -@node FILE LABEL, FINISH, EXECUTE, Utilities -@section FILE LABEL -@vindex FILE LABEL - -@display -FILE LABEL file_label. -@end display - -@cmd{FILE LABEL} provides a title for the active file. This -title will be saved into system files and portable files that are -created during this PSPP run. - -file_label need not be quoted. If quotes are -included, they become part of the file label. - -@node FINISH, HOST, FILE LABEL, Utilities -@section FINISH -@vindex FINISH - -@display -FINISH. -@end display - -@cmd{FINISH} terminates the current PSPP session and returns -control to the operating system. - -This command is not valid in interactive mode. - -@node HOST, INCLUDE, FINISH, Utilities -@comment node-name, next, previous, up -@section HOST -@vindex HOST - -@display -HOST. -@end display - -@cmd{HOST} suspends the current PSPP session and temporarily returns control -to the operating system. -This command cannot be used if the SAFER setting is active. - - -@node INCLUDE, QUIT, HOST, Utilities -@section INCLUDE -@vindex INCLUDE -@vindex @@ - -@display -Two possible syntaxes: - INCLUDE 'filename'. - @@filename. -@end display - -@cmd{INCLUDE} causes the PSPP command processor to read an -additional command file as if it were included bodily in the current -command file. - -Include files may be nested to any depth, up to the limit of available -memory. - -@node QUIT, SET, INCLUDE, Utilities -@section QUIT -@vindex QUIT - -@display -Two possible syntaxes: - QUIT. - EXIT. -@end display - -@cmd{QUIT} terminates the current PSPP session and returns control -to the operating system. - -This command is not valid within a command file. - -@node SET, SHOW, QUIT, Utilities -@section SET -@vindex SET - -@display -SET - -(data input) - /BLANKS=@{SYSMIS,'.',number@} - /DECIMAL=@{DOT,COMMA@} - /FORMAT=fmt_spec - -(program input) - /ENDCMD='.' - /NULLINE=@{ON,OFF@} - -(interaction) - /CPROMPT='cprompt_string' - /DPROMPT='dprompt_string' - /ERRORBREAK=@{OFF,ON@} - /MXERRS=max_errs - /MXWARNS=max_warnings - /PROMPT='prompt' - /VIEWLENGTH=@{MINIMUM,MEDIAN,MAXIMUM,n_lines@} - /VIEWWIDTH=n_characters - -(program execution) - /MEXPAND=@{ON,OFF@} - /MITERATE=max_iterations - /MNEST=max_nest - /MPRINT=@{ON,OFF@} - /MXLOOPS=max_loops - /SEED=@{RANDOM,seed_value@} - /UNDEFINED=@{WARN,NOWARN@} - -(data output) - /CC@{A,B,C,D,E@}=@{'npre,pre,suf,nsuf','npre.pre.suf.nsuf'@} - /DECIMAL=@{DOT,COMMA@} - /FORMAT=fmt_spec - -(output routing) - /ECHO=@{ON,OFF@} - /ERRORS=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@} - /INCLUDE=@{ON,OFF@} - /MESSAGES=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@} - /PRINTBACK=@{ON,OFF@} - /RESULTS=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@} - -(output activation) - /LISTING=@{ON,OFF@} - /PRINTER=@{ON,OFF@} - /SCREEN=@{ON,OFF@} - -(output driver options) - /HEADERS=@{NO,YES,BLANK@} - /LENGTH=@{NONE,length_in_lines@} - /LISTING=filename - /MORE=@{ON,OFF@} - /PAGER=@{OFF,"pager_name"@} - /WIDTH=@{NARROW,WIDTH,n_characters@} - -(logging) - /JOURNAL=@{ON,OFF@} [filename] - /LOG=@{ON,OFF@} [filename] - -(system files) - /COMPRESSION=@{ON,OFF@} - /SCOMPRESSION=@{ON,OFF@} - -(security) - /SAFER=ON - -(obsolete settings accepted for compatibility, but ignored) - /AUTOMENU=@{ON,OFF@} - /BEEP=@{ON,OFF@} - /BLOCK='c' - /BOXSTRING=@{'xxx','xxxxxxxxxxx'@} - /CASE=@{UPPER,UPLOW@} - /COLOR=@dots{} - /CPI=cpi_value - /DISK=@{ON,OFF@} - /EJECT=@{ON,OFF@} - /HELPWINDOWS=@{ON,OFF@} - /HIGHRES=@{ON,OFF@} - /HISTOGRAM='c' - /LOWRES=@{AUTO,ON,OFF@} - /LPI=lpi_value - /MENUS=@{STANDARD,EXTENDED@} - /MXMEMORY=max_memory - /PTRANSLATE=@{ON,OFF@} - /RCOLORS=@dots{} - /RUNREVIEW=@{AUTO,MANUAL@} - /SCRIPTTAB='c' - /TB1=@{'xxx','xxxxxxxxxxx'@} - /TBFONTS='string' - /WORKDEV=drive_letter - /WORKSPACE=workspace_size - /XSORT=@{YES,NO@} -@end display - -@cmd{SET} allows the user to adjust several parameters relating to -PSPP's execution. Since there are many subcommands to this command, its -subcommands will be examined in groups. - -On subcommands that take boolean values, ON and YES are synonym, and -as are OFF and NO, when used as subcommand values. - -The data input subcommands affect the way that data is read from data -files. The data input subcommands are - -@table @asis -@item BLANKS -This is the value assigned to an item data item that is empty or -contains only whitespace. An argument of SYSMIS or '.' will cause the -system-missing value to be assigned to null items. This is the -default. Any real value may be assigned. - -@item DECIMAL -The default DOT setting causes the decimal point character to be -@samp{.}. A setting of COMMA causes the decimal point character to be -@samp{,}. - -@item FORMAT -Allows the default numeric input/output format to be specified. The -default is F8.2. @xref{Input/Output Formats}. -@end table - -Program input subcommands affect the way that programs are parsed when -they are typed interactively or run from a script. They are - -@table @asis -@item ENDCMD -This is a single character indicating the end of a command. The default -is @samp{.}. Don't change this. - -@item NULLINE -Whether a blank line is interpreted as ending the current command. The -default is ON. -@end table - -Interaction subcommands affect the way that PSPP interacts with an -online user. The interaction subcommands are - -@table @asis -@item CPROMPT -The command continuation prompt. The default is @samp{ > }. - -@item DPROMPT -Prompt used when expecting data input within @cmd{BEGIN DATA} (@pxref{BEGIN -DATA}). The default is @samp{data> }. - -@item ERRORBREAK -Whether an error causes PSPP to stop processing the current command -file after finishing the current command. The default is OFF. - -@item MXERRS -The maximum number of errors before PSPP halts processing of the current -command file. The default is 50. - -@item MXWARNS -The maximum number of warnings + errors before PSPP halts processing the -current command file. The default is 100. - -@item PROMPT -The command prompt. The default is @samp{PSPP> }. - -@item VIEWLENGTH -The length of the screen in lines. MINIMUM means 25 lines, MEDIAN and -MAXIMUM mean 43 lines. Otherwise specify the number of lines. Normally -PSPP should auto-detect your screen size so this shouldn't have to be -used. - -@item VIEWWIDTH -The width of the screen in characters. Normally 80 or 132. -@end table - -Program execution subcommands control the way that PSPP commands -execute. The program execution subcommands are - -@table @asis -@item MEXPAND -@itemx MITERATE -@itemx MNEST -@itemx MPRINT -Currently not used. - -@item MXLOOPS -The maximum number of iterations for an uncontrolled loop (@pxref{LOOP}). - -@item SEED -The initial pseudo-random number seed. Set to a real number or to -RANDOM, which will obtain an initial seed from the current time of day. - -@item UNDEFINED -Currently not used. -@end table - -Data output subcommands affect the format of output data. These -subcommands are - -@table @asis -@item CCA -@itemx CCB -@itemx CCC -@itemx CCD -@itemx CCE -Set up custom currency formats. The argument is a string which must -contain exactly three commas or exactly three periods. If commas, then -the grouping character for the currency format is @samp{,}, and the -decimal point character is @samp{.}; if periods, then the situation is -reversed. - -The commas or periods divide the string into four fields, which are, in -order, the negative prefix, prefix, suffix, and negative suffix. When a -value is formatted using the custom currency format, the prefix precedes -the value formatted and the suffix follows it. In addition, if the -value is negative, the negative prefix precedes the prefix and the -negative suffix follows the suffix. - -@item DECIMAL -The default DOT setting causes the decimal point character to be -@samp{.}. A setting of COMMA causes the decimal point character to be -@samp{,}. - -@item FORMAT -Allows the default numeric input/output format to be specified. The -default is F8.2. @xref{Input/Output Formats}. -@end table - -Output routing subcommands affect where the output of transformations -and procedures is sent. These subcommands are - -@table @asis -@item ECHO - -If turned on, commands are written to the listing file as they are read -from command files. The default is OFF. - -@itemx ERRORS -@itemx INCLUDE -@itemx MESSAGES -@item PRINTBACK -@item RESULTS -Currently not used. -@end table - -Output activation subcommands affect whether output devices of -particular types are enabled. These subcommands are - -@table @asis -@item LISTING -Enable or disable listing devices. - -@item PRINTER -Enable or disable printer devices. - -@item SCREEN -Enable or disable screen devices. -@end table - -Output driver option subcommands affect output drivers' settings. These -subcommands are - -@table @asis -@item HEADERS -@itemx LENGTH -@itemx LISTING -@itemx MORE -@itemx PAGER -@itemx WIDTH -Currently not used. -@end table - -Logging subcommands affect logging of commands executed to external -files. These subcommands are - -@table @asis -@item JOURNAL -@item LOG -Not currently used. -@end table - -System file subcommands affect the default format of system files -produced by PSPP. These subcommands are - -@table @asis -@item COMPRESSION -Not currently used. - -@item SCOMPRESSION -Whether system files created by @cmd{SAVE} or @cmd{XSAVE} are -compressed by default. The default is ON. -@end table - -Security subcommands affect the operations that commands are allowed to -perform. The security subcommands are - -@table @asis -@item SAFER -When set, this setting cannot ever be reset, for obvious security -reasons. Setting this option disables the following operations: - -@itemize @bullet -@item -The ERASE command. -@item -The HOST command. -@item -Pipe filenames (filenames beginning or ending with @samp{|}). -@end itemize - -Be aware that this setting does not guarantee safety (commands can still -overwrite files, for instance) but it is an improvement. -@end table - -@node SHOW, SUBTITLE, SET, Utilities -@comment node-name, next, previous, up -@section SHOW -@vindex SHOW - -@display -SHOW - /@var{subcommand} - -@end display - -@cmd{SHOW} can be used to display the current state of PSPP's -execution parameters. All of the parameters which can be changed -using @code{SET} @xref{SET}, can be examined using @cmd{SHOW}, by -using a subcommand with the same name. -In addition, @code{SHOW} supports the following subcommands: - -@table @code -@item WARRANTY -Show details of the lack of warranty for PSPP. -@item COPYING -Display the terms of PSPP's copyright licence @ref{License}. -@end table - - - -@node SUBTITLE, TITLE, SHOW, Utilities -@section SUBTITLE -@vindex SUBTITLE - -@display -SUBTITLE 'subtitle_string'. - or -SUBTITLE subtitle_string. -@end display - -@cmd{SUBTITLE} provides a subtitle to a particular PSPP -run. This subtitle appears at the top of each output page below the -title, if headers are enabled on the output device. - -Specify a subtitle as a string in quotes. The alternate syntax that did -not require quotes is now obsolete. If it is used then the subtitle is -converted to all uppercase. - -@node TITLE, , SUBTITLE, Utilities -@section TITLE -@vindex TITLE - -@display -TITLE 'title_string'. - or -TITLE title_string. -@end display - -@cmd{TITLE} provides a title to a particular PSPP run. -This title appears at the top of each output page, if headers are enabled -on the output device. - -Specify a title as a string in quotes. The alternate syntax that did -not require quotes is now obsolete. If it is used then the title is -converted to all uppercase. - -@node Not Implemented, Data File Format, Utilities, Top -@chapter Not Implemented - -This chapter lists parts of the PSPP language that are not yet -implemented. - -The following transformations and utilities are not yet implemented, but -they will be supported in a later release. - -@itemize @bullet -@item -ADD FILES -@item -ANOVA -@item -DEFINE -@item -FILE TYPE -@item -GET SAS -@item -GET TRANSLATE -@item -MCONVERT -@item -PLOT -@item -PRESERVE -@item -PROCEDURE OUTPUT -@item -RESTORE -@item -SAVE TRANSLATE -@item -SHOW -@item -UPDATE -@end itemize - -The following transformations and utilities are not implemented. There -are no plans to support them in future releases. Contributions to -implement them will still be accepted. - -@itemize @bullet -@item -EDIT -@item -GET DATABASE -@item -GET OSIRIS -@item -GET SCSS -@item -GSET -@item -HELP -@item -INFO -@item -INPUT MATRIX -@item -KEYED DATA LIST -@item -NUMBERED and UNNUMBERED -@item -OPTIONS -@item -REVIEW -@item -SAVE SCSS -@item -SPSS MANAGER -@item -STATISTICS -@end itemize - -@node Data File Format, Portable File Format, Not Implemented, Top -@chapter Data File Format - -PSPP necessarily uses the same format for system files as do the -products with which it is compatible. This chapter is a description of -that format. - -There are three data types used in system files: 32-bit integers, 64-bit -floating points, and 1-byte characters. In this document these will -simply be referred to as @code{int32}, @code{flt64}, and @code{char}, -the names that are used in the PSPP source code. Every field of type -@code{int32} or @code{flt64} is aligned on a 32-bit boundary. - -The endianness of data in PSPP system files is not specified. System -files output on a computer of a particular endianness will have the -endianness of that computer. However, PSPP can read files of either -endianness, regardless of its host computer's endianness. PSPP -translates endianness for both integer and floating point numbers. - -Floating point formats are also not specified. PSPP does not -translate between floating point formats. This is unlikely to be a -problem as all modern computer architectures use IEEE 754 format for -floating point representation. - -The PSPP system-missing value is represented by the largest possible -negative number in the floating point format; in C, this is most likely -@code{-DBL_MAX}. There are two other important values used in missing -values: @code{HIGHEST} and @code{LOWEST}. These are represented by the -largest possible positive number (probably @code{DBL_MAX}) and the -second-largest negative number. The latter must be determined in a -system-dependent manner; in IEEE 754 format it is represented by value -@code{0xffeffffffffffffe}. - -System files are divided into records. Each record begins with an -@code{int32} giving a numeric record type. Individual record types are -described below: - -@menu -* File Header Record:: -* Variable Record:: -* Value Label Record:: -* Value Label Variable Record:: -* Document Record:: -* Machine int32 Info Record:: -* Machine flt64 Info Record:: -* Miscellaneous Informational Records:: -* Dictionary Termination Record:: -* Data Record:: -@end menu - -@node File Header Record, Variable Record, Data File Format, Data File Format -@section File Header Record - -The file header is always the first record in the file. - -@example -struct sysfile_header - @{ - char rec_type[4]; - char prod_name[60]; - int32 layout_code; - int32 case_size; - int32 compressed; - int32 weight_index; - int32 ncases; - flt64 bias; - char creation_date[9]; - char creation_time[8]; - char file_label[64]; - char padding[3]; - @}; -@end example - -@table @code -@item char rec_type[4]; -Record type code. Always set to @samp{$FL2}. This is the only record -for which the record type is not of type @code{int32}. - -@item char prod_name[60]; -Product identification string. This always begins with the characters -@samp{@@(#) SPSS DATA FILE}. PSPP uses the remaining characters to -give its version and the operating system name; for example, @samp{GNU -pspp 0.1.4 - sparc-sun-solaris2.5.2}. The string is truncated if it -would be longer than 60 characters; otherwise it is padded on the right -with spaces. - -@item int32 layout_code; -Always set to 2. PSPP reads this value to determine the -file's endianness. - -@item int32 case_size; -Number of data elements per case. This is the number of variables, -except that long string variables add extra data elements (one for every -8 characters after the first 8). - -@item int32 compressed; -Set to 1 if the data in the file is compressed, 0 otherwise. - -@item int32 weight_index; -If one of the variables in the data set is used as a weighting variable, -set to the index of that variable. Otherwise, set to 0. - -@item int32 ncases; -Set to the number of cases in the file if it is known, or -1 otherwise. - -In the general case it is not possible to determine the number of cases -that will be output to a system file at the time that the header is -written. The way that this is dealt with is by writing the entire -system file, including the header, then seeking back to the beginning of -the file and writing just the @code{ncases} field. For `files' in which -this is not valid, the seek operation fails. In this case, -@code{ncases} remains -1. - -@item flt64 bias; -Compression bias. Always set to 100. The significance of this value is -that only numbers between @code{(1 - bias)} and @code{(251 - bias)} can -be compressed. - -@item char creation_date[9]; -Set to the date of creation of the system file, in @samp{dd mmm yy} -format, with the month as standard English abbreviations, using an -initial capital letter and following with lowercase. If the date is not -available then this field is arbitrarily set to @samp{01 Jan 70}. - -@item char creation_time[8]; -Set to the time of creation of the system file, in @samp{hh:mm:ss} -format and using 24-hour time. If the time is not available then this -field is arbitrarily set to @samp{00:00:00}. - -@item char file_label[64]; -Set the the file label declared by the user, if any. Padded on the -right with spaces. - -@item char padding[3]; -Ignored padding bytes to make the structure a multiple of 32 bits in -length. Set to zeros. -@end table - -@node Variable Record, Value Label Record, File Header Record, Data File Format -@section Variable Record - -Immediately following the header must come the variable records. There -must be one variable record for every variable and every 8 characters in -a long string beyond the first 8; i.e., there must be exactly as many -variable records as the value specified for @code{case_size} in the file -header record. - -@example -struct sysfile_variable - @{ - int32 rec_type; - int32 type; - int32 has_var_label; - int32 n_missing_values; - int32 print; - int32 write; - char name[8]; - - /* The following two fields are present - only if has_var_label is 1. */ - int32 label_len; - char label[/* variable length */]; - - /* The following field is present only - if n_missing_values is not 0. */ - flt64 missing_values[/* variable length*/]; - @}; -@end example - -@table @code -@item int32 rec_type; -Record type code. Always set to 2. - -@item int32 type; -Variable type code. Set to 0 for a numeric variable. For a short -string variable or the first part of a long string variable, this is set -to the width of the string. For the second and subsequent parts of a -long string variable, set to -1, and the remaining fields in the -structure are ignored. - -@item int32 has_var_label; -If this variable has a variable label, set to 1; otherwise, set to 0. - -@item int32 n_missing_values; -If the variable has no missing values, set to 0. If the variable has -one, two, or three discrete missing values, set to 1, 2, or 3, -respectively. If the variable has a range for missing variables, set to --2; if the variable has a range for missing variables plus a single -discrete value, set to -3. - -@item int32 print; -Print format for this variable. See below. - -@item int32 write; -Write format for this variable. See below. - -@item char name[8]; -Variable name. The variable name must begin with a capital letter or -the at-sign (@samp{@@}). Subsequent characters may also be octothorpes -(@samp{#}), dollar signs (@samp{$}), underscores (@samp{_}), or full -stops (@samp{.}). The variable name is padded on the right with spaces. - -@item int32 label_len; -This field is present only if @code{has_var_label} is set to 1. It is -set to the length, in characters, of the variable label, which must be a -number between 0 and 120. - -@item char label[/* variable length */]; -This field is present only if @code{has_var_label} is set to 1. It has -length @code{label_len}, rounded up to the nearest multiple of 32 bits. -The first @code{label_len} characters are the variable's variable label. - -@item flt64 missing_values[/* variable length */]; -This field is present only if @code{n_missing_values} is not 0. It has -the same number of elements as the absolute value of -@code{n_missing_values}. For discrete missing values, each element -represents one missing value. When a range is present, the first -element denotes the minimum value in the range, and the second element -denotes the maximum value in the range. When a range plus a value are -present, the third element denotes the additional discrete missing -value. HIGHEST and LOWEST are indicated as described in the chapter -introduction. -@end table - -The @code{print} and @code{write} members of sysfile_variable are output -formats coded into @code{int32} types. The LSB (least-significant byte) -of the @code{int32} represents the number of decimal places, and the -next two bytes in order of increasing significance represent field width -and format type, respectively. The MSB (most-significant byte) is not -used and should be set to zero. - -Format types are defined as follows: -@table @asis -@item 0 -Not used. -@item 1 -@code{A} -@item 2 -@code{AHEX} -@item 3 -@code{COMMA} -@item 4 -@code{DOLLAR} -@item 5 -@code{F} -@item 6 -@code{IB} -@item 7 -@code{PIBHEX} -@item 8 -@code{P} -@item 9 -@code{PIB} -@item 10 -@code{PK} -@item 11 -@code{RB} -@item 12 -@code{RBHEX} -@item 13 -Not used. -@item 14 -Not used. -@item 15 -@code{Z} -@item 16 -@code{N} -@item 17 -@code{E} -@item 18 -Not used. -@item 19 -Not used. -@item 20 -@code{DATE} -@item 21 -@code{TIME} -@item 22 -@code{DATETIME} -@item 23 -@code{ADATE} -@item 24 -@code{JDATE} -@item 25 -@code{DTIME} -@item 26 -@code{WKDAY} -@item 27 -@code{MONTH} -@item 28 -@code{MOYR} -@item 29 -@code{QYR} -@item 30 -@code{WKYR} -@item 31 -@code{PCT} -@item 32 -@code{DOT} -@item 33 -@code{CCA} -@item 34 -@code{CCB} -@item 35 -@code{CCC} -@item 36 -@code{CCD} -@item 37 -@code{CCE} -@item 38 -@code{EDATE} -@item 39 -@code{SDATE} -@end table - -@node Value Label Record, Value Label Variable Record, Variable Record, Data File Format -@section Value Label Record - -Value label records must follow the variable records and must precede -the header termination record. Other than this, they may appear -anywhere in the system file. Every value label record must be -immediately followed by a label variable record, described below. - -Value label records begin with @code{rec_type}, an @code{int32} value -set to the record type of 3. This is followed by @code{count}, an -@code{int32} value set to the number of value labels present in this -record. - -These two fields are followed by a series of @code{count} tuples. Each -tuple is divided into two fields, the value and the label. The first of -these, the value, is composed of a 64-bit value, which is either a -@code{flt64} value or up to 8 characters (padded on the right to 8 -bytes) denoting a short string value. Whether the value is a -@code{flt64} or a character string is not defined inside the value label -record. - -The second field in the tuple, the label, has variable length. The -first @code{char} is a count of the number of characters in the value -label. The remainder of the field is the label itself. The field is -padded on the right to a multiple of 64 bits in length. - -@node Value Label Variable Record, Document Record, Value Label Record, Data File Format -@section Value Label Variable Record - -Every value label variable record must be immediately preceded by a -value label record, described above. - -@example -struct sysfile_value_label_variable - @{ - int32 rec_type; - int32 count; - int32 vars[/* variable length */]; - @}; -@end example - -@table @code -@item int32 rec_type; -Record type. Always set to 4. - -@item int32 count; -Number of variables that the associated value labels from the value -label record are to be applied. - -@item int32 vars[/* variable length]; -A list of variables to which to apply the value labels. There are -@code{count} elements. -@end table - -@node Document Record, Machine int32 Info Record, Value Label Variable Record, Data File Format -@section Document Record - -There must be no more than one document record per system file. -Document records must follow the variable records and precede the -dictionary termination record. - -@example -struct sysfile_document - @{ - int32 rec_type; - int32 n_lines; - char lines[/* variable length */][80]; - @}; -@end example - -@table @code -@item int32 rec_type; -Record type. Always set to 6. - -@item int32 n_lines; -Number of lines of documents present. - -@item char lines[/* variable length */][80]; -Document lines. The number of elements is defined by @code{n_lines}. -Lines shorter than 80 characters are padded on the right with spaces. -@end table - -@node Machine int32 Info Record, Machine flt64 Info Record, Document Record, Data File Format -@section Machine @code{int32} Info Record - -There must be no more than one machine @code{int32} info record per -system file. Machine @code{int32} info records must follow the variable -records and precede the dictionary termination record. - -@example -struct sysfile_machine_int32_info - @{ - /* Header. */ - int32 rec_type; - int32 subtype; - int32 size; - int32 count; - - /* Data. */ - int32 version_major; - int32 version_minor; - int32 version_revision; - int32 machine_code; - int32 floating_point_rep; - int32 compression_code; - int32 endianness; - int32 character_code; - @}; -@end example - -@table @code -@item int32 rec_type; -Record type. Always set to 7. - -@item int32 subtype; -Record subtype. Always set to 3. - -@item int32 size; -Size of each piece of data in the data part, in bytes. Always set to 4. - -@item int32 count; -Number of pieces of data in the data part. Always set to 8. - -@item int32 version_major; -PSPP major version number. In version @var{x}.@var{y}.@var{z}, this -is @var{x}. - -@item int32 version_minor; -PSPP minor version number. In version @var{x}.@var{y}.@var{z}, this -is @var{y}. - -@item int32 version_revision; -PSPP version revision number. In version @var{x}.@var{y}.@var{z}, -this is @var{z}. - -@item int32 machine_code; -Machine code. PSPP always set this field to value to -1, but other -values may appear. - -@item int32 floating_point_rep; -Floating point representation code. For IEEE 754 systems this is 1. -IBM 370 sets this to 2, and DEC VAX E to 3. - -@item int32 compression_code; -Compression code. Always set to 1. - -@item int32 endianness; -Machine endianness. 1 indicates big-endian, 2 indicates little-endian. - -@item int32 character_code; -Character code. 1 indicates EBCDIC, 2 indicates 7-bit ASCII, 3 -indicates 8-bit ASCII, 4 indicates DEC Kanji. -@end table - -@node Machine flt64 Info Record, Miscellaneous Informational Records, Machine int32 Info Record, Data File Format -@section Machine @code{flt64} Info Record - -There must be no more than one machine @code{flt64} info record per -system file. Machine @code{flt64} info records must follow the variable -records and precede the dictionary termination record. - -@example -struct sysfile_machine_flt64_info - @{ - /* Header. */ - int32 rec_type; - int32 subtype; - int32 size; - int32 count; - - /* Data. */ - flt64 sysmis; - flt64 highest; - flt64 lowest; - @}; -@end example - -@table @code -@item int32 rec_type; -Record type. Always set to 7. - -@item int32 subtype; -Record subtype. Always set to 4. - -@item int32 size; -Size of each piece of data in the data part, in bytes. Always set to 4. - -@item int32 count; -Number of pieces of data in the data part. Always set to 3. - -@item flt64 sysmis; -The system missing value. - -@item flt64 highest; -The value used for HIGHEST in missing values. - -@item flt64 lowest; -The value used for LOWEST in missing values. -@end table - -@node Miscellaneous Informational Records, Dictionary Termination Record, Machine flt64 Info Record, Data File Format -@section Miscellaneous Informational Records - -Miscellaneous informational records must follow the variable records and -precede the dictionary termination record. - -Miscellaneous informational records are ignored by PSPP when reading -system files. They are not written by PSPP when writing system files. - -@example -struct sysfile_misc_info - @{ - /* Header. */ - int32 rec_type; - int32 subtype; - int32 size; - int32 count; - - /* Data. */ - char data[/* variable length */]; - @}; -@end example - -@table @code -@item int32 rec_type; -Record type. Always set to 7. - -@item int32 subtype; -Record subtype. May take any value. According to Aapi -H@"am@"al@"ainen, value 5 indicates a set of grouped variables and 6 -indicates date info (probably related to USE). - -@item int32 size; -Size of each piece of data in the data part. Should have the value 4 or -8, for @code{int32} and @code{flt64}, respectively. - -@item int32 count; -Number of pieces of data in the data part. - -@item char data[/* variable length */]; -Arbitrary data. There must be @code{size} times @code{count} bytes of -data. -@end table - -@node Dictionary Termination Record, Data Record, Miscellaneous Informational Records, Data File Format -@section Dictionary Termination Record - -The dictionary termination record must follow all other records, except -for the actual cases, which it must precede. There must be exactly one -dictionary termination record in every system file. - -@example -struct sysfile_dict_term - @{ - int32 rec_type; - int32 filler; - @}; -@end example - -@table @code -@item int32 rec_type; -Record type. Always set to 999. - -@item int32 filler; -Ignored padding. Should be set to 0. -@end table - -@node Data Record, , Dictionary Termination Record, Data File Format -@section Data Record - -Data records must follow all other records in the data file. There must -be at least one data record in every system file. - -The format of data records varies depending on whether the data is -compressed. Regardless, the data is arranged in a series of 8-byte -elements. - -When data is not compressed, Every case is composed of @code{case_size} -of these 8-byte elements, where @code{case_size} comes from the file -header record (@pxref{File Header Record}). Each element corresponds to -the variable declared in the respective variable record (@pxref{Variable -Record}). Numeric values are given in @code{flt64} format; string -values are literal characters string, padded on the right when -necessary. - -Compressed data is arranged in the following manner: the first 8-byte -element in the data section is divided into a series of 1-byte command -codes. These codes have meanings as described below: - -@table @asis -@item 0 -Ignored. If the program writing the system file accumulates compressed -data in blocks of fixed length, 0 bytes can be used to pad out extra -bytes remaining at the end of a fixed-size block. - -@item 1 through 251 -These values indicate that the corresponding numeric variable has the -value @code{(@var{code} - @var{bias})} for the case being read, where -@var{code} is the value of the compression code and @var{bias} is the -variable @code{compression_bias} from the file header. For example, -code 105 with bias 100.0 (the normal value) indicates a numeric variable -of value 5. - -@item 252 -End of file. This code may or may not appear at the end of the data -stream. PSPP always outputs this code but its use is not required. - -@item 253 -This value indicates that the numeric or string value is not -compressible. The value is stored in the 8-byte element following the -current block of command bytes. If this value appears twice in a block -of command bytes, then it indicates the second element following the -command bytes, and so on. - -@item 254 -Used to indicate a string value that is all spaces. - -@item 255 -Used to indicate the system-missing value. -@end table - -When the end of the first 8-byte element of command bytes is reached, -any blocks of non-compressible values are skipped, and the next element -of command bytes is read and interpreted, until the end of the file is -reached. - -@node Portable File Format, q2c Input Format, Data File Format, Top -@chapter Portable File Format - -These days, most computers use the same internal data formats for -integer and floating-point data, if one ignores little differences like -big- versus little-endian byte ordering. However, occasionally it is -necessary to exchange data between systems with incompatible data -formats. This is what portable files are designed to do. - -@strong{Please note:} Although all of the following information is -correct, as far as the author has been able to ascertain, it is gleaned -from examination of ASCII-formatted portable files only, so some of it -may be incorrect in the general case. - -@menu -* Portable File Characters:: -* Portable File Structure:: -* Portable File Header:: -* Version and Date Info Record:: -* Identification Records:: -* Variable Count Record:: -* Case Weight Variable Record:: -* Variable Records:: -* Value Label Records:: -* Portable File Data:: -@end menu - -@node Portable File Characters, Portable File Structure, Portable File Format, Portable File Format -@section Portable File Characters - -Portable files are arranged as a series of lines of exactly 80 -characters each. Each line is terminated by a carriage-return, -line-feed sequence ``new-lines''). New-lines are only used to avoid -line length limits imposed by some OSes; they are not meaningful. - -The file must be terminated with a @samp{Z} character. In addition, if -the final line in the file does not have exactly 80 characters, then it -is padded on the right with @samp{Z} characters. (The file contents may -be in any character set; the file contains a description of its own -character set, as explained in the next section. Therefore, the -@samp{Z} character is not necessarily an ASCII @samp{Z}.) - -For the rest of the description of the portable file format, new-lines -and the trailing @samp{Z}s will be ignored, as if they did not exist, -because they are not an important part of understanding the file -contents. - -@node Portable File Structure, Portable File Header, Portable File Characters, Portable File Format -@section Portable File Structure - -Every portable file consists of the following records, in sequence: - -@itemize @bullet - -@item -File header. - -@item -Version and date info. - -@item -Product identification. - -@item -Subproduct identification (optional). - -@item -Variable count. - -@item -Case weight variable (optional). - -@item -Variables. Each variable record may optionally be followed by a -missing value record and a variable label record. - -@item -Value labels (optional). - -@item -Data. -@end itemize - -Most records are identified by a single-character tag code. The file -header and version info record do not have a tag. - -Other than these single-character codes, there are three types of fields -in a portable file: floating-point, integer, and string. Floating-point -fields have the following format: - -@itemize @bullet - -@item -Zero or more leading spaces. - -@item -Optional asterisk (@samp{*}), which indicates a missing value. The -asterisk must be followed by a single character, generally a period -(@samp{.}), but it appears that other characters may also be possible. -This completes the specification of a missing value. - -@item -Optional minus sign (@samp{-}) to indicate a negative number. - -@item -A whole number, consisting of one or more base-30 digits: @samp{0} -through @samp{9} plus capital letters @samp{A} through @samp{T}. - -@item -Optional fraction, consisting of a radix point (@samp{.}) followed by -one or more base-30 digits. - -@item -Optional exponent, consisting of a plus or minus sign (@samp{+} or -@samp{-}) followed by one or more base-30 digits. - -@item -A forward slash (@samp{/}). -@end itemize - -Integer fields take a form identical to floating-point fields, but they -may not contain a fraction. - -String fields take the form of a integer field having value @var{n}, -followed by exactly @var{n} characters, which are the string content. - -@node Portable File Header, Version and Date Info Record, Portable File Structure, Portable File Format -@section Portable File Header - -Every portable file begins with a 464-byte header, consisting of a -200-byte collection of vanity splash strings, followed by a 256-byte -character set translation table, followed by an 8-byte tag string. - -The 200-byte segment is divided into five 40-byte sections, each of -which represents the string @code{@var{charset} SPSS PORT FILE} in a -different character set encoding, where @var{charset} is the name of -the character set used in the file, e.g.@: @code{ASCII} or -@code{EBCDIC}. Each string is padded on the right with spaces in its -respective character set. - -It appears that these strings exist only to inform those who might view -the file on a screen, and that they are not parsed by SPSS products. -Thus, they can be safely ignored. For those interested, the strings are -supposed to be in the following character sets, in the specified order: -EBCDIC, 7-bit ASCII, CDC 6-bit ASCII, 6-bit ASCII, Honeywell 6-bit -ASCII. - -The 256-byte segment describes a mapping from the character set used in -the portable file to an arbitrary character set having characters at the -following positions: - -@table @asis -@item 0--60 - -Control characters. Not important enough to describe in full here. - -@item 61--63 - -Reserved. - -@item 64--73 - -Digits @samp{0} through @samp{9}. - -@item 74--99 - -Capital letters @samp{A} through @samp{Z}. - -@item 100--125 - -Lowercase letters @samp{a} through @samp{z}. - -@item 126 - -Space. - -@item 127--130 - -Symbols @code{.<(+} - -@item 131 - -Solid vertical pipe. - -@item 132--142 - -Symbols @code{&[]!$*);^-/} - -@item 143 - -Broken vertical pipe. - -@item 144--150 - -Symbols @code{,%_>}?@code{`:} @c @code{?} is an inverted question mark - -@item 151 - -British pound symbol. - -@item 152--155 - -Symbols @code{@@'="}. - -@item 156 - -Less than or equal symbol. - -@item 157 - -Empty box. - -@item 158 - -Plus or minus. - -@item 159 - -Filled box. - -@item 160 - -Degree symbol. - -@item 161 - -Dagger. - -@item 162 - -Symbol @samp{~}. - -@item 163 - -En dash. - -@item 164 - -Lower left corner box draw. - -@item 165 - -Upper left corner box draw. - -@item 166 - -Greater than or equal symbol. - -@item 167--176 - -Superscript @samp{0} through @samp{9}. - -@item 177 - -Lower right corner box draw. - -@item 178 - -Upper right corner box draw. - -@item 179 - -Not equal symbol. - -@item 180 - -Em dash. - -@item 181 - -Superscript @samp{(}. - -@item 182 - -Superscript @samp{)}. - -@item 183 - -Horizontal dagger (?). - -@item 184--186 - -Symbols @samp{@{@}\}. -@item 187 - -Cents symbol. - -@item 188 - -Centered dot, or bullet. - -@item 189--255 - -Reserved. -@end table - -Symbols that are not defined in a particular character set are set to -the same value as symbol 64; i.e., to @samp{0}. - -The 8-byte tag string consists of the exact characters @code{SPSSPORT} -in the portable file's character set, which can be used to verify that -the file is indeed a portable file. - -@node Version and Date Info Record, Identification Records, Portable File Header, Portable File Format -@section Version and Date Info Record - -This record does not have a tag code. It has the following structure: - -@itemize @bullet -@item -A single character identifying the file format version. The letter A -represents version 0, and so on. - -@item -An 8-character string field giving the file creation date in the format -YYYYMMDD. - -@item -A 6-character string field giving the file creation time in the format -HHMMSS. -@end itemize - -@node Identification Records, Variable Count Record, Version and Date Info Record, Portable File Format -@section Identification Records - -The product identification record has tag code @samp{1}. It consists of -a single string field giving the name of the product that wrote the -portable file. - -The subproduct identification record has tag code @samp{3}. It -consists of a single string field giving additional information on the -product that wrote the portable file. - -@node Variable Count Record, Case Weight Variable Record, Identification Records, Portable File Format -@section Variable Count Record - -The variable count record has tag code @samp{4}. It consists of two -integer fields. The first contains the number of variables in the file -dictionary. The purpose of the second is unknown; it contains the value -161 in all portable files examined so far. - -@node Case Weight Variable Record, Variable Records, Variable Count Record, Portable File Format -@section Case Weight Variable Record - -The case weight variable record is optional. If it is present, it -indicates the variable used for weighting cases; if it is absent, -cases are unweighted. It has tag code @samp{6}. It consists of a -single string field that names the weighting variable. - -@node Variable Records, Value Label Records, Case Weight Variable Record, Portable File Format -@section Variable Records - -Each variable record represents a single variable. Variable records -have tag code @samp{7}. They have the following structure: - -@itemize @bullet - -@item -Width (integer). This is 0 for a numeric variable, and a number between 1 -and 255 for a string variable. - -@item -Name (string). 1--8 characters long. Must be in all capitals. - -@item -Print format. This is a set of three integer fields: - -@itemize @minus - -@item -Format type (@pxref{Variable Record}). - -@item -Format width. 1--40. - -@item -Number of decimal places. 1--40. -@end itemize - -@item -Write format. Same structure as the print format described above. -@end itemize - -Each variable record can optionally be followed by a missing value -record, which has tag code @samp{8}. A missing value record has one -field, the missing value itself (a floating-point or string, as -appropriate). Up to three of these missing value records can be used. - -There is also a record for missing value ranges, which has tag code -@samp{B}. It is followed by two fields representing the range, which -are floating-point or string as appropriate. If a missing value range -is present, it may be followed by a single missing value record. - -Tag codes @samp{9} and @samp{A} represent @code{LO THRU @var{x}} and -@code{@var{x} THRU HI} ranges, respectively. Each is followed by a -single field representing @var{x}. If one of the ranges is present, it -may be followed by a single missing value record. - -In addition, each variable record can optionally be followed by a -variable label record, which has tag code @samp{C}. A variable label -record has one field, the variable label itself (string). - -@node Value Label Records, Portable File Data, Variable Records, Portable File Format -@section Value Label Records - -Value label records have tag code @samp{D}. They have the following -format: - -@itemize @bullet -@item -Variable count (integer). - -@item -List of variables (strings). The variable count specifies the number in -the list. Variables are specified by their names. All variables must -be of the same type (numeric or string). - -@item -Label count (integer). - -@item -List of (value, label) tuples. The label count specifies the number of -tuples. Each tuple consists of a value, which is numeric or string as -appropriate to the variables, followed by a label (string). -@end itemize - -@node Portable File Data, , Value Label Records, Portable File Format -@section Portable File Data - -The data record has tag code @samp{F}. There is only one tag for all -the data; thus, all the data must follow the dictionary. The data is -terminated by the end-of-file marker @samp{Z}, which is not valid as the -beginning of a data element. - -Data elements are output in the same order as the variable records -describing them. String variables are output as string fields, and -numeric variables are output as floating-point fields. - -@node q2c Input Format, Bugs, Portable File Format, Top -@chapter @code{q2c} Input Format - -PSPP statistical procedures have a bizarre and somewhat irregular -syntax. Despite this, a parser generator has been written that -adequately addresses many of the possibilities and tries to provide -hooks for the exceptional cases. This parser generator is named -@code{q2c}. - -@menu -* Invoking q2c:: q2c command-line syntax. -* q2c Input Structure:: High-level layout of the input file. -* Grammar Rules:: Syntax of the grammar rules. -@end menu - -@node Invoking q2c, q2c Input Structure, q2c Input Format, q2c Input Format -@section Invoking q2c - -@example -q2c @var{input.q} @var{output.c} -@end example - -@code{q2c} translates a @samp{.q} file into a @samp{.c} file. It takes -exactly two command-line arguments, which are the input file name and -output file name, respectively. @code{q2c} does not accept any -command-line options. - -@node q2c Input Structure, Grammar Rules, Invoking q2c, q2c Input Format -@section @code{q2c} Input Structure - -@code{q2c} input files are divided into two sections: the grammar rules -and the supporting code. The @dfn{grammar rules}, which make up the -first part of the input, are used to define the syntax of the -statistical procedure to be parsed. The @dfn{supporting code}, -following the grammar rules, are copied largely unchanged to the output -file, except for certain escapes. - -The most important lines in the grammar rules are used for defining -procedure syntax. These lines can be prefixed with a dollar sign -(@samp{$}), which prevents Emacs' CC-mode from munging them. Besides -this, a bang (@samp{!}) at the beginning of a line causes the line, -minus the bang, to be written verbatim to the output file (useful for -comments). As a third special case, any line that begins with the exact -characters @code{/* *INDENT} is ignored and not written to the output. -This allows @code{.q} files to be processed through @code{indent} -without being munged. - -The syntax of the grammar rules themselves is given in the following -sections. - -The supporting code is passed into the output file largely unchanged. -However, the following escapes are supported. Each escape must appear -on a line by itself. - -@table @code -@item /* (header) */ - -Expands to a series of C @code{#include} directives which include the -headers that are required for the parser generated by @code{q2c}. - -@item /* (decls @var{scope}) */ - -Expands to C variable and data type declarations for the variables and -@code{enum}s input and output by the @code{q2c} parser. @var{scope} -must be either @code{local} or @code{global}. @code{local} causes the -declarations to be output as function locals. @code{global} causes them -to be declared as @code{static} module variables; thus, @code{global} is -a bit of a misnomer. - -@item /* (parser) */ - -Expands to the entire parser. Must be enclosed within a C function. - -@item /* (free) */ - -Expands to a set of calls to the @code{free} function for variables -declared by the parser. Only needs to be invoked if subcommands of type -@code{string} are used in the grammar rules. -@end table - -@node Grammar Rules, , q2c Input Structure, q2c Input Format -@section Grammar Rules - -The grammar rules describe the format of the syntax that the parser -generated by @code{q2c} will understand. The way that the grammar rules -are included in @code{q2c} input file are described above. - -The grammar rules are divided into tokens of the following types: - -@table @asis -@item Identifier (@code{ID}) - -An identifier token is a sequence of letters, digits, and underscores -(@samp{_}). Identifiers are @emph{not} case-sensitive. - -@item String (@code{STRING}) - -String tokens are initiated by a double-quote character (@samp{"}) and -consist of all the characters between that double quote and the next -double quote, which must be on the same line as the first. Within a -string, a backslash can be used as a ``literal escape''. The only -reasons to use a literal escape are to include a double quote or a -backslash within a string. - -@item Special character - -Other characters, other than whitespace, constitute tokens in -themselves. - -@end table - -The syntax of the grammar rules is as follows: - -@example -grammar-rules ::= ID : subcommands . -subcommands ::= subcommand - ::= subcommands ; subcommand -@end example - -The syntax begins with an ID or STRING token that gives the name of the -procedure to be parsed. The rest of the syntax consists of subcommands -separated by semicolons (@samp{;}) and terminated with a full stop -(@samp{.}). - -@example -subcommand ::= sbc-options ID sbc-defn -sbc-options ::= - ::= sbc-option - ::= sbc-options sbc-options -sbc-option ::= * - ::= + -sbc-defn ::= opt-prefix = specifiers - ::= [ ID ] = array-sbc - ::= opt-prefix = sbc-special-form -opt-prefix ::= - ::= ( ID ) -@end example - -Each subcommand can be prefixed with one or more option characters. An -asterisk (@samp{*}) is used to indicate the default subcommand; the -keyword used for the default subcommand can be omitted in the PSPP -syntax file. A plus sign (@samp{+}) is used to indicate that a -subcommand can appear more than once; if it is not present then that -subcommand can appear no more than once. - -The subcommand name appears after the option characters. - -There are three forms of subcommands. The first and most common form -simply gives an equals sign (@samp{=}) and a list of specifiers, which -can each be set to a single setting. The second form declares an array, -which is a set of flags that can be individually turned on by the user. -There are also several special forms that do not take a list of -specifiers. - -Arrays require an additional @code{ID} argument. This is used as a -prefix, prepended to the variable names constructed from the -specifiers. The other forms also allow an optional prefix to be -specified. - -@example -array-sbc ::= alternatives - ::= array-sbc , alternatives -alternatives ::= ID - ::= alternatives | ID -@end example - -An array subcommand is a set of Boolean values that can independently be -turned on by the user, listed separated by commas (@samp{,}). If an value has more -than one name then these names are separated by pipes (@samp{|}). - -@example -specifiers ::= specifier - ::= specifiers , specifier -specifier ::= opt-id : settings -opt-id ::= - ::= ID -@end example - -Ordinary subcommands (other than arrays and special forms) require a -list of specifiers. Each specifier has an optional name and a list of -settings. If the name is given then a correspondingly named variable -will be used to store the user's choice of setting. If no name is given -then there is no way to tell which setting the user picked; in this case -the settings should probably have values attached. - -@example -settings ::= setting - ::= settings / setting -setting ::= setting-options ID setting-value -setting-options ::= - ::= * - ::= ! - ::= * ! -@end example - -Individual settings are separated by forward slashes (@samp{/}). Each -setting can be as little as an @code{ID} token, but options and values -can optionally be included. The @samp{*} option means that, for this -setting, the @code{ID} can be omitted. The @samp{!} option means that -this option is the default for its specifier. - -@example -setting-value ::= - ::= ( setting-value-2 ) - ::= setting-value-2 -setting-value-2 ::= setting-value-options setting-value-type : ID - setting-value-restriction -setting-value-options ::= - ::= * -setting-value-type ::= N - ::= D -setting-value-restriction ::= - ::= , STRING -@end example - -Settings may have values. If the value must be enclosed in parentheses, -then enclose the value declaration in parentheses. Declare the setting -type as @samp{n} or @samp{d} for integer or floating point type, -respectively. The given @code{ID} is used to construct a variable name. -If option @samp{*} is given, then the value is optional; otherwise it -must be specified whenever the corresponding setting is specified. A -``restriction'' can also be specified which is a string giving a C -expression limiting the valid range of the value. The special escape -@code{%s} should be used within the restriction to refer to the -setting's value variable. - -@example -sbc-special-form ::= VAR - ::= VARLIST varlist-options - ::= INTEGER opt-list - ::= DOUBLE opt-list - ::= PINT - ::= STRING @r{(the literal word STRING)} string-options - ::= CUSTOM -varlist-options ::= - ::= ( STRING ) -opt-list ::= - ::= LIST -string-options ::= - ::= ( STRING STRING ) -@end example - -The special forms are of the following types: - -@table @code -@item VAR - -A single variable name. - -@item VARLIST - -A list of variables. If given, the string can be used to provide -@code{PV_@var{*}} options to the call to @code{parse_variables}. - -@item INTEGER - -A single integer value. - -@item INTEGER LIST - -A list of integers separated by spaces or commas. - -@item DOUBLE - -A single floating-point value. - -@item DOUBLE LIST - -A list of floating-point values. - -@item PINT - -A single positive integer value. - -@item STRING - -A string value. If the options are given then the first string is an -expression giving a restriction on the value of the string; the second -string is an error message to display when the restriction is violated. - -@item CUSTOM - -A custom function is used to parse this subcommand. The function must -have prototype @code{int custom_@var{name} (void)}. It should return 0 -on failure (when it has already issued an appropriate diagnostic), 1 on -success, or 2 if it fails and the calling function should issue a syntax -error on behalf of the custom handler. - -@end table - -@node Bugs, Function Index, q2c Input Format, Top -@chapter Bugs - -@menu -* Known bugs:: Pointers to other files. -* Contacting the Author:: Where to send the bug reports. -@end menu - -@node Known bugs, Contacting the Author, Bugs, Bugs -@section Known bugs - -This is the list of known bugs in PSPP. In addition, @xref{Not -Implemented}, and @xref{Functions Not Implemented}, for lists of bugs -due to features not implemented. For known bugs in individual language -features, see the documentation for that feature. - -@itemize @bullet -@item -Nothing has yet been tested exhaustively. Be cautious using PSPP to -make important decisions. - -@item -@code{make check} fails on some systems that don't like the syntax. I'm -not sure why. If someone could make an attempt to track this down, it -would be appreciated. - -@item -PostScript driver bugs: - -@itemize @minus -@item -Does not support driver arguments `max-fonts-simult' or -`optimize-text-size'. - -@item -Minor problems with font-encodings. - -@item -Fails to align fonts along their baselines. - -@item -Does not support certain bizarre line intersections--should -never crop up in practice. - -@item -Does not gracefully substitute for existing fonts whose -encodings are missing. - -@item -Does not perform italic correction or left italic correction -on font changes. - -@item -Encapsulated PostScript is unimplemented. -@end itemize - -@item -ASCII driver bugs: - -@itemize @minus -Does not support `infinite length' or `infinite width' paper. -@end itemize -@end itemize - -See below for information on reporting bugs not listed here. - -@node Contacting the Author, , Known bugs, Bugs -@section Contacting the Author - -The author can be contacted at e-mail address -@ifinfo -. -@end ifinfo -@iftex -@code{}. -@end iftex - -PSPP bug reports should be sent to -@ifinfo -. -@end ifinfo -@iftex -@code{}. -@end iftex - -@node Function Index, Concept Index, Bugs, Top -@chapter Function Index -@printindex fn - -@node Concept Index, Command Index, Function Index, Top -@chapter Concept Index -@printindex cp - -@node Command Index, , Concept Index, Top -@chapter Command Index -@printindex vr - -@contents -@bye - -@c Local Variables: -@c compile-command: "makeinfo pspp.texi" -@c End: diff --git a/doc/pspp.texinfo b/doc/pspp.texinfo new file mode 100644 index 00000000..ebf0381d --- /dev/null +++ b/doc/pspp.texinfo @@ -0,0 +1,164 @@ +\input texinfo @c -*- texinfo -*- +@c %**start of header +@setfilename pspp.info +@settitle PSPP +@set TIMESTAMP Time-stamp: Sat Oct 30 17:30:39 WST 2004 +@set EDITION 0.21 +@set VERSION 0.31 +@c For double-sided printing, uncomment: +@c @setchapternewpage odd +@c %**end of header + + +@macro cmd{CMDNAME} +\CMDNAME\ +@end macro + +@iftex +@finalout +@end iftex + +@dircategory Math +@direntry +* PSPP: (pspp). Statistical analysis package. +@end direntry + +@ifinfo +PSPP, for statistical analysis of sampled data, by Ben Pfaff. + +This file documents PSPP, a statistical package for analysis of +sampled data that uses a command language compatible with SPSS. + +Copyright (C) 1996-9, 2000 Free Software Foundation, Inc. + +This version of the PSPP documentation is consistent with version 2 of +``texinfo.tex''. + +Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + +@ignore +Permission is granted to process this file through TeX and print the +results, provided the printed document carries copying permission notice +identical to this one except for the removal of this paragraph (this +paragraph not being relevant to the printed manual). + +@end ignore +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided that the +entire resulting derived work is distributed under the terms of a +permission notice identical to this one. + +Permission is granted to copy and distribute translations of this +manual into another language, under the above condition for modified +versions, except that this permission notice may be stated in a +translation approved by the Free Software Foundation. +@end ifinfo + +@titlepage +@title PSPP +@subtitle A System for Statistical Analysis +@subtitle Edition @value{EDITION}, for PSPP version @value{VERSION} +@author by Ben Pfaff + +@page +@vskip 0pt plus 1filll + +PSPP Copyright @copyright{} 1997, 1998 Free Software Foundation, Inc. + +Permission is granted to make and distribute verbatim copies of this +manual provided the copyright notice and this permission notice are +preserved on all copies. + +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided that the +entire derived work is distributed under the terms of a permission +notice identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that this permission notice may be stated in a translation +approved by the Foundation. +@end titlepage + +@contents + +@node Top, Introduction, (dir), (dir) +@ifinfo +@top PSPP + +This file documents the PSPP package for statistical analysis of sampled +data. This is edition @value{EDITION}, for PSPP version +@value{VERSION}, last modified at @value{TIMESTAMP}. + +@end ifinfo + +@menu +* Introduction:: Description of the package. +* License:: Your rights and obligations. +* Credits:: Acknowledgement of authors. + +* Invocation:: Starting and running PSPP. +* Language:: Basics of the PSPP command language. +* Expressions:: Numeric and string expression syntax. + +* Data Input and Output:: Reading data from user files. +* System and Portable Files:: Dealing with system & portable files. +* Variable Attributes:: Adjusting and examining variables. +* Data Manipulation:: Simple operations on data. +* Data Selection:: Select certain cases for analysis. +* Conditionals and Looping:: Doing things many times or not at all. +* Statistics:: Basic statistical procedures. +* Utilities:: Other commands. + +* Not Implemented:: What's not here yet +* Bugs:: Known problems; submitting bug reports. + +* Function Index:: Index of PSPP functions for expressions. +* Command Index:: Index of PSPP procedures. +* Concept Index:: Index of concepts. + +* Installation:: How to compile and install PSPP. +* Configuration:: Configuring PSPP. + +* Portable File Format:: Format of PSPP portable files. +* Data File Format:: Format of PSPP system files. +* q2c Input Format:: Format of syntax accepted by q2c. +@end menu + +@include introduction.texi +@include license.texi +@include credits.texi + +@include invoking.texi +@include language.texi +@include expressions.texi +@include data-io.texi +@include files.texi +@include variables.texi +@include transformation.texi +@include data-selection.texi +@include flow-control.texi +@include statistics.texi +@include utilities.texi +@include not-implemented.texi +@include bugs.texi + +@include function-index.texi +@include command-index.texi +@include concept-index.texi + +@include installing.texi +@include configuring.texi + +@include portable-file-format.texi +@include data-file-format.texi +@include q2c.texi + +@bye + +@c Local Variables: +@c use (texinfo-multiple-files-update "pspp.texinfo") in emacs to keep these files consistent +@c compile-command: "makeinfo pspp.texi" +@c End: diff --git a/doc/q2c.texi b/doc/q2c.texi new file mode 100644 index 00000000..b24a8ee0 --- /dev/null +++ b/doc/q2c.texi @@ -0,0 +1,290 @@ +@node q2c Input Format, , Data File Format, Top +@appendix @code{q2c} Input Format + +PSPP statistical procedures have a bizarre and somewhat irregular +syntax. Despite this, a parser generator has been written that +adequately addresses many of the possibilities and tries to provide +hooks for the exceptional cases. This parser generator is named +@code{q2c}. + +@menu +* Invoking q2c:: q2c command-line syntax. +* q2c Input Structure:: High-level layout of the input file. +* Grammar Rules:: Syntax of the grammar rules. +@end menu + +@node Invoking q2c, q2c Input Structure, q2c Input Format, q2c Input Format +@section Invoking q2c + +@example +q2c @var{input.q} @var{output.c} +@end example + +@code{q2c} translates a @samp{.q} file into a @samp{.c} file. It takes +exactly two command-line arguments, which are the input file name and +output file name, respectively. @code{q2c} does not accept any +command-line options. + +@node q2c Input Structure, Grammar Rules, Invoking q2c, q2c Input Format +@section @code{q2c} Input Structure + +@code{q2c} input files are divided into two sections: the grammar rules +and the supporting code. The @dfn{grammar rules}, which make up the +first part of the input, are used to define the syntax of the +statistical procedure to be parsed. The @dfn{supporting code}, +following the grammar rules, are copied largely unchanged to the output +file, except for certain escapes. + +The most important lines in the grammar rules are used for defining +procedure syntax. These lines can be prefixed with a dollar sign +(@samp{$}), which prevents Emacs' CC-mode from munging them. Besides +this, a bang (@samp{!}) at the beginning of a line causes the line, +minus the bang, to be written verbatim to the output file (useful for +comments). As a third special case, any line that begins with the exact +characters @code{/* *INDENT} is ignored and not written to the output. +This allows @code{.q} files to be processed through @code{indent} +without being munged. + +The syntax of the grammar rules themselves is given in the following +sections. + +The supporting code is passed into the output file largely unchanged. +However, the following escapes are supported. Each escape must appear +on a line by itself. + +@table @code +@item /* (header) */ + +Expands to a series of C @code{#include} directives which include the +headers that are required for the parser generated by @code{q2c}. + +@item /* (decls @var{scope}) */ + +Expands to C variable and data type declarations for the variables and +@code{enum}s input and output by the @code{q2c} parser. @var{scope} +must be either @code{local} or @code{global}. @code{local} causes the +declarations to be output as function locals. @code{global} causes them +to be declared as @code{static} module variables; thus, @code{global} is +a bit of a misnomer. + +@item /* (parser) */ + +Expands to the entire parser. Must be enclosed within a C function. + +@item /* (free) */ + +Expands to a set of calls to the @code{free} function for variables +declared by the parser. Only needs to be invoked if subcommands of type +@code{string} are used in the grammar rules. +@end table + +@node Grammar Rules, , q2c Input Structure, q2c Input Format +@section Grammar Rules + +The grammar rules describe the format of the syntax that the parser +generated by @code{q2c} will understand. The way that the grammar rules +are included in @code{q2c} input file are described above. + +The grammar rules are divided into tokens of the following types: + +@table @asis +@item Identifier (@code{ID}) + +An identifier token is a sequence of letters, digits, and underscores +(@samp{_}). Identifiers are @emph{not} case-sensitive. + +@item String (@code{STRING}) + +String tokens are initiated by a double-quote character (@samp{"}) and +consist of all the characters between that double quote and the next +double quote, which must be on the same line as the first. Within a +string, a backslash can be used as a ``literal escape''. The only +reasons to use a literal escape are to include a double quote or a +backslash within a string. + +@item Special character + +Other characters, other than whitespace, constitute tokens in +themselves. + +@end table + +The syntax of the grammar rules is as follows: + +@example +grammar-rules ::= ID : subcommands . +subcommands ::= subcommand + ::= subcommands ; subcommand +@end example + +The syntax begins with an ID or STRING token that gives the name of the +procedure to be parsed. The rest of the syntax consists of subcommands +separated by semicolons (@samp{;}) and terminated with a full stop +(@samp{.}). + +@example +subcommand ::= sbc-options ID sbc-defn +sbc-options ::= + ::= sbc-option + ::= sbc-options sbc-options +sbc-option ::= * + ::= + +sbc-defn ::= opt-prefix = specifiers + ::= [ ID ] = array-sbc + ::= opt-prefix = sbc-special-form +opt-prefix ::= + ::= ( ID ) +@end example + +Each subcommand can be prefixed with one or more option characters. An +asterisk (@samp{*}) is used to indicate the default subcommand; the +keyword used for the default subcommand can be omitted in the PSPP +syntax file. A plus sign (@samp{+}) is used to indicate that a +subcommand can appear more than once; if it is not present then that +subcommand can appear no more than once. + +The subcommand name appears after the option characters. + +There are three forms of subcommands. The first and most common form +simply gives an equals sign (@samp{=}) and a list of specifiers, which +can each be set to a single setting. The second form declares an array, +which is a set of flags that can be individually turned on by the user. +There are also several special forms that do not take a list of +specifiers. + +Arrays require an additional @code{ID} argument. This is used as a +prefix, prepended to the variable names constructed from the +specifiers. The other forms also allow an optional prefix to be +specified. + +@example +array-sbc ::= alternatives + ::= array-sbc , alternatives +alternatives ::= ID + ::= alternatives | ID +@end example + +An array subcommand is a set of Boolean values that can independently be +turned on by the user, listed separated by commas (@samp{,}). If an value has more +than one name then these names are separated by pipes (@samp{|}). + +@example +specifiers ::= specifier + ::= specifiers , specifier +specifier ::= opt-id : settings +opt-id ::= + ::= ID +@end example + +Ordinary subcommands (other than arrays and special forms) require a +list of specifiers. Each specifier has an optional name and a list of +settings. If the name is given then a correspondingly named variable +will be used to store the user's choice of setting. If no name is given +then there is no way to tell which setting the user picked; in this case +the settings should probably have values attached. + +@example +settings ::= setting + ::= settings / setting +setting ::= setting-options ID setting-value +setting-options ::= + ::= * + ::= ! + ::= * ! +@end example + +Individual settings are separated by forward slashes (@samp{/}). Each +setting can be as little as an @code{ID} token, but options and values +can optionally be included. The @samp{*} option means that, for this +setting, the @code{ID} can be omitted. The @samp{!} option means that +this option is the default for its specifier. + +@example +setting-value ::= + ::= ( setting-value-2 ) + ::= setting-value-2 +setting-value-2 ::= setting-value-options setting-value-type : ID + setting-value-restriction +setting-value-options ::= + ::= * +setting-value-type ::= N + ::= D +setting-value-restriction ::= + ::= , STRING +@end example + +Settings may have values. If the value must be enclosed in parentheses, +then enclose the value declaration in parentheses. Declare the setting +type as @samp{n} or @samp{d} for integer or floating point type, +respectively. The given @code{ID} is used to construct a variable name. +If option @samp{*} is given, then the value is optional; otherwise it +must be specified whenever the corresponding setting is specified. A +``restriction'' can also be specified which is a string giving a C +expression limiting the valid range of the value. The special escape +@code{%s} should be used within the restriction to refer to the +setting's value variable. + +@example +sbc-special-form ::= VAR + ::= VARLIST varlist-options + ::= INTEGER opt-list + ::= DOUBLE opt-list + ::= PINT + ::= STRING @r{(the literal word STRING)} string-options + ::= CUSTOM +varlist-options ::= + ::= ( STRING ) +opt-list ::= + ::= LIST +string-options ::= + ::= ( STRING STRING ) +@end example + +The special forms are of the following types: + +@table @code +@item VAR + +A single variable name. + +@item VARLIST + +A list of variables. If given, the string can be used to provide +@code{PV_@var{*}} options to the call to @code{parse_variables}. + +@item INTEGER + +A single integer value. + +@item INTEGER LIST + +A list of integers separated by spaces or commas. + +@item DOUBLE + +A single floating-point value. + +@item DOUBLE LIST + +A list of floating-point values. + +@item PINT + +A single positive integer value. + +@item STRING + +A string value. If the options are given then the first string is an +expression giving a restriction on the value of the string; the second +string is an error message to display when the restriction is violated. + +@item CUSTOM + +A custom function is used to parse this subcommand. The function must +have prototype @code{int custom_@var{name} (void)}. It should return 0 +on failure (when it has already issued an appropriate diagnostic), 1 on +success, or 2 if it fails and the calling function should issue a syntax +error on behalf of the custom handler. + +@end table +@setfilename ignored diff --git a/doc/statistics.texi b/doc/statistics.texi new file mode 100644 index 00000000..29e8f064 --- /dev/null +++ b/doc/statistics.texi @@ -0,0 +1,571 @@ +@node Statistics, Utilities, Conditionals and Looping, Top +@chapter Statistics + +This chapter documents the statistical procedures that PSPP supports so +far. + +@menu +* DESCRIPTIVES:: Descriptive statistics. +* FREQUENCIES:: Frequency tables. +* CROSSTABS:: Crosstabulation tables. +* T-TEST:: Test hypotheses about means. +* ONEWAY:: One analysis of variance. +@end menu + +@node DESCRIPTIVES, FREQUENCIES, Statistics, Statistics +@section DESCRIPTIVES + +@vindex DESCRIPTIVES +@display +DESCRIPTIVES + /VARIABLES=var_list + /MISSING=@{VARIABLE,LISTWISE@} @{INCLUDE,NOINCLUDE@} + /FORMAT=@{LABELS,NOLABELS@} @{NOINDEX,INDEX@} @{LINE,SERIAL@} + /SAVE + /STATISTICS=@{ALL,MEAN,SEMEAN,STDDEV,VARIANCE,KURTOSIS, + SKEWNESS,RANGE,MINIMUM,MAXIMUM,SUM,DEFAULT, + SESKEWNESS,SEKURTOSIS@} + /SORT=@{NONE,MEAN,SEMEAN,STDDEV,VARIANCE,KURTOSIS,SKEWNESS, + RANGE,MINIMUM,MAXIMUM,SUM,SESKEWNESS,SEKURTOSIS,NAME@} + @{A,D@} +@end display + +The @cmd{DESCRIPTIVES} procedure reads the active file and outputs +descriptive +statistics requested by the user. In addition, it can optionally +compute Z-scores. + +The VARIABLES subcommand, which is required, specifies the list of +variables to be analyzed. Keyword VARIABLES is optional. + +All other subcommands are optional: + +The MISSING subcommand determines the handling of missing variables. If +INCLUDE is set, then user-missing values are included in the +calculations. If NOINCLUDE is set, which is the default, user-missing +values are excluded. If VARIABLE is set, then missing values are +excluded on a variable by variable basis; if LISTWISE is set, then +the entire case is excluded whenever any value in that case has a +system-missing or, if INCLUDE is set, user-missing value. + +The FORMAT subcommand affects the output format. Currently the +LABELS/NOLABELS and NOINDEX/INDEX settings are not used. When SERIAL is +set, both valid and missing number of cases are listed in the output; +when NOSERIAL is set, only valid cases are listed. + +The SAVE subcommand causes @cmd{DESCRIPTIVES} to calculate Z scores for all +the specified variables. The Z scores are saved to new variables. +Variable names are generated by trying first the original variable name +with Z prepended and truncated to a maximum of 8 characters, then the +names ZSC000 through ZSC999, STDZ00 through STDZ09, ZZZZ00 through +ZZZZ09, ZQZQ00 through ZQZQ09, in that sequence. In addition, Z score +variable names can be specified explicitly on VARIABLES in the variable +list by enclosing them in parentheses after each variable. + +The STATISTICS subcommand specifies the statistics to be displayed: + +@table @code +@item ALL +All of the statistics below. +@item MEAN +Arithmetic mean. +@item SEMEAN +Standard error of the mean. +@item STDDEV +Standard deviation. +@item VARIANCE +Variance. +@item KURTOSIS +Kurtosis and standard error of the kurtosis. +@item SKEWNESS +Skewness and standard error of the skewness. +@item RANGE +Range. +@item MINIMUM +Minimum value. +@item MAXIMUM +Maximum value. +@item SUM +Sum. +@item DEFAULT +Mean, standard deviation of the mean, minimum, maximum. +@item SEKURTOSIS +Standard error of the kurtosis. +@item SESKEWNESS +Standard error of the skewness. +@end table + +The SORT subcommand specifies how the statistics should be sorted. Most +of the possible values should be self-explanatory. NAME causes the +statistics to be sorted by name. By default, the statistics are listed +in the order that they are specified on the VARIABLES subcommand. The A +and D settings request an ascending or descending sort order, +respectively. + +@node FREQUENCIES, CROSSTABS, DESCRIPTIVES, Statistics +@section FREQUENCIES + +@vindex FREQUENCIES +@display +FREQUENCIES + /VARIABLES=var_list + /FORMAT=@{TABLE,NOTABLE,LIMIT(limit)@} + @{STANDARD,CONDENSE,ONEPAGE[(onepage_limit)]@} + @{LABELS,NOLABELS@} + @{AVALUE,DVALUE,AFREQ,DFREQ@} + @{SINGLE,DOUBLE@} + @{OLDPAGE,NEWPAGE@} + /MISSING=@{EXCLUDE,INCLUDE@} + /STATISTICS=@{DEFAULT,MEAN,SEMEAN,MEDIAN,MODE,STDDEV,VARIANCE, + KURTOSIS,SKEWNESS,RANGE,MINIMUM,MAXIMUM,SUM, + SESKEWNESS,SEKURTOSIS,ALL,NONE@} + /NTILES=ntiles + /PERCENTILES=percent@dots{} + +(These options are not currently implemented.) + /BARCHART=@dots{} + /HISTOGRAM=@dots{} + /HBAR=@dots{} + /GROUPED=@dots{} + +(Integer mode.) + /VARIABLES=var_list (low,high)@dots{} +@end display + +The @cmd{FREQUENCIES} procedure outputs frequency tables for specified +variables. +@cmd{FREQUENCIES} can also calculate and display descriptive statistics +(including median and mode) and percentiles. + +In the future, @cmd{FREQUENCIES} will also support graphical output in the +form of bar charts and histograms. In addition, it will be able to +support percentiles for grouped data. + +The VARIABLES subcommand is the only required subcommand. Specify the +variables to be analyzed. In most cases, this is all that is required. +This is known as @dfn{general mode}. + +Occasionally, one may want to invoke a special mode called @dfn{integer +mode}. Normally, in general mode, PSPP will automatically determine +what values occur in the data. In integer mode, the user specifies the +range of values that the data assumes. To invoke this mode, specify a +range of data values in parentheses, separated by a comma. Data values +inside the range are truncated to the nearest integer, then assigned to +that value. If values occur outside this range, they are discarded. + +The FORMAT subcommand controls the output format. It has several +possible settings: + +@itemize @bullet +@item +TABLE, the default, causes a frequency table to be output for every +variable specified. NOTABLE prevents them from being output. LIMIT +with a numeric argument causes them to be output except when there are +more than the specified number of values in the table. + +@item +STANDARD frequency tables contain more complete information, but also to +take up more space on the printed page. CONDENSE frequency tables are +less informative but take up less space. ONEPAGE with a numeric +argument will output standard frequency tables if there are the +specified number of values or less, condensed tables otherwise. ONEPAGE +without an argument defaults to a threshold of 50 values. + +@item +LABELS causes value labels to be displayed in STANDARD frequency +tables. NOLABLES prevents this. + +@item +Normally frequency tables are sorted in ascending order by value. This +is AVALUE. DVALUE tables are sorted in descending order by value. +AFREQ and DFREQ tables are sorted in ascending and descending order, +respectively, by frequency count. + +@item +SINGLE spaced frequency tables are closely spaced. DOUBLE spaced +frequency tables have wider spacing. + +@item +OLDPAGE and NEWPAGE are not currently used. +@end itemize + +The MISSING subcommand controls the handling of user-missing values. +When EXCLUDE, the default, is set, user-missing values are not included +in frequency tables or statistics. When INCLUDE is set, user-missing +are included. System-missing values are never included in statistics, +but are listed in frequency tables. + +The available STATISTICS are the same as available in @cmd{DESCRIPTIVES} +(@pxref{DESCRIPTIVES}), with the addition of MEDIAN, the data's median +value, and MODE, the mode. (If there are multiple modes, the smallest +value is reported.) By default, the mean, standard deviation of the +mean, minimum, and maximum are reported for each variable. + +PERCENTILES causes the specified percentiles to be reported. +The percentiles should be presented at a list of numbers between 0 +and 100 inclusive. +The NTILES subcommand causes the percentiles to be reported at the +boundaries of the data set divided into the specified number of ranges. +For instance, @code{/NTILES=4} would cause quartiles to be reported. + + +@node CROSSTABS, T-TEST, FREQUENCIES, Statistics +@section CROSSTABS + +@vindex CROSSTABS +@display +CROSSTABS + /TABLES=var_list BY var_list [BY var_list]@dots{} + /MISSING=@{TABLE,INCLUDE,REPORT@} + /WRITE=@{NONE,CELLS,ALL@} + /FORMAT=@{TABLES,NOTABLES@} + @{LABELS,NOLABELS,NOVALLABS@} + @{PIVOT,NOPIVOT@} + @{AVALUE,DVALUE@} + @{NOINDEX,INDEX@} + @{BOX,NOBOX@} + /CELLS=@{COUNT,ROW,COLUMN,TOTAL,EXPECTED,RESIDUAL,SRESIDUAL, + ASRESIDUAL,ALL,NONE@} + /STATISTICS=@{CHISQ,PHI,CC,LAMBDA,UC,BTAU,CTAU,RISK,GAMMA,D, + KAPPA,ETA,CORR,ALL,NONE@} + +(Integer mode.) + /VARIABLES=var_list (low,high)@dots{} +@end display + +The @cmd{CROSSTABS} procedure displays crosstabulation +tables requested by the user. It can calculate several statistics for +each cell in the crosstabulation tables. In addition, a number of +statistics can be calculated for each table itself. + +The TABLES subcommand is used to specify the tables to be reported. Any +number of dimensions is permitted, and any number of variables per +dimension is allowed. The TABLES subcommand may be repeated as many +times as needed. This is the only required subcommand in @dfn{general +mode}. + +Occasionally, one may want to invoke a special mode called @dfn{integer +mode}. Normally, in general mode, PSPP automatically determines +what values occur in the data. In integer mode, the user specifies the +range of values that the data assumes. To invoke this mode, specify the +VARIABLES subcommand, giving a range of data values in parentheses for +each variable to be used on the TABLES subcommand. Data values inside +the range are truncated to the nearest integer, then assigned to that +value. If values occur outside this range, they are discarded. When it +is present, the VARIABLES subcommand must precede the TABLES +subcommand. + +In general mode, numeric and string variables may be specified on +TABLES. Although long string variables are allowed, only their +initial short-string parts are used. In integer mode, only numeric +variables are allowed. + +The MISSING subcommand determines the handling of user-missing values. +When set to TABLE, the default, missing values are dropped on a table by +table basis. When set to INCLUDE, user-missing values are included in +tables and statistics. When set to REPORT, which is allowed only in +integer mode, user-missing values are included in tables but marked with +an @samp{M} (for ``missing'') and excluded from statistical +calculations. + +Currently the WRITE subcommand is ignored. + +The FORMAT subcommand controls the characteristics of the +crosstabulation tables to be displayed. It has a number of possible +settings: + +@itemize @bullet +@item +TABLES, the default, causes crosstabulation tables to be output. +NOTABLES suppresses them. + +@item +LABELS, the default, allows variable labels and value labels to appear +in the output. NOLABELS suppresses them. NOVALLABS displays variable +labels but suppresses value labels. + +@item +PIVOT, the default, causes each TABLES subcommand to be displayed in a +pivot table format. NOPIVOT causes the old-style crosstabulation format +to be used. + +@item +AVALUE, the default, causes values to be sorted in ascending order. +DVALUE asserts a descending sort order. + +@item +INDEX/NOINDEX is currently ignored. + +@item +BOX/NOBOX is currently ignored. +@end itemize + +The CELLS subcommand controls the contents of each cell in the displayed +crosstabulation table. The possible settings are: + +@table @asis +@item COUNT +Frequency count. +@item ROW +Row percent. +@item COLUMN +Column percent. +@item TOTAL +Table percent. +@item EXPECTED +Expected value. +@item RESIDUAL +Residual. +@item SRESIDUAL +Standardized residual. +@item ASRESIDUAL +Adjusted standardized residual. +@item ALL +All of the above. +@item NONE +Suppress cells entirely. +@end table + +@samp{/CELLS} without any settings specified requests COUNT, ROW, +COLUMN, and TOTAL. If CELLS is not specified at all then only COUNT +will be selected. + +The STATISTICS subcommand selects statistics for computation: + +@table @asis +@item CHISQ +Pearson chi-square, likelihood ratio, Fisher's exact test, continuity +correction, linear-by-linear association. +@item PHI +Phi. +@item CC +Contingency coefficient. +@item LAMBDA +Lambda. +@item UC +Uncertainty coefficient. +@item BTAU +Tau-b. +@item CTAU +Tau-c. +@item RISK +Risk estimate. +@item GAMMA +Gamma. +@item D +Somers' D. +@item KAPPA +Cohen's Kappa. +@item ETA +Eta. +@item CORR +Spearman correlation, Pearson's r. +@item ALL +All of the above. +@item NONE +No statistics. +@end table + +Selected statistics are only calculated when appropriate for the +statistic. Certain statistics require tables of a particular size, and +some statistics are calculated only in integer mode. + +@samp{/STATISTICS} without any settings selects CHISQ. If the +STATISTICS subcommand is not given, no statistics are calculated. + +@strong{Please note:} Currently the implementation of CROSSTABS has the +followings bugs: + +@itemize @bullet +@item +Pearson's R (but not Spearman) is off a little. +@item +T values for Spearman's R and Pearson's R are wrong. +@item +Significance of symmetric and directional measures is not calculated. +@item +Asymmetric ASEs and T values for lambda are wrong. +@item +ASE of Goodman and Kruskal's tau is not calculated. +@item +ASE of symmetric somers' d is wrong. +@item +Approximate T of uncertainty coefficient is wrong. +@end itemize + +Fixes for any of these deficiencies would be welcomed. + +@node T-TEST, ONEWAY, CROSSTABS, Statistics +@comment node-name, next, previous, up +@section T-TEST + +@vindex T-TEST +@display +T-TEST + /MISSING=@{ANALYSIS,LISTWISE@} @{EXCLUDE,INCLUDE@} + /CRITERIA=CIN(confidence) + + +(One Sample mode.) + TESTVAL=test_value + /VARIABLES=var_list + + +(Independent Samples mode.) + GROUPS=var(value1 [, value2]) + /VARIABLES=var_list + + +(Paired Samples mode.) + PAIRS=var_list [WITH var_list [(PAIRED)] ] + +@end display + + +The @cmd{T-TEST} procedure outputs tables used in testing hypotheses about +means. +It operates in one of three modes: +@itemize +@item One Sample mode. +@item Independent Groups mode. +@item Paired mode. +@end itemize + +@noindent +Each of these modes are described in more detail below. +There are two optional subcommands which are common to all modes. + +The @cmd{/CRITERIA} subcommand tells PSPP the confidence interval used +in the tests. The default value is 0.95. + + +The @cmd{MISSING} subcommand determines the handling of missing +variables. +If INCLUDE is set, then user-missing values are included in the +calculations, but system-missing values are not. +If EXCLUDE is set, which is the default, user-missing +values are excluded as well as system-missing values. +This is the default. + +If LISTWISE is set, then the entire case is excluded from analysis +whenever any variable specified in the @cmd{/VARIABLES}, @cmd{/PAIRS} or +@cmd{/GROUPS} subcommands contains a missing value. +If ANALYSIS is set, then missing values are excluded only in the analysis for +which they would be needed. This is the default. + + +@menu +* One Sample Mode:: Testing against a hypothesised mean +* Independent Samples Mode:: Testing two independent groups for equal mean +* Paired Samples Mode:: Testing two interdependent groups for equal mean +@end menu + +@node One Sample Mode, Independent Samples Mode, T-TEST, T-TEST +@subsection One Sample Mode + +The @cmd{TESTVAL} subcommand invokes the One Sample mode. +This mode is used to test a population mean against a hypothesised +mean. +The value given to the @cmd{TESTVAL} subcommand is the value against +which you wish to test. +In this mode, you must also use the @cmd{/VARIABLES} subcommand to +tell PSPP which variables you wish to test. + +@node Independent Samples Mode, Paired Samples Mode, One Sample Mode, T-TEST +@comment node-name, next, previous, up +@subsection Independent Samples Mode + +The @cmd{GROUPS} subcommand invokes Independent Samples mode or +`Groups' mode. +This mode is used to test whether two groups of values have the +same population mean. +In this mode, you must also use the @cmd{/VARIABLES} subcommand to +tell PSPP the dependent variables you wish to test. + +The variable given in the @cmd{GROUPS} subcommand is the independent +variable which determines to which group the samples belong. +The values in parentheses are the specific values of the independent +variable for each group. +If the parentheses are omitted and no values are given, the default values +of 1.0 and 2.0 are assumed. + +If the independent variable is numeric, +it is acceptable to specify only one value inside the parentheses. +If you do this, cases where the independent variable is +less than or equal to this value belong to the first group, and cases +greater than this value belong to the second group. +When using this form of the @cmd{GROUPS} subcommand, missing values in +the independent variable are excluded on a listwise basis, regardless +of whether @cmd{/MISSING=LISTWISE} was specified. + + +@node Paired Samples Mode, , Independent Samples Mode, T-TEST +@comment node-name, next, previous, up +@subsection Paired Samples Mode + +The @cmd{PAIRS} subcommand introduces Paired Samples mode. +Use this mode when repeated measures have been taken from the same +samples. +If the the @code{WITH} keyword is omitted, then tables for all +combinations of variables given in the @cmd{PAIRS} subcommand are +generated. +If the @code{WITH} keyword is given, and the @code{(PAIRED)} keyword +is also given, then the number of variables preceding @code{WITH} +must be the same as the number following it. +In this case, tables for each respective pair of variables are +generated. +In the event that the @code{WITH} keyword is given, but the +@code{(PAIRED)} keyword is omitted, then tables for each combination +of variable preceding @code{WITH} against variable following +@code{WITH} are generated. + + +@node ONEWAY, , T-TEST, Statistics +@comment node-name, next, previous, up +@section Oneway + +@vindex ONEWAY +@cindex analysis of variance +@cindex ANOVA + +@display +ONEWAY + [/VARIABLES = ] var_list BY var + /MISSING=@{ANALYSIS,LISTWISE@} @{EXCLUDE,INCLUDE@} + /CONTRASTS= value1 [, value2] ... [,valueN] + /STATISTICS=@{DESCRIPTIVES,HOMOGENEITY@} + +@end display + +The @cmd{ONEWAY} procedure performs a one-way analysis of variance of +variables factored by a single independent variable. +It is used to compare the means of a population +divided into more than two groups. + +The variables to be analysed should be given in the @code{VARIABLES} +subcommand. +The list of variables must be followed by the @code{BY} keyword and +the name of the independent (or factor) variable. + +You can use the @code{STATISTICS} subcommand to tell PSPP to display +ancilliary information. The options accepted are: +@itemize +@item DESCRIPTIVES +Displays descriptive statistics about the groups factored by the independent +variable. +@item HOMOGENEITY +Displays the Levene test of Homogeneity of Variance for the +variables and their groups. +@end itemize + +The @code{CONTRASTS} subcommand is used when you anticipate certain +differences between the groups. +The subcommand must be followed by a list of numerals which are the +coefficients of the groups to be tested. +The number of coefficients must correspond to the number of distinct +groups (or values of the independent variable). +If the total sum of the coefficients are not zero, then PSPP will +display a warning, but will proceed with the analysis. +The @code{CONTRASTS} subcommand may be given up to 10 times in order +to specify different contrast tests. +@setfilename ignored diff --git a/doc/texinfo.tex b/doc/texinfo.tex index d5c8121f..b4b9016f 100644 --- a/doc/texinfo.tex +++ b/doc/texinfo.tex @@ -3,10 +3,11 @@ % Load plain if necessary, i.e., if running under initex. \expandafter\ifx\csname fmtname\endcsname\relax\input plain\fi % -\def\texinfoversion{2003-10-06.08} +\def\texinfoversion{2004-09-06.16} % % Copyright (C) 1985, 1986, 1988, 1990, 1991, 1992, 1993, 1994, 1995, -% 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc. +% 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004 Free Software +% Foundation, Inc. % % This texinfo.tex file is free software; you can redistribute it and/or % modify it under the terms of the GNU General Public License as @@ -23,21 +24,16 @@ % to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, % Boston, MA 02111-1307, USA. % -% In other words, you are welcome to use, share and improve this program. -% You are forbidden to forbid anyone else to use, share and improve -% what you give them. Help stamp out software-hoarding! +% As a special exception, when this file is read by TeX when processing +% a Texinfo source document, you may use the result without +% restriction. (This has been our intent since Texinfo was invented.) % % Please try the latest version of texinfo.tex before submitting bug % reports; you can get the latest version from: -% ftp://ftp.gnu.org/gnu/texinfo/texinfo.tex -% (and all GNU mirrors, see http://www.gnu.org/order/ftp.html) +% http://www.gnu.org/software/texinfo/ (the Texinfo home page), or % ftp://tug.org/tex/texinfo.tex -% (and all CTAN mirrors, see http://www.ctan.org), -% and /home/gd/gnu/doc/texinfo.tex on the GNU machines. -% -% The GNU Texinfo home page is http://www.gnu.org/software/texinfo. -% -% The texinfo.tex in any given Texinfo distribution could well be out +% (and all CTAN mirrors, see http://www.ctan.org). +% The texinfo.tex in any given distribution could well be out % of date, so if that's what you're using, please check. % % Send bug reports to bug-texinfo@gnu.org. Please include including a @@ -59,6 +55,9 @@ % It is possible to adapt texinfo.tex for other languages, to some % extent. You can get the existing language-specific files from the % full Texinfo distribution. +% +% The GNU Texinfo home page is http://www.gnu.org/software/texinfo. + \message{Loading texinfo [version \texinfoversion]:} @@ -85,11 +84,13 @@ \let\ptexend=\end \let\ptexequiv=\equiv \let\ptexexclam=\! +\let\ptexfootnote=\footnote \let\ptexgtr=> \let\ptexhat=^ \let\ptexi=\i \let\ptexindent=\indent \let\ptexnoindent=\noindent +\let\ptexinsert=\insert \let\ptexlbrace=\{ \let\ptexless=< \let\ptexplus=+ @@ -102,6 +103,15 @@ % starts a new line in the output. \newlinechar = `^^J +% Use TeX 3.0's \inputlineno to get the line number, for better error +% messages, but if we're using an old version of TeX, don't do anything. +% +\ifx\inputlineno\thisisundefined + \let\linenumber = \empty % Pre-3.0. +\else + \def\linenumber{l.\the\inputlineno:\space} +\fi + % Set up fixed words for English if not already set. \ifx\putwordAppendix\undefined \gdef\putwordAppendix{Appendix}\fi \ifx\putwordChapter\undefined \gdef\putwordChapter{Chapter}\fi @@ -140,43 +150,81 @@ \ifx\putwordDefspec\undefined \gdef\putwordDefspec{Special Form}\fi \ifx\putwordDefvar\undefined \gdef\putwordDefvar{Variable}\fi \ifx\putwordDefopt\undefined \gdef\putwordDefopt{User Option}\fi -\ifx\putwordDeftypevar\undefined\gdef\putwordDeftypevar{Variable}\fi \ifx\putwordDeffunc\undefined \gdef\putwordDeffunc{Function}\fi -\ifx\putwordDeftypefun\undefined\gdef\putwordDeftypefun{Function}\fi % In some macros, we cannot use the `\? notation---the left quote is % in some cases the escape char. \chardef\colonChar = `\: \chardef\commaChar = `\, \chardef\dotChar = `\. -\chardef\equalChar = `\= \chardef\exclamChar= `\! \chardef\questChar = `\? \chardef\semiChar = `\; -\chardef\spaceChar = `\ % \chardef\underChar = `\_ +\chardef\spaceChar = `\ % +\chardef\spacecat = 10 +\def\spaceisspace{\catcode\spaceChar=\spacecat} + % Ignore a token. % \def\gobble#1{} -% True if #1 is the empty string, i.e., called like `\ifempty{}'. -% -\def\ifempty#1{\ifemptyx #1\emptymarkA\emptymarkB}% -\def\ifemptyx#1#2\emptymarkB{\ifx #1\emptymarkA}% +% The following is used inside several \edef's. +\def\makecsname#1{\expandafter\noexpand\csname#1\endcsname} % Hyphenation fixes. -\hyphenation{ap-pen-dix} -\hyphenation{eshell} -\hyphenation{mini-buf-fer mini-buf-fers} -\hyphenation{time-stamp} -\hyphenation{white-space} +\hyphenation{ + Flor-i-da Ghost-script Ghost-view Mac-OS Post-Script + ap-pen-dix bit-map bit-maps + data-base data-bases eshell fall-ing half-way long-est man-u-script + man-u-scripts mini-buf-fer mini-buf-fers over-view par-a-digm + par-a-digms rath-er rec-tan-gu-lar ro-bot-ics se-vere-ly set-up spa-ces + spell-ing spell-ings + stand-alone strong-est time-stamp time-stamps which-ever white-space + wide-spread wrap-around +} % Margin to add to right of even pages, to left of odd pages. \newdimen\bindingoffset \newdimen\normaloffset \newdimen\pagewidth \newdimen\pageheight +% For a final copy, take out the rectangles +% that mark overfull boxes (in case you have decided +% that the text looks ok even though it passes the margin). +% +\def\finalout{\overfullrule=0pt} + +% @| inserts a changebar to the left of the current line. It should +% surround any changed text. This approach does *not* work if the +% change spans more than two lines of output. To handle that, we would +% have adopt a much more difficult approach (putting marks into the main +% vertical list for the beginning and end of each change). +% +\def\|{% + % \vadjust can only be used in horizontal mode. + \leavevmode + % + % Append this vertical mode material after the current line in the output. + \vadjust{% + % We want to insert a rule with the height and depth of the current + % leading; that is exactly what \strutbox is supposed to record. + \vskip-\baselineskip + % + % \vadjust-items are inserted at the left edge of the type. So + % the \llap here moves out into the left-hand margin. + \llap{% + % + % For a thicker or thinner bar, change the `1pt'. + \vrule height\baselineskip width1pt + % + % This is the space between the bar and the text. + \hskip 12pt + }% + }% +} + % Sometimes it is convenient to have everything in the transcript file % and nothing on the terminal. We don't just call \tracingall here, % since that produces some useless output on the terminal. We also make @@ -201,7 +249,7 @@ \tracingassigns1 \fi \tracingcommands3 % 3 gives us more in etex - \errorcontextlines\maxdimen + \errorcontextlines16 }% % add check for \lastpenalty to plain's definitions. If the last thing @@ -259,7 +307,7 @@ % the page break happens to be in the middle of an example. \shipout\vbox{% % Do this early so pdf references go to the beginning of the page. - \ifpdfmakepagedest \pdfmkdest{\the\pageno}\fi + \ifpdfmakepagedest \pdfdest name{\the\pageno} xyz\fi % \ifcropmarks \vbox to \outervsize\bgroup \hsize = \outerhsize @@ -340,132 +388,162 @@ % the input line (except we remove a trailing comment). #1 should be a % macro which expects an ordinary undelimited TeX argument. % -\def\parsearg#1{% - \let\next = #1% +\def\parsearg{\parseargusing{}} +\def\parseargusing#1#2{% + \def\next{#2}% \begingroup \obeylines - \futurelet\temp\parseargx -} - -% If the next token is an obeyed space (from an @example environment or -% the like), remove it and recurse. Otherwise, we're done. -\def\parseargx{% - % \obeyedspace is defined far below, after the definition of \sepspaces. - \ifx\obeyedspace\temp - \expandafter\parseargdiscardspace - \else - \expandafter\parseargline - \fi + \spaceisspace + #1% + \parseargline\empty% Insert the \empty token, see \finishparsearg below. } -% Remove a single space (as the delimiter token to the macro call). -{\obeyspaces % - \gdef\parseargdiscardspace {\futurelet\temp\parseargx}} - {\obeylines % \gdef\parseargline#1^^M{% \endgroup % End of the group started in \parsearg. - % - % First remove any @c comment, then any @comment. - % Result of each macro is put in \toks0. - \argremovec #1\c\relax % - \expandafter\argremovecomment \the\toks0 \comment\relax % - % - % Call the caller's macro, saved as \next in \parsearg. - \expandafter\next\expandafter{\the\toks0}% + \argremovecomment #1\comment\ArgTerm% }% } -% Since all \c{,omment} does is throw away the argument, we can let TeX -% do that for us. The \relax here is matched by the \relax in the call -% in \parseargline; it could be more or less anything, its purpose is -% just to delimit the argument to the \c. -\def\argremovec#1\c#2\relax{\toks0 = {#1}} -\def\argremovecomment#1\comment#2\relax{\toks0 = {#1}} +% First remove any @comment, then any @c comment. +\def\argremovecomment#1\comment#2\ArgTerm{\argremovec #1\c\ArgTerm} +\def\argremovec#1\c#2\ArgTerm{\argcheckspaces#1\^^M\ArgTerm} -% \argremovec{,omment} might leave us with trailing spaces, though; e.g., +% Each occurence of `\^^M' or `\^^M' is replaced by a single space. +% +% \argremovec might leave us with trailing space, e.g., % @end itemize @c foo -% will have two active spaces as part of the argument with the -% `itemize'. Here we remove all active spaces from #1, and assign the -% result to \toks0. -% -% This loses if there are any *other* active characters besides spaces -% in the argument -- _ ^ +, for example -- since they get expanded. -% Fortunately, Texinfo does not define any such commands. (If it ever -% does, the catcode of the characters in questionwill have to be changed -% here.) But this means we cannot call \removeactivespaces as part of -% \argremovec{,omment}, since @c uses \parsearg, and thus the argument -% that \parsearg gets might well have any character at all in it. -% -\def\removeactivespaces#1{% - \begingroup - \ignoreactivespaces - \edef\temp{#1}% - \global\toks0 = \expandafter{\temp}% - \endgroup +% This space token undergoes the same procedure and is eventually removed +% by \finishparsearg. +% +\def\argcheckspaces#1\^^M{\argcheckspacesX#1\^^M \^^M} +\def\argcheckspacesX#1 \^^M{\argcheckspacesY#1\^^M} +\def\argcheckspacesY#1\^^M#2\^^M#3\ArgTerm{% + \def\temp{#3}% + \ifx\temp\empty + % We cannot use \next here, as it holds the macro to run; + % thus we reuse \temp. + \let\temp\finishparsearg + \else + \let\temp\argcheckspaces + \fi + % Put the space token in: + \temp#1 #3\ArgTerm } -% Change the active space to expand to nothing. +% If a _delimited_ argument is enclosed in braces, they get stripped; so +% to get _exactly_ the rest of the line, we had to prevent such situation. +% We prepended an \empty token at the very beginning and we expand it now, +% just before passing the control to \next. +% (Similarily, we have to think about #3 of \argcheckspacesY above: it is +% either the null string, or it ends with \^^M---thus there is no danger +% that a pair of braces would be stripped. % -\begingroup +% But first, we have to remove the trailing space token. +% +\def\finishparsearg#1 \ArgTerm{\expandafter\next\expandafter{#1}} + +% \parseargdef\foo{...} +% is roughly equivalent to +% \def\foo{\parsearg\Xfoo} +% \def\Xfoo#1{...} +% +% Actually, I use \csname\string\foo\endcsname, ie. \\foo, as it is my +% favourite TeX trick. --kasal, 16nov03 + +\def\parseargdef#1{% + \expandafter \doparseargdef \csname\string#1\endcsname #1% +} +\def\doparseargdef#1#2{% + \def#2{\parsearg#1}% + \def#1##1% +} + +% Several utility definitions with active space: +{ \obeyspaces - \gdef\ignoreactivespaces{\obeyspaces\let =\empty} -\endgroup + \gdef\obeyedspace{ } + + % Make each space character in the input produce a normal interword + % space in the output. Don't allow a line break at this space, as this + % is used only in environments like @example, where each line of input + % should produce a line of output anyway. + % + \gdef\sepspaces{\obeyspaces\let =\tie} + + % If an index command is used in an @example environment, any spaces + % therein should become regular spaces in the raw index file, not the + % expansion of \tie (\leavevmode \penalty \@M \ ). + \gdef\unsepspaces{\let =\space} +} \def\flushcr{\ifx\par\lisppar \def\next##1{}\else \let\next=\relax \fi \next} -%% These are used to keep @begin/@end levels from running away -%% Call \inENV within environments (after a \begingroup) -\newif\ifENV \ENVfalse \def\inENV{\ifENV\relax\else\ENVtrue\fi} -\def\ENVcheck{% -\ifENV\errmessage{Still within an environment; press RETURN to continue} -\endgroup\fi} % This is not perfect, but it should reduce lossage +% Define the framework for environments in texinfo.tex. It's used like this: +% +% \envdef\foo{...} +% \def\Efoo{...} +% +% It's the responsibility of \envdef to insert \begingroup before the +% actual body; @end closes the group after calling \Efoo. \envdef also +% defines \thisenv, so the current environment is known; @end checks +% whether the environment name matches. The \checkenv macro can also be +% used to check whether the current environment is the one expected. +% +% Non-false conditionals (@iftex, @ifset) don't fit into this, so they +% are not treated as enviroments; they don't open a group. (The +% implementation of @end takes care not to call \endgroup in this +% special case.) -% @begin foo is the same as @foo, for now. -\newhelp\EMsimple{Press RETURN to continue.} -\outer\def\begin{\parsearg\beginxxx} +% At runtime, environments start with this: +\def\startenvironment#1{\begingroup\def\thisenv{#1}} +% initialize +\let\thisenv\empty -\def\beginxxx #1{% -\expandafter\ifx\csname #1\endcsname\relax -{\errhelp=\EMsimple \errmessage{Undefined command @begin #1}}\else -\csname #1\endcsname\fi} +% ... but they get defined via ``\envdef\foo{...}'': +\long\def\envdef#1#2{\def#1{\startenvironment#1#2}} +\def\envparseargdef#1#2{\parseargdef#1{\startenvironment#1#2}} -% @end foo executes the definition of \Efoo. -% -\def\end{\parsearg\endxxx} -\def\endxxx #1{% - \removeactivespaces{#1}% - \edef\endthing{\the\toks0}% - % - \expandafter\ifx\csname E\endthing\endcsname\relax - \expandafter\ifx\csname \endthing\endcsname\relax - % There's no \foo, i.e., no ``environment'' foo. - \errhelp = \EMsimple - \errmessage{Undefined command `@end \endthing'}% - \else - \unmatchedenderror\endthing - \fi +% Check whether we're in the right environment: +\def\checkenv#1{% + \def\temp{#1}% + \ifx\thisenv\temp \else - % Everything's ok; the right environment has been started. - \csname E\endthing\endcsname + \badenverr \fi } -% There is an environment #1, but it hasn't been started. Give an error. -% -\def\unmatchedenderror#1{% +% Evironment mismatch, #1 expected: +\def\badenverr{% \errhelp = \EMsimple - \errmessage{This `@end #1' doesn't have a matching `@#1'}% + \errmessage{This command can appear only \inenvironment\temp, + not \inenvironment\thisenv}% +} +\def\inenvironment#1{% + \ifx#1\empty + out of any environment% + \else + in environment \expandafter\string#1% + \fi } -% Define the control sequence \E#1 to give an unmatched @end error. +% @end foo executes the definition of \Efoo. +% But first, it executes a specialized version of \checkenv % -\def\defineunmatchedend#1{% - \expandafter\def\csname E#1\endcsname{\unmatchedenderror{#1}}% +\parseargdef\end{% + \if 1\csname iscond.#1\endcsname + \else + % The general wording of \badenverr may not be ideal, but... --kasal, 06nov03 + \expandafter\checkenv\csname#1\endcsname + \csname E#1\endcsname + \endgroup + \fi } +\newhelp\EMsimple{Press RETURN to continue.} + %% Simple single-character @ commands @@ -497,6 +575,9 @@ !gdef!rbraceatcmd[@}]% !endgroup +% @comma{} to avoid , parsing problems. +\let\comma = , + % Accents: @, @dotaccent @ringaccent @ubaraccent @udotaccent % Others are defined by plain TeX: @` @' @" @^ @~ @= @u @v @H. \let\, = \c @@ -506,10 +587,12 @@ \let\ubaraccent = \b \let\udotaccent = \d -% Other special characters: @questiondown @exclamdown +% Other special characters: @questiondown @exclamdown @ordf @ordm % Plain TeX defines: @AA @AE @O @OE @L (plus lowercase versions) @ss. \def\questiondown{?`} \def\exclamdown{!`} +\def\ordf{\leavevmode\raise1ex\hbox{\selectfonts\lllsize \underbar{a}}} +\def\ordm{\leavevmode\raise1ex\hbox{\selectfonts\lllsize \underbar{o}}} % Dotless i and dotless j, used for accents. \def\imacro{i} @@ -522,6 +605,25 @@ \fi\fi } +% The \TeX{} logo, as in plain, but resetting the spacing so that a +% period following counts as ending a sentence. (Idea found in latex.) +% +\edef\TeX{\TeX \spacefactor=1000 } + +% @LaTeX{} logo. Not quite the same results as the definition in +% latex.ltx, since we use a different font for the raised A; it's most +% convenient for us to use an explicitly smaller font, rather than using +% the \scriptstyle font (since we don't reset \scriptstyle and +% \scriptscriptstyle). +% +\def\LaTeX{% + L\kern-.36em + {\setbox0=\hbox{T}% + \vbox to \ht0{\hbox{\selectfonts\lllsize A}\vss}}% + \kern-.15em + \TeX +} + % Be sure we're in horizontal mode when doing a tie, since we make space % equivalent to this in @example-like environments. Otherwise, a space % at the beginning of a line will start with \penalty -- and @@ -575,59 +677,14 @@ \newbox\groupbox \def\vfilllimit{0.7} % -\def\group{\begingroup - \ifnum\catcode13=\active \else +\envdef\group{% + \ifnum\catcode`\^^M=\active \else \errhelp = \groupinvalidhelp \errmessage{@group invalid in context where filling is enabled}% \fi - % - % The \vtop we start below produces a box with normal height and large - % depth; thus, TeX puts \baselineskip glue before it, and (when the - % next line of text is done) \lineskip glue after it. (See p.82 of - % the TeXbook.) Thus, space below is not quite equal to space - % above. But it's pretty close. - \def\Egroup{% - \egroup % End the \vtop. - % \dimen0 is the vertical size of the group's box. - \dimen0 = \ht\groupbox \advance\dimen0 by \dp\groupbox - % \dimen2 is how much space is left on the page (more or less). - \dimen2 = \pageheight \advance\dimen2 by -\pagetotal - % if the group doesn't fit on the current page, and it's a big big - % group, force a page break. - \ifdim \dimen0 > \dimen2 - \ifdim \pagetotal < \vfilllimit\pageheight - \page - \fi - \fi - \copy\groupbox - \endgroup % End the \group. - }% + \startsavinginserts % \setbox\groupbox = \vtop\bgroup - % We have to put a strut on the last line in case the @group is in - % the midst of an example, rather than completely enclosing it. - % Otherwise, the interline space between the last line of the group - % and the first line afterwards is too small. But we can't put the - % strut in \Egroup, since there it would be on a line by itself. - % Hence this just inserts a strut at the beginning of each line. - \everypar = {\strut}% - % - % Since we have a strut on every line, we don't need any of TeX's - % normal interline spacing. - \offinterlineskip - % - % OK, but now we have to do something about blank - % lines in the input in @example-like environments, which normally - % just turn into \lisppar, which will insert no space now that we've - % turned off the interline space. Simplest is to make them be an - % empty paragraph. - \ifx\par\lisppar - \edef\par{\leavevmode \par}% - % - % Reset ^^M's definition to new definition of \par. - \obeylines - \fi - % % Do @comment since we are called inside an environment such as % @example, where each end-of-line in the input causes an % end-of-line in the output. We don't want the end-of-line after @@ -637,6 +694,32 @@ \comment } % +% The \vtop produces a box with normal height and large depth; thus, TeX puts +% \baselineskip glue before it, and (when the next line of text is done) +% \lineskip glue after it. Thus, space below is not quite equal to space +% above. But it's pretty close. +\def\Egroup{% + % To get correct interline space between the last line of the group + % and the first line afterwards, we have to propagate \prevdepth. + \endgraf % Not \par, as it may have been set to \lisppar. + \global\dimen1 = \prevdepth + \egroup % End the \vtop. + % \dimen0 is the vertical size of the group's box. + \dimen0 = \ht\groupbox \advance\dimen0 by \dp\groupbox + % \dimen2 is how much space is left on the page (more or less). + \dimen2 = \pageheight \advance\dimen2 by -\pagetotal + % if the group doesn't fit on the current page, and it's a big big + % group, force a page break. + \ifdim \dimen0 > \dimen2 + \ifdim \pagetotal < \vfilllimit\pageheight + \page + \fi + \fi + \box\groupbox + \prevdepth = \dimen1 + \checkinserts +} +% % TeX puts in an \escapechar (i.e., `@') at the beginning of the help % message, so this ends up printing `@group can only ...'. % @@ -649,10 +732,8 @@ where each line of input produces a line of output.} \newdimen\mil \mil=0.001in -\def\need{\parsearg\needx} - % Old definition--didn't work. -%\def\needx #1{\par % +%\parseargdef\need{\par % %% This method tries to make TeX break the page naturally %% if the depth of the box does not fit. %{\baselineskip=0pt% @@ -660,7 +741,7 @@ where each line of input produces a line of output.} %\prevdepth=-1000pt %}} -\def\needx#1{% +\parseargdef\need{% % Ensure vertical mode, so we don't make a big box in the middle of a % paragraph. \par @@ -699,35 +780,10 @@ where each line of input produces a line of output.} \fi } -% @br forces paragraph break +% @br forces paragraph break (and is undocumented). \let\br = \par -% @dots{} output an ellipsis using the current font. -% We do .5em per period so that it has the same spacing in a typewriter -% font as three actual period characters. -% -\def\dots{% - \leavevmode - \hbox to 1.5em{% - \hskip 0pt plus 0.25fil minus 0.25fil - .\hss.\hss.% - \hskip 0pt plus 0.5fil minus 0.5fil - }% -} - -% @enddots{} is an end-of-sentence ellipsis. -% -\def\enddots{% - \leavevmode - \hbox to 2em{% - \hskip 0pt plus 0.25fil minus 0.25fil - .\hss.\hss.\hss.% - \hskip 0pt plus 0.5fil minus 0.5fil - }% - \spacefactor=3000 -} - % @page forces the start of a new page. % \def\page{\par\vfill\supereject} @@ -740,13 +796,11 @@ where each line of input produces a line of output.} \newskip\exdentamount % This defn is used inside fill environments such as @defun. -\def\exdent{\parsearg\exdentyyy} -\def\exdentyyy #1{{\hfil\break\hbox{\kern -\exdentamount{\rm#1}}\hfil\break}} +\parseargdef\exdent{\hfil\break\hbox{\kern -\exdentamount{\rm#1}}\hfil\break} % This defn is used inside nofill environments such as @example. -\def\nofillexdent{\parsearg\nofillexdentyyy} -\def\nofillexdentyyy #1{{\advance \leftskip by -\exdentamount -\leftline{\hskip\leftskip{\rm#1}}}} +\parseargdef\nofillexdent{{\advance \leftskip by -\exdentamount + \leftline{\hskip\leftskip{\rm#1}}}} % @inmargin{WHICH}{TEXT} puts TEXT in the WHICH margin next to the current % paragraph. For more general purposes, use the \margin insertion @@ -798,8 +852,19 @@ where each line of input produces a line of output.} } % @include file insert text of that file as input. -% Allow normal characters that we make active in the argument (a file name). -\def\include{\begingroup +% +\def\include{\parseargusing\filenamecatcodes\includezzz} +\def\includezzz#1{% + \pushthisfilestack + \def\thisfile{#1}% + {% + \makevalueexpandable + \def\temp{\input #1 }% + \expandafter + }\temp + \popthisfilestack +} +\def\filenamecatcodes{% \catcode`\\=\other \catcode`~=\other \catcode`^=\other @@ -808,33 +873,50 @@ where each line of input produces a line of output.} \catcode`<=\other \catcode`>=\other \catcode`+=\other - \parsearg\includezzz} -% Restore active chars for included file. -\def\includezzz#1{\endgroup\begingroup - % Read the included file in a group so nested @include's work. - \def\thisfile{#1}% - \let\value=\expandablevalue - \input\thisfile -\endgroup} + \catcode`-=\other +} + +\def\pushthisfilestack{% + \expandafter\pushthisfilestackX\popthisfilestack\StackTerm +} +\def\pushthisfilestackX{% + \expandafter\pushthisfilestackY\thisfile\StackTerm +} +\def\pushthisfilestackY #1\StackTerm #2\StackTerm {% + \gdef\popthisfilestack{\gdef\thisfile{#1}\gdef\popthisfilestack{#2}}% +} + +\def\popthisfilestack{\errthisfilestackempty} +\def\errthisfilestackempty{\errmessage{Internal error: + the stack of filenames is empty.}} \def\thisfile{} % @center line % outputs that line, centered. % -\def\center{\parsearg\docenter} -\def\docenter#1{{% - \ifhmode \hfil\break \fi - \advance\hsize by -\leftskip - \advance\hsize by -\rightskip - \line{\hfil \ignorespaces#1\unskip \hfil}% - \ifhmode \break \fi -}} +\parseargdef\center{% + \ifhmode + \let\next\centerH + \else + \let\next\centerV + \fi + \next{\hfil \ignorespaces#1\unskip \hfil}% +} +\def\centerH#1{% + {% + \hfil\break + \advance\hsize by -\leftskip + \advance\hsize by -\rightskip + \line{#1}% + \break + }% +} +\def\centerV#1{\line{\kern\leftskip #1\kern\rightskip}} % @sp n outputs n lines of vertical space -\def\sp{\parsearg\spxxx} -\def\spxxx #1{\vskip #1\baselineskip} +\parseargdef\sp{\vskip #1\baselineskip} % @comment ...line which is ignored... % @c is the same as @comment @@ -855,8 +937,7 @@ where each line of input produces a line of output.} \def\asisword{asis} % no translation, these are keywords \def\noneword{none} % -\def\paragraphindent{\parsearg\doparagraphindent} -\def\doparagraphindent#1{% +\parseargdef\paragraphindent{% \def\temp{#1}% \ifx\temp\asisword \else @@ -873,8 +954,7 @@ where each line of input produces a line of output.} % We'll use ems for NCHARS like @paragraphindent. % It seems @exampleindent asis isn't necessary, but % I preserve it to make it similar to @paragraphindent. -\def\exampleindent{\parsearg\doexampleindent} -\def\doexampleindent#1{% +\parseargdef\exampleindent{% \def\temp{#1}% \ifx\temp\asisword \else @@ -897,12 +977,9 @@ where each line of input produces a line of output.} % By default, we suppress indentation. % \def\suppressfirstparagraphindent{\dosuppressfirstparagraphindent} -\newdimen\currentparindent -% \def\insertword{insert} % -\def\firstparagraphindent{\parsearg\dofirstparagraphindent} -\def\dofirstparagraphindent#1{% +\parseargdef\firstparagraphindent{% \def\temp{#1}% \ifx\temp\noneword \let\suppressfirstparagraphindent = \dosuppressfirstparagraphindent @@ -947,23 +1024,18 @@ where each line of input produces a line of output.} \def\asis#1{#1} % @math outputs its argument in math mode. -% We don't use $'s directly in the definition of \math because we need -% to set catcodes according to plain TeX first, to allow for subscripts, -% superscripts, special math chars, etc. -% -\let\implicitmath = $%$ font-lock fix % % One complication: _ usually means subscripts, but it could also mean % an actual _ character, as in @math{@var{some_variable} + 1}. So make -% _ within @math be active (mathcode "8000), and distinguish by seeing -% if the current family is \slfam, which is what @var uses. -% -{\catcode\underChar = \active -\gdef\mathunderscore{% - \catcode\underChar=\active - \def_{\ifnum\fam=\slfam \_\else\sb\fi}% -}} -% +% _ active, and distinguish by seeing if the current family is \slfam, +% which is what @var uses. +{ + \catcode\underChar = \active + \gdef\mathunderscore{% + \catcode\underChar=\active + \def_{\ifnum\fam=\slfam \_\else\sb\fi}% + } +} % Another complication: we want \\ (and @\) to output a \ character. % FYI, plain.tex uses \\ as a temporary control sequence (why?), but % this is not advertised and we don't care. Texinfo does not @@ -974,15 +1046,16 @@ where each line of input produces a line of output.} % \def\math{% \tex - \mathcode`\_="8000 \mathunderscore + \mathunderscore \let\\ = \mathbackslash \mathactive - \implicitmath\finishmath} -\def\finishmath#1{#1\implicitmath\Etex} + $\finishmath +} +\def\finishmath#1{#1$\endgroup} % Close the group opened by \tex. % Some active characters (such as <) are spaced differently in math. -% We have to reset their definitions in case the @math was an -% argument to a command which set the catcodes (such as @item or @section). +% We have to reset their definitions in case the @math was an argument +% to a command which sets the catcodes (such as @item or @section). % { \catcode`^ = \active @@ -998,8 +1071,33 @@ where each line of input produces a line of output.} } % @bullet and @minus need the same treatment as @math, just above. -\def\bullet{\implicitmath\ptexbullet\implicitmath} -\def\minus{\implicitmath-\implicitmath} +\def\bullet{$\ptexbullet$} +\def\minus{$-$} + +% @dots{} outputs an ellipsis using the current font. +% We do .5em per period so that it has the same spacing in a typewriter +% font as three actual period characters. +% +\def\dots{% + \leavevmode + \hbox to 1.5em{% + \hskip 0pt plus 0.25fil + .\hfil.\hfil.% + \hskip 0pt plus 0.5fil + }% +} + +% @enddots{} is an end-of-sentence ellipsis. +% +\def\enddots{% + \dots + \spacefactor=3000 +} + +% @comma{} is so commas can be inserted into text without messing up +% Texinfo's parsing. +% +\let\comma = , % @refill is a no-op. \let\refill=\relax @@ -1015,20 +1113,20 @@ where each line of input produces a line of output.} % So open here the files we need to have open while reading the input. % This makes it possible to make a .fmt file for texinfo. \def\setfilename{% + \fixbackslash % Turn off hack to swallow `\input texinfo'. \iflinks - \readauxfile + \tryauxfile + % Open the new aux file. TeX will close it automatically at exit. + \immediate\openout\auxfile=\jobname.aux \fi % \openindices needs to do some work in any case. \openindices - \fixbackslash % Turn off hack to swallow `\input texinfo'. - \global\let\setfilename=\comment % Ignore extra @setfilename cmds. + \let\setfilename=\comment % Ignore extra @setfilename cmds. % % If texinfo.cnf is present on the system, read it. % Useful for site-wide @afourpaper, etc. - % Just to be on the safe side, close the input stream before the \input. \openin 1 texinfo.cnf - \ifeof1 \let\temp=\relax \else \def\temp{\input texinfo.cnf }\fi - \closein1 - \temp + \ifeof 1 \else \input texinfo.cnf \fi + \closein 1 % \comment % Ignore the actual filename. } @@ -1064,16 +1162,21 @@ where each line of input produces a line of output.} \newif\ifpdf \newif\ifpdfmakepagedest +% when pdftex is run in dvi mode, \pdfoutput is defined (so \pdfoutput=1 +% can be set). So we test for \relax and 0 as well as \undefined, +% borrowed from ifpdf.sty. \ifx\pdfoutput\undefined - \pdffalse - \let\pdfmkdest = \gobble - \let\pdfurl = \gobble - \let\endlink = \relax - \let\linkcolor = \relax - \let\pdfmakeoutlines = \relax \else - \pdftrue - \pdfoutput = 1 + \ifx\pdfoutput\relax + \else + \ifcase\pdfoutput + \else + \pdftrue + \fi + \fi +\fi +% +\ifpdf \input pdfcolor \pdfcatalog{/PageMode /UseOutlines}% \def\dopdfimage#1#2#3{% @@ -1096,7 +1199,13 @@ where each line of input produces a line of output.} \ifnum\pdftexversion < 14 \else \pdfrefximage \pdflastximage \fi} - \def\pdfmkdest#1{{\normalturnoffactive \pdfdest name{#1} xyz}} + \def\pdfmkdest#1{{% + % We have to set dummies so commands such as @code in a section title + % aren't expanded. + \atdummies + \normalturnoffactive + \pdfdest name{#1} xyz% + }} \def\pdfmkpgn#1{#1} \let\linkcolor = \Blue % was Cyan, but that seems light? \def\endlink{\Black\pdfendlink} @@ -1112,7 +1221,7 @@ where each line of input produces a line of output.} % of subentries (or empty, for subsubsections). #3 is the node % text, which might be empty if this toc entry had no % corresponding node. #4 is the page number. - % + % \def\dopdfoutline#1#2#3#4{% % Generate a link to the node text if that exists; else, use the % page number. We could generate a destination for the section @@ -1125,22 +1234,32 @@ where each line of input produces a line of output.} } % \def\pdfmakeoutlines{% - \openin 1 \jobname.toc - \ifeof 1\else\begingroup - \closein 1 + \begingroup % Thanh's hack / proper braces in bookmarks \edef\mylbrace{\iftrue \string{\else}\fi}\let\{=\mylbrace \edef\myrbrace{\iffalse{\else\string}\fi}\let\}=\myrbrace % % Read toc silently, to get counts of subentries for \pdfoutline. - \def\numchapentry##1##2##3##4{\def\thischapnum{##2}}% + \def\numchapentry##1##2##3##4{% + \def\thischapnum{##2}% + \let\thissecnum\empty + \let\thissubsecnum\empty + }% \def\numsecentry##1##2##3##4{% - \def\thissecnum{##2}% - \advancenumber{chap\thischapnum}}% + \advancenumber{chap\thischapnum}% + \def\thissecnum{##2}% + \let\thissubsecnum\empty + }% \def\numsubsecentry##1##2##3##4{% - \def\thissubsecnum{##2}% - \advancenumber{sec\thissecnum}}% - \def\numsubsubsecentry##1##2##3##4{\advancenumber{subsec\thissubsecnum}}% + \advancenumber{sec\thissecnum}% + \def\thissubsecnum{##2}% + }% + \def\numsubsubsecentry##1##2##3##4{% + \advancenumber{subsec\thissubsecnum}% + }% + \let\thischapnum\empty + \let\thissecnum\empty + \let\thissubsecnum\empty % % use \def rather than \let here because we redefine \chapentry et % al. a second time, below. @@ -1157,7 +1276,7 @@ where each line of input produces a line of output.} % Read toc second time, this time actually producing the outlines. % The `-' means take the \expnumber as the absolute number of % subentries, which we calculated on our first read of the .toc above. - % + % % We use the node names as the destinations. \def\numchapentry##1##2##3##4{% \dopdfoutline{##1}{count-\expnumber{chap##2}}{##3}{##4}}% @@ -1168,11 +1287,19 @@ where each line of input produces a line of output.} \def\numsubsubsecentry##1##2##3##4{% count is always zero \dopdfoutline{##1}{}{##3}{##4}}% % - % Make special characters normal for writing to the pdf file. + % PDF outlines are displayed using system fonts, instead of + % document fonts. Therefore we cannot use special characters, + % since the encoding is unknown. For example, the eogonek from + % Latin 2 (0xea) gets translated to a | character. Info from + % Staszek Wawrykiewicz, 19 Jan 2004 04:09:24 +0100. + % + % xx to do this right, we have to translate 8-bit characters to + % their "best" equivalent, based on the @documentencoding. Right + % now, I guess we'll just let the pdf reader have its way. \indexnofonts \turnoffactive \input \jobname.toc - \endgroup\fi + \endgroup } % \def\makelinks #1,{% @@ -1205,7 +1332,6 @@ where each line of input produces a line of output.} \def\ppn#1{\pgn=#1\gobble} \def\ppnn{\pgn=\first} \def\pdfmklnk#1{\lnkcount=0\makelinks #1,END,} - \def\addtokens#1#2{\edef\addtoks{\noexpand#1={\the#1#2}}\addtoks} \def\skipspaces#1{\def\PP{#1}\def\D{|}% \ifx\PP\D\let\nextsp\relax \else\let\nextsp\skipspaces @@ -1223,18 +1349,17 @@ where each line of input produces a line of output.} \def\pdfurl#1{% \begingroup \normalturnoffactive\def\@{@}% - \let\value=\expandablevalue + \makevalueexpandable \leavevmode\Red \startlink attr{/Border [0 0 0]}% user{/Subtype /Link /A << /S /URI /URI (#1) >>}% - % #1 \endgroup} \def\pdfgettoks#1.{\setbox\boxA=\hbox{\toksA={#1.}\toksB={}\maketoks}} \def\addtokens#1#2{\edef\addtoks{\noexpand#1={\the#1#2}}\addtoks} \def\adn#1{\addtokens{\toksC}{#1}\global\countA=1\let\next=\maketoks} \def\poptoks#1#2|ENDTOKS|{\let\first=#1\toksD={#1}\toksA={#2}} \def\maketoks{% - \expandafter\poptoks\the\toksA|ENDTOKS| + \expandafter\poptoks\the\toksA|ENDTOKS|\relax \ifx\first0\adn0 \else\ifx\first1\adn1 \else\ifx\first2\adn2 \else\ifx\first3\adn3 \else\ifx\first4\adn4 \else\ifx\first5\adn5 \else\ifx\first6\adn6 @@ -1254,20 +1379,44 @@ where each line of input produces a line of output.} \startlink attr{/Border [0 0 0]} goto name{\pdfmkpgn{#1}} \linkcolor #1\endlink} \def\done{\edef\st{\global\noexpand\toksA={\the\toksB}}\st} -\fi % \ifx\pdfoutput +\else + \let\pdfmkdest = \gobble + \let\pdfurl = \gobble + \let\endlink = \relax + \let\linkcolor = \relax + \let\pdfmakeoutlines = \relax +\fi % \ifx\pdfoutput \message{fonts,} -% Font-change commands. + +% Change the current font style to #1, remembering it in \curfontstyle. +% For now, we do not accumulate font styles: @b{@i{foo}} prints foo in +% italics, not bold italics. +% +\def\setfontstyle#1{% + \def\curfontstyle{#1}% not as a control sequence, because we are \edef'd. + \csname ten#1\endcsname % change the current font +} + +% Select #1 fonts with the current style. +% +\def\selectfonts#1{\csname #1fonts\endcsname \csname\curfontstyle\endcsname} + +\def\rm{\fam=0 \setfontstyle{rm}} +\def\it{\fam=\itfam \setfontstyle{it}} +\def\sl{\fam=\slfam \setfontstyle{sl}} +\def\bf{\fam=\bffam \setfontstyle{bf}} +\def\tt{\fam=\ttfam \setfontstyle{tt}} % Texinfo sort of supports the sans serif font style, which plain TeX does not. -% So we set up a \sf analogous to plain's \rm, etc. +% So we set up a \sf. \newfam\sffam -\def\sf{\fam=\sffam \tensf} +\def\sf{\fam=\sffam \setfontstyle{sf}} \let\li = \sf % Sometimes we call it \li, not \sf. -% We don't need math for this one. -\def\ttsl{\tenttsl} +% We don't need math for this font style. +\def\ttsl{\setfontstyle{ttsl}} % Default leading. \newdimen\textleading \textleading = 13.2pt @@ -1318,6 +1467,7 @@ where each line of input produces a line of output.} \def\scshape{csc} \def\scbshape{csc} +% Text fonts (11.2pt, magstep1). \newcount\mainmagstep \ifx\bigger\relax % not really supported. @@ -1329,10 +1479,6 @@ where each line of input produces a line of output.} \setfont\textrm\rmshape{10}{\mainmagstep} \setfont\texttt\ttshape{10}{\mainmagstep} \fi -% Instead of cmb10, you may want to use cmbx10. -% cmbx10 is a prettier font on its own, but cmb10 -% looks better when embedded in a line with cmr10 -% (in Bob's opinion). \setfont\textbf\bfshape{10}{\mainmagstep} \setfont\textit\itshape{10}{\mainmagstep} \setfont\textsl\slshape{10}{\mainmagstep} @@ -1342,10 +1488,11 @@ where each line of input produces a line of output.} \font\texti=cmmi10 scaled \mainmagstep \font\textsy=cmsy10 scaled \mainmagstep -% A few fonts for @defun, etc. -\setfont\defbf\bxshape{10}{\magstep1} %was 1314 +% A few fonts for @defun names and args. +\setfont\defbf\bfshape{10}{\magstep1} \setfont\deftt\ttshape{10}{\magstep1} -\def\df{\let\tentt=\deftt \let\tenbf = \defbf \bf} +\setfont\defttsl\ttslshape{10}{\magstep1} +\def\df{\let\tentt=\deftt \let\tenbf = \defbf \let\tenttsl=\defttsl \bf} % Fonts for indices, footnotes, small examples (9pt). \setfont\smallrm\rmshape{9}{1000} @@ -1371,7 +1518,7 @@ where each line of input produces a line of output.} \font\smalleri=cmmi8 \font\smallersy=cmsy8 -% Fonts for title page: +% Fonts for title page (20.4pt): \setfont\titlerm\rmbshape{12}{\magstep3} \setfont\titleit\itbshape{10}{\magstep4} \setfont\titlesl\slbshape{10}{\magstep4} @@ -1417,11 +1564,21 @@ where each line of input produces a line of output.} \setfont\ssecttsl\ttslshape{10}{1315} \setfont\ssecsf\sfbshape{12}{\magstephalf} \let\ssecbf\ssecrm -\setfont\ssecsc\scbshape{10}{\magstep1} +\setfont\ssecsc\scbshape{10}{1315} \font\sseci=cmmi12 scaled \magstephalf \font\ssecsy=cmsy10 scaled 1315 -% The smallcaps and symbol fonts should actually be scaled \magstep1.5, -% but that is not a standard magnification. + +% Reduced fonts for @acro in text (10pt). +\setfont\reducedrm\rmshape{10}{1000} +\setfont\reducedtt\ttshape{10}{1000} +\setfont\reducedbf\bfshape{10}{1000} +\setfont\reducedit\itshape{10}{1000} +\setfont\reducedsl\slshape{10}{1000} +\setfont\reducedsf\sfshape{10}{1000} +\setfont\reducedsc\scshape{10}{1000} +\setfont\reducedttsl\ttslshape{10}{1000} +\font\reducedi=cmmi10 +\font\reducedsy=cmsy10 % In order for the font changes to affect most math symbols and letters, % we have to define the \textfont of the standard families. Since @@ -1436,50 +1593,72 @@ where each line of input produces a line of output.} } % The font-changing commands redefine the meanings of \tenSTYLE, instead -% of just \STYLE. We do this so that font changes will continue to work -% in math mode, where it is the current \fam that is relevant in most -% cases, not the current font. Plain TeX does \def\bf{\fam=\bffam -% \tenbf}, for example. By redefining \tenbf, we obviate the need to -% redefine \bf itself. +% of just \STYLE. We do this because \STYLE needs to also set the +% current \fam for math mode. Our \STYLE (e.g., \rm) commands hardwire +% \tenSTYLE to set the current font. +% +% Each font-changing command also sets the names \lsize (one size lower) +% and \lllsize (three sizes lower). These relative commands are used in +% the LaTeX logo and acronyms. +% +% This all needs generalizing, badly. +% \def\textfonts{% \let\tenrm=\textrm \let\tenit=\textit \let\tensl=\textsl \let\tenbf=\textbf \let\tentt=\texttt \let\smallcaps=\textsc - \let\tensf=\textsf \let\teni=\texti \let\tensy=\textsy \let\tenttsl=\textttsl + \let\tensf=\textsf \let\teni=\texti \let\tensy=\textsy + \let\tenttsl=\textttsl + \def\lsize{reduced}\def\lllsize{smaller}% \resetmathfonts \setleading{\textleading}} \def\titlefonts{% \let\tenrm=\titlerm \let\tenit=\titleit \let\tensl=\titlesl \let\tenbf=\titlebf \let\tentt=\titlett \let\smallcaps=\titlesc \let\tensf=\titlesf \let\teni=\titlei \let\tensy=\titlesy \let\tenttsl=\titlettsl + \def\lsize{chap}\def\lllsize{subsec}% \resetmathfonts \setleading{25pt}} \def\titlefont#1{{\titlefonts\rm #1}} \def\chapfonts{% \let\tenrm=\chaprm \let\tenit=\chapit \let\tensl=\chapsl \let\tenbf=\chapbf \let\tentt=\chaptt \let\smallcaps=\chapsc \let\tensf=\chapsf \let\teni=\chapi \let\tensy=\chapsy \let\tenttsl=\chapttsl + \def\lsize{sec}\def\lllsize{text}% \resetmathfonts \setleading{19pt}} \def\secfonts{% \let\tenrm=\secrm \let\tenit=\secit \let\tensl=\secsl \let\tenbf=\secbf \let\tentt=\sectt \let\smallcaps=\secsc - \let\tensf=\secsf \let\teni=\seci \let\tensy=\secsy \let\tenttsl=\secttsl + \let\tensf=\secsf \let\teni=\seci \let\tensy=\secsy + \let\tenttsl=\secttsl + \def\lsize{subsec}\def\lllsize{reduced}% \resetmathfonts \setleading{16pt}} \def\subsecfonts{% \let\tenrm=\ssecrm \let\tenit=\ssecit \let\tensl=\ssecsl \let\tenbf=\ssecbf \let\tentt=\ssectt \let\smallcaps=\ssecsc - \let\tensf=\ssecsf \let\teni=\sseci \let\tensy=\ssecsy \let\tenttsl=\ssecttsl + \let\tensf=\ssecsf \let\teni=\sseci \let\tensy=\ssecsy + \let\tenttsl=\ssecttsl + \def\lsize{text}\def\lllsize{small}% \resetmathfonts \setleading{15pt}} -\let\subsubsecfonts = \subsecfonts % Maybe make sssec fonts scaled magstephalf? +\let\subsubsecfonts = \subsecfonts +\def\reducedfonts{% + \let\tenrm=\reducedrm \let\tenit=\reducedit \let\tensl=\reducedsl + \let\tenbf=\reducedbf \let\tentt=\reducedtt \let\reducedcaps=\reducedsc + \let\tensf=\reducedsf \let\teni=\reducedi \let\tensy=\reducedsy + \let\tenttsl=\reducedttsl + \def\lsize{small}\def\lllsize{smaller}% + \resetmathfonts \setleading{10.5pt}} \def\smallfonts{% \let\tenrm=\smallrm \let\tenit=\smallit \let\tensl=\smallsl \let\tenbf=\smallbf \let\tentt=\smalltt \let\smallcaps=\smallsc \let\tensf=\smallsf \let\teni=\smalli \let\tensy=\smallsy \let\tenttsl=\smallttsl + \def\lsize{smaller}\def\lllsize{smaller}% \resetmathfonts \setleading{10.5pt}} \def\smallerfonts{% \let\tenrm=\smallerrm \let\tenit=\smallerit \let\tensl=\smallersl \let\tenbf=\smallerbf \let\tentt=\smallertt \let\smallcaps=\smallersc \let\tensf=\smallersf \let\teni=\smalleri \let\tensy=\smallersy \let\tenttsl=\smallerttsl + \def\lsize{smaller}\def\lllsize{smaller}% \resetmathfonts \setleading{9.5pt}} % Set the fonts to use with the @small... environments. @@ -1488,7 +1667,7 @@ where each line of input produces a line of output.} % About \smallexamplefonts. If we use \smallfonts (9pt), @smallexample % can fit this many characters: % 8.5x11=86 smallbook=72 a4=90 a5=69 -% If we use \smallerfonts (8pt), then we can fit this many characters: +% If we use \scriptfonts (8pt), then we can fit this many characters: % 8.5x11=90+ smallbook=80 a4=90+ a5=77 % For me, subjectively, the few extra characters that fit aren't worth % the additional smallness of 8pt. So I'm making the default 9pt. @@ -1496,14 +1675,13 @@ where each line of input produces a line of output.} % By the way, for comparison, here's what fits with @example (10pt): % 8.5x11=71 smallbook=60 a4=75 a5=58 % -% I wish we used A4 paper on this side of the Atlantic. -% +% I wish the USA used A4 paper. % --karl, 24jan03. % Set up the default fonts, so we can use them for creating boxes. % -\textfonts +\textfonts \rm % Define these so they can be easily changed for other fonts. \def\angleleft{$\langle$} @@ -1514,7 +1692,7 @@ where each line of input produces a line of output.} % Fonts for short table of contents. \setfont\shortcontrm\rmshape{12}{1000} -\setfont\shortcontbf\bxshape{12}{1000} +\setfont\shortcontbf\bfshape{10}{\magstep1} % no cmb12 \setfont\shortcontsl\slshape{12}{1000} \setfont\shortconttt\ttshape{12}{1000} @@ -1528,11 +1706,19 @@ where each line of input produces a line of output.} \def\smartslanted#1{{\ifusingtt\ttsl\sl #1}\futurelet\next\smartitalicx} \def\smartitalic#1{{\ifusingtt\ttsl\it #1}\futurelet\next\smartitalicx} +% like \smartslanted except unconditionally uses \ttsl. +% @var is set to this for defun arguments. +\def\ttslanted#1{{\ttsl #1}\futurelet\next\smartitalicx} + +% like \smartslanted except unconditionally use \sl. We never want +% ttsl for book titles, do we? +\def\cite#1{{\sl #1}\futurelet\next\smartitalicx} + \let\i=\smartitalic +\let\slanted=\smartslanted \let\var=\smartslanted \let\dfn=\smartslanted \let\emph=\smartitalic -\let\cite=\smartslanted \def\b#1{{\bf #1}} \let\strong=\b @@ -1559,7 +1745,6 @@ where each line of input produces a line of output.} {\tt \rawbackslash \frenchspacing #1}% \null } -\let\ttfont=\t \def\samp#1{`\tclose{#1}'\null} \setfont\keyrm\rmshape{8}{1000} \font\keysy=cmsy9 @@ -1600,7 +1785,7 @@ where each line of input produces a line of output.} \null } -% We *must* turn on hyphenation at `-' and `_' in \code. +% We *must* turn on hyphenation at `-' and `_' in @code. % Otherwise, it is too hard to avoid overfull hboxes % in the Emacs manual, the Library manual, etc. @@ -1618,10 +1803,6 @@ where each line of input produces a line of output.} \catcode`\_=\active \let_\codeunder \codex } - % - % If we end up with any active - characters when handling the index, - % just treat them as a normal -. - \global\def\indexbreaks{\catcode`\-=\active \let-\realdash} } \def\realdash{-} @@ -1645,8 +1826,7 @@ where each line of input produces a line of output.} % @kbdinputstyle -- arg is `distinct' (@kbd uses slanted tty font always), % `example' (@kbd uses ttsl only inside of @example and friends), % or `code' (@kbd uses normal tty font always). -\def\kbdinputstyle{\parsearg\kbdinputstylexxx} -\def\kbdinputstylexxx#1{% +\parseargdef\kbdinputstyle{% \def\arg{#1}% \ifx\arg\worddistinct \gdef\kbdexamplefont{\ttsl}\gdef\kbdfont{\ttsl}% @@ -1672,8 +1852,8 @@ where each line of input produces a line of output.} \else{\tclose{\kbdfont\look}}\fi \else{\tclose{\kbdfont\look}}\fi} -% For @url, @env, @command quotes seem unnecessary, so use \code. -\let\url=\code +% For @indicateurl, @env, @command quotes seem unnecessary, so use \code. +\let\indicateurl=\code \let\env=\code \let\command=\code @@ -1705,6 +1885,10 @@ where each line of input produces a line of output.} \endlink \endgroup} +% @url synonym for @uref, since that's how everyone uses it. +% +\let\url=\uref + % rms does not like angle brackets --karl, 17may97. % So now @email is just like @uref, unless we are pdf. % @@ -1746,22 +1930,53 @@ where each line of input produces a line of output.} \def\sc#1{{\smallcaps#1}} % smallcaps font \def\ii#1{{\it #1}} % italic font -% @acronym downcases the argument and prints in smallcaps. -\def\acronym#1{{\smallcaps \lowercase{#1}}} +% @acronym for "FBI", "NATO", and the like. +% We print this one point size smaller, since it's intended for +% all-uppercase. +% +\def\acronym#1{\doacronym #1,,\finish} +\def\doacronym#1,#2,#3\finish{% + {\selectfonts\lsize #1}% + \def\temp{#2}% + \ifx\temp\empty \else + \space ({\unsepspaces \ignorespaces \temp \unskip})% + \fi +} + +% @abbr for "Comput. J." and the like. +% No font change, but don't do end-of-sentence spacing. +% +\def\abbr#1{\doabbr #1,,\finish} +\def\doabbr#1,#2,#3\finish{% + {\frenchspacing #1}% + \def\temp{#2}% + \ifx\temp\empty \else + \space ({\unsepspaces \ignorespaces \temp \unskip})% + \fi +} -% @pounds{} is a sterling sign. +% @pounds{} is a sterling sign, which Knuth put in the CM italic font. +% \def\pounds{{\it\$}} -% @registeredsymbol - R in a circle. For now, only works in text size; -% we'd have to redo the font mechanism to change the \scriptstyle and -% \scriptscriptstyle font sizes to make it look right in headings. +% @registeredsymbol - R in a circle. The font for the R should really +% be smaller yet, but lllsize is the best we can do for now. % Adapted from the plain.tex definition of \copyright. % \def\registeredsymbol{% - $^{{\ooalign{\hfil\raise.07ex\hbox{$\scriptstyle\rm R$}\hfil\crcr\Orb}}% + $^{{\ooalign{\hfil\raise.07ex\hbox{\selectfonts\lllsize R}% + \hfil\crcr\Orb}}% }$% } +% Laurent Siebenmann reports \Orb undefined with: +% Textures 1.7.7 (preloaded format=plain 93.10.14) (68K) 16 APR 2004 02:38 +% so we'll define it if necessary. +% +\ifx\Orb\undefined +\def\Orb{\mathhexbox20D} +\fi + \message{page headings,} @@ -1780,87 +1995,103 @@ where each line of input produces a line of output.} \newif\ifsetshortcontentsaftertitlepage \let\setshortcontentsaftertitlepage = \setshortcontentsaftertitlepagetrue -\def\shorttitlepage{\parsearg\shorttitlepagezzz} -\def\shorttitlepagezzz #1{\begingroup\hbox{}\vskip 1.5in \chaprm \centerline{#1}% +\parseargdef\shorttitlepage{\begingroup\hbox{}\vskip 1.5in \chaprm \centerline{#1}% \endgroup\page\hbox{}\page} -\def\titlepage{\begingroup \parindent=0pt \textfonts - \let\subtitlerm=\tenrm - \def\subtitlefont{\subtitlerm \normalbaselineskip = 13pt \normalbaselines}% - % - \def\authorfont{\authorrm \normalbaselineskip = 16pt \normalbaselines - \let\tt=\authortt}% - % - % Leave some space at the very top of the page. - \vglue\titlepagetopglue - % - % Now you can print the title using @title. - \def\title{\parsearg\titlezzz}% - \def\titlezzz##1{\leftline{\titlefonts\rm ##1} - % print a rule at the page bottom also. - \finishedtitlepagefalse - \vskip4pt \hrule height 4pt width \hsize \vskip4pt}% - % No rule at page bottom unless we print one at the top with @title. - \finishedtitlepagetrue - % - % Now you can put text using @subtitle. - \def\subtitle{\parsearg\subtitlezzz}% - \def\subtitlezzz##1{{\subtitlefont \rightline{##1}}}% - % - % @author should come last, but may come many times. - \def\author{\parsearg\authorzzz}% - \def\authorzzz##1{\ifseenauthor\else\vskip 0pt plus 1filll\seenauthortrue\fi - {\authorfont \leftline{##1}}}% - % - % Most title ``pages'' are actually two pages long, with space - % at the top of the second. We don't want the ragged left on the second. - \let\oldpage = \page - \def\page{% +\envdef\titlepage{% + % Open one extra group, as we want to close it in the middle of \Etitlepage. + \begingroup + \parindent=0pt \textfonts + % Leave some space at the very top of the page. + \vglue\titlepagetopglue + % No rule at page bottom unless we print one at the top with @title. + \finishedtitlepagetrue + % + % Most title ``pages'' are actually two pages long, with space + % at the top of the second. We don't want the ragged left on the second. + \let\oldpage = \page + \def\page{% \iffinishedtitlepage\else - \finishtitlepage + \finishtitlepage \fi - \oldpage \let\page = \oldpage - \hbox{}}% -% \def\page{\oldpage \hbox{}} + \page + \null + }% } \def\Etitlepage{% - \iffinishedtitlepage\else - \finishtitlepage - \fi - % It is important to do the page break before ending the group, - % because the headline and footline are only empty inside the group. - % If we use the new definition of \page, we always get a blank page - % after the title page, which we certainly don't want. - \oldpage - \endgroup - % - % Need this before the \...aftertitlepage checks so that if they are - % in effect the toc pages will come out with page numbers. - \HEADINGSon - % - % If they want short, they certainly want long too. - \ifsetshortcontentsaftertitlepage - \shortcontents - \contents - \global\let\shortcontents = \relax - \global\let\contents = \relax - \fi - % - \ifsetcontentsaftertitlepage - \contents - \global\let\contents = \relax - \global\let\shortcontents = \relax - \fi + \iffinishedtitlepage\else + \finishtitlepage + \fi + % It is important to do the page break before ending the group, + % because the headline and footline are only empty inside the group. + % If we use the new definition of \page, we always get a blank page + % after the title page, which we certainly don't want. + \oldpage + \endgroup + % + % Need this before the \...aftertitlepage checks so that if they are + % in effect the toc pages will come out with page numbers. + \HEADINGSon + % + % If they want short, they certainly want long too. + \ifsetshortcontentsaftertitlepage + \shortcontents + \contents + \global\let\shortcontents = \relax + \global\let\contents = \relax + \fi + % + \ifsetcontentsaftertitlepage + \contents + \global\let\contents = \relax + \global\let\shortcontents = \relax + \fi } \def\finishtitlepage{% - \vskip4pt \hrule height 2pt width \hsize - \vskip\titlepagebottomglue - \finishedtitlepagetrue + \vskip4pt \hrule height 2pt width \hsize + \vskip\titlepagebottomglue + \finishedtitlepagetrue +} + +%%% Macros to be used within @titlepage: + +\let\subtitlerm=\tenrm +\def\subtitlefont{\subtitlerm \normalbaselineskip = 13pt \normalbaselines} + +\def\authorfont{\authorrm \normalbaselineskip = 16pt \normalbaselines + \let\tt=\authortt} + +\parseargdef\title{% + \checkenv\titlepage + \leftline{\titlefonts\rm #1} + % print a rule at the page bottom also. + \finishedtitlepagefalse + \vskip4pt \hrule height 4pt width \hsize \vskip4pt +} + +\parseargdef\subtitle{% + \checkenv\titlepage + {\subtitlefont \rightline{#1}}% +} + +% @author should come last, but may come many times. +% It can also be used inside @quotation. +% +\parseargdef\author{% + \def\temp{\quotation}% + \ifx\thisenv\temp + \def\quotationauthor{#1}% printed in \Equotation. + \else + \checkenv\titlepage + \ifseenauthor\else \vskip 0pt plus 1filll \seenauthortrue \fi + {\authorfont \leftline{#1}}% + \fi } + %%% Set up page headings and footings. \let\thispage=\folio @@ -1870,7 +2101,7 @@ where each line of input produces a line of output.} \newtoks\evenfootline % footline on even pages \newtoks\oddfootline % footline on odd pages -% Now make Tex use those variables +% Now make TeX use those variables \headline={{\textfonts\rm \ifodd\pageno \the\oddheadline \else \the\evenheadline \fi}} \footline={{\textfonts\rm \ifodd\pageno \the\oddfootline @@ -1884,32 +2115,27 @@ where each line of input produces a line of output.} % @evenfooting @thisfile|| % @oddfooting ||@thisfile -\def\evenheading{\parsearg\evenheadingxxx} -\def\oddheading{\parsearg\oddheadingxxx} -\def\everyheading{\parsearg\everyheadingxxx} - -\def\evenfooting{\parsearg\evenfootingxxx} -\def\oddfooting{\parsearg\oddfootingxxx} -\def\everyfooting{\parsearg\everyfootingxxx} -{\catcode`\@=0 % - -\gdef\evenheadingxxx #1{\evenheadingyyy #1@|@|@|@|\finish} -\gdef\evenheadingyyy #1@|#2@|#3@|#4\finish{% +\def\evenheading{\parsearg\evenheadingxxx} +\def\evenheadingxxx #1{\evenheadingyyy #1\|\|\|\|\finish} +\def\evenheadingyyy #1\|#2\|#3\|#4\finish{% \global\evenheadline={\rlap{\centerline{#2}}\line{#1\hfil#3}}} -\gdef\oddheadingxxx #1{\oddheadingyyy #1@|@|@|@|\finish} -\gdef\oddheadingyyy #1@|#2@|#3@|#4\finish{% +\def\oddheading{\parsearg\oddheadingxxx} +\def\oddheadingxxx #1{\oddheadingyyy #1\|\|\|\|\finish} +\def\oddheadingyyy #1\|#2\|#3\|#4\finish{% \global\oddheadline={\rlap{\centerline{#2}}\line{#1\hfil#3}}} -\gdef\everyheadingxxx#1{\oddheadingxxx{#1}\evenheadingxxx{#1}}% +\parseargdef\everyheading{\oddheadingxxx{#1}\evenheadingxxx{#1}}% -\gdef\evenfootingxxx #1{\evenfootingyyy #1@|@|@|@|\finish} -\gdef\evenfootingyyy #1@|#2@|#3@|#4\finish{% +\def\evenfooting{\parsearg\evenfootingxxx} +\def\evenfootingxxx #1{\evenfootingyyy #1\|\|\|\|\finish} +\def\evenfootingyyy #1\|#2\|#3\|#4\finish{% \global\evenfootline={\rlap{\centerline{#2}}\line{#1\hfil#3}}} -\gdef\oddfootingxxx #1{\oddfootingyyy #1@|@|@|@|\finish} -\gdef\oddfootingyyy #1@|#2@|#3@|#4\finish{% +\def\oddfooting{\parsearg\oddfootingxxx} +\def\oddfootingxxx #1{\oddfootingyyy #1\|\|\|\|\finish} +\def\oddfootingyyy #1\|#2\|#3\|#4\finish{% \global\oddfootline = {\rlap{\centerline{#2}}\line{#1\hfil#3}}% % % Leave some space for the footline. Hopefully ok to assume @@ -1918,9 +2144,8 @@ where each line of input produces a line of output.} \global\advance\vsize by -\baselineskip } -\gdef\everyfootingxxx#1{\oddfootingxxx{#1}\evenfootingxxx{#1}} -% -}% unbind the catcode of @. +\parseargdef\everyfooting{\oddfootingxxx{#1}\evenfootingxxx{#1}} + % @headings double turns headings on for double-sided printing. % @headings single turns headings on for single-sided printing. @@ -1934,7 +2159,7 @@ where each line of input produces a line of output.} \def\headings #1 {\csname HEADINGS#1\endcsname} -\def\HEADINGSoff{ +\def\HEADINGSoff{% \global\evenheadline={\hfil} \global\evenfootline={\hfil} \global\oddheadline={\hfil} \global\oddfootline={\hfil}} \HEADINGSoff @@ -1943,7 +2168,7 @@ where each line of input produces a line of output.} % chapter name on inside top of right hand pages, document % title on inside top of left hand pages, and page numbers on outside top % edge of all pages. -\def\HEADINGSdouble{ +\def\HEADINGSdouble{% \global\pageno=1 \global\evenfootline={\hfil} \global\oddfootline={\hfil} @@ -1955,7 +2180,7 @@ where each line of input produces a line of output.} % For single-sided printing, chapter title goes across top left of page, % page number on top right. -\def\HEADINGSsingle{ +\def\HEADINGSsingle{% \global\pageno=1 \global\evenfootline={\hfil} \global\oddfootline={\hfil} @@ -2002,12 +2227,11 @@ where each line of input produces a line of output.} % @settitle line... specifies the title of the document, for headings. % It generates no output of its own. \def\thistitle{\putwordNoTitle} -\def\settitle{\parsearg\settitlezzz} -\def\settitlezzz #1{\gdef\thistitle{#1}} +\def\settitle{\parsearg{\gdef\thistitle}} \message{tables,} -% Tables -- @table, @ftable, @vtable, @item(x), @kitem(x), @xitem(x). +% Tables -- @table, @ftable, @vtable, @item(x). % default indentation of table text \newdimen\tableindent \tableindent=.8in @@ -2019,7 +2243,7 @@ where each line of input produces a line of output.} % used internally for \itemindent minus \itemmargin \newdimen\itemmax -% Note @table, @vtable, and @vtable define @item, @itemx, etc., with +% Note @table, @ftable, and @vtable define @item, @itemx, etc., with % these defs. % They also define \itemindex % to index the item name in whatever manner is desired (perhaps none). @@ -2031,22 +2255,10 @@ where each line of input produces a line of output.} \def\internalBitem{\smallbreak \parsearg\itemzzz} \def\internalBitemx{\itemxpar \parsearg\itemzzz} -\def\internalBxitem "#1"{\def\xitemsubtopix{#1} \smallbreak \parsearg\xitemzzz} -\def\internalBxitemx "#1"{\def\xitemsubtopix{#1} \itemxpar \parsearg\xitemzzz} - -\def\internalBkitem{\smallbreak \parsearg\kitemzzz} -\def\internalBkitemx{\itemxpar \parsearg\kitemzzz} - -\def\kitemzzz #1{\dosubind {kw}{\code{#1}}{for {\bf \lastfunction}}% - \itemzzz {#1}} - -\def\xitemzzz #1{\dosubind {kw}{\code{#1}}{for {\bf \xitemsubtopic}}% - \itemzzz {#1}} - \def\itemzzz #1{\begingroup % \advance\hsize by -\rightskip \advance\hsize by -\tableindent - \setbox0=\hbox{\itemfont{#1}}% + \setbox0=\hbox{\itemindicate{#1}}% \itemindex{#1}% \nobreak % This prevents a break before @itemx. % @@ -2070,17 +2282,13 @@ where each line of input produces a line of output.} % \parskip glue -- logically it's part of the @item we just started. \nobreak \vskip-\parskip % - % Stop a page break at the \parskip glue coming up. (Unfortunately - % we can't prevent a possible page break at the following - % \baselineskip glue.) However, if what follows is an environment - % such as @example, there will be no \parskip glue; then - % the negative vskip we just would cause the example and the item to - % crash together. So we use this bizarre value of 10001 as a signal - % to \aboveenvbreak to insert \parskip glue after all. - % (Possibly there are other commands that could be followed by - % @example which need the same treatment, but not section titles; or - % maybe section titles are the only special case and they should be - % penalty 10001...) + % Stop a page break at the \parskip glue coming up. However, if + % what follows is an environment such as @example, there will be no + % \parskip glue; then the negative vskip we just inserted would + % cause the example and the item to crash together. So we use this + % bizarre value of 10001 as a signal to \aboveenvbreak to insert + % \parskip glue after all. Section titles are handled this way also. + % \penalty 10001 \endgroup \itemxneedsnegativevskipfalse @@ -2100,81 +2308,72 @@ where each line of input produces a line of output.} \fi } -\def\item{\errmessage{@item while not in a table}} -\def\itemx{\errmessage{@itemx while not in a table}} -\def\kitem{\errmessage{@kitem while not in a table}} -\def\kitemx{\errmessage{@kitemx while not in a table}} -\def\xitem{\errmessage{@xitem while not in a table}} -\def\xitemx{\errmessage{@xitemx while not in a table}} - -% Contains a kludge to get @end[description] to work. -\def\description{\tablez{\dontindex}{1}{}{}{}{}} +\def\item{\errmessage{@item while not in a list environment}} +\def\itemx{\errmessage{@itemx while not in a list environment}} % @table, @ftable, @vtable. -\def\table{\begingroup\inENV\obeylines\obeyspaces\tablex} -{\obeylines\obeyspaces% -\gdef\tablex #1^^M{% -\tabley\dontindex#1 \endtabley}} - -\def\ftable{\begingroup\inENV\obeylines\obeyspaces\ftablex} -{\obeylines\obeyspaces% -\gdef\ftablex #1^^M{% -\tabley\fnitemindex#1 \endtabley -\def\Eftable{\endgraf\afterenvbreak\endgroup}% -\let\Etable=\relax}} - -\def\vtable{\begingroup\inENV\obeylines\obeyspaces\vtablex} -{\obeylines\obeyspaces% -\gdef\vtablex #1^^M{% -\tabley\vritemindex#1 \endtabley -\def\Evtable{\endgraf\afterenvbreak\endgroup}% -\let\Etable=\relax}} - -\def\dontindex #1{} -\def\fnitemindex #1{\doind {fn}{\code{#1}}}% -\def\vritemindex #1{\doind {vr}{\code{#1}}}% - -{\obeyspaces % -\gdef\tabley#1#2 #3 #4 #5 #6 #7\endtabley{\endgroup% -\tablez{#1}{#2}{#3}{#4}{#5}{#6}}} - -\def\tablez #1#2#3#4#5#6{% -\aboveenvbreak % -\begingroup % -\def\Edescription{\Etable}% Necessary kludge. -\let\itemindex=#1% -\ifnum 0#3>0 \advance \leftskip by #3\mil \fi % -\ifnum 0#4>0 \tableindent=#4\mil \fi % -\ifnum 0#5>0 \advance \rightskip by #5\mil \fi % -\def\itemfont{#2}% -\itemmax=\tableindent % -\advance \itemmax by -\itemmargin % -\advance \leftskip by \tableindent % -\exdentamount=\tableindent -\parindent = 0pt -\parskip = \smallskipamount -\ifdim \parskip=0pt \parskip=2pt \fi% -\def\Etable{\endgraf\afterenvbreak\endgroup}% -\let\item = \internalBitem % -\let\itemx = \internalBitemx % -\let\kitem = \internalBkitem % -\let\kitemx = \internalBkitemx % -\let\xitem = \internalBxitem % -\let\xitemx = \internalBxitemx % +\envdef\table{% + \let\itemindex\gobble + \tablecheck{table}% +} +\envdef\ftable{% + \def\itemindex ##1{\doind {fn}{\code{##1}}}% + \tablecheck{ftable}% +} +\envdef\vtable{% + \def\itemindex ##1{\doind {vr}{\code{##1}}}% + \tablecheck{vtable}% +} +\def\tablecheck#1{% + \ifnum \the\catcode`\^^M=\active + \endgroup + \errmessage{This command won't work in this context; perhaps the problem is + that we are \inenvironment\thisenv}% + \def\next{\doignore{#1}}% + \else + \let\next\tablex + \fi + \next +} +\def\tablex#1{% + \def\itemindicate{#1}% + \parsearg\tabley +} +\def\tabley#1{% + {% + \makevalueexpandable + \edef\temp{\noexpand\tablez #1\space\space\space}% + \expandafter + }\temp \endtablez } +\def\tablez #1 #2 #3 #4\endtablez{% + \aboveenvbreak + \ifnum 0#1>0 \advance \leftskip by #1\mil \fi + \ifnum 0#2>0 \tableindent=#2\mil \fi + \ifnum 0#3>0 \advance \rightskip by #3\mil \fi + \itemmax=\tableindent + \advance \itemmax by -\itemmargin + \advance \leftskip by \tableindent + \exdentamount=\tableindent + \parindent = 0pt + \parskip = \smallskipamount + \ifdim \parskip=0pt \parskip=2pt \fi + \let\item = \internalBitem + \let\itemx = \internalBitemx +} +\def\Etable{\endgraf\afterenvbreak} +\let\Eftable\Etable +\let\Evtable\Etable +\let\Eitemize\Etable +\let\Eenumerate\Etable % This is the counter used by @enumerate, which is really @itemize \newcount \itemno -\def\itemize{\parsearg\itemizezzz} - -\def\itemizezzz #1{% - \begingroup % ended by the @end itemize - \itemizey {#1}{\Eitemize} -} +\envdef\itemize{\parsearg\doitemize} -\def\itemizey#1#2{% +\def\doitemize#1{% \aboveenvbreak \itemmax=\itemindent \advance\itemmax by -\itemmargin @@ -2183,13 +2382,33 @@ where each line of input produces a line of output.} \parindent=0pt \parskip=\smallskipamount \ifdim\parskip=0pt \parskip=2pt \fi - \def#2{\endgraf\afterenvbreak\endgroup}% \def\itemcontents{#1}% % @itemize with no arg is equivalent to @itemize @bullet. \ifx\itemcontents\empty\def\itemcontents{\bullet}\fi \let\item=\itemizeitem } +% Definition of @item while inside @itemize and @enumerate. +% +\def\itemizeitem{% + \advance\itemno by 1 % for enumerations + {\let\par=\endgraf \smallbreak}% reasonable place to break + {% + % If the document has an @itemize directly after a section title, a + % \nobreak will be last on the list, and \sectionheading will have + % done a \vskip-\parskip. In that case, we don't want to zero + % parskip, or the item text will crash with the heading. On the + % other hand, when there is normal text preceding the item (as there + % usually is), we do want to zero parskip, or there would be too much + % space. In that case, we won't have a \nobreak before. At least + % that's the theory. + \ifnum\lastpenalty<10000 \parskip=0in \fi + \noindent + \hbox to 0pt{\hss \itemcontents \kern\itemmargin}% + \vadjust{\penalty 1200}}% not good to break after first line of item. + \flushcr +} + % \splitoff TOKENS\endmark defines \first to be the first token in % TOKENS, and \rest to be the remainder. % @@ -2199,11 +2418,8 @@ where each line of input produces a line of output.} % or number, to specify the first label in the enumerated list. No % argument is the same as `1'. % -\def\enumerate{\parsearg\enumeratezzz} -\def\enumeratezzz #1{\enumeratey #1 \endenumeratey} +\envparseargdef\enumerate{\enumeratey #1 \endenumeratey} \def\enumeratey #1 #2\endenumeratey{% - \begingroup % ended by the @end enumerate - % % If we were given no argument, pretend we were given `1'. \def\thearg{#1}% \ifx\thearg\empty \def\thearg{1}\fi @@ -2274,13 +2490,13 @@ where each line of input produces a line of output.} }% } -% Call itemizey, adding a period to the first argument and supplying the +% Call \doitemize, adding a period to the first argument and supplying the % common last two arguments. Also subtract one from the initial value in % \itemno, since @item increments \itemno. % \def\startenumeration#1{% \advance\itemno by -1 - \itemizey{#1.}\Eenumerate\flushcr + \doitemize{#1.}\flushcr } % @alphaenumerate and @capsenumerate are abbreviations for giving an arg @@ -2291,16 +2507,6 @@ where each line of input produces a line of output.} \def\Ealphaenumerate{\Eenumerate} \def\Ecapsenumerate{\Eenumerate} -% Definition of @item while inside @itemize. - -\def\itemizeitem{% -\advance\itemno by 1 -{\let\par=\endgraf \smallbreak}% -\ifhmode \errmessage{In hmode at itemizeitem}\fi -{\parskip=0in \hskip 0pt -\hbox to 0pt{\hss \itemcontents\hskip \itemmargin}% -\vadjust{\penalty 1200}}% -\flushcr} % @multitable macros % Amy Hendrickson, 8/18/94, 3/6/96 @@ -2327,24 +2533,14 @@ where each line of input produces a line of output.} % @multitable {Column 1 template} {Column 2 template} {Column 3 template} % @item ... % using the widest term desired in each column. -% -% For those who want to use more than one line's worth of words in -% the preamble, break the line within one argument and it -% will parse correctly, i.e., -% -% @multitable {Column 1 template} {Column 2 template} {Column 3 -% template} -% Not: -% @multitable {Column 1 template} {Column 2 template} -% {Column 3 template} % Each new table line starts with @item, each subsequent new column % starts with @tab. Empty columns may be produced by supplying @tab's % with nothing between them for as many times as empty columns are needed, % ie, @tab@tab@tab will produce two empty columns. -% @item, @tab, @multitable or @end multitable do not need to be on their -% own lines, but it will not hurt if they are. +% @item, @tab do not need to be on their own lines, but it will not hurt +% if they are. % Sample multitable: @@ -2388,13 +2584,12 @@ where each line of input produces a line of output.} \def\xcolumnfractions{\columnfractions} \newif\ifsetpercent -% #1 is the part of the @columnfraction before the decimal point, which -% is presumably either 0 or the empty string (but we don't check, we -% just throw it away). #2 is the decimal part, which we use as the -% percent of \hsize for this column. -\def\pickupwholefraction#1.#2 {% +% #1 is the @columnfraction, usually a decimal number like .5, but might +% be just 1. We just use it, whatever it is. +% +\def\pickupwholefraction#1 {% \global\advance\colcount by 1 - \expandafter\xdef\csname col\the\colcount\endcsname{.#2\hsize}% + \expandafter\xdef\csname col\the\colcount\endcsname{#1\hsize}% \setuptable } @@ -2427,18 +2622,33 @@ where each line of input produces a line of output.} \go } +% multitable-only commands. +% +% @headitem starts a heading row, which we typeset in bold. +% Assignments have to be global since we are inside the implicit group +% of an alignment entry. Note that \everycr resets \everytab. +\def\headitem{\checkenv\multitable \crcr \global\everytab={\bf}\the\everytab}% +% +% A \tab used to include \hskip1sp. But then the space in a template +% line is not enough. That is bad. So let's go back to just `&' until +% we encounter the problem it was intended to solve again. +% --karl, nathan@acm.org, 20apr99. +\def\tab{\checkenv\multitable &\the\everytab}% + % @multitable ... @end multitable definitions: % -\def\multitable{\parsearg\dotable} -\def\dotable#1{\bgroup +\newtoks\everytab % insert after every tab. +% +\envdef\multitable{% \vskip\parskip - \let\item=\crcrwithfootnotes - % A \tab used to include \hskip1sp. But then the space in a template - % line is not enough. That is bad. So let's go back to just & until - % we encounter the problem it was intended to solve again. --karl, - % nathan@acm.org, 20apr99. - \let\tab=&% - \let\startfootins=\startsavedfootnote + \startsavinginserts + % + % @item within a multitable starts a normal row. + % We use \def instead of \let so that if one of the multitable entries + % contains an @itemize, we don't choke on the \item (seen as \crcr aka + % \endtemplate) expanding \doitemize. + \def\item{\crcr}% + % \tolerance=9500 \hbadness=9500 \setmultitablespacing @@ -2446,70 +2656,80 @@ where each line of input produces a line of output.} \parindent=\multitableparindent \overfullrule=0pt \global\colcount=0 - \def\Emultitable{% - \global\setpercentfalse - \crcrwithfootnotes\crcr - \egroup\egroup + % + \everycr = {% + \noalign{% + \global\everytab={}% + \global\colcount=0 % Reset the column counter. + % Check for saved footnotes, etc. + \checkinserts + % Keeps underfull box messages off when table breaks over pages. + %\filbreak + % Maybe so, but it also creates really weird page breaks when the + % table breaks over pages. Wouldn't \vfil be better? Wait until the + % problem manifests itself, so it can be fixed for real --karl. + }% }% % + \parsearg\domultitable +} +\def\domultitable#1{% % To parse everything between @multitable and @item: \setuptable#1 \endsetuptable % - % \everycr will reset column counter, \colcount, at the end of - % each line. Every column entry will cause \colcount to advance by one. - % The table preamble - % looks at the current \colcount to find the correct column width. - \everycr{\noalign{% - % - % \filbreak%% keeps underfull box messages off when table breaks over pages. - % Maybe so, but it also creates really weird page breaks when the table - % breaks over pages. Wouldn't \vfil be better? Wait until the problem - % manifests itself, so it can be fixed for real --karl. - \global\colcount=0\relax}}% - % % This preamble sets up a generic column definition, which will % be used as many times as user calls for columns. % \vtop will set a single line and will also let text wrap and % continue for many paragraphs if desired. - \halign\bgroup&\global\advance\colcount by 1\relax - \multistrut\vtop{\hsize=\expandafter\csname col\the\colcount\endcsname - % - % In order to keep entries from bumping into each other - % we will add a \leftskip of \multitablecolspace to all columns after - % the first one. - % - % If a template has been used, we will add \multitablecolspace - % to the width of each template entry. - % - % If the user has set preamble in terms of percent of \hsize we will - % use that dimension as the width of the column, and the \leftskip - % will keep entries from bumping into each other. Table will start at - % left margin and final column will justify at right margin. - % - % Make sure we don't inherit \rightskip from the outer environment. - \rightskip=0pt - \ifnum\colcount=1 - % The first column will be indented with the surrounding text. - \advance\hsize by\leftskip - \else - \ifsetpercent \else - % If user has not set preamble in terms of percent of \hsize - % we will advance \hsize by \multitablecolspace. - \advance\hsize by \multitablecolspace - \fi - % In either case we will make \leftskip=\multitablecolspace: - \leftskip=\multitablecolspace - \fi - % Ignoring space at the beginning and end avoids an occasional spurious - % blank line, when TeX decides to break the line at the space before the - % box from the multistrut, so the strut ends up on a line by itself. - % For example: - % @multitable @columnfractions .11 .89 - % @item @code{#} - % @tab Legal holiday which is valid in major parts of the whole country. - % Is automatically provided with highlighting sequences respectively marking - % characters. - \noindent\ignorespaces##\unskip\multistrut}\cr + \halign\bgroup &% + \global\advance\colcount by 1 + \multistrut + \vtop{% + % Use the current \colcount to find the correct column width: + \hsize=\expandafter\csname col\the\colcount\endcsname + % + % In order to keep entries from bumping into each other + % we will add a \leftskip of \multitablecolspace to all columns after + % the first one. + % + % If a template has been used, we will add \multitablecolspace + % to the width of each template entry. + % + % If the user has set preamble in terms of percent of \hsize we will + % use that dimension as the width of the column, and the \leftskip + % will keep entries from bumping into each other. Table will start at + % left margin and final column will justify at right margin. + % + % Make sure we don't inherit \rightskip from the outer environment. + \rightskip=0pt + \ifnum\colcount=1 + % The first column will be indented with the surrounding text. + \advance\hsize by\leftskip + \else + \ifsetpercent \else + % If user has not set preamble in terms of percent of \hsize + % we will advance \hsize by \multitablecolspace. + \advance\hsize by \multitablecolspace + \fi + % In either case we will make \leftskip=\multitablecolspace: + \leftskip=\multitablecolspace + \fi + % Ignoring space at the beginning and end avoids an occasional spurious + % blank line, when TeX decides to break the line at the space before the + % box from the multistrut, so the strut ends up on a line by itself. + % For example: + % @multitable @columnfractions .11 .89 + % @item @code{#} + % @tab Legal holiday which is valid in major parts of the whole country. + % Is automatically provided with highlighting sequences respectively + % marking characters. + \noindent\ignorespaces##\unskip\multistrut + }\cr +} +\def\Emultitable{% + \crcr + \egroup % end the \halign + \global\setpercentfalse } \def\setmultitablespacing{% test to see if user has set \multitablelinespace. @@ -2539,65 +2759,33 @@ width0pt\relax} \fi %% than skip between lines in the table. \fi} -% In case a @footnote appears inside an alignment, save the footnote -% text to a box and make the \insert when a row of the table is -% finished. Otherwise, the insertion is lost, it never migrates to the -% main vertical list. --kasal, 22jan03. -% -\newbox\savedfootnotes -% -% \dotable \let's \startfootins to this, so that \dofootnote will call -% it instead of starting the insertion right away. -\def\startsavedfootnote{% - \global\setbox\savedfootnotes = \vbox\bgroup - \unvbox\savedfootnotes -} -\def\crcrwithfootnotes{% - \crcr - \ifvoid\savedfootnotes \else - \noalign{\insert\footins{\box\savedfootnotes}}% - \fi -} \message{conditionals,} -% Prevent errors for section commands. -% Used in @ignore and in failing conditionals. -\def\ignoresections{% - \let\appendix=\relax - \let\appendixsec=\relax - \let\appendixsection=\relax - \let\appendixsubsec=\relax - \let\appendixsubsection=\relax - \let\appendixsubsubsec=\relax - \let\appendixsubsubsection=\relax - %\let\begin=\relax - %\let\bye=\relax - \let\centerchap=\relax - \let\chapter=\relax - \let\contents=\relax - \let\section=\relax - \let\smallbook=\relax - \let\subsec=\relax - \let\subsection=\relax - \let\subsubsec=\relax - \let\subsubsection=\relax - \let\titlepage=\relax - \let\top=\relax - \let\unnumbered=\relax - \let\unnumberedsec=\relax - \let\unnumberedsection=\relax - \let\unnumberedsubsec=\relax - \let\unnumberedsubsection=\relax - \let\unnumberedsubsubsec=\relax - \let\unnumberedsubsubsection=\relax + +% @iftex, @ifnotdocbook, @ifnothtml, @ifnotinfo, @ifnotplaintext, +% @ifnotxml always succeed. They currently do nothing; we don't +% attempt to check whether the conditionals are properly nested. But we +% have to remember that they are conditionals, so that @end doesn't +% attempt to close an environment group. +% +\def\makecond#1{% + \expandafter\let\csname #1\endcsname = \relax + \expandafter\let\csname iscond.#1\endcsname = 1 } +\makecond{iftex} +\makecond{ifnotdocbook} +\makecond{ifnothtml} +\makecond{ifnotinfo} +\makecond{ifnotplaintext} +\makecond{ifnotxml} % Ignore @ignore, @ifhtml, @ifinfo, and the like. % \def\direntry{\doignore{direntry}} -\def\documentdescriptionword{documentdescription} \def\documentdescription{\doignore{documentdescription}} +\def\docbook{\doignore{docbook}} \def\html{\doignore{html}} +\def\ifdocbook{\doignore{ifdocbook}} \def\ifhtml{\doignore{ifhtml}} \def\ifinfo{\doignore{ifinfo}} \def\ifnottex{\doignore{ifnottex}} @@ -2607,47 +2795,40 @@ width0pt\relax} \fi \def\menu{\doignore{menu}} \def\xml{\doignore{xml}} -% @dircategory CATEGORY -- specify a category of the dir file -% which this file should belong to. Ignore this in TeX. -\let\dircategory = \comment - % Ignore text until a line `@end #1', keeping track of nested conditionals. % % A count to remember the depth of nesting. \newcount\doignorecount \def\doignore#1{\begingroup - % Don't complain about control sequences we have declared \outer. - \ignoresections + % Scan in ``verbatim'' mode: + \catcode`\@ = \other + \catcode`\{ = \other + \catcode`\} = \other % % Make sure that spaces turn into tokens that match what \doignoretext wants. - \catcode\spaceChar = 10 - % - % Ignore braces, so mismatched braces don't cause trouble. - \catcode`\{ = 9 - \catcode`\} = 9 + \spaceisspace % % Count number of #1's that we've seen. \doignorecount = 0 % % Swallow text until we reach the matching `@end #1'. - \expandafter \dodoignore \csname#1\endcsname {#1}% + \dodoignore{#1}% } -{ \catcode`@=11 % We want to use \ST@P which cannot appear in texinfo source. +{ \catcode`_=11 % We want to use \_STOP_ which cannot appear in texinfo source. \obeylines % % - \gdef\dodoignore#1#2{% - % #1 contains, e.g., \ifinfo, a.k.a. @ifinfo. - % #2 contains the string `ifinfo'. + \gdef\dodoignore#1{% + % #1 contains the command name as a string, e.g., `ifinfo'. % - % Define a command to find the next `@end #2', which must be on a line + % Define a command to find the next `@end #1', which must be on a line % by itself. - \long\def\doignoretext##1^^M\end #2{\doignoretextyyy##1^^M#1\ST@P}% + \long\def\doignoretext##1^^M@end #1{\doignoretextyyy##1^^M@#1\_STOP_}% % And this command to find another #1 command, at the beginning of a % line. (Otherwise, we would consider a line `@c @ifset', for % example, to count as an @ifset for nesting.) - \long\def\doignoretextyyy##1^^M#1##2\ST@P{\doignoreyyy{##2}\ST@P}% + \long\def\doignoretextyyy##1^^M@#1##2\_STOP_{\doignoreyyy{##2}\_STOP_}% % % And now expand that command. \obeylines % @@ -2664,11 +2845,11 @@ width0pt\relax} \fi \let\next\doignoretextyyy % ..., look for another. % If we're here, #1 ends with ^^M\ifinfo (for example). \fi - \next #1% the token \ST@P is present just after this macro. + \next #1% the token \_STOP_ is present just after this macro. } -% We have to swallow the remaining "\ST@P". -% +% We have to swallow the remaining "\_STOP_". +% \def\doignoretextzzz#1{% \ifnum\doignorecount = 0 % We have just found the outermost @end. \let\next\enddoignore @@ -2689,53 +2870,58 @@ width0pt\relax} \fi % Since we want to separate VAR from REST-OF-LINE (which might be % empty), we can't just use \parsearg; we have to insert a space of our % own to delimit the rest of the line, and then take it out again if we -% didn't need it. Make sure the catcode of space is correct to avoid -% losing inside @example, for instance. +% didn't need it. +% We rely on the fact that \parsearg sets \catcode`\ =10. % -\def\set{\begingroup\catcode` =10 - \catcode`\-=12 \catcode`\_=12 % Allow - and _ in VAR. - \parsearg\setxxx} -\def\setxxx#1{\setyyy#1 \endsetyyy} +\parseargdef\set{\setyyy#1 \endsetyyy} \def\setyyy#1 #2\endsetyyy{% - \def\temp{#2}% - \ifx\temp\empty \global\expandafter\let\csname SET#1\endcsname = \empty - \else \setzzz{#1}#2\endsetzzz % Remove the trailing space \setxxx inserted. - \fi - \endgroup + {% + \makevalueexpandable + \def\temp{#2}% + \edef\next{\gdef\makecsname{SET#1}}% + \ifx\temp\empty + \next{}% + \else + \setzzz#2\endsetzzz + \fi + }% } -% Can't use \xdef to pre-expand #2 and save some time, since \temp or -% \next or other control sequences that we've defined might get us into -% an infinite loop. Consider `@set foo @cite{bar}'. -\def\setzzz#1#2 \endsetzzz{\expandafter\gdef\csname SET#1\endcsname{#2}} +% Remove the trailing space \setxxx inserted. +\def\setzzz#1 \endsetzzz{\next{#1}} % @clear VAR clears (i.e., unsets) the variable VAR. % -\def\clear{\parsearg\clearxxx} -\def\clearxxx#1{\global\expandafter\let\csname SET#1\endcsname=\relax} +\parseargdef\clear{% + {% + \makevalueexpandable + \global\expandafter\let\csname SET#1\endcsname=\relax + }% +} % @value{foo} gets the text saved in variable foo. +\def\value{\begingroup\makevalueexpandable\valuexxx} +\def\valuexxx#1{\expandablevalue{#1}\endgroup} { - \catcode`\_ = \active + \catcode`\- = \active \catcode`\_ = \active % - % We might end up with active _ or - characters in the argument if - % we're called from @code, as @code{@value{foo-bar_}}. So \let any - % such active characters to their normal equivalents. - \gdef\value{\begingroup + \gdef\makevalueexpandable{% + \let\value = \expandablevalue + % We don't want these characters active, ... \catcode`\-=\other \catcode`\_=\other - \indexbreaks \let_\normalunderscore - \valuexxx} + % ..., but we might end up with active ones in the argument if + % we're called from @code, as @code{@value{foo-bar_}}, though. + % So \let them to their normal equivalents. + \let-\realdash \let_\normalunderscore + } } -\def\valuexxx#1{\expandablevalue{#1}\endgroup} % We have this subroutine so that we can handle at least some @value's -% properly in indexes (we \let\value to this in \indexdummies). Ones -% whose names contain - or _ still won't work, but we can't do anything -% about that. The command has to be fully expandable (if the variable -% is set), since the result winds up in the index file. This means that -% if the variable's value contains other Texinfo commands, it's almost -% certain it will fail (although perhaps we could fix that with -% sufficient work to do a one-level expansion on the result, instead of -% complete). +% properly in indexes (we call \makevalueexpandable in \indexdummies). +% The command has to be fully expandable (if the variable is set), since +% the result winds up in the index file. This means that if the +% variable's value contains other Texinfo commands, it's almost certain +% it will fail (although perhaps we could fix that with sufficient work +% to do a one-level expansion on the result, instead of complete). % \def\expandablevalue#1{% \expandafter\ifx\csname SET#1\endcsname\relax @@ -2749,55 +2935,36 @@ width0pt\relax} \fi % @ifset VAR ... @end ifset reads the `...' iff VAR has been defined % with @set. % -\def\ifset{\parsearg\doifset} -\def\doifset#1{% - \expandafter\ifx\csname SET#1\endcsname\relax - \let\next=\ifsetfail - \else - \let\next=\ifsetsucceed - \fi - \next +% To get special treatment of `@end ifset,' call \makeond and the redefine. +% +\makecond{ifset} +\def\ifset{\parsearg{\doifset{\let\next=\ifsetfail}}} +\def\doifset#1#2{% + {% + \makevalueexpandable + \let\next=\empty + \expandafter\ifx\csname SET#2\endcsname\relax + #1% If not set, redefine \next. + \fi + \expandafter + }\next } -\def\ifsetsucceed{\conditionalsucceed{ifset}} \def\ifsetfail{\doignore{ifset}} -\defineunmatchedend{ifset} % @ifclear VAR ... @end ifclear reads the `...' iff VAR has never been % defined with @set, or has been undefined with @clear. % -\def\ifclear{\parsearg\doifclear} -\def\doifclear#1{% - \expandafter\ifx\csname SET#1\endcsname\relax - \let\next=\ifclearsucceed - \else - \let\next=\ifclearfail - \fi - \next -} -\def\ifclearsucceed{\conditionalsucceed{ifclear}} -\def\ifclearfail{\doignore{ifclear}} -\defineunmatchedend{ifclear} - -% @iftex, @ifnothtml, @ifnotinfo, @ifnotplaintext always succeed; we -% read the text following, through the first @end iftex (etc.). Make -% `@end iftex' (etc.) valid only after an @iftex. +% The `\else' inside the `\doifset' parameter is a trick to reuse the +% above code: if the variable is not set, do nothing, if it is set, +% then redefine \next to \ifclearfail. % -\def\iftex{\conditionalsucceed{iftex}} -\def\ifnothtml{\conditionalsucceed{ifnothtml}} -\def\ifnotinfo{\conditionalsucceed{ifnotinfo}} -\def\ifnotplaintext{\conditionalsucceed{ifnotplaintext}} -\defineunmatchedend{iftex} -\defineunmatchedend{ifnothtml} -\defineunmatchedend{ifnotinfo} -\defineunmatchedend{ifnotplaintext} +\makecond{ifclear} +\def\ifclear{\parsearg{\doifset{\else \let\next=\ifclearfail}}} +\def\ifclearfail{\doignore{ifclear}} -% True conditional. Since \set globally defines its variables, we can -% just start and end a group (to keep the @end definition undefined at -% the outer level). -% -\def\conditionalsucceed#1{\begingroup - \expandafter\def\csname E#1\endcsname{\endgroup}% -} +% @dircategory CATEGORY -- specify a category of the dir file +% which this file should belong to. Ignore this in TeX. +\let\dircategory=\comment % @defininfoenclose. \let\definfoenclose=\comment @@ -2922,6 +3089,7 @@ width0pt\relax} \fi \def\definedummyletter##1{% \expandafter\def\csname ##1\endcsname{\realbackslash ##1}% }% + \let\definedummyaccent\definedummyletter % % Do the redefinitions. \commondummies @@ -2944,6 +3112,7 @@ width0pt\relax} \fi \def\definedummyletter##1{% \expandafter\def\csname ##1\endcsname{@##1}% }% + \let\definedummyaccent\definedummyletter % % Do the redefinitions. \commondummies @@ -2956,26 +3125,11 @@ width0pt\relax} \fi % \normalturnoffactive % - % Control letters and accents. + \commondummiesnofonts + % \definedummyletter{_}% - \definedummyletter{,}% - \definedummyletter{"}% - \definedummyletter{`}% - \definedummyletter{'}% - \definedummyletter{^}% - \definedummyletter{~}% - \definedummyletter{=}% - \definedummyword{u}% - \definedummyword{v}% - \definedummyword{H}% - \definedummyword{dotaccent}% - \definedummyword{ringaccent}% - \definedummyword{tieaccent}% - \definedummyword{ubaraccent}% - \definedummyword{udotaccent}% - \definedummyword{dotless}% - % - % Other non-English letters. + % + % Non-English letters. \definedummyword{AA}% \definedummyword{AE}% \definedummyword{L}% @@ -2987,6 +3141,10 @@ width0pt\relax} \fi \definedummyword{oe}% \definedummyword{o}% \definedummyword{ss}% + \definedummyword{exclamdown}% + \definedummyword{questiondown}% + \definedummyword{ordf}% + \definedummyword{ordm}% % % Although these internal commands shouldn't show up, sometimes they do. \definedummyword{bf}% @@ -2998,38 +3156,14 @@ width0pt\relax} \fi \definedummyword{tclose}% \definedummyword{tt}% % - % Texinfo font commands. - \definedummyword{b}% - \definedummyword{i}% - \definedummyword{r}% - \definedummyword{sc}% - \definedummyword{t}% - % + \definedummyword{LaTeX}% \definedummyword{TeX}% - \definedummyword{acronym}% - \definedummyword{cite}% - \definedummyword{code}% - \definedummyword{command}% - \definedummyword{dfn}% - \definedummyword{dots}% - \definedummyword{emph}% - \definedummyword{env}% - \definedummyword{file}% - \definedummyword{kbd}% - \definedummyword{key}% - \definedummyword{math}% - \definedummyword{option}% - \definedummyword{samp}% - \definedummyword{strong}% - \definedummyword{uref}% - \definedummyword{url}% - \definedummyword{var}% - \definedummyword{verb}% - \definedummyword{w}% % % Assorted special characters. \definedummyword{bullet}% + \definedummyword{comma}% \definedummyword{copyright}% + \definedummyword{registeredsymbol}% \definedummyword{dots}% \definedummyword{enddots}% \definedummyword{equiv}% @@ -3041,10 +3175,9 @@ width0pt\relax} \fi \definedummyword{print}% \definedummyword{result}% % - % Handle some cases of @value -- where the variable name does not - % contain - or _, and the value does not contain any + % Handle some cases of @value -- where it does not contain any % (non-fully-expandable) commands. - \let\value = \expandablevalue + \makevalueexpandable % % Normal spaces, not active ones. \unsepspaces @@ -3053,45 +3186,97 @@ width0pt\relax} \fi \turnoffmacros } -% If an index command is used in an @example environment, any spaces -% therein should become regular spaces in the raw index file, not the -% expansion of \tie (\leavevmode \penalty \@M \ ). -{\obeyspaces - \gdef\unsepspaces{\obeyspaces\let =\space}} - +% \commondummiesnofonts: common to \commondummies and \indexnofonts. +% +% Better have this without active chars. +{ + \catcode`\~=\other + \gdef\commondummiesnofonts{% + % Control letters and accents. + \definedummyletter{!}% + \definedummyaccent{"}% + \definedummyaccent{'}% + \definedummyletter{*}% + \definedummyaccent{,}% + \definedummyletter{.}% + \definedummyletter{/}% + \definedummyletter{:}% + \definedummyaccent{=}% + \definedummyletter{?}% + \definedummyaccent{^}% + \definedummyaccent{`}% + \definedummyaccent{~}% + \definedummyword{u}% + \definedummyword{v}% + \definedummyword{H}% + \definedummyword{dotaccent}% + \definedummyword{ringaccent}% + \definedummyword{tieaccent}% + \definedummyword{ubaraccent}% + \definedummyword{udotaccent}% + \definedummyword{dotless}% + % + % Texinfo font commands. + \definedummyword{b}% + \definedummyword{i}% + \definedummyword{r}% + \definedummyword{sc}% + \definedummyword{t}% + % + % Commands that take arguments. + \definedummyword{acronym}% + \definedummyword{cite}% + \definedummyword{code}% + \definedummyword{command}% + \definedummyword{dfn}% + \definedummyword{emph}% + \definedummyword{env}% + \definedummyword{file}% + \definedummyword{kbd}% + \definedummyword{key}% + \definedummyword{math}% + \definedummyword{option}% + \definedummyword{samp}% + \definedummyword{strong}% + \definedummyword{tie}% + \definedummyword{uref}% + \definedummyword{url}% + \definedummyword{var}% + \definedummyword{verb}% + \definedummyword{w}% + } +} % \indexnofonts is used when outputting the strings to sort the index % by, and when constructing control sequence names. It eliminates all % control sequences and just writes whatever the best ASCII sort string % would be for a given command (usually its argument). % -\def\indexdummytex{TeX} -\def\indexdummydots{...} -% \def\indexnofonts{% + % Accent commands should become @asis. + \def\definedummyaccent##1{% + \expandafter\let\csname ##1\endcsname\asis + }% + % We can just ignore other control letters. + \def\definedummyletter##1{% + \expandafter\def\csname ##1\endcsname{}% + }% + % Hopefully, all control words can become @asis. + \let\definedummyword\definedummyaccent + % + \commondummiesnofonts + % + % Don't no-op \tt, since it isn't a user-level command + % and is used in the definitions of the active chars like <, >, |, etc. + % Likewise with the other plain tex font commands. + %\let\tt=\asis + % \def\ { }% \def\@{@}% % how to handle braces? \def\_{\normalunderscore}% % - \let\,=\asis - \let\"=\asis - \let\`=\asis - \let\'=\asis - \let\^=\asis - \let\~=\asis - \let\==\asis - \let\u=\asis - \let\v=\asis - \let\H=\asis - \let\dotaccent=\asis - \let\ringaccent=\asis - \let\tieaccent=\asis - \let\ubaraccent=\asis - \let\udotaccent=\asis - \let\dotless=\asis - % - % Other non-English letters. + % Non-English letters. \def\AA{AA}% \def\AE{AE}% \def\L{L}% @@ -3105,60 +3290,51 @@ width0pt\relax} \fi \def\ss{ss}% \def\exclamdown{!}% \def\questiondown{?}% + \def\ordf{a}% + \def\ordm{o}% % - % Don't no-op \tt, since it isn't a user-level command - % and is used in the definitions of the active chars like <, >, |, etc. - % Likewise with the other plain tex font commands. - %\let\tt=\asis + \def\LaTeX{LaTeX}% + \def\TeX{TeX}% % - % Texinfo font commands. - \let\b=\asis - \let\i=\asis - \let\r=\asis - \let\sc=\asis - \let\t=\asis - % - \let\TeX=\indexdummytex - \let\acronym=\asis - \let\cite=\asis - \let\code=\asis - \let\command=\asis - \let\dfn=\asis - \let\dots=\indexdummydots - \let\emph=\asis - \let\env=\asis - \let\file=\asis - \let\kbd=\asis - \let\key=\asis - \let\math=\asis - \let\option=\asis - \let\samp=\asis - \let\strong=\asis - \let\uref=\asis - \let\url=\asis - \let\var=\asis - \let\verb=\asis - \let\w=\asis + % Assorted special characters. + % (The following {} will end up in the sort string, but that's ok.) + \def\bullet{bullet}% + \def\comma{,}% + \def\copyright{copyright}% + \def\registeredsymbol{R}% + \def\dots{...}% + \def\enddots{...}% + \def\equiv{==}% + \def\error{error}% + \def\expansion{==>}% + \def\minus{-}% + \def\pounds{pounds}% + \def\point{.}% + \def\print{-|}% + \def\result{=>}% + % + % Don't write macro names. + \emptyusermacros } \let\indexbackslash=0 %overridden during \printindex. \let\SETmarginindex=\relax % put index entries in margin (undocumented)? % Most index entries go through here, but \dosubind is the general case. -% +% #1 is the index name, #2 is the entry text. \def\doind#1#2{\dosubind{#1}{#2}{}} % Workhorse for all \fooindexes. % #1 is name of index, #2 is stuff to put there, #3 is subentry -- -% \empty if called from \doind, as we usually are. The main exception -% is with defuns, which call us directly. +% empty if called from \doind, as we usually are (the main exception +% is with most defuns, which call us directly). % \def\dosubind#1#2#3{% \iflinks {% % Store the main index entry text (including the third arg). \toks0 = {#2}% - % If third arg is present, precede it with space. + % If third arg is present, precede it with a space. \def\thirdarg{#3}% \ifx\thirdarg\empty \else \toks0 = \expandafter{\the\toks0 \space #3}% @@ -3175,7 +3351,7 @@ width0pt\relax} \fi \fi } -% Write the entry to the index file: +% Write the entry in \toks0 to the index file: % \def\dosubindwrite{% % Put the index entry in the margin if desired. @@ -3186,7 +3362,7 @@ width0pt\relax} \fi % Remember, we are within a group. \indexdummies % Must do this here, since \bf, etc expand at this stage \escapechar=`\\ - \def\rawbackslashxx{\indexbackslash}% \indexbackslash isn't defined now + \def\backslashcurfont{\indexbackslash}% \indexbackslash isn't defined now % so it will be output as is; and it will print as backslash. % % Process the index entry with all font commands turned off, to @@ -3208,7 +3384,7 @@ width0pt\relax} \fi \temp } -% Take care of unwanted page breaks: +% Take care of unwanted page breaks: % % If a skip is the last thing on the list now, preserve it % by backing up by \lastskip, doing the \write, then inserting @@ -3227,9 +3403,23 @@ width0pt\relax} \fi % % Avoid page breaks due to these extra skips, too. % +% But wait, there is a catch there: +% We'll have to check whether \lastskip is zero skip. \ifdim is not +% sufficient for this purpose, as it ignores stretch and shrink parts +% of the skip. The only way seems to be to check the textual +% representation of the skip. +% +% The following is almost like \def\zeroskipmacro{0.0pt} except that +% the ``p'' and ``t'' characters have catcode \other, not 11 (letter). +% +\edef\zeroskipmacro{\expandafter\the\csname z@skip\endcsname} +% +% ..., ready, GO: +% \def\dosubindsanitize{% % \lastskip and \lastpenalty cannot both be nonzero simultaneously. \skip0 = \lastskip + \edef\lastskipmacro{\the\lastskip}% \count255 = \lastpenalty % % If \lastskip is nonzero, that means the last item was a @@ -3237,22 +3427,26 @@ width0pt\relax} \fi % -\skip0 glue we're inserting is preceded by a % non-discardable item, therefore it is not a potential % breakpoint, therefore no \nobreak needed. - \ifdim\lastskip = 0pt \else \vskip-\skip0 \fi + \ifx\lastskipmacro\zeroskipmacro + \else + \vskip-\skip0 + \fi % \dosubindwrite % - \ifdim\skip0 = 0pt - % if \lastskip was zero, perhaps the last item was a - % penalty, and perhaps it was >=10000, e.g., a \nobreak. - % In that case, we want to re-insert the penalty; since we - % just inserted a non-discardable item, any following glue - % (such as a \parskip) would be a breakpoint. For example: + \ifx\lastskipmacro\zeroskipmacro + % If \lastskip was zero, perhaps the last item was a penalty, and + % perhaps it was >=10000, e.g., a \nobreak. In that case, we want + % to re-insert the same penalty (values >10000 are used for various + % signals); since we just inserted a non-discardable item, any + % following glue (such as a \parskip) would be a breakpoint. For example: + % % @deffn deffn-whatever % @vindex index-whatever % Description. % would allow a break between the index-whatever whatsit % and the "Description." paragraph. - \ifnum\count255>9999 \nobreak \fi + \ifnum\count255>9999 \penalty\count255 \fi \else % On the other hand, if we had a nonzero \lastskip, % this make-up glue would be preceded by a non-discardable item @@ -3296,14 +3490,12 @@ width0pt\relax} \fi % @printindex causes a particular index (the ??s file) to get printed. % It does not print any chapter heading (usually an @unnumbered). % -\def\printindex{\parsearg\doprintindex} -\def\doprintindex#1{\begingroup +\parseargdef\printindex{\begingroup \dobreak \chapheadingskip{10000}% % \smallfonts \rm \tolerance = 9500 \everypar = {}% don't want the \kern\-parindent from indentation suppression. - \indexbreaks % % See if the index file exists and is nonempty. % Change catcode of @ here so that if the index file contains @@ -3330,7 +3522,7 @@ width0pt\relax} \fi % Index files are almost Texinfo source, but we use \ as the escape % character. It would be better to use @, but that's too big a change % to make right now. - \def\indexbackslash{\rawbackslashxx}% + \def\indexbackslash{\backslashcurfont}% \catcode`\\ = 0 \escapechar = `\\ \begindoublecolumns @@ -3352,7 +3544,10 @@ width0pt\relax} \fi \removelastskip % % We like breaks before the index initials, so insert a bonus. - \penalty -300 + \nobreak + \vskip 0pt plus 3\baselineskip + \penalty 0 + \vskip 0pt plus -3\baselineskip % % Typeset the initial. Making this add up to a whole number of % baselineskips increases the chance of the dots lining up from column @@ -3362,80 +3557,100 @@ width0pt\relax} \fi % No shrink because it confuses \balancecolumns. \vskip 1.67\baselineskip plus .5\baselineskip \leftline{\secbf #1}% - \vskip .33\baselineskip plus .1\baselineskip - % % Do our best not to break after the initial. \nobreak + \vskip .33\baselineskip plus .1\baselineskip }} -% This typesets a paragraph consisting of #1, dot leaders, and then #2 -% flush to the right margin. It is used for index and table of contents -% entries. The paragraph is indented by \leftskip. +% \entry typesets a paragraph consisting of the text (#1), dot leaders, and +% then page number (#2) flushed to the right margin. It is used for index +% and table of contents entries. The paragraph is indented by \leftskip. % -\def\entry#1#2{\begingroup - % - % Start a new paragraph if necessary, so our assignments below can't - % affect previous text. - \par - % - % Do not fill out the last line with white space. - \parfillskip = 0in - % - % No extra space above this paragraph. - \parskip = 0in - % - % Do not prefer a separate line ending with a hyphen to fewer lines. - \finalhyphendemerits = 0 - % - % \hangindent is only relevant when the entry text and page number - % don't both fit on one line. In that case, bob suggests starting the - % dots pretty far over on the line. Unfortunately, a large - % indentation looks wrong when the entry text itself is broken across - % lines. So we use a small indentation and put up with long leaders. - % - % \hangafter is reset to 1 (which is the value we want) at the start - % of each paragraph, so we need not do anything with that. - \hangindent = 2em - % - % When the entry text needs to be broken, just fill out the first line - % with blank space. - \rightskip = 0pt plus1fil - % - % A bit of stretch before each entry for the benefit of balancing columns. - \vskip 0pt plus1pt - % - % Start a ``paragraph'' for the index entry so the line breaking - % parameters we've set above will have an effect. - \noindent - % - % Insert the text of the index entry. TeX will do line-breaking on it. - #1% - % The following is kludged to not output a line of dots in the index if - % there are no page numbers. The next person who breaks this will be - % cursed by a Unix daemon. - \def\tempa{{\rm }}% - \def\tempb{#2}% - \edef\tempc{\tempa}% - \edef\tempd{\tempb}% - \ifx\tempc\tempd\ \else% +% A straightforward implementation would start like this: +% \def\entry#1#2{... +% But this frozes the catcodes in the argument, and can cause problems to +% @code, which sets - active. This problem was fixed by a kludge--- +% ``-'' was active throughout whole index, but this isn't really right. +% +% The right solution is to prevent \entry from swallowing the whole text. +% --kasal, 21nov03 +\def\entry{% + \begingroup + % + % Start a new paragraph if necessary, so our assignments below can't + % affect previous text. + \par + % + % Do not fill out the last line with white space. + \parfillskip = 0in + % + % No extra space above this paragraph. + \parskip = 0in + % + % Do not prefer a separate line ending with a hyphen to fewer lines. + \finalhyphendemerits = 0 + % + % \hangindent is only relevant when the entry text and page number + % don't both fit on one line. In that case, bob suggests starting the + % dots pretty far over on the line. Unfortunately, a large + % indentation looks wrong when the entry text itself is broken across + % lines. So we use a small indentation and put up with long leaders. + % + % \hangafter is reset to 1 (which is the value we want) at the start + % of each paragraph, so we need not do anything with that. + \hangindent = 2em + % + % When the entry text needs to be broken, just fill out the first line + % with blank space. + \rightskip = 0pt plus1fil % - % If we must, put the page number on a line of its own, and fill out - % this line with blank space. (The \hfil is overwhelmed with the - % fill leaders glue in \indexdotfill if the page number does fit.) - \hfil\penalty50 - \null\nobreak\indexdotfill % Have leaders before the page number. + % A bit of stretch before each entry for the benefit of balancing + % columns. + \vskip 0pt plus1pt % - % The `\ ' here is removed by the implicit \unskip that TeX does as - % part of (the primitive) \par. Without it, a spurious underfull - % \hbox ensues. - \ifpdf - \pdfgettoks#2.\ \the\toksA % The page number ends the paragraph. + % Swallow the left brace of the text (first parameter): + \afterassignment\doentry + \let\temp = +} +\def\doentry{% + \bgroup % Instead of the swallowed brace. + \noindent + \aftergroup\finishentry + % And now comes the text of the entry. +} +\def\finishentry#1{% + % #1 is the page number. + % + % The following is kludged to not output a line of dots in the index if + % there are no page numbers. The next person who breaks this will be + % cursed by a Unix daemon. + \def\tempa{{\rm }}% + \def\tempb{#1}% + \edef\tempc{\tempa}% + \edef\tempd{\tempb}% + \ifx\tempc\tempd + \ % \else - \ #2% The page number ends the paragraph. + % + % If we must, put the page number on a line of its own, and fill out + % this line with blank space. (The \hfil is overwhelmed with the + % fill leaders glue in \indexdotfill if the page number does fit.) + \hfil\penalty50 + \null\nobreak\indexdotfill % Have leaders before the page number. + % + % The `\ ' here is removed by the implicit \unskip that TeX does as + % part of (the primitive) \par. Without it, a spurious underfull + % \hbox ensues. + \ifpdf + \pdfgettoks#1.% + \ \the\toksA + \else + \ #1% + \fi \fi - \fi% - \par -\endgroup} + \par + \endgroup +} % Like \dotfill except takes at least 1 em. \def\indexdotfill{\cleaders @@ -3622,7 +3837,7 @@ width0pt\relax} \fi % We do the following ugly conditional instead of the above simple % construct for the sake of pdftex, which needs the actual % letter in the expansion, not just typeset. -% +% \def\appendixletter{% \ifnum\appendixno=`A A% \else\ifnum\appendixno=`B B% @@ -3675,59 +3890,106 @@ width0pt\relax} \fi \def\lowersections{\global\advance\secbase by 1} \let\down=\lowersections % original BFox name -% Choose a numbered-heading macro -% #1 is heading level if unmodified by @raisesections or @lowersections -% #2 is text for heading -\def\numhead#1#2{\absseclevel=\secbase\advance\absseclevel by #1 -\ifcase\absseclevel - \chapterzzz{#2}% - \or \seczzz{#2}% - \or \numberedsubseczzz{#2}% - \or \numberedsubsubseczzz{#2}% +% we only have subsub. +\chardef\maxseclevel = 3 +% +% A numbered section within an unnumbered changes to unnumbered too. +% To achive this, remember the "biggest" unnum. sec. we are currently in: +\chardef\unmlevel = \maxseclevel +% +% Trace whether the current chapter is an appendix or not: +% \chapheadtype is "N" or "A", unnumbered chapters are ignored. +\def\chapheadtype{N} + +% Choose a heading macro +% #1 is heading type +% #2 is heading level +% #3 is text for heading +\def\genhead#1#2#3{% + % Compute the abs. sec. level: + \absseclevel=#2 + \advance\absseclevel by \secbase + % Make sure \absseclevel doesn't fall outside the range: + \ifnum \absseclevel < 0 + \absseclevel = 0 \else - \ifnum \absseclevel<0 \chapterzzz{#2}% - \else \numberedsubsubseczzz{#2}% + \ifnum \absseclevel > 3 + \absseclevel = 3 \fi \fi - \suppressfirstparagraphindent -} - -% like \numhead, but chooses appendix heading levels -\def\apphead#1#2{\absseclevel=\secbase\advance\absseclevel by #1 -\ifcase\absseclevel - \appendixzzz{#2}% - \or \appendixsectionzzz{#2}% - \or \appendixsubseczzz{#2}% - \or \appendixsubsubseczzz{#2}% + % The heading type: + \def\headtype{#1}% + \if \headtype U% + \ifnum \absseclevel < \unmlevel + \chardef\unmlevel = \absseclevel + \fi \else - \ifnum \absseclevel<0 \appendixzzz{#2}% - \else \appendixsubsubseczzz{#2}% + % Check for appendix sections: + \ifnum \absseclevel = 0 + \edef\chapheadtype{\headtype}% + \else + \if \headtype A\if \chapheadtype N% + \errmessage{@appendix... within a non-appendix chapter}% + \fi\fi + \fi + % Check for numbered within unnumbered: + \ifnum \absseclevel > \unmlevel + \def\headtype{U}% + \else + \chardef\unmlevel = 3 \fi \fi - \suppressfirstparagraphindent -} - -% like \numhead, but chooses numberless heading levels -\def\unnmhead#1#2{\absseclevel=\secbase\advance\absseclevel by #1 - \ifcase\absseclevel - \unnumberedzzz{#2}% - \or \unnumberedseczzz{#2}% - \or \unnumberedsubseczzz{#2}% - \or \unnumberedsubsubseczzz{#2}% + % Now print the heading: + \if \headtype U% + \ifcase\absseclevel + \unnumberedzzz{#3}% + \or \unnumberedseczzz{#3}% + \or \unnumberedsubseczzz{#3}% + \or \unnumberedsubsubseczzz{#3}% + \fi \else - \ifnum \absseclevel<0 \unnumberedzzz{#2}% - \else \unnumberedsubsubseczzz{#2}% + \if \headtype A% + \ifcase\absseclevel + \appendixzzz{#3}% + \or \appendixsectionzzz{#3}% + \or \appendixsubseczzz{#3}% + \or \appendixsubsubseczzz{#3}% + \fi + \else + \ifcase\absseclevel + \chapterzzz{#3}% + \or \seczzz{#3}% + \or \numberedsubseczzz{#3}% + \or \numberedsubsubseczzz{#3}% + \fi \fi \fi \suppressfirstparagraphindent } -% @chapter, @appendix, @unnumbered. +% an interface: +\def\numhead{\genhead N} +\def\apphead{\genhead A} +\def\unnmhead{\genhead U} + +% @chapter, @appendix, @unnumbered. Increment top-level counter, reset +% all lower-level sectioning counters to zero. +% +% Also set \chaplevelprefix, which we prepend to @float sequence numbers +% (e.g., figures), q.v. By default (before any chapter), that is empty. +\let\chaplevelprefix = \empty % -\outer\def\chapter{\parsearg\chapteryyy} -\def\chapteryyy#1{\numhead0{#1}} % normally numhead0 calls chapterzzz +\outer\parseargdef\chapter{\numhead0{#1}} % normally numhead0 calls chapterzzz \def\chapterzzz#1{% - \secno=0 \subsecno=0 \subsubsecno=0 \advance\chapno by 1 + % section resetting is \global in case the chapter is in a group, such + % as an @include file. + \global\secno=0 \global\subsecno=0 \global\subsubsecno=0 + \global\advance\chapno by 1 + % + % Used for \float. + \gdef\chaplevelprefix{\the\chapno.}% + \resetallfloatnos + % \message{\putwordChapter\space \the\chapno}% % % Write the actual heading. @@ -3739,29 +4001,31 @@ width0pt\relax} \fi \global\let\subsubsection = \numberedsubsubsec } -\outer\def\appendix{\parsearg\appendixyyy} -\def\appendixyyy#1{\apphead0{#1}} % normally apphead0 calls appendixzzz +\outer\parseargdef\appendix{\apphead0{#1}} % normally apphead0 calls appendixzzz \def\appendixzzz#1{% - \secno=0 \subsecno=0 \subsubsecno=0 \advance\appendixno by 1 + \global\secno=0 \global\subsecno=0 \global\subsubsecno=0 + \global\advance\appendixno by 1 + \gdef\chaplevelprefix{\appendixletter.}% + \resetallfloatnos + % \def\appendixnum{\putwordAppendix\space \appendixletter}% \message{\appendixnum}% + % \chapmacro{#1}{Yappendix}{\appendixletter}% + % \global\let\section = \appendixsec \global\let\subsection = \appendixsubsec \global\let\subsubsection = \appendixsubsubsec } -% @centerchap is like @unnumbered, but the heading is centered. -\outer\def\centerchap{\parsearg\centerchapyyy} -\def\centerchapyyy#1{{\unnumberedyyy{#1}}} - -% @top is like @unnumbered. -\outer\def\top{\parsearg\unnumberedyyy} - -\outer\def\unnumbered{\parsearg\unnumberedyyy} -\def\unnumberedyyy#1{\unnmhead0{#1}} % normally unnmhead0 calls unnumberedzzz +\outer\parseargdef\unnumbered{\unnmhead0{#1}} % normally unnmhead0 calls unnumberedzzz \def\unnumberedzzz#1{% - \secno=0 \subsecno=0 \subsubsecno=0 \advance\unnumberedno by 1 + \global\secno=0 \global\subsecno=0 \global\subsubsecno=0 + \global\advance\unnumberedno by 1 + % + % Since an unnumbered has no number, no prefix for figures. + \global\let\chaplevelprefix = \empty + \resetallfloatnos % % This used to be simply \message{#1}, but TeX fully expands the % argument to \message. Therefore, if #1 contained @-commands, TeX @@ -3774,7 +4038,8 @@ width0pt\relax} \fi % \the to achieve this: TeX expands \the only once, % simply yielding the contents of . (We also do this for % the toc entries.) - \toks0 = {#1}\message{(\the\toks0)}% + \toks0 = {#1}% + \message{(\the\toks0)}% % \chapmacro{#1}{Ynothing}{\the\unnumberedno}% % @@ -3783,96 +4048,82 @@ width0pt\relax} \fi \global\let\subsubsection = \unnumberedsubsubsec } +% @centerchap is like @unnumbered, but the heading is centered. +\outer\parseargdef\centerchap{% + % Well, we could do the following in a group, but that would break + % an assumption that \chapmacro is called at the outermost level. + % Thus we are safer this way: --kasal, 24feb04 + \let\centerparametersmaybe = \centerparameters + \unnmhead0{#1}% + \let\centerparametersmaybe = \relax +} + +% @top is like @unnumbered. +\let\top\unnumbered + % Sections. -\outer\def\numberedsec{\parsearg\secyyy} -\def\secyyy#1{\numhead1{#1}} % normally calls seczzz +\outer\parseargdef\numberedsec{\numhead1{#1}} % normally calls seczzz \def\seczzz#1{% - \subsecno=0 \subsubsecno=0 \advance\secno by 1 + \global\subsecno=0 \global\subsubsecno=0 \global\advance\secno by 1 \sectionheading{#1}{sec}{Ynumbered}{\the\chapno.\the\secno}% } -\outer\def\appendixsection{\parsearg\appendixsecyyy} -\outer\def\appendixsec{\parsearg\appendixsecyyy} -\def\appendixsecyyy#1{\apphead1{#1}} % normally calls appendixsectionzzz +\outer\parseargdef\appendixsection{\apphead1{#1}} % normally calls appendixsectionzzz \def\appendixsectionzzz#1{% - \subsecno=0 \subsubsecno=0 \advance\secno by 1 + \global\subsecno=0 \global\subsubsecno=0 \global\advance\secno by 1 \sectionheading{#1}{sec}{Yappendix}{\appendixletter.\the\secno}% } +\let\appendixsec\appendixsection -\outer\def\unnumberedsec{\parsearg\unnumberedsecyyy} -\def\unnumberedsecyyy#1{\unnmhead1{#1}} % normally calls unnumberedseczzz +\outer\parseargdef\unnumberedsec{\unnmhead1{#1}} % normally calls unnumberedseczzz \def\unnumberedseczzz#1{% - \subsecno=0 \subsubsecno=0 \advance\secno by 1 + \global\subsecno=0 \global\subsubsecno=0 \global\advance\secno by 1 \sectionheading{#1}{sec}{Ynothing}{\the\unnumberedno.\the\secno}% } % Subsections. -\outer\def\numberedsubsec{\parsearg\numberedsubsecyyy} -\def\numberedsubsecyyy#1{\numhead2{#1}} % normally calls numberedsubseczzz +\outer\parseargdef\numberedsubsec{\numhead2{#1}} % normally calls numberedsubseczzz \def\numberedsubseczzz#1{% - \subsubsecno=0 \advance\subsecno by 1 + \global\subsubsecno=0 \global\advance\subsecno by 1 \sectionheading{#1}{subsec}{Ynumbered}{\the\chapno.\the\secno.\the\subsecno}% } -\outer\def\appendixsubsec{\parsearg\appendixsubsecyyy} -\def\appendixsubsecyyy#1{\apphead2{#1}} % normally calls appendixsubseczzz +\outer\parseargdef\appendixsubsec{\apphead2{#1}} % normally calls appendixsubseczzz \def\appendixsubseczzz#1{% - \subsubsecno=0 \advance\subsecno by 1 + \global\subsubsecno=0 \global\advance\subsecno by 1 \sectionheading{#1}{subsec}{Yappendix}% {\appendixletter.\the\secno.\the\subsecno}% } -\outer\def\unnumberedsubsec{\parsearg\unnumberedsubsecyyy} -\def\unnumberedsubsecyyy#1{\unnmhead2{#1}} %normally calls unnumberedsubseczzz +\outer\parseargdef\unnumberedsubsec{\unnmhead2{#1}} %normally calls unnumberedsubseczzz \def\unnumberedsubseczzz#1{% - \subsubsecno=0 \advance\subsecno by 1 + \global\subsubsecno=0 \global\advance\subsecno by 1 \sectionheading{#1}{subsec}{Ynothing}% {\the\unnumberedno.\the\secno.\the\subsecno}% } % Subsubsections. -\outer\def\numberedsubsubsec{\parsearg\numberedsubsubsecyyy} -\def\numberedsubsubsecyyy#1{\numhead3{#1}} % normally numberedsubsubseczzz +\outer\parseargdef\numberedsubsubsec{\numhead3{#1}} % normally numberedsubsubseczzz \def\numberedsubsubseczzz#1{% - \advance\subsubsecno by 1 + \global\advance\subsubsecno by 1 \sectionheading{#1}{subsubsec}{Ynumbered}% {\the\chapno.\the\secno.\the\subsecno.\the\subsubsecno}% } -\outer\def\appendixsubsubsec{\parsearg\appendixsubsubsecyyy} -\def\appendixsubsubsecyyy#1{\apphead3{#1}} % normally appendixsubsubseczzz +\outer\parseargdef\appendixsubsubsec{\apphead3{#1}} % normally appendixsubsubseczzz \def\appendixsubsubseczzz#1{% - \advance\subsubsecno by 1 + \global\advance\subsubsecno by 1 \sectionheading{#1}{subsubsec}{Yappendix}% {\appendixletter.\the\secno.\the\subsecno.\the\subsubsecno}% } -\outer\def\unnumberedsubsubsec{\parsearg\unnumberedsubsubsecyyy} -\def\unnumberedsubsubsecyyy#1{\unnmhead3{#1}} %normally unnumberedsubsubseczzz +\outer\parseargdef\unnumberedsubsubsec{\unnmhead3{#1}} %normally unnumberedsubsubseczzz \def\unnumberedsubsubseczzz#1{% - \advance\subsubsecno by 1 + \global\advance\subsubsecno by 1 \sectionheading{#1}{subsubsec}{Ynothing}% {\the\unnumberedno.\the\secno.\the\subsecno.\the\subsubsecno}% } -% These are variants which are not "outer", so they can appear in @ifinfo. -% Actually, they are now be obsolete; ordinary section commands should work. -\def\infotop{\parsearg\unnumberedzzz} -\def\infounnumbered{\parsearg\unnumberedzzz} -\def\infounnumberedsec{\parsearg\unnumberedseczzz} -\def\infounnumberedsubsec{\parsearg\unnumberedsubseczzz} -\def\infounnumberedsubsubsec{\parsearg\unnumberedsubsubseczzz} - -\def\infoappendix{\parsearg\appendixzzz} -\def\infoappendixsec{\parsearg\appendixseczzz} -\def\infoappendixsubsec{\parsearg\appendixsubseczzz} -\def\infoappendixsubsubsec{\parsearg\appendixsubsubseczzz} - -\def\infochapter{\parsearg\chapterzzz} -\def\infosection{\parsearg\sectionzzz} -\def\infosubsection{\parsearg\subsectionzzz} -\def\infosubsubsection{\parsearg\subsubsectionzzz} - % These macros control what the section commands do, according % to what kind of chapter we are in (ordinary, appendix, or unnumbered). % Define them by default for a numbered chapter. @@ -3906,14 +4157,11 @@ width0pt\relax} \fi } % @heading, @subheading, @subsubheading. -\def\heading{\parsearg\doheading} -\def\subheading{\parsearg\dosubheading} -\def\subsubheading{\parsearg\dosubsubheading} -\def\doheading#1{\sectionheading{#1}{sec}{Yomitfromtoc}{} +\parseargdef\heading{\sectionheading{#1}{sec}{Yomitfromtoc}{} \suppressfirstparagraphindent} -\def\dosubheading#1{\sectionheading{#1}{subsec}{Yomitfromtoc}{} +\parseargdef\subheading{\sectionheading{#1}{subsec}{Yomitfromtoc}{} \suppressfirstparagraphindent} -\def\dosubsubheading#1{\sectionheading{#1}{subsubsec}{Yomitfromtoc}{} +\parseargdef\subsubheading{\sectionheading{#1}{subsubsec}{Yomitfromtoc}{} \suppressfirstparagraphindent} % These macros generate a chapter, section, etc. heading only @@ -3923,8 +4171,6 @@ width0pt\relax} \fi %%% Args are the skip and penalty (usually negative) \def\dobreak#1#2{\par\ifdim\lastskip<#1\removelastskip\penalty#2\vskip#1\fi} -\def\setchapterstyle #1 {\csname CHAPF#1\endcsname} - %%% Define plain chapter starts, and page on/off switching for it % Parameter controlling skip before chapter headings (if needed) @@ -3955,29 +4201,24 @@ width0pt\relax} \fi \CHAPPAGon -\def\CHAPFplain{% -\global\let\chapmacro=\chfplain -\global\let\centerchapmacro=\centerchfplain} - -% Normal chapter opening. -% +% Chapter opening. +% % #1 is the text, #2 is the section type (Ynumbered, Ynothing, % Yappendix, Yomitfromtoc), #3 the chapter number. -% +% % To test against our argument. \def\Ynothingkeyword{Ynothing} \def\Yomitfromtockeyword{Yomitfromtoc} \def\Yappendixkeyword{Yappendix} % -\def\chfplain#1#2#3{% +\def\chapmacro#1#2#3{% \pchapsepmacro {% \chapfonts \rm % % Have to define \thissection before calling \donoderef, because the - % xref code eventually uses it, as \Ytitle. On the other hand, it - % has to be called after \pchapsepmacro, or the headline will change - % too soon. + % xref code eventually uses it. On the other hand, it has to be called + % after \pchapsepmacro, or the headline will change too soon. \gdef\thissection{#1}% \gdef\thischaptername{#1}% % @@ -4031,45 +4272,40 @@ width0pt\relax} \fi % @centerchap -- centered and unnumbered. \let\centerparametersmaybe = \relax -\def\centerchfplain#1{{% - \def\centerparametersmaybe{% - \advance\rightskip by 3\rightskip - \leftskip = \rightskip - \parfillskip = 0pt - }% - \chfplain{#1}{Ynothing}{}% -}} +\def\centerparameters{% + \advance\rightskip by 3\rightskip + \leftskip = \rightskip + \parfillskip = 0pt +} -\CHAPFplain % The default % I don't think this chapter style is supported any more, so I'm not % updating it with the new noderef stuff. We'll see. --karl, 11aug03. -% +% +\def\setchapterstyle #1 {\csname CHAPF#1\endcsname} +% \def\unnchfopen #1{% \chapoddpage {\chapfonts \vbox{\hyphenpenalty=10000\tolerance=5000 \parindent=0pt\raggedright \rm #1\hfill}}\bigskip \par\nobreak } - \def\chfopen #1#2{\chapoddpage {\chapfonts \vbox to 3in{\vfil \hbox to\hsize{\hfil #2} \hbox to\hsize{\hfil #1} \vfil}}% \par\penalty 5000 % } - \def\centerchfopen #1{% \chapoddpage {\chapfonts \vbox{\hyphenpenalty=10000\tolerance=5000 \parindent=0pt \hfill {\rm #1}\hfill}}\bigskip \par\nobreak } - \def\CHAPFopen{% -\global\let\chapmacro=\chfopen -\global\let\centerchapmacro=\centerchfopen} + \global\let\chapmacro=\chfopen + \global\let\centerchapmacro=\centerchfopen} % Section titles. These macros combine the section number parts and % call the generic \sectionheading to do the printing. -% +% \newskip\secheadingskip \def\secheadingbreak{\dobreak \secheadingskip{-1000}} @@ -4083,11 +4319,11 @@ width0pt\relax} \fi % Print any size, any type, section title. -% +% % #1 is the text, #2 is the section level (sec/subsec/subsubsec), #3 is % the section type for xrefs (Ynumbered, Ynothing, Yappendix), #4 is the % section number. -% +% \def\sectionheading#1#2#3#4{% {% % Switch to the right set of fonts. @@ -4144,14 +4380,14 @@ width0pt\relax} \fi % glue accumulate. (Not a breakpoint because it's preceded by a % discardable item.) \vskip-\parskip - % - % This \nobreak is purely so the last item on the list is a \penalty - % of 10000. This is so other code, for instance \parsebodycommon, can - % check for and avoid allowing breakpoints. Otherwise, it would - % insert a valid breakpoint between: + % + % This is purely so the last item on the list is a known \penalty > + % 10000. This is so \startdefun can avoid allowing breakpoints after + % section headings. Otherwise, it would insert a valid breakpoint between: + % % @section sec-whatever % @deffn def-whatever - \nobreak + \penalty 10001 } @@ -4160,14 +4396,14 @@ width0pt\relax} \fi \newwrite\tocfile % Write an entry to the toc file, opening it if necessary. -% Called from @chapter, etc. -% +% Called from @chapter, etc. +% % Example usage: \writetocentry{sec}{Section Name}{\the\chapno.\the\secno} % We append the current node name (if any) and page number as additional % arguments for the \{chap,sec,...}entry macros which will eventually % read this. The node name is used in the pdf outlines as the % destination to jump to. -% +% % We open the .toc file for writing here instead of at @setfilename (or % any other fixed time) so that @contents can be anywhere in the document. % But if #1 is `omit', then we don't do anything. This is used for the @@ -4209,81 +4445,83 @@ width0pt\relax} \fi % Prepare to read what we've written to \tocfile. % \def\startcontents#1{% - % If @setchapternewpage on, and @headings double, the contents should - % start on an odd page, unlike chapters. Thus, we maintain - % \contentsalignmacro in parallel with \pagealignmacro. - % From: Torbjorn Granlund - \contentsalignmacro - \immediate\closeout\tocfile - % - % Don't need to put `Contents' or `Short Contents' in the headline. - % It is abundantly clear what they are. - \def\thischapter{}% - \chapmacro{#1}{Yomitfromtoc}{}% - % - \savepageno = \pageno - \begingroup % Set up to handle contents files properly. - \catcode`\\=0 \catcode`\{=1 \catcode`\}=2 \catcode`\@=11 - % We can't do this, because then an actual ^ in a section - % title fails, e.g., @chapter ^ -- exponentiation. --karl, 9jul97. - %\catcode`\^=7 % to see ^^e4 as \"a etc. juha@piuha.ydi.vtt.fi - \raggedbottom % Worry more about breakpoints than the bottom. - \advance\hsize by -\contentsrightmargin % Don't use the full line length. - % - % Roman numerals for page numbers. - \ifnum \pageno>0 \global\pageno = \lastnegativepageno \fi + % If @setchapternewpage on, and @headings double, the contents should + % start on an odd page, unlike chapters. Thus, we maintain + % \contentsalignmacro in parallel with \pagealignmacro. + % From: Torbjorn Granlund + \contentsalignmacro + \immediate\closeout\tocfile + % + % Don't need to put `Contents' or `Short Contents' in the headline. + % It is abundantly clear what they are. + \def\thischapter{}% + \chapmacro{#1}{Yomitfromtoc}{}% + % + \savepageno = \pageno + \begingroup % Set up to handle contents files properly. + \catcode`\\=0 \catcode`\{=1 \catcode`\}=2 \catcode`\@=11 + % We can't do this, because then an actual ^ in a section + % title fails, e.g., @chapter ^ -- exponentiation. --karl, 9jul97. + %\catcode`\^=7 % to see ^^e4 as \"a etc. juha@piuha.ydi.vtt.fi + \raggedbottom % Worry more about breakpoints than the bottom. + \advance\hsize by -\contentsrightmargin % Don't use the full line length. + % + % Roman numerals for page numbers. + \ifnum \pageno>0 \global\pageno = \lastnegativepageno \fi } % Normal (long) toc. \def\contents{% - \startcontents{\putwordTOC}% - \openin 1 \jobname.toc - \ifeof 1 \else - \closein 1 - \input \jobname.toc - \fi - \vfill \eject - \contentsalignmacro % in case @setchapternewpage odd is in effect - \pdfmakeoutlines - \endgroup - \lastnegativepageno = \pageno - \global\pageno = \savepageno + \startcontents{\putwordTOC}% + \openin 1 \jobname.toc + \ifeof 1 \else + \input \jobname.toc + \fi + \vfill \eject + \contentsalignmacro % in case @setchapternewpage odd is in effect + \ifeof 1 \else + \pdfmakeoutlines + \fi + \closein 1 + \endgroup + \lastnegativepageno = \pageno + \global\pageno = \savepageno } % And just the chapters. \def\summarycontents{% - \startcontents{\putwordShortTOC}% - % - \let\numchapentry = \shortchapentry - \let\appentry = \shortchapentry - \let\unnchapentry = \shortunnchapentry - % We want a true roman here for the page numbers. - \secfonts - \let\rm=\shortcontrm \let\bf=\shortcontbf - \let\sl=\shortcontsl \let\tt=\shortconttt - \rm - \hyphenpenalty = 10000 - \advance\baselineskip by 1pt % Open it up a little. - \def\numsecentry##1##2##3##4{} - \let\appsecentry = \numsecentry - \let\unnsecentry = \numsecentry - \let\numsubsecentry = \numsecentry - \let\appsubsecentry = \numsecentry - \let\unnsubsecentry = \numsecentry - \let\numsubsubsecentry = \numsecentry - \let\appsubsubsecentry = \numsecentry - \let\unnsubsubsecentry = \numsecentry - \openin 1 \jobname.toc - \ifeof 1 \else - \closein 1 - \input \jobname.toc - \fi - \vfill \eject - \contentsalignmacro % in case @setchapternewpage odd is in effect - \endgroup - \lastnegativepageno = \pageno - \global\pageno = \savepageno + \startcontents{\putwordShortTOC}% + % + \let\numchapentry = \shortchapentry + \let\appentry = \shortchapentry + \let\unnchapentry = \shortunnchapentry + % We want a true roman here for the page numbers. + \secfonts + \let\rm=\shortcontrm \let\bf=\shortcontbf + \let\sl=\shortcontsl \let\tt=\shortconttt + \rm + \hyphenpenalty = 10000 + \advance\baselineskip by 1pt % Open it up a little. + \def\numsecentry##1##2##3##4{} + \let\appsecentry = \numsecentry + \let\unnsecentry = \numsecentry + \let\numsubsecentry = \numsecentry + \let\appsubsecentry = \numsecentry + \let\unnsubsecentry = \numsecentry + \let\numsubsubsecentry = \numsecentry + \let\appsubsubsecentry = \numsecentry + \let\unnsubsubsecentry = \numsecentry + \openin 1 \jobname.toc + \ifeof 1 \else + \input \jobname.toc + \fi + \closein 1 + \vfill \eject + \contentsalignmacro % in case @setchapternewpage odd is in effect + \endgroup + \lastnegativepageno = \pageno + \global\pageno = \savepageno } \let\shortcontents = \summarycontents @@ -4296,7 +4534,7 @@ width0pt\relax} \fi % But use \hss just in case. % (This space doesn't include the extra space that gets added after % the label; that gets put in by \shortchapentry above.) - % + % % We'd like to right-justify chapter numbers, but that looks strange % with appendix letters. And right-justifying numbers and % left-justifying letters looks strange when there is less than 10 @@ -4321,7 +4559,7 @@ width0pt\relax} \fi % Appendices, in the main contents. % Need the word Appendix, and a fixed-size box. -% +% \def\appendixbox#1{% % We use M since it's probably the widest letter. \setbox0 = \hbox{\putwordAppendix{} M}% @@ -4349,7 +4587,8 @@ width0pt\relax} \fi \def\unnsubsubsecentry#1#2#3#4{\dosubsubsecentry{#1}{#4}} % This parameter controls the indentation of the various levels. -\newdimen\tocindent \tocindent = 2pc +% Same as \defaultparindent. +\newdimen\tocindent \tocindent = 15pt % Now for the actual typesetting. In all these, #1 is the text and #2 is the % page number. @@ -4380,17 +4619,8 @@ width0pt\relax} \fi \tocentry{#1}{\dopageno\bgroup#2\egroup}% \endgroup} -% Final typesetting of a toc entry; we use the same \entry macro as for -% the index entries, but we want to suppress hyphenation here. (We -% can't do that in the \entry macro, since index entries might consist -% of hyphenated-identifiers-that-do-not-fit-on-a-line-and-nothing-else.) -\def\tocentry#1#2{\begingroup - \vskip 0pt plus1pt % allow a little stretch for the sake of nice page breaks - % Do not use \turnoffactive in these arguments. Since the toc is - % typeset in cmr, characters such as _ would come out wrong; we - % have to do the usual translation tricks. - \entry{#1}{#2}% -\endgroup} +% We use the same \entry macro as for the index entries. +\let\tocentry = \entry % Space between chapter (or whatever) number and the title. \def\labelspace{\hskip1em \relax} @@ -4428,10 +4658,10 @@ width0pt\relax} \fi % The text. (`r' is open on the right, `e' somewhat less so on the left.) \setbox0 = \hbox{\kern-.75pt \tensf error\kern-1.5pt} % -\global\setbox\errorbox=\hbox to \dimen0{\hfil +\setbox\errorbox=\hbox to \dimen0{\hfil \hsize = \dimen0 \advance\hsize by -5.8pt % Space to left+right. \advance\hsize by -2\dimen2 % Rules. - \vbox{ + \vbox{% \hrule height\dimen2 \hbox{\vrule width\dimen2 \kern3pt % Space to left of text. \vtop{\kern2.4pt \box0 \kern2.4pt}% Space above/below. @@ -4445,14 +4675,13 @@ width0pt\relax} \fi % One exception: @ is still an escape character, so that @end tex works. % But \@ or @@ will get a plain tex @ character. -\def\tex{\begingroup +\envdef\tex{% \catcode `\\=0 \catcode `\{=1 \catcode `\}=2 \catcode `\$=3 \catcode `\&=4 \catcode `\#=6 \catcode `\^=7 \catcode `\_=8 \catcode `\~=\active \let~=\tie \catcode `\%=14 \catcode `\+=\other \catcode `\"=\other - \catcode `\==\other \catcode `\|=\other \catcode `\<=\other \catcode `\>=\other @@ -4479,10 +4708,11 @@ width0pt\relax} \fi \def\endldots{\mathinner{\ldots\ldots\ldots\ldots}}% \def\enddots{\relax\ifmmode\endldots\else$\mathsurround=0pt \endldots\,$\fi}% \def\@{@}% -\let\Etex=\endgroup} +} +% There is no need to define \Etex. % Define @lisp ... @end lisp. -% @lisp does a \begingroup so it can rebind things, +% @lisp environment forms a group so it can rebind things, % including the definition of @end lisp (which normally is erroneous). % Amount to narrow the margins by for @lisp. @@ -4493,19 +4723,6 @@ width0pt\relax} \fi % have any width. \def\lisppar{\null\endgraf} -% Make each space character in the input produce a normal interword -% space in the output. Don't allow a line break at this space, as this -% is used only in environments like @example, where each line of input -% should produce a line of output anyway. -% -{\obeyspaces % -\gdef\sepspaces{\obeyspaces\let =\tie}} - -% Define \obeyedspace to be our active space, whatever it is. This is -% for use in \parsearg. -{\sepspaces% -\global\let\obeyedspace= } - % This space is always present above and below environments. \newskip\envskipamount \envskipamount = 0pt @@ -4515,7 +4732,8 @@ width0pt\relax} \fi % start of the next paragraph will insert \parskip. % \def\aboveenvbreak{{% - % =10000 instead of <10000 because of a special case in \itemzzz, q.v. + % =10000 instead of <10000 because of a special case in \itemzzz and + % \sectionheading, q.v. \ifnum \lastpenalty=10000 \else \advance\envskipamount by \parskip \endgraf @@ -4523,7 +4741,7 @@ width0pt\relax} \fi \removelastskip % it's not a good place to break if the last penalty was \nobreak % or better ... - \ifnum\lastpenalty>10000 \else \penalty-50 \fi + \ifnum\lastpenalty<10000 \penalty-50 \fi \vskip\envskipamount \fi \fi @@ -4555,52 +4773,52 @@ width0pt\relax} \fi % \newskip\lskip\newskip\rskip -\def\cartouche{% -\par % can't be in the midst of a paragraph. -\begingroup - \lskip=\leftskip \rskip=\rightskip - \leftskip=0pt\rightskip=0pt %we want these *outside*. - \cartinner=\hsize \advance\cartinner by-\lskip - \advance\cartinner by-\rskip - \cartouter=\hsize - \advance\cartouter by 18.4pt % allow for 3pt kerns on either -% side, and for 6pt waste from -% each corner char, and rule thickness - \normbskip=\baselineskip \normpskip=\parskip \normlskip=\lineskip - % Flag to tell @lisp, etc., not to narrow margin. - \let\nonarrowing=\comment - \vbox\bgroup - \baselineskip=0pt\parskip=0pt\lineskip=0pt - \carttop - \hbox\bgroup - \hskip\lskip - \vrule\kern3pt - \vbox\bgroup - \hsize=\cartinner - \kern3pt - \begingroup - \baselineskip=\normbskip - \lineskip=\normlskip - \parskip=\normpskip - \vskip -\parskip +\envdef\cartouche{% + \ifhmode\par\fi % can't be in the midst of a paragraph. + \startsavinginserts + \lskip=\leftskip \rskip=\rightskip + \leftskip=0pt\rightskip=0pt % we want these *outside*. + \cartinner=\hsize \advance\cartinner by-\lskip + \advance\cartinner by-\rskip + \cartouter=\hsize + \advance\cartouter by 18.4pt % allow for 3pt kerns on either + % side, and for 6pt waste from + % each corner char, and rule thickness + \normbskip=\baselineskip \normpskip=\parskip \normlskip=\lineskip + % Flag to tell @lisp, etc., not to narrow margin. + \let\nonarrowing=\comment + \vbox\bgroup + \baselineskip=0pt\parskip=0pt\lineskip=0pt + \carttop + \hbox\bgroup + \hskip\lskip + \vrule\kern3pt + \vbox\bgroup + \kern3pt + \hsize=\cartinner + \baselineskip=\normbskip + \lineskip=\normlskip + \parskip=\normpskip + \vskip -\parskip + \comment % For explanation, see the end of \def\group. +} \def\Ecartouche{% - \endgroup - \kern3pt - \egroup - \kern3pt\vrule - \hskip\rskip - \egroup - \cartbot - \egroup -\endgroup -}} + \ifhmode\par\fi + \kern3pt + \egroup + \kern3pt\vrule + \hskip\rskip + \egroup + \cartbot + \egroup + \checkinserts +} % This macro is called at the beginning of all the @example variants, % inside a group. \def\nonfillstart{% \aboveenvbreak - \inENV % This group ends at the end of the body \hfuzz = 12pt % Don't be fussy \sepspaces % Make spaces be word-separators rather than space tokens. \let\par = \lisppar % don't ignore blank lines @@ -4613,103 +4831,99 @@ width0pt\relax} \fi \ifx\nonarrowing\relax \advance \leftskip by \lispnarrowing \exdentamount=\lispnarrowing - \let\exdent=\nofillexdent - \let\nonarrowing=\relax \fi + \let\exdent=\nofillexdent } -% Define the \E... control sequence only if we are inside the particular -% environment, so the error checking in \end will work. +% If you want all examples etc. small: @set dispenvsize small. +% If you want even small examples the full size: @set dispenvsize nosmall. +% This affects the following displayed environments: +% @example, @display, @format, @lisp % -% To end an @example-like environment, we first end the paragraph (via -% \afterenvbreak's vertical glue), and then the group. That way we keep -% the zero \parskip that the environments set -- \parskip glue will be -% inserted at the beginning of the next paragraph in the document, after -% the environment. -% -\def\nonfillfinish{\afterenvbreak\endgroup} +\def\smallword{small} +\def\nosmallword{nosmall} +\let\SETdispenvsize\relax +\def\setnormaldispenv{% + \ifx\SETdispenvsize\smallword + \smallexamplefonts \rm + \fi +} +\def\setsmalldispenv{% + \ifx\SETdispenvsize\nosmallword + \else + \smallexamplefonts \rm + \fi +} -% @lisp: indented, narrowed, typewriter font. -\def\lisp{\begingroup - \nonfillstart - \let\Elisp = \nonfillfinish - \tt - \let\kbdfont = \kbdexamplefont % Allow @kbd to do something special. - \gobble % eat return +% We often define two environments, @foo and @smallfoo. +% Let's do it by one command: +\def\makedispenv #1#2{ + \expandafter\envdef\csname#1\endcsname {\setnormaldispenv #2} + \expandafter\envdef\csname small#1\endcsname {\setsmalldispenv #2} + \expandafter\let\csname E#1\endcsname \afterenvbreak + \expandafter\let\csname Esmall#1\endcsname \afterenvbreak } -% @example: Same as @lisp. -\def\example{\begingroup \def\Eexample{\nonfillfinish\endgroup}\lisp} +% Define two synonyms: +\def\maketwodispenvs #1#2#3{ + \makedispenv{#1}{#3} + \makedispenv{#2}{#3} +} +% @lisp: indented, narrowed, typewriter font; @example: same as @lisp. +% % @smallexample and @smalllisp: use smaller fonts. % Originally contributed by Pavel@xerox. -\def\smalllisp{\begingroup - \def\Esmalllisp{\nonfillfinish\endgroup}% - \def\Esmallexample{\nonfillfinish\endgroup}% - \smallexamplefonts - \lisp +% +\maketwodispenvs {lisp}{example}{% + \nonfillstart + \tt + \let\kbdfont = \kbdexamplefont % Allow @kbd to do something special. + \gobble % eat return } -\let\smallexample = \smalllisp - -% @display: same as @lisp except keep current font. +% @display/@smalldisplay: same as @lisp except keep current font. % -\def\display{\begingroup +\makedispenv {display}{% \nonfillstart - \let\Edisplay = \nonfillfinish \gobble } -% -% @smalldisplay: @display plus smaller fonts. -% -\def\smalldisplay{\begingroup - \def\Esmalldisplay{\nonfillfinish\endgroup}% - \smallexamplefonts \rm - \display -} -% @format: same as @display except don't narrow margins. +% @format/@smallformat: same as @display except don't narrow margins. % -\def\format{\begingroup - \let\nonarrowing = t +\makedispenv{format}{% + \let\nonarrowing = t% \nonfillstart - \let\Eformat = \nonfillfinish \gobble } -% -% @smallformat: @format plus smaller fonts. -% -\def\smallformat{\begingroup - \def\Esmallformat{\nonfillfinish\endgroup}% - \smallexamplefonts \rm - \format -} -% @flushleft (same as @format). -% -\def\flushleft{\begingroup \def\Eflushleft{\nonfillfinish\endgroup}\format} +% @flushleft: same as @format, but doesn't obey \SETdispenvsize. +\envdef\flushleft{% + \let\nonarrowing = t% + \nonfillstart + \gobble +} +\let\Eflushleft = \afterenvbreak % @flushright. % -\def\flushright{\begingroup - \let\nonarrowing = t +\envdef\flushright{% + \let\nonarrowing = t% \nonfillstart - \let\Eflushright = \nonfillfinish \advance\leftskip by 0pt plus 1fill \gobble } +\let\Eflushright = \afterenvbreak % @quotation does normal linebreaking (hence we can't use \nonfillstart) -% and narrows the margins. +% and narrows the margins. We keep \parskip nonzero in general, since +% we're doing normal filling. So, when using \aboveenvbreak and +% \afterenvbreak, temporarily make \parskip 0. % -\def\quotation{% - \begingroup\inENV %This group ends at the end of the @quotation body +\envdef\quotation{% {\parskip=0pt \aboveenvbreak}% because \aboveenvbreak inserts \parskip \parindent=0pt - % We have retained a nonzero parskip for the environment, since we're - % doing normal filling. So to avoid extra space below the environment... - \def\Equotation{\parskip = 0pt \nonfillfinish}% % % @cartouche defines \nonarrowing to inhibit narrowing at next level down. \ifx\nonarrowing\relax @@ -4718,6 +4932,27 @@ width0pt\relax} \fi \exdentamount = \lispnarrowing \let\nonarrowing = \relax \fi + \parsearg\quotationlabel +} + +% We have retained a nonzero parskip for the environment, since we're +% doing normal filling. +% +\def\Equotation{% + \par + \ifx\quotationauthor\undefined\else + % indent a bit. + \leftline{\kern 2\leftskip \sl ---\quotationauthor}% + \fi + {\parskip=0pt \afterenvbreak}% +} + +% If we're given an argument, typeset it in bold with a colon after. +\def\quotationlabel#1{% + \def\temp{#1}% + \ifx\temp\empty \else + {\bf #1: }% + \fi } @@ -4739,7 +4974,7 @@ width0pt\relax} \fi % % [Knuth] p. 380 \def\uncatcodespecials{% - \def\do##1{\catcode`##1=12}\dospecials} + \def\do##1{\catcode`##1=\other}\dospecials} % % [Knuth] pp. 380,381,391 % Disable Spanish ligatures ?` and !` of \tt font @@ -4787,6 +5022,8 @@ width0pt\relax} \fi } \endgroup \def\setupverbatim{% + \nonfillstart + \advance\leftskip by -\defbodyindent % Easiest (and conventionally used) font for verbatim \tt \def\par{\leavevmode\egroup\box0\endgraf}% @@ -4808,7 +5045,7 @@ width0pt\relax} \fi % % [Knuth] p. 382; only eat outer {} \begingroup - \catcode`[=1\catcode`]=2\catcode`\{=12\catcode`\}=12 + \catcode`[=1\catcode`]=2\catcode`\{=\other\catcode`\}=\other \gdef\doverb{#1[\def\next##1#1}[##1\endgroup]\next] \endgroup % @@ -4825,13 +5062,6 @@ width0pt\relax} \fi % we need not redefine '\', '{' and '}'. % % Inspired by LaTeX's verbatim command set [latex.ltx] -%% Include LaTeX hack for completeness -- never know -%% \begingroup -%% \catcode`|=0 \catcode`[=1 -%% \catcode`]=2\catcode`\{=12\catcode`\}=12\catcode`\ =\active -%% \catcode`\\=12|gdef|doverbatim#1@end verbatim[ -%% #1|endgroup|def|Everbatim[]|end[verbatim]] -%% |endgroup % \begingroup \catcode`\ =\active @@ -4839,54 +5069,32 @@ width0pt\relax} \fi % ignore everything up to the first ^^M, that's the newline at the end % of the @verbatim input line itself. Otherwise we get an extra blank % line in the output. - \gdef\doverbatim#1^^M#2@end verbatim{#2\end{verbatim}}% + \xdef\doverbatim#1^^M#2@end verbatim{#2\noexpand\end\gobble verbatim}% + % We really want {...\end verbatim} in the body of the macro, but + % without the active space; thus we have to use \xdef and \gobble. \endgroup % -\def\verbatim{% - \def\Everbatim{\nonfillfinish\endgroup}% - \begingroup - \nonfillstart - \advance\leftskip by -\defbodyindent - \begingroup\setupverbatim\doverbatim +\envdef\verbatim{% + \setupverbatim\doverbatim } +\let\Everbatim = \afterenvbreak + % @verbatiminclude FILE - insert text of file in verbatim environment. % -% Allow normal characters that we make active in the argument (a file name). -\def\verbatiminclude{% - \begingroup - \catcode`\\=\other - \catcode`~=\other - \catcode`^=\other - \catcode`_=\other - \catcode`|=\other - \catcode`<=\other - \catcode`>=\other - \catcode`+=\other - \parsearg\doverbatiminclude -} -\def\setupverbatiminclude{% - \begingroup - \nonfillstart - \advance\leftskip by -\defbodyindent - \begingroup\setupverbatim -} +\def\verbatiminclude{\parseargusing\filenamecatcodes\doverbatiminclude} % \def\doverbatiminclude#1{% - % Restore active chars for included file. - \endgroup - \begingroup - \let\value=\expandablevalue - \def\thisfile{#1}% - \expandafter\expandafter\setupverbatiminclude\input\thisfile - \endgroup - \nonfillfinish - \endgroup + {% + \makevalueexpandable + \setupverbatim + \input #1 + \afterenvbreak + }% } % @copying ... @end copying. -% Save the text away for @insertcopying later. Many commands won't be -% allowed in this context, but that's ok. +% Save the text away for @insertcopying later. % % We save the uninterpreted tokens, rather than creating a box. % Saving the text in a box would be much easier, but then all the @@ -4895,62 +5103,14 @@ width0pt\relax} \fi % file; b) letting users define the frontmatter in as flexible order as % possible is very desirable. % -\def\copying{\begingroup - % Define a command to swallow text until we reach `@end copying'. - % \ is the escape char in this texinfo.tex file, so it is the - % delimiter for the command; @ will be the escape char when we read - % it, but that doesn't matter. - \long\def\docopying##1\end copying{\gdef\copyingtext{##1}\enddocopying}% - % - % We must preserve ^^M's in the input file; see \insertcopying below. - \catcode`\^^M = \active - \docopying -} - -% What we do to finish off the copying text. -% -\def\enddocopying{\endgroup\ignorespaces} - -% @insertcopying. Here we must play games with ^^M's. On the one hand, -% we need them to delimit commands such as `@end quotation', so they -% must be active. On the other hand, we certainly don't want every -% end-of-line to be a \par, as would happen with the normal active -% definition of ^^M. On the third hand, two ^^M's in a row should still -% generate a \par. -% -% Our approach is to make ^^M insert a space and a penalty1 normally; -% then it can also check if \lastpenalty=1. If it does, then manually -% do \par. -% -% This messes up the normal definitions of @c[omment], so we redefine -% it. Similarly for @ignore. (These commands are used in the gcc -% manual for man page generation.) -% -% Seems pretty fragile, most line-oriented commands will presumably -% fail, but for the limited use of getting the copying text (which -% should be quite simple) inserted, we can hope it's ok. -% -{\catcode`\^^M=\active % -\gdef\insertcopying{\begingroup % - \parindent = 0pt % looks wrong on title page - \def^^M{% - \ifnum \lastpenalty=1 % - \par % - \else % - \space \penalty 1 % - \fi % - }% - % - % Fix @c[omment] for catcode 13 ^^M's. - \def\c##1^^M{\ignorespaces}% - \let\comment = \c % - % - % Don't bother jumping through all the hoops that \doignore does, it - % would be very hard since the catcodes are already set. - \long\def\ignore##1\end ignore{\ignorespaces}% - % - \copyingtext % -\endgroup}% +\def\copying{\checkenv{}\begingroup\scanargctxt\docopying} +\def\docopying#1@end copying{\endgroup\def\copyingtext{#1}} +% +\def\insertcopying{% + \begingroup + \parindent = 0pt % paragraph indentation looks wrong on title page + \scanexp\copyingtext + \endgroup } \message{defuns,} @@ -4960,578 +5120,332 @@ width0pt\relax} \fi \newskip\defargsindent \defargsindent=50pt \newskip\deflastargmargin \deflastargmargin=18pt -\newcount\parencount - -% We want ()&[] to print specially on the defun line. -% -\def\activeparens{% - \catcode`\(=\active \catcode`\)=\active - \catcode`\&=\active - \catcode`\[=\active \catcode`\]=\active -} - -% Make control sequences which act like normal parenthesis chars. -\let\lparen = ( \let\rparen = ) - -{\activeparens % Now, smart parens don't turn on until &foo (see \amprm) - -% Be sure that we always have a definition for `(', etc. For example, -% if the fn name has parens in it, \boldbrax will not be in effect yet, -% so TeX would otherwise complain about undefined control sequence. -\global\let(=\lparen \global\let)=\rparen -\global\let[=\lbrack \global\let]=\rbrack - -\gdef\functionparens{\boldbrax\let&=\amprm\parencount=0 } -\gdef\boldbrax{\let(=\opnr\let)=\clnr\let[=\lbrb\let]=\rbrb} -% This is used to turn on special parens -% but make & act ordinary (given that it's active). -\gdef\boldbraxnoamp{\let(=\opnr\let)=\clnr\let[=\lbrb\let]=\rbrb\let&=\ampnr} - -% Definitions of (, ) and & used in args for functions. -% This is the definition of ( outside of all parentheses. -\gdef\oprm#1 {{\rm\char`\(}#1 \bf \let(=\opnested - \global\advance\parencount by 1 -} -% -% This is the definition of ( when already inside a level of parens. -\gdef\opnested{\char`\(\global\advance\parencount by 1 } -% -\gdef\clrm{% Print a paren in roman if it is taking us back to depth of 0. - % also in that case restore the outer-level definition of (. - \ifnum \parencount=1 {\rm \char `\)}\sl \let(=\oprm \else \char `\) \fi - \global\advance \parencount by -1 } -% If we encounter &foo, then turn on ()-hacking afterwards -\gdef\amprm#1 {{\rm\}\let(=\oprm \let)=\clrm\ } -% -\gdef\normalparens{\boldbrax\let&=\ampnr} -} % End of definition inside \activeparens -%% These parens (in \boldbrax) actually are a little bolder than the -%% contained text. This is especially needed for [ and ] -\def\opnr{{\sf\char`\(}\global\advance\parencount by 1 } -\def\clnr{{\sf\char`\)}\global\advance\parencount by -1 } -\let\ampnr = \& -\def\lbrb{{\bf\char`\[}} -\def\rbrb{{\bf\char`\]}} - -% Active &'s sneak into the index arguments, so make sure it's defined. -{ - \catcode`& = \active - \global\let& = \ampnr -} - -% \defname, which formats the name of the @def (not the args). -% #1 is the function name. -% #2 is the type of definition, such as "Function". -% -\def\defname#1#2{% - % How we'll output the type name. Putting it in brackets helps - % distinguish it from the body text that may end up on the next line - % just below it. - \ifempty{#2}% - \def\defnametype{}% +% Start the processing of @deffn: +\def\startdefun{% + \ifnum\lastpenalty<10000 + \medbreak \else - \def\defnametype{[\rm #2]}% - \fi - % - % Get the values of \leftskip and \rightskip as they were outside the @def... - \dimen2=\leftskip - \advance\dimen2 by -\defbodyindent - % - % Figure out values for the paragraph shape. - \setbox0=\hbox{\hskip \deflastargmargin{\defnametype}}% - \dimen0=\hsize \advance \dimen0 by -\wd0 % compute size for first line - \dimen1=\hsize \advance \dimen1 by -\defargsindent % size for continuations - \parshape 2 0in \dimen0 \defargsindent \dimen1 - % - % Output arg 2 ("Function" or some such) but stuck inside a box of - % width 0 so it does not interfere with linebreaking. - \noindent - % - {% Adjust \hsize to exclude the ambient margins, - % so that \rightline will obey them. - \advance \hsize by -\dimen2 - \dimen3 = 0pt % was -1.25pc - \rlap{\rightline{\defnametype\kern\dimen3}}% - }% - % - % Allow all lines to be underfull without complaint: - \tolerance=10000 \hbadness=10000 - \advance\leftskip by -\defbodyindent - \exdentamount=\defbodyindent - {\df #1}\enskip % output function name - % \defunargs will be called next to output the arguments, if any. -} - -% Common pieces to start any @def... -% #1 is the \E... control sequence to end the definition (which we define). -% #2 is the \...x control sequence (which our caller defines). -% #3 is the control sequence to process the header, such as \defunheader. -% -\def\parsebodycommon#1#2#3{% - \begingroup\inENV - % If there are two @def commands in a row, we'll have a \nobreak, - % which is there to keep the function description together with its - % header. But if there's nothing but headers, we need to allow a - % break somewhere. Check for penalty 10002 (inserted by - % \defargscommonending) instead of 10000, since the sectioning - % commands insert a \penalty10000, and we don't want to allow a break - % between a section heading and a defun. - \ifnum\lastpenalty=10002 \penalty2000 \fi - % - % Similarly, after a section heading, do not allow a break. - % But do insert the glue. - \ifnum\lastpenalty<10000 \medbreak - \else \medskip % preceded by discardable penalty, so not a breakpoint + % If there are two @def commands in a row, we'll have a \nobreak, + % which is there to keep the function description together with its + % header. But if there's nothing but headers, we need to allow a + % break somewhere. Check specifically for penalty 10002, inserted + % by \defargscommonending, instead of 10000, since the sectioning + % commands also insert a nobreak penalty, and we don't want to allow + % a break between a section heading and a defun. + % + \ifnum\lastpenalty=10002 \penalty2000 \fi + % + % Similarly, after a section heading, do not allow a break. + % But do insert the glue. + \medskip % preceded by discardable penalty, so not a breakpoint \fi % - % Define the \E... end token that this defining construct specifies - % so that it will exit this group. - \def#1{\endgraf\endgroup\medbreak}% - % \parindent=0in \advance\leftskip by \defbodyindent \exdentamount=\defbodyindent } -% Common part of the \...x definitions. -% -\def\defxbodycommon{% - % As with \parsebodycommon above, allow line break if we have multiple - % x headers in a row. It's not a great place, though. - \ifnum\lastpenalty=10002 \penalty2000 \fi +\def\dodefunx#1{% + % First, check whether we are in the right environment: + \checkenv#1% + % + % As above, allow line break if we have multiple x headers in a row. + % It's not a great place, though. + \ifnum\lastpenalty=10002 \penalty3000 \fi % - \begingroup\obeylines + % And now, it's time to reuse the body of the original defun: + \expandafter\gobbledefun#1% } +\def\gobbledefun#1\startdefun{} -% Process body of @defun, @deffn, @defmac, etc. +% \printdefunline \deffnheader{text} % -\def\defparsebody#1#2#3{% - \parsebodycommon{#1}{#2}{#3}% - \def#2{\defxbodycommon \activeparens \spacesplit#3}% - \catcode\equalChar=\active - \begingroup\obeylines\activeparens - \spacesplit#3% +\def\printdefunline#1#2{% + \begingroup + % call \deffnheader: + #1#2 \endheader + % common ending: + \interlinepenalty = 10000 + \advance\rightskip by 0pt plus 1fil + \endgraf + \nobreak\vskip -\parskip + \penalty 10002 % signal to \startdefun and \dodefunx + % Some of the @defun-type tags do not enable magic parentheses, + % rendering the following check redundant. But we don't optimize. + \checkparencounts + \endgroup } -% #1, #2, #3 are the common arguments (see \parsebodycommon above). -% #4, delimited by the space, is the class name. -% -\def\defmethparsebody#1#2#3#4 {% - \parsebodycommon{#1}{#2}{#3}% - \def#2##1 {\defxbodycommon \activeparens \spacesplit{#3{##1}}}% - \begingroup\obeylines\activeparens - % The \empty here prevents misinterpretation of a construct such as - % @deffn {whatever} {Enharmonic comma} - % See comments at \deftpparsebody, although in our case we don't have - % to remove the \empty afterwards, since it is empty. - \spacesplit{#3{#4}}\empty -} +\def\Edefun{\endgraf\medbreak} -% Used for @deftypemethod and @deftypeivar. -% #1, #2, #3 are the common arguments (see \defparsebody). -% #4, delimited by a space, is the class name. -% #5 is the method's return type. +% \makedefun{deffn} creates \deffn, \deffnx and \Edeffn; +% the only thing remainnig is to define \deffnheader. % -\def\deftypemethparsebody#1#2#3#4 #5 {% - \parsebodycommon{#1}{#2}{#3}% - \def#2##1 ##2 {\defxbodycommon \activeparens \spacesplit{#3{##1}{##2}}}% - \begingroup\obeylines\activeparens - \spacesplit{#3{#4}{#5}}% +\def\makedefun#1{% + \expandafter\let\csname E#1\endcsname = \Edefun + \edef\temp{\noexpand\domakedefun + \makecsname{#1}\makecsname{#1x}\makecsname{#1header}}% + \temp } -% Used for @deftypeop. The change from \deftypemethparsebody is an -% extra argument at the beginning which is the `category', instead of it -% being the hardwired string `Method' or `Instance Variable'. We have -% to account for this both in the \...x definition and in parsing the -% input at hand. Thus also need a control sequence (passed as #5) for -% the \E... definition to assign the category name to. +% \domakedefun \deffn \deffnx \deffnheader % -\def\deftypeopparsebody#1#2#3#4#5 #6 {% - \parsebodycommon{#1}{#2}{#3}% - \def#2##1 ##2 ##3 {\def#4{##1}% - \defxbodycommon \activeparens \spacesplit{#3{##2}{##3}}}% - \begingroup\obeylines\activeparens - \spacesplit{#3{#5}{#6}}% -} - -% For @defop. -\def\defopparsebody #1#2#3#4#5 {% - \parsebodycommon{#1}{#2}{#3}% - \def#2##1 ##2 {\def#4{##1}% - \defxbodycommon \activeparens \spacesplit{#3{##2}}}% - \begingroup\obeylines\activeparens - \spacesplit{#3{#5}}% -} - -% These parsing functions are similar to the preceding ones -% except that they do not make parens into active characters. -% These are used for "variables" since they have no arguments. +% Define \deffn and \deffnx, without parameters. +% \deffnheader has to be defined explicitly. % -\def\defvarparsebody #1#2#3{% - \parsebodycommon{#1}{#2}{#3}% - \def#2{\defxbodycommon \spacesplit#3}% - \catcode\equalChar=\active - \begingroup\obeylines - \spacesplit#3% -} - -% @defopvar. -\def\defopvarparsebody #1#2#3#4#5 {% - \parsebodycommon{#1}{#2}{#3}% - \def#2##1 ##2 {\def#4{##1}% - \defxbodycommon \spacesplit{#3{##2}}}% - \begingroup\obeylines - \spacesplit{#3{#5}}% -} - -\def\defvrparsebody#1#2#3#4 {% - \parsebodycommon{#1}{#2}{#3}% - \def#2##1 {\defxbodycommon \spacesplit{#3{##1}}}% - \begingroup\obeylines - \spacesplit{#3{#4}}% +\def\domakedefun#1#2#3{% + \envdef#1{% + \startdefun + \parseargusing\activeparens{\printdefunline#3}% + }% + \def#2{\dodefunx#1}% + \def#3% } -% This loses on `@deftp {Data Type} {struct termios}' -- it thinks the -% type is just `struct', because we lose the braces in `{struct -% termios}' when \spacesplit reads its undelimited argument. Sigh. -% \let\deftpparsebody=\defvrparsebody -% -% So, to get around this, we put \empty in with the type name. That -% way, TeX won't find exactly `{...}' as an undelimited argument, and -% won't strip off the braces. -% -\def\deftpparsebody #1#2#3#4 {% - \parsebodycommon{#1}{#2}{#3}% - \def#2##1 {\defxbodycommon \spacesplit{#3{##1}}}% - \begingroup\obeylines - \spacesplit{\parsetpheaderline{#3{#4}}}\empty -} - -% Fine, but then we have to eventually remove the \empty *and* the -% braces (if any). That's what this does. -% -\def\removeemptybraces\empty#1\relax{#1} - -% After \spacesplit has done its work, this is called -- #1 is the final -% thing to call, #2 the type name (which starts with \empty), and #3 -% (which might be empty) the arguments. -% -\def\parsetpheaderline#1#2#3{% - #1{\removeemptybraces#2\relax}{#3}% -}% +%%% Untyped functions: -% Split up #2 (the rest of the input line) at the first space token. -% call #1 with two arguments: -% the first is all of #2 before the space token, -% the second is all of #2 after that space token. -% If #2 contains no space token, all of it is passed as the first arg -% and the second is passed as empty. -% -{\obeylines % - \gdef\spacesplit#1#2^^M{\endgroup\spacesplitx{#1}#2 \relax\spacesplitx}% - \long\gdef\spacesplitx#1#2 #3#4\spacesplitx{% - \ifx\relax #3% - #1{#2}{}% - \else % - #1{#2}{#3#4}% - \fi}% -} +% @deffn category name args +\makedefun{deffn}{\deffngeneral{}} -% Define @defun. +% @deffn category class name args +\makedefun{defop}#1 {\defopon{#1\ \putwordon}} -% This is called to end the arguments processing for all the @def... commands. -% -\def\defargscommonending{% - \interlinepenalty = 10000 - \advance\rightskip by 0pt plus 1fil - \endgraf - \nobreak\vskip -\parskip - \penalty 10002 % signal to \parsebodycommon and \defxbodycommon. -} +% \defopon {category on}class name args +\def\defopon#1#2 {\deffngeneral{\putwordon\ \code{#2}}{#1\ \code{#2}} } -% This expands the args and terminates the paragraph they comprise. +% \deffngeneral {subind}category name args % -\def\defunargs#1{\functionparens \sl -% Expand, preventing hyphenation at `-' chars. -% Note that groups don't affect changes in \hyphenchar. -% Set the font temporarily and use \font in case \setfont made \tensl a macro. -{\tensl\hyphenchar\font=0}% -#1% -{\tensl\hyphenchar\font=45}% -\ifnum\parencount=0 \else \errmessage{Unbalanced parentheses in @def}\fi% - \defargscommonending +\def\deffngeneral#1#2 #3 #4\endheader{% + % Remember that \dosubind{fn}{foo}{} is equivalent to \doind{fn}{foo}. + \dosubind{fn}{\code{#3}}{#1}% + \defname{#2}{}{#3}\magicamp\defunargs{#4\unskip}% } -\def\deftypefunargs #1{% -% Expand, preventing hyphenation at `-' chars. -% Note that groups don't affect changes in \hyphenchar. -% Use \boldbraxnoamp, not \functionparens, so that & is not special. -\boldbraxnoamp -\tclose{#1}% avoid \code because of side effects on active chars - \defargscommonending -} +%%% Typed functions: -% Do complete processing of one @defun or @defunx line already parsed. +% @deftypefn category type name args +\makedefun{deftypefn}{\deftypefngeneral{}} -% @deffn Command forward-char nchars +% @deftypeop category class type name args +\makedefun{deftypeop}#1 {\deftypeopon{#1\ \putwordon}} -\def\deffn{\defmethparsebody\Edeffn\deffnx\deffnheader} +% \deftypeopon {category on}class type name args +\def\deftypeopon#1#2 {\deftypefngeneral{\putwordon\ \code{#2}}{#1\ \code{#2}} } -\def\deffnheader #1#2#3{\doind {fn}{\code{#2}}% -\begingroup\defname {#2}{#1}\defunargs{#3}\endgroup % -\catcode\equalChar=\other % Turn off change made in \defparsebody +% \deftypefngeneral {subind}category type name args +% +\def\deftypefngeneral#1#2 #3 #4 #5\endheader{% + \dosubind{fn}{\code{#4}}{#1}% + \defname{#2}{#3}{#4}\defunargs{#5\unskip}% } -% @defun == @deffn Function +%%% Typed variables: -\def\defun{\defparsebody\Edefun\defunx\defunheader} - -\def\defunheader #1#2{\doind {fn}{\code{#1}}% Make entry in function index -\begingroup\defname {#1}{\putwordDeffunc}% -\defunargs {#2}\endgroup % -\catcode\equalChar=\other % Turn off change made in \defparsebody -} +% @deftypevr category type var args +\makedefun{deftypevr}{\deftypecvgeneral{}} -% @deftypefun int foobar (int @var{foo}, float @var{bar}) +% @deftypecv category class type var args +\makedefun{deftypecv}#1 {\deftypecvof{#1\ \putwordof}} -\def\deftypefun{\defparsebody\Edeftypefun\deftypefunx\deftypefunheader} +% \deftypecvof {category of}class type var args +\def\deftypecvof#1#2 {\deftypecvgeneral{\putwordof\ \code{#2}}{#1\ \code{#2}} } -% #1 is the data type. #2 is the name and args. -\def\deftypefunheader #1#2{\deftypefunheaderx{#1}#2 \relax} -% #1 is the data type, #2 the name, #3 the args. -\def\deftypefunheaderx #1#2 #3\relax{% -\doind {fn}{\code{#2}}% Make entry in function index -\begingroup\defname {\defheaderxcond#1\relax$.$#2}{\putwordDeftypefun}% -\deftypefunargs {#3}\endgroup % -\catcode\equalChar=\other % Turn off change made in \defparsebody +% \deftypecvgeneral {subind}category type var args +% +\def\deftypecvgeneral#1#2 #3 #4 #5\endheader{% + \dosubind{vr}{\code{#4}}{#1}% + \defname{#2}{#3}{#4}\defunargs{#5\unskip}% } -% @deftypefn {Library Function} int foobar (int @var{foo}, float @var{bar}) - -\def\deftypefn{\defmethparsebody\Edeftypefn\deftypefnx\deftypefnheader} - -% \defheaderxcond#1\relax$.$ -% puts #1 in @code, followed by a space, but does nothing if #1 is null. -\def\defheaderxcond#1#2$.${\ifx#1\relax\else\code{#1#2} \fi} +%%% Untyped variables: -% #1 is the classification. #2 is the data type. #3 is the name and args. -\def\deftypefnheader #1#2#3{\deftypefnheaderx{#1}{#2}#3 \relax} -% #1 is the classification, #2 the data type, #3 the name, #4 the args. -\def\deftypefnheaderx #1#2#3 #4\relax{% -\doind {fn}{\code{#3}}% Make entry in function index -\begingroup -\normalparens % notably, turn off `&' magic, which prevents -% at least some C++ text from working -\defname {\defheaderxcond#2\relax$.$#3}{#1}% -\deftypefunargs {#4}\endgroup % -\catcode\equalChar=\other % Turn off change made in \defparsebody -} +% @defvr category var args +\makedefun{defvr}#1 {\deftypevrheader{#1} {} } -% @defmac == @deffn Macro +% @defcv category class var args +\makedefun{defcv}#1 {\defcvof{#1\ \putwordof}} -\def\defmac{\defparsebody\Edefmac\defmacx\defmacheader} +% \defcvof {category of}class var args +\def\defcvof#1#2 {\deftypecvof{#1}#2 {} } -\def\defmacheader #1#2{\doind {fn}{\code{#1}}% Make entry in function index -\begingroup\defname {#1}{\putwordDefmac}% -\defunargs {#2}\endgroup % -\catcode\equalChar=\other % Turn off change made in \defparsebody +%%% Type: +% @deftp category name args +\makedefun{deftp}#1 #2 #3\endheader{% + \doind{tp}{\code{#2}}% + \defname{#1}{}{#2}\defunargs{#3\unskip}% } -% @defspec == @deffn Special Form +% Remaining @defun-like shortcuts: +\makedefun{defun}{\deffnheader{\putwordDeffunc} } +\makedefun{defmac}{\deffnheader{\putwordDefmac} } +\makedefun{defspec}{\deffnheader{\putwordDefspec} } +\makedefun{deftypefun}{\deftypefnheader{\putwordDeffunc} } +\makedefun{defvar}{\defvrheader{\putwordDefvar} } +\makedefun{defopt}{\defvrheader{\putwordDefopt} } +\makedefun{deftypevar}{\deftypevrheader{\putwordDefvar} } +\makedefun{defmethod}{\defopon\putwordMethodon} +\makedefun{deftypemethod}{\deftypeopon\putwordMethodon} +\makedefun{defivar}{\defcvof\putwordInstanceVariableof} +\makedefun{deftypeivar}{\deftypecvof\putwordInstanceVariableof} -\def\defspec{\defparsebody\Edefspec\defspecx\defspecheader} - -\def\defspecheader #1#2{\doind {fn}{\code{#1}}% Make entry in function index -\begingroup\defname {#1}{\putwordDefspec}% -\defunargs {#2}\endgroup % -\catcode\equalChar=\other % Turn off change made in \defparsebody -} - -% @defop CATEGORY CLASS OPERATION ARG... +% \defname, which formats the name of the @def (not the args). +% #1 is the category, such as "Function". +% #2 is the return type, if any. +% #3 is the function name. % -\def\defop #1 {\def\defoptype{#1}% -\defopparsebody\Edefop\defopx\defopheader\defoptype} +% We are followed by (but not passed) the arguments, if any. % -\def\defopheader#1#2#3{% - \dosubind{fn}{\code{#2}}{\putwordon\ \code{#1}}% function index entry - \begingroup - \defname{#2}{\defoptype\ \putwordon\ #1}% - \defunargs{#3}% - \endgroup +\def\defname#1#2#3{% + % Get the values of \leftskip and \rightskip as they were outside the @def... + \advance\leftskip by -\defbodyindent + % + % How we'll format the type name. Putting it in brackets helps + % distinguish it from the body text that may end up on the next line + % just below it. + \def\temp{#1}% + \setbox0=\hbox{\kern\deflastargmargin \ifx\temp\empty\else [\rm\temp]\fi} + % + % Figure out line sizes for the paragraph shape. + % The first line needs space for \box0; but if \rightskip is nonzero, + % we need only space for the part of \box0 which exceeds it: + \dimen0=\hsize \advance\dimen0 by -\wd0 \advance\dimen0 by \rightskip + % The continuations: + \dimen2=\hsize \advance\dimen2 by -\defargsindent + % (plain.tex says that \dimen1 should be used only as global.) + \parshape 2 0in \dimen0 \defargsindent \dimen2 + % + % Put the type name to the right margin. + \noindent + \hbox to 0pt{% + \hfil\box0 \kern-\hsize + % \hsize has to be shortened this way: + \kern\leftskip + % Intentionally do not respect \rightskip, since we need the space. + }% + % + % Allow all lines to be underfull without complaint: + \tolerance=10000 \hbadness=10000 + \exdentamount=\defbodyindent + {% + % defun fonts. We use typewriter by default (used to be bold) because: + % . we're printing identifiers, they should be in tt in principle. + % . in languages with many accents, such as Czech or French, it's + % common to leave accents off identifiers. The result looks ok in + % tt, but exceedingly strange in rm. + % . we don't want -- and --- to be treated as ligatures. + % . this still does not fix the ?` and !` ligatures, but so far no + % one has made identifiers using them :). + \df \tt + \def\temp{#2}% return value type + \ifx\temp\empty\else \tclose{\temp} \fi + #3% output function name + }% + {\rm\enskip}% hskip 0.5 em of \tenrm + % + \boldbrax + % arguments will be output next, if any. } -% @deftypeop CATEGORY CLASS TYPE OPERATION ARG... -% -\def\deftypeop #1 {\def\deftypeopcategory{#1}% - \deftypeopparsebody\Edeftypeop\deftypeopx\deftypeopheader - \deftypeopcategory} +% Print arguments in slanted roman (not ttsl), inconsistently with using +% tt for the name. This is because literal text is sometimes needed in +% the argument list (groff manual), and ttsl and tt are not very +% distinguishable. Prevent hyphenation at `-' chars. % -% #1 is the class name, #2 the data type, #3 the operation name, #4 the args. -\def\deftypeopheader#1#2#3#4{% - \dosubind{fn}{\code{#3}}{\putwordon\ \code{#1}}% entry in function index - \begingroup - \defname{\defheaderxcond#2\relax$.$#3} - {\deftypeopcategory\ \putwordon\ \code{#1}}% - \deftypefunargs{#4}% - \endgroup +\def\defunargs#1{% + % use sl by default (not ttsl), + % tt for the names. + \df \sl \hyphenchar\font=0 + % + % On the other hand, if an argument has two dashes (for instance), we + % want a way to get ttsl. Let's try @var for that. + \let\var=\ttslanted + #1% + \sl\hyphenchar\font=45 } -% @deftypemethod CLASS TYPE METHOD ARG... -% -\def\deftypemethod{% - \deftypemethparsebody\Edeftypemethod\deftypemethodx\deftypemethodheader} +% We want ()&[] to print specially on the defun line. % -% #1 is the class name, #2 the data type, #3 the method name, #4 the args. -\def\deftypemethodheader#1#2#3#4{% - \dosubind{fn}{\code{#3}}{\putwordon\ \code{#1}}% entry in function index - \begingroup - \defname{\defheaderxcond#2\relax$.$#3}{\putwordMethodon\ \code{#1}}% - \deftypefunargs{#4}% - \endgroup +\def\activeparens{% + \catcode`\(=\active \catcode`\)=\active + \catcode`\[=\active \catcode`\]=\active + \catcode`\&=\active } -% @deftypeivar CLASS TYPE VARNAME -% -\def\deftypeivar{% - \deftypemethparsebody\Edeftypeivar\deftypeivarx\deftypeivarheader} -% -% #1 is the class name, #2 the data type, #3 the variable name. -\def\deftypeivarheader#1#2#3{% - \dosubind{vr}{\code{#3}}{\putwordof\ \code{#1}}% entry in variable index - \begingroup - \defname{\defheaderxcond#2\relax$.$#3} - {\putwordInstanceVariableof\ \code{#1}}% - \defvarargs{#3}% - \endgroup -} +% Make control sequences which act like normal parenthesis chars. +\let\lparen = ( \let\rparen = ) -% @defmethod == @defop Method -% -\def\defmethod{\defmethparsebody\Edefmethod\defmethodx\defmethodheader} -% -% #1 is the class name, #2 the method name, #3 the args. -\def\defmethodheader#1#2#3{% - \dosubind{fn}{\code{#2}}{\putwordon\ \code{#1}}% entry in function index - \begingroup - \defname{#2}{\putwordMethodon\ \code{#1}}% - \defunargs{#3}% - \endgroup -} +% Be sure that we always have a definition for `(', etc. For example, +% if the fn name has parens in it, \boldbrax will not be in effect yet, +% so TeX would otherwise complain about undefined control sequence. +{ + \activeparens + \global\let(=\lparen \global\let)=\rparen + \global\let[=\lbrack \global\let]=\rbrack + \global\let& = \& -% @defcv {Class Option} foo-class foo-flag + \gdef\boldbrax{\let(=\opnr\let)=\clnr\let[=\lbrb\let]=\rbrb} + \gdef\magicamp{\let&=\amprm} +} -\def\defcv #1 {\def\defcvtype{#1}% -\defopvarparsebody\Edefcv\defcvx\defcvarheader\defcvtype} +\newcount\parencount -\def\defcvarheader #1#2#3{% - \dosubind{vr}{\code{#2}}{\putwordof\ \code{#1}}% variable index entry - \begingroup - \defname{#2}{\defcvtype\ \putwordof\ #1}% - \defvarargs{#3}% - \endgroup +% If we encounter &foo, then turn on ()-hacking afterwards +\newif\ifampseen +\def\amprm#1 {\ampseentrue{\bf\ }} + +\def\parenfont{% + \ifampseen + % At the first level, print parens in roman, + % otherwise use the default font. + \ifnum \parencount=1 \rm \fi + \else + % The \sf parens (in \boldbrax) actually are a little bolder than + % the contained text. This is especially needed for [ and ] . + \sf + \fi } - -% @defivar CLASS VARNAME == @defcv {Instance Variable} CLASS VARNAME -% -\def\defivar{\defvrparsebody\Edefivar\defivarx\defivarheader} -% -\def\defivarheader#1#2#3{% - \dosubind{vr}{\code{#2}}{\putwordof\ \code{#1}}% entry in var index - \begingroup - \defname{#2}{\putwordInstanceVariableof\ #1}% - \defvarargs{#3}% - \endgroup +\def\infirstlevel#1{% + \ifampseen + \ifnum\parencount=1 + #1% + \fi + \fi } +\def\bfafterword#1 {#1 \bf} -% @defvar -% First, define the processing that is wanted for arguments of @defvar. -% This is actually simple: just print them in roman. -% This must expand the args and terminate the paragraph they make up -\def\defvarargs #1{\normalparens #1% - \defargscommonending +\def\opnr{% + \global\advance\parencount by 1 + {\parenfont(}% + \infirstlevel \bfafterword } - -% @defvr Counter foo-count - -\def\defvr{\defvrparsebody\Edefvr\defvrx\defvrheader} - -\def\defvrheader #1#2#3{\doind {vr}{\code{#2}}% -\begingroup\defname {#2}{#1}\defvarargs{#3}\endgroup} - -% @defvar == @defvr Variable - -\def\defvar{\defvarparsebody\Edefvar\defvarx\defvarheader} - -\def\defvarheader #1#2{\doind {vr}{\code{#1}}% Make entry in var index -\begingroup\defname {#1}{\putwordDefvar}% -\defvarargs {#2}\endgroup % +\def\clnr{% + {\parenfont)}% + \infirstlevel \sl + \global\advance\parencount by -1 } -% @defopt == @defvr {User Option} - -\def\defopt{\defvarparsebody\Edefopt\defoptx\defoptheader} - -\def\defoptheader #1#2{\doind {vr}{\code{#1}}% Make entry in var index -\begingroup\defname {#1}{\putwordDefopt}% -\defvarargs {#2}\endgroup % +\newcount\brackcount +\def\lbrb{% + \global\advance\brackcount by 1 + {\bf[}% +} +\def\rbrb{% + {\bf]}% + \global\advance\brackcount by -1 } -% @deftypevar int foobar - -\def\deftypevar{\defvarparsebody\Edeftypevar\deftypevarx\deftypevarheader} - -% #1 is the data type. #2 is the name, perhaps followed by text that -% is actually part of the data type, which should not be put into the index. -\def\deftypevarheader #1#2{% -\dovarind#2 \relax% Make entry in variables index -\begingroup\defname {\defheaderxcond#1\relax$.$#2}{\putwordDeftypevar}% - \defargscommonending -\endgroup} -\def\dovarind#1 #2\relax{\doind{vr}{\code{#1}}} - -% @deftypevr {Global Flag} int enable - -\def\deftypevr{\defvrparsebody\Edeftypevr\deftypevrx\deftypevrheader} - -\def\deftypevrheader #1#2#3{\dovarind#3 \relax% -\begingroup\defname {\defheaderxcond#2\relax$.$#3}{#1} - \defargscommonending -\endgroup} - -% Now define @deftp -% Args are printed in bold, a slight difference from @defvar. - -\def\deftpargs #1{\bf \defvarargs{#1}} - -% @deftp Class window height width ... - -\def\deftp{\deftpparsebody\Edeftp\deftpx\deftpheader} - -\def\deftpheader #1#2#3{\doind {tp}{\code{#2}}% -\begingroup\defname {#2}{#1}\deftpargs{#3}\endgroup} - -% These definitions are used if you use @defunx (etc.) -% anywhere other than immediately after a @defun or @defunx. -% -\def\defcvx#1 {\errmessage{@defcvx in invalid context}} -\def\deffnx#1 {\errmessage{@deffnx in invalid context}} -\def\defivarx#1 {\errmessage{@defivarx in invalid context}} -\def\defmacx#1 {\errmessage{@defmacx in invalid context}} -\def\defmethodx#1 {\errmessage{@defmethodx in invalid context}} -\def\defoptx #1 {\errmessage{@defoptx in invalid context}} -\def\defopx#1 {\errmessage{@defopx in invalid context}} -\def\defspecx#1 {\errmessage{@defspecx in invalid context}} -\def\deftpx#1 {\errmessage{@deftpx in invalid context}} -\def\deftypefnx#1 {\errmessage{@deftypefnx in invalid context}} -\def\deftypefunx#1 {\errmessage{@deftypefunx in invalid context}} -\def\deftypeivarx#1 {\errmessage{@deftypeivarx in invalid context}} -\def\deftypemethodx#1 {\errmessage{@deftypemethodx in invalid context}} -\def\deftypeopx#1 {\errmessage{@deftypeopx in invalid context}} -\def\deftypevarx#1 {\errmessage{@deftypevarx in invalid context}} -\def\deftypevrx#1 {\errmessage{@deftypevrx in invalid context}} -\def\defunx#1 {\errmessage{@defunx in invalid context}} -\def\defvarx#1 {\errmessage{@defvarx in invalid context}} -\def\defvrx#1 {\errmessage{@defvrx in invalid context}} +\def\checkparencounts{% + \ifnum\parencount=0 \else \badparencount \fi + \ifnum\brackcount=0 \else \badbrackcount \fi +} +\def\badparencount{% + \errmessage{Unbalanced parentheses in @def}% + \global\parencount=0 +} +\def\badbrackcount{% + \errmessage{Unbalanced square braces in @def}% + \global\brackcount=0 +} \message{macros,} @@ -5540,27 +5454,41 @@ width0pt\relax} \fi % To do this right we need a feature of e-TeX, \scantokens, % which we arrange to emulate with a temporary file in ordinary TeX. \ifx\eTeXversion\undefined - \newwrite\macscribble - \def\scanmacro#1{% - \begingroup \newlinechar`\^^M - % Undo catcode changes of \startcontents and \doprintindex - \catcode`\@=0 \catcode`\\=\other \escapechar=`\@ - % Append \endinput to make sure that TeX does not see the ending newline. - \toks0={#1\endinput}% - \immediate\openout\macscribble=\jobname.tmp - \immediate\write\macscribble{\the\toks0}% - \immediate\closeout\macscribble - \let\xeatspaces\eatspaces - \input \jobname.tmp - \endgroup + \newwrite\macscribble + \def\scantokens#1{% + \toks0={#1}% + \immediate\openout\macscribble=\jobname.tmp + \immediate\write\macscribble{\the\toks0}% + \immediate\closeout\macscribble + \input \jobname.tmp + } +\fi + +\def\scanmacro#1{% + \begingroup + \newlinechar`\^^M + \let\xeatspaces\eatspaces + % Undo catcode changes of \startcontents and \doprintindex + % When called from @insertcopying or (short)caption, we need active + % backslash to get it printed correctly. Previously, we had + % \catcode`\\=\other instead. We'll see whether a problem appears + % with macro expansion. --kasal, 19aug04 + \catcode`\@=0 \catcode`\\=\active \escapechar=`\@ + % ... and \example + \spaceisspace + % + % Append \endinput to make sure that TeX does not see the ending newline. + % + % I've verified that it is necessary both for e-TeX and for ordinary TeX + % --kasal, 29nov03 + \scantokens{#1\endinput}% + \endgroup +} + +\def\scanexp#1{% + \edef\temp{\noexpand\scanmacro{#1}}% + \temp } -\else -\def\scanmacro#1{% -\begingroup \newlinechar`\^^M -% Undo catcode changes of \startcontents and \doprintindex -\catcode`\@=0 \catcode`\\=\other \escapechar=`\@ -\let\xeatspaces\eatspaces\scantokens{#1\endinput}\endgroup} -\fi \newcount\paramno % Count of parameters \newtoks\macname % Macro name @@ -5569,13 +5497,15 @@ width0pt\relax} \fi % \do\macro1\do\macro2... % Utility routines. -% Thisdoes \let #1 = #2, except with \csnames. +% This does \let #1 = #2, with \csnames; that is, +% \let \csname#1\endcsname = \csname#2\endcsname +% (except of course we have to play expansion games). +% \def\cslet#1#2{% -\expandafter\expandafter -\expandafter\let -\expandafter\expandafter -\csname#1\endcsname -\csname#2\endcsname} + \expandafter\let + \csname#1\expandafter\endcsname + \csname#2\endcsname +} % Trim leading and trailing spaces off a string. % Concepts from aro-bend problem 15 (see CTAN). @@ -5602,30 +5532,36 @@ width0pt\relax} \fi % done by making ^^M (\endlinechar) catcode 12 when reading the macro % body, and then making it the \newlinechar in \scanmacro. -\def\macrobodyctxt{% - \catcode`\~=\other +\def\scanctxt{% + \catcode`\"=\other + \catcode`\+=\other + \catcode`\<=\other + \catcode`\>=\other + \catcode`\@=\other \catcode`\^=\other \catcode`\_=\other \catcode`\|=\other - \catcode`\<=\other - \catcode`\>=\other - \catcode`\+=\other + \catcode`\~=\other +} + +\def\scanargctxt{% + \scanctxt + \catcode`\\=\other + \catcode`\^^M=\other +} + +\def\macrobodyctxt{% + \scanctxt \catcode`\{=\other \catcode`\}=\other - \catcode`\@=\other \catcode`\^^M=\other - \usembodybackslash} + \usembodybackslash +} \def\macroargctxt{% - \catcode`\~=\other - \catcode`\^=\other - \catcode`\_=\other - \catcode`\|=\other - \catcode`\<=\other - \catcode`\>=\other - \catcode`\+=\other - \catcode`\@=\other - \catcode`\\=\other} + \scanctxt + \catcode`\\=\other +} % \mbodybackslash is the definition of \ in @macro bodies. % It maps \foo\ => \csname macarg.foo\endcsname => #N @@ -5666,8 +5602,7 @@ width0pt\relax} \fi \else \expandafter\parsemacbody \fi} -\def\unmacro{\parsearg\dounmacro} -\def\dounmacro#1{% +\parseargdef\unmacro{% \if1\csname ismacro.#1\endcsname \global\cslet{#1}{macsave.#1}% \global\expandafter\let \csname ismacro.#1\endcsname=0% @@ -5808,21 +5743,38 @@ width0pt\relax} \fi \expandafter\parsearg \fi \next} -% We mant to disable all macros during \shipout so that they are not +% We want to disable all macros during \shipout so that they are not % expanded by \write. \def\turnoffmacros{\begingroup \def\do##1{\let\noexpand##1=\relax}% \edef\next{\macrolist}\expandafter\endgroup\next} +% For \indexnofonts, we need to get rid of all macros, leaving only the +% arguments (if present). Of course this is not nearly correct, but it +% is the best we can do for now. makeinfo does not expand macros in the +% argument to @deffn, which ends up writing an index entry, and texindex +% isn't prepared for an index sort entry that starts with \. +% +% Since macro invocations are followed by braces, we can just redefine them +% to take a single TeX argument. The case of a macro invocation that +% goes to end-of-line is not handled. +% +\def\emptyusermacros{\begingroup + \def\do##1{\let\noexpand##1=\noexpand\asis}% + \edef\next{\macrolist}\expandafter\endgroup\next} + % @alias. % We need some trickery to remove the optional spaces around the equal % sign. Just make them active and then expand them all to nothing. -\def\alias{\begingroup\obeyspaces\parsearg\aliasxxx} +\def\alias{\parseargusing\obeyspaces\aliasxxx} \def\aliasxxx #1{\aliasyyy#1\relax} -\def\aliasyyy #1=#2\relax{\ignoreactivespaces -\edef\next{\global\let\expandafter\noexpand\csname#1\endcsname=% - \expandafter\noexpand\csname#2\endcsname}% -\expandafter\endgroup\next} +\def\aliasyyy #1=#2\relax{% + {% + \expandafter\let\obeyedspace=\empty + \xdef\next{\global\let\makecsname{#1}=\makecsname{#2}}% + }% + \next +} \message{cross references,} @@ -5838,19 +5790,27 @@ width0pt\relax} \fi node \samp{\ignorespaces#1{}}} % @node's only job in TeX is to define \lastnode, which is used in -% cross-references. -\def\node{\ENVcheck\parsearg\nodezzz} -\def\nodezzz#1{\nodexxx #1,\finishnodeparse} -\def\nodexxx#1,#2\finishnodeparse{\gdef\lastnode{#1}} +% cross-references. The @node line might or might not have commas, and +% might or might not have spaces before the first comma, like: +% @node foo , bar , ... +% We don't want such trailing spaces in the node name. +% +\parseargdef\node{\checkenv{}\donode #1 ,\finishnodeparse} +% +% also remove a trailing comma, in case of something like this: +% @node Help-Cross, , , Cross-refs +\def\donode#1 ,#2\finishnodeparse{\dodonode #1,\finishnodeparse} +\def\dodonode#1,#2\finishnodeparse{\gdef\lastnode{#1}} + \let\nwnode=\node \let\lastnode=\empty % Write a cross-reference definition for the current node. #1 is the % type (Ynumbered, Yappendix, Ynothing). -% +% \def\donoderef#1{% \ifx\lastnode\empty\else - \expandafter\expandafter\expandafter\setref{\lastnode}{#1}% + \setref{\lastnode}{#1}% \global\let\lastnode=\empty \fi } @@ -5859,33 +5819,40 @@ width0pt\relax} \fi % \newcount\savesfregister % -\gdef\savesf{\relax \ifhmode \savesfregister=\spacefactor \fi} -\gdef\restoresf{\relax \ifhmode \spacefactor=\savesfregister \fi} -\gdef\anchor#1{\savesf \setref{#1}{Ynothing}\restoresf \ignorespaces} +\def\savesf{\relax \ifhmode \savesfregister=\spacefactor \fi} +\def\restoresf{\relax \ifhmode \spacefactor=\savesfregister \fi} +\def\anchor#1{\savesf \setref{#1}{Ynothing}\restoresf \ignorespaces} % \setref{NAME}{SNT} defines a cross-reference point NAME (a node or an -% anchor), namely NAME-title (the corresponding @chapter/etc. name), -% NAME-pg (the page number), and NAME-snt (section number and type). -% Called from \foonoderef. -% -% We have to set dummies so commands such as @code in a section title -% aren't expanded. It would be nicer not to expand the titles in the -% first place, but that is hard to do. -% -% Likewise, use \turnoffactive so that punctuation chars such as underscore -% and backslash work in node names. -% -\def\setref#1#2{{% - \atdummies +% anchor), which consists of three parts: +% 1) NAME-title - the current sectioning name taken from \thissection, +% or the anchor name. +% 2) NAME-snt - section number and type, passed as the SNT arg, or +% empty for anchors. +% 3) NAME-pg - the page number. +% +% This is called from \donoderef, \anchor, and \dofloat. In the case of +% floats, there is an additional part, which is not written here: +% 4) NAME-lof - the text as it should appear in a @listoffloats. +% +\def\setref#1#2{% \pdfmkdest{#1}% - % \iflinks - \turnoffactive - \dosetq{#1-title}{Ytitle}% - \dosetq{#1-pg}{Ypagenumber}% - \dosetq{#1-snt}{#2}% + {% + \atdummies % preserve commands, but don't expand them + \turnoffactive + \otherbackslash + \edef\writexrdef##1##2{% + \write\auxfile{@xrdef{#1-% #1 of \setref, expanded by the \edef + ##1}{##2}}% these are parameters of \writexrdef + }% + \toks0 = \expandafter{\thissection}% + \immediate \writexrdef{title}{\the\toks0 }% + \immediate \writexrdef{snt}{\csname #2\endcsname}% \Ynumbered etc. + \writexrdef{pg}{\folio}% will be written later, during \shipout + }% \fi -}} +} % @xref, @pxref, and @ref generate cross-references. For \xrefX, #1 is % the node name, #2 the name of the Info cross-reference, #3 the printed @@ -5898,38 +5865,33 @@ width0pt\relax} \fi \def\xrefX[#1,#2,#3,#4,#5,#6]{\begingroup \unsepspaces \def\printedmanual{\ignorespaces #5}% - \def\printednodename{\ignorespaces #3}% - \setbox1=\hbox{\printedmanual}% - \setbox0=\hbox{\printednodename}% + \def\printedrefname{\ignorespaces #3}% + \setbox1=\hbox{\printedmanual\unskip}% + \setbox0=\hbox{\printedrefname\unskip}% \ifdim \wd0 = 0pt % No printed node name was explicitly given. \expandafter\ifx\csname SETxref-automatic-section-title\endcsname\relax % Use the node name inside the square brackets. - \def\printednodename{\ignorespaces #1}% + \def\printedrefname{\ignorespaces #1}% \else % Use the actual chapter/section title appear inside % the square brackets. Use the real section title if we have it. \ifdim \wd1 > 0pt % It is in another manual, so we don't have it. - \def\printednodename{\ignorespaces #1}% + \def\printedrefname{\ignorespaces #1}% \else \ifhavexrefs % We know the real title if we have the xref values. - \def\printednodename{\refx{#1-title}{}}% + \def\printedrefname{\refx{#1-title}{}}% \else % Otherwise just copy the Info node name. - \def\printednodename{\ignorespaces #1}% + \def\printedrefname{\ignorespaces #1}% \fi% \fi \fi \fi % - % If we use \unhbox0 and \unhbox1 to print the node names, TeX does not - % insert empty discretionaries after hyphens, which means that it will - % not find a line break at a hyphen in a node names. Since some manuals - % are best written with fairly long node names, containing hyphens, this - % is a loss. Therefore, we give the text of the node name again, so it - % is as if TeX is seeing it for the first time. + % Make link in pdf output. \ifpdf \leavevmode \getfilename{#4}% @@ -5945,54 +5907,77 @@ width0pt\relax} \fi \linkcolor \fi % - \ifdim \wd1 > 0pt - \putwordsection{} ``\printednodename'' \putwordin{} \cite{\printedmanual}% - \else - % _ (for example) has to be the character _ for the purposes of the - % control sequence corresponding to the node, but it has to expand - % into the usual \leavevmode...\vrule stuff for purposes of - % printing. So we \turnoffactive for the \refx-snt, back on for the - % printing, back off for the \refx-pg. - {\turnoffactive \otherbackslash - % Only output a following space if the -snt ref is nonempty; for - % @unnumbered and @anchor, it won't be. - \setbox2 = \hbox{\ignorespaces \refx{#1-snt}{}}% - \ifdim \wd2 > 0pt \refx{#1-snt}\space\fi - }% - % output the `[mynode]' via a macro. - \xrefprintnodename\printednodename + % Float references are printed completely differently: "Figure 1.2" + % instead of "[somenode], p.3". We distinguish them by the + % LABEL-title being set to a magic string. + {% + % Have to otherify everything special to allow the \csname to + % include an _ in the xref name, etc. + \indexnofonts + \turnoffactive + \otherbackslash + \expandafter\global\expandafter\let\expandafter\Xthisreftitle + \csname XR#1-title\endcsname + }% + \iffloat\Xthisreftitle + % If the user specified the print name (third arg) to the ref, + % print it instead of our usual "Figure 1.2". + \ifdim\wd0 = 0pt + \refx{#1-snt}% + \else + \printedrefname + \fi % - % But we always want a comma and a space: - ,\space + % if the user also gave the printed manual name (fifth arg), append + % "in MANUALNAME". + \ifdim \wd1 > 0pt + \space \putwordin{} \cite{\printedmanual}% + \fi + \else + % node/anchor (non-float) references. % - % output the `page 3'. - \turnoffactive \otherbackslash \putwordpage\tie\refx{#1-pg}{}% + % If we use \unhbox0 and \unhbox1 to print the node names, TeX does not + % insert empty discretionaries after hyphens, which means that it will + % not find a line break at a hyphen in a node names. Since some manuals + % are best written with fairly long node names, containing hyphens, this + % is a loss. Therefore, we give the text of the node name again, so it + % is as if TeX is seeing it for the first time. + \ifdim \wd1 > 0pt + \putwordsection{} ``\printedrefname'' \putwordin{} \cite{\printedmanual}% + \else + % _ (for example) has to be the character _ for the purposes of the + % control sequence corresponding to the node, but it has to expand + % into the usual \leavevmode...\vrule stuff for purposes of + % printing. So we \turnoffactive for the \refx-snt, back on for the + % printing, back off for the \refx-pg. + {\turnoffactive \otherbackslash + % Only output a following space if the -snt ref is nonempty; for + % @unnumbered and @anchor, it won't be. + \setbox2 = \hbox{\ignorespaces \refx{#1-snt}{}}% + \ifdim \wd2 > 0pt \refx{#1-snt}\space\fi + }% + % output the `[mynode]' via a macro so it can be overridden. + \xrefprintnodename\printedrefname + % + % But we always want a comma and a space: + ,\space + % + % output the `page 3'. + \turnoffactive \otherbackslash \putwordpage\tie\refx{#1-pg}{}% + \fi \fi \endlink \endgroup} % This macro is called from \xrefX for the `[nodename]' part of xref % output. It's a separate macro only so it can be changed more easily, -% since not square brackets don't work in some documents. Particularly +% since square brackets don't work well in some documents. Particularly % one that Bob is working on :). % \def\xrefprintnodename#1{[#1]} -% \dosetq is called from \setref to do the actual \write (\iflinks). -% -\def\dosetq#1#2{% - \edef\next{\write\auxfile{\internalsetq{#1}{#2}}}% - \next -} - -% \internalsetq{foo}{page} expands into -% CHARACTERS @xrdef{foo}{...expansion of \page...} -\def\internalsetq#1#2{@xrdef{#1}{\csname #2\endcsname}} - -% Things to be expanded by \internalsetq. +% Things referred to by \setref. % -\def\Ypagenumber{\noexpand\folio} -\def\Ytitle{\thissection} \def\Ynothing{} \def\Yomitfromtoc{} \def\Ynumbered{% @@ -6019,15 +6004,6 @@ width0pt\relax} \fi \fi\fi\fi } -% Use TeX 3.0's \inputlineno to get the line number, for better error -% messages, but if we're using an old version of TeX, don't do anything. -% -\ifx\inputlineno\thisisundefined - \let\linenumber = \empty % Pre-3.0. -\else - \def\linenumber{\the\inputlineno:\space} -\fi - % Define \refx{NAME}{SUFFIX} to reference a cross-reference string named NAME. % If its value is nonempty, SUFFIX is output afterward. % @@ -6036,7 +6012,7 @@ width0pt\relax} \fi \indexnofonts \otherbackslash \expandafter\global\expandafter\let\expandafter\thisrefX - \csname X#1\endcsname + \csname XR#1\endcsname }% \ifx\thisrefX\relax % If not defined, say something at least. @@ -6058,11 +6034,44 @@ width0pt\relax} \fi #2% Output the suffix in any case. } -% This is the macro invoked by entries in the aux file. +% This is the macro invoked by entries in the aux file. Usually it's +% just a \def (we prepend XR to the control sequence name to avoid +% collisions). But if this is a float type, we have more work to do. % -\def\xrdef#1{\expandafter\gdef\csname X#1\endcsname} +\def\xrdef#1#2{% + \expandafter\gdef\csname XR#1\endcsname{#2}% remember this xref value. + % + % Was that xref control sequence that we just defined for a float? + \expandafter\iffloat\csname XR#1\endcsname + % it was a float, and we have the (safe) float type in \iffloattype. + \expandafter\let\expandafter\floatlist + \csname floatlist\iffloattype\endcsname + % + % Is this the first time we've seen this float type? + \expandafter\ifx\floatlist\relax + \toks0 = {\do}% yes, so just \do + \else + % had it before, so preserve previous elements in list. + \toks0 = \expandafter{\floatlist\do}% + \fi + % + % Remember this xref in the control sequence \floatlistFLOATTYPE, + % for later use in \listoffloats. + \expandafter\xdef\csname floatlist\iffloattype\endcsname{\the\toks0{#1}}% + \fi +} % Read the last existing aux file, if any. No error if none exists. +% +\def\tryauxfile{% + \openin 1 \jobname.aux + \ifeof 1 \else + \readauxfile + \global\havexrefstrue + \fi + \closein 1 +} + \def\readauxfile{\begingroup \catcode`\^^@=\other \catcode`\^^A=\other @@ -6121,7 +6130,16 @@ width0pt\relax} \fi \catcode`\%=\other \catcode`+=\other % avoid \+ for paranoia even though we've turned it off % - % Make the characters 128-255 be printing characters + % This is to support \ in node names and titles, since the \ + % characters end up in a \csname. It's easier than + % leaving it active and making its active definition an actual \ + % character. What I don't understand is why it works in the *value* + % of the xrdef. Seems like it should be a catcode12 \, and that + % should not typeset properly. But it works, so I'm moving on for + % now. --karl, 15jan04. + \catcode`\\=\other + % + % Make the characters 128-255 be printing characters. {% \count 1=128 \def\loop{% @@ -6131,30 +6149,17 @@ width0pt\relax} \fi }% }% % - % Turn off \ as an escape so we do not lose on - % entries which were dumped with control sequences in their names. - % For example, @xrdef{$\leq $-fun}{page ...} made by @defun ^^ - % Reference to such entries still does not work the way one would wish, - % but at least they do not bomb out when the aux file is read in. - \catcode`\\=\other - % - % @ is our escape character in .aux files. + % @ is our escape character in .aux files, and we need braces. \catcode`\{=1 \catcode`\}=2 \catcode`\@=0 % - \openin 1 \jobname.aux - \ifeof 1 \else - \closein 1 - \input \jobname.aux - \global\havexrefstrue - \fi - % Open the new aux file. TeX will close it automatically at exit. - \openout\auxfile=\jobname.aux + \input \jobname.aux \endgroup} -% Footnotes. +\message{insertions,} +% including footnotes. \newcount \footnoteno @@ -6168,8 +6173,6 @@ width0pt\relax} \fi % @footnotestyle is meaningful for info output only. \let\footnotestyle=\comment -\let\ptexfootnote=\footnote - {\catcode `\@=11 % % Auto-number footnotes. Otherwise like plain. @@ -6193,17 +6196,12 @@ width0pt\relax} \fi % Don't bother with the trickery in plain.tex to not require the % footnote text as a parameter. Our footnotes don't need to be so general. % -% Oh yes, they do; otherwise, @ifset and anything else that uses -% \parseargline fail inside footnotes because the tokens are fixed when +% Oh yes, they do; otherwise, @ifset (and anything else that uses +% \parseargline) fails inside footnotes because the tokens are fixed when % the footnote is read. --karl, 16nov96. % -% The start of the footnote looks usually like this: -\gdef\startfootins{\insert\footins\bgroup} -% -% ... but this macro is redefined inside @multitable. -% \gdef\dofootnote{% - \startfootins + \insert\footins\bgroup % We want to typeset this text as a normal paragraph, even if the % footnote reference occurs in (for example) a display environment. % So reset some parameters. @@ -6239,40 +6237,66 @@ width0pt\relax} \fi } }%end \catcode `\@=11 -% @| inserts a changebar to the left of the current line. It should -% surround any changed text. This approach does *not* work if the -% change spans more than two lines of output. To handle that, we would -% have adopt a much more difficult approach (putting marks into the main -% vertical list for the beginning and end of each change). +% In case a @footnote appears in a vbox, save the footnote text and create +% the real \insert just after the vbox finished. Otherwise, the insertion +% would be lost. +% Similarily, if a @footnote appears inside an alignment, save the footnote +% text to a box and make the \insert when a row of the table is finished. +% And the same can be done for other insert classes. --kasal, 16nov03. + +% Replace the \insert primitive by a cheating macro. +% Deeper inside, just make sure that the saved insertions are not spilled +% out prematurely. % -\def\|{% - % \vadjust can only be used in horizontal mode. - \leavevmode - % - % Append this vertical mode material after the current line in the output. - \vadjust{% - % We want to insert a rule with the height and depth of the current - % leading; that is exactly what \strutbox is supposed to record. - \vskip-\baselineskip - % - % \vadjust-items are inserted at the left edge of the type. So - % the \llap here moves out into the left-hand margin. - \llap{% - % - % For a thicker or thinner bar, change the `1pt'. - \vrule height\baselineskip width1pt - % - % This is the space between the bar and the text. - \hskip 12pt - }% - }% +\def\startsavinginserts{% + \ifx \insert\ptexinsert + \let\insert\saveinsert + \else + \let\checkinserts\relax + \fi } -% For a final copy, take out the rectangles -% that mark overfull boxes (in case you have decided -% that the text looks ok even though it passes the margin). +% This \insert replacement works for both \insert\footins{foo} and +% \insert\footins\bgroup foo\egroup, but it doesn't work for \insert27{foo}. % -\def\finalout{\overfullrule=0pt} +\def\saveinsert#1{% + \edef\next{\noexpand\savetobox \makeSAVEname#1}% + \afterassignment\next + % swallow the left brace + \let\temp = +} +\def\makeSAVEname#1{\makecsname{SAVE\expandafter\gobble\string#1}} +\def\savetobox#1{\global\setbox#1 = \vbox\bgroup \unvbox#1} + +\def\checksaveins#1{\ifvoid#1\else \placesaveins#1\fi} + +\def\placesaveins#1{% + \ptexinsert \csname\expandafter\gobblesave\string#1\endcsname + {\box#1}% +} + +% eat @SAVE -- beware, all of them have catcode \other: +{ + \def\dospecials{\do S\do A\do V\do E} \uncatcodespecials % ;-) + \gdef\gobblesave @SAVE{} +} + +% initialization: +\def\newsaveins #1{% + \edef\next{\noexpand\newsaveinsX \makeSAVEname#1}% + \next +} +\def\newsaveinsX #1{% + \csname newbox\endcsname #1% + \expandafter\def\expandafter\checkinserts\expandafter{\checkinserts + \checksaveins #1}% +} + +% initialize: +\let\checkinserts\empty +\newsaveins\footins +\newsaveins\margin + % @image. We use the macros from epsf.tex to support this. % If epsf.tex is not installed and @image is used, we complain. @@ -6282,12 +6306,12 @@ width0pt\relax} \fi % undone and the next image would fail. \openin 1 = epsf.tex \ifeof 1 \else - \closein 1 % Do not bother showing banner with epsf.tex v2.7k (available in % doc/epsf.tex and on ctan). \def\epsfannounce{\toks0 = }% \input epsf.tex \fi +\closein 1 % % We will only complain once about lack of epsf.tex. \newif\ifwarnednoepsf @@ -6343,6 +6367,269 @@ width0pt\relax} \fi \endgroup} +% @float FLOATTYPE,LABEL,LOC ... @end float for displayed figures, tables, +% etc. We don't actually implement floating yet, we always include the +% float "here". But it seemed the best name for the future. +% +\envparseargdef\float{\eatcommaspace\eatcommaspace\dofloat#1, , ,\finish} + +% There may be a space before second and/or third parameter; delete it. +\def\eatcommaspace#1, {#1,} + +% #1 is the optional FLOATTYPE, the text label for this float, typically +% "Figure", "Table", "Example", etc. Can't contain commas. If omitted, +% this float will not be numbered and cannot be referred to. +% +% #2 is the optional xref label. Also must be present for the float to +% be referable. +% +% #3 is the optional positioning argument; for now, it is ignored. It +% will somehow specify the positions allowed to float to (here, top, bottom). +% +% We keep a separate counter for each FLOATTYPE, which we reset at each +% chapter-level command. +\let\resetallfloatnos=\empty +% +\def\dofloat#1,#2,#3,#4\finish{% + \let\thiscaption=\empty + \let\thisshortcaption=\empty + % + % don't lose footnotes inside @float. + % + % BEWARE: when the floats start float, we have to issue warning whenever an + % insert appears inside a float which could possibly float. --kasal, 26may04 + % + \startsavinginserts + % + % We can't be used inside a paragraph. + \par + % + \vtop\bgroup + \def\floattype{#1}% + \def\floatlabel{#2}% + \def\floatloc{#3}% we do nothing with this yet. + % + \ifx\floattype\empty + \let\safefloattype=\empty + \else + {% + % the floattype might have accents or other special characters, + % but we need to use it in a control sequence name. + \indexnofonts + \turnoffactive + \xdef\safefloattype{\floattype}% + }% + \fi + % + % If label is given but no type, we handle that as the empty type. + \ifx\floatlabel\empty \else + % We want each FLOATTYPE to be numbered separately (Figure 1, + % Table 1, Figure 2, ...). (And if no label, no number.) + % + \expandafter\getfloatno\csname\safefloattype floatno\endcsname + \global\advance\floatno by 1 + % + {% + % This magic value for \thissection is output by \setref as the + % XREFLABEL-title value. \xrefX uses it to distinguish float + % labels (which have a completely different output format) from + % node and anchor labels. And \xrdef uses it to construct the + % lists of floats. + % + \edef\thissection{\floatmagic=\safefloattype}% + \setref{\floatlabel}{Yfloat}% + }% + \fi + % + % start with \parskip glue, I guess. + \vskip\parskip + % + % Don't suppress indentation if a float happens to start a section. + \restorefirstparagraphindent +} + +% we have these possibilities: +% @float Foo,lbl & @caption{Cap}: Foo 1.1: Cap +% @float Foo,lbl & no caption: Foo 1.1 +% @float Foo & @caption{Cap}: Foo: Cap +% @float Foo & no caption: Foo +% @float ,lbl & Caption{Cap}: 1.1: Cap +% @float ,lbl & no caption: 1.1 +% @float & @caption{Cap}: Cap +% @float & no caption: +% +\def\Efloat{% + \let\floatident = \empty + % + % In all cases, if we have a float type, it comes first. + \ifx\floattype\empty \else \def\floatident{\floattype}\fi + % + % If we have an xref label, the number comes next. + \ifx\floatlabel\empty \else + \ifx\floattype\empty \else % if also had float type, need tie first. + \appendtomacro\floatident{\tie}% + \fi + % the number. + \appendtomacro\floatident{\chaplevelprefix\the\floatno}% + \fi + % + % Start the printed caption with what we've constructed in + % \floatident, but keep it separate; we need \floatident again. + \let\captionline = \floatident + % + \ifx\thiscaption\empty \else + \ifx\floatident\empty \else + \appendtomacro\captionline{: }% had ident, so need a colon between + \fi + % + % caption text. + \appendtomacro\captionline{\scanexp\thiscaption}% + \fi + % + % If we have anything to print, print it, with space before. + % Eventually this needs to become an \insert. + \ifx\captionline\empty \else + \vskip.5\parskip + \captionline + % + % Space below caption. + \vskip\parskip + \fi + % + % If have an xref label, write the list of floats info. Do this + % after the caption, to avoid chance of it being a breakpoint. + \ifx\floatlabel\empty \else + % Write the text that goes in the lof to the aux file as + % \floatlabel-lof. Besides \floatident, we include the short + % caption if specified, else the full caption if specified, else nothing. + {% + \atdummies \turnoffactive \otherbackslash + % since we read the caption text in the macro world, where ^^M + % is turned into a normal character, we have to scan it back, so + % we don't write the literal three characters "^^M" into the aux file. + \scanexp{% + \xdef\noexpand\gtemp{% + \ifx\thisshortcaption\empty + \thiscaption + \else + \thisshortcaption + \fi + }% + }% + \immediate\write\auxfile{@xrdef{\floatlabel-lof}{\floatident + \ifx\gtemp\empty \else : \gtemp \fi}}% + }% + \fi + \egroup % end of \vtop + % + % place the captured inserts + % + % BEWARE: when the floats start float, we have to issue warning whenever an + % insert appears inside a float which could possibly float. --kasal, 26may04 + % + \checkinserts +} + +% Append the tokens #2 to the definition of macro #1, not expanding either. +% +\def\appendtomacro#1#2{% + \expandafter\def\expandafter#1\expandafter{#1#2}% +} + +% @caption, @shortcaption +% +\def\caption{\docaption\thiscaption} +\def\shortcaption{\docaption\thisshortcaption} +\def\docaption{\checkenv\float \bgroup\scanargctxt\defcaption} +\def\defcaption#1#2{\egroup \def#1{#2}} + +% The parameter is the control sequence identifying the counter we are +% going to use. Create it if it doesn't exist and assign it to \floatno. +\def\getfloatno#1{% + \ifx#1\relax + % Haven't seen this figure type before. + \csname newcount\endcsname #1% + % + % Remember to reset this floatno at the next chap. + \expandafter\gdef\expandafter\resetallfloatnos + \expandafter{\resetallfloatnos #1=0 }% + \fi + \let\floatno#1% +} + +% \setref calls this to get the XREFLABEL-snt value. We want an @xref +% to the FLOATLABEL to expand to "Figure 3.1". We call \setref when we +% first read the @float command. +% +\def\Yfloat{\floattype@tie \chaplevelprefix\the\floatno}% + +% Magic string used for the XREFLABEL-title value, so \xrefX can +% distinguish floats from other xref types. +\def\floatmagic{!!float!!} + +% #1 is the control sequence we are passed; we expand into a conditional +% which is true if #1 represents a float ref. That is, the magic +% \thissection value which we \setref above. +% +\def\iffloat#1{\expandafter\doiffloat#1==\finish} +% +% #1 is (maybe) the \floatmagic string. If so, #2 will be the +% (safe) float type for this float. We set \iffloattype to #2. +% +\def\doiffloat#1=#2=#3\finish{% + \def\temp{#1}% + \def\iffloattype{#2}% + \ifx\temp\floatmagic +} + +% @listoffloats FLOATTYPE - print a list of floats like a table of contents. +% +\parseargdef\listoffloats{% + \def\floattype{#1}% floattype + {% + % the floattype might have accents or other special characters, + % but we need to use it in a control sequence name. + \indexnofonts + \turnoffactive + \xdef\safefloattype{\floattype}% + }% + % + % \xrdef saves the floats as a \do-list in \floatlistSAFEFLOATTYPE. + \expandafter\ifx\csname floatlist\safefloattype\endcsname \relax + \ifhavexrefs + % if the user said @listoffloats foo but never @float foo. + \message{\linenumber No `\safefloattype' floats to list.}% + \fi + \else + \begingroup + \leftskip=\tocindent % indent these entries like a toc + \let\do=\listoffloatsdo + \csname floatlist\safefloattype\endcsname + \endgroup + \fi +} + +% This is called on each entry in a list of floats. We're passed the +% xref label, in the form LABEL-title, which is how we save it in the +% aux file. We strip off the -title and look up \XRLABEL-lof, which +% has the text we're supposed to typeset here. +% +% Figures without xref labels will not be included in the list (since +% they won't appear in the aux file). +% +\def\listoffloatsdo#1{\listoffloatsdoentry#1\finish} +\def\listoffloatsdoentry#1-title\finish{{% + % Can't fully expand XR#1-lof because it can contain anything. Just + % pass the control sequence. On the other hand, XR#1-pg is just the + % page number, and we want to fully expand that so we can get a link + % in pdf output. + \toksA = \expandafter{\csname XR#1-lof\endcsname}% + % + % use the same \entry macro we use to generate the TOC and index. + \edef\writeentry{\noexpand\entry{\the\toksA}{\csname XR#1-pg\endcsname}}% + \writeentry +}} + \message{localization,} % and i18n. @@ -6351,19 +6638,17 @@ width0pt\relax} \fi % properly. Single argument is the language abbreviation. % It would be nice if we could set up a hyphenation file here. % -\def\documentlanguage{\parsearg\dodocumentlanguage} -\def\dodocumentlanguage#1{% +\parseargdef\documentlanguage{% \tex % read txi-??.tex file in plain TeX. - % Read the file if it exists. - \openin 1 txi-#1.tex - \ifeof1 - \errhelp = \nolanghelp - \errmessage{Cannot read language file txi-#1.tex}% - \let\temp = \relax - \else - \def\temp{\input txi-#1.tex }% - \fi - \temp + % Read the file if it exists. + \openin 1 txi-#1.tex + \ifeof 1 + \errhelp = \nolanghelp + \errmessage{Cannot read language file txi-#1.tex}% + \else + \input txi-#1.tex + \fi + \closein 1 \endgroup } \newhelp\nolanghelp{The given language definition file cannot be found or @@ -6546,8 +6831,7 @@ should work if nowhere else does.} % Perhaps we should allow setting the margins, \topskip, \parskip, % and/or leading, also. Or perhaps we should compute them somehow. % -\def\pagesizes{\parsearg\pagesizesxxx} -\def\pagesizesxxx#1{\pagesizesyyy #1,,\finish} +\parseargdef\pagesizes{\pagesizesyyy #1,,\finish} \def\pagesizesyyy#1,#2,#3\finish{{% \setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \hsize=#2\relax \fi \globaldefs = 1 @@ -6594,8 +6878,8 @@ should work if nowhere else does.} \def\normalplus{+} \def\normaldollar{$}%$ font-lock fix -% This macro is used to make a character print one way in ttfont -% where it can probably just be output, and another way in other fonts, +% This macro is used to make a character print one way in \tt +% (where it can probably be output as-is), and another way in other fonts, % where something hairier probably needs to be done. % % #1 is what to print if we are indeed using \tt; #2 is what to print @@ -6643,13 +6927,6 @@ should work if nowhere else does.} \catcode`\$=\active \def${\ifusingit{{\sl\$}}\normaldollar}%$ font-lock fix -% Set up an active definition for =, but don't enable it most of the time. -{\catcode`\==\active -\global\def={{\tt \char 61}}} - -\catcode`+=\active -\catcode`\_=\active - % If a .fmt file is being used, characters that might appear in a file % name cannot be active until we have parsed the command line. % So turn them off again, and have \everyjob (or @setfilename) turn them on. @@ -6658,15 +6935,16 @@ should work if nowhere else does.} \catcode`\@=0 -% \rawbackslashxx outputs one backslash character in current font, +% \backslashcurfont outputs one backslash character in current font, % as in \char`\\. -\global\chardef\rawbackslashxx=`\\ +\global\chardef\backslashcurfont=`\\ +\global\let\rawbackslashxx=\backslashcurfont % let existing .??s files work -% \rawbackslash defines an active \ to do \rawbackslashxx. +% \rawbackslash defines an active \ to do \backslashcurfont. % \otherbackslash defines an active \ to be a literal `\' character with % catcode other. {\catcode`\\=\active - @gdef@rawbackslash{@let\=@rawbackslashxx} + @gdef@rawbackslash{@let\=@backslashcurfont} @gdef@otherbackslash{@let\=@realbackslash} } @@ -6674,7 +6952,7 @@ should work if nowhere else does.} {\catcode`\\=\other @gdef@realbackslash{\}} % \normalbackslash outputs one backslash in fixed width font. -\def\normalbackslash{{\tt\rawbackslashxx}} +\def\normalbackslash{{\tt\backslashcurfont}} \catcode`\\=\active @@ -6691,6 +6969,7 @@ should work if nowhere else does.} @let>=@normalgreater @let+=@normalplus @let$=@normaldollar %$ font-lock fix + @unsepspaces } % Same as @turnoffactive except outputs \ as {\tt\char`\\} instead of @@ -6730,10 +7009,6 @@ should work if nowhere else does.} @catcode`@# = @other @catcode`@% = @other -@c Set initial fonts. -@textfonts -@rm - @c Local variables: @c eval: (add-hook 'write-file-hooks 'time-stamp) @@ -6742,3 +7017,9 @@ should work if nowhere else does.} @c time-stamp-format: "%:y-%02m-%02d.%02H" @c time-stamp-end: "}" @c End: + +@c vim:sw=2: + +@ignore + arch-tag: e1b36e32-c96e-4135-a41a-0b2efa2ea115 +@end ignore diff --git a/doc/transformation.texi b/doc/transformation.texi new file mode 100644 index 00000000..6973c519 --- /dev/null +++ b/doc/transformation.texi @@ -0,0 +1,487 @@ +@node Data Manipulation, Data Selection, Variable Attributes, Top +@chapter Data transformations +@cindex transformations + +The PSPP procedures examined in this chapter manipulate data and +prepare the active file for later analyses. They do not produce output, +as a rule. + +@menu +* AGGREGATE:: Summarize multiple cases into a single case. +* AUTORECODE:: Automatic recoding of variables. +* COMPUTE:: Assigning a variable a calculated value. +* COUNT:: Counting variables with particular values. +* FLIP:: Exchange variables with cases. +* IF:: Conditionally assigning a calculated value. +* RECODE:: Mapping values from one set to another. +* SORT CASES:: Sort the active file. +@end menu + +@node AGGREGATE, AUTORECODE, Data Manipulation, Data Manipulation +@section AGGREGATE +@vindex AGGREGATE + +@display +AGGREGATE + /BREAK=var_list + /PRESORTED + /OUTFILE=@{*,'filename'@} + /DOCUMENT + /MISSING=COLUMNWISE + /dest_vars=agr_func(src_vars, args@dots{})@dots{} +@end display + +@cmd{AGGREGATE} summarizes groups of cases into single cases. +Cases are divided into groups that have the same values for one or more +variables called @dfn{break variables}. Several functions are available +for summarizing case contents. + +At least one break variable must be specified on BREAK, the only +required subcommand. The values of these variables are used to divide +the active file into groups to be summarized. In addition, at least +one @var{dest_var} must be specified. + +By default, the active file is sorted based on the break variables +before aggregation takes place. If the active file is already sorted +or otherwise grouped in terms of the break variables, specify +PRESORTED to save time. + +The OUTFILE subcommand specifies a system file by file name string or +file handle (@pxref{FILE HANDLE}). The aggregated cases are written to +this file. If OUTFILE is not specified, or if @samp{*} is specified, +then the aggregated cases replace the active file. + +Specify DOCUMENT to copy the documents from the active file into the +aggregate file (@pxref{DOCUMENT}). Otherwise, the aggregate file will +not contain any documents, even if the aggregate file replaces the +active file. + +One or more sets of aggregation variables must be specified. Each set +comprises a list of aggregation variables, an equals sign (@samp{=}), +the name of an aggregation function (see the list below), and a list +of source variables in parentheses. Some aggregation functions expect +additional arguments following the source variable names. + +Each set must have exactly as many source variables as aggregation +variables. Each aggregation variable receives the results of applying +the specified aggregation function to the corresponding source +variable. Most aggregation functions may be applied to numeric and +short and long string variables. Others, marked below, are restricted +to numeric values. + +The available aggregation functions are as follows: + +@table @asis +@item SUM(var_name) +Sum. Limited to numeric values. +@item MEAN(var_name) +Arithmetic mean. Limited to numeric values. +@item SD(var_name) +Standard deviation of the mean. Limited to numeric values. +@item MAX(var_name) +Maximum value. +@item MIN(var_name) +Minimum value. +@item FGT(var_name, value) +@itemx PGT(var_name, value) +Fraction between 0 and 1, or percentage between 0 and 100, respectively, +of values greater than the specified constant. +@item FLT(var_name, value) +@itemx PLT(var_name, value) +Fraction or percentage, respectively, of values less than the specified +constant. +@item FIN(var_name, low, high) +@itemx PIN(var_name, low, high) +Fraction or percentage, respectively, of values within the specified +inclusive range of constants. +@item FOUT(var_name, low, high) +@itemx POUT(var_name, low, high) +Fraction or percentage, respectively, of values strictly outside the +specified range of constants. +@item N(var_name) +Number of non-missing values. +@item N +Number of cases aggregated to form this group. Don't supply a source +variable for this aggregation function. +@item NU(var_name) +Number of non-missing values. Each case is considered to have a weight +of 1, regardless of the current weighting variable (@pxref{WEIGHT}). +@item NU +Number of cases aggregated to form this group. Each case is considered +to have a weight of 1, regardless of the current weighting variable. +@item NMISS(var_name) +Number of missing values. +@item NUMISS(var_name) +Number of missing values. Each case is considered to have a weight of +1, regardless of the current weighting variable. +@item FIRST(var_name) +First value in this group. +@item LAST(var_name) +Last value in this group. +@end table + +Aggregation functions compare string values in terms of internal +character codes. On most modern computers, this is a form of ASCII. + +The aggregation functions listed above exclude all user-missing values +from calculations. To include user-missing values, insert a period +(@samp{.}) between the function name and left parenthesis +(e.g.@: @samp{SUM.}). + +Normally, only a single case (for SD and SD., two cases) need be +non-missing in each group for the aggregate variable to be +non-missing. Specifying /MISSING=COLUMNWISE inverts this behavior, so +that the aggregate variable becomes missing if any aggregated value is +missing. + +@cmd{AGGREGATE} both ignores and cancels the current @cmd{SPLIT FILE} +settings (@pxref{SPLIT FILE}). + +@node AUTORECODE, COMPUTE, AGGREGATE, Data Manipulation +@section AUTORECODE +@vindex AUTORECODE + +@display +AUTORECODE VARIABLES=src_vars INTO dest_vars + /DESCENDING + /PRINT +@end display + +The @cmd{AUTORECODE} procedure considers the @var{n} values that a variable +takes on and maps them onto values 1@dots{}@var{n} on a new numeric +variable. + +Subcommand VARIABLES is the only required subcommand and must come +first. Specify VARIABLES, an equals sign (@samp{=}), a list of source +variables, INTO, and a list of target variables. There must the same +number of source and target variables. The target variables must not +already exist. + +By default, increasing values of a source variable (for a string, this +is based on character code comparisons) are recoded to increasing values +of its target variable. To cause increasing values of a source variable +to be recoded to decreasing values of its target variable (@var{n} down +to 1), specify DESCENDING. + +PRINT is currently ignored. + +@cmd{AUTORECODE} is a procedure. It causes the data to be read. + +@node COMPUTE, COUNT, AUTORECODE, Data Manipulation +@section COMPUTE +@vindex COMPUTE + +@display +COMPUTE variable = expression. + or +COMPUTE vector(index) = expression. +@end display + +@cmd{COMPUTE} assigns the value of an expression to a target +variable. For each case, the expression is evaluated and its value +assigned to the target variable. Numeric and short and long string +variables may be assigned. When a string expression's width differs +from the target variable's width, the string result of the expression +is truncated or padded with spaces on the right as necessary. The +expression and variable types must match. + +For numeric variables only, the target variable need not already +exist. Numeric variables created by @cmd{COMPUTE} are assigned an +@code{F8.2} output format. String variables must be declared before +they can be used as targets for @cmd{COMPUTE}. + +The target variable may be specified as an element of a vector +(@pxref{VECTOR}). In this case, a vector index expression must be +specified in parentheses following the vector name. The index +expression must evaluate to a numeric value that, after rounding down +to the nearest integer, is a valid index for the named vector. + +Using @cmd{COMPUTE} to assign to a variable specified on @cmd{LEAVE} +(@pxref{LEAVE}) resets the variable's left state. Therefore, +@code{LEAVE} should be specified following @cmd{COMPUTE}, not before. + +@cmd{COMPUTE} is a transformation. It does not cause the active file to be +read. + +When @cmd{COMPUTE} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + +@node COUNT, FLIP, COMPUTE, Data Manipulation +@section COUNT +@vindex COUNT + +@display +COUNT var_name = var@dots{} (value@dots{}). + +Each value takes one of the following forms: + number + string + num1 THRU num2 + MISSING + SYSMIS +In addition, num1 and num2 can be LO or LOWEST, or HI or HIGHEST, +respectively. +@end display + +@cmd{COUNT} creates or replaces a numeric @dfn{target} variable that +counts the occurrence of a @dfn{criterion} value or set of values over +one or more @dfn{test} variables for each case. + +The target variable values are always nonnegative integers. They are +never missing. The target variable is assigned an F8.2 output format. +@xref{Input/Output Formats}. Any variables, including long and short +string variables, may be test variables. + +User-missing values of test variables are treated just like any other +values. They are @strong{not} treated as system-missing values. +User-missing values that are criterion values or inside ranges of +criterion values are counted as any other values. However (for numeric +variables), keyword MISSING may be used to refer to all system- +and user-missing values. + +@cmd{COUNT} target variables are assigned values in the order +specified. In the command @code{COUNT A=A B(1) /B=A B(2).}, the +following actions occur: + +@itemize @minus +@item +The number of occurrences of 1 between @code{A} and @code{B} is counted. + +@item +@code{A} is assigned this value. + +@item +The number of occurrences of 1 between @code{B} and the @strong{new} +value of @code{A} is counted. + +@item +@code{B} is assigned this value. +@end itemize + +Despite this ordering, all @cmd{COUNT} criterion variables must exist +before the procedure is executed---they may not be created as target +variables earlier in the command! Break such a command into two +separate commands. + +The examples below may help to clarify. + +@enumerate A +@item +Assuming @code{Q0}, @code{Q2}, @dots{}, @code{Q9} are numeric variables, +the following commands: + +@enumerate +@item +Count the number of times the value 1 occurs through these variables +for each case and assigns the count to variable @code{QCOUNT}. + +@item +Print out the total number of times the value 1 occurs throughout +@emph{all} cases using @cmd{DESCRIPTIVES}. @xref{DESCRIPTIVES}, for +details. +@end enumerate + +@example +COUNT QCOUNT=Q0 TO Q9(1). +DESCRIPTIVES QCOUNT /STATISTICS=SUM. +@end example + +@item +Given these same variables, the following commands: + +@enumerate +@item +Count the number of valid values of these variables for each case and +assigns the count to variable @code{QVALID}. + +@item +Multiplies each value of @code{QVALID} by 10 to obtain a percentage of +valid values, using @cmd{COMPUTE}. @xref{COMPUTE}, for details. + +@item +Print out the percentage of valid values across all cases, using +@cmd{DESCRIPTIVES}. @xref{DESCRIPTIVES}, for details. +@end enumerate + +@example +COUNT QVALID=Q0 TO Q9 (LO THRU HI). +COMPUTE QVALID=QVALID*10. +DESCRIPTIVES QVALID /STATISTICS=MEAN. +@end example +@end enumerate + +@node FLIP, IF, COUNT, Data Manipulation +@section FLIP +@vindex FLIP + +@display +FLIP /VARIABLES=var_list /NEWNAMES=var_name. +@end display + +@cmd{FLIP} transposes rows and columns in the active file. It +causes cases to be swapped with variables, and vice versa. + +All variables in the transposed active file are numeric. String +variables take on the system-missing value in the transposed file. + +No subcommands are required. The VARIABLES subcommand specifies +variables that will be transformed into cases. Variables not specified +are discarded. By default, all variables are selected for +transposition. + +The variables specified by NEWNAMES, which must be a string variable, is +used to give names to the variables created by @cmd{FLIP}. If +NEWNAMES is not +specified then the default is a variable named CASE_LBL, if it exists. +If it does not then the variables created by FLIP are named VAR000 +through VAR999, then VAR1000, VAR1001, and so on. + +When a NEWNAMES variable is available, the names must be canonicalized +before becoming variable names. Invalid characters are replaced by +letter @samp{V} in the first position, or by @samp{_} in subsequent +positions. If the name thus generated is not unique, then numeric +extensions are added, starting with 1, until a unique name is found or +there are no remaining possibilities. If the latter occurs then the +FLIP operation aborts. + +The resultant dictionary contains a CASE_LBL variable, which stores the +names of the variables in the dictionary before the transposition. If +the active file is subsequently transposed using @cmd{FLIP}, this +variable can +be used to recreate the original variable names. + +FLIP honors N OF CASES. It ignores TEMPORARY, so that ``temporary'' +transformations become permanent. + +@node IF, RECODE, FLIP, Data Manipulation +@section IF +@vindex IF + +@display +IF condition variable=expression. + or +IF condition vector(index)=expression. +@end display + +The @cmd{IF} transformation conditionally assigns the value of a target +expression to a target variable, based on the truth of a test +expression. + +Specify a boolean-valued expression (@pxref{Expressions}) to be tested +following the IF keyword. This expression is evaluated for each case. +If the value is true, then the value of the expression is computed and +assigned to the specified variable. If the value is false or missing, +nothing is done. Numeric and short and long string variables may be +assigned. When a string expression's width differs from the target +variable's width, the string result of the expression is truncated or +padded with spaces on the right as necessary. The expression and +variable types must match. + +The target variable may be specified as an element of a vector +(@pxref{VECTOR}). In this case, a vector index expression must be +specified in parentheses following the vector name. The index +expression must evaluate to a numeric value that, after rounding down +to the nearest integer, is a valid index for the named vector. + +Using @cmd{IF} to assign to a variable specified on @cmd{LEAVE} +(@pxref{LEAVE}) resets the variable's left state. Therefore, +@code{LEAVE} should be specified following @cmd{IF}, not before. + +When @cmd{IF} is specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}), the @cmd{LAG} function may not be used +(@pxref{LAG}). + +@node RECODE, SORT CASES, IF, Data Manipulation +@section RECODE +@vindex RECODE + +@display +RECODE var_list (src_value@dots{}=dest_value)@dots{} [INTO var_list]. + +src_value may take the following forms: + number + string + num1 THRU num2 + MISSING + SYSMIS + ELSE +Open-ended ranges may be specified using LO or LOWEST for num1 +or HI or HIGHEST for num2. + +dest_value may take the following forms: + num + string + SYSMIS + COPY +@end display + +@cmd{RECODE} translates data from one range of values to +another, via flexible user-specified mappings. Data may be remapped +in-place or copied to new variables. Numeric, short string, and long +string data can be recoded. + +Specify the list of source variables, followed by one or more mapping +specifications each enclosed in parentheses. If the data is to be +copied to new variables, specify INTO, then the list of target +variables. String target variables must already have been declared +using @cmd{STRING} or another transformation, but numeric target +variables can +be created on the fly. There must be exactly as many target variables +as source variables. Each source variable is remapped into its +corresponding target variable. + +When INTO is not used, the input and output variables must be of the +same type. Otherwise, string values can be recoded into numeric values, +and vice versa. When this is done and there is no mapping for a +particular value, either a value consisting of all spaces or the +system-missing value is assigned, depending on variable type. + +Mappings are considered from left to right. The first src_value that +matches the value of the source variable causes the target variable to +receive the value indicated by the dest_value. Literal number, string, +and range src_value's should be self-explanatory. MISSING as a +src_value matches any user- or system-missing value. SYSMIS matches the +system missing value only. ELSE is a catch-all that matches anything. +It should be the last src_value specified. + +Numeric and string dest_value's should also be self-explanatory. COPY +causes the input values to be copied to the output. This is only value +if the source and target variables are of the same type. SYSMIS +indicates the system-missing value. + +If the source variables are strings and the target variables are +numeric, then there is one additional mapping available: (CONVERT), +which must be the last specified mapping. CONVERT causes a number +specified as a string to be converted to a numeric value. If the string +cannot be parsed as a number, then the system-missing value is assigned. + +Multiple recodings can be specified on a single @cmd{RECODE} invocation. +Introduce additional recodings with a slash (@samp{/}) to +separate them from the previous recodings. + +@node SORT CASES, , RECODE, Data Manipulation +@section SORT CASES +@vindex SORT CASES + +@display +SORT CASES BY var_list. +@end display + +@cmd{SORT CASES} sorts the active file by the values of one or more +variables. + +Specify BY and a list of variables to sort by. By default, variables +are sorted in ascending order. To override sort order, specify (D) or +(DOWN) after a list of variables to get descending order, or (A) or (UP) +for ascending order. These apply to the entire list of variables +preceding them. + +@cmd{SORT CASES} is a procedure. It causes the data to be read. + +@cmd{SORT CASES} attempts to sort the entire active file in main memory. +If main memory is exhausted, it falls back to a merge sort algorithm that +involves writing and reading numerous temporary files. + +@cmd{SORT CASES} may not be specified following TEMPORARY. +@setfilename ignored diff --git a/doc/utilities.texi b/doc/utilities.texi new file mode 100644 index 00000000..db7daa36 --- /dev/null +++ b/doc/utilities.texi @@ -0,0 +1,581 @@ +@node Utilities, Not Implemented, Statistics, Top +@chapter Utilities + +Commands that don't fit any other category are placed here. + +Most of these commands are not affected by commands like @cmd{IF} and +@cmd{LOOP}: +they take effect only once, unconditionally, at the time that they are +encountered in the input. + +@menu +* COMMENT:: Document your syntax file. +* DOCUMENT:: Document the active file. +* DISPLAY DOCUMENTS:: Display active file documents. +* DISPLAY FILE LABEL:: Display the active file label. +* DROP DOCUMENTS:: Remove documents from the active file. +* ERASE:: Erase a file. +* EXECUTE:: Execute pending transformations. +* FILE LABEL:: Set the active file's label. +* FINISH:: Terminate the PSPP session. +* HOST:: Temporarily return to the operating system. +* INCLUDE:: Include a file within the current one. +* QUIT:: Terminate the PSPP session. +* SET:: Adjust PSPP runtime parameters. +* SHOW:: Display runtime parameters. +* SUBTITLE:: Provide a document subtitle. +* TITLE:: Provide a document title. +@end menu + +@node COMMENT, DOCUMENT, Utilities, Utilities +@section COMMENT +@vindex COMMENT +@vindex * + +@display +Two possibles syntaxes: + COMMENT comment text @dots{} . + *comment text @dots{} . +@end display + +@cmd{COMMENT} is ignored. It is used to provide information to +the author and other readers of the PSPP syntax file. + +@cmd{COMMENT} can extend over any number of lines. Don't forget to +terminate it with a dot or a blank line. + +@node DOCUMENT, DISPLAY DOCUMENTS, COMMENT, Utilities +@section DOCUMENT +@vindex DOCUMENT + +@display +DOCUMENT documentary_text. +@end display + +@cmd{DOCUMENT} adds one or more lines of descriptive commentary to the +active file. Documents added in this way are saved to system files. +They can be viewed using @cmd{SYSFILE INFO} or @cmd{DISPLAY +DOCUMENTS}. They can be removed from the active file with @cmd{DROP +DOCUMENTS}. + +Specify the documentary text following the DOCUMENT keyword. You can +extend the documentary text over as many lines as necessary. Lines are +truncated at 80 characters width. Don't forget to terminate +the command with a dot or a blank line. + +@node DISPLAY DOCUMENTS, DISPLAY FILE LABEL, DOCUMENT, Utilities +@section DISPLAY DOCUMENTS +@vindex DISPLAY DOCUMENTS + +@display +DISPLAY DOCUMENTS. +@end display + +@cmd{DISPLAY DOCUMENTS} displays the documents in the active file. Each +document is preceded by a line giving the time and date that it was +added. @xref{DOCUMENT}. + +@node DISPLAY FILE LABEL, DROP DOCUMENTS, DISPLAY DOCUMENTS, Utilities +@section DISPLAY FILE LABEL +@vindex DISPLAY FILE LABEL + +@display +DISPLAY FILE LABEL. +@end display + +@cmd{DISPLAY FILE LABEL} displays the file label contained in the +active file, +if any. @xref{FILE LABEL}. + +@node DROP DOCUMENTS, ERASE, DISPLAY FILE LABEL, Utilities +@section DROP DOCUMENTS +@vindex DROP DOCUMENTS + +@display +DROP DOCUMENTS. +@end display + +@cmd{DROP DOCUMENTS} removes all documents from the active file. +New documents can be added with @cmd{DOCUMENT} (@pxref{DOCUMENT}). + +@cmd{DROP DOCUMENTS} changes only the active file. It does not modify any +system files stored on disk. + + +@node ERASE, EXECUTE, DROP DOCUMENTS, Utilities +@comment node-name, next, previous, up +@section ERASE +@vindex ERASE + +@display +ERASE FILE file_name. +@end display + +@cmd{ERASE FILE} deletes a file from the local filesystem. +file_name must be quoted. +This command cannot be used if the SAFER setting is active. + + +@node EXECUTE, FILE LABEL, ERASE, Utilities +@section EXECUTE +@vindex EXECUTE + +@display +EXECUTE. +@end display + +@cmd{EXECUTE} causes the active file to be read and all pending +transformations to be executed. + +@node FILE LABEL, FINISH, EXECUTE, Utilities +@section FILE LABEL +@vindex FILE LABEL + +@display +FILE LABEL file_label. +@end display + +@cmd{FILE LABEL} provides a title for the active file. This +title will be saved into system files and portable files that are +created during this PSPP run. + +file_label need not be quoted. If quotes are +included, they become part of the file label. + +@node FINISH, HOST, FILE LABEL, Utilities +@section FINISH +@vindex FINISH + +@display +FINISH. +@end display + +@cmd{FINISH} terminates the current PSPP session and returns +control to the operating system. + +This command is not valid in interactive mode. + +@node HOST, INCLUDE, FINISH, Utilities +@comment node-name, next, previous, up +@section HOST +@vindex HOST + +@display +HOST. +@end display + +@cmd{HOST} suspends the current PSPP session and temporarily returns control +to the operating system. +This command cannot be used if the SAFER setting is active. + + +@node INCLUDE, QUIT, HOST, Utilities +@section INCLUDE +@vindex INCLUDE +@vindex @@ + +@display +Two possible syntaxes: + INCLUDE 'filename'. + @@filename. +@end display + +@cmd{INCLUDE} causes the PSPP command processor to read an +additional command file as if it were included bodily in the current +command file. + +Include files may be nested to any depth, up to the limit of available +memory. + +@node QUIT, SET, INCLUDE, Utilities +@section QUIT +@vindex QUIT + +@display +Two possible syntaxes: + QUIT. + EXIT. +@end display + +@cmd{QUIT} terminates the current PSPP session and returns control +to the operating system. + +This command is not valid within a command file. + +@node SET, SHOW, QUIT, Utilities +@section SET +@vindex SET + +@display +SET + +(data input) + /BLANKS=@{SYSMIS,'.',number@} + /DECIMAL=@{DOT,COMMA@} + /FORMAT=fmt_spec + +(program input) + /ENDCMD='.' + /NULLINE=@{ON,OFF@} + +(interaction) + /CPROMPT='cprompt_string' + /DPROMPT='dprompt_string' + /ERRORBREAK=@{OFF,ON@} + /MXERRS=max_errs + /MXWARNS=max_warnings + /PROMPT='prompt' + /VIEWLENGTH=@{MINIMUM,MEDIAN,MAXIMUM,n_lines@} + /VIEWWIDTH=n_characters + +(program execution) + /MEXPAND=@{ON,OFF@} + /MITERATE=max_iterations + /MNEST=max_nest + /MPRINT=@{ON,OFF@} + /MXLOOPS=max_loops + /SEED=@{RANDOM,seed_value@} + /UNDEFINED=@{WARN,NOWARN@} + +(data output) + /CC@{A,B,C,D,E@}=@{'npre,pre,suf,nsuf','npre.pre.suf.nsuf'@} + /DECIMAL=@{DOT,COMMA@} + /FORMAT=fmt_spec + +(output routing) + /ECHO=@{ON,OFF@} + /ERRORS=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@} + /INCLUDE=@{ON,OFF@} + /MESSAGES=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@} + /PRINTBACK=@{ON,OFF@} + /RESULTS=@{ON,OFF,TERMINAL,LISTING,BOTH,NONE@} + +(output activation) + /LISTING=@{ON,OFF@} + /PRINTER=@{ON,OFF@} + /SCREEN=@{ON,OFF@} + +(output driver options) + /HEADERS=@{NO,YES,BLANK@} + /LENGTH=@{NONE,length_in_lines@} + /LISTING=filename + /MORE=@{ON,OFF@} + /PAGER=@{OFF,"pager_name"@} + /WIDTH=@{NARROW,WIDTH,n_characters@} + +(logging) + /JOURNAL=@{ON,OFF@} [filename] + /LOG=@{ON,OFF@} [filename] + +(system files) + /COMPRESSION=@{ON,OFF@} + /SCOMPRESSION=@{ON,OFF@} + +(security) + /SAFER=ON + +(obsolete settings accepted for compatibility, but ignored) + /AUTOMENU=@{ON,OFF@} + /BEEP=@{ON,OFF@} + /BLOCK='c' + /BOXSTRING=@{'xxx','xxxxxxxxxxx'@} + /CASE=@{UPPER,UPLOW@} + /COLOR=@dots{} + /CPI=cpi_value + /DISK=@{ON,OFF@} + /EJECT=@{ON,OFF@} + /HELPWINDOWS=@{ON,OFF@} + /HIGHRES=@{ON,OFF@} + /HISTOGRAM='c' + /LOWRES=@{AUTO,ON,OFF@} + /LPI=lpi_value + /MENUS=@{STANDARD,EXTENDED@} + /MXMEMORY=max_memory + /PTRANSLATE=@{ON,OFF@} + /RCOLORS=@dots{} + /RUNREVIEW=@{AUTO,MANUAL@} + /SCRIPTTAB='c' + /TB1=@{'xxx','xxxxxxxxxxx'@} + /TBFONTS='string' + /WORKDEV=drive_letter + /WORKSPACE=workspace_size + /XSORT=@{YES,NO@} +@end display + +@cmd{SET} allows the user to adjust several parameters relating to +PSPP's execution. Since there are many subcommands to this command, its +subcommands will be examined in groups. + +On subcommands that take boolean values, ON and YES are synonym, and +as are OFF and NO, when used as subcommand values. + +The data input subcommands affect the way that data is read from data +files. The data input subcommands are + +@table @asis +@item BLANKS +This is the value assigned to an item data item that is empty or +contains only whitespace. An argument of SYSMIS or '.' will cause the +system-missing value to be assigned to null items. This is the +default. Any real value may be assigned. + +@item DECIMAL +The default DOT setting causes the decimal point character to be +@samp{.}. A setting of COMMA causes the decimal point character to be +@samp{,}. + +@item FORMAT +Allows the default numeric input/output format to be specified. The +default is F8.2. @xref{Input/Output Formats}. +@end table + +Program input subcommands affect the way that programs are parsed when +they are typed interactively or run from a script. They are + +@table @asis +@item ENDCMD +This is a single character indicating the end of a command. The default +is @samp{.}. Don't change this. + +@item NULLINE +Whether a blank line is interpreted as ending the current command. The +default is ON. +@end table + +Interaction subcommands affect the way that PSPP interacts with an +online user. The interaction subcommands are + +@table @asis +@item CPROMPT +The command continuation prompt. The default is @samp{ > }. + +@item DPROMPT +Prompt used when expecting data input within @cmd{BEGIN DATA} (@pxref{BEGIN +DATA}). The default is @samp{data> }. + +@item ERRORBREAK +Whether an error causes PSPP to stop processing the current command +file after finishing the current command. The default is OFF. + +@item MXERRS +The maximum number of errors before PSPP halts processing of the current +command file. The default is 50. + +@item MXWARNS +The maximum number of warnings + errors before PSPP halts processing the +current command file. The default is 100. + +@item PROMPT +The command prompt. The default is @samp{PSPP> }. + +@item VIEWLENGTH +The length of the screen in lines. MINIMUM means 25 lines, MEDIAN and +MAXIMUM mean 43 lines. Otherwise specify the number of lines. Normally +PSPP should auto-detect your screen size so this shouldn't have to be +used. + +@item VIEWWIDTH +The width of the screen in characters. Normally 80 or 132. +@end table + +Program execution subcommands control the way that PSPP commands +execute. The program execution subcommands are + +@table @asis +@item MEXPAND +@itemx MITERATE +@itemx MNEST +@itemx MPRINT +Currently not used. + +@item MXLOOPS +The maximum number of iterations for an uncontrolled loop (@pxref{LOOP}). + +@item SEED +The initial pseudo-random number seed. Set to a real number or to +RANDOM, which will obtain an initial seed from the current time of day. + +@item UNDEFINED +Currently not used. +@end table + +Data output subcommands affect the format of output data. These +subcommands are + +@table @asis +@item CCA +@itemx CCB +@itemx CCC +@itemx CCD +@itemx CCE +Set up custom currency formats. The argument is a string which must +contain exactly three commas or exactly three periods. If commas, then +the grouping character for the currency format is @samp{,}, and the +decimal point character is @samp{.}; if periods, then the situation is +reversed. + +The commas or periods divide the string into four fields, which are, in +order, the negative prefix, prefix, suffix, and negative suffix. When a +value is formatted using the custom currency format, the prefix precedes +the value formatted and the suffix follows it. In addition, if the +value is negative, the negative prefix precedes the prefix and the +negative suffix follows the suffix. + +@item DECIMAL +The default DOT setting causes the decimal point character to be +@samp{.}. A setting of COMMA causes the decimal point character to be +@samp{,}. + +@item FORMAT +Allows the default numeric input/output format to be specified. The +default is F8.2. @xref{Input/Output Formats}. +@end table + +Output routing subcommands affect where the output of transformations +and procedures is sent. These subcommands are + +@table @asis +@item ECHO + +If turned on, commands are written to the listing file as they are read +from command files. The default is OFF. + +@itemx ERRORS +@itemx INCLUDE +@itemx MESSAGES +@item PRINTBACK +@item RESULTS +Currently not used. +@end table + +Output activation subcommands affect whether output devices of +particular types are enabled. These subcommands are + +@table @asis +@item LISTING +Enable or disable listing devices. + +@item PRINTER +Enable or disable printer devices. + +@item SCREEN +Enable or disable screen devices. +@end table + +Output driver option subcommands affect output drivers' settings. These +subcommands are + +@table @asis +@item HEADERS +@itemx LENGTH +@itemx LISTING +@itemx MORE +@itemx PAGER +@itemx WIDTH +Currently not used. +@end table + +Logging subcommands affect logging of commands executed to external +files. These subcommands are + +@table @asis +@item JOURNAL +@item LOG +Not currently used. +@end table + +System file subcommands affect the default format of system files +produced by PSPP. These subcommands are + +@table @asis +@item COMPRESSION +Not currently used. + +@item SCOMPRESSION +Whether system files created by @cmd{SAVE} or @cmd{XSAVE} are +compressed by default. The default is ON. +@end table + +Security subcommands affect the operations that commands are allowed to +perform. The security subcommands are + +@table @asis +@item SAFER +When set, this setting cannot ever be reset, for obvious security +reasons. Setting this option disables the following operations: + +@itemize @bullet +@item +The ERASE command. +@item +The HOST command. +@item +Pipe filenames (filenames beginning or ending with @samp{|}). +@end itemize + +Be aware that this setting does not guarantee safety (commands can still +overwrite files, for instance) but it is an improvement. +@end table + +@node SHOW, SUBTITLE, SET, Utilities +@comment node-name, next, previous, up +@section SHOW +@vindex SHOW + +@display +SHOW + /@var{subcommand} + +@end display + +@cmd{SHOW} can be used to display the current state of PSPP's +execution parameters. All of the parameters which can be changed +using @code{SET} @xref{SET}, can be examined using @cmd{SHOW}, by +using a subcommand with the same name. +In addition, @code{SHOW} supports the following subcommands: + +@table @code +@item WARRANTY +Show details of the lack of warranty for PSPP. +@item COPYING +Display the terms of PSPP's copyright licence @ref{License}. +@end table + + + +@node SUBTITLE, TITLE, SHOW, Utilities +@section SUBTITLE +@vindex SUBTITLE + +@display +SUBTITLE 'subtitle_string'. + or +SUBTITLE subtitle_string. +@end display + +@cmd{SUBTITLE} provides a subtitle to a particular PSPP +run. This subtitle appears at the top of each output page below the +title, if headers are enabled on the output device. + +Specify a subtitle as a string in quotes. The alternate syntax that did +not require quotes is now obsolete. If it is used then the subtitle is +converted to all uppercase. + +@node TITLE, , SUBTITLE, Utilities +@section TITLE +@vindex TITLE + +@display +TITLE 'title_string'. + or +TITLE title_string. +@end display + +@cmd{TITLE} provides a title to a particular PSPP run. +This title appears at the top of each output page, if headers are enabled +on the output device. + +Specify a title as a string in quotes. The alternate syntax that did +not require quotes is now obsolete. If it is used then the title is +converted to all uppercase. +@setfilename ignored diff --git a/doc/variables.texi b/doc/variables.texi new file mode 100644 index 00000000..17ddb52f --- /dev/null +++ b/doc/variables.texi @@ -0,0 +1,401 @@ +@node Variable Attributes, Data Manipulation, System and Portable Files, Top +@chapter Manipulating variables + +The variables in the active file dictionary are important. There are +several utility functions for examining and adjusting them. + +@menu +* ADD VALUE LABELS:: Add value labels to variables. +* DISPLAY:: Display variable names & descriptions. +* DISPLAY VECTORS:: Display a list of vectors. +* FORMATS:: Set print and write formats. +* LEAVE:: Don't clear variables between cases. +* MISSING VALUES:: Set missing values for variables. +* MODIFY VARS:: Rename, reorder, and drop variables. +* NUMERIC:: Create new numeric variables. +* PRINT FORMATS:: Set variable print formats. +* RENAME VARIABLES:: Rename variables. +* VALUE LABELS:: Set value labels for variables. +* STRING:: Create new string variables. +* VARIABLE LABELS:: Set variable labels for variables. +* VECTOR:: Declare an array of variables. +* WRITE FORMATS:: Set variable write formats. +@end menu + +@node ADD VALUE LABELS, DISPLAY, Variable Attributes, Variable Attributes +@section ADD VALUE LABELS +@vindex ADD VALUE LABELS + +@display +ADD VALUE LABELS + /var_list value 'label' [value 'label']@dots{} +@end display + +@cmd{ADD VALUE LABELS} has the same syntax and purpose as @cmd{VALUE +LABELS} (@pxref{VALUE LABELS}), but it does not clear value +labels from the variables before adding the ones specified. + +@node DISPLAY, DISPLAY VECTORS, ADD VALUE LABELS, Variable Attributes +@section DISPLAY +@vindex DISPLAY + +@display +DISPLAY @{NAMES,INDEX,LABELS,VARIABLES,DICTIONARY,SCRATCH@} + [SORTED] [var_list] +@end display + +@cmd{DISPLAY} displays requested information on variables. Variables can +optionally be sorted alphabetically. The entire dictionary or just +specified variables can be described. + +One of the following keywords can be present: + +@table @asis +@item NAMES +The variables' names are displayed. + +@item INDEX +The variables' names are displayed along with a value describing their +position within the active file dictionary. + +@item LABELS +Variable names, positions, and variable labels are displayed. + +@item VARIABLES +Variable names, positions, print and write formats, and missing values +are displayed. + +@item DICTIONARY +Variable names, positions, print and write formats, missing values, +variable labels, and value labels are displayed. + +@item SCRATCH +Varible names are displayed, for scratch variables only (@pxref{Scratch +Variables}). +@end table + +If SORTED is specified, then the variables are displayed in ascending +order based on their names; otherwise, they are displayed in the order +that they occur in the active file dictionary. + +@node DISPLAY VECTORS, FORMATS, DISPLAY, Variable Attributes +@section DISPLAY VECTORS +@vindex DISPLAY VECTORS + +@display +DISPLAY VECTORS. +@end display + +@cmd{DISPLAY VECTORS} lists all the currently declared vectors. + +@node FORMATS, LEAVE, DISPLAY VECTORS, Variable Attributes +@section FORMATS +@vindex FORMATS + +@display +FORMATS var_list (fmt_spec). +@end display + +@cmd{FORMATS} set both print and write formats for the specified +variables to the specified format specification. @xref{Input/Output +Formats}. + +Specify a list of variables followed by a format specification in +parentheses. The print and write formats of the specified variables +will be changed. + +Additional lists of variables and formats may be included if they are +delimited by a slash (@samp{/}). + +@cmd{FORMATS} takes effect immediately. It is not affected by +conditional and looping structures such as @cmd{DO IF} or @cmd{LOOP}. + +@node LEAVE, MISSING VALUES, FORMATS, Variable Attributes +@section LEAVE +@vindex LEAVE + +@display +LEAVE var_list. +@end display + +@cmd{LEAVE} prevents the specified variables from being +reinitialized whenever a new case is processed. + +Normally, when a data file is processed, every variable in the active +file is initialized to the system-missing value or spaces at the +beginning of processing for each case. When a variable has been +specified on @cmd{LEAVE}, this is not the case. Instead, that variable is +initialized to 0 (not system-missing) or spaces for the first case. +After that, it retains its value between cases. + +This becomes useful for counters. For instance, in the example below +the variable SUM maintains a running total of the values in the ITEM +variable. + +@example +DATA LIST /ITEM 1-3. +COMPUTE SUM=SUM+ITEM. +PRINT /ITEM SUM. +LEAVE SUM +BEGIN DATA. +123 +404 +555 +999 +END DATA. +@end example + +@noindent Partial output from this example: + +@example +123 123.00 +404 527.00 +555 1082.00 +999 2081.00 +@end example + +It is best to use @cmd{LEAVE} command immediately before invoking a +procedure command, because the left status of variables is reset by +certain transformations---for instance, @cmd{COMPUTE} and @cmd{IF}. +Left status is also reset by all procedure invocations. + +@node MISSING VALUES, MODIFY VARS, LEAVE, Variable Attributes +@section MISSING VALUES +@vindex MISSING VALUES + +@display +MISSING VALUES var_list (missing_values). + +missing_values takes one of the following forms: + num1 + num1, num2 + num1, num2, num3 + num1 THRU num2 + num1 THRU num2, num3 + string1 + string1, string2 + string1, string2, string3 +As part of a range, LO or LOWEST may take the place of num1; +HI or HIGHEST may take the place of num2. +@end display + +@cmd{MISSING VALUES} sets user-missing values for numeric and +short string variables. Long string variables may not have missing +values. + +Specify a list of variables, followed by a list of their user-missing +values in parentheses. Up to three discrete values may be given, or, +for numeric variables only, a range of values optionally accompanied by +a single discrete value. Ranges may be open-ended on one end, indicated +through the use of the keyword LO or LOWEST or HI or HIGHEST. + +The @cmd{MISSING VALUES} command takes effect immediately. It is not +affected by conditional and looping constructs such as @cmd{DO IF} or +@cmd{LOOP}. + +@node MODIFY VARS, NUMERIC, MISSING VALUES, Variable Attributes +@section MODIFY VARS +@vindex MODIFY VARS + +@display +MODIFY VARS + /REORDER=@{FORWARD,BACKWARD@} @{POSITIONAL,ALPHA@} (var_list)@dots{} + /RENAME=(old_names=new_names)@dots{} + /@{DROP,KEEP@}=var_list + /MAP +@end display + +@cmd{MODIFY VARS} reorders, renames, and deletes variables in the +active file. + +At least one subcommand must be specified, and no subcommand may be +specified more than once. DROP and KEEP may not both be specified. + +The REORDER subcommand changes the order of variables in the active +file. Specify one or more lists of variable names in parentheses. By +default, each list of variables is rearranged into the specified order. +To put the variables into the reverse of the specified order, put +keyword BACKWARD before the parentheses. To put them into alphabetical +order in the dictionary, specify keyword ALPHA before the parentheses. +BACKWARD and ALPHA may also be combined. + +To rename variables in the active file, specify RENAME, an equals sign +(@samp{=}), and lists of the old variable names and new variable names +separated by another equals sign within parentheses. There must be the +same number of old and new variable names. Each old variable is renamed to +the corresponding new variable name. Multiple parenthesized groups of +variables may be specified. + +The DROP subcommand deletes a specified list of variables from the +active file. + +The KEEP subcommand keeps the specified list of variables in the active +file. Any unlisted variables are deleted from the active file. + +MAP is currently ignored. + +If either DROP or KEEP is specified, the data is read; otherwise it is +not. + +@cmd{MODIFY VARS} may not be specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}). + +@node NUMERIC, PRINT FORMATS, MODIFY VARS, Variable Attributes +@section NUMERIC +@vindex NUMERIC + +@display +NUMERIC /var_list [(fmt_spec)]. +@end display + +@cmd{NUMERIC} explicitly declares new numeric variables, optionally +setting their output formats. + +Specify a slash (@samp{/}), followed by the names of the new numeric +variables. If you wish to set their output formats, follow their names +by an output format specification in parentheses (@pxref{Input/Output +Formats}); otherwise, the default is F8.2. + +Variables created with @cmd{NUMERIC} are initialized to the +system-missing value. + +@node PRINT FORMATS, RENAME VARIABLES, NUMERIC, Variable Attributes +@section PRINT FORMATS +@vindex PRINT FORMATS + +@display +PRINT FORMATS var_list (fmt_spec). +@end display + +@cmd{PRINT FORMATS} sets the print formats for the specified +variables to the specified format specification. + +Its syntax is identical to that of @cmd{FORMATS} (@pxref{FORMATS}), +but @cmd{PRINT FORMATS} sets only print formats, not write formats. + +@node RENAME VARIABLES, VALUE LABELS, PRINT FORMATS, Variable Attributes +@section RENAME VARIABLES +@vindex RENAME VARIABLES + +@display +RENAME VARIABLES (old_names=new_names)@dots{} . +@end display + +@cmd{RENAME VARIABLES} changes the names of variables in the active +file. Specify lists of the old variable names and new +variable names, separated by an equals sign (@samp{=}), within +parentheses. There must be the same number of old and new variable +names. Each old variable is renamed to the corresponding new variable +name. Multiple parenthesized groups of variables may be specified. + +@cmd{RENAME VARIABLES} takes effect immediately. It does not cause the data +to be read. + +@cmd{RENAME VARIABLES} may not be specified following @cmd{TEMPORARY} +(@pxref{TEMPORARY}). + +@node VALUE LABELS, STRING, RENAME VARIABLES, Variable Attributes +@section VALUE LABELS +@vindex VALUE LABELS + +@display +VALUE LABELS + /var_list value 'label' [value 'label']@dots{} +@end display + +@cmd{VALUE LABELS} allows values of numeric and short string +variables to be associated with labels. In this way, a short value can +stand for a long value. + +To set up value labels for a set of variables, specify the +variable names after a slash (@samp{/}), followed by a list of values +and their associated labels, separated by spaces. Long string +variables may not be specified. + +Before @cmd{VALUE LABELS} is executed, any existing value labels +are cleared from the variables specified. Use @cmd{ADD VALUE LABELS} +(@pxref{ADD VALUE LABELS}) to add value labels without clearing those +already present. + +@node STRING, VARIABLE LABELS, VALUE LABELS, Variable Attributes +@section STRING +@vindex STRING + +@display +STRING /var_list (fmt_spec). +@end display + +@cmd{STRING} creates new string variables for use in +transformations. + +Specify a slash (@samp{/}), followed by the names of the string +variables to create and the desired output format specification in +parentheses (@pxref{Input/Output Formats}). Variable widths are +implicitly derived from the specified output formats. + +Created variables are initialized to spaces. + +@node VARIABLE LABELS, VECTOR, STRING, Variable Attributes +@section VARIABLE LABELS +@vindex VARIABLE LABELS + +@display +VARIABLE LABELS + /var_list 'var_label'. +@end display + +@cmd{VARIABLE LABELS} associates explanatory names +with variables. This name, called a @dfn{variable label}, is displayed by +statistical procedures. + +To assign a variable label to a group of variables, specify a slash +(@samp{/}), followed by the list of variable names and the variable +label as a string. + +@node VECTOR, WRITE FORMATS, VARIABLE LABELS, Variable Attributes +@section VECTOR +@vindex VECTOR + +@display +Two possible syntaxes: + VECTOR vec_name=var_list. + VECTOR vec_name_list(count). +@end display + +@cmd{VECTOR} allows a group of variables to be accessed as if they +were consecutive members of an array with a vector(index) notation. + +To make a vector out of a set of existing variables, specify a name for +the vector followed by an equals sign (@samp{=}) and the variables that +belong in the vector. + +To make a vector and create variables at the same time, specify one or +more vector names followed by a count in parentheses. This will cause +variables named @code{@var{vec}1} through @code{@var{vec}@var{count}} +to be created as numeric variables with print and write format F8.2. +Variable names including numeric suffixes may not exceed 8 characters +in length, and none of the variables may exist prior to @cmd{VECTOR}. + +All the variables in a vector must be the same type. + +Vectors created with @cmd{VECTOR} disappear after any procedure or +procedure-like command is executed. The variables contained in the +vectors remain, unless they are scratch variables (@pxref{Scratch +Variables}). + +Variables within a vector may be references in expressions using +@code{vector(index)} syntax. + +@node WRITE FORMATS, , VECTOR, Variable Attributes +@section WRITE FORMATS +@vindex WRITE FORMATS + +@display +WRITE FORMATS var_list (fmt_spec). +@end display + +@cmd{WRITE FORMATS} sets the write formats for the specified variables +to the specified format specification. Its syntax is identical to +that of FORMATS (@pxref{FORMATS}), but @cmd{WRITE FORMATS} sets only +write formats, not print formats. +@setfilename ignored diff --git a/po/en_GB.po b/po/en_GB.po index 1dd1270c..041fd622 100644 --- a/po/en_GB.po +++ b/po/en_GB.po @@ -7,7 +7,7 @@ msgid "" msgstr "" "Project-Id-Version: PSPP 0.3.1\n" "Report-Msgid-Bugs-To: pspp-dev@gnu.org\n" -"POT-Creation-Date: 2004-10-26 18:37+0800\n" +"POT-Creation-Date: 2004-10-30 17:37+0800\n" "PO-Revision-Date: 2004-01-23 13:04+0800\n" "Last-Translator: John Darrington \n" "Language-Team: John Darrington \n" @@ -957,8 +957,8 @@ msgstr "" msgid "Only USE ALL is currently implemented." msgstr "" -#: src/descript.c:98 src/frequencies.q:96 src/oneway.q:367 src/t-test.q:634 -#: src/t-test.q:657 src/t-test.q:746 src/t-test.q:1061 +#: src/descript.c:98 src/frequencies.q:100 src/oneway.q:375 src/t-test.q:681 +#: src/t-test.q:704 src/t-test.q:827 src/t-test.q:1165 msgid "Mean" msgstr "" @@ -966,15 +966,15 @@ msgstr "" msgid "S E Mean" msgstr "" -#: src/descript.c:100 src/frequencies.q:100 +#: src/descript.c:100 src/frequencies.q:104 msgid "Std Dev" msgstr "" -#: src/descript.c:101 src/frequencies.q:101 +#: src/descript.c:101 src/frequencies.q:105 msgid "Variance" msgstr "" -#: src/descript.c:102 src/frequencies.q:102 +#: src/descript.c:102 src/frequencies.q:106 msgid "Kurtosis" msgstr "" @@ -982,7 +982,7 @@ msgstr "" msgid "S E Kurt" msgstr "" -#: src/descript.c:104 src/frequencies.q:104 +#: src/descript.c:104 src/frequencies.q:108 msgid "Skewness" msgstr "" @@ -990,19 +990,19 @@ msgstr "" msgid "S E Skew" msgstr "" -#: src/descript.c:106 src/frequencies.q:106 +#: src/descript.c:106 src/frequencies.q:110 msgid "Range" msgstr "" -#: src/descript.c:107 src/frequencies.q:107 src/oneway.q:379 +#: src/descript.c:107 src/frequencies.q:111 src/oneway.q:387 msgid "Minimum" msgstr "" -#: src/descript.c:108 src/frequencies.q:108 src/oneway.q:380 +#: src/descript.c:108 src/frequencies.q:112 src/oneway.q:388 msgid "Maximum" msgstr "" -#: src/descript.c:109 src/frequencies.q:109 +#: src/descript.c:109 src/frequencies.q:113 msgid "Sum" msgstr "" @@ -3808,7 +3808,7 @@ msgstr "" #: src/sysfile-info.c:529 src/vfm.c:875 src/crosstabs.q:1068 #: src/crosstabs.q:1095 src/crosstabs.q:1115 src/crosstabs.q:1137 -#: src/frequencies.q:1023 src/frequencies.q:1140 +#: src/frequencies.q:1068 src/frequencies.q:1186 msgid "Value" msgstr "" @@ -4033,26 +4033,26 @@ msgstr "" msgid "Cases" msgstr "" -#: src/crosstabs.q:772 src/frequencies.q:1021 src/frequencies.q:1387 +#: src/crosstabs.q:772 src/frequencies.q:1066 src/frequencies.q:1433 msgid "Valid" msgstr "" -#: src/crosstabs.q:773 src/frequencies.q:1088 src/frequencies.q:1388 +#: src/crosstabs.q:773 src/frequencies.q:1133 src/frequencies.q:1434 msgid "Missing" msgstr "" #: src/crosstabs.q:774 src/crosstabs.q:977 src/crosstabs.q:1690 -#: src/frequencies.q:1097 src/oneway.q:279 src/oneway.q:456 +#: src/frequencies.q:1142 src/oneway.q:287 src/oneway.q:464 msgid "Total" msgstr "" -#: src/crosstabs.q:784 src/frequencies.q:1386 src/oneway.q:366 -#: src/t-test.q:633 src/t-test.q:656 src/t-test.q:747 src/t-test.q:1261 +#: src/crosstabs.q:784 src/frequencies.q:1432 src/oneway.q:374 +#: src/t-test.q:680 src/t-test.q:703 src/t-test.q:828 src/t-test.q:1365 msgid "N" msgstr "" -#: src/crosstabs.q:785 src/frequencies.q:1025 src/frequencies.q:1026 -#: src/frequencies.q:1027 +#: src/crosstabs.q:785 src/frequencies.q:1070 src/frequencies.q:1071 +#: src/frequencies.q:1072 msgid "Percent" msgstr "" @@ -4093,8 +4093,8 @@ msgstr "" msgid "Statistic" msgstr "" -#: src/crosstabs.q:1069 src/oneway.q:249 src/oneway.q:674 src/t-test.q:898 -#: src/t-test.q:1067 src/t-test.q:1159 +#: src/crosstabs.q:1069 src/oneway.q:257 src/oneway.q:686 src/t-test.q:979 +#: src/t-test.q:1171 src/t-test.q:1263 msgid "df" msgstr "" @@ -4131,11 +4131,11 @@ msgstr "" msgid " 95%% Confidence Interval" msgstr "" -#: src/crosstabs.q:1116 src/t-test.q:902 src/t-test.q:1064 src/t-test.q:1162 +#: src/crosstabs.q:1116 src/t-test.q:983 src/t-test.q:1168 src/t-test.q:1266 msgid "Lower" msgstr "" -#: src/crosstabs.q:1117 src/t-test.q:903 src/t-test.q:1065 src/t-test.q:1163 +#: src/crosstabs.q:1117 src/t-test.q:984 src/t-test.q:1169 src/t-test.q:1267 msgid "Upper" msgstr "" @@ -4304,105 +4304,105 @@ msgstr "" msgid "expecting a file name or handle name" msgstr "" -#: src/frequencies.q:97 +#: src/frequencies.q:101 msgid "S.E. Mean" msgstr "" -#: src/frequencies.q:98 +#: src/frequencies.q:102 msgid "Median" msgstr "" -#: src/frequencies.q:99 +#: src/frequencies.q:103 msgid "Mode" msgstr "" -#: src/frequencies.q:103 +#: src/frequencies.q:107 msgid "S.E. Kurt" msgstr "" -#: src/frequencies.q:105 +#: src/frequencies.q:109 msgid "S.E. Skew" msgstr "" -#: src/frequencies.q:280 +#: src/frequencies.q:289 msgid "" "At most one of BARCHART, HISTOGRAM, or HBAR should be given. HBAR will be " "assumed. Argument values will be given precedence increasing along the " "order given." msgstr "" -#: src/frequencies.q:361 +#: src/frequencies.q:372 #, c-format msgid "" "MAX must be greater than or equal to MIN, if both are specified. However, " "MIN was specified as %g and MAX as %g. MIN and MAX will be ignored." msgstr "" -#: src/frequencies.q:651 +#: src/frequencies.q:696 msgid "" "Upper limit of integer mode value range must be greater than lower limit." msgstr "" -#: src/frequencies.q:663 +#: src/frequencies.q:708 #, c-format msgid "Variable %s specified multiple times on VARIABLES subcommand." msgstr "" -#: src/frequencies.q:676 +#: src/frequencies.q:721 #, c-format msgid "Integer mode specified, but %s is not a numeric variable." msgstr "" -#: src/frequencies.q:738 +#: src/frequencies.q:783 msgid "`)' expected after GROUPED interval list." msgstr "" -#: src/frequencies.q:751 +#: src/frequencies.q:796 #, c-format msgid "Variables %s specified on GROUPED but not on VARIABLES." msgstr "" -#: src/frequencies.q:754 +#: src/frequencies.q:799 #, c-format msgid "Variables %s specified multiple times on GROUPED subcommand." msgstr "" -#: src/frequencies.q:810 +#: src/frequencies.q:855 msgid "Percentile list expected after PERCENTILES." msgstr "" -#: src/frequencies.q:818 +#: src/frequencies.q:863 #, fuzzy msgid "Percentiles must be between 0 and 100." msgstr "Frame colour must be between 0 and 6." -#: src/frequencies.q:1022 src/frequencies.q:1112 src/frequencies.q:1113 -#: src/frequencies.q:1143 +#: src/frequencies.q:1067 src/frequencies.q:1158 src/frequencies.q:1159 +#: src/frequencies.q:1189 msgid "Cum" msgstr "" -#: src/frequencies.q:1024 +#: src/frequencies.q:1069 msgid "Frequency" msgstr "" -#: src/frequencies.q:1043 +#: src/frequencies.q:1088 msgid "Value Label" msgstr "" -#: src/frequencies.q:1141 +#: src/frequencies.q:1187 msgid "Freq" msgstr "" -#: src/frequencies.q:1142 src/frequencies.q:1144 +#: src/frequencies.q:1188 src/frequencies.q:1190 msgid "Pct" msgstr "" -#: src/frequencies.q:1360 +#: src/frequencies.q:1406 #, c-format msgid "No valid data for variable %s; statistics not displayed." msgstr "" -#: src/frequencies.q:1399 +#: src/frequencies.q:1445 msgid "Percentiles" msgstr "" @@ -4473,119 +4473,119 @@ msgstr "" msgid "Upper value (%g) is less than lower value (%g) on VARIABLES subcommand." msgstr "" -#: src/oneway.q:125 +#: src/oneway.q:148 msgid "Number of contrast coefficients must equal the number of groups" msgstr "" -#: src/oneway.q:133 +#: src/oneway.q:156 #, c-format msgid "Coefficients for contrast %d do not total zero" msgstr "" -#: src/oneway.q:213 src/t-test.q:329 src/t-test.q:406 +#: src/oneway.q:221 src/t-test.q:364 src/t-test.q:449 #, c-format msgid "`%s' is not a variable name" msgstr "" -#: src/oneway.q:248 +#: src/oneway.q:256 msgid "Sum of Squares" msgstr "" -#: src/oneway.q:250 +#: src/oneway.q:258 msgid "Mean Square" msgstr "" -#: src/oneway.q:251 src/t-test.q:895 +#: src/oneway.q:259 src/t-test.q:976 msgid "F" msgstr "" -#: src/oneway.q:252 src/oneway.q:522 +#: src/oneway.q:260 src/oneway.q:530 msgid "Significance" msgstr "" -#: src/oneway.q:277 +#: src/oneway.q:285 msgid "Between Groups" msgstr "" -#: src/oneway.q:278 +#: src/oneway.q:286 msgid "Within Groups" msgstr "" -#: src/oneway.q:324 +#: src/oneway.q:332 msgid "ANOVA" msgstr "" -#: src/oneway.q:368 src/t-test.q:635 src/t-test.q:658 src/t-test.q:748 -#: src/t-test.q:1062 +#: src/oneway.q:376 src/t-test.q:682 src/t-test.q:705 src/t-test.q:829 +#: src/t-test.q:1166 msgid "Std. Deviation" msgstr "" -#: src/oneway.q:369 src/oneway.q:672 +#: src/oneway.q:377 src/oneway.q:684 msgid "Std. Error" msgstr "" -#: src/oneway.q:374 +#: src/oneway.q:382 #, c-format msgid "%g%% Confidence Interval for Mean" msgstr "" -#: src/oneway.q:376 +#: src/oneway.q:384 msgid "Lower Bound" msgstr "" -#: src/oneway.q:377 +#: src/oneway.q:385 msgid "Upper Bound" msgstr "" -#: src/oneway.q:383 +#: src/oneway.q:391 msgid "Descriptives" msgstr "" -#: src/oneway.q:519 +#: src/oneway.q:527 msgid "Levene Statistic" msgstr "" -#: src/oneway.q:520 +#: src/oneway.q:528 msgid "df1" msgstr "" -#: src/oneway.q:521 +#: src/oneway.q:529 msgid "df2" msgstr "" -#: src/oneway.q:525 +#: src/oneway.q:533 msgid "Test of Homogeneity of Variances" msgstr "" -#: src/oneway.q:597 +#: src/oneway.q:609 msgid "Contrast Coefficients" msgstr "" -#: src/oneway.q:599 src/oneway.q:670 +#: src/oneway.q:611 src/oneway.q:682 msgid "Contrast" msgstr "" -#: src/oneway.q:668 +#: src/oneway.q:680 msgid "Contrast Tests" msgstr "" -#: src/oneway.q:671 +#: src/oneway.q:683 msgid "Value of Contrast" msgstr "" -#: src/oneway.q:673 src/t-test.q:897 src/t-test.q:1066 src/t-test.q:1158 +#: src/oneway.q:685 src/t-test.q:978 src/t-test.q:1170 src/t-test.q:1262 msgid "t" msgstr "" -#: src/oneway.q:675 src/t-test.q:899 src/t-test.q:1068 src/t-test.q:1160 +#: src/oneway.q:687 src/t-test.q:980 src/t-test.q:1172 src/t-test.q:1264 msgid "Sig. (2-tailed)" msgstr "" -#: src/oneway.q:723 +#: src/oneway.q:735 msgid "Assume equal variances" msgstr "" -#: src/oneway.q:727 +#: src/oneway.q:739 msgid "Does not assume equal" msgstr "" @@ -4721,128 +4721,128 @@ msgstr "" msgid "data> " msgstr "" -#: src/t-test.q:233 +#: src/t-test.q:266 msgid "TESTVAL, GROUPS and PAIRS subcommands are mutually exclusive." msgstr "" -#: src/t-test.q:250 +#: src/t-test.q:283 msgid "VARIABLES subcommand is not appropriate with PAIRS" msgstr "" -#: src/t-test.q:287 +#: src/t-test.q:320 msgid "One or more VARIABLES must be specified." msgstr "" -#: src/t-test.q:342 +#: src/t-test.q:377 #, c-format msgid "Long string variable %s is not valid here." msgstr "" -#: src/t-test.q:359 +#: src/t-test.q:397 msgid "" "When applying GROUPS to a string variable, at least one value must be " "specified." msgstr "" -#: src/t-test.q:441 +#: src/t-test.q:484 #, c-format msgid "" "PAIRED was specified but the number of variables preceding WITH (%d) did not " "match the number following (%d)." msgstr "" -#: src/t-test.q:458 +#: src/t-test.q:501 msgid "At least two variables must be specified on PAIRS." msgstr "" -#: src/t-test.q:631 +#: src/t-test.q:678 msgid "One-Sample Statistics" msgstr "" -#: src/t-test.q:636 src/t-test.q:659 src/t-test.q:749 +#: src/t-test.q:683 src/t-test.q:706 src/t-test.q:830 msgid "SE. Mean" msgstr "" -#: src/t-test.q:653 +#: src/t-test.q:700 msgid "Group Statistics" msgstr "" -#: src/t-test.q:743 +#: src/t-test.q:824 msgid "Paired Sample Statistics" msgstr "" -#: src/t-test.q:765 src/t-test.q:1087 src/t-test.q:1278 +#: src/t-test.q:846 src/t-test.q:1191 src/t-test.q:1382 #, c-format msgid "Pair %d" msgstr "" -#: src/t-test.q:883 +#: src/t-test.q:964 msgid "Independent Samples Test" msgstr "" -#: src/t-test.q:891 +#: src/t-test.q:972 msgid "Levene's Test for Equality of Variances" msgstr "" -#: src/t-test.q:893 +#: src/t-test.q:974 msgid "t-test for Equality of Means" msgstr "" -#: src/t-test.q:896 src/t-test.q:1263 +#: src/t-test.q:977 src/t-test.q:1367 msgid "Sig." msgstr "" -#: src/t-test.q:900 src/t-test.q:1161 +#: src/t-test.q:981 src/t-test.q:1265 msgid "Mean Difference" msgstr "" -#: src/t-test.q:901 +#: src/t-test.q:982 msgid "Std. Error Difference" msgstr "" -#: src/t-test.q:906 src/t-test.q:1058 src/t-test.q:1153 +#: src/t-test.q:987 src/t-test.q:1162 src/t-test.q:1257 #, c-format msgid "%g%% Confidence Interval of the Difference" msgstr "" -#: src/t-test.q:937 +#: src/t-test.q:1041 msgid "Equal variances assumed" msgstr "" -#: src/t-test.q:990 +#: src/t-test.q:1094 msgid "Equal variances not assumed" msgstr "" -#: src/t-test.q:1048 +#: src/t-test.q:1152 msgid "Paired Samples Test" msgstr "" -#: src/t-test.q:1051 +#: src/t-test.q:1155 msgid "Paired Differences" msgstr "" -#: src/t-test.q:1063 +#: src/t-test.q:1167 msgid "Std. Error Mean" msgstr "" -#: src/t-test.q:1142 +#: src/t-test.q:1246 msgid "One-Sample Test" msgstr "" -#: src/t-test.q:1147 +#: src/t-test.q:1251 #, c-format msgid "Test Value = %f" msgstr "" -#: src/t-test.q:1258 +#: src/t-test.q:1362 msgid "Paired Samples Correlations" msgstr "" -#: src/t-test.q:1262 +#: src/t-test.q:1366 msgid "Correlation" msgstr "" -#: src/t-test.q:1281 +#: src/t-test.q:1385 #, c-format msgid "%s & %s" msgstr "" diff --git a/po/pspp.pot b/po/pspp.pot index 86af7879..86a951e7 100644 --- a/po/pspp.pot +++ b/po/pspp.pot @@ -8,7 +8,7 @@ msgid "" msgstr "" "Project-Id-Version: PACKAGE VERSION\n" "Report-Msgid-Bugs-To: pspp-dev@gnu.org\n" -"POT-Creation-Date: 2004-10-26 18:37+0800\n" +"POT-Creation-Date: 2004-10-30 17:37+0800\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME \n" "Language-Team: LANGUAGE \n" @@ -958,8 +958,8 @@ msgstr "" msgid "Only USE ALL is currently implemented." msgstr "" -#: src/descript.c:98 src/frequencies.q:96 src/oneway.q:367 src/t-test.q:634 -#: src/t-test.q:657 src/t-test.q:746 src/t-test.q:1061 +#: src/descript.c:98 src/frequencies.q:100 src/oneway.q:375 src/t-test.q:681 +#: src/t-test.q:704 src/t-test.q:827 src/t-test.q:1165 msgid "Mean" msgstr "" @@ -967,15 +967,15 @@ msgstr "" msgid "S E Mean" msgstr "" -#: src/descript.c:100 src/frequencies.q:100 +#: src/descript.c:100 src/frequencies.q:104 msgid "Std Dev" msgstr "" -#: src/descript.c:101 src/frequencies.q:101 +#: src/descript.c:101 src/frequencies.q:105 msgid "Variance" msgstr "" -#: src/descript.c:102 src/frequencies.q:102 +#: src/descript.c:102 src/frequencies.q:106 msgid "Kurtosis" msgstr "" @@ -983,7 +983,7 @@ msgstr "" msgid "S E Kurt" msgstr "" -#: src/descript.c:104 src/frequencies.q:104 +#: src/descript.c:104 src/frequencies.q:108 msgid "Skewness" msgstr "" @@ -991,19 +991,19 @@ msgstr "" msgid "S E Skew" msgstr "" -#: src/descript.c:106 src/frequencies.q:106 +#: src/descript.c:106 src/frequencies.q:110 msgid "Range" msgstr "" -#: src/descript.c:107 src/frequencies.q:107 src/oneway.q:379 +#: src/descript.c:107 src/frequencies.q:111 src/oneway.q:387 msgid "Minimum" msgstr "" -#: src/descript.c:108 src/frequencies.q:108 src/oneway.q:380 +#: src/descript.c:108 src/frequencies.q:112 src/oneway.q:388 msgid "Maximum" msgstr "" -#: src/descript.c:109 src/frequencies.q:109 +#: src/descript.c:109 src/frequencies.q:113 msgid "Sum" msgstr "" @@ -3808,7 +3808,7 @@ msgstr "" #: src/sysfile-info.c:529 src/vfm.c:875 src/crosstabs.q:1068 #: src/crosstabs.q:1095 src/crosstabs.q:1115 src/crosstabs.q:1137 -#: src/frequencies.q:1023 src/frequencies.q:1140 +#: src/frequencies.q:1068 src/frequencies.q:1186 msgid "Value" msgstr "" @@ -4033,26 +4033,26 @@ msgstr "" msgid "Cases" msgstr "" -#: src/crosstabs.q:772 src/frequencies.q:1021 src/frequencies.q:1387 +#: src/crosstabs.q:772 src/frequencies.q:1066 src/frequencies.q:1433 msgid "Valid" msgstr "" -#: src/crosstabs.q:773 src/frequencies.q:1088 src/frequencies.q:1388 +#: src/crosstabs.q:773 src/frequencies.q:1133 src/frequencies.q:1434 msgid "Missing" msgstr "" #: src/crosstabs.q:774 src/crosstabs.q:977 src/crosstabs.q:1690 -#: src/frequencies.q:1097 src/oneway.q:279 src/oneway.q:456 +#: src/frequencies.q:1142 src/oneway.q:287 src/oneway.q:464 msgid "Total" msgstr "" -#: src/crosstabs.q:784 src/frequencies.q:1386 src/oneway.q:366 -#: src/t-test.q:633 src/t-test.q:656 src/t-test.q:747 src/t-test.q:1261 +#: src/crosstabs.q:784 src/frequencies.q:1432 src/oneway.q:374 +#: src/t-test.q:680 src/t-test.q:703 src/t-test.q:828 src/t-test.q:1365 msgid "N" msgstr "" -#: src/crosstabs.q:785 src/frequencies.q:1025 src/frequencies.q:1026 -#: src/frequencies.q:1027 +#: src/crosstabs.q:785 src/frequencies.q:1070 src/frequencies.q:1071 +#: src/frequencies.q:1072 msgid "Percent" msgstr "" @@ -4093,8 +4093,8 @@ msgstr "" msgid "Statistic" msgstr "" -#: src/crosstabs.q:1069 src/oneway.q:249 src/oneway.q:674 src/t-test.q:898 -#: src/t-test.q:1067 src/t-test.q:1159 +#: src/crosstabs.q:1069 src/oneway.q:257 src/oneway.q:686 src/t-test.q:979 +#: src/t-test.q:1171 src/t-test.q:1263 msgid "df" msgstr "" @@ -4131,11 +4131,11 @@ msgstr "" msgid " 95%% Confidence Interval" msgstr "" -#: src/crosstabs.q:1116 src/t-test.q:902 src/t-test.q:1064 src/t-test.q:1162 +#: src/crosstabs.q:1116 src/t-test.q:983 src/t-test.q:1168 src/t-test.q:1266 msgid "Lower" msgstr "" -#: src/crosstabs.q:1117 src/t-test.q:903 src/t-test.q:1065 src/t-test.q:1163 +#: src/crosstabs.q:1117 src/t-test.q:984 src/t-test.q:1169 src/t-test.q:1267 msgid "Upper" msgstr "" @@ -4304,104 +4304,104 @@ msgstr "" msgid "expecting a file name or handle name" msgstr "" -#: src/frequencies.q:97 +#: src/frequencies.q:101 msgid "S.E. Mean" msgstr "" -#: src/frequencies.q:98 +#: src/frequencies.q:102 msgid "Median" msgstr "" -#: src/frequencies.q:99 +#: src/frequencies.q:103 msgid "Mode" msgstr "" -#: src/frequencies.q:103 +#: src/frequencies.q:107 msgid "S.E. Kurt" msgstr "" -#: src/frequencies.q:105 +#: src/frequencies.q:109 msgid "S.E. Skew" msgstr "" -#: src/frequencies.q:280 +#: src/frequencies.q:289 msgid "" "At most one of BARCHART, HISTOGRAM, or HBAR should be given. HBAR will be " "assumed. Argument values will be given precedence increasing along the " "order given." msgstr "" -#: src/frequencies.q:361 +#: src/frequencies.q:372 #, c-format msgid "" "MAX must be greater than or equal to MIN, if both are specified. However, " "MIN was specified as %g and MAX as %g. MIN and MAX will be ignored." msgstr "" -#: src/frequencies.q:651 +#: src/frequencies.q:696 msgid "" "Upper limit of integer mode value range must be greater than lower limit." msgstr "" -#: src/frequencies.q:663 +#: src/frequencies.q:708 #, c-format msgid "Variable %s specified multiple times on VARIABLES subcommand." msgstr "" -#: src/frequencies.q:676 +#: src/frequencies.q:721 #, c-format msgid "Integer mode specified, but %s is not a numeric variable." msgstr "" -#: src/frequencies.q:738 +#: src/frequencies.q:783 msgid "`)' expected after GROUPED interval list." msgstr "" -#: src/frequencies.q:751 +#: src/frequencies.q:796 #, c-format msgid "Variables %s specified on GROUPED but not on VARIABLES." msgstr "" -#: src/frequencies.q:754 +#: src/frequencies.q:799 #, c-format msgid "Variables %s specified multiple times on GROUPED subcommand." msgstr "" -#: src/frequencies.q:810 +#: src/frequencies.q:855 msgid "Percentile list expected after PERCENTILES." msgstr "" -#: src/frequencies.q:818 +#: src/frequencies.q:863 msgid "Percentiles must be between 0 and 100." msgstr "" -#: src/frequencies.q:1022 src/frequencies.q:1112 src/frequencies.q:1113 -#: src/frequencies.q:1143 +#: src/frequencies.q:1067 src/frequencies.q:1158 src/frequencies.q:1159 +#: src/frequencies.q:1189 msgid "Cum" msgstr "" -#: src/frequencies.q:1024 +#: src/frequencies.q:1069 msgid "Frequency" msgstr "" -#: src/frequencies.q:1043 +#: src/frequencies.q:1088 msgid "Value Label" msgstr "" -#: src/frequencies.q:1141 +#: src/frequencies.q:1187 msgid "Freq" msgstr "" -#: src/frequencies.q:1142 src/frequencies.q:1144 +#: src/frequencies.q:1188 src/frequencies.q:1190 msgid "Pct" msgstr "" -#: src/frequencies.q:1360 +#: src/frequencies.q:1406 #, c-format msgid "No valid data for variable %s; statistics not displayed." msgstr "" -#: src/frequencies.q:1399 +#: src/frequencies.q:1445 msgid "Percentiles" msgstr "" @@ -4472,119 +4472,119 @@ msgstr "" msgid "Upper value (%g) is less than lower value (%g) on VARIABLES subcommand." msgstr "" -#: src/oneway.q:125 +#: src/oneway.q:148 msgid "Number of contrast coefficients must equal the number of groups" msgstr "" -#: src/oneway.q:133 +#: src/oneway.q:156 #, c-format msgid "Coefficients for contrast %d do not total zero" msgstr "" -#: src/oneway.q:213 src/t-test.q:329 src/t-test.q:406 +#: src/oneway.q:221 src/t-test.q:364 src/t-test.q:449 #, c-format msgid "`%s' is not a variable name" msgstr "" -#: src/oneway.q:248 +#: src/oneway.q:256 msgid "Sum of Squares" msgstr "" -#: src/oneway.q:250 +#: src/oneway.q:258 msgid "Mean Square" msgstr "" -#: src/oneway.q:251 src/t-test.q:895 +#: src/oneway.q:259 src/t-test.q:976 msgid "F" msgstr "" -#: src/oneway.q:252 src/oneway.q:522 +#: src/oneway.q:260 src/oneway.q:530 msgid "Significance" msgstr "" -#: src/oneway.q:277 +#: src/oneway.q:285 msgid "Between Groups" msgstr "" -#: src/oneway.q:278 +#: src/oneway.q:286 msgid "Within Groups" msgstr "" -#: src/oneway.q:324 +#: src/oneway.q:332 msgid "ANOVA" msgstr "" -#: src/oneway.q:368 src/t-test.q:635 src/t-test.q:658 src/t-test.q:748 -#: src/t-test.q:1062 +#: src/oneway.q:376 src/t-test.q:682 src/t-test.q:705 src/t-test.q:829 +#: src/t-test.q:1166 msgid "Std. Deviation" msgstr "" -#: src/oneway.q:369 src/oneway.q:672 +#: src/oneway.q:377 src/oneway.q:684 msgid "Std. Error" msgstr "" -#: src/oneway.q:374 +#: src/oneway.q:382 #, c-format msgid "%g%% Confidence Interval for Mean" msgstr "" -#: src/oneway.q:376 +#: src/oneway.q:384 msgid "Lower Bound" msgstr "" -#: src/oneway.q:377 +#: src/oneway.q:385 msgid "Upper Bound" msgstr "" -#: src/oneway.q:383 +#: src/oneway.q:391 msgid "Descriptives" msgstr "" -#: src/oneway.q:519 +#: src/oneway.q:527 msgid "Levene Statistic" msgstr "" -#: src/oneway.q:520 +#: src/oneway.q:528 msgid "df1" msgstr "" -#: src/oneway.q:521 +#: src/oneway.q:529 msgid "df2" msgstr "" -#: src/oneway.q:525 +#: src/oneway.q:533 msgid "Test of Homogeneity of Variances" msgstr "" -#: src/oneway.q:597 +#: src/oneway.q:609 msgid "Contrast Coefficients" msgstr "" -#: src/oneway.q:599 src/oneway.q:670 +#: src/oneway.q:611 src/oneway.q:682 msgid "Contrast" msgstr "" -#: src/oneway.q:668 +#: src/oneway.q:680 msgid "Contrast Tests" msgstr "" -#: src/oneway.q:671 +#: src/oneway.q:683 msgid "Value of Contrast" msgstr "" -#: src/oneway.q:673 src/t-test.q:897 src/t-test.q:1066 src/t-test.q:1158 +#: src/oneway.q:685 src/t-test.q:978 src/t-test.q:1170 src/t-test.q:1262 msgid "t" msgstr "" -#: src/oneway.q:675 src/t-test.q:899 src/t-test.q:1068 src/t-test.q:1160 +#: src/oneway.q:687 src/t-test.q:980 src/t-test.q:1172 src/t-test.q:1264 msgid "Sig. (2-tailed)" msgstr "" -#: src/oneway.q:723 +#: src/oneway.q:735 msgid "Assume equal variances" msgstr "" -#: src/oneway.q:727 +#: src/oneway.q:739 msgid "Does not assume equal" msgstr "" @@ -4720,128 +4720,128 @@ msgstr "" msgid "data> " msgstr "" -#: src/t-test.q:233 +#: src/t-test.q:266 msgid "TESTVAL, GROUPS and PAIRS subcommands are mutually exclusive." msgstr "" -#: src/t-test.q:250 +#: src/t-test.q:283 msgid "VARIABLES subcommand is not appropriate with PAIRS" msgstr "" -#: src/t-test.q:287 +#: src/t-test.q:320 msgid "One or more VARIABLES must be specified." msgstr "" -#: src/t-test.q:342 +#: src/t-test.q:377 #, c-format msgid "Long string variable %s is not valid here." msgstr "" -#: src/t-test.q:359 +#: src/t-test.q:397 msgid "" "When applying GROUPS to a string variable, at least one value must be " "specified." msgstr "" -#: src/t-test.q:441 +#: src/t-test.q:484 #, c-format msgid "" "PAIRED was specified but the number of variables preceding WITH (%d) did not " "match the number following (%d)." msgstr "" -#: src/t-test.q:458 +#: src/t-test.q:501 msgid "At least two variables must be specified on PAIRS." msgstr "" -#: src/t-test.q:631 +#: src/t-test.q:678 msgid "One-Sample Statistics" msgstr "" -#: src/t-test.q:636 src/t-test.q:659 src/t-test.q:749 +#: src/t-test.q:683 src/t-test.q:706 src/t-test.q:830 msgid "SE. Mean" msgstr "" -#: src/t-test.q:653 +#: src/t-test.q:700 msgid "Group Statistics" msgstr "" -#: src/t-test.q:743 +#: src/t-test.q:824 msgid "Paired Sample Statistics" msgstr "" -#: src/t-test.q:765 src/t-test.q:1087 src/t-test.q:1278 +#: src/t-test.q:846 src/t-test.q:1191 src/t-test.q:1382 #, c-format msgid "Pair %d" msgstr "" -#: src/t-test.q:883 +#: src/t-test.q:964 msgid "Independent Samples Test" msgstr "" -#: src/t-test.q:891 +#: src/t-test.q:972 msgid "Levene's Test for Equality of Variances" msgstr "" -#: src/t-test.q:893 +#: src/t-test.q:974 msgid "t-test for Equality of Means" msgstr "" -#: src/t-test.q:896 src/t-test.q:1263 +#: src/t-test.q:977 src/t-test.q:1367 msgid "Sig." msgstr "" -#: src/t-test.q:900 src/t-test.q:1161 +#: src/t-test.q:981 src/t-test.q:1265 msgid "Mean Difference" msgstr "" -#: src/t-test.q:901 +#: src/t-test.q:982 msgid "Std. Error Difference" msgstr "" -#: src/t-test.q:906 src/t-test.q:1058 src/t-test.q:1153 +#: src/t-test.q:987 src/t-test.q:1162 src/t-test.q:1257 #, c-format msgid "%g%% Confidence Interval of the Difference" msgstr "" -#: src/t-test.q:937 +#: src/t-test.q:1041 msgid "Equal variances assumed" msgstr "" -#: src/t-test.q:990 +#: src/t-test.q:1094 msgid "Equal variances not assumed" msgstr "" -#: src/t-test.q:1048 +#: src/t-test.q:1152 msgid "Paired Samples Test" msgstr "" -#: src/t-test.q:1051 +#: src/t-test.q:1155 msgid "Paired Differences" msgstr "" -#: src/t-test.q:1063 +#: src/t-test.q:1167 msgid "Std. Error Mean" msgstr "" -#: src/t-test.q:1142 +#: src/t-test.q:1246 msgid "One-Sample Test" msgstr "" -#: src/t-test.q:1147 +#: src/t-test.q:1251 #, c-format msgid "Test Value = %f" msgstr "" -#: src/t-test.q:1258 +#: src/t-test.q:1362 msgid "Paired Samples Correlations" msgstr "" -#: src/t-test.q:1262 +#: src/t-test.q:1366 msgid "Correlation" msgstr "" -#: src/t-test.q:1281 +#: src/t-test.q:1385 #, c-format msgid "%s & %s" msgstr ""