X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Flanguage.texi;h=1c9a4f0105245a5550f7ad71f577be5db2be0c53;hb=refs%2Fheads%2Fctables7;hp=10fab4eca6391ef8494e18dc8859fb39fb2beb98;hpb=640c9e545186880ad74c123b0ad5df347a9a3163;p=pspp diff --git a/doc/language.texi b/doc/language.texi index 10fab4eca6..1c9a4f0105 100644 --- a/doc/language.texi +++ b/doc/language.texi @@ -1,5 +1,5 @@ @c PSPP - a program for statistical analysis. -@c Copyright (C) 2017 Free Software Foundation, Inc. +@c Copyright (C) 2017, 2020 Free Software Foundation, Inc. @c Permission is granted to copy, distribute and/or modify this document @c under the terms of the GNU Free Documentation License, Version 1.3 @c or any later version published by the Free Software Foundation; @@ -13,7 +13,7 @@ @cindex @pspp{}, language This chapter discusses elements common to many @pspp{} commands. -Later chapters will describe individual commands in detail. +Later chapters describe individual commands in detail. @menu * Tokens:: Characters combine to form tokens. @@ -104,7 +104,7 @@ number. No white space is allowed within a number token, except for horizontal white space between @samp{-} and the rest of the number. -The last example above, @samp{8945.} will be interpreted as two +The last example above, @samp{8945.} is interpreted as two tokens, @samp{8945} and @samp{.}, if it is the last token on a line. @xref{Commands, , Forming commands of tokens}. @@ -115,7 +115,7 @@ tokens, @samp{8945} and @samp{.}, if it is the last token on a line. @cindex case-sensitivity Strings are literal sequences of characters enclosed in pairs of single quotes (@samp{'}) or double quotes (@samp{"}). To include the -character used for quoting in the string, double it, e.g.@: +character used for quoting in the string, double it, @i{e.g.}@: @samp{'it''s an apostrophe'}. White space and case of letters are significant inside strings. @@ -158,7 +158,7 @@ Most of these appear within the syntax of commands, but the period punctuator only as the last character on a line (except white space). When it is the last non-space character on a line, a period is not treated as part of another token, even if it would otherwise be part -of, e.g.@:, an identifier or a floating-point number. +of, @i{e.g.}@:, an identifier or a floating-point number. @end table @node Commands @@ -252,7 +252,7 @@ in arbitrary textual or binary formats. @xref{INPUT PROGRAM}. @item Transformations @cindex transformations Perform operations on data and write data to output files. Transformations -are not carried out until a procedure is executed. +are not carried out until a procedure is executed. @item Restricted transformations @cindex restricted transformations @@ -303,7 +303,7 @@ transformation state. Valid in any state. @item When executed in the initial or procedure state, causes a transition to -the transformation state. +the transformation state. @item Clears the active dataset if executed in the procedure or transformation state. @@ -314,7 +314,7 @@ state. @item Invalid in input-program and file-type states. @item -Causes a transition to the intput-program state. +Causes a transition to the intput-program state. @item Clears the active dataset. @end itemize @@ -440,7 +440,7 @@ variables' names may not begin with @samp{$}. @cindex variable names, ending with period The final character in a variable name should not be @samp{.}, because such an identifier will be misinterpreted when it is the final token -on a line: @code{FOO.} will be divided into two separate tokens, +on a line: @code{FOO.} is divided into two separate tokens, @samp{FOO} and @samp{.}, indicating end-of-command. @xref{Tokens}. @cindex @samp{_} @@ -507,6 +507,35 @@ they are displayed. Example: a width of 8, with 2 decimal places. Similar to print format, but used by the @cmd{WRITE} command (@pxref{WRITE}). +@cindex measurement level +@item Measurement level +One of the following: + +@table @asis +@item Nominal +Each value of a nominal variable represents a distinct category. The +possible categories are finite and often have value labels. The order +of categories is not significant. Political parties, US states, and +yes/no choices are nominal. Numeric and string variables can be +nominal. + +@item Ordinal +Ordinal variables also represent distinct categories, but their values +are arranged according to some natural order. Likert scales, e.g.@: +from strongly disagree to strongly agree, are ordinal. Data grouped +into ranges, e.g.@: age groups or income groups, are ordinal. Both +numeric and string variables can be ordinal. String values are +ordered alphabetically, so letter grades from A to F will work as +expected, but @code{poor}, @code{satisfactory}, @code{excellent} will +not. + +@item Scale +Scale variables are ones for which differences and ratios are +meaningful. These are often values which have a natural unit +attached, such as age in years, income in dollars, or distance in +miles. Only numeric variables are scalar. +@end table + @cindex custom attributes @item Custom attributes User-defined associations between names and values. @xref{VARIABLE @@ -537,7 +566,12 @@ shuffled around. @cindex @code{$DATE} @item $DATE Date the @pspp{} process was started, in format A9, following the -pattern @code{DD MMM YY}. +pattern @code{DD-MMM-YY}. + +@cindex @code{$DATE11} +@item $DATE11 +Date the @pspp{} process was started, in format A11, following the +pattern @code{DD-MMM-YYYY}. @cindex @code{$JDATE} @item $JDATE @@ -619,7 +653,7 @@ created variables have identical print and write formats, and most of the time, the distinction between print and write formats is unimportant. -Input and output formats are specified to @pspp{} with +Input and output formats are specified to @pspp{} with a @dfn{format specification} of the form @subcmd{@var{TYPE}@var{w}} or @code{TYPE@var{w}.@var{d}}, where @var{TYPE} is one of the format types described later, @var{w} is a @@ -631,13 +665,13 @@ The following sections describe the input and output formats supported by @pspp{}. @menu -* Basic Numeric Formats:: -* Custom Currency Formats:: -* Legacy Numeric Formats:: -* Binary and Hexadecimal Numeric Formats:: -* Time and Date Formats:: -* Date Component Formats:: -* String Formats:: +* Basic Numeric Formats:: +* Custom Currency Formats:: +* Legacy Numeric Formats:: +* Binary and Hexadecimal Numeric Formats:: +* Time and Date Formats:: +* Date Component Formats:: +* String Formats:: @end menu @node Basic Numeric Formats @@ -813,7 +847,7 @@ In scientific notation, the exponent is output as @samp{E} followed by @samp{+} or @samp{-} and exactly three digits. Numbers with magnitude less than 10**-999 or larger than 10**999 are not supported by most computers, but if they are supported then their output is considered -to overflow the field and will be output as asterisks. +to overflow the field and they are output as asterisks. @item On most computers, no more than 15 decimal digits are significant in @@ -826,7 +860,7 @@ calculations may also reduce precision of output. Special values such as infinities and ``not a number'' values are usually converted to the system-missing value before printing. In a few circumstances, these values are output directly. In fields of width 3 -or greater, special values are output as however many characters will +or greater, special values are output as however many characters fit from @code{+Infinity} or @code{-Infinity} for infinities, from @code{NaN} for ``not a number,'' or from @code{Unknown} for other values (if any are supported by the system). In fields under 3 columns wide, @@ -843,15 +877,15 @@ SET command configures custom currency formats, using the syntax @display SET CC@var{x}=@t{"}@var{string}@t{"}. @end display -@noindent +@noindent where @var{x} is A, B, C, D, or E, and @var{string} is no more than 16 characters long. @var{string} must contain exactly three commas or exactly three periods (but not both), except that a single quote character may be used to ``escape'' a following comma, period, or single quote. If three commas -are used, commas will be used for grouping in output, and a period will -be used as the decimal point. Uses of periods reverses these roles. +are used, commas are used for grouping in output, and a period +is used as the decimal point. Uses of periods reverses these roles. The commas or periods divide @var{string} into four fields, called the @dfn{negative prefix}, @dfn{prefix}, @dfn{suffix}, and @dfn{negative @@ -1053,7 +1087,7 @@ WRB}). The recommended field width depends on the floating-point format. NATIVE (the default format), IDL, IDB, VD, VG, and ZL formats should use a field width of 8. ISL, ISB, VF, and ZS formats should use a field -width of 4. Other field widths will not produce useful results. The +width of 4. Other field widths do not produce useful results. The maximum field width is 8. No decimal places may be specified. The default output format is F8.2. @@ -1196,7 +1230,7 @@ below: @float @multitable {DATETIME} {Min. Input Width} {Min. Output Width} {4-digit year} -@headitem Format @tab Min. Input Width @tab Min. Output Width @tab Option +@headitem Format @tab Min. Input Width @tab Min. Output Width @tab Option @item DATE @tab 8 @tab 9 @tab 4-digit year @item ADATE @tab 8 @tab 8 @tab 4-digit year @item EDATE @tab 8 @tab 8 @tab 4-digit year @@ -1212,16 +1246,16 @@ below: @item DTIME @tab 8 @tab 8 @tab seconds @end multitable @end float -@noindent +@noindent In the table, ``Option'' describes what increased output width enables: @table @asis @item 4-digit year -A field 2 columns wider than minimum will include a 4-digit year. +A field 2 columns wider than the minimum includes a 4-digit year. (DATETIME and YMDHMS formats always include a 4-digit year.) @item seconds -A field 3 columns wider than minimum will include seconds as well as +A field 3 columns wider than the minimum includes seconds as well as minutes. A field 5 columns wider than minimum, or more, can also include a decimal point and fractional seconds (but no more than allowed by the format's decimal places). @@ -1235,14 +1269,14 @@ Time or dates narrower than the field width are right-justified within the field. When a time or date exceeds the field width, characters are trimmed from -the end until it fits. This can occur in an unusual situation, e.g.@: +the end until it fits. This can occur in an unusual situation, @i{e.g.}@: with a year greater than 9999 (which adds an extra digit), or for a negative value on MTIME, TIME, or DTIME (which adds a leading minus sign). @c What about out-of-range values? The system-missing value is output as a period at the right end of the -field. +field. @node Date Component Formats @subsubsection Date Component Formats @@ -1296,7 +1330,7 @@ or to blanks, depending on type. However, sometimes it's useful to have a variable that keeps its value between cases. You can do this with @cmd{LEAVE} (@pxref{LEAVE}), or you can use a @dfn{scratch variable}. Scratch variables are variables whose -names begin with an octothorpe (@samp{#}). +names begin with an octothorpe (@samp{#}). Scratch variables have the same properties as variables left with @cmd{LEAVE}: they retain their values between cases, and for the first @@ -1358,7 +1392,7 @@ portable files. @section File Handles @cindex file handles -A @dfn{file handle} is a reference to a data file, system file, or +A @dfn{file handle} is a reference to a data file, system file, or portable file. Most often, a file handle is specified as the name of a file as a string, that is, enclosed within @samp{'} or @samp{"}. @@ -1366,9 +1400,9 @@ name of a file as a string, that is, enclosed within @samp{'} or A file name string that begins or ends with @samp{|} is treated as the name of a command to pipe data to or from. You can use this feature to read data over the network using a program such as @samp{curl} -(e.g.@: @code{GET '|curl -s -S http://example.com/mydata.sav'}), to +(@i{e.g.}@: @code{GET '|curl -s -S http://example.com/mydata.sav'}), to read compressed data from a file using a program such as @samp{zcat} -(e.g.@: @code{GET '|zcat mydata.sav.gz'}), and for many other +(@i{e.g.}@: @code{GET '|zcat mydata.sav.gz'}), and for many other purposes. @pspp{} also supports declaring named file handles with the @cmd{FILE @@ -1382,9 +1416,9 @@ for more information. In some circumstances, @pspp{} must distinguish whether a file handle refers to a system file or a portable file. When this is necessary to -read a file, e.g.@: as an input file for @cmd{GET} or @cmd{MATCH FILES}, +read a file, @i{e.g.}@: as an input file for @cmd{GET} or @cmd{MATCH FILES}, @pspp{} uses the file's contents to decide. In the context of writing a -file, e.g.@: as an output file for @cmd{SAVE} or @cmd{AGGREGATE}, @pspp{} +file, @i{e.g.}@: as an output file for @cmd{SAVE} or @cmd{AGGREGATE}, @pspp{} decides based on the file's name: if it ends in @samp{.por} (with any capitalization), then @pspp{} writes a portable file; otherwise, @pspp{} writes a system file. @@ -1442,7 +1476,7 @@ Operators and punctuators. @cindex @code{.} @item @code{.} The end of the command. This is not necessarily an actual dot in the -syntax file: @xref{Commands}, for more details. +syntax file (@pxref{Commands}). @end table @item