X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Flanguage.texi;h=7b750eb7ae75ac3acf8053184aaacded62fc97a2;hb=3640237a5fc890a84cb814fbe8bf6fd9299624e4;hp=9fcfde677bd7b67a81ba8052d23a6e1d1a8d8b2d;hpb=4056e461fd8f8d9ba7feca63c73d2f50a2048b63;p=pspp diff --git a/doc/language.texi b/doc/language.texi index 9fcfde677b..7b750eb7ae 100644 --- a/doc/language.texi +++ b/doc/language.texi @@ -1,5 +1,5 @@ @c PSPP - a program for statistical analysis. -@c Copyright (C) 2017 Free Software Foundation, Inc. +@c Copyright (C) 2017, 2020 Free Software Foundation, Inc. @c Permission is granted to copy, distribute and/or modify this document @c under the terms of the GNU Free Documentation License, Version 1.3 @c or any later version published by the Free Software Foundation; @@ -13,7 +13,7 @@ @cindex @pspp{}, language This chapter discusses elements common to many @pspp{} commands. -Later chapters will describe individual commands in detail. +Later chapters describe individual commands in detail. @menu * Tokens:: Characters combine to form tokens. @@ -104,7 +104,7 @@ number. No white space is allowed within a number token, except for horizontal white space between @samp{-} and the rest of the number. -The last example above, @samp{8945.} will be interpreted as two +The last example above, @samp{8945.} is interpreted as two tokens, @samp{8945} and @samp{.}, if it is the last token on a line. @xref{Commands, , Forming commands of tokens}. @@ -440,7 +440,7 @@ variables' names may not begin with @samp{$}. @cindex variable names, ending with period The final character in a variable name should not be @samp{.}, because such an identifier will be misinterpreted when it is the final token -on a line: @code{FOO.} will be divided into two separate tokens, +on a line: @code{FOO.} is divided into two separate tokens, @samp{FOO} and @samp{.}, indicating end-of-command. @xref{Tokens}. @cindex @samp{_} @@ -507,6 +507,75 @@ they are displayed. Example: a width of 8, with 2 decimal places. Similar to print format, but used by the @cmd{WRITE} command (@pxref{WRITE}). +@cindex measurement level +@item Measurement level +@anchor{Measurement Level} +One of the following: + +@table @asis +@item Nominal +Each value of a nominal variable represents a distinct category. The +possible categories are finite and often have value labels. The order +of categories is not significant. Political parties, US states, and +yes/no choices are nominal. Numeric and string variables can be +nominal. + +@item Ordinal +Ordinal variables also represent distinct categories, but their values +are arranged according to some natural order. Likert scales, e.g.@: +from strongly disagree to strongly agree, are ordinal. Data grouped +into ranges, e.g.@: age groups or income groups, are ordinal. Both +numeric and string variables can be ordinal. String values are +ordered alphabetically, so letter grades from A to F will work as +expected, but @code{poor}, @code{satisfactory}, @code{excellent} will +not. + +@item Scale +Scale variables are ones for which differences and ratios are +meaningful. These are often values which have a natural unit +attached, such as age in years, income in dollars, or distance in +miles. Only numeric variables are scalar. +@end table + +Variables created by @cmd{COMPUTE} and similar transformations, +obtained from external sources, etc., initially have an unknown +measurement level. Any procedure that reads the data will then assign +a default measurement level. @pspp{} can assign some defaults without +reading the data: + +@itemize @bullet +@item +Nominal, if it's a string variable. + +@item +Nominal, if the variable has a WKDAY or MONTH print format. + +@item +Scale, if the variable has a DOLLAR, CCA through CCE, or time or date +print format. +@end itemize + +Otherwise, @pspp{} reads the data and decides based on its +distribution: + +@itemize @bullet +@item +Nominal, if all observations are missing. + +@item +Scale, if one or more valid observations are noninteger or negative. + +@item +Scale, if no valid observation is less than 10. + +@item +Scale, if the variable has 24 or more unique valid values. The value +24 is the default and can be adjusted (@pxref{SET SCALEMIN}). +@end itemize + +Finally, if none of the above is true, @pspp{} assigns the variable a +nominal measurement level. + @cindex custom attributes @item Custom attributes User-defined associations between names and values. @xref{VARIABLE @@ -537,7 +606,12 @@ shuffled around. @cindex @code{$DATE} @item $DATE Date the @pspp{} process was started, in format A9, following the -pattern @code{DD MMM YY}. +pattern @code{DD-MMM-YY}. + +@cindex @code{$DATE11} +@item $DATE11 +Date the @pspp{} process was started, in format A11, following the +pattern @code{DD-MMM-YYYY}. @cindex @code{$JDATE} @item $JDATE @@ -791,8 +865,10 @@ would not fit at all without it. Scientific notation with @samp{$} or @item Except in scientific notation, a decimal point is included only when it is followed by a digit. If the integer part of the number being -output is 0, and a decimal point is included, then the zero before the -decimal point is dropped. +output is 0, and a decimal point is included, then @pspp{} ordinarily +drops the zero before the decimal point. However, in @code{F}, +@code{COMMA}, or @code{DOT} formats, @pspp{} keeps the zero if +@code{SET LEADZERO} is set to @code{ON} (@pxref{SET LEADZERO}). In scientific notation, the number always includes a decimal point, even if it is not followed by a digit. @@ -813,7 +889,7 @@ In scientific notation, the exponent is output as @samp{E} followed by @samp{+} or @samp{-} and exactly three digits. Numbers with magnitude less than 10**-999 or larger than 10**999 are not supported by most computers, but if they are supported then their output is considered -to overflow the field and will be output as asterisks. +to overflow the field and they are output as asterisks. @item On most computers, no more than 15 decimal digits are significant in @@ -826,7 +902,7 @@ calculations may also reduce precision of output. Special values such as infinities and ``not a number'' values are usually converted to the system-missing value before printing. In a few circumstances, these values are output directly. In fields of width 3 -or greater, special values are output as however many characters will +or greater, special values are output as however many characters fit from @code{+Infinity} or @code{-Infinity} for infinities, from @code{NaN} for ``not a number,'' or from @code{Unknown} for other values (if any are supported by the system). In fields under 3 columns wide, @@ -850,8 +926,8 @@ characters long. @var{string} must contain exactly three commas or exactly three periods (but not both), except that a single quote character may be used to ``escape'' a following comma, period, or single quote. If three commas -are used, commas will be used for grouping in output, and a period will -be used as the decimal point. Uses of periods reverses these roles. +are used, commas are used for grouping in output, and a period +is used as the decimal point. Uses of periods reverses these roles. The commas or periods divide @var{string} into four fields, called the @dfn{negative prefix}, @dfn{prefix}, @dfn{suffix}, and @dfn{negative @@ -1053,7 +1129,7 @@ WRB}). The recommended field width depends on the floating-point format. NATIVE (the default format), IDL, IDB, VD, VG, and ZL formats should use a field width of 8. ISL, ISB, VF, and ZS formats should use a field -width of 4. Other field widths will not produce useful results. The +width of 4. Other field widths do not produce useful results. The maximum field width is 8. No decimal places may be specified. The default output format is F8.2. @@ -1217,11 +1293,11 @@ In the table, ``Option'' describes what increased output width enables: @table @asis @item 4-digit year -A field 2 columns wider than minimum will include a 4-digit year. +A field 2 columns wider than the minimum includes a 4-digit year. (DATETIME and YMDHMS formats always include a 4-digit year.) @item seconds -A field 3 columns wider than minimum will include seconds as well as +A field 3 columns wider than the minimum includes seconds as well as minutes. A field 5 columns wider than minimum, or more, can also include a decimal point and fractional seconds (but no more than allowed by the format's decimal places).