X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fstatistics.texi;h=4c65898b8eb6be7f70e2f61b586fd2fb2f8c5ecd;hb=d98583b9425b8a053dc21b539203406bac74adc5;hp=0cfceb225afa5d59988a7a426f5042866d0291a2;hpb=2088d7438791ad96dda2037a6ac7e9b0f3998c8b;p=pspp diff --git a/doc/statistics.texi b/doc/statistics.texi index 0cfceb225a..4c65898b8e 100644 --- a/doc/statistics.texi +++ b/doc/statistics.texi @@ -1781,7 +1781,15 @@ categories, they include all the values listed individually and all values with value labels that are covered by ranges or @code{MISSING} or @code{OTHERNM}. -@c TODO +The following example syntax and output show the effect of +@code{EMPTY=EXCLUDE} for the @code{qns1} variable, in which 0 is labeled +``None'' but no cases exist with that value: + +@example +CTABLES /TABLE=qns1. +CTABLES /TABLE=qns1 /CATEGORIES VARIABLES=qns1 EMPTY=EXCLUDE. +@end example +@psppoutput {ctables31} @node CTABLES Titles @subsection Titles @@ -1820,8 +1828,6 @@ The expression specified on the @code{TABLE} command. Summary and measurement level specifications are omitted, and variable labels are used in place of variable names. @end table -@c TODO example - @node CTABLES Table Formatting @subsection Table Formatting @@ -1891,8 +1897,6 @@ Show variable name and label. Show nothing. @end table -@c TODO example - @node CTABLES Missing Value Treatment @subsection Missing Value Treatment @@ -1900,26 +1904,40 @@ The @code{TABLE} subcommand on @code{CTABLES} specifies two different kinds of variables: variables that divide tables into cells (which are always categorical) and variables being summarized (which may be categorical or scale). @pspp{} treats missing values differently in -each kind of variable: +each kind of variable, as described in the sections below. + +@node CTABLES Missing Values for Cell-Defining Variables +@subsubsection Missing Values for Cell-Defining Variables -@itemize @bullet -@item For variables that divide tables into cells, per-variable category -options determine which data is analyzed. If any of the categories -for such a variable would exclude a case, then that case is not -included. +options, as described in @ref{CTABLES Per-Variable Category Options}, +determine which data is analyzed. If any of the categories for such a +variable would exclude a case, then that case is not included. -@item -The treatment of missing values in variables being summarized varies -between scale and scale and categorical variables. The following -section describes their treatment in detail. +As an example, consider the following entirely artificial dataset, in +which @samp{x} and @samp{y} are categorical variables with missing +value 9, and @samp{z} is scale: -By default, each summarized variable is considered separately for -missing value treatment. A section below describes how to consider -missing values listwise for summarizing scale variables. -@end itemize +@psppoutput{ctables32} + +Using @samp{x} and @samp{y} to define cells, and summarizing @samp{z}, +by default @pspp{} omits all the cases that have @samp{x} or @samp{y} (or both) +missing: + +@example +CTABLES /TABLE x > y > z [SUM]. +@end example +@psppoutput{ctables33} + +If, however, we add @code{CATEGORIES} specifications to include +missing values for @samp{y} or for @samp{x} and @samp{y}, the output +table includes them, like so: -@c TODO example +@example +CTABLES /TABLE x > y > z [SUM] /CATEGORIES VARIABLES=y MISSING=INCLUDE. +CTABLES /TABLE x > y > z [SUM] /CATEGORIES VARIABLES=x y MISSING=INCLUDE. +@end example +@psppoutput{ctables34} @node CTABLES Missing Values for Summary Variables @subsubsection Missing Values for Summary Variables @@ -2022,19 +2040,30 @@ CTABLES /SMISSING LISTWISE /TABLE (y > x) + (z > x). @display @t{/PCOMPUTE} @t{&}@i{postcompute}@t{=EXPR(}@i{expression}@t{)} +@t{/PPROPERTIES} @t{&}@i{postcompute}@dots{} + [@t{LABEL=}@i{string}] + [@t{FORMAT=}[@i{summary} @i{format}]@dots{}] + [@t{HIDESOURCECATS=}@{@t{NO} @math{|} @t{YES}@} @end display @dfn{Computed categories}, also called @dfn{postcomputes}, are categories created using arithmetic on categories obtained from the -data. The @code{PCOMPUTE} subcommand defines computed categories, -which can then be used in two places: on @code{CATEGORIES} within an -explicit category list (@pxref{CTABLES Explicit Category List}), and on -the @code{PPROPERTIES} subcommand to define further properties for a -given postcompute. +data. The @code{PCOMPUTE} subcommand creates a postcompute, which may +then be used on @code{CATEGORIES} within an explicit category list +(@pxref{CTABLES Explicit Category List}). Optionally, +@code{PPROPERTIES} refines how a postcompute is displayed. The +following sections provide the details. + +@node CTABLES PCOMPUTE +@subsubsection PCOMPUTE + +@display +@t{/PCOMPUTE} @t{&}@i{postcompute}@t{=EXPR(}@i{expression}@t{)} +@end display -@code{PCOMPUTE} must precede the first @code{TABLE} command. It is -optional and it may be used any number of times to define multiple -postcomputes. +The @code{PCOMPUTE} subcommand, which must precede the first +@code{TABLE} command, defines computed categories. It is optional and +may be used any number of times to define multiple postcomputes. Each @code{PCOMPUTE} defines one postcompute. Its syntax consists of a name to identify the postcompute as a @pspp{} identifier prefixed by @@ -2101,10 +2130,30 @@ Normally a named postcompute is defined only once, but if a later @code{PCOMPUTE} redefines a postcompute with the same name as an earlier one, the later one take precedence. -@c TODO example +The following syntax and output shows how @code{PCOMPUTE} can compute +a total over subtotals, summing the ``Frequent Drivers'' and +``Infrequent Drivers'' subtotals to form an ``All Drivers'' +postcompute. It also shows how to calculate and display a percentage, +in this case the percentage of valid responses that report never +driving. It uses @code{PPROPERTIES} (@pxref{CTABLES PPROPERTIES}) to +display the latter in @code{PCT} format. + +@example +CTABLES + /PCOMPUTE &all_drivers=EXPR([1 THRU 2] + [3 THRU 4]) + /PPROPERTIES &all_drivers LABEL='All Drivers' + /PCOMPUTE &pct_never=EXPR([5] / ([1 THRU 2] + [3 THRU 4] + [5]) * 100) + /PPROPERTIES &pct_never LABEL='% Not Drivers' FORMAT=COUNT PCT40.1 + /TABLE=qn1 BY qns3a + /CATEGORIES VARIABLES=qn1 [1 THRU 2, SUBTOTAL='Frequent Drivers', + 3 THRU 4, SUBTOTAL='Infrequent Drivers', + &all_drivers, 5, &pct_never, + MISSING, SUBTOTAL='Not Drivers or Missing']. +@end example +@psppoutput{ctables35} -@node CTABLES Computed Category Properties -@subsection Computed Category Properties +@node CTABLES PPROPERTIES +@subsubsection PPROPERTIES @display @t{/PPROPERTIES} @t{&}@i{postcompute}@dots{} @@ -2133,7 +2182,7 @@ By default, or with @code{HIDESOURCECATS=NO}, categories referred to by computed categories are displayed like other categories. Use @code{HIDESOURCECATS=YES} to hide them. -@c TODO example +The previous section provides an example for @code{PPROPERTIES}. @node CTABLES Effective Weight @subsection Effective Weight @@ -2164,12 +2213,18 @@ with zero, missing, or negative effective weights. @end display The @code{HIDESMALLCOUNTS} subcommand is optional. If it specified, -then count values in output tables less than the value of @i{count} -are shown as @code{<@i{count}} instead of their true values. The -value of @i{count} must be an integer and must be at least 2. Case -weights are considered for deciding whether to hide a count. +then @code{COUNT}, @code{ECOUNT}, and @code{UCOUNT} values in output +tables less than the value of @i{count} are shown as @code{<@i{count}} +instead of their true values. The value of @i{count} must be an +integer and must be at least 2. + +The following syntax and example shows how to use +@code{HIDESMALLCOUNTS}: -@c TODO example +@example +CTABLES /HIDESMALLCOUNTS COUNT=10 /TABLE qn37. +@end example +@psppoutput{ctables36} @node FACTOR @section FACTOR