X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fstatistics.texi;h=4c65898b8eb6be7f70e2f61b586fd2fb2f8c5ecd;hb=d98583b9425b8a053dc21b539203406bac74adc5;hp=2e4c96c6a17a566d502eca34fe3a9a2601f33c7b;hpb=fb429d05d61e959fcd4a0116e9b4327967845f9d;p=pspp diff --git a/doc/statistics.texi b/doc/statistics.texi index 2e4c96c6a1..4c65898b8e 100644 --- a/doc/statistics.texi +++ b/doc/statistics.texi @@ -1658,7 +1658,14 @@ A subtotal (@pxref{CTABLES Totals and Subtotals}). If multiple elements of the list cover a given category, the last one in the list takes precedence. -@c TODO example +The following example syntax and output show how an explicit category +can limit the displayed categories: + +@example +CTABLES /TABLE qn1. +CTABLES /TABLE qn1 /CATEGORIES VARIABLES=qn1 [1, 2, 3]. +@end example +@psppoutput {ctables27} @node CTABLES Implicit Categories @subsubsection Implicit Categories @@ -1687,7 +1694,15 @@ User-missing values are excluded by default, or with @code{MISSING=EXCLUDE}. Specify @code{MISSING=INCLUDE} to include user-missing values. The system-missing value is always excluded. -@c TODO example +The following example syntax and output show how +@code{MISSING=INCLUDE} causes missing values to be included in a +category list. + +@example +CTABLES /TABLE qn1. +CTABLES /TABLE qn1 /CATEGORIES VARIABLES=qn1 MISSING=INCLUDE. +@end example +@psppoutput {ctables28} @node CTABLES Totals and Subtotals @subsubsection Totals and Subtotals @@ -1706,21 +1721,35 @@ subtotal. Either way, the default label is ``Subtotal'', use @code{SUBTOTAL="@i{label}"} or @code{HSUBTOTAL="@i{label}"} to specify a custom label. -@c TODO +The following example syntax and output show how to use +@code{TOTAL=YES} and @code{SUBTOTAL}: + +@example +CTABLES + /TABLE qn1 + /CATEGORIES VARIABLES=qn1 [OTHERNM, SUBTOTAL='Valid Total', + MISSING, SUBTOTAL='Missing Total'] + TOTAL=YES LABEL='Overall Total'. +@end example +@psppoutput {ctables29} By default, or with @code{POSITION=AFTER}, totals are displayed in the output after the last category and subtotals apply to categories that precede them. With @code{POSITION=BEFORE}, totals come before the first category and subtotals apply to categories that follow them. -@c TODO - Only categorical variables may have totals and subtotals. Scalar variables may be ``totaled'' indirectly by enabling totals and -subtotals on a categorical variable within which the scalar variable is -summarized. +subtotals on a categorical variable within which the scalar variable +is summarized. For example, the following syntax produces a mean, +count, and valid count across all data by adding a total on the +categorical @code{region} variable, as shown: -@c TODO +@example +CTABLES /TABLE=region > qn20 [MEAN, VALIDN] + /CATEGORIES VARIABLES=region TOTAL=YES LABEL='All regions'. +@end example +@psppoutput {ctables30} By default, @pspp{} uses the same summary functions for totals and subtotals as other categories. To summarize totals and subtotals @@ -1752,7 +1781,15 @@ categories, they include all the values listed individually and all values with value labels that are covered by ranges or @code{MISSING} or @code{OTHERNM}. -@c TODO +The following example syntax and output show the effect of +@code{EMPTY=EXCLUDE} for the @code{qns1} variable, in which 0 is labeled +``None'' but no cases exist with that value: + +@example +CTABLES /TABLE=qns1. +CTABLES /TABLE=qns1 /CATEGORIES VARIABLES=qns1 EMPTY=EXCLUDE. +@end example +@psppoutput {ctables31} @node CTABLES Titles @subsection Titles @@ -1791,8 +1828,6 @@ The expression specified on the @code{TABLE} command. Summary and measurement level specifications are omitted, and variable labels are used in place of variable names. @end table -@c TODO example - @node CTABLES Table Formatting @subsection Table Formatting @@ -1862,8 +1897,6 @@ Show variable name and label. Show nothing. @end table -@c TODO example - @node CTABLES Missing Value Treatment @subsection Missing Value Treatment @@ -1871,26 +1904,40 @@ The @code{TABLE} subcommand on @code{CTABLES} specifies two different kinds of variables: variables that divide tables into cells (which are always categorical) and variables being summarized (which may be categorical or scale). @pspp{} treats missing values differently in -each kind of variable: +each kind of variable, as described in the sections below. + +@node CTABLES Missing Values for Cell-Defining Variables +@subsubsection Missing Values for Cell-Defining Variables -@itemize @bullet -@item For variables that divide tables into cells, per-variable category -options determine which data is analyzed. If any of the categories -for such a variable would exclude a case, then that case is not -included. +options, as described in @ref{CTABLES Per-Variable Category Options}, +determine which data is analyzed. If any of the categories for such a +variable would exclude a case, then that case is not included. -@item -The treatment of missing values in variables being summarized varies -between scale and scale and categorical variables. The following -section describes their treatment in detail. +As an example, consider the following entirely artificial dataset, in +which @samp{x} and @samp{y} are categorical variables with missing +value 9, and @samp{z} is scale: -By default, each summarized variable is considered separately for -missing value treatment. A section below describes how to consider -missing values listwise for summarizing scale variables. -@end itemize +@psppoutput{ctables32} -@c TODO example +Using @samp{x} and @samp{y} to define cells, and summarizing @samp{z}, +by default @pspp{} omits all the cases that have @samp{x} or @samp{y} (or both) +missing: + +@example +CTABLES /TABLE x > y > z [SUM]. +@end example +@psppoutput{ctables33} + +If, however, we add @code{CATEGORIES} specifications to include +missing values for @samp{y} or for @samp{x} and @samp{y}, the output +table includes them, like so: + +@example +CTABLES /TABLE x > y > z [SUM] /CATEGORIES VARIABLES=y MISSING=INCLUDE. +CTABLES /TABLE x > y > z [SUM] /CATEGORIES VARIABLES=x y MISSING=INCLUDE. +@end example +@psppoutput{ctables34} @node CTABLES Missing Values for Summary Variables @subsubsection Missing Values for Summary Variables @@ -1993,19 +2040,30 @@ CTABLES /SMISSING LISTWISE /TABLE (y > x) + (z > x). @display @t{/PCOMPUTE} @t{&}@i{postcompute}@t{=EXPR(}@i{expression}@t{)} +@t{/PPROPERTIES} @t{&}@i{postcompute}@dots{} + [@t{LABEL=}@i{string}] + [@t{FORMAT=}[@i{summary} @i{format}]@dots{}] + [@t{HIDESOURCECATS=}@{@t{NO} @math{|} @t{YES}@} @end display @dfn{Computed categories}, also called @dfn{postcomputes}, are categories created using arithmetic on categories obtained from the -data. The @code{PCOMPUTE} subcommand defines computed categories, -which can then be used in two places: on @code{CATEGORIES} within an -explicit category list (@pxref{CTABLES Explicit Category List}), and on -the @code{PPROPERTIES} subcommand to define further properties for a -given postcompute. +data. The @code{PCOMPUTE} subcommand creates a postcompute, which may +then be used on @code{CATEGORIES} within an explicit category list +(@pxref{CTABLES Explicit Category List}). Optionally, +@code{PPROPERTIES} refines how a postcompute is displayed. The +following sections provide the details. -@code{PCOMPUTE} must precede the first @code{TABLE} command. It is -optional and it may be used any number of times to define multiple -postcomputes. +@node CTABLES PCOMPUTE +@subsubsection PCOMPUTE + +@display +@t{/PCOMPUTE} @t{&}@i{postcompute}@t{=EXPR(}@i{expression}@t{)} +@end display + +The @code{PCOMPUTE} subcommand, which must precede the first +@code{TABLE} command, defines computed categories. It is optional and +may be used any number of times to define multiple postcomputes. Each @code{PCOMPUTE} defines one postcompute. Its syntax consists of a name to identify the postcompute as a @pspp{} identifier prefixed by @@ -2072,10 +2130,30 @@ Normally a named postcompute is defined only once, but if a later @code{PCOMPUTE} redefines a postcompute with the same name as an earlier one, the later one take precedence. -@c TODO example +The following syntax and output shows how @code{PCOMPUTE} can compute +a total over subtotals, summing the ``Frequent Drivers'' and +``Infrequent Drivers'' subtotals to form an ``All Drivers'' +postcompute. It also shows how to calculate and display a percentage, +in this case the percentage of valid responses that report never +driving. It uses @code{PPROPERTIES} (@pxref{CTABLES PPROPERTIES}) to +display the latter in @code{PCT} format. -@node CTABLES Computed Category Properties -@subsection Computed Category Properties +@example +CTABLES + /PCOMPUTE &all_drivers=EXPR([1 THRU 2] + [3 THRU 4]) + /PPROPERTIES &all_drivers LABEL='All Drivers' + /PCOMPUTE &pct_never=EXPR([5] / ([1 THRU 2] + [3 THRU 4] + [5]) * 100) + /PPROPERTIES &pct_never LABEL='% Not Drivers' FORMAT=COUNT PCT40.1 + /TABLE=qn1 BY qns3a + /CATEGORIES VARIABLES=qn1 [1 THRU 2, SUBTOTAL='Frequent Drivers', + 3 THRU 4, SUBTOTAL='Infrequent Drivers', + &all_drivers, 5, &pct_never, + MISSING, SUBTOTAL='Not Drivers or Missing']. +@end example +@psppoutput{ctables35} + +@node CTABLES PPROPERTIES +@subsubsection PPROPERTIES @display @t{/PPROPERTIES} @t{&}@i{postcompute}@dots{} @@ -2104,7 +2182,7 @@ By default, or with @code{HIDESOURCECATS=NO}, categories referred to by computed categories are displayed like other categories. Use @code{HIDESOURCECATS=YES} to hide them. -@c TODO example +The previous section provides an example for @code{PPROPERTIES}. @node CTABLES Effective Weight @subsection Effective Weight @@ -2135,12 +2213,18 @@ with zero, missing, or negative effective weights. @end display The @code{HIDESMALLCOUNTS} subcommand is optional. If it specified, -then count values in output tables less than the value of @i{count} -are shown as @code{<@i{count}} instead of their true values. The -value of @i{count} must be an integer and must be at least 2. Case -weights are considered for deciding whether to hide a count. +then @code{COUNT}, @code{ECOUNT}, and @code{UCOUNT} values in output +tables less than the value of @i{count} are shown as @code{<@i{count}} +instead of their true values. The value of @i{count} must be an +integer and must be at least 2. + +The following syntax and example shows how to use +@code{HIDESMALLCOUNTS}: -@c TODO example +@example +CTABLES /HIDESMALLCOUNTS COUNT=10 /TABLE qn37. +@end example +@psppoutput{ctables36} @node FACTOR @section FACTOR