X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fstatistics.texi;h=8c4a2dde445ec6c67b802ca40665d7aa19a56c6c;hb=refs%2Fheads%2Fctables10;hp=f199069c94868e81d5eef8cde06ba88710841a28;hpb=4e9cac1017c866edad3e5573e4a181eeb69c2703;p=pspp diff --git a/doc/statistics.texi b/doc/statistics.texi index f199069c94..8c4a2dde44 100644 --- a/doc/statistics.texi +++ b/doc/statistics.texi @@ -940,7 +940,9 @@ optional: @t{/VLABELS} @t{VARIABLES=}@i{variables} @t{DISPLAY}=@{@t{DEFAULT} @math{|} @t{NAME} @math{|} @t{LABEL} @math{|} @t{BOTH} @math{|} @t{NONE}@} +@ignore @c not yet implemented @t{/MRSETS COUNTDUPLICATES=}@{@t{YES} @math{|} @t{NO}@} +@end ignore @t{/SMISSING} @{@t{VARIABLE} @math{|} @t{LISTWISE}@} @t{/PCOMPUTE} @t{&}@i{category}@t{=EXPR(}@i{expression}@t{)} @t{/PPROPERTIES} @t{&}@i{category}@dots{} @@ -959,7 +961,6 @@ previous @code{TABLE}. All of these subcommands are optional: [@t{POSITION=}@{@t{COLUMN} @math{|} @t{ROW} @math{|} @t{LAYER}@}] [@t{VISIBLE=}@{@t{YES} @math{|} @t{NO}@}] @t{/CLABELS} @{@t{AUTO} @math{|} @{@t{ROWLABELS}@math{|}@t{COLLABELS}@}@t{=}@{@t{OPPOSITE}@math{|}@t{LAYER}@}@} -@t{/CRITERIA CILEVEL=}@i{percentage} @t{/CATEGORIES} @t{VARIABLES=}@i{variables} @{@t{[}@i{value}@t{,} @i{value}@dots{}@t{]} @math{|} [@t{ORDER=}@{@t{A} @math{|} @t{D}@}] @@ -971,6 +972,8 @@ previous @code{TABLE}. All of these subcommands are optional: [@t{TITLE=}@i{string}@dots{}] [@t{CAPTION=}@i{string}@dots{}] [@t{CORNER=}@i{string}@dots{}] +@ignore @c not yet implemented +@t{/CRITERIA CILEVEL=}@i{percentage} @t{/SIGTEST TYPE=CHISQUARE} [@t{ALPHA=}@i{siglevel}] [@t{INCLUDEMRSETS=}@{@t{YES} @math{|} @t{NO}@}] @@ -984,6 +987,7 @@ previous @code{TABLE}. All of these subcommands are optional: [@t{MERGE=}@{@t{NO} @math{|} @t{YES}@}] [@t{STYLE=}@{@t{APA} @math{|} @t{SIMPLE}@}] [@t{SHOWSIG=}@{@t{NO} @math{|} @t{YES}@}] +@end ignore @end display The @code{CTABLES} (aka ``custom tables'') command produces @@ -997,11 +1001,6 @@ available at @url{https://data.transportation.gov}. @pspp{} includes this data set, with a slightly modified dictionary, as @file{examples/nhtsa.sav}. -@menu -* CTABLES Basics:: -* CTABLES Data Summarization:: -@end menu - @node CTABLES Basics @subsection Basics @@ -1019,7 +1018,6 @@ variables. At least one must specify an axis expression. * CTABLES Categorical Variable Basics:: * CTABLES Scalar Variable Basics:: * CTABLES Overriding Measurement Level:: -* CTABLES Multiple Response Sets:: @end menu @node CTABLES Categorical Variable Basics @@ -1080,7 +1078,7 @@ CTABLES /TABLE (qn26 + qn27) > qns3a. @subsubsection Scalar Variables For a categorical variable, @code{CTABLES} divides the table into a -cell per category. For a scalar variables, @code{CTABLES} instead +cell per category. For a scalar variable, @code{CTABLES} instead calculates a summary measure, by default the mean, of the values that fall into a cell. For example, if the only variable specified is a scalar variable, then the output is a single cell that holds the mean @@ -1135,11 +1133,13 @@ CTABLES /TABLE qn20 [C] BY qns3a. @end example @psppoutput {ctables9} +@ignore @node CTABLES Multiple Response Sets @subsubheading Multiple Response Sets The @code{CTABLES} command does not yet support multiple response sets. +@end ignore @node CTABLES Data Summarization @subsection Data Summarization @@ -1252,7 +1252,7 @@ A percentage of valid values within the specified @var{area}. A percentage of total values within the specified @var{area}. @end table -The following summary functions apply only to scale variables: +The following summary functions apply only to scalar variables: @table @asis @item @code{MAXIMUM} (``Maximum'') @@ -1422,12 +1422,13 @@ CTABLES /TABLE=AgeGroup [TABLEPCT] /SLABELS VISIBLE=NO. @t{/CLABELS} @{@t{AUTO} @math{|} @{@t{ROWLABELS}@math{|}@t{COLLABELS}@}@t{=}@{@t{OPPOSITE}@math{|}@t{LAYER}@}@} @end display -The @code{CLABELS} subcommand controls the position of category -labels. By default, category labels for a given variable nest inside -the variable's label on the same axis. For example, the command below -results in age categories nesting within the age group variable on the -rows axis and gender categories within the gender variable on the -columns axis: +The @code{CLABELS} subcommand controls the position of category labels +for the @code{TABLE} subcommand that it follows. By default, or if +@t{AUTO} is specified, category labels for a given variable nest +inside the variable's label on the same axis. For example, the +command below results in age categories nesting within the age group +variable on the rows axis and gender categories within the gender +variable on the columns axis: @example CTABLES /TABLE AgeGroup BY qns3a. @@ -1454,9 +1455,262 @@ to the layer axis. @node CTABLES Per-Variable Category Options @subsection Per-Variable Category Options +@display +@t{/CATEGORIES} @t{VARIABLES=}@i{variables} + @{@t{[}@i{value}@t{,} @i{value}@dots{}@t{]} + @math{|} [@t{ORDER=}@{@t{A} @math{|} @t{D}@}] + [@t{KEY=}@{@t{VALUE} @math{|} @t{LABEL} @math{|} @i{summary}@t{(}@i{variable}@t{)}@}] + [@t{MISSING=}@{@t{EXCLUDE} @math{|} @t{INCLUDE}@}]@} + [@t{TOTAL=}@{@t{NO} @math{|} @t{YES}@} [@t{LABEL=}@i{string}] [@t{POSITION=}@{@t{AFTER} @math{|} @t{BEFORE}@}]] + [@t{EMPTY=}@{@t{INCLUDE} @math{|} @t{EXCLUDE}@}] +@end display + +The @code{CATEGORIES} subcommand specifies, for one or more +categorical variables, the categories to include and exclude, the sort +order for included categories, and treatment of missing values. It +also controls the totals and subtotals to display. It may be +specified any number of times, each time for a different set of +variables. @code{CATEGORIES} applies to the table produced by the +@code{TABLE} subcommand that it follows. + +@code{CATEGORIES} does not apply to scalar variables. + +@t{VARIABLES} is required. List the variables for the subcommand +to affect. + +There are two way to specify the Categories to include and their sort +order: + +@table @asis +@item Explicit categories. +@anchor{CTABLE Explicit Category List} +To explicitly specify categories to include, list the categories +within square brackets in the desired sort order. Use spaces or +commas to separate values. Categories not covered by the list are +excluded from analysis. + +Each element of the list takes one of the following forms: + +@table @t +@item @i{number} +@itemx '@i{string}' +A numeric or string category value, for variables that have the +corresponding type. + +@item '@i{date}' +@itemx '@i{time}' +A date or time category value, for variables that have a date or time +print format. + +@item @i{min} THRU @i{max} +@itemx LO THRU @i{max} +@itemx @i{min} THRU HI +A range of category values, where @var{min} and @var{max} each takes +one of the forms above, in increasing order. + +@item MISSING +All user-missing values. (To match individual user-missing values, +specify their category values.) + +@item OTHERNM +Any non-missing value not covered by any other element of the list +(regardless of where @t{OTHERNM} is placed in the list). + +@item &@i{pcompute} +A computed category name (@pxref{CTABLES Computed Categories}). +@end table + +Additional forms, described later, allow for subtotals. +If multiple elements of the list cover a given category, the last one +in the list is considered to be a match. + +@item Implicit categories. +Without an explicit list of categories, @pspp{} sorts +categories automatically. + +The @code{KEY} setting specifies the sort key. By default, or with +@code{KEY=VALUE}, categories are sorted by default. Categories may +also be sorted by value label, with @code{KEY=LABEL}, or by the value +of a summary function, e.g.@: @code{KEY=COUNT}. For summary +functions, a variable name may be specified in parentheses, e.g.@: +@code{KEY=MAXIUM(qnd1)}, and this is required for functions that apply +only to scalar variables. The @code{PTILE} function also requires a +percentage argument, e.g.@: @code{KEY=PTILE(qnd1, 90)}. Only summary +functions used in the table may be used, except that @code{COUNT} is +always allowed. + +By default, or with @code{ORDER=A}, categories are sorted in ascending +order. Specify @code{ORDER=D} to sort in descending order. + +User-missing values are excluded by default, or with +@code{MISSING=EXCLUDE}. Specify @code{MISSING=INCLUDE} to include +user-missing values. The system-missing value is always excluded. +@end table + +@subsubheading Totals and Subtotals + +@code{CATEGORIES} also controls display of totals and subtotals. +Totals are not displayed by default, or with @code{TOTAL=NO}. Specify +@code{TOTAL=YES} to display a total. By default, the total is labeled +``Total''; use @code{LABEL="@i{label}"} to override it. + +Subtotals are also not displayed by default. To add one or more +subtotals, use an explicit category list and insert @code{SUBTOTAL} or +@code{HSUBTOTAL} in the position or positions where the subtotal +should appear. With @code{SUBTOTAL}, the subtotal becomes an extra +row or column or layer; @code{HSUBTOTAL} additionally hides the +categories that make up the subtotal. Either way, the default label +is ``Subtotal'', use @code{SUBTOTAL="@i{label}"} or +@code{HSUBTOTAL="@i{label}"} to specify a custom label. + +By default, or with @code{POSITION=AFTER}, totals come after the last +category and subtotals apply to categories that precede them. With +@code{POSITION=BEFORE}, totals come before the first category and +subtotals apply to categories that follow them. + +Only categorical variables may have totals and subtotals. Scalar +variables may be ``totaled'' indirectly by enabling totals and +subtotals on a categorical variable within which the scalar variable is +summarized. + +@subsubheading Categories Without Values + +Some categories might not be included in the data set being analyzed. +For example, our example data set has no cases in the ``15 or +younger'' age group. By default, or with @code{EMPTY=INCLUDE}, +@pspp{} includes these empty categories in output tables. To exclude +them, specify @code{EMPTY=EXCLUDE}. + +For implicit categories, empty categories potentially include all the +values with labels for a given variable; for explicit categories, they +include all the values listed individually and all labeled values +covered by ranges or @code{MISSING} or @code{OTHERNM}. + @node CTABLES Titles @subsection Titles +@display +@t{/TITLES} + [@t{TITLE=}@i{string}@dots{}] + [@t{CAPTION=}@i{string}@dots{}] + [@t{CORNER=}@i{string}@dots{}] +@end display + +The @code{TITLES} subcommand sets the title, caption, and corner text +for the table output for the previous @code{TABLE} subcommand. The +title appears above the table, the caption below the table, and the +corner text appears in the table's upper left corner. By default, the +title is ``Custom Tables'' and the caption and corner text are empty. + +@node CTABLES Table Formatting +@subsection Table Formatting + +@display +@t{/FORMAT} + [@t{MINCOLWIDTH=}@{@t{DEFAULT} @math{|} @i{width}@}] + [@t{MAXCOLWIDTH=}@{@t{DEFAULT} @math{|} @i{width}@}] + [@t{UNITS=}@{@t{POINTS} @math{|} @t{INCHES} @math{|} @t{CM}@}] + [@t{EMPTY=}@{@t{ZERO} @math{|} @t{BLANK} @math{|} @i{string}@}] + [@t{MISSING=}@i{string}] +@end display + +The @code{FORMAT} subcommand, which must precede the first +@code{TABLE} subcommand, controls formatting for all the output +tables. @code{FORMAT} and all of its settings are optional. + +Use @code{MINCOLWIDTH} and @code{MAXCOLWIDTH} to control the minimum +or maximum width of columns in output tables. By default, or with +@code{DEFAULT}, column width varies based on content. Otherwise, +specify a number for either or both of these settings. If both are +specified, @code{MAXCOLWIDTH} must be bigger than @code{MINCOLWIDTH}. +The default unit, or with @code{UNITS=POINTS}, is points (1/72 inch), +but specify @code{UNITS=INCHES} to use inches or @code{UNITS=CM} for +centimeters. + +By default, or with @code{EMPTY=ZERO}, zero values are displayed in +their usual format. Use @code{EMPTY=BLANK} to use an empty cell +instead, or @code{EMPTY="@i{string}"} to use the specified string. + +By default, missing values are displayed as @samp{.}, the same as in +other tables. Specify @code{MISSING="@i{string}"} to instead use a +custom string. + +@node CTABLES Display of Variable Labels +@subsection Display of Variable Labels + +@display +@t{/VLABELS} + @t{VARIABLES=}@i{variables} + @t{DISPLAY}=@{@t{DEFAULT} @math{|} @t{NAME} @math{|} @t{LABEL} @math{|} @t{BOTH} @math{|} @t{NONE}@} +@end display + +The @code{VLABELS} subcommand, which must precede the first +@code{TABLE} subcommand, controls display of variable labels in all +the output tables. @code{VLABELS} is optional. It may appear +multiple times to adjust settings for different variables. + +@code{VARIABLES} and @code{DISPLAY} are required. The value of +@code{DISPLAY} controls how variable labels are displayed for the +variables listed on @code{VARIABLES}. The supported values are: + +@table @code +@item DEFAULT +Uses the setting from @ref{SET TVARS}. + +@item NAME +Show only a variable name. + +@item LABEL +Show only a variable label. + +@item BOTH +Show variable name and label. + +@item NONE +Show nothing. +@end table + +@node CTABLES Missing Value Treatment +@subsection Missing Value Treatment + +@display +@t{/SMISSING} @{@t{VARIABLE} @math{|} @t{LISTWISE}@} +@end display + +The @code{SMISSING} subcommand, which must precede the first +@code{TABLE} subcommand, controls treatment of missing values for +scalar variables in producing all the output tables. @code{SMISSING} +is optional. + +With @code{SMISSING=VARIABLE}, which is the default, missing values +are excluded on a variable-by-variable basis. With +@code{SMISSING=LISTWISE}, when scalar variables are stacked, a missing +value for any of the scalar variables causes the case to be excluded +for all of them. + +@node CTABLES Computed Categories +@subsection Computed Categories + +@display +@t{/PCOMPUTE} @t{&}@i{category}@t{=EXPR(}@i{expression}@t{)} +@t{/PPROPERTIES} @t{&}@i{category}@dots{} + [@t{LABEL=}@i{string}] + [@t{FORMAT=}[@i{summary} @i{format}]@dots{}] + [@t{HIDESOURCECATS=}@{@t{NO} @math{|} @t{YES}@} +@end display + +@dfn{Computed categories}, also called @dfn{postcomputes}, are +categories created using arithmetic on categories obtained from the +data. The @code{PCOMPUTE} subcommand defines computed categories, +which can then be used in two places: on @code{CATEGORIES} within an +explicit category list (@pxref{CTABLE Explicit Category List}), and on +the @code{PPROPERTIES} subcommand to define further properties for a +given postcompute. + +@code{PCOMPUTE} must precede the first @code{TABLE} command. It is +optional and it may be used multiple times to define multiple +postcomputes. + @node FACTOR @section FACTOR