X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fstatistics.texi;fp=doc%2Fstatistics.texi;h=6957b3836eae3a2c07f8bad2f64d30d781db1249;hb=4eac481a68b1be28ebcbdef2b38fc45dddd55c43;hp=c7abe4d88d4cb94a88a9e098913bdaa07656f486;hpb=f6491a00fd9755850740aa45530498b6ae5d265a;p=pspp diff --git a/doc/statistics.texi b/doc/statistics.texi index c7abe4d88d..6957b3836e 100644 --- a/doc/statistics.texi +++ b/doc/statistics.texi @@ -30,7 +30,6 @@ far. * ONEWAY:: One way analysis of variance. * QUICK CLUSTER:: K-Means clustering. * RANK:: Compute rank scores. -* REGRESSION:: Linear regression. * RELIABILITY:: Reliability analysis. * ROC:: Receiver Operating Characteristic. @end menu @@ -514,9 +513,9 @@ The @cmd{GRAPH} command produces graphical plots of data. Only one of the subcom can be produced per call of @cmd{GRAPH}. The @subcmd{MISSING} is optional. @menu -* SCATTERPLOT:: Cartesian Plots -* HISTOGRAM:: Histograms -* BAR CHART:: Bar Charts +* SCATTERPLOT:: Cartesian Plots +* HISTOGRAM:: Histograms +* BAR CHART:: Bar Charts @end menu @node SCATTERPLOT @@ -987,16 +986,81 @@ previous @code{TABLE}. All of these subcommands are optional: [@t{SHOWSIG=}@{@t{NO} @math{|} @t{YES}@}] @end display -The @code{CTABLES} (aka ``custom tables'') command outputs -multi-dimensional tables, offering many options for table -summarization and formatting. +The @code{CTABLES} (aka ``custom tables'') command produces +multi-dimensional tables from categorical and scale data. It offers +many options for data summarization and formatting. -@code{TABLE}, the only required subcommand, specifies the variables to -include on each dimension, using the syntax @t{/TABLE} @i{rows} @t{BY} -@i{columns} @t{BY} @i{layers}, in which @i{rows}, @i{columns}, and -@i{layers} is each empty or an @i{axis}. The simplest form of -@i{axis} is just a variable name. +This section's examples use data from the 2008 (USA) National Survey +of Drinking and Driving Attitudes and Behaviors, a public domain data +set from the (USA) National Highway Traffic Administration and +available at @url{https://data.transportation.gov}. @pspp{} includes +this data set, with a slightly modified dictionary, as +@file{examples/nhtsa.sav}. +@menu +* CTABLES Basics:: +@end menu + +@node CTABLES Basics +@subsection Basics + +The only required subcommand is @code{TABLE}, which specifies the +variables to include along each axis: +@display +@t{/TABLE} @i{rows} [@t{BY} @i{columns} [@t{BY} @i{layers}]] +@end display +@noindent +In @code{TABLE}, each of @var{rows}, @var{columns}, and @var{layers} +is either empty or an axis expression that specifies one or more +variables. An axis expression that names a categorical variable +divides the data into cells according to the values of that variable. +When all the variables named on @code{TABLE} are categorical, by +default each cell displays the number of cases that it contains, so +specifying a single variable yields a frequency table: + +@example +CTABLES /TABLE=AgeGroup. +@end example +@psppoutput {ctables1} + +@noindent +Specifying a row and a column categorical variable yields a +crosstabulation: + +@example +CTABLES /TABLE=AgeGroup BY qns3a. +@end example +@psppoutput {ctables2} + +@noindent +The @samp{>} operator nests multiple variables on a single axis, e.g.: + +@example +CTABLES /TABLE qn105ba BY AgeGroup > qns3a. +@end example +@psppoutput {ctables3} + +@noindent +The @samp{+} operator allows a single output table to include multiple +data analyses. With @samp{+}, @code{CTABLES} divides the output table +into multiple sections, each of which includes an analysis of the full +data set. For example, the following command separately tabulates age +group and driving frequency by gender: + +@example +CTABLES /TABLE AgeGroup + qn1 BY qns3a. +@end example +@psppoutput {ctables4} + +@noindent +If @samp{+} and @samp{>} are used together, @samp{>} binds more +tightly. Use parentheses to override operator precedence. Thus: + +@example +CTABLES /TABLE qn26 + qn27 > qns3a. +CTABLES /TABLE (qn26 + qn27) > qns3a. +@end example +@psppoutput {ctables5} @node FACTOR @section FACTOR @@ -1503,19 +1567,19 @@ is used. @menu -* BINOMIAL:: Binomial Test -* CHISQUARE:: Chi-square Test -* COCHRAN:: Cochran Q Test -* FRIEDMAN:: Friedman Test -* KENDALL:: Kendall's W Test -* KOLMOGOROV-SMIRNOV:: Kolmogorov Smirnov Test -* KRUSKAL-WALLIS:: Kruskal-Wallis Test -* MANN-WHITNEY:: Mann Whitney U Test -* MCNEMAR:: McNemar Test -* MEDIAN:: Median Test -* RUNS:: Runs Test -* SIGN:: The Sign Test -* WILCOXON:: Wilcoxon Signed Ranks Test +* BINOMIAL:: Binomial Test +* CHISQUARE:: Chi-square Test +* COCHRAN:: Cochran Q Test +* FRIEDMAN:: Friedman Test +* KENDALL:: Kendall's W Test +* KOLMOGOROV-SMIRNOV:: Kolmogorov Smirnov Test +* KRUSKAL-WALLIS:: Kruskal-Wallis Test +* MANN-WHITNEY:: Mann Whitney U Test +* MCNEMAR:: McNemar Test +* MEDIAN:: Median Test +* RUNS:: Runs Test +* SIGN:: The Sign Test +* WILCOXON:: Wilcoxon Signed Ranks Test @end menu