X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fstatistics.texi;h=01976e27c950fa4ef21087dc40e0fadbd668a3ee;hb=25e030629aed1fda2a75115a7e35b7a0797b7458;hp=c0e6a1c084d190fdad0a055bec2517d0c034228f;hpb=0e958ac80f5add8d0581c218badbdf9bddcde9bc;p=pspp diff --git a/doc/statistics.texi b/doc/statistics.texi index c0e6a1c084..01976e27c9 100644 --- a/doc/statistics.texi +++ b/doc/statistics.texi @@ -678,12 +678,8 @@ The keyword @subcmd{ALL} is the union of @subcmd{DESCRIPTIVES} and @subcmd{XPROD CROSSTABS /TABLES=@var{var_list} BY @var{var_list} [BY @var{var_list}]@dots{} /MISSING=@{TABLE,INCLUDE,REPORT@} - /WRITE=@{NONE,CELLS,ALL@} /FORMAT=@{TABLES,NOTABLES@} - @{PIVOT,NOPIVOT@} @{AVALUE,DVALUE@} - @{NOINDEX,INDEX@} - @{BOX,NOBOX@} /CELLS=@{COUNT,ROW,COLUMN,TOTAL,EXPECTED,RESIDUAL,SRESIDUAL, ASRESIDUAL,ALL,NONE@} /COUNT=@{ASIS,CASE,CELL@} @@ -728,8 +724,6 @@ tables and statistics. When set to @subcmd{REPORT}, which is allowed only in integer mode, user-missing values are included in tables but marked with a footnote and excluded from statistical calculations. -Currently the @subcmd{WRITE} subcommand is ignored. - The @subcmd{FORMAT} subcommand controls the characteristics of the crosstabulation tables to be displayed. It has a number of possible settings: @@ -737,22 +731,11 @@ settings: @itemize @w{} @item @subcmd{TABLES}, the default, causes crosstabulation tables to be output. -@subcmd{NOTABLES} suppresses them. - -@item -@subcmd{PIVOT}, the default, causes each @subcmd{TABLES} subcommand to be displayed in a -pivot table format. @subcmd{NOPIVOT} causes the old-style crosstabulation format -to be used. +@subcmd{NOTABLES}, which is equivalent to @code{CELLS=NONE}, suppresses them. @item @subcmd{AVALUE}, the default, causes values to be sorted in ascending order. @subcmd{DVALUE} asserts a descending sort order. - -@item -@subcmd{INDEX} and @subcmd{NOINDEX} are currently ignored. - -@item -@subcmd{BOX} and @subcmd{NOBOX} is currently ignored. @end itemize The @subcmd{CELLS} subcommand controls the contents of each cell in the displayed @@ -862,6 +845,59 @@ Approximate T is not calculated for symmetric uncertainty coefficient. Fixes for any of these deficiencies would be welcomed. +@subsection Crosstabs Example + +@cindex chi-square test of independence + +A researcher wishes to know if, in an industry, a person's sex is related to +the person's occupation. To investigate this, she has determined that the +@file{personnel.sav} is a representative, randomly selected sample of persons. +The researcher's null hypothesis is that a person's sex has no relation to a +person's occupation. She uses a chi-squared test of independence to investigate +the hypothesis. + +@float Example, crosstabs:ex +@psppsyntax {crosstabs.sps} +@caption {Running crosstabs on the @exvar{sex} and @exvar{occupation} variables} +@end float + +The syntax in @ref{crosstabs:ex} conducts a chi-squared test of independence. +The line @code{/tables = occupation by sex} indicates that @exvar{occupation} +and @exvar{sex} are the variables to be tabulated. To do this using the @gui{} +you must place these variable names respectively in the @samp{Row} and +@samp{Column} fields as shown in @ref{crosstabs:scr}. + +@float Screenshot, crosstabs:scr +@psppimage {crosstabs} +@caption {The Crosstabs dialog box with the @exvar{sex} and @exvar{occupation} variables selected} +@end float + +Similarly, the @samp{Cells} button shows a dialog box to select the @code{count} +and @code{expected} options. All other cell options can be deselected for this +test. + +You would use the @samp{Format} and @samp{Statistics} buttons to select options +for the @subcmd{FORMAT} and @subcmd{STATISTICS} subcommands. In this example, +the @samp{Statistics} requires only the @samp{Chisq} option to be checked. All +other options should be unchecked. No special settings are required from the +@samp{Format} dialog. + +As shown in @ref{crosstabs:res} @cmd{CROSSTABS} generates a contingency table +containing the observed count and the expected count of each sex and each +occupation. The expected count is the count which would be observed if the +null hypothesis were true. + +The significance of the Pearson Chi-Square value is very much larger than the +normally accepted value of 0.05 and so one cannot reject the null hypothesis. +Thus the researcher must conclude that a person's sex has no relation to the +person's occupation. + +@float Results, crosstabs:res +@psppoutput {crosstabs} +@caption {The results of a test of independence between @exvar{sex} and @exvar{occupation}} +@end float + + @node FACTOR @section FACTOR @@ -1598,9 +1634,10 @@ arbitrary number of populations. It does not assume normality. The data to be compared are specified by @var{var_list}. The categorical variable determining the groups to which the data belongs is given by @var{var}. The limits @var{lower} and -@var{upper} specify the valid range of @var{var}. Any cases for -which @var{var} falls outside [@var{lower}, @var{upper}] are -ignored. +@var{upper} specify the valid range of @var{var}. +If @var{upper} is smaller than @var{lower}, the PSPP will assume their values +to be reversed. Any cases for which @var{var} falls outside +[@var{lower}, @var{upper}] are ignored. The mean rank of each group as well as the chi-squared value and significance of the test are printed.