X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Fstatistics.texi;h=a4f03ca6dd52fcce3a1b6b8babd747c263a4633b;hb=68d837b6f9edf4151a8df34f65123b10b35612ea;hp=4452c1a631766bb1fb4b8209605d943fc1242c3f;hpb=1c817faf0b4f8f7e53d032c805f775e239c6a9f2;p=pspp diff --git a/doc/statistics.texi b/doc/statistics.texi index 4452c1a631..a4f03ca6dd 100644 --- a/doc/statistics.texi +++ b/doc/statistics.texi @@ -10,7 +10,8 @@ far. * EXAMINE:: Testing data for normality. * CORRELATIONS:: Correlation tables. * CROSSTABS:: Crosstabulation tables. -* FACTOR:: Factor analysis and Principal Components analysis +* FACTOR:: Factor analysis and Principal Components analysis. +* LOGISTIC REGRESSION:: Bivariate Logistic Regression. * MEANS:: Average values and other statistics. * NPAR TESTS:: Nonparametric tests. * T-TEST:: Test hypotheses about means. @@ -76,21 +77,21 @@ list by enclosing them in parentheses after each variable. The @subcmd{STATISTICS} subcommand specifies the statistics to be displayed: @table @code -@item ALL +@item @subcmd{ALL} All of the statistics below. -@item MEAN +@item @subcmd{MEAN} Arithmetic mean. -@item SEMEAN +@item @subcmd{SEMEAN} Standard error of the mean. -@item STDDEV +@item @subcmd{STDDEV} Standard deviation. -@item VARIANCE +@item @subcmd{VARIANCE} Variance. -@item KURTOSIS +@item @subcmd{KURTOSIS} Kurtosis and standard error of the kurtosis. -@item SKEWNESS +@item @subcmd{SKEWNESS} Skewness and standard error of the skewness. -@item RANGE +@item @subcmd{RANGE} Range. @item MINIMUM Minimum value. @@ -666,28 +667,28 @@ to be analysed. By default, the correlation matrix is analysed. The @subcmd{/PRINT} subcommand may be used to select which features of the analysis are reported: -@itemize @subcmd{} -@item UNIVARIATE +@itemize +@item @subcmd{UNIVARIATE} A table of mean values, standard deviations and total weights are printed. -@item INITIAL +@item @subcmd{INITIAL} Initial communalities and eigenvalues are printed. -@item EXTRACTION +@item @subcmd{EXTRACTION} Extracted communalities and eigenvalues are printed. -@item ROTATION +@item @subcmd{ROTATION} Rotated communalities and eigenvalues are printed. -@item CORRELATION +@item @subcmd{CORRELATION} The correlation matrix is printed. -@item COVARIANCE +@item @subcmd{COVARIANCE} The covariance matrix is printed. -@item DET +@item @subcmd{DET} The determinant of the correlation or covariance matrix is printed. -@item KMO +@item @subcmd{KMO} The Kaiser-Meyer-Olkin measure of sampling adequacy and the Bartlett test of sphericity is printed. -@item SIG +@item @subcmd{SIG} The significance of the elements of correlation matrix is printed. -@item ALL +@item @subcmd{ALL} All of the above are printed. -@item DEFAULT +@item @subcmd{DEFAULT} Identical to @subcmd{INITIAL} and @subcmd{EXTRACTION}. @end itemize @@ -723,6 +724,92 @@ If @subcmd{PAIRWISE} is set, then a case is considered missing only if either of values for the particular coefficient are missing. The default is @subcmd{LISTWISE}. +@node LOGISTIC REGRESSION +@section LOGISTIC REGRESSION + +@vindex LOGISTIC REGRESSION +@cindex logistic regression +@cindex bivariate logistic regression + +@display +LOGISTIC REGRESSION [VARIABLES =] @var{dependent_var} WITH @var{predictors} + + [/CATEGORICAL = @var{categorical_predictors}] + + [@{/NOCONST | /ORIGIN | /NOORIGIN @}] + + [/PRINT = [SUMMARY] [DEFAULT] [CI(@var{confidence})] [ALL]] + + [/CRITERIA = [BCON(@var{min_delta})] [ITERATE(@var{max_interations})] + [LCON(@var{min_likelihood_delta})] [EPS(@var{min_epsilon})] + [CUT(@var{cut_point})]] + + [/MISSING = @{INCLUDE|EXCLUDE@}] +@end display + +Bivariate Logistic Regression is used when you want to explain a dichotomous dependent +variable in terms of one or more predictor variables. + +The minimum command is +@example +LOGISTIC REGRESSION @var{y} WITH @var{x1} @var{x2} @dots{} @var{xn}. +@end example +Here, @var{y} is the dependent variable, which must be dichotomous and @var{x1} @dots{} @var{xn} +are the predictor variables whose coefficients the procedure estimates. + +By default, a constant term is included in the model. +Hence, the full model is +@math{ +{\bf y} += b_0 + b_1 {\bf x_1} ++ b_2 {\bf x_2} ++ \dots ++ b_n {\bf x_n} +} + +Predictor variables which are categorical in nature should be listed on the @subcmd{/CATEGORICAL} subcommand. +Simple variables as well as interactions between variables may be listed here. + +If you want a model without the constant term @math{b_0}, use the keyword @subcmd{/ORIGIN}. +@subcmd{/NOCONST} is a synonym for @subcmd{/ORIGIN}. + +An iterative Newton-Raphson procedure is used to fit the model. +The @subcmd{/CRITERIA} subcommand is used to specify the stopping criteria of the procedure, +and other parameters. +The value of @var{cut_point} is used in the classification table. It is the +threshold above which predicted values are considered to be 1. Values +of @var{cut_point} must lie in the range [0,1]. +During iterations, if any one of the stopping criteria are satisfied, the procedure is +considered complete. +The stopping criteria are: +@itemize +@item The number of iterations exceeds @var{max_iterations}. + The default value of @var{max_iterations} is 20. +@item The change in the all coefficient estimates are less than @var{min_delta}. +The default value of @var{min_delta} is 0.001. +@item The magnitude of change in the likelihood estimate is less than @var{min_likelihood_delta}. +The default value of @var{min_delta} is zero. +This means that this criterion is disabled. +@item The differential of the estimated probability for all cases is less than @var{min_epsilon}. +In other words, the probabilities are close to zero or one. +The default value of @var{min_epsilon} is 0.00000001. +@end itemize + + +The @subcmd{PRINT} subcommand controls the display of optional statistics. +Currently there is one such option, @subcmd{CI}, which indicates that the +confidence interval of the odds ratio should be displayed as well as its value. +@subcmd{CI} should be followed by an integer in parentheses, to indicate the +confidence level of the desired confidence interval. + +The @subcmd{MISSING} subcommand determines the handling of missing +variables. +If @subcmd{INCLUDE} is set, then user-missing values are included in the +calculations, but system-missing values are not. +If @subcmd{EXCLUDE} is set, which is the default, user-missing +values are excluded as well as system-missing values. +This is the default. + @node MEANS @section MEANS