1 @node REGRESSION, , ONEWAY, Statistics
4 The REGRESSION procedure fits linear models to data via least-squares
5 estimation. The procedure is appropriate for data which satisfy those
6 assumptions typical in linear regression:
9 @item The data set contains n observations of a dependent variable, say
10 Y_1,...,Y_n, and n observations of one or more explanatory
11 variables. Let X_11, X_12, ..., X_1n denote the n observations of the
12 first explanatory variable; X_21,...,X_2n denote the n observations of the
13 second explanatory variable; X_k1,...,X_kn denote the n observations of the kth
16 @item The dependent variable Y has the following relationship to the
17 explanatory variables:
18 @math{Y_i = b_0 + b_1 X_1i + ... + b_k X_ki + Z_i}
19 where @math{b_0, b_1, ..., b_k} are unknown
20 coefficients, and @math{Z_1,...,Z_n} are independent, normally
21 distributed ``noise'' terms with common variance. The noise, or
22 ``error'' terms are unobserved. This relationship is called the
26 The REGRESSION procedure estimates the coefficients
27 @math{b_0,...,b_k} and produces output relevant to inferences for the
30 @c If you add any new commands, then don't forget to remove the entry in
31 @c not-implemented.texi
34 * Syntax:: Syntax definition.
35 * Examples:: Using the REGRESSION procedure.
38 @node Syntax, Examples, , REGRESSION
46 /STATISTICS=@{ALL, DEFAULTS, R, COEFF, ANOVA, BCOV@}
50 The @cmd{REGRESSION} procedure reads the active file and outputs
51 statistics relevant to the linear model specified by the user.
53 The VARIABLES subcommand, which is required, specifies the list of
54 variables to be analyzed. Keyword VARIABLES is required. The
55 DEPENDENT subcommand specifies the dependent variable of the linear
56 model. The DEPENDENT subcommond is required. All variables listed in
57 the VARIABLES subcommand, but not listed in the DEPENDENT subcommand,
58 are treated as explanatory variables in the linear model.
60 All other subcommands are optional:
62 The STATISTICS subcommand specifies the statistics to be displayed:
66 All of the statistics below.
68 The ratio of the sums of squares due to the model to the total sums of
69 squares for the dependent variable.
71 A table containing the estimated model coefficients and their standard errors.
73 Analysis of variance table for the model.
75 The covariance matrix for the estimated model coefficients.
78 The EXPORT subcommand causes PSPP to write a C program containing
79 functions related to the model. One such function accepts values of
80 explanatory variables as arguments, and returns an estimate of the
82 value of the dependent variable. The generated program will also contain
83 functions that return prediction and confidence intervals related to
84 those new estimates. PSPP will write the program to the
85 'filename' given by the user, and write declarations of functions
86 to a file called pspp_model_reg.h. The user can then compile the C
87 program and use it as part of another program. This subcommand is a
90 @node Examples, , Syntax, REGRESSION
92 The following PSPP code will generate the default output, and save the
93 linear model in a program called ``model.c.''
96 title 'Demonstrate REGRESSION procedure'.
97 data list / v0 1-2 (A) v1 v2 3-22 (10).
111 regression /variables=v0 v1 v2 /statistics defaults /dependent=v2 /export (model.c) /method=enter.
114 The file pspp_model_reg.h contains these declarations:
117 double pspp_reg_estimate (const double *, const char *[]);
118 double pspp_reg_variance (const double *var_vals, const char *[]);
119 double pspp_reg_confidence_interval_U (const double *var_vals, const char *var_names[], double p);
120 double pspp_reg_confidence_interval_L (const double *var_vals, const char *var_names[], double p);
121 double pspp_reg_prediction_interval_U (const double *var_vals, const char *var_names[], double p);
122 double pspp_reg_prediction_interval_L (const double *var_vals, const char *var_names[], double p);
125 The file model.c contains the definitions of the functions.