work on PRINT encoding

[pspp] / doc / regression.texi
diff --git a/doc/regression.texi b/doc/regression.texi

index 22b9f58ff664c196a574565aeac548983f452f1f..127ec7164567bce7b2f2d2b6233d5e97ca526c03 100644 (file)
--- a/doc/regression.texi
+++ b/doc/regression.texi
@@ -1,68 +1,69 @@
-@node REGRESSION, , ONEWAY, Statistics
+@node REGRESSION
  @section REGRESSION
  
-The REGRESSION procedure fits linear models to data via least-squares
+@cindex regression
+@cindex linear regression
+The @cmd{REGRESSION} procedure fits linear models to data via least-squares
  estimation. The procedure is appropriate for data which satisfy those
  assumptions typical in linear regression:
  
  @itemize @bullet
-@item The data set contains n observations of a dependent variable, say
-Y_1,...,Y_n, and n observations of one or more explanatory
-variables. Let X_11, X_12, ..., X_1n denote the n observations of the
-first explanatory variable; X_21,...,X_2n denote the n observations of the
-second explanatory variable; X_k1,...,X_kn denote the n observations of the kth
-explanatory variable.
-
-@item The dependent variable Y has the following relationship to the 
+@item The data set contains @math{n} observations of a dependent variable, say
+@math{Y_1,@dots{},Y_n}, and @math{n} observations of one or more explanatory
+variables.
+Let @math{X_{11}, X_{12}}, @dots{}, @math{X_{1n}} denote the @math{n} observations
+of the first explanatory variable;
+@math{X_{21}},@dots{},@math{X_{2n}} denote the @math{n} observations of the second
+explanatory variable;
+@math{X_{k1}},@dots{},@math{X_{kn}} denote the @math{n} observations of 
+the @math{k}th explanatory variable.
+
+@item The dependent variable @math{Y} has the following relationship to the 
  explanatory variables:
-@math{Y_i = b_0 + b_1 X_1i + ... + b_k X_ki + Z_i} 
-where @math{b_0, b_1, ..., b_k} are unknown
-coefficients, and @math{Z_1,...,Z_n} are independent, normally
-distributed ``noise'' terms with common variance. The noise, or
-``error'' terms are unobserved. This relationship is called the
-``linear model.''
+@math{Y_i = b_0 + b_1 X_{1i} + ... + b_k X_{ki} + Z_i} 
+where @math{b_0, b_1, @dots{}, b_k} are unknown
+coefficients, and @math{Z_1,@dots{},Z_n} are independent, normally
+distributed @dfn{noise} terms with mean zero and common variance.
+The noise, or @dfn{error} terms are unobserved.
+This relationship is called the @dfn{linear model}.
  @end itemize
  
-The REGRESSION procedure estimates the coefficients
-@math{b_0,...,b_k} and produces output relevant to inferences for the
+The @cmd{REGRESSION} procedure estimates the coefficients
+@math{b_0,@dots{},b_k} and produces output relevant to inferences for the
  linear model. 
  
-@c If you add any new commands, then don't forget to remove the entry in 
-@c not-implemented.texi
-
  @menu
  * Syntax::                      Syntax definition.
  * Examples::                    Using the REGRESSION procedure.
  @end menu
  
-@node Syntax, Examples, , REGRESSION
+@node Syntax
  @subsection Syntax
  
  @vindex REGRESSION
  @display
  REGRESSION
-        /VARIABLES=var_list
-        /DEPENDENT=var_list
+        /VARIABLES=@var{var_list}
+        /DEPENDENT=@var{var_list}
          /STATISTICS=@{ALL, DEFAULTS, R, COEFF, ANOVA, BCOV@}
-        /EXPORT ('file-name')
-        /SAVE
+        /SAVE=@{PRED, RESID@}
  @end display
  
-The @cmd{REGRESSION} procedure reads the active file and outputs
+The @cmd{REGRESSION} procedure reads the active dataset and outputs
  statistics relevant to the linear model specified by the user.
  
-The VARIABLES subcommand, which is required, specifies the list of
-variables to be analyzed.  Keyword VARIABLES is required. The
-DEPENDENT subcommand specifies the dependent variable of the linear
-model. The DEPENDENT subcommond is required. All variables listed in
-the VARIABLES subcommand, but not listed in the DEPENDENT subcommand,
+The @subcmd{VARIABLES} subcommand, which is required, specifies the list of
+variables to be analyzed.  Keyword @subcmd{VARIABLES} is required. The
+@subcmd{DEPENDENT} subcommand specifies the dependent variable of the linear
+model. The @subcmd{DEPENDENT} subcommand is required. All variables listed in
+the @subcmd{VARIABLES} subcommand, but not listed in the @subcmd{DEPENDENT} subcommand,
  are treated as explanatory variables in the linear model.
  
  All other subcommands are optional:
  
-The STATISTICS subcommand specifies the statistics to be displayed:
+The @subcmd{STATISTICS} subcommand specifies the statistics to be displayed:
  
-@table @code
+@table @subcmd
  @item ALL
  All of the statistics below.
  @item R
@@ -76,27 +77,20 @@ Analysis of variance table for the model.
  The covariance matrix for the estimated model coefficients.
  @end table
  
-The SAVE subcommand causes PSPP to save the residuals from the fitted
-model to the active file. PSPP will store the residuals in a variable
-called RES1 if no such variable exists, RES2 if RES1 already exists,
-RES3 if RES1 and RES2 already exist, etc.
-
-The EXPORT subcommand causes PSPP to write a C program containing
-functions related to the model. One such function accepts values of
-explanatory variables as arguments, and returns an estimate of the
-corresponding new
-value of the dependent variable. The generated program will also contain
-functions that return prediction and confidence intervals related to
-those new estimates. PSPP will write the program to the
-'file-name' given by the user, and write declarations of functions
-to a file called pspp_model_reg.h. The user can then compile the C
-program and use it as part of another program. This subcommand is a
-PSPP extension.
-
-@node Examples, , Syntax, REGRESSION
+The @subcmd{SAVE} subcommand causes @pspp{} to save the residuals or predicted
+values from the fitted
+model to the active dataset. @pspp{} will store the residuals in a variable
+called @samp{RES1} if no such variable exists, @samp{RES2} if @samp{RES1} 
+already exists,
+@samp{RES3} if @samp{RES1} and @samp{RES2} already exist, etc. It will
+choose the name of
+the variable for the predicted values similarly, but with @samp{PRED} as a
+prefix.
+
+@node Examples
  @subsection Examples
-The following PSPP code will generate the default output, and save the
-linear model in a program called ``model.c.''
+The following @pspp{} syntax will generate the default output and save the
+predicted values and residuals to the active dataset.
  
  @example
  title 'Demonstrate REGRESSION procedure'.
@@ -114,19 +108,6 @@ a  8.838262 -29.25689
  b  6.200189 -18.58219
  end data.
  list.
-regression /variables=v0 v1 v2 /statistics defaults /dependent=v2 /export (model.c) /method=enter.
+regression /variables=v0 v1 v2 /statistics defaults /dependent=v2 
+           /save pred resid /method=enter.
  @end example
-
-The file pspp_model_reg.h contains these declarations:
-
-@example
-double pspp_reg_estimate (const double *, const char *[]);
-double pspp_reg_variance (const double *var_vals, const char *[]);
-double pspp_reg_confidence_interval_U (const double *var_vals, const char *var_names[], double p);
-double pspp_reg_confidence_interval_L (const double *var_vals, const char *var_names[], double p);
-double pspp_reg_prediction_interval_U (const double *var_vals, const char *var_names[], double p);
-double pspp_reg_prediction_interval_L (const double *var_vals, const char *var_names[], double p);
-@end example
-
-The file model.c contains the definitions of the functions. 
-@setfilename ignored