X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Ftransformation.texi;h=27bbb2da8d039204be8c27f9b996f867b9ba6511;hb=9b94efd7513afdb12a6023024e00e50801532fee;hp=9225ca80009fb325b71206114aff0c12ea2b962c;hpb=6ccbd384363db2e304ffe8cc51fcd2eac0a5349a;p=pspp-builds.git diff --git a/doc/transformation.texi b/doc/transformation.texi index 9225ca80..27bbb2da 100644 --- a/doc/transformation.texi +++ b/doc/transformation.texi @@ -1,4 +1,4 @@ -@node Data Manipulation, Data Selection, Variable Attributes, Top +@node Data Manipulation @chapter Data transformations @cindex transformations @@ -17,13 +17,13 @@ as a rule. * SORT CASES:: Sort the active file. @end menu -@node AGGREGATE, AUTORECODE, Data Manipulation, Data Manipulation +@node AGGREGATE @section AGGREGATE @vindex AGGREGATE @display AGGREGATE - OUTFILE=@{*,'filename'@} + OUTFILE=@{*,'file-name',file_handle@} /PRESORTED /DOCUMENT /MISSING=COLUMNWISE @@ -37,9 +37,11 @@ variables called @dfn{break variables}. Several functions are available for summarizing case contents. The OUTFILE subcommand is required and must appear first. Specify a -system file by file name string or file handle (@pxref{FILE HANDLE}). +system file, portable file, or scratch file by file name or file +handle (@pxref{File Handles}). The aggregated cases are written to this file. If @samp{*} is -specified, then the aggregated cases replace the active file. +specified, then the aggregated cases replace the active file. Use of +OUTFILE to write a portable file or scratch file is a PSPP extension. By default, the active file will be sorted based on the break variables before aggregation takes place. If the active file is already sorted @@ -103,6 +105,9 @@ format is F5.3. @item FIRST(var_name) First non-missing value in break group. The aggregation variable receives the complete dictionary information from the source variable. +The sort performed by AGGREGATE (and by SORT CASES) is stable, so that +the first case with particular values for the break variables before +sorting will also be the first case in that break group after sorting. @item FOUT(var_name, low, high) Fraction of values strictly outside the specified range of constants. @@ -111,6 +116,9 @@ The default format is F5.3. @item LAST(var_name) Last non-missing value in break group. The aggregation variable receives the complete dictionary information from the source variable. +The sort performed by AGGREGATE (and by SORT CASES) is stable, so that +the last case with particular values for the break variables before +sorting will also be the last case in that break group after sorting. @item MAX(var_name) Maximum value. The aggregation variable receives the complete @@ -186,7 +194,7 @@ will cause the period to be interpreted as the end of the command.) @cmd{AGGREGATE} both ignores and cancels the current @cmd{SPLIT FILE} settings (@pxref{SPLIT FILE}). -@node AUTORECODE, COMPUTE, AGGREGATE, Data Manipulation +@node AUTORECODE @section AUTORECODE @vindex AUTORECODE @@ -216,7 +224,7 @@ PRINT is currently ignored. @cmd{AUTORECODE} is a procedure. It causes the data to be read. -@node COMPUTE, COUNT, AUTORECODE, Data Manipulation +@node COMPUTE @section COMPUTE @vindex COMPUTE @@ -256,7 +264,7 @@ When @cmd{COMPUTE} is specified following @cmd{TEMPORARY} (@pxref{TEMPORARY}), the @cmd{LAG} function may not be used (@pxref{LAG}). -@node COUNT, FLIP, COMPUTE, Data Manipulation +@node COUNT @section COUNT @vindex COUNT @@ -279,7 +287,7 @@ one or more @dfn{test} variables for each case. The target variable values are always nonnegative integers. They are never missing. The target variable is assigned an F8.2 output format. -@xref{Input/Output Formats}. Any variables, including long and short +@xref{Input and Output Formats}. Any variables, including long and short string variables, may be test variables. User-missing values of test variables are treated just like any other @@ -360,7 +368,7 @@ DESCRIPTIVES QVALID /STATISTICS=MEAN. @end example @end enumerate -@node FLIP, IF, COUNT, Data Manipulation +@node FLIP @section FLIP @vindex FLIP @@ -380,7 +388,8 @@ specified are discarded. If the VARIABLES subcommand is omitted, all variables are selected for transposition. The variables specified by NEWNAMES, which must be a string variable, is -used to give names to the variables created by @cmd{FLIP}. If +used to give names to the variables created by @cmd{FLIP}. Only the +first 8 characters of the variable are used. If NEWNAMES is not specified then the default is a variable named CASE_LBL, if it exists. If it does not then the variables created by FLIP are named VAR000 @@ -394,17 +403,18 @@ extensions are added, starting with 1, until a unique name is found or there are no remaining possibilities. If the latter occurs then the FLIP operation aborts. -The resultant dictionary contains a CASE_LBL variable, which stores the -names of the variables in the dictionary before the transposition. If -the active file is subsequently transposed using @cmd{FLIP}, this -variable can -be used to recreate the original variable names. +The resultant dictionary contains a CASE_LBL variable, a string +variable of width 8, which stores the names of the variables in the +dictionary before the transposition. Variables names longer than 8 +characters are truncated. If the active file is subsequently +transposed using @cmd{FLIP}, this variable can be used to recreate the +original variable names. FLIP honors @cmd{N OF CASES} (@pxref{N OF CASES}). It ignores @cmd{TEMPORARY} (@pxref{TEMPORARY}), so that ``temporary'' transformations become permanent. -@node IF, RECODE, FLIP, Data Manipulation +@node IF @section IF @vindex IF @@ -442,7 +452,7 @@ When @cmd{IF} is specified following @cmd{TEMPORARY} (@pxref{TEMPORARY}), the @cmd{LAG} function may not be used (@pxref{LAG}). -@node RECODE, SORT CASES, IF, Data Manipulation +@node RECODE @section RECODE @vindex RECODE @@ -495,8 +505,8 @@ src_value matches any user- or system-missing value. SYSMIS matches the system missing value only. ELSE is a catch-all that matches anything. It should be the last src_value specified. -Numeric and string dest_value's should also be self-explanatory. COPY -causes the input values to be copied to the output. This is only value +Numeric and string dest_value's should be self-explanatory. COPY +causes the input values to be copied to the output. This is only valid if the source and target variables are of the same type. SYSMIS indicates the system-missing value. @@ -510,12 +520,12 @@ Multiple recodings can be specified on a single @cmd{RECODE} invocation. Introduce additional recodings with a slash (@samp{/}) to separate them from the previous recodings. -@node SORT CASES, , RECODE, Data Manipulation +@node SORT CASES @section SORT CASES @vindex SORT CASES @display -SORT CASES BY var_list. +SORT CASES BY var_list[(@{D|A@}] [ var_list[(@{D|A@}] ] ... @end display @cmd{SORT CASES} sorts the active file by the values of one or more @@ -524,8 +534,8 @@ variables. Specify BY and a list of variables to sort by. By default, variables are sorted in ascending order. To override sort order, specify (D) or (DOWN) after a list of variables to get descending order, or (A) or (UP) -for ascending order. These apply to the entire list of variables -preceding them. +for ascending order. These apply to all the listed variables +up until the preceding (A), (D), (UP) or (DOWN). The sort algorithms used by @cmd{SORT CASES} are stable. That is, records that have equal values of the sort variables will have the