From: Ben Pfaff Date: Wed, 7 May 2025 00:22:33 +0000 (-0700) Subject: work on manual X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=e9a50fd6b52ec53d976530b390cb7b0b34bbbf3c;p=pspp work on manual --- diff --git a/rust/doc/src/SUMMARY.md b/rust/doc/src/SUMMARY.md index c92ef172c2..3c2020fd9a 100644 --- a/rust/doc/src/SUMMARY.md +++ b/rust/doc/src/SUMMARY.md @@ -40,7 +40,7 @@ # Command Syntax -- [Data Input and Output](commands/data-io/index.md) +- [Working with Text Files](commands/data-io/index.md) - [BEGIN DATA](commands/data-io/begin-data.md) - [CLOSE FILE HANDLE](commands/data-io/close-file-handle.md) - [DATAFILE ATTRIBUTE](commands/data-io/datafile-attribute.md) @@ -58,6 +58,18 @@ - [REREAD](commands/data-io/reread.md) - [REPEATING DATA](commands/data-io/repeating-data.md) - [WRITE](commands/data-io/write.md) +- [Working with Data Files](commands/spss-io/index.md) + - [APPLY DICTIONARY](commands/spss-io/apply-dictionary.md) + - [EXPORT](commands/spss-io/export.md) + - [GET](commands/spss-io/get.md) + - [GET DATA](commands/spss-io/get-data.md) + - [IMPORT](commands/spss-io/import.md) + - [SAVE](commands/spss-io/save.md) + - [SAVE DATA COLLECTION](commands/spss-io/save-data-collection.md) + - [SAVE TRANSLATE](commands/spss-io/save-translate.md) + - [SYSFILE INFO](commands/spss-io/sysfile-info.md) + - [XEXPORT](commands/spss-io/xexport.md) + - [XSAVE](commands/spss-io/xsave.md) # Developer Documentation diff --git a/rust/doc/src/commands/spss-io/apply-dictionary.md b/rust/doc/src/commands/spss-io/apply-dictionary.md new file mode 100644 index 0000000000..3e4e8df6de --- /dev/null +++ b/rust/doc/src/commands/spss-io/apply-dictionary.md @@ -0,0 +1,57 @@ +# APPLY DICTIONARY + +``` +APPLY DICTIONARY FROM={'FILE_NAME',FILE_HANDLE}. +``` + +`APPLY DICTIONARY` applies the variable labels, value labels, and +missing values taken from a file to corresponding variables in the +active dataset. In some cases it also updates the weighting variable. + +The `FROM` clause is mandatory. Use it to specify a system file or +portable file's name in single quotes, or a [file handle +name](../../language/files/file-handles.md). The dictionary in the +file is read, but it does not replace the active dataset's dictionary. +The file's data is not read. + +Only variables with names that exist in both the active dataset and +the system file are considered. Variables with the same name but +different types (numeric, string) cause an error message. Otherwise, +the system file variables' attributes replace those in their matching +active dataset variables: + +- If a system file variable has a variable label, then it replaces + the variable label of the active dataset variable. If the system + file variable does not have a variable label, then the active + dataset variable's variable label, if any, is retained. + +- If the system file variable has custom attributes (*note VARIABLE + ATTRIBUTE::), then those attributes replace the active dataset + variable's custom attributes. If the system file variable does not + have custom attributes, then the active dataset variable's custom + attributes, if any, is retained. + +- If the active dataset variable is numeric or short string, then + value labels and missing values, if any, are copied to the active + dataset variable. If the system file variable does not have value + labels or missing values, then those in the active dataset + variable, if any, are not disturbed. + +In addition to properties of variables, some properties of the active +file dictionary as a whole are updated: + +- If the system file has custom attributes (see [DATAFILE + ATTRIBUTE](../../commands/data-io/datafile-attribute.html)), then + those attributes replace the active dataset variable's custom + attributes. + +- If the active dataset has a weighting variable (*note WEIGHT::), + and the system file does not, or if the weighting variable in the + system file does not exist in the active dataset, then the active + dataset weighting variable, if any, is retained. Otherwise, the + weighting variable in the system file becomes the active dataset + weighting variable. + +`APPLY DICTIONARY` takes effect immediately. It does not read the +active dataset. The system file is not modified. + diff --git a/rust/doc/src/commands/spss-io/export.md b/rust/doc/src/commands/spss-io/export.md new file mode 100644 index 0000000000..6c29a27213 --- /dev/null +++ b/rust/doc/src/commands/spss-io/export.md @@ -0,0 +1,45 @@ +# EXPORT + +``` +EXPORT + /OUTFILE='FILE_NAME' + /UNSELECTED={RETAIN,DELETE} + /DIGITS=N + /DROP=VAR_LIST + /KEEP=VAR_LIST + /RENAME=(SRC_NAMES=TARGET_NAMES)... + /TYPE={COMM,TAPE} + /MAP +``` + + The `EXPORT` procedure writes the active dataset's dictionary and +data to a specified portable file. + + `UNSELECTED` controls whether cases excluded with `FILTER` (*note +FILTER::) are written to the file. These can be excluded by +specifying `DELETE` on the `UNSELECTED` subcommand. The default is +`RETAIN`. + + Portable files express real numbers in base 30. Integers are +always expressed to the maximum precision needed to make them exact. +Non-integers are, by default, expressed to the machine's maximum +natural precision (approximately 15 decimal digits on many machines). +If many numbers require this many digits, the portable file may +significantly increase in size. As an alternative, the `DIGITS` +subcommand may be used to specify the number of decimal digits of +precision to write. `DIGITS` applies only to non-integers. + + The `OUTFILE` subcommand, which is the only required subcommand, +specifies the portable file to be written as a file name string or a +[file handle](../../language/files/file-handles.md). + +`DROP`, `KEEP`, and `RENAME` have the same syntax and meaning as for +the [`SAVE`](save.md) command. + + The `TYPE` subcommand specifies the character set for use in the +portable file. Its value is currently not used. + + The `MAP` subcommand is currently ignored. + + `EXPORT` is a procedure. It causes the active dataset to be read. + diff --git a/rust/doc/src/commands/spss-io/get-data.md b/rust/doc/src/commands/spss-io/get-data.md new file mode 100644 index 0000000000..b6601e17d1 --- /dev/null +++ b/rust/doc/src/commands/spss-io/get-data.md @@ -0,0 +1,386 @@ +# GET DATA + +``` +GET DATA + /TYPE={GNM,ODS,PSQL,TXT} + ...additional subcommands depending on TYPE... +``` + + The `GET DATA` command is used to read files and other data sources +created by other applications. When this command is executed, the +current dictionary and active dataset are replaced with variables and +data read from the specified source. + + The `TYPE` subcommand is mandatory and must be the first subcommand +specified. It determines the type of the file or source to read. +PSPP currently supports the following `TYPE`s: + +* `GNM` + Spreadsheet files created by Gnumeric (). + +* `ODS` + Spreadsheet files in OpenDocument format + (). + +* `PSQL` + Relations from PostgreSQL databases (). + +* `TXT` + Textual data files in columnar and delimited formats. + +Each supported file type has additional subcommands, explained in +separate sections below. + +## Spreadsheet Files + +``` +GET DATA /TYPE={GNM, ODS} + /FILE={'FILE_NAME'} + /SHEET={NAME 'SHEET_NAME', INDEX N} + /CELLRANGE={RANGE 'RANGE', FULL} + /READNAMES={ON, OFF} + /ASSUMEDSTRWIDTH=N. +``` + +`GET DATA` can read Gnumeric spreadsheets (), and +spreadsheets in OpenDocument format +(). Use the +`TYPE` subcommand to indicate the file's format. `/TYPE=GNM` +indicates Gnumeric files, `/TYPE=ODS` indicates OpenDocument. The +`FILE` subcommand is mandatory. Use it to specify the name file to be +read. All other subcommands are optional. + + The format of each variable is determined by the format of the +spreadsheet cell containing the first datum for the variable. If this +cell is of string (text) format, then the width of the variable is +determined from the length of the string it contains, unless the +`ASSUMEDSTRWIDTH` subcommand is given. + + The `SHEET` subcommand specifies the sheet within the spreadsheet +file to read. There are two forms of the `SHEET` subcommand. In the +first form, `/SHEET=name SHEET_NAME`, the string SHEET_NAME is the name +of the sheet to read. In the second form, `/SHEET=index IDX`, IDX is a +integer which is the index of the sheet to read. The first sheet has +the index 1. If the `SHEET` subcommand is omitted, then the command +reads the first sheet in the file. + + The `CELLRANGE` subcommand specifies the range of cells within the +sheet to read. If the subcommand is given as `/CELLRANGE=FULL`, then +the entire sheet is read. To read only part of a sheet, use the form +`/CELLRANGE=range 'TOP_LEFT_CELL:BOTTOM_RIGHT_CELL'`. For example, +the subcommand `/CELLRANGE=range 'C3:P19'` reads columns C-P and rows +3-19, inclusive. Without the `CELLRANGE` subcommand, the entire sheet +is read. + + If `/READNAMES=ON` is specified, then the contents of cells of the +first row are used as the names of the variables in which to store the +data from subsequent rows. This is the default. If `/READNAMES=OFF` is +used, then the variables receive automatically assigned names. + + The `ASSUMEDSTRWIDTH` subcommand specifies the maximum width of +string variables read from the file. If omitted, the default value is +determined from the length of the string in the first spreadsheet cell +for each variable. + +## Postgres Database Queries + +``` +GET DATA /TYPE=PSQL + /CONNECT={CONNECTION INFO} + /SQL={QUERY} + [/ASSUMEDSTRWIDTH=W] + [/UNENCRYPTED] + [/BSIZE=N]. +``` + + `GET DATA /TYPE=PSQL` imports data from a local or remote Postgres +database server. It automatically creates variables based on the table +column names or the names specified in the SQL query. PSPP cannot +support the full precision of some Postgres data types, so data of those +types will lose some precision when PSPP imports them. PSPP does not +support all Postgres data types. If PSPP cannot support a datum, `GET +DATA` issues a warning and substitutes the system-missing value. + + The `CONNECT` subcommand must be a string for the parameters of the +database server from which the data should be fetched. The format of +the string is given [in the Postgres +manual](http://www.postgresql.org/docs/8.0/static/libpq.html#LIBPQ-CONNECT). + + The `SQL` subcommand must be a valid SQL statement to retrieve data +from the database. + + The `ASSUMEDSTRWIDTH` subcommand specifies the maximum width of +string variables read from the database. If omitted, the default value +is determined from the length of the string in the first value read for +each variable. + + The `UNENCRYPTED` subcommand allows data to be retrieved over an +insecure connection. If the connection is not encrypted, and the +`UNENCRYPTED` subcommand is not given, then an error occurs. Whether or +not the connection is encrypted depends upon the underlying psql library +and the capabilities of the database server. + + The `BSIZE` subcommand serves only to optimise the speed of data +transfer. It specifies an upper limit on number of cases to fetch from +the database at once. The default value is 4096. If your SQL statement +fetches a large number of cases but only a small number of variables, +then the data transfer may be faster if you increase this value. +Conversely, if the number of variables is large, or if the machine on +which PSPP is running has only a small amount of memory, then a smaller +value is probably better. + +### Example + +``` +GET DATA /TYPE=PSQL + /CONNECT='host=example.com port=5432 dbname=product user=fred passwd=xxxx' + /SQL='select * from manufacturer'. +``` + +## Textual Data Files + +``` +GET DATA /TYPE=TXT + /FILE={'FILE_NAME',FILE_HANDLE} + [ENCODING='ENCODING'] + [/ARRANGEMENT={DELIMITED,FIXED}] + [/FIRSTCASE={FIRST_CASE}] + [/IMPORTCASES=...] + ...additional subcommands depending on ARRANGEMENT... +``` + + When `TYPE=TXT` is specified, `GET DATA` reads data in a delimited +or fixed columnar format, much like [`DATA +LIST`](../../commands/data-io/data-list.md). + + The `FILE` subcommand must specify the file to be read as a string +file name or (for textual data only) a [file +handle](../../language/files/file-handles.md)). + + The `ENCODING` subcommand specifies the character encoding of the +file to be read. *Note INSERT::, for information on supported +encodings. + + The `ARRANGEMENT` subcommand determines the file's basic format. +`DELIMITED`, the default setting, specifies that fields in the input data +are separated by spaces, tabs, or other user-specified delimiters. +`FIXED` specifies that fields in the input data appear at particular fixed +column positions within records of a case. + + By default, cases are read from the input file starting from the +first line. To skip lines at the beginning of an input file, set +`FIRSTCASE` to the number of the first line to read: 2 to skip the +first line, 3 to skip the first two lines, and so on. + + `IMPORTCASES` is ignored, for compatibility. Use `N OF CASES` to +limit the number of cases read from a file (*note N OF CASES::), or +`SAMPLE` to obtain a random sample of cases (*note SAMPLE::). + + The remaining subcommands apply only to one of the two file +arrangements, described below. + +### Delimited Data + +``` +GET DATA /TYPE=TXT + /FILE={'FILE_NAME',FILE_HANDLE} + [/ARRANGEMENT={DELIMITED,FIXED}] + [/FIRSTCASE={FIRST_CASE}] + [/IMPORTCASE={ALL,FIRST MAX_CASES,PERCENT PERCENT}] + + /DELIMITERS="DELIMITERS" + [/QUALIFIER="QUOTES" + [/DELCASE={LINE,VARIABLES N_VARIABLES}] + /VARIABLES=DEL_VAR1 [DEL_VAR2]... +where each DEL_VAR takes the form: + variable format +``` + + The `GET DATA` command with `TYPE=TXT` and `ARRANGEMENT=DELIMITED` +reads input data from text files in delimited format, where fields are +separated by a set of user-specified delimiters. Its capabilities are +similar to those of [`DATA LIST +FREE`](../../commands/data-io/data-list.md#data-list-free), with a few +enhancements. + + The required `FILE` subcommand and optional `FIRSTCASE` and +`IMPORTCASE` subcommands are described above (*note GET DATA +/TYPE=TXT::). + + `DELIMITERS`, which is required, specifies the set of characters that +may separate fields. Each character in the string specified on +`DELIMITERS` separates one field from the next. The end of a line also +separates fields, regardless of `DELIMITERS`. Two consecutive +delimiters in the input yield an empty field, as does a delimiter at the +end of a line. A space character as a delimiter is an exception: +consecutive spaces do not yield an empty field and neither does any +number of spaces at the end of a line. + + To use a tab as a delimiter, specify `\t` at the beginning of the +`DELIMITERS` string. To use a backslash as a delimiter, specify `\\` as +the first delimiter or, if a tab should also be a delimiter, immediately +following `\t`. To read a data file in which each field appears on a +separate line, specify the empty string for `DELIMITERS`. + + The optional `QUALIFIER` subcommand names one or more characters that +can be used to quote values within fields in the input. A field that +begins with one of the specified quote characters ends at the next +matching quote. Intervening delimiters become part of the field, +instead of terminating it. The ability to specify more than one quote +character is a PSPP extension. + + The character specified on `QUALIFIER` can be embedded within a field +that it quotes by doubling the qualifier. For example, if `'` is +specified on `QUALIFIER`, then `'a''b'` specifies a field that contains +`a'b`. + + The `DELCASE` subcommand controls how data may be broken across +lines in the data file. With `LINE`, the default setting, each line +must contain all the data for exactly one case. For additional +flexibility, to allow a single case to be split among lines or +multiple cases to be contained on a single line, specify `VARIABLES +n_variables`, where `n_variables` is the number of variables per case. + + The `VARIABLES` subcommand is required and must be the last +subcommand. Specify the name of each variable and its [input +format](../../language/datasets/formats/index.md), in the order they +should be read from the input file. + +#### Example 1 + +On a Unix-like system, the `/etc/passwd` file has a format similar to +this: + +``` +root:$1$nyeSP5gD$pDq/:0:0:,,,:/root:/bin/bash +blp:$1$BrP/pFg4$g7OG:1000:1000:Ben Pfaff,,,:/home/blp:/bin/bash +john:$1$JBuq/Fioq$g4A:1001:1001:John Darrington,,,:/home/john:/bin/bash +jhs:$1$D3li4hPL$88X1:1002:1002:Jason Stover,,,:/home/jhs:/bin/csh +``` + +The following syntax reads a file in the format used by `/etc/passwd`: + +``` +GET DATA /TYPE=TXT /FILE='/etc/passwd' /DELIMITERS=':' + /VARIABLES=username A20 + password A40 + uid F10 + gid F10 + gecos A40 + home A40 + shell A40. +``` + +#### Example 2 + +Consider the following data on used cars: + +``` +model year mileage price type age +Civic 2002 29883 15900 Si 2 +Civic 2003 13415 15900 EX 1 +Civic 1992 107000 3800 n/a 12 +Accord 2002 26613 17900 EX 1 +``` + +The following syntax can be used to read the used car data: + +``` +GET DATA /TYPE=TXT /FILE='cars.data' /DELIMITERS=' ' /FIRSTCASE=2 + /VARIABLES=model A8 + year F4 + mileage F6 + price F5 + type A4 + age F2. +``` + +#### Example 3 + +Consider the following information on animals in a pet store: + +``` +'Pet''s Name', "Age", "Color", "Date Received", "Price", "Height", "Type" +, (Years), , , (Dollars), , +"Rover", 4.5, Brown, "12 Feb 2004", 80, '1''4"', "Dog" +"Charlie", , Gold, "5 Apr 2007", 12.3, "3""", "Fish" +"Molly", 2, Black, "12 Dec 2006", 25, '5"', "Cat" +"Gilly", , White, "10 Apr 2007", 10, "3""", "Guinea Pig" +``` + +The following syntax can be used to read the pet store data: + +``` +GET DATA /TYPE=TXT /FILE='pets.data' /DELIMITERS=', ' /QUALIFIER='''"' /ESCAPE + /FIRSTCASE=3 + /VARIABLES=name A10 + age F3.1 + color A5 + received EDATE10 + price F5.2 + height a5 + type a10. +``` + +### Fixed Columnar Data + +``` +GET DATA /TYPE=TXT + /FILE={'file_name',FILE_HANDLE} + [/ARRANGEMENT={DELIMITED,FIXED}] + [/FIRSTCASE={FIRST_CASE}] + [/IMPORTCASE={ALL,FIRST MAX_CASES,PERCENT PERCENT}] + + [/FIXCASE=N] + /VARIABLES FIXED_VAR [FIXED_VAR]... + [/rec# FIXED_VAR [FIXED_VAR]...]... +where each FIXED_VAR takes the form: + VARIABLE START-END FORMAT +``` + + The `GET DATA` command with `TYPE=TXT` and `ARRANGEMENT=FIXED` +reads input data from text files in fixed format, where each field is +located in particular fixed column positions within records of a case. +Its capabilities are similar to those of [`DATA LIST +FIXED`](../../commands/data-io/data-list.md#data-list-fixed), with a +few enhancements. + + The required `FILE` subcommand and optional `FIRSTCASE` and +`IMPORTCASE` subcommands are described [above](#textual-data-files). + + The optional `FIXCASE` subcommand may be used to specify the positive +integer number of input lines that make up each case. The default value +is 1. + + The `VARIABLES` subcommand, which is required, specifies the +positions at which each variable can be found. For each variable, +specify its name, followed by its start and end column separated by `-` +(e.g. `0-9`), followed by an input format type (e.g. `F`) or a full +format specification (e.g. `DOLLAR12.2`). For this command, columns are +numbered starting from 0 at the left column. Introduce the variables in +the second and later lines of a case by a slash followed by the number +of the line within the case, e.g. `/2` for the second line. + +#### Example + +Consider the following data on used cars: + +``` +model year mileage price type age +Civic 2002 29883 15900 Si 2 +Civic 2003 13415 15900 EX 1 +Civic 1992 107000 3800 n/a 12 +Accord 2002 26613 17900 EX 1 +``` + +The following syntax can be used to read the used car data: + +``` +GET DATA /TYPE=TXT /FILE='cars.data' /ARRANGEMENT=FIXED /FIRSTCASE=2 + /VARIABLES=model 0-7 A + year 8-15 F + mileage 16-23 F + price 24-31 F + type 32-40 A + age 40-47 F. +``` diff --git a/rust/doc/src/commands/spss-io/get.md b/rust/doc/src/commands/spss-io/get.md new file mode 100644 index 0000000000..3777bcc54e --- /dev/null +++ b/rust/doc/src/commands/spss-io/get.md @@ -0,0 +1,55 @@ +# GET + +``` +GET + /FILE={'FILE_NAME',FILE_HANDLE} + /DROP=VAR_LIST + /KEEP=VAR_LIST + /RENAME=(SRC_NAMES=TARGET_NAMES)... + /ENCODING='ENCODING' +``` + + `GET` clears the current dictionary and active dataset and replaces +them with the dictionary and data from a specified file. + + The `FILE` subcommand is the only required subcommand. Specify the +SPSS system file, SPSS/PC+ system file, or SPSS portable file to be +read as a string file name or a [file +handle](../../language/files/file-handles.md). + + By default, all the variables in a file are read. The `DROP` +subcommand can be used to specify a list of variables that are not to +be read. By contrast, the `KEEP` subcommand can be used to specify +variable that are to be read, with all other variables not read. + + Normally variables in a file retain the names that they were saved +under. Use the `RENAME` subcommand to change these names. Specify, +within parentheses, a list of variable names followed by an equals sign +(`=`) and the names that they should be renamed to. Multiple +parenthesized groups of variable names can be included on a single +`RENAME` subcommand. Variables' names may be swapped using a `RENAME` +subcommand of the form `/RENAME=(A B=B A)`. + + Alternate syntax for the `RENAME` subcommand allows the parentheses +to be omitted. When this is done, only a single variable may be +renamed at once. For instance, `/RENAME=A=B`. This alternate syntax +is discouraged. + + `DROP`, `KEEP`, and `RENAME` are executed in left-to-right order. +Each may be present any number of times. `GET` never modifies a file on +disk. Only the active dataset read from the file is affected by these +subcommands. + + PSPP automatically detects the encoding of string data in the file, +when possible. The character encoding of old SPSS system files cannot +always be guessed correctly, and SPSS/PC+ system files do not include +any indication of their encoding. Specify the `ENCODING` subcommand +with an IANA character set name as its string argument to override the +default. Use `SYSFILE INFO` to analyze the encodings that might be +valid for a system file. The `ENCODING` subcommand is a PSPP extension. + + `GET` does not cause the data to be read, only the dictionary. The +data is read later, when a procedure is executed. + + Use of `GET` to read a portable file is a PSPP extension. + diff --git a/rust/doc/src/commands/spss-io/import.md b/rust/doc/src/commands/spss-io/import.md new file mode 100644 index 0000000000..60365f45ee --- /dev/null +++ b/rust/doc/src/commands/spss-io/import.md @@ -0,0 +1,29 @@ +# IMPORT + +``` +IMPORT + /FILE='FILE_NAME' + /TYPE={COMM,TAPE} + /DROP=VAR_LIST + /KEEP=VAR_LIST + /RENAME=(SRC_NAMES=TARGET_NAMES)... +``` + +The `IMPORT` transformation clears the active dataset dictionary and +data and replaces them with a dictionary and data from a system file or +portable file. + +The `FILE` subcommand, which is the only required subcommand, +specifies the portable file to be read as a file name string or a +[file handle](../../language/files/file-handles.md). + +The `TYPE` subcommand is currently not used. + +`DROP`, `KEEP`, and `RENAME` follow the syntax used by +[`GET`](get.md). + +`IMPORT` does not cause the data to be read; only the dictionary. +The data is read later, when a procedure is executed. + +Use of `IMPORT` to read a system file is a PSPP extension. + diff --git a/rust/doc/src/commands/spss-io/index.md b/rust/doc/src/commands/spss-io/index.md new file mode 100644 index 0000000000..a7817a6900 --- /dev/null +++ b/rust/doc/src/commands/spss-io/index.md @@ -0,0 +1,4 @@ +# Working with SPSS Data Files + +These commands read and write data files in SPSS and other proprietary +or specialized data formats. diff --git a/rust/doc/src/commands/spss-io/save-data-collection.md b/rust/doc/src/commands/spss-io/save-data-collection.md new file mode 100644 index 0000000000..02d1b9ff5c --- /dev/null +++ b/rust/doc/src/commands/spss-io/save-data-collection.md @@ -0,0 +1,37 @@ +# SAVE DATA COLLECTION + +``` +SAVE DATA COLLECTION + /OUTFILE={'FILE_NAME',FILE_HANDLE} + /METADATA={'FILE_NAME',FILE_HANDLE} + /{UNCOMPRESSED,COMPRESSED,ZCOMPRESSED} + /PERMISSIONS={WRITEABLE,READONLY} + /DROP=VAR_LIST + /KEEP=VAR_LIST + /VERSION=VERSION + /RENAME=(SRC_NAMES=TARGET_NAMES)... + /NAMES + /MAP +``` + +Like `SAVE`, `SAVE DATA COLLECTION` writes the dictionary and data in +the active dataset to a system file. In addition, it writes metadata to +an additional XML metadata file. + +`OUTFILE` is required. Specify the system file to be written as a +string file name or a [file +handle](../../language/files/file-handles.md). + +`METADATA` is also required. Specify the metadata file to be written +as a string file name or a file handle. Metadata files customarily use +a `.mdd` extension. + +The current implementation of this command is experimental. It only +outputs an approximation of the metadata file format. Please report +bugs. + +Other subcommands are optional. They have the same meanings as in +the `SAVE` command. + +`SAVE DATA COLLECTION` causes the data to be read. It is a procedure. + diff --git a/rust/doc/src/commands/spss-io/save-translate.md b/rust/doc/src/commands/spss-io/save-translate.md new file mode 100644 index 0000000000..350d4fcc8a --- /dev/null +++ b/rust/doc/src/commands/spss-io/save-translate.md @@ -0,0 +1,127 @@ +# SAVE TRANSLATE + +``` +SAVE TRANSLATE + /OUTFILE={'FILE_NAME',FILE_HANDLE} + /TYPE={CSV,TAB} + [/REPLACE] + [/MISSING={IGNORE,RECODE}] + + [/DROP=VAR_LIST] + [/KEEP=VAR_LIST] + [/RENAME=(SRC_NAMES=TARGET_NAMES)...] + [/UNSELECTED={RETAIN,DELETE}] + [/MAP] + + ...additional subcommands depending on TYPE... +``` + +The `SAVE TRANSLATE` command is used to save data into various +formats understood by other applications. + +The `OUTFILE` and `TYPE` subcommands are mandatory. `OUTFILE` +specifies the file to be written, as a string file name or a file handle +(*note File Handles::). `TYPE` determines the type of the file or +source to read. It must be one of the following: + +* `CSV` + Comma-separated value format, + +* `TAB` + Tab-delimited format. + +By default, `SAVE TRANSLATE` does not overwrite an existing file. +Use `REPLACE` to force an existing file to be overwritten. + +With `MISSING=IGNORE`, the default, `SAVE TRANSLATE` treats +user-missing values as if they were not missing. Specify +`MISSING=RECODE` to output numeric user-missing values like +system-missing values and string user-missing values as all spaces. + +By default, all the variables in the active dataset dictionary are +saved to the system file, but `DROP` or `KEEP` can select a subset of +variable to save. The `RENAME` subcommand can also be used to change +the names under which variables are saved; because they are used only +in the output, these names do not have to conform to the usual PSPP +variable naming rules. `UNSELECTED` determines whether cases filtered +out by the `FILTER` command are written to the output file. These +subcommands have the same syntax and meaning as on the +[`SAVE`](save.md) command. + +Each supported file type has additional subcommands, explained in +separate sections below. + +`SAVE TRANSLATE` causes the data to be read. It is a procedure. + +## Comma- and Tab-Separated Data Files + +``` +SAVE TRANSLATE + /OUTFILE={'FILE_NAME',FILE_HANDLE} + /TYPE=CSV + [/REPLACE] + [/MISSING={IGNORE,RECODE}] + + [/DROP=VAR_LIST] + [/KEEP=VAR_LIST] + [/RENAME=(SRC_NAMES=TARGET_NAMES)...] + [/UNSELECTED={RETAIN,DELETE}] + + [/FIELDNAMES] + [/CELLS={VALUES,LABELS}] + [/TEXTOPTIONS DELIMITER='DELIMITER'] + [/TEXTOPTIONS QUALIFIER='QUALIFIER'] + [/TEXTOPTIONS DECIMAL={DOT,COMMA}] + [/TEXTOPTIONS FORMAT={PLAIN,VARIABLE}] +``` + +The `SAVE TRANSLATE` command with `TYPE=CSV` or `TYPE=TAB` writes data in a +comma- or tab-separated value format similar to that described by +RFC 4180. Each variable becomes one output column, and each case +becomes one line of output. If `FIELDNAMES` is specified, an additional +line at the top of the output file lists variable names. + +The `CELLS` and `TEXTOPTIONS FORMAT` settings determine how values are +written to the output file: + +* `CELLS=VALUES FORMAT=PLAIN` (the default settings) + Writes variables to the output in "plain" formats that ignore the + details of variable formats. Numeric values are written as plain + decimal numbers with enough digits to indicate their exact values + in machine representation. Numeric values include `e` followed by + an exponent if the exponent value would be less than -4 or greater + than 16. Dates are written in MM/DD/YYYY format and times in + HH:MM:SS format. `WKDAY` and `MONTH` values are written as decimal + numbers. + + Numeric values use, by default, the decimal point character set + with `SET DECIMAL` (*note SET DECIMAL::). Use `DECIMAL=DOT` or + `DECIMAL=COMMA` to force a particular decimal point character. + +* `CELLS=VALUES FORMAT=VARIABLE` + Writes variables using their print formats. Leading and trailing + spaces are removed from numeric values, and trailing spaces are + removed from string values. + +* `CELLS=LABEL FORMAT=PLAIN` + `CELLS=LABEL FORMAT=VARIABLE` + Writes value labels where they exist, and otherwise writes the + values themselves as described above. + + Regardless of `CELLS` and `TEXTOPTIONS FORMAT`, numeric system-missing +values are output as a single space. + + For `TYPE=TAB`, tab characters delimit values. For `TYPE=CSV`, the +`TEXTOPTIONS DELIMITER` and `DECIMAL` settings determine the character +that separate values within a line. If `DELIMITER` is specified, then +the specified string separate values. If `DELIMITER` is not +specified, then the default is a comma with `DECIMAL=DOT` or a +semicolon with `DECIMAL=COMMA`. If `DECIMAL` is not given either, it +is inferred from the decimal point character set with `SET DECIMAL` +(*note SET DECIMAL::). + + The `TEXTOPTIONS QUALIFIER` setting specifies a character that is +output before and after a value that contains the delimiter character or +the qualifier character. The default is a double quote (`"`). A +qualifier character that appears within a value is doubled. + diff --git a/rust/doc/src/commands/spss-io/save.md b/rust/doc/src/commands/spss-io/save.md new file mode 100644 index 0000000000..4f426d931b --- /dev/null +++ b/rust/doc/src/commands/spss-io/save.md @@ -0,0 +1,89 @@ +# SAVE + +``` +SAVE + /OUTFILE={'FILE_NAME',FILE_HANDLE} + /UNSELECTED={RETAIN,DELETE} + /{UNCOMPRESSED,COMPRESSED,ZCOMPRESSED} + /PERMISSIONS={WRITEABLE,READONLY} + /DROP=VAR_LIST + /KEEP=VAR_LIST + /VERSION=VERSION + /RENAME=(SRC_NAMES=TARGET_NAMES)... + /NAMES + /MAP +``` + + The `SAVE` procedure causes the dictionary and data in the active +dataset to be written to a system file. + + `OUTFILE` is the only required subcommand. Specify the system file +to be written as a string file name or a [file +handle](../../language/files/file-handles.md). + + By default, cases excluded with `FILTER` are written to the system +file. These can be excluded by specifying `DELETE` on the `UNSELECTED` +subcommand. Specifying `RETAIN` makes the default explicit. + + The `UNCOMPRESSED`, `COMPRESSED`, and `ZCOMPRESSED` subcommand +determine the system file's compression level: + +* `UNCOMPRESSED` + Data is not compressed. Each numeric value uses 8 bytes of disk + space. Each string value uses one byte per column width, rounded + up to a multiple of 8 bytes. + +* `COMPRESSED` + Data is compressed in a simple way. Each integer numeric value + between −99 and 151, inclusive, or system missing value uses one + byte of disk space. Each 8-byte segment of a string that consists + only of spaces uses 1 byte. Any other numeric value or 8-byte + string segment uses 9 bytes of disk space. + +* `ZCOMPRESSED` + Data is compressed with the "deflate" compression algorithm + specified in RFC 1951 (the same algorithm used by `gzip`). Files + written with this compression level cannot be read by PSPP 0.8.1 or + earlier or by SPSS 20 or earlier. + +`COMPRESSED` is the default compression level. The SET command (*note +SET::) can change this default. + +The `PERMISSIONS` subcommand specifies operating system permissions +for the new system file. `WRITEABLE`, the default, creates the file +with read and write permission. `READONLY` creates the file for +read-only access. + +By default, all the variables in the active dataset dictionary are +written to the system file. The `DROP` subcommand can be used to +specify a list of variables not to be written. In contrast, `KEEP` +specifies variables to be written, with all variables not specified +not written. + +Normally variables are saved to a system file under the same names +they have in the active dataset. Use the `RENAME` subcommand to change +these names. Specify, within parentheses, a list of variable names +followed by an equals sign (`=`) and the names that they should be +renamed to. Multiple parenthesized groups of variable names can be +included on a single `RENAME` subcommand. Variables' names may be +swapped using a `RENAME` subcommand of the form `/RENAME=(A B=B A)`. + +Alternate syntax for the `RENAME` subcommand allows the parentheses to +be eliminated. When this is done, only a single variable may be +renamed at once. For instance, `/RENAME=A=B`. This alternate syntax +is discouraged. + +`DROP`, `KEEP`, and `RENAME` are performed in left-to-right order. +They each may be present any number of times. `SAVE` never modifies +the active dataset. `DROP`, `KEEP`, and `RENAME` only affect the +system file written to disk. + +The `VERSION` subcommand specifies the version of the file format. +Valid versions are 2 and 3. The default version is 3. In version 2 +system files, variable names longer than 8 bytes are truncated. The +two versions are otherwise identical. + +The `NAMES` and `MAP` subcommands are currently ignored. + +`SAVE` causes the data to be read. It is a procedure. + diff --git a/rust/doc/src/commands/spss-io/sysfile-info.md b/rust/doc/src/commands/spss-io/sysfile-info.md new file mode 100644 index 0000000000..5f837fa2e1 --- /dev/null +++ b/rust/doc/src/commands/spss-io/sysfile-info.md @@ -0,0 +1,24 @@ +# SYSFILE INFO + +``` +SYSFILE INFO FILE='FILE_NAME' [ENCODING='ENCODING']. +``` + +`SYSFILE INFO` reads the dictionary in an SPSS system file, SPSS/PC+ +system file, or SPSS portable file, and displays the information in +its dictionary. + +Specify a file name or file handle. `SYSFILE INFO` reads that file +and displays information on its dictionary. + +PSPP automatically detects the encoding of string data in the file, +when possible. The character encoding of old SPSS system files cannot +always be guessed correctly, and SPSS/PC+ system files do not include +any indication of their encoding. Specify the `ENCODING` subcommand +with an IANA character set name as its string argument to override the +default, or specify `ENCODING='DETECT'` to analyze and report possibly +valid encodings for the system file. The `ENCODING` subcommand is a +PSPP extension. + +`SYSFILE INFO` does not affect the current active dataset. + diff --git a/rust/doc/src/commands/spss-io/xexport.md b/rust/doc/src/commands/spss-io/xexport.md new file mode 100644 index 0000000000..d791aa05c0 --- /dev/null +++ b/rust/doc/src/commands/spss-io/xexport.md @@ -0,0 +1,27 @@ +# XEXPORT + +``` +XEXPORT + /OUTFILE='FILE_NAME' + /DIGITS=N + /DROP=VAR_LIST + /KEEP=VAR_LIST + /RENAME=(SRC_NAMES=TARGET_NAMES)... + /TYPE={COMM,TAPE} + /MAP +``` + +The `XEXPORT` transformation writes the active dataset dictionary and +data to a specified portable file. + +This transformation is a PSPP extension. + +It is similar to the `EXPORT` procedure, with two differences: + +- `XEXPORT` is a transformation, not a procedure. It is executed when + the data is read by a procedure or procedure-like command. + +- `XEXPORT` does not support the `UNSELECTED` subcommand. + +See [`EXPORT`](export.md) for more information. + diff --git a/rust/doc/src/commands/spss-io/xsave.md b/rust/doc/src/commands/spss-io/xsave.md new file mode 100644 index 0000000000..0462500fb8 --- /dev/null +++ b/rust/doc/src/commands/spss-io/xsave.md @@ -0,0 +1,26 @@ +# XSAVE + +``` +XSAVE + /OUTFILE='FILE_NAME' + /{UNCOMPRESSED,COMPRESSED,ZCOMPRESSED} + /PERMISSIONS={WRITEABLE,READONLY} + /DROP=VAR_LIST + /KEEP=VAR_LIST + /VERSION=VERSION + /RENAME=(SRC_NAMES=TARGET_NAMES)... + /NAMES + /MAP +``` + +The `XSAVE` transformation writes the active dataset's dictionary and +data to a system file. It is similar to the `SAVE` procedure, with +two differences: + +- `XSAVE` is a transformation, not a procedure. It is executed when + the data is read by a procedure or procedure-like command. + +- `XSAVE` does not support the `UNSELECTED` subcommand. + +See [`SAVE`](save.md) for more information. +