From: Ben Pfaff Date: Mon, 25 Aug 2025 03:15:38 +0000 (-0700) Subject: rust: Work on chapter in manual on invoking PSPP. X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=5815480116de34f96f5647343cce022429c436ca;p=pspp rust: Work on chapter in manual on invoking PSPP. --- diff --git a/rust/doc/src/SUMMARY.md b/rust/doc/src/SUMMARY.md index 8469a7e9a3..bc4f8640bd 100644 --- a/rust/doc/src/SUMMARY.md +++ b/rust/doc/src/SUMMARY.md @@ -3,12 +3,10 @@ [Introduction](introduction.md) [License](license.md) -# Running PSPP - -- [Invoking `pspp`](invoking/pspp.md) -- [Invoking `pspp-convert`](invoking/pspp-convert.md) -- [Invoking `pspp-output`](invoking/pspp-output.md) -- [Invoking `pspp-dump-sav`](invoking/pspp-dump-sav.md) +- [Running PSPP](invoking/index.md) + - [Converting data](invoking/pspp-convert.md) + - [Inspecting data](invoking/pspp-show.md) + - [Decrypting files](invoking/pspp-decrypt.md) # Language Overview diff --git a/rust/doc/src/invoking/index.md b/rust/doc/src/invoking/index.md new file mode 100644 index 0000000000..82ebc4adef --- /dev/null +++ b/rust/doc/src/invoking/index.md @@ -0,0 +1,8 @@ +# Running `pspp` + +This chapter describes how to run `pspp`, PSPP's main command-line +user interface. The `pspp` program has a number of commands, each of +which is documented in its own section. + +To see a list of commands, run `pspp --help`. For help with a +particular command, run `pspp --help`. diff --git a/rust/doc/src/invoking/pspp-convert.md b/rust/doc/src/invoking/pspp-convert.md index 988570eb80..42055240a2 100644 --- a/rust/doc/src/invoking/pspp-convert.md +++ b/rust/doc/src/invoking/pspp-convert.md @@ -1,26 +1,11 @@ -# Invoking `pspp-convert` +# Converting data files with `pspp convert` -`pspp-convert` is a command-line utility accompanying PSPP. It reads an -SPSS or SPSS/PC+ system file or SPSS portable file or encrypted SPSS -syntax file INPUT and writes a copy of it to another OUTPUT in a -different format. Synopsis: +`pspp convert [OUTPUT]` reads an SPSS data file from `` +and writes a copy of it to `[OUTPUT]` (or to the terminal, if +`[OUTPUT]` is omitted). -``` -pspp-convert [OPTIONS] INPUT OUTPUT - -pspp-convert --help - -pspp-convert --version -``` - -The format of INPUT is automatically detected, when possible. The -character encoding of old SPSS system files cannot always be guessed -correctly, and SPSS/PC+ system files do not include any indication of -their encoding. Use `-e ENCODING` to specify the encoding in this -case. - -By default, the intended format for OUTPUT is inferred based on its -extension: +If `[OUTPUT]` is specified, then `pspp convert` tries to guess the +output format based on its extension: * `csv` `txt` @@ -32,139 +17,101 @@ extension: `sys` SPSS system file. -* `por` - SPSS portable file. - -* `sps` - SPSS syntax file. (Only encrypted syntax files may be converted to - this format.) - -`pspp-convert` can convert most input formats to most output formats. -Encrypted SPSS file formats are exceptions: if the input file is in an -encrypted format, then the output file will be the same format -(decrypted). To decrypt such a file, specify the encrypted file as -INPUT. The output will be the equivalent plaintext file. Options for -the output format are ignored in this case. - -The password for encrypted files can be specified a few different -ways. If the password is known, use the `-p` option (documented below) -or allow `pspp-convert` to prompt for it. If the password is unknown, -use the `-a` and `-l` options to specify how to search for it, or -`--password-list` to specify a file of passwords to try. - -Use `-O FORMAT` to override the inferred format or to specify the -format for unrecognized extensions. - -`pspp-convert` accepts the following general options: - -* `-O FORMAT` - `--output-format=FORMAT` - Sets the output format, where FORMAT is one of the extensions - listed above, e.g.: `-O csv`. Use `--help` to list the supported - output formats. - -* `-c MAXCASES` - `--cases=MAXCASES` - By default, all cases are copied from INPUT to OUTPUT. Specifying - this option to limit the number of cases written to OUTPUT to - MAXCASES. - -* `-e CHARSET` - `--encoding=CHARSET` - Overrides the encoding in which character strings in INPUT are - interpreted. This option is necessary because old SPSS system - files, and SPSS/PC+ system files, do not self-identify their - encoding. - -* `-k VARIABLE...` - `--keep=VARIABLE...` - By default, `pspp-convert` includes all the variables from the - input file. Use this option to list specific variables to include; - any variables not listed will be dropped. The variables in the - output file will also be reordered into the given order. The - variable list may use `TO` in the same way as in PSPP syntax, e.g. - if the dictionary contains consecutive variables `a`, `b`, `c`, and - `d`, then `--keep='a to d'` will include all of them (and no - others). - -* `-d VARIABLE...` - `--drop=VARIABLE...` - Drops the specified variables from the output. - - When `--keep` and `--drop` are used together, `--keep` is processed - first. - -* `-h`, `--help` - Prints a usage message on stdout and exits. - -* `-v`, `--version` - Prints version information on stdout and exits. - -The following options affect CSV output: +Without an output file name, the default output format is CSV. Use +`-O ` to override the default or to specify the format +for unrecognized extensions. + +## Options + +`pspp convert` accepts the following general options: + +* `-O csv` + `-O sys` + Sets the output format. + +* `-e ` + `--encoding=` + Sets the character encoding used to read text strings in the input + file. This is not needed for new enough SPSS data files, but older + data files do not identify their encoding, and PSPP cannot always + guess correctly. + + `` must be one of the labels for encodings in the + [Encoding Standard]. PSPP does not support UTF-16 or EBCDIC + encodings data files. + + `pspp show encodings` can help figure out the correct encoding for a + system file. + + [Encoding Standard]: https://encoding.spec.whatwg.org/#names-and-labels + +* `-c ` + `--cases=` + By default, all cases in the input are copied to the output. + Specify this option to limit the number of copied cases. + +* `-p ` + `--password=` + Specifies the password for reading an encrypted SPSS system file. + + `pspp convert` reads, but does not write, encrypted system files. + + > ⚠️ The password (and other command-line options) may be visible to + other users on multiuser systems. + +## System File Output Options + +These options only affect output to SPSS system files. + +* `--unicode` + Writes system file output with Unicode (UTF-8) encoding. If the + input was not already in Unicode, then this causes string variables + to be tripled in width. + +* `--compression ` + Writes data in the system file with the specified format of + compression: + + - `simple`: A simple form of compression that saves space writing + small integer values and string segments that are all spaces. All + versions of SPSS support simple compression. + + - `zlib`: More advanced compression that saves space in more general + cases. Only SPSS 21 and later can read files written with `zlib` + compression. + +## CSV Output Options + +These options only affect output to CSV files. + +* `--no-var-names` + By default, `pspp convert` writes the variable names as the first + line of output. With this option, `pspp convert` omits this line. * `--recode` - By default, `pspp-convert` writes user-missing values to CSV output - files as their regular values. With this option, `pspp-convert` + By default, `pspp convert` writes user-missing values to CSV output + files as their regular values. With this option, `pspp convert` recodes them to system-missing values (which are written as a single space). -* `--no-var-names` - By default, `pspp-convert` writes the variable names as the first - line of output. With this option, `pspp-convert` omits this line. - * `--labels` - By default, `pspp-convert` writes variables' values to CSV output - files. With this option, `pspp-convert` writes value labels. + By default, `pspp convert` writes variables' values to CSV output + files. With this option, `pspp convert` writes value labels. * `--print-formats` - By default, `pspp-convert` writes numeric variables as plain - numbers. This option makes `pspp-convert` honor variables' print + By default, `pspp convert` writes numeric variables as plain + numbers. This option makes `pspp convert` honor variables' print formats. * `--decimal=DECIMAL` This option sets the character used as a decimal point in output. - The default is `.`. + The default is `.`. Only ASCII characters may be used. * `--delimiter=DELIMITER` This option sets the character used to separate fields in output. The default is `,`, unless the decimal point is `,`, in which case - `;` is used. + `;` is used. Only ASCII characters may be used. * `--qualifier=QUALIFIER` The option sets the character used to quote fields that contain the - delimiter. The default is `"`. - -The following options specify how to obtain the password for -encrypted files: - -* `-p PASSWORD` - `--password=PASSWORD` - Specifies the password to use to decrypt an encrypted SPSS system - file or syntax file. If this option is not specified, - `pspp-convert` will prompt interactively for the password as - necessary. - - > ⚠️ Passwords (and other command-line options) may be visible to - other users on multiuser systems. - - When used with `-a` (or `--password-alphabet`) and `-l` (or - `--password-length`), this option specifies the starting point for - the search. This can be used to restart a search that was - interrupted. - -* `-a ALPHABET` - `--password-alphabet=ALPHABET` - Specifies the alphabet of symbols over which to search for an - encrypted file's password. ALPHABET may include individual - characters and ranges delimited by `-`. For example, `-a a-z` - searches lowercase letters, `-a A-Z0-9` searches uppercase letters - and digits, and `-a ' -~'` searches all printable ASCII characters. - -* `-l MAX-LENGTH` - `--password-length=MAX-LENGTH` - Specifies the maximum length of the passwords to try. - -* `--password-list=FILE` - Specifies a file to read containing a list of passwords to try, one - per line. If FILE is `-`, reads from stdin. - + delimiter. The default is `"`. Only ASCII characters may be used. diff --git a/rust/doc/src/invoking/pspp-decrypt.md b/rust/doc/src/invoking/pspp-decrypt.md new file mode 100644 index 0000000000..1f8b5447df --- /dev/null +++ b/rust/doc/src/invoking/pspp-decrypt.md @@ -0,0 +1 @@ +# Decrypting SPSS files with `pspp decrypt` diff --git a/rust/doc/src/invoking/pspp-dump-sav.md b/rust/doc/src/invoking/pspp-dump-sav.md deleted file mode 100644 index ef409b0481..0000000000 --- a/rust/doc/src/invoking/pspp-dump-sav.md +++ /dev/null @@ -1,40 +0,0 @@ -# Invoking `pspp-dump-sav` - -`pspp-dump-sav` is a command-line utility accompanying PSPP. It is not -installed by default, so it may be missing from your PSPP installation. -It reads one or more SPSS system files and prints their contents. The -output format is useful for debugging system file readers and writers -and for discovering how to interpret unknown or poorly understood -records. End users may find the output useful for providing the PSPP -developers information about system files that PSPP does not accurately -read. - -Synopsis: - -``` -pspp-dump-sav [-d[MAXCASES] | --data[=MAXCASES]] FILE... - -pspp-dump-sav --help | -h - -pspp-dump-sav --version | -v -``` - -The following options are accepted: - -* `-d[MAXCASES]` - `--data[=MAXCASES]` - By default, `pspp-dump-sav` does not print any of the data in a - system file, only the file headers. Specify this option to print - the data as well. If MAXCASES is specified, then it limits the - number of cases printed. - -* `-h`, `--help` - Prints a usage message on stdout and exits. - -* `-v`, `--version` - Prints version information on stdout and exits. - -Some errors that prevent files from being interpreted successfully -cause `pspp-dump-sav` to exit without reading any additional files -given on the command line. - diff --git a/rust/doc/src/invoking/pspp-output.md b/rust/doc/src/invoking/pspp-output.md deleted file mode 100644 index c2258e6c69..0000000000 --- a/rust/doc/src/invoking/pspp-output.md +++ /dev/null @@ -1,211 +0,0 @@ -# Invoking `pspp-output` - -`pspp-output` is a command-line utility accompanying PSPP. It supports -multiple operations on SPSS viewer or `.spv` files, here called SPV -files. SPSS 16 and later writes SPV files to represent the contents of -its output editor. - -SPSS 15 and earlier versions instead use `.spo` files. `pspp-output` -does not support this format. - -`pspp-options` may be invoked in the following ways: - -``` -pspp-output detect FILE - -pspp-output [OPTIONS] dir FILE - -pspp-output [OPTIONS] convert SOURCE DESTINATION - -pspp-output [OPTIONS] get-table-look SOURCE DESTINATION - -pspp-output [OPTIONS] convert-table-look SOURCE DESTINATION - -pspp-output --help - -pspp-output --version -``` - -Each of these forms is documented separately below. `pspp-output` -also has several undocumented command forms that developers may find -useful for debugging. - -## The `detect` Command - -``` -pspp-output detect FILE -``` - -When FILE is an SPV file, `pspp-output` exits successfully without -outputting anything. When FILE is not an SPV file or some other error -occurs, `pspp-output` prints an error message and exits with a failure -indication. - -## The `dir` Command - -``` -pspp-output [OPTIONS] dir FILE -``` - -Prints on stdout a table of contents for SPV file FILE. By default, -this table lists every object in the file, except for hidden objects. -See [Input Selection Options](#input-selection-options), for -information on the options available to select a subset of objects. - -The following additional option for `dir` is intended mainly for use -by PSPP developers: - -* `--member-names` - Also show the names of the Zip members associated with each object. - -## The `convert` Command - -``` -pspp-output [OPTIONS] convert SOURCE DESTINATION -``` - -Reads SPV file SOURCE and converts it to another format, writing the -output to DESTINATION. - -By default, the intended format for DESTINATION is inferred based on -its extension, in the same way that the `pspp` program does for its -output files. See [Invoking `pspp`](pspp.md), for details. - -See [Input Selection Options](#input-selection-options), for -information on the options available to select a subset of objects to -include in the output. The following additional options are accepted: - -* `-O format=FORMAT` - Overrides the format inferred from the output file's extension. Use - `--help` to list the available formats. See [Invoking - `pspp`](pspp.md) for details of the available output formats. - -* `-O OPTION=VALUE` - Sets an option for the output file format. See [Invoking - `pspp`](pspp.md) for details of the available output options. - -* `-F`, `--force` - By default, if the source is corrupt or otherwise cannot be - processed, the destination is not written. With `-F` or `--force`, - the destination is written as best it can, even with errors. - -* `--table-look=FILE` - Reads a table style from FILE and applies it to all of the output - tables. The file should be a TableLook `.stt` or `.tlo` file. - -* `--use-page-setup` - By default, the `convert` command uses the default page setup (for - example, page size and margins) for DESTINATION, or the one - specified with `-O` options, if any. Specify this option to ignore - these sources of page setup in favor of the one embedded in the - SPV, if any. - -## The `get-table-look` Command - -``` -pspp-output [OPTIONS] get-table-look SOURCE DESTINATION -``` - -Reads SPV file SOURCE, applies any [input selection -options](#input-selection-options), picks the first table from the -selected object, extracts the TableLook from that table, and writes it -to DESTINATION (typically with an `.stt` extension) in the TableLook -XML format. - -Use `-` for SOURCE to instead write the default look to DESTINATION. - -The user may use the TableLook file to change the style of tables in -other files, by passing it to the `--table-look` option on the `convert` -command. - -## The `convert-table-look` Command - -``` -pspp-output [OPTIONS] convert-table-look SOURCE DESTINATION -``` - -Reads `.stt` or `.tlo` file SOURCE, and writes it back to DESTINATION -(typically with an `.stt` extension) in the TableLook XML format. This -is useful for converting a TableLook `.tlo` file from SPSS 15 or earlier -into the newer `.stt` format. - -## Input Selection Options - -The `dir` and `convert` commands, by default, operate on all of the -objects in the source SPV file, except for objects that are not visible -in the output viewer window. The user may specify these options to -select a subset of the input objects. When multiple options are used, -only objects that satisfy all of them are selected: - -* `--select=[^]CLASS...` - Include only objects of the given CLASS; with leading `^`, include - only objects not in the class. Use commas to separate multiple - classes. The supported classes are `charts`, `headings`, `logs`, - `models`, `tables`, `texts`, `trees`, `warnings`, `outlineheaders`, - `pagetitle`, `notes`, `unknown`, and `other`. - - Use `--select=help` to print this list of classes. - -* `--commands=[^]COMMAND...` - `--subtypes=[^]SUBTYPE...` - `--labels=[^]LABEL...` - Include only objects with the specified COMMAND, SUBTYPE, or LABEL. - With a leading `^`, include only the objects that do not match. - Multiple values may be specified separated by commas. An asterisk - at the end of a value acts as a wildcard. - - The `--command` option matches command identifiers, case - insensitively. All of the objects produced by a single command use - the same, unique command identifier. Command identifiers are - always in English regardless of the language used for output. They - often differ from the command name in PSPP syntax. Use the - `pspp-output` program's `dir` command to print command identifiers - in particular output. - - The `--subtypes` option matches particular tables within a command, - case insensitively. Subtypes are not necessarily unique: two - commands that produce similar output tables may use the same - subtype. Subtypes are always in English and `dir` will print them. - - The `--labels` option matches the labels in table output (that is, - the table titles). Labels are affected by the output language, - variable names and labels, split file settings, and other factors. - -* `--nth-commands=N...` - Include only objects from the Nth command that matches `--command` - (or the Nth command overall if `--command` is not specified), where - N is 1 for the first command, 2 for the second, and so on. - -* `--instances=INSTANCE...` - Include the specified INSTANCE of an object that matches the other - criteria within a single command. The INSTANCE may be a number (1 - for the first instance, 2 for the second, and so on) or `last` for - the last instance. - -* `--show-hidden` - Include hidden output objects in the output. By default, they are - excluded. - -* `--or` - Separates two sets of selection options. Objects selected by - either set of options are included in the output. - -The following additional input selection options are intended mainly -for use by PSPP developers: - -* `--errors` - Include only objects that cause an error when read. With the - `convert` command, this is most useful in conjunction with the - `--force` option. - -* `--members=MEMBER...` - Include only the objects that include a listed Zip file MEMBER. - More than one name may be included, comma-separated. The members - in an SPV file may be listed with the `dir` command by adding the - `--show-members` option or with the `zipinfo` program included with - many operating systems. Error messages that `pspp-output` prints - when it reads SPV files also often include member names. - -* `--member-names` - Displays the name of the Zip member or members associated with each - object just above the object itself. diff --git a/rust/doc/src/invoking/pspp-show.md b/rust/doc/src/invoking/pspp-show.md new file mode 100644 index 0000000000..65319beb06 --- /dev/null +++ b/rust/doc/src/invoking/pspp-show.md @@ -0,0 +1 @@ +# Inspecting data files with `pspp show` diff --git a/rust/doc/src/invoking/pspp.md b/rust/doc/src/invoking/pspp.md deleted file mode 100644 index 59bacb2bf9..0000000000 --- a/rust/doc/src/invoking/pspp.md +++ /dev/null @@ -1,126 +0,0 @@ -# Invoking `pspp` - -This chapter describes how to invoke `pspp`, PSPP's main command-line -user interface. The `pspp` program has a number of commands, which -are documented separately. - -To see a list of commands, run `pspp --help`. For help with a -particular command, run `pspp --help`. - -## Converting data files with `pspp convert` - -`pspp convert [OUTPUT]` reads an SPSS data file from `` -and writes a copy of it to `[OUTPUT]` (or to the terminal, if -`[OUTPUT]` is omitted). - -If `[OUTPUT]` is specified, then `pspp convert` tries to guess the -output format based on its extension: - -* `csv` - `txt` - Comma-separated value. Each value is formatted according to its - variable's print format. The first line in the file contains - variable names. - -* `sav` - `sys` - SPSS system file. - -Without an output file name, the default output format is CSV. Use -`-O ` to override the default or to specify the format -for unrecognized extensions. - -### Options - -`pspp convert` accepts the following general options: - -* `-O csv` - `-O sys` - Sets the output format. - -* `-e ` - `--encoding=` - Sets the character encoding used to read text strings in the input - file. This is not needed for new enough SPSS data files, but older - data files do not identify their encoding, and PSPP cannot always - guess correctly. - - `` must be one of the labels for encodings in the - [Encoding Standard]. PSPP does not support UTF-16 or EBCDIC - encodings data files. - - `pspp show encodings` can help figure out the correct encoding for a - system file. - - [Encoding Standard]: https://encoding.spec.whatwg.org/#names-and-labels - -* `-c ` - `--cases=` - By default, all cases in the input are copied to the output. - Specify this option to limit the number of copied cases. - -* `-p ` - `--password=` - Specifies the password for reading an encrypted SPSS system file. - - `pspp convert` reads, but does not write, encrypted system files. - - > ⚠️ The password (and other command-line options) may be visible to - other users on multiuser systems. - -### System File Output Options - -These options only affect output to SPSS system files. - -* `--unicode` - Writes system file output with Unicode (UTF-8) encoding. If the - input was not already in Unicode, then this causes string variables - to be tripled in width. - -* `--compression ` - Writes data in the system file with the specified format of - compression: - - - `simple`: A simple form of compression that saves space writing - small integer values and string segments that are all spaces. All - versions of SPSS support simple compression. - - - `zlib`: More advanced compression that saves space in more general - cases. Only SPSS 21 and later can read files written with `zlib` - compression. - -### CSV Output Options - -These options only affect output to CSV files. - -* `--no-var-names` - By default, `pspp convert` writes the variable names as the first - line of output. With this option, `pspp convert` omits this line. - -* `--recode` - By default, `pspp convert` writes user-missing values to CSV output - files as their regular values. With this option, `pspp convert` - recodes them to system-missing values (which are written as a - single space). - -* `--labels` - By default, `pspp convert` writes variables' values to CSV output - files. With this option, `pspp convert` writes value labels. - -* `--print-formats` - By default, `pspp convert` writes numeric variables as plain - numbers. This option makes `pspp convert` honor variables' print - formats. - -* `--decimal=DECIMAL` - This option sets the character used as a decimal point in output. - The default is `.`. Only ASCII characters may be used. - -* `--delimiter=DELIMITER` - This option sets the character used to separate fields in output. - The default is `,`, unless the decimal point is `,`, in which case - `;` is used. Only ASCII characters may be used. - -* `--qualifier=QUALIFIER` - The option sets the character used to quote fields that contain the - delimiter. The default is `"`. Only ASCII characters may be used.