1 @node System and Portable File IO
2 @chapter System and Portable File I/O
4 The commands in this chapter read, write, and examine system files and
8 * APPLY DICTIONARY:: Apply system file dictionary to active dataset.
9 * EXPORT:: Write to a portable file.
10 * GET:: Read from a system file.
11 * GET DATA:: Read from foreign files.
12 * IMPORT:: Read from a portable file.
13 * SAVE:: Write to a system file.
14 * SAVE TRANSLATE:: Write data in foreign file formats.
15 * SYSFILE INFO:: Display system file dictionary.
16 * XEXPORT:: Write to a portable file, as a transformation.
17 * XSAVE:: Write to a system file, as a transformation.
20 @node APPLY DICTIONARY
21 @section APPLY DICTIONARY
22 @vindex APPLY DICTIONARY
25 APPLY DICTIONARY FROM=@{'file-name',file_handle@}.
28 @cmd{APPLY DICTIONARY} applies the variable labels, value labels,
29 and missing values taken from a file to corresponding
30 variables in the active dataset. In some cases it also updates the
33 Specify a system file or portable file's name, a data set name
34 (@pxref{Datasets}), or a file handle name (@pxref{File Handles}). The
35 dictionary in the file will be read, but it will not replace the
36 active dataset's dictionary. The file's data will not be read.
38 Only variables with names that exist in both the active dataset and the
39 system file are considered. Variables with the same name but different
40 types (numeric, string) will cause an error message. Otherwise, the
41 system file variables' attributes will replace those in their matching
42 active dataset variables:
46 If a system file variable has a variable label, then it will replace
47 the variable label of the active dataset variable. If the system
48 file variable does not have a variable label, then the active dataset
49 variable's variable label, if any, will be retained.
52 If the system file variable has custom attributes (@pxref{VARIABLE
53 ATTRIBUTE}), then those attributes replace the active dataset variable's
54 custom attributes. If the system file variable does not have custom
55 attributes, then the active dataset variable's custom attributes, if any,
59 If the active dataset variable is numeric or short string, then value
60 labels and missing values, if any, will be copied to the active dataset
61 variable. If the system file variable does not have value labels or
62 missing values, then those in the active dataset variable, if any, will not
66 In addition to properties of variables, some properties of the active
67 file dictionary as a whole are updated:
71 If the system file has custom attributes (@pxref{DATAFILE ATTRIBUTE}),
72 then those attributes replace the active dataset variable's custom
76 If the active dataset has a weighting variable (@pxref{WEIGHT}), and the
77 system file does not, or if the weighting variable in the system file
78 does not exist in the active dataset, then the active dataset weighting
79 variable, if any, is retained. Otherwise, the weighting variable in
80 the system file becomes the active dataset weighting variable.
83 @cmd{APPLY DICTIONARY} takes effect immediately. It does not read the
84 active dataset. The system file is not modified.
93 /UNSELECTED=@{RETAIN,DELETE@}
97 /RENAME=(src_names=target_names)@dots{}
102 The @cmd{EXPORT} procedure writes the active dataset's dictionary and
103 data to a specified portable file.
105 By default, cases excluded with FILTER are written to the
106 file. These can be excluded by specifying DELETE on the UNSELECTED
107 subcommand. Specifying RETAIN makes the default explicit.
109 Portable files express real numbers in base 30. Integers are always
110 expressed to the maximum precision needed to make them exact.
111 Non-integers are, by default, expressed to the machine's maximum
112 natural precision (approximately 15 decimal digits on many machines).
113 If many numbers require this many digits, the portable file may
114 significantly increase in size. As an alternative, the DIGITS
115 subcommand may be used to specify the number of decimal digits of
116 precision to write. DIGITS applies only to non-integers.
118 The OUTFILE subcommand, which is the only required subcommand, specifies
119 the portable file to be written as a file name string or
120 a file handle (@pxref{File Handles}).
122 DROP, KEEP, and RENAME follow the same format as the SAVE procedure
125 The TYPE subcommand specifies the character set for use in the
126 portable file. Its value is currently not used.
128 The MAP subcommand is currently ignored.
130 @cmd{EXPORT} is a procedure. It causes the active dataset to be read.
138 /FILE=@{'file-name',file_handle@}
141 /RENAME=(src_names=target_names)@dots{}
145 @cmd{GET} clears the current dictionary and active dataset and
146 replaces them with the dictionary and data from a specified file.
148 The FILE subcommand is the only required subcommand. Specify the system
149 file or portable file to be read as a string file name or
150 a file handle (@pxref{File Handles}).
152 By default, all the variables in a file are read. The DROP
153 subcommand can be used to specify a list of variables that are not to be
154 read. By contrast, the KEEP subcommand can be used to specify variable
155 that are to be read, with all other variables not read.
157 Normally variables in a file retain the names that they were
158 saved under. Use the RENAME subcommand to change these names. Specify,
159 within parentheses, a list of variable names followed by an equals sign
160 (@samp{=}) and the names that they should be renamed to. Multiple
161 parenthesized groups of variable names can be included on a single
162 RENAME subcommand. Variables' names may be swapped using a RENAME
163 subcommand of the form @samp{/RENAME=(A B=B A)}.
165 Alternate syntax for the RENAME subcommand allows the parentheses to be
166 eliminated. When this is done, only a single variable may be renamed at
167 once. For instance, @samp{/RENAME=A=B}. This alternate syntax is
170 DROP, KEEP, and RENAME are executed in left-to-right order.
171 Each may be present any number of times. @cmd{GET} never modifies a
172 file on disk. Only the active dataset read from the file
173 is affected by these subcommands.
175 PSPP tries to automatically detect the encoding of string data in the
176 file. Sometimes, however, this does not work well encoding,
177 especially for files written by old versions of SPSS or PSPP. Specify
178 the ENCODING subcommand with an IANA character set name as its string
179 argument to override the default. The ENCODING subcommand is a PSPP
182 @cmd{GET} does not cause the data to be read, only the dictionary. The data
183 is read later, when a procedure is executed.
185 Use of @cmd{GET} to read a portable file is a PSPP extension.
193 /TYPE=@{GNM,ODS,PSQL,TXT@}
194 @dots{}additional subcommands depending on TYPE@dots{}
197 The @cmd{GET DATA} command is used to read files and other data
198 sources created by other applications. When this command is executed,
199 the current dictionary and active dataset are replaced with variables
200 and data read from the specified source.
202 The TYPE subcommand is mandatory and must be the first subcommand
203 specified. It determines the type of the file or source to read.
204 PSPP currently supports the following file types:
208 Spreadsheet files created by Gnumeric (@url{http://gnumeric.org}).
211 Spreadsheet files in OpenDocument format.
214 Relations from PostgreSQL databases (@url{http://postgresql.org}).
217 Textual data files in columnar and delimited formats.
220 Each supported file type has additional subcommands, explained in
221 separate sections below.
224 * GET DATA /TYPE=GNM/ODS:: Spreadsheets
225 * GET DATA /TYPE=PSQL:: Databases
226 * GET DATA /TYPE=TXT:: Delimited Text Files
229 @node GET DATA /TYPE=GNM/ODS
230 @subsection Spreadsheet Files
233 GET DATA /TYPE=@{GNM, ODS@}
234 /FILE=@{'file-name'@}
235 /SHEET=@{NAME 'sheet-name', INDEX n@}
236 /CELLRANGE=@{RANGE 'range', FULL@}
237 /READNAMES=@{ON, OFF@}
243 @cindex spreadsheet files
245 Gnumeric spreadsheets (@url{http://gnumeric.org}), and spreadsheets
246 in OpenDocument format
247 (@url{http://libreplanet.org/wiki/Group:OpenDocument/Software})
248 can be read using the GET DATA command.
249 Use the TYPE subcommand to indicate the file's format.
250 /TYPE=GNM indicates Gnumeric files,
251 /TYPE=ODS indicates OpenDocument.
252 The FILE subcommand is mandatory.
253 Use it to specify the name file to be read.
254 All other subcommands are optional.
256 The format of each variable is determined by the format of the spreadsheet
257 cell containing the first datum for the variable.
258 If this cell is of string (text) format, then the width of the variable is
259 determined from the length of the string it contains, unless the
260 ASSUMEDVARWIDTH subcommand is given.
262 The SHEET subcommand specifies the sheet within the spreadsheet file to read.
263 There are two forms of the SHEET subcommand.
265 @samp{/SHEET=name @var{sheet-name}}, the string @var{sheet-name} is the
266 name of the sheet to read.
267 In the second form, @samp{/SHEET=index @var{idx}}, @var{idx} is a
268 integer which is the index of the sheet to read.
269 The first sheet has the index 1.
270 If the SHEET subcommand is omitted, then the command will read the
271 first sheet in the file.
273 The CELLRANGE subcommand specifies the range of cells within the sheet to read.
274 If the subcommand is given as @samp{/CELLRANGE=FULL}, then the entire
276 To read only part of a sheet, use the form
277 @samp{/CELLRANGE=range '@var{top-left-cell}:@var{bottom-right-cell}'}.
278 For example, the subcommand @samp{/CELLRANGE=range 'C3:P19'} reads
279 columns C--P, and rows 3--19 inclusive.
280 If no CELLRANGE subcommand is given, then the entire sheet is read.
282 If @samp{/READNAMES=ON} is specified, then the contents of cells of
283 the first row are used as the names of the variables in which to store
284 the data from subsequent rows. This is the default.
285 If @samp{/READNAMES=OFF} is
286 used, then the variables receive automatically assigned names.
288 The ASSUMEDVARWIDTH subcommand specifies the maximum width of string
289 variables read from the file.
290 If omitted, the default value is determined from the length of the
291 string in the first spreadsheet cell for each variable.
294 @node GET DATA /TYPE=PSQL
295 @subsection Postgres Database Queries
299 /CONNECT=@{connection info@}
309 The PSQL type is used to import data from a postgres database server.
310 The server may be located locally or remotely.
311 Variables are automatically created based on the table column names
312 or the names specified in the SQL query.
313 Postgres data types of high precision, will loose precision when
315 Not all the postgres data types are able to be represented in PSPP.
316 If a datum cannot be represented a warning will be issued and that
317 datum will be set to SYSMIS.
319 The CONNECT subcommand is mandatory.
320 It is a string specifying the parameters of the database server from
321 which the data should be fetched.
322 The format of the string is given in the postgres manual
323 @url{http://www.postgresql.org/docs/8.0/static/libpq.html#LIBPQ-CONNECT}.
325 The SQL subcommand is mandatory.
326 It must be a valid SQL string to retrieve data from the database.
328 The ASSUMEDVARWIDTH subcommand specifies the maximum width of string
329 variables read from the database.
330 If omitted, the default value is determined from the length of the
331 string in the first value read for each variable.
333 The UNENCRYPTED subcommand allows data to be retrieved over an insecure
335 If the connection is not encrypted, and the UNENCRYPTED subcommand is not
336 given, then an error will occur.
337 Whether or not the connection is
338 encrypted depends upon the underlying psql library and the
339 capabilities of the database server.
341 The BSIZE subcommand serves only to optimise the speed of data transfer.
342 It specifies an upper limit on
343 number of cases to fetch from the database at once.
344 The default value is 4096.
345 If your SQL statement fetches a large number of cases but only a small number of
346 variables, then the data transfer may be faster if you increase this value.
347 Conversely, if the number of variables is large, or if the machine on which
348 PSPP is running has only a
349 small amount of memory, then a smaller value will be better.
352 The following syntax is an example:
355 /CONNECT='host=example.com port=5432 dbname=product user=fred passwd=xxxx'
356 /SQL='select * from manufacturer'.
360 @node GET DATA /TYPE=TXT
361 @subsection Textual Data Files
365 /FILE=@{'file-name',file_handle@}
366 [/ARRANGEMENT=@{DELIMITED,FIXED@}]
367 [/FIRSTCASE=@{first_case@}]
368 [/IMPORTCASE=@{ALL,FIRST max_cases,PERCENT percent@}]
369 @dots{}additional subcommands depending on ARRANGEMENT@dots{}
374 When TYPE=TXT is specified, GET DATA reads data in a delimited or
375 fixed columnar format, much like DATA LIST (@pxref{DATA LIST}).
377 The FILE subcommand is mandatory. Specify the file to be read as
378 a string file name or (for textual data
379 only) a file handle (@pxref{File Handles}).
381 The ARRANGEMENT subcommand determines the file's basic format.
382 DELIMITED, the default setting, specifies that fields in the input
383 data are separated by spaces, tabs, or other user-specified
384 delimiters. FIXED specifies that fields in the input data appear at
385 particular fixed column positions within records of a case.
387 By default, cases are read from the input file starting from the first
388 line. To skip lines at the beginning of an input file, set FIRSTCASE
389 to the number of the first line to read: 2 to skip the first line, 3
390 to skip the first two lines, and so on.
392 IMPORTCASE can be used to limit the number of cases read from the
393 input file. With the default setting, ALL, all cases in the file are
394 read. Specify FIRST @i{max_cases} to read at most @i{max_cases} cases
395 from the file. Use PERCENT @i{percent} to read only @i{percent}
396 percent, approximately, of the cases contained in the file. (The
397 percentage is approximate, because there is no way to accurately count
398 the number of cases in the file without reading the entire file. The
399 number of cases in some kinds of unusual files cannot be estimated;
400 PSPP will read all cases in such files.)
402 FIRSTCASE and IMPORTCASE may be used with delimited and fixed-format
403 data. The remaining subcommands, which apply only to one of the two file
404 arrangements, are described below.
407 * GET DATA /TYPE=TXT /ARRANGEMENT=DELIMITED::
408 * GET DATA /TYPE=TXT /ARRANGEMENT=FIXED::
411 @node GET DATA /TYPE=TXT /ARRANGEMENT=DELIMITED
412 @subsubsection Reading Delimited Data
416 /FILE=@{'file-name',file_handle@}
417 [/ARRANGEMENT=@{DELIMITED,FIXED@}]
418 [/FIRSTCASE=@{first_case@}]
419 [/IMPORTCASE=@{ALL,FIRST max_cases,PERCENT percent@}]
421 /DELIMITERS="delimiters"
422 [/QUALIFIER="quotes" [/ESCAPE]]
423 [/DELCASE=@{LINE,VARIABLES n_variables@}]
424 /VARIABLES=del_var [del_var]@dots{}
425 where each del_var takes the form:
429 The GET DATA command with TYPE=TXT and ARRANGEMENT=DELIMITED reads
430 input data from text files in delimited format, where fields are
431 separated by a set of user-specified delimiters. Its capabilities are
432 similar to those of DATA LIST FREE (@pxref{DATA LIST FREE}), with a
435 The required FILE subcommand and optional FIRSTCASE and IMPORTCASE
436 subcommands are described above (@pxref{GET DATA /TYPE=TXT}).
438 DELIMITERS, which is required, specifies the set of characters that
439 may separate fields. Each character in the string specified on
440 DELIMITERS separates one field from the next. The end of a line also
441 separates fields, regardless of DELIMITERS. Two consecutive
442 delimiters in the input yield an empty field, as does a delimiter at
443 the end of a line. A space character as a delimiter is an exception:
444 consecutive spaces do not yield an empty field and neither does any
445 number of spaces at the end of a line.
447 To use a tab as a delimiter, specify @samp{\t} at the beginning of the
448 DELIMITERS string. To use a backslash as a delimiter, specify
449 @samp{\\} as the first delimiter or, if a tab should also be a
450 delimiter, immediately following @samp{\t}. To read a data file in
451 which each field appears on a separate line, specify the empty string
454 The optional QUALIFIER subcommand names one or more characters that
455 can be used to quote values within fields in the input. A field that
456 begins with one of the specified quote characters ends at the next
457 matching quote. Intervening delimiters become part of the field,
458 instead of terminating it. The ability to specify more than one quote
459 character is a PSPP extension.
461 By default, a character specified on QUALIFIER cannot itself be
462 embedded within a field that it quotes, because the quote character
463 always terminates the quoted field. With ESCAPE, however, a doubled
464 quote character within a quoted field inserts a single instance of the
465 quote into the field. For example, if @samp{'} is specified on
466 QUALIFIER, then without ESCAPE @code{'a''b'} specifies a pair of
467 fields that contain @samp{a} and @samp{b}, but with ESCAPE it
468 specifies a single field that contains @samp{a'b}. ESCAPE is a PSPP
471 The DELCASE subcommand controls how data may be broken across lines in
472 the data file. With LINE, the default setting, each line must contain
473 all the data for exactly one case. For additional flexibility, to
474 allow a single case to be split among lines or multiple cases to be
475 contained on a single line, specify VARIABLES @i{n_variables}, where
476 @i{n_variables} is the number of variables per case.
478 The VARIABLES subcommand is required and must be the last subcommand.
479 Specify the name of each variable and its input format (@pxref{Input
480 and Output Formats}) in the order they should be read from the input
483 @subsubheading Examples
486 On a Unix-like system, the @samp{/etc/passwd} file has a format
490 root:$1$nyeSP5gD$pDq/:0:0:,,,:/root:/bin/bash
491 blp:$1$BrP/pFg4$g7OG:1000:1000:Ben Pfaff,,,:/home/blp:/bin/bash
492 john:$1$JBuq/Fioq$g4A:1001:1001:John Darrington,,,:/home/john:/bin/bash
493 jhs:$1$D3li4hPL$88X1:1002:1002:Jason Stover,,,:/home/jhs:/bin/csh
497 The following syntax reads a file in the format used by
500 @c If you change this example, change the regression test in
501 @c tests/language/data-io/get-data.at to match.
503 GET DATA /TYPE=TXT /FILE='/etc/passwd' /DELIMITERS=':'
504 /VARIABLES=username A20
514 Consider the following data on used cars:
517 model year mileage price type age
518 Civic 2002 29883 15900 Si 2
519 Civic 2003 13415 15900 EX 1
520 Civic 1992 107000 3800 n/a 12
521 Accord 2002 26613 17900 EX 1
525 The following syntax can be used to read the used car data:
527 @c If you change this example, change the regression test in
528 @c tests/language/data-io/get-data.at to match.
530 GET DATA /TYPE=TXT /FILE='cars.data' /DELIMITERS=' ' /FIRSTCASE=2
540 Consider the following information on animals in a pet store:
543 'Pet''s Name', "Age", "Color", "Date Received", "Price", "Height", "Type"
544 , (Years), , , (Dollars), ,
545 "Rover", 4.5, Brown, "12 Feb 2004", 80, '1''4"', "Dog"
546 "Charlie", , Gold, "5 Apr 2007", 12.3, "3""", "Fish"
547 "Molly", 2, Black, "12 Dec 2006", 25, '5"', "Cat"
548 "Gilly", , White, "10 Apr 2007", 10, "3""", "Guinea Pig"
552 The following syntax can be used to read the pet store data:
554 @c If you change this example, change the regression test in
555 @c tests/language/data-io/get-data.at to match.
557 GET DATA /TYPE=TXT /FILE='pets.data' /DELIMITERS=', ' /QUALIFIER='''"' /ESCAPE
568 @node GET DATA /TYPE=TXT /ARRANGEMENT=FIXED
569 @subsubsection Reading Fixed Columnar Data
573 /FILE=@{'file-name',file_handle@}
574 [/ARRANGEMENT=@{DELIMITED,FIXED@}]
575 [/FIRSTCASE=@{first_case@}]
576 [/IMPORTCASE=@{ALL,FIRST max_cases,PERCENT percent@}]
579 /VARIABLES fixed_var [fixed_var]@dots{}
580 [/rec# fixed_var [fixed_var]@dots{}]@dots{}
581 where each fixed_var takes the form:
582 variable start-end format
585 The GET DATA command with TYPE=TXT and ARRANGEMENT=FIXED reads input
586 data from text files in fixed format, where each field is located in
587 particular fixed column positions within records of a case. Its
588 capabilities are similar to those of DATA LIST FIXED (@pxref{DATA LIST
589 FIXED}), with a few enhancements.
591 The required FILE subcommand and optional FIRSTCASE and IMPORTCASE
592 subcommands are described above (@pxref{GET DATA /TYPE=TXT}).
594 The optional FIXCASE subcommand may be used to specify the positive
595 integer number of input lines that make up each case. The default
598 The VARIABLES subcommand, which is required, specifies the positions
599 at which each variable can be found. For each variable, specify its
600 name, followed by its start and end column separated by @samp{-}
601 (e.g.@: @samp{0-9}), followed by an input format type (e.g.@:
602 @samp{F}) or a full format specification (e.g.@: @samp{DOLLAR12.2}).
603 For this command, columns are numbered starting from 0 at
604 the left column. Introduce the variables in the second and later
605 lines of a case by a slash followed by the number of the line within
606 the case, e.g.@: @samp{/2} for the second line.
608 @subsubheading Examples
611 Consider the following data on used cars:
614 model year mileage price type age
615 Civic 2002 29883 15900 Si 2
616 Civic 2003 13415 15900 EX 1
617 Civic 1992 107000 3800 n/a 12
618 Accord 2002 26613 17900 EX 1
622 The following syntax can be used to read the used car data:
624 @c If you change this example, change the regression test in
625 @c tests/language/data-io/get-data.at to match.
627 GET DATA /TYPE=TXT /FILE='cars.data' /ARRANGEMENT=FIXED /FIRSTCASE=2
628 /VARIABLES=model 0-7 A
646 /RENAME=(src_names=target_names)@dots{}
649 The @cmd{IMPORT} transformation clears the active dataset dictionary and
651 replaces them with a dictionary and data from a system file or
654 The FILE subcommand, which is the only required subcommand, specifies
655 the portable file to be read as a file name string or a file handle
656 (@pxref{File Handles}).
658 The TYPE subcommand is currently not used.
660 DROP, KEEP, and RENAME follow the syntax used by @cmd{GET} (@pxref{GET}).
662 @cmd{IMPORT} does not cause the data to be read, only the dictionary. The
663 data is read later, when a procedure is executed.
665 Use of @cmd{IMPORT} to read a system file is a PSPP extension.
673 /OUTFILE=@{'file-name',file_handle@}
674 /UNSELECTED=@{RETAIN,DELETE@}
675 /@{COMPRESSED,UNCOMPRESSED@}
676 /PERMISSIONS=@{WRITEABLE,READONLY@}
680 /RENAME=(src_names=target_names)@dots{}
685 The @cmd{SAVE} procedure causes the dictionary and data in the active
687 be written to a system file.
689 OUTFILE is the only required subcommand. Specify the system file
690 to be written as a string file name or a file handle
691 (@pxref{File Handles}).
693 By default, cases excluded with FILTER are written to the system file.
694 These can be excluded by specifying DELETE on the UNSELECTED
695 subcommand. Specifying RETAIN makes the default explicit.
697 The COMPRESS and UNCOMPRESS subcommand determine whether the saved
698 system file is compressed. By default, system files are compressed.
699 This default can be changed with the SET command (@pxref{SET}).
701 The PERMISSIONS subcommand specifies permissions for the new system
702 file. WRITEABLE, the default, creates the file with read and write
703 permission. READONLY creates the file for read-only access.
705 By default, all the variables in the active dataset dictionary are written
706 to the system file. The DROP subcommand can be used to specify a list
707 of variables not to be written. In contrast, KEEP specifies variables
708 to be written, with all variables not specified not written.
710 Normally variables are saved to a system file under the same names they
711 have in the active dataset. Use the RENAME subcommand to change these names.
712 Specify, within parentheses, a list of variable names followed by an
713 equals sign (@samp{=}) and the names that they should be renamed to.
714 Multiple parenthesized groups of variable names can be included on a
715 single RENAME subcommand. Variables' names may be swapped using a
716 RENAME subcommand of the form @samp{/RENAME=(A B=B A)}.
718 Alternate syntax for the RENAME subcommand allows the parentheses to be
719 eliminated. When this is done, only a single variable may be renamed at
720 once. For instance, @samp{/RENAME=A=B}. This alternate syntax is
723 DROP, KEEP, and RENAME are performed in left-to-right order. They
724 each may be present any number of times. @cmd{SAVE} never modifies
725 the active dataset. DROP, KEEP, and RENAME only affect the system file
728 The VERSION subcommand specifies the version of the file format. Valid
729 versions are 2 and 3. The default version is 3. In version 2 system
730 files, variable names longer than 8 bytes will be truncated. The two
731 versions are otherwise identical.
733 The NAMES and MAP subcommands are currently ignored.
735 @cmd{SAVE} causes the data to be read. It is a procedure.
738 @section SAVE TRANSLATE
739 @vindex SAVE TRANSLATE
743 /OUTFILE=@{'file-name',file_handle@}
746 [/MISSING=@{IGNORE,RECODE@}]
750 [/RENAME=(src_names=target_names)@dots{}]
751 [/UNSELECTED=@{RETAIN,DELETE@}]
754 @dots{}additional subcommands depending on TYPE@dots{}
757 The @cmd{SAVE TRANSLATE} command is used to save data into various
758 formats understood by other applications.
760 The OUTFILE and TYPE subcommands are mandatory. OUTFILE specifies the
761 file to be written, as a string file name or a file handle
762 (@pxref{File Handles}). TYPE determines the type of the file or
763 source to read. It must be one of the following:
767 Comma-separated value format,
770 Tab-delimited format.
773 By default, SAVE TRANSLATE will not overwrite an existing file. Use
774 REPLACE to force an existing file to be overwritten.
776 With MISSING=IGNORE, the default, SAVE TRANSLATE treats user-missing
777 values as if they were not missing. Specify MISSING=RECODE to output
778 numeric user-missing values like system-missing values and string
779 user-missing values as all spaces.
781 By default, all the variables in the active dataset dictionary are saved
782 to the system file, but DROP or KEEP can select a subset of variable
783 to save. The RENAME subcommand can also be used to change the names
784 under which variables are saved. UNSELECTED determines whether cases
785 filtered out by the FILTER command are written to the output file.
786 These subcommands have the same syntax and meaning as on the
787 @cmd{SAVE} command (@pxref{SAVE}).
789 Each supported file type has additional subcommands, explained in
790 separate sections below.
792 @cmd{SAVE TRANSLATE} causes the data to be read. It is a procedure.
795 * SAVE TRANSLATE /TYPE=CSV and TYPE=TAB::
798 @node SAVE TRANSLATE /TYPE=CSV and TYPE=TAB
799 @subsection Writing Comma- and Tab-Separated Data Files
803 /OUTFILE=@{'file-name',file_handle@}
806 [/MISSING=@{IGNORE,RECODE@}]
810 [/RENAME=(src_names=target_names)@dots{}]
811 [/UNSELECTED=@{RETAIN,DELETE@}]
814 [/CELLS=@{VALUES,LABELS@}]
815 [/TEXTOPTIONS DELIMITER='delimiter']
816 [/TEXTOPTIONS QUALIFIER='qualifier']
817 [/TEXTOPTIONS DECIMAL=@{DOT,COMMA@}]
818 [/TEXTOPTIONS FORMAT=@{PLAIN,VARIABLE@}]
821 The SAVE TRANSLATE command with TYPE=CSV or TYPE=TAB writes data in a
822 comma- or tab-separated value format similar to that described by
823 RFC@tie{}4180. Each variable becomes one output column, and each case
824 becomes one line of output. If FIELDNAMES is specified, an additional
825 line at the top of the output file lists variable names.
827 The CELLS and TEXTOPTIONS FORMAT settings determine how values are
828 written to the output file:
831 @item CELLS=VALUES FORMAT=PLAIN (the default settings)
832 Writes variables to the output in ``plain'' formats that ignore the
833 details of variable formats. Numeric values are written as plain
834 decimal numbers with enough digits to indicate their exact values in
835 machine representation. Numeric values include @samp{e} followed by
836 an exponent if the exponent value would be less than -4 or greater
837 than 16. Dates are written in MM/DD/YYYY format and times in HH:MM:SS
838 format. WKDAY and MONTH values are written as decimal numbers.
840 Numeric values use, by default, the decimal point character set with
841 SET DECIMAL (@pxref{SET DECIMAL}). Use DECIMAL=DOT or DECIMAL=COMMA
842 to force a particular decimal point character.
844 @item CELLS=VALUES FORMAT=VARIABLE
845 Writes variables using their print formats. Leading and trailing
846 spaces are removed from numeric values, and trailing spaces are
847 removed from string values.
849 @item CELLS=LABEL FORMAT=PLAIN
850 @itemx CELLS=LABEL FORMAT=VARIABLE
851 Writes value labels where they exist, and otherwise writes the values
852 themselves as described above.
855 Regardless of CELLS and TEXTOPTIONS FORMAT, numeric system-missing
856 values are output as a single space.
858 For TYPE=TAB, tab characters delimit values. For TYPE=CSV, the
859 TEXTOPTIONS DELIMITER and DECIMAL settings determine the character
860 that separate values within a line. If DELIMITER is specified, then
861 the specified string separate values. If DELIMITER is not specified,
862 then the default is a comma with DECIMAL=DOT or a semicolon with
863 DECIMAL=COMMA. If DECIMAL is not given either, it is implied by the
864 decimal point character set with SET DECIMAL (@pxref{SET DECIMAL}).
866 The TEXTOPTIONS QUALIFIER setting specifies a character that is output
867 before and after a value that contains the delimiter character or the
868 qualifier character. The default is a double quote (@samp{@@}). A
869 qualifier character that appears within a value is doubled.
872 @section SYSFILE INFO
876 SYSFILE INFO FILE='file-name'.
879 @cmd{SYSFILE INFO} reads the dictionary in a system file and
880 displays the information in its dictionary.
882 Specify a file name or file handle. @cmd{SYSFILE INFO} reads that file as
883 a system file and displays information on its dictionary.
885 @cmd{SYSFILE INFO} does not affect the current active dataset.
897 /RENAME=(src_names=target_names)@dots{}
902 The @cmd{EXPORT} transformation writes the active dataset dictionary and
903 data to a specified portable file.
905 This transformation is a PSPP extension.
907 It is similar to the @cmd{EXPORT} procedure, with two differences:
911 @cmd{XEXPORT} is a transformation, not a procedure. It is executed when
912 the data is read by a procedure or procedure-like command.
915 @cmd{XEXPORT} does not support the UNSELECTED subcommand.
918 @xref{EXPORT}, for more information.
927 /@{COMPRESSED,UNCOMPRESSED@}
928 /PERMISSIONS=@{WRITEABLE,READONLY@}
932 /RENAME=(src_names=target_names)@dots{}
937 The @cmd{XSAVE} transformation writes the active dataset's dictionary and
938 data to a system file. It is similar to the @cmd{SAVE}
939 procedure, with two differences:
943 @cmd{XSAVE} is a transformation, not a procedure. It is executed when
944 the data is read by a procedure or procedure-like command.
947 @cmd{XSAVE} does not support the UNSELECTED subcommand.
950 @xref{SAVE}, for more information.