file handle (@pxref{File Handles}). If the @subcmd{FILE} subcommand is not used,
then input is assumed to be specified within the command file using
@cmd{BEGIN DATA}@dots{}@cmd{END DATA} (@pxref{BEGIN DATA}).
-The @subcmd{ENCODING} subcommand may only be used if the @subcmd{FILE} subcommand is also used.
-It specifies the character encoding of the file.
+The @subcmd{ENCODING} subcommand may only be used if the @subcmd{FILE}
+subcommand is also used. It specifies the character encoding of the
+file. @xref{INSERT}, for information on supported encodings.
The optional @subcmd{RECORDS} subcommand, which takes a single integer as an
argument, is used to specify the number of lines per record.
The @subcmd{NOTABLE} and @subcmd{TABLE} subcommands are as in @cmd{DATA LIST FIXED} above.
@subcmd{NOTABLE} is the default.
-The @subcmd{FILE} and @subcmd{SKIP} subcommands are as in @cmd{DATA LIST FIXED} above.
+The @subcmd{FILE}, @subcmd{SKIP}, and @subcmd{ENCODING} subcommands
+are as in @cmd{DATA LIST FIXED} above.
The variables to be parsed are given as a single list of variable names.
This list must be introduced by a single slash (@samp{/}). The set of
DATA LIST LIST
[(@{TAB,'@var{c}'@}, @dots{})]
[@{NOTABLE,TABLE@}]
- [FILE='@var{file_name'} [ENCODING='@var{encoding}']]
+ [FILE='@var{file_name}' [ENCODING='@var{encoding}']]
[SKIP=@var{record_count}]
/@var{var_spec}@dots{}
FILE HANDLE @var{handle_name}
/NAME='@var{file_name}
[/MODE=CHARACTER]
+ [/ENDS=@{CR,CRLF@}]
/TABWIDTH=@var{tab_width}
+ [ENCODING='@var{encoding}']
For binary files in native encoding with fixed-length records:
FILE HANDLE @var{handle_name}
/NAME='@var{file_name}'
/MODE=IMAGE
[/LRECL=@var{rec_len}]
+ [ENCODING='@var{encoding}']
For binary files in native encoding with variable-length records:
FILE HANDLE @var{handle_name}
/NAME='@var{file_name}'
/MODE=BINARY
[/LRECL=@var{rec_len}]
+ [ENCODING='@var{encoding}']
For binary files encoded in EBCDIC:
FILE HANDLE @var{handle_name}
/MODE=360
/RECFORM=@{FIXED,VARIABLE,SPANNED@}
[/LRECL=@var{rec_len}]
+ [ENCODING='@var{encoding}']
@end display
Use @cmd{FILE HANDLE} to associate a file handle name with a file and
@itemize
@item
-In CHARACTER mode, the default, the data file is read as a text file,
-according to the local system's conventions, and each text line is
-read as one record.
+In CHARACTER mode, the default, the data file is read as a text file.
+Each text line is read as one record.
In CHARACTER mode only, tabs are expanded to spaces by input programs,
except by @cmd{DATA LIST FREE} with explicitly specified delimiters.
extension) may be used to specify an alternate width. Use a TABWIDTH
of 0 to suppress tab expansion.
+A file written in CHARACTER mode by default uses the line ends of the
+system on which PSPP is running, that is, on Windows, the default is
+CR LF line ends, and on other systems the default is LF only. Specify
+ENDS as CR or CRLF to override the default. PSPP reads files using
+either convention on any kind of system, regardless of ENDS.
+
@item
In IMAGE mode, the data file is treated as a series of fixed-length
binary records. LRECL should be used to specify the record length in
handle. It is required in all modes but SCRATCH mode, in which its
use is forbidden.
+The ENCODING subcommand specifies the encoding of text in the file.
+For reading text files in CHARACTER mode, all of the forms described
+for ENCODING on the INSERT command are supported (@pxref{INSERT}).
+For reading in other file-based modes, encoding autodetection is not
+supported; if the specified encoding requests autodetection then the
+default encoding will be used. This is also true when a file handle
+is used for writing a file in any mode.
+
@node INPUT PROGRAM
@section INPUT PROGRAM
@vindex INPUT PROGRAM
stops the flow of input data and passes out of the @cmd{INPUT PROGRAM}
structure.
+@cmd{INPUT PROGRAM} must contain at least one @cmd{DATA LIST} or
+@cmd{END FILE} command.
+
All this is very confusing. A few examples should help to clarify.
@c If you change this example, change the regression test1 in
@display
PRINT
- OUTFILE='@var{file_name}'
- RECORDS=@var{n_lines}
- @{NOTABLE,TABLE@}
+ [OUTFILE='@var{file_name}']
+ [RECORDS=@var{n_lines}]
+ [@{NOTABLE,TABLE@}]
+ [ENCODING='@var{encoding}']
[/[@var{line_no}] @var{arg}@dots{}]
@var{arg} takes one of the following forms:
- '@var{string}' [@var{start}-@var{end}]
+ '@var{string}' [@var{start}]
@var{var_list} @var{start}-@var{end} [@var{type_spec}]
@var{var_list} (@var{fortran_spec})
@var{var_list} *
inserted at beginning of each output line, even lines that otherwise
would be blank.
+The @subcmd{ENCODING} subcommand may only be used if the
+@subcmd{OUTFILE} subcommand is also used. It specifies the character
+encoding of the file. @xref{INSERT}, for information on supported
+encodings.
+
The @subcmd{RECORDS} subcommand specifies the number of lines to be output. The
number of lines may optionally be surrounded by parentheses.
be specified using multiple slashes with the intended output for a line
following its respective slash.
-
-Literal strings may be printed. Specify the string itself. Optionally
-the string may be followed by a column number or range of column
-numbers, specifying the location on the line for the string to be
-printed. Otherwise, the string will be printed at the current position
-on the line.
+Literal strings may be printed. Specify the string itself.
+Optionally the string may be followed by a column number, specifying
+the column on the line where the string should start. Otherwise, the
+string will be printed at the current position on the line.
Variables to be printed can be specified in the same ways as available
for @cmd{DATA LIST FIXED} (@pxref{DATA LIST FIXED}). In addition, a
@vindex PRINT SPACE
@display
-PRINT SPACE OUTFILE='file_name' n_lines.
+PRINT SPACE [OUTFILE='file_name'] [ENCODING='@var{encoding}'] [n_lines].
@end display
@cmd{PRINT SPACE} prints one or more blank lines to an output file.
Handles}). If OUTFILE is not specified then output will be directed to
the listing file.
+The @subcmd{ENCODING} subcommand may only be used if @subcmd{OUTFILE}
+is also used. It specifies the character encoding of the file.
+@xref{INSERT}, for information on supported encodings.
+
n_lines is also optional. If present, it is an expression
(@pxref{Expressions}) specifying the number of blank lines to be
printed. The expression must evaluate to a nonnegative value.
@vindex REREAD
@display
-REREAD FILE=handle COLUMN=column.
+REREAD [FILE=handle] [COLUMN=column] [ENCODING='@var{encoding}'].
@end display
The @cmd{REREAD} transformation allows the previous input line in a
the first column that should be included in the re-read line. Columns
are numbered from 1 at the left margin.
+The @subcmd{ENCODING} subcommand may only be used if the @subcmd{FILE}
+subcommand is also used. It specifies the character encoding of the
+file. @xref{INSERT}, for information on supported encodings.
+
Issuing @code{REREAD} multiple times will not back up in the data
file. Instead, it will re-read the same line multiple times.