+@c PSPP - a program for statistical analysis.
+@c Copyright (C) 2017, 2020 Free Software Foundation, Inc.
+@c Permission is granted to copy, distribute and/or modify this document
+@c under the terms of the GNU Free Documentation License, Version 1.3
+@c or any later version published by the Free Software Foundation;
+@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
+@c A copy of the license is included in the section entitled "GNU
+@c Free Documentation License".
+@c
@node Combining Data Files
@chapter Combining Data Files
@cmd{ADD FILES}, @cmd{MATCH FILES}, and @cmd{UPDATE} commands. The
following sections describe details specific to each command.
-Each of these commands reads two or more input files and combines
-them. The command's output becomes the new active dataset. The input
-files are not changed on disk.
+Each of these commands reads two or more input files and combines them.
+The command's output becomes the new active dataset.
+None of the commands actually change the input files.
+Therefore, if you want the changes to become permanent, you must explicitly
+save them using an appropriate procedure or transformation (@pxref{System and Portable File IO}).
The syntax of each command begins with a specification of the files to
be read as input. For each input file, specify FILE with a system
subcommands that specify a parenthesized group or groups of variable
names as they appear in the input file, followed by those variables'
new names, separated by an equals sign (@subcmd{=}),
-e.g. @subcmd{/RENAME=(OLD1=NEW1)(OLD2=NEW2)}. To rename a single
+@i{e.g.} @subcmd{/RENAME=(OLD1=NEW1)(OLD2=NEW2)}. To rename a single
variable, the parentheses may be omitted: @subcmd{/RENAME=@var{old}=@var{new}}.
Within a parenthesized group, variables are renamed simultaneously, so
that @subcmd{/RENAME=(@var{A} @var{B}=@var{B} @var{A})} exchanges the
@item
The file label of the new active dataset (@pxref{FILE LABEL}) is that of the
-first specified FILE that has a file label.
+first specified @subcmd{FILE} that has a file label.
@item
The documents in the new active dataset (@pxref{DOCUMENT}) are the
The remaining subcommands apply to the output file as a whole, rather
than to individual input files. They must be specified at the end of
-the command specification, following all of the FILE and related
+the command specification, following all of the @subcmd{FILE} and related
subcommands. The most important of these subcommands is @subcmd{BY}, which
specifies a set of one or more variables that may be used to find
corresponding cases in each of the input files. The variables
system-missing value for numeric variables or spaces for string
variables.
+These commands may combine any number of files, limited only by the
+machine's memory.
+
@node ADD FILES
@section ADD FILES
@vindex ADD FILES
@item
If @subcmd{BY} is used, @cmd{MATCH FILES} combines cases from each input file that
-have identical values for the BY variables.
+have identical values for the @subcmd{BY} variables.
When @subcmd{BY} is used, @subcmd{TABLE} subcommands may be used to introduce @dfn{table
lookup file}. @subcmd{TABLE} has same syntax as @subcmd{FILE}, and the @subcmd{RENAME}, @subcmd{IN}, and
@end display
@cmd{UPDATE} updates a @dfn{master file} by applying modifications
-from one or more @dfn{transaction files}.
+from one or more @dfn{transaction files}.
@cmd{UPDATE} shares the bulk of its syntax with other @pspp{} commands for
combining multiple data files. @xref{Combining Files Common Syntax},
@itemize @bullet
@item
When a match is found, then the values of the variables present in the
-transaction file replace those variable's values in the new active
+transaction file replace those variables' values in the new active
file. If there are matching cases in more than more transaction file,
@pspp{} applies the replacements from the first transaction file, then
from the second transaction file, and so on. Similarly, if a single
transaction file has cases with duplicate @subcmd{BY} values, then those are
applied in order to the master file.
-When a variable in a transaction file has a missing value or a string
+When a variable in a transaction file has a missing value or when a string
variable's value is all blanks, that value is never used to update the
master file.