lexer: Reimplement for better testability and internationalization.

author Ben Pfaff <blp@cs.stanford.edu>

Sun, 20 Mar 2011 00:05:47 +0000 (17:05 -0700)

committer Ben Pfaff <blp@cs.stanford.edu>

Sun, 20 Mar 2011 16:43:45 +0000 (09:43 -0700)
author Ben Pfaff <blp@cs.stanford.edu>
Sun, 20 Mar 2011 00:05:47 +0000 (17:05 -0700)
committer Ben Pfaff <blp@cs.stanford.edu>
Sun, 20 Mar 2011 16:43:45 +0000 (09:43 -0700)
diff --git a/NEWS b/NEWS

index a96068484d3ae8aaa1ebe87619d64ebf5875c6e6..b4bb63f7a8f6a3753a02cdb872ade889e72fa334 100644 (file)
--- a/NEWS
+++ b/NEWS
@@ -1,12 +1,54 @@
  PSPP NEWS -- history of user-visible changes.
-Time-stamp: <2010-11-21 11:58:30 blp>
-Copyright (C) 1996-9, 2000, 2008, 2009, 2010 Free Software Foundation, Inc.
+Time-stamp: <2011-03-19 16:39:28 blp>
+Copyright (C) 1996-9, 2000, 2008, 2009, 2010, 2011 Free Software Foundation, Inc.
  See the end for copying conditions.
  
  Please send PSPP bug reports to bug-gnu-pspp@gnu.org.
  
  Changes from 0.7.3 to 0.7.6:
  
+ * The "pspp" program has a new option --batch (or -b) that selects
+   "batch" syntax mode.  In previous versions of PSPP this syntax mode
+   was the default.  Now a new "auto" syntax mode is the default.  In
+   "auto" mode, PSPP interprets most syntax files correctly regardless
+   of their intended syntax mode.
+
+   See the "Syntax Variants" section in the PSPP manual for more
+   information.
+
+ * The "pspp" program has a new option --syntax-encoding that
+   specifies the encoding for syntax files listed on the command line,
+   as well as the default encoding for syntax files included with
+   INCLUDE or INSERT.  The default is to accept the system locale
+   encoding, UTF-8, UTF-16, or UTF-32, automatically detecting which
+   one the system file uses.
+
+   See the documentation for the INSERT command in the PSPP manual for
+   more information.
+
+ * The INCLUDE and INSERT commands now support the ENCODING subcommand
+   to specify the encoding for the included syntax file.
+
+ * Strings may now include arbitrary Unicode code points specified in
+   hexadecimal, using the syntax U'hhhh'.  For example, Unicode code
+   point U+1D11E, the musical G clef character, may be expressed as
+   U'1D11E'.
+
+   See the "Tokens" section in the PSPP manual for more information.
+
+ * In previous versions of PSPP, in a string expressed in hexadecimal
+   with X'hh' syntax, the hexadecimal digits expressed bytes in the
+   locale encoding.  In this version of PSPP, X'hh' syntax always
+   expresses bytes in UTF-8 encoding.
+
+   See the "Tokens" section in the PSPP manual for more information.
+
+ * The DO REPEAT command has been reimplemented.  The most prominent
+   change is that when a DO REPEAT block contains an INCLUDE or INSERT
+   command, substitutions are not applied to the included file.
+
+   See the "DO REPEAT" section in the PSPP manual for more information.
+
   * NPAR TESTS now supports the /KRUSKAL-WALLIS and /RUNS subcommands.
  
   * AUTORECODE now supports the /GROUP subcommand.
diff --git a/Smake b/Smake

index 683a8e382871f190c1a824d66f263a7a65a602c1..a1a81651309b0cea35e0109cec80803e0d0ebd79 100644 (file)
--- a/Smake
+++ b/Smake
@@ -49,6 +49,7 @@ GNULIB_MODULES = \
         printf-posix \
         printf-safe \
         progname \
+       rawmemchr \
         read-file \
         regex \
         relocatable-prog \
@@ -80,6 +81,7 @@ GNULIB_MODULES = \
         unistr/u8-cpy \
         unistr/u8-mbtouc \
         unistr/u8-strlen \
+       unistr/u8-strmbtouc \
         unistr/u8-strncat \
         uniwidth/u8-strwidth \
         unitypes \
diff --git a/doc/dev/concepts.texi b/doc/dev/concepts.texi

index 24c1654150fdb84f473b1acd09bb789fc0662ab1..053d2521c1298045b0ca774509f0d0379aada084 100644 (file)
--- a/doc/dev/concepts.texi
+++ b/doc/dev/concepts.texi
@@ -1220,12 +1220,12 @@ The following sections describe variable-related functions and macros.
  @node Variable Name
  @subsection Variable Name
  
-A variable name is a string between 1 and @code{VAR_NAME_LEN} bytes
+A variable name is a string between 1 and @code{ID_MAX_LEN} bytes
  long that satisfies the rules for PSPP identifiers
  (@pxref{Tokens,,,pspp, PSPP Users Guide}).  Variable names are
  mixed-case and treated case-insensitively.
  
-@deftypefn Macro int VAR_NAME_LEN
+@deftypefn Macro int ID_MAX_LEN
  Maximum length of a variable name, in bytes, currently 64.
  @end deftypefn
  
@@ -1248,23 +1248,6 @@ dictionary.  Use @func{dict_rename_var} instead (@pxref{Dictionary
  Renaming Variables}).
  @end deftypefun
  
-@anchor{var_is_plausible_name}
-@deftypefun {bool} var_is_valid_name (const char *@var{name}, bool @var{issue_error})
-@deftypefunx {bool} var_is_plausible_name (const char *@var{name}, bool @var{issue_error})
-Tests @var{name} for validity or ``plausibility.''  Returns true if
-the name is acceptable, false otherwise.  If the name is not
-acceptable and @var{issue_error} is true, also issues an error message
-explaining the violation.
-
-A valid name is one that fully satisfies all of the requirements for
-variable names (@pxref{Tokens,,,pspp, PSPP Users Guide}).  A
-``plausible'' name is simply a string whose length is in the valid
-range and that is not a reserved word.  PSPP accepts plausible but
-invalid names as variable names in some contexts where the character
-encoding scheme is ambiguous, as when reading variable names from
-system files.
-@end deftypefun
-
  @deftypefun {enum dict_class} var_get_dict_class (const struct variable *@var{var})
  Returns the dictionary class of @var{var}'s name (@pxref{Dictionary
  Class}).
@@ -1764,7 +1747,7 @@ To delete a variable from a dictionary and destroy it, use
  @node Variable Short Names
  @subsection Variable Short Names
  
-PSPP variable names may be up to 64 (@code{VAR_NAME_LEN}) bytes long.
+PSPP variable names may be up to 64 (@code{ID_MAX_LEN}) bytes long.
  The system and portable file formats, however, were designed when
  variable names were limited to 8 bytes in length.  Since then, the
  system file format has been augmented with an extension record that
@@ -1829,7 +1812,7 @@ been assigned a short name.
  Sets @var{var}'s short name to @var{short_name}, or removes
  @var{var}'s short name if @var{short_name} is a null pointer.  If it
  is non-null, then @var{short_name} must be a plausible name for a
-variable (@pxref{var_is_plausible_name}).  The name will be truncated
+variable.  The name will be truncated
  to 8 bytes in length and converted to all-uppercase.
  @end deftypefun
  
diff --git a/doc/flow-control.texi b/doc/flow-control.texi

index 868143b35a5a0ba8b73c5f628834f9b7a39cc87a..892887e211f9cb5b9f68f4f79a562b7796b33d55 100644 (file)
--- a/doc/flow-control.texi
+++ b/doc/flow-control.texi
@@ -72,6 +72,7 @@ expansion takes one of the following forms:
          var_list
          num_or_range@dots{}
          'string'@dots{}
+        ALL
  
  num_or_range takes one of the following forms:
          number
@@ -82,13 +83,11 @@ num_or_range takes one of the following forms:
  different variables, numbers, or strings into the block with each
  repetition.
  
-Specify a dummy variable name followed by an equals sign (@samp{=}) and
-the list of replacements.  Replacements can be a list of variables
-(which may be existing variables or new variables or some combination),
-numbers, or strings.  When new variable names are
-specified, @cmd{DO REPEAT} creates them as numeric variables.  When numbers
-are specified, runs of increasing integers may be indicated as
-@code{@var{num1} TO @var{num2}}, so that
+Specify a dummy variable name followed by an equals sign (@samp{=})
+and the list of replacements.  Replacements can be a list of existing
+or new variables, numbers, strings, or @code{ALL} to specify all
+existing variables.  When numbers are specified, runs of increasing
+integers may be indicated as @code{@var{num1} TO @var{num2}}, so that
  @samp{1 TO 5} is short for @samp{1 2 3 4 5}.
  
  Multiple dummy variables can be specified.  Each
@@ -100,10 +99,22 @@ each dummy variable is substituted; the second time, the second value
  for each dummy variable is substituted; and so on.
  
  Dummy variable substitutions work like macros.  They take place
-anywhere in a line that the dummy variable name occurs as a token,
-including command and subcommand names.  For this reason,
-words commonly used in command and subcommand names should not be used
-as dummy variable identifiers.
+anywhere in a line that the dummy variable name occurs.  This includes
+command and subcommand names, so command and subcommand names that
+appear in the code block should not be used as dummy variable
+identifiers.  Dummy variable substitutions do not occur inside quoted
+strings, comments, unquoted strings (such as the text on the
+@cmd{TITLE} or @cmd{DOCUMENT} command), or inside @cmd{BEGIN
+DATA}@dots{}@cmd{END DATA}.
+
+New variable names used as replacements are not automatically created
+as variables, but only if used in the code block in a context that
+would create them, e.g.@: on a @cmd{NUMERIC} or @cmd{STRING} command
+or on the left side of a @cmd{COMPUTE} assignment.
+
+Any command may appear within DO REPEAT, including nested DO REPEAT
+commands.  If @cmd{INCLUDE} or @cmd{INSERT} appears within DO REPEAT,
+the substitutions do not apply to the included file.
  
  If PRINT is specified on @cmd{END REPEAT}, the commands after substitutions
  are made are printed to the listing file, prefixed by a plus sign
diff --git a/doc/invoking.texi b/doc/invoking.texi

index 826498a26ce54ac6a90d5e462c3089d2b165d68c..4c4f74aa5392c6692457fc1ab2be6b0cd3515599 100644 (file)
--- a/doc/invoking.texi
+++ b/doc/invoking.texi
@@ -49,10 +49,12 @@ corresponding short options.
  @example
  -I, --include=@var{dir}
  -I-, --no-include
+-b, --batch
  -i, --interactive
  -r, --no-statrc
  -a, --algorithm=@{compatible|enhanced@}
  -x, --syntax=@{compatible|enhanced@}
+--syntax-encoding=@var{encoding}
  @end example
  
  @item Informational options
@@ -135,11 +137,13 @@ inserted in the include path by default.  The default include path is
  user's home directory, followed by PSPP's system configuration
  directory (usually @file{/etc/pspp} or @file{/usr/local/etc/pspp}).
  
+@item -b
+@item --batch
  @item -i
  @itemx --interactive
-This option forces syntax files to be interpreted in interactive
-mode, rather than the default batch mode.  @xref{Syntax Variants}, for
-a description of the differences.
+These options forces syntax files to be interpreted in batch mode or
+interactive mode, respectively, rather than the default ``auto'' mode.
+@xref{Syntax Variants}, for a description of the differences.
  
  @item -r
  @itemx --no-statrc
@@ -161,8 +165,14 @@ With @code{enhanced}, the default, PSPP accepts its own extensions
  beyond those compatible with the proprietary program SPSS.  With
  @code{compatible}, PSPP rejects syntax that uses these extensions.
  
-@item -?
-@itemx --help
+@item --syntax-encoding=@var{encoding}
+Specifies @var{encoding} as the encoding for syntax files named on the
+command line.  The @var{encoding} also becomes the default encoding
+for other syntax files read during the PSPP session by the
+@cmd{INCLUDE} and @cmd{INSERT} commands.  @xref{INSERT}, for the
+accepted forms of @var{encoding}.
+
+@item --help
  Prints a message describing PSPP command-line syntax and the available
  device formats, then exits.
  
diff --git a/doc/language.texi b/doc/language.texi

index 78d38acdc6f0813fe2e1666fa7ddfa8b5bb69c31..5381668c705ad38675c924d2147858a8b2790c82 100644 (file)
--- a/doc/language.texi
+++ b/doc/language.texi
@@ -111,26 +111,29 @@ character used for quoting in the string, double it, e.g.@:
  significant inside strings.
  
  Strings can be concatenated using @samp{+}, so that @samp{"a" + 'b' +
-'c'} is equivalent to @samp{'abc'}.  Concatenation is useful for
-splitting a single string across multiple source lines.
-
-Strings may also be expressed as hexadecimal, octal, or binary
-character values by prefixing the initial quote character by @samp{X},
-@samp{O}, or @samp{B} or their lowercase equivalents.  Each pair,
-triplet, or octet of characters, according to the radix, is
-transformed into a single character with the given value.  If there is
-an incomplete group of characters, the missing final digits are
-assumed to be @samp{0}.  These forms of strings are nonportable
-because numeric values are associated with different characters by
-different operating systems.  Therefore, their use should be confined
-to syntax files that will not be widely distributed.
-
-@cindex characters, reserved
-@cindex 0
-@cindex white space
-The character with value 00 is reserved for
-internal use by PSPP.  Its use in strings causes an error and
-replacement by a space character.
+'c'} is equivalent to @samp{'abc'}.  So that a long string may be
+broken across lines, a line break may precede or follow, or both
+precede and follow, the @samp{+}.  (However, an entirely blank line
+preceding or following the @samp{+} is interpreted as ending the
+current command.)
+
+Strings may also be expressed as hexadecimal character values by
+prefixing the initial quote character by @samp{x} or @samp{X}.
+Regardless of the syntax file or active dataset's encoding, the
+hexadecimal digits in the string are interpreted as Unicode characters
+in UTF-8 encoding.
+
+Individual Unicode code points may also be expressed by specifying the
+hexadecimal code point number in single or double quotes preceded by
+@samp{u} or @samp{U}.  For example, Unicode code point U+1D11E, the
+musical G clef character, could be expressed as @code{U'1D11E'}.
+Invalid Unicode code points (above U+10FFFF or in between U+D800 and
+U+DFFF) are not allowed.
+
+When strings are concatenated with @samp{+}, each segment's prefix is
+considered individually.  For example, @code{'The G clef symbol is:' +
+u"1d11e" + "."} inserts a G clef symbol in the middle of an otherwise
+plain text string.
  
  @item Punctuators and Operators
  @cindex punctuators
@@ -177,33 +180,40 @@ described in the previous section (@pxref{Tokens}).  A blank line, or
  one that consists only of white space or comments, also ends a command.
  
  @node Syntax Variants
-@section Variants of syntax.
+@section Syntax Variants
  
  @cindex Batch syntax
  @cindex Interactive syntax
  
-There are two variants of command syntax, @i{viz}: @dfn{batch} mode and
-@dfn{interactive} mode.
-Batch mode is the default when reading commands from a file.
-Interactive mode is the default when commands are typed at a prompt
-by a user.
-Certain commands, such as @cmd{INSERT} (@pxref{INSERT}), may explicitly
-change the syntax mode. 
-
-In batch mode, any line that contains a non-space character
-in the leftmost column begins a new command. 
-Thus, each command consists of a flush-left line followed by any
-number of lines indented from the left margin. 
-In this mode, a plus or minus sign (@samp{+}, @samp{@minus{}}) as the
-first character in a line is ignored and causes that line to begin a
-new command, which allows for visual indentation of a command without
-that command being considered part of the previous command. 
-The period terminating the end of a command is optional but recommended.
-
-In interactive mode, each command must be terminated with a period
-or by a blank line.
-The use of @samp{+} and @samp{@minus{}} as continuation characters is not
-permitted.
+There are three variants of command syntax, which vary only in how
+they detect the end of one command and the start of the next.
+
+In @dfn{interactive mode}, which is the default for syntax typed at a
+command prompt, a period as the last non-blank character on a line
+ends a command.  A blank line also ends a command.
+
+In @dfn{batch mode}, an end-of-line period or a blank line also ends a
+command.  Additionally, it treats any line that has a non-blank
+character in the leftmost column as beginning a new command.  Thus, in
+batch mode the second and subsequent lines in a command must be
+indented.
+
+Regardless of the syntax mode, a plus sign, minus sign, or period in
+the leftmost column of a line is ignored and causes that line to begin
+a new command.  This is most useful in batch mode, in which the first
+line of a new command could not otherwise be indented, but it is
+accepted regardless of syntax mode.
+
+The default mode for reading commands from a file is @dfn{auto mode}.
+It is the same as batch mode, except that a line with a non-blank in
+the leftmost column only starts a new command if that line begins with
+the name of a PSPP command.  This correctly interprets most valid PSPP
+syntax files regardless of the syntax mode for which they are
+intended.
+
+The @option{--interactive} (or @option{-i}) or @option{--batch} (or
+@option{-b}) options set the syntax mode for files listed on the PSPP
+command line.  @xref{Main Options}, for more details.
  
  @node Types of Commands
  @section Types of Commands
diff --git a/doc/utilities.texi b/doc/utilities.texi

index b729b88bb14cb16b9df9f29b59095c4fea51f425..2cf95a3107839634d4b360ea1ce4431b236bf5ea 100644 (file)
--- a/doc/utilities.texi
+++ b/doc/utilities.texi
@@ -242,7 +242,7 @@ subshell.
  @vindex INCLUDE
  
  @display
-        INCLUDE [FILE=]'file-name'.
+        INCLUDE [FILE=]'file-name' [ENCODING='encoding'].
  @end display
  
  @cmd{INCLUDE} causes the PSPP command processor to read an
@@ -253,19 +253,11 @@ stop and no more commands will be processed.
  Include files may be nested to any depth, up to the limit of available
  memory.
  
+The @cmd{INSERT} command (@pxref{INSERT}) is a more flexible
+alternative to @cmd{INCLUDE}.  An INCLUDE command acts the same as
+INSERT with ERROR=STOP CD=NO SYNTAX=BATCH specified.
  
-The @cmd{INSERT} command (@pxref{INSERT}) may be used instead of
-@cmd{INCLUDE} if you require more flexible options.
-The syntax 
-@example
-INCLUDE FILE=@var{file-name}.
-@end example
-@noindent 
-functions identically to 
-@example
-INSERT FILE=@var{file-name} ERROR=STOP CD=NO SYNTAX=BATCH.
-@end example
-
+The optional ENCODING subcommand has the same meaning as on INSERT.
  
  @node INSERT
  @section INSERT
@@ -275,7 +267,8 @@ INSERT FILE=@var{file-name} ERROR=STOP CD=NO SYNTAX=BATCH.
       INSERT [FILE=]'file-name'
          [CD=@{NO,YES@}]
          [ERROR=@{CONTINUE,STOP@}]
-        [SYNTAX=@{BATCH,INTERACTIVE@}].
+        [SYNTAX=@{BATCH,INTERACTIVE@}]
+        [ENCODING='encoding'].
  @end display
  
  @cmd{INSERT} is similar to @cmd{INCLUDE} (@pxref{INCLUDE}) 
@@ -303,6 +296,37 @@ the included file must conform to interactive syntax
  conventions. @xref{Syntax Variants}.
  The default setting is @samp{SYNTAX=BATCH}.
  
+ENCODING optionally specifies the character set used by the included
+file.  Its argument, which is not case-sensitive, must be in one of
+the following forms:
+
+@table @asis
+@item @code{Locale}
+The encoding used by the system locale, or as overridden by the SET
+LOCALE command (@pxref{SET}).  On Unix systems, environment variables,
+e.g.@: @env{LANG} or @env{LC_ALL}, determine the system locale.
+
+@item IANA character set name
+One of the character set names listed by IANA at
+@uref{http://www.iana.org/assignments/character-sets}.  Some examples
+are @code{ASCII} (United States), @code{ISO-8859-1} (western Europe),
+@code{EUC-JP} (Japan), and @code{windows-1252} (Windows).  Not all
+systems support all character sets.
+
+@item @code{Auto}
+@item @code{Auto,@var{encoding}}
+Automatically detects whether a syntax file is encoded in
+@var{encoding} or in a Unicode encoding such as UTF-8, UTF-16, or
+UTF-32.  The @var{encoding} may be an IANA character set name or
+@code{Locale} (the default).  Only ASCII compatible encodings can
+automatically be distinguished from UTF-8 (the most common locale
+encodings are all ASCII-compatible).
+@end table
+
+When ENCODING is not specified, the default is taken from the
+@option{--syntax-encoding} command option, if it was specified, and
+otherwise it is @code{Auto}.
+
  @node PERMISSIONS
  @section PERMISSIONS
  @vindex PERMISSIONS
@@ -363,7 +387,8 @@ SET
          /MXWARNS=max_warnings
          /WORKSPACE=workspace_size
  
-(program execution)
+(syntax execution)
+        /LOCALE='locale'
          /MEXPAND=@{ON,OFF@}
          /MITERATE=max_iterations
          /MNEST=max_nest
@@ -540,10 +565,20 @@ that warnings will not be given.
  The default value is 100.
  @end table
  
-Program execution subcommands control the way that PSPP commands
-execute.  The program execution subcommands are
+Syntax execution subcommands control the way that PSPP commands
+execute.  The syntax execution subcommands are
  
  @table @asis
+@item LOCALE
+Overrides the system locale for the purpose of reading and writing
+syntax and data files.  The argument should be a locale name in the
+general form @code{language_country.encoding}, where @code{language}
+and @code{country} are 2-character language and country abbreviations,
+respectively, and @code{encoding} is an IANA character set name.
+Example locales are @code{en_US.UTF-8} (UTF-8 encoded English as
+spoken in the United States) and @code{ja_JP.EUC-JP} (EUC-JP encoded
+Japanese as spoken in Japan).
+
  @item MEXPAND
  @itemx MITERATE
  @itemx MNEST
diff --git a/perl-module/PSPP.xs b/perl-module/PSPP.xs

index 25effb9216b77364c9641b0a479b397cd25dea37..58eac5b762b58080d7ec3dbb9abf79e3ed3b50ca 100644 (file)
--- a/perl-module/PSPP.xs
+++ b/perl-module/PSPP.xs
@@ -1,5 +1,5 @@
  /* PSPP - computes sample statistics.
-   Copyright (C) 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
@@ -88,7 +88,7 @@ struct sysreader_info
  
  /*  A message handler which writes messages to PSPP::errstr */
  static void
-message_handler (const struct msg *m)
+message_handler (const struct msg *m, void *aux)
  {
   SV *errstr = get_sv("PSPP::errstr", TRUE);
   sv_setpv (errstr, m->text);
@@ -179,7 +179,7 @@ CODE:
   assert (0 == strncmp (ver, bare_version, strlen (ver)));
  
   i18n_init ();
- msg_init (NULL, message_handler);
+ msg_set_handler (message_handler, NULL);
   settings_init (0, 0);
   fh_init ();
  
@@ -255,7 +255,7 @@ set_documents (dict, docs)
   struct dictionary *dict
   char *docs
  CODE:
- dict_set_documents (dict, docs);
+ dict_set_documents_string (dict, docs);
  
  
  void
@@ -263,7 +263,7 @@ add_document (dict, doc)
   struct dictionary *dict
   char *doc
  CODE:
- dict_add_document_line (dict, doc);
+ dict_add_document_line (dict, doc, false);
  
  
  void
@@ -326,7 +326,7 @@ pxs_dict_create_var (dict, name, ip_fmt)
  INIT:
   SV *errstr = get_sv("PSPP::errstr", TRUE);
   sv_setpv (errstr, "");
- if ( ! var_is_plausible_name (name, false))
+ if ( ! id_is_plausible (name, false))
    {
      sv_setpv (errstr, "The variable name is not valid.");
      XSRETURN_UNDEF;
@@ -376,7 +376,7 @@ set_label (var, label)
   struct variable *var;
   char *label
  CODE:
-  var_set_label (var, label);
+  var_set_label (var, label, NULL, false);
  
  
  void
diff --git a/perl-module/t/Pspp.t b/perl-module/t/Pspp.t

index fce5b74dd2eede1a4d472563cbb4413b359772fa..a1ff5051a66fdde6d3d677009c99a06dcb8c7103 100644 (file)
--- a/perl-module/t/Pspp.t
+++ b/perl-module/t/Pspp.t
@@ -72,7 +72,7 @@ sub run_pspp_syntax_cmp
    ok ($d->get_var_cnt () == 0);
  
    $d->set_label ("My Dictionary");
-  $d->set_documents ("These Documents");
+  $d->add_document ("These Documents");
  
    # Tests for variable creation
  
@@ -130,7 +130,7 @@ sub run_pspp_syntax_cmp
                           )
                          );
  
-  $d->set_documents ("This should not appear");
+  $d->add_document ("This should not appear");
    $d->clear_documents ();
    $d->add_document ("This is a document line");
  
diff --git a/src/data/automake.mk b/src/data/automake.mk

index 609c59b8692808b9745461e37f493586e089734d..ae0000d2bcddad239d4151d17e146f4a7ba9a138 100644 (file)
--- a/src/data/automake.mk
+++ b/src/data/automake.mk
@@ -66,6 +66,7 @@ src_data_libdata_la_SOURCES = \
         src/data/gnumeric-reader.c \
         src/data/gnumeric-reader.h \
         src/data/identifier.c \
+       src/data/identifier2.c \
         src/data/identifier.h \
         src/data/lazy-casereader.c \
         src/data/lazy-casereader.h \
diff --git a/src/data/dictionary.c b/src/data/dictionary.c

index 467f347efd9f5a5e290435265cbd1077032bfb0b..79d36374fc42767660234911ed533c5732270b92 100644 (file)
--- a/src/data/dictionary.c
+++ b/src/data/dictionary.c
@@ -21,6 +21,7 @@
  #include <stdint.h>
  #include <stdlib.h>
  #include <ctype.h>
+#include <unistr.h>
  
  #include "data/attributes.h"
  #include "data/case.h"
@@ -36,14 +37,17 @@
  #include "libpspp/compiler.h"
  #include "libpspp/hash-functions.h"
  #include "libpspp/hmap.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/misc.h"
  #include "libpspp/pool.h"
  #include "libpspp/str.h"
+#include "libpspp/string-array.h"
  
  #include "gl/intprops.h"
  #include "gl/minmax.h"
  #include "gl/xalloc.h"
+#include "gl/xmemdup0.h"
  
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
@@ -63,7 +67,7 @@ struct dictionary
      struct variable *filter;    /* FILTER variable. */
      casenumber case_limit;      /* Current case limit (N command). */
      char *label;               /* File label. */
-    struct string documents;    /* Documents, as a string. */
+    struct string_array documents; /* Documents. */
      struct vector **vector;     /* Vectors of variables. */
      size_t vector_cnt;          /* Number of vectors. */
      struct attrset attributes;  /* Custom attributes. */
@@ -99,6 +103,15 @@ dict_get_encoding (const struct dictionary *d)
    return d->encoding ;
  }
  
+/* Returns true if UTF-8 string ID is an acceptable identifier in DICT's
+   encoding, false otherwise.  If ISSUE_ERROR is true, issues an explanatory
+   error message on failure. */
+bool
+dict_id_is_valid (const struct dictionary *dict, const char *id,
+                  bool issue_error)
+{
+  return id_is_valid (id, dict->encoding, issue_error);
+}
  
  void
  dict_set_change_callback (struct dictionary *d,
@@ -268,7 +281,7 @@ dict_clear (struct dictionary *d)
    d->case_limit = 0;
    free (d->label);
    d->label = NULL;
-  ds_destroy (&d->documents);
+  string_array_clear (&d->documents);
    dict_clear_vectors (d);
    attrset_clear (&d->attributes);
  }
@@ -845,54 +858,67 @@ var_name_is_insertable (const struct dictionary *dict, const char *name)
  static char *
  make_hinted_name (const struct dictionary *dict, const char *hint)
  {
-  char name[VAR_NAME_LEN + 1];
+  size_t hint_len = strlen (hint);
    bool dropped = false;
-  char *cp;
-
-  for (cp = name; *hint && cp < name + VAR_NAME_LEN; hint++)
+  char *root, *rp;
+  size_t ofs;
+  int mblen;
+
+  /* The allocation size here is OK: characters that are copied directly fit
+     OK, and characters that are not copied directly are replaced by a single
+     '_' byte.  If u8_mbtouc() replaces bad input by 0xfffd, then that will get
+     replaced by '_' too.  */
+  root = rp = xmalloc (hint_len + 1);
+  for (ofs = 0; ofs < hint_len; ofs += mblen)
      {
-      if (cp == name
-          ? lex_is_id1 (*hint) && *hint != '$'
-          : lex_is_idn (*hint))
+      ucs4_t uc;
+
+      mblen = u8_mbtouc (&uc, CHAR_CAST (const uint8_t *, hint + ofs),
+                         hint_len - ofs);
+      if (rp == root
+          ? lex_uc_is_id1 (uc) && uc != '$'
+          : lex_uc_is_idn (uc))
          {
            if (dropped)
              {
-              *cp++ = '_';
+              *rp++ = '_';
                dropped = false;
              }
-          if (cp < name + VAR_NAME_LEN)
-            *cp++ = *hint;
+          rp += u8_uctomb (CHAR_CAST (uint8_t *, rp), uc, 6);
          }
-      else if (cp > name)
+      else if (rp != root)
          dropped = true;
      }
-  *cp = '\0';
+  *rp = '\0';
  
-  if (name[0] != '\0')
+  if (root[0] != '\0')
      {
-      size_t len = strlen (name);
        unsigned long int i;
  
-      if (var_name_is_insertable (dict, name))
-        return xstrdup (name);
+      if (var_name_is_insertable (dict, root))
+        return root;
  
        for (i = 0; i < ULONG_MAX; i++)
          {
            char suffix[INT_BUFSIZE_BOUND (i) + 1];
-          int ofs;
+          char *name;
  
            suffix[0] = '_';
            if (!str_format_26adic (i + 1, &suffix[1], sizeof suffix - 1))
              NOT_REACHED ();
  
-          ofs = MIN (VAR_NAME_LEN - strlen (suffix), len);
-          strcpy (&name[ofs], suffix);
-
+          name = utf8_encoding_concat (root, suffix, dict->encoding, 64);
            if (var_name_is_insertable (dict, name))
-            return xstrdup (name);
+            {
+              free (root);
+              return name;
+            }
+          free (name);
          }
      }
  
+  free (root);
+
    return NULL;
  }
  
@@ -1238,74 +1264,94 @@ dict_set_label (struct dictionary *d, const char *label)
    d->label = label != NULL && label[0] != '\0' ? xstrndup (label, 60) : NULL;
  }
  
-/* Returns the documents for D, or a null pointer if D has no
-   documents.  If the return value is nonnull, then the string
-   will be an exact multiple of DOC_LINE_LENGTH bytes in length,
-   with each segment corresponding to one line. */
-const char *
+/* Returns the documents for D, as an UTF-8 encoded string_array.  The
+   return value is always nonnull; if there are no documents then the
+   string_arary is empty.*/
+const struct string_array *
  dict_get_documents (const struct dictionary *d)
  {
-  return ds_is_empty (&d->documents) ? NULL : ds_cstr (&d->documents);
+  return &d->documents;
  }
  
-/* Sets the documents for D to DOCUMENTS, or removes D's
-   documents if DOCUMENT is a null pointer.  If DOCUMENTS is
-   nonnull, then it should be an exact multiple of
-   DOC_LINE_LENGTH bytes in length, with each segment
-   corresponding to one line. */
+/* Replaces the documents for D by NEW_DOCS, a UTF-8 encoded string_array. */
  void
-dict_set_documents (struct dictionary *d, const char *documents)
+dict_set_documents (struct dictionary *d, const struct string_array *new_docs)
  {
-  size_t remainder;
+  size_t i;
  
-  ds_assign_cstr (&d->documents, documents != NULL ? documents : "");
+  dict_clear_documents (d);
  
-  /* In case the caller didn't get it quite right, pad out the
-     final line with spaces. */
-  remainder = ds_length (&d->documents) % DOC_LINE_LENGTH;
-  if (remainder != 0)
-    ds_put_byte_multiple (&d->documents, ' ', DOC_LINE_LENGTH - remainder);
+  for (i = 0; i < new_docs->n; i++)
+    dict_add_document_line (d, new_docs->strings[i], false);
+}
+
+/* Replaces the documents for D by UTF-8 encoded string NEW_DOCS, dividing it
+   into individual lines at new-line characters.  Each line is truncated to at
+   most DOC_LINE_LENGTH bytes in D's encoding. */
+void
+dict_set_documents_string (struct dictionary *d, const char *new_docs)
+{
+  const char *s;
+
+  dict_clear_documents (d);
+  for (s = new_docs; *s != '\0'; )
+    {
+      size_t len = strcspn (s, "\n");
+      char *line = xmemdup0 (s, len);
+      dict_add_document_line (d, line, false);
+      free (line);
+
+      s += len;
+      if (*s == '\n')
+        s++;
+    }
  }
  
  /* Drops the documents from dictionary D. */
  void
  dict_clear_documents (struct dictionary *d)
  {
-  ds_clear (&d->documents);
+  string_array_clear (&d->documents);
  }
  
-/* Appends LINE to the documents in D.  LINE will be truncated or
-   padded on the right with spaces to make it exactly
-   DOC_LINE_LENGTH bytes long. */
-void
-dict_add_document_line (struct dictionary *d, const char *line)
+/* Appends the UTF-8 encoded LINE to the documents in D.  LINE will be
+   truncated so that it is no more than 80 bytes in the dictionary's
+   encoding.  If this causes some text to be lost, and ISSUE_WARNING is true,
+   then a warning will be issued. */
+bool
+dict_add_document_line (struct dictionary *d, const char *line,
+                        bool issue_warning)
  {
-  if (strlen (line) > DOC_LINE_LENGTH)
+  size_t trunc_len;
+  bool truncated;
+
+  trunc_len = utf8_encoding_trunc_len (line, d->encoding, DOC_LINE_LENGTH);
+  truncated = line[trunc_len] != '\0';
+  if (truncated && issue_warning)
      {
        /* Note to translators: "bytes" is correct, not characters */
        msg (SW, _("Truncating document line to %d bytes."), DOC_LINE_LENGTH);
      }
-  buf_copy_str_rpad (ds_put_uninit (&d->documents, DOC_LINE_LENGTH),
-                     DOC_LINE_LENGTH, line, ' ');
+
+  string_array_append_nocopy (&d->documents, xmemdup0 (line, trunc_len));
+
+  return !truncated;
  }
  
  /* Returns the number of document lines in dictionary D. */
  size_t
  dict_get_document_line_cnt (const struct dictionary *d)
  {
-  return ds_length (&d->documents) / DOC_LINE_LENGTH;
+  return d->documents.n;
  }
  
-/* Copies document line number IDX from dictionary D into
-   LINE, trimming off any trailing white space. */
-void
-dict_get_document_line (const struct dictionary *d,
-                        size_t idx, struct string *line)
+/* Returns document line number IDX in dictionary D.  The caller must not
+   modify or free the returned string. */
+const char *
+dict_get_document_line (const struct dictionary *d, size_t idx)
  {
-  assert (idx < dict_get_document_line_cnt (d));
-  ds_assign_substring (line, ds_substr (&d->documents, idx * DOC_LINE_LENGTH,
-                                        DOC_LINE_LENGTH));
-  ds_rtrim (line, ss_cstr (CC_SPACES));
+  assert (idx < d->documents.n);
+  return d->documents.strings[idx];
  }
  
  /* Creates in D a vector named NAME that contains the CNT
diff --git a/src/data/dictionary.h b/src/data/dictionary.h

index dd220e4de15d5475fe2fb3dca58ae7f069c22b96..fa5d0ddeb8fc355d4f48a28e69b24b5e883f0645 100644 (file)
--- a/src/data/dictionary.h
+++ b/src/data/dictionary.h
@@ -127,14 +127,15 @@ void dict_set_label (struct dictionary *, const char *);
  /* Documents. */
  #define DOC_LINE_LENGTH 80 /* Fixed length of document lines. */
  
-const char *dict_get_documents (const struct dictionary *);
-void dict_set_documents (struct dictionary *, const char *);
+const struct string_array *dict_get_documents (const struct dictionary *);
+void dict_set_documents (struct dictionary *, const struct string_array *);
+void dict_set_documents_string (struct dictionary *, const char *);
  void dict_clear_documents (struct dictionary *);
  
-void dict_add_document_line (struct dictionary *, const char *);
+bool dict_add_document_line (struct dictionary *, const char *,
+                             bool issue_warning);
  size_t dict_get_document_line_cnt (const struct dictionary *);
-void dict_get_document_line (const struct dictionary *,
-                             size_t, struct string *);
+const char *dict_get_document_line (const struct dictionary *, size_t);
  
  /* Vectors. */
  bool dict_create_vector (struct dictionary *, const char *name,
@@ -166,6 +167,9 @@ bool dict_has_attributes (const struct dictionary *);
  void dict_set_encoding (struct dictionary *d, const char *enc);
  const char *dict_get_encoding (const struct dictionary *d);
  
+bool dict_id_is_valid (const struct dictionary *, const char *id,
+                       bool issue_error);
+
  /* Internal variables. */
  struct variable *dict_create_internal_var (int case_idx, int width);
  void dict_destroy_internal_var (struct variable *);
diff --git a/src/data/file-handle-def.c b/src/data/file-handle-def.c

index 95a92f57f4f6b2d41fe8aa4c45023025917b4ce2..2b8e40ce347b7935b68589083dcda03030c07a74 100644 (file)
--- a/src/data/file-handle-def.c
+++ b/src/data/file-handle-def.c
@@ -220,10 +220,10 @@ fh_inline_file (void)
    return inline_file;
  }
  
-/* Creates and returns a new file handle with the given ID, which
-   may be null.  If it is non-null, it must be unique among
-   existing file identifiers.  The new handle is associated with
-   file FILE_NAME and the given PROPERTIES. */
+/* Creates and returns a new file handle with the given ID, which may be null.
+   If it is non-null, it must be a UTF-8 encoded string that is unique among
+   existing file identifiers.  The new handle is associated with file FILE_NAME
+   and the given PROPERTIES. */
  struct file_handle *
  fh_create_file (const char *id, const char *file_name,
                  const struct fh_properties *properties)
diff --git a/src/data/gnumeric-reader.h b/src/data/gnumeric-reader.h

index 6bb5a6b7d9324c97ba2f4e6ff86bc67b8b7aece1..b313fc78768cf446975156c6d68647f277a72d4b 100644 (file)
--- a/src/data/gnumeric-reader.h
+++ b/src/data/gnumeric-reader.h
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 2007 Free Software Foundation, Inc.
+   Copyright (C) 2007, 2010 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -24,9 +24,9 @@ struct casereader;
  
  struct gnumeric_read_info
  {
-  char *sheet_name ;
-  char *file_name ;
-  char *cell_range ;
+  char *sheet_name ;            /* In UTF-8. */
+  char *file_name ;             /* In filename encoding. */
+  char *cell_range ;            /* In UTF-8. */
    int sheet_index ;
    bool read_names ;
    int asw ;
diff --git a/src/data/identifier.c b/src/data/identifier.c

index 4b613bb480edb5555cb1532176d745182345c0c4..f1c22ef1b567223579586ca8ac6f24fd3f4e3722 100644 (file)
--- a/src/data/identifier.c
+++ b/src/data/identifier.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2005, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2005, 2009, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -23,15 +23,10 @@
  
  #include "data/identifier.h"
  
-#include <assert.h>
  #include <string.h>
  #include <unictype.h>
-#include <unistr.h>
  
  #include "libpspp/assertion.h"
-#include "libpspp/cast.h"
-#include "libpspp/i18n.h"
-#include "libpspp/message.h"
  
  #include "gl/c-ctype.h"
  
@@ -319,20 +314,3 @@ lex_id_to_token (struct substring id)
  
    return T_ID;
  }
-
-/* Returns the name for the given keyword token type. */
-const char *
-lex_id_name (enum token_type token)
-{
-  const struct keyword *kw;
-
-  for (kw = keywords; kw < &keywords[keyword_cnt]; kw++)
-    if (kw->token == token)
-      {
-        /* A "struct substring" is not guaranteed to be
-           null-terminated, as our caller expects, but in this
-           case it always will be. */
-        return ss_data (kw->identifier);
-      }
-  NOT_REACHED ();
-}
diff --git a/src/data/identifier.h b/src/data/identifier.h

index bf20f9cc912dad0c8405b6a6bbd77a7e6b8bb520..7f2f904239167f1c5c4200a570621dd7d83cf66a 100644 (file)
--- a/src/data/identifier.h
+++ b/src/data/identifier.h
@@ -74,6 +74,12 @@ const char *token_type_to_string (enum token_type);
  /* Tokens. */
  bool lex_is_keyword (enum token_type);
  
+/* Validating identifiers. */
+#define ID_MAX_LEN 64          /* Maximum length of identifier, in bytes. */
+
+bool id_is_valid (const char *id, const char *dict_encoding, bool issue_error);
+bool id_is_plausible (const char *id, bool issue_error);
+
  /* Recognizing identifiers. */
  bool lex_is_id1 (char);
  bool lex_is_idn (char);
@@ -88,7 +94,4 @@ bool lex_id_match_n (struct substring keyword, struct substring token,
                       size_t n);
  int lex_id_to_token (struct substring);
  
-/* Identifier names. */
-const char *lex_id_name (enum token_type);
-
  #endif /* !data/identifier.h */
diff --git a/src/data/identifier2.c b/src/data/identifier2.c

new file mode 100644 (file)

index 0000000..3b6458f
--- /dev/null
+++ b/src/data/identifier2.c
@@ -0,0 +1,133 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 1997-9, 2000, 2005, 2009, 2010, 2011 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+/* This file implements parts of identifier.h that call the msg() function.
+   This allows test programs that do not use those functions to avoid linking
+   additional object files. */
+
+#include <config.h>
+
+#include "data/identifier.h"
+
+#include <string.h>
+#include <unistr.h>
+
+#include "libpspp/cast.h"
+#include "libpspp/i18n.h"
+#include "libpspp/message.h"
+
+#include "gl/c-ctype.h"
+
+#include "gettext.h"
+#define _(msgid) gettext (msgid)
+
+/* Returns true if UTF-8 string ID is an acceptable identifier in encoding
+   DICT_ENCODING (UTF-8 if null), false otherwise.  If ISSUE_ERROR is true,
+   issues an explanatory error message on failure. */
+bool
+id_is_valid (const char *id, const char *dict_encoding, bool issue_error)
+{
+  size_t dict_len;
+
+  if (!id_is_plausible (id, issue_error))
+    return false;
+
+  if (dict_encoding != NULL)
+    {
+      /* XXX need to reject recoded strings that contain the fallback
+         character. */
+      dict_len = recode_string_len (dict_encoding, "UTF-8", id, -1);
+    }
+  else
+    dict_len = strlen (id);
+
+  if (dict_len > ID_MAX_LEN)
+    {
+      if (issue_error)
+        msg (SE, _("Identifier `%s' exceeds %d-byte limit."),
+             id, ID_MAX_LEN);
+      return false;
+    }
+
+  return true;
+}
+
+/* Returns true if UTF-8 string ID is an plausible identifier, false
+   otherwise.  If ISSUE_ERROR is true, issues an explanatory error message on
+   failure.  */
+bool
+id_is_plausible (const char *id, bool issue_error)
+{
+  const uint8_t *bad_unit;
+  const uint8_t *s;
+  char ucname[16];
+  int mblen;
+  ucs4_t uc;
+
+  /* ID cannot be the empty string. */
+  if (id[0] == '\0')
+    {
+      if (issue_error)
+        msg (SE, _("Identifier cannot be empty string."));
+      return false;
+    }
+
+  /* ID cannot be a reserved word. */
+  if (lex_id_to_token (ss_cstr (id)) != T_ID)
+    {
+      if (issue_error)
+        msg (SE, _("`%s' may not be used as an identifier because it "
+                   "is a reserved word."), id);
+      return false;
+    }
+
+  bad_unit = u8_check (CHAR_CAST (const uint8_t *, id), strlen (id));
+  if (bad_unit != NULL)
+    {
+      /* If this message ever appears, it probably indicates a PSPP bug since
+         it shouldn't be possible to get invalid UTF-8 this far. */
+      if (issue_error)
+        msg (SE, _("`%s' may not be used as an identifier because it "
+                   "contains ill-formed UTF-8 at byte offset %tu."),
+             id, CHAR_CAST (const char *, bad_unit) - id);
+      return false;
+    }
+
+  /* Check that it is a valid identifier. */
+  mblen = u8_strmbtouc (&uc, CHAR_CAST (uint8_t *, id));
+  if (!lex_uc_is_id1 (uc))
+    {
+      if (issue_error)
+        msg (SE, _("Character %s (in `%s') may not appear "
+                   "as the first character in a identifier."),
+             uc_name (uc, ucname), id);
+      return false;
+    }
+
+  for (s = CHAR_CAST (uint8_t *, id + mblen);
+       (mblen = u8_strmbtouc (&uc, s)) != 0;
+        s += mblen)
+    if (!lex_uc_is_idn (uc))
+      {
+        if (issue_error)
+          msg (SE, _("Character %s (in `%s') may not appear in an "
+                     "identifier."),
+               uc_name (uc, ucname), id);
+        return false;
+      }
+
+  return true;
+}
diff --git a/src/data/mrset.c b/src/data/mrset.c

index d1807b96f301000f807ccc87fb696d735562530e..38b0ab2d5266e88192342166df828cd7a98e2bd9 100644 (file)
--- a/src/data/mrset.c
+++ b/src/data/mrset.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 2010 Free Software Foundation, Inc.
+   Copyright (C) 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -21,11 +21,16 @@
  #include <stdlib.h>
  
  #include "data/dictionary.h"
+#include "data/identifier.h"
  #include "data/val-type.h"
  #include "data/variable.h"
+#include "libpspp/message.h"
  
  #include "gl/xalloc.h"
  
+#include "gettext.h"
+#define _(msgid) gettext (msgid)
+
  /* Creates and returns a clone of OLD.  The caller is responsible for freeing
     the new multiple response set (using mrset_destroy()). */
  struct mrset *
@@ -62,9 +67,31 @@ mrset_destroy (struct mrset *mrset)
      }
  }
  
+/* Returns true if the UTF-8 encoded NAME is a valid name for a multiple
+   response set in a dictionary encoded in DICT_ENCODING, false otherwise.  If
+   ISSUE_ERROR is true, issues an explanatory error message on failure. */
+bool
+mrset_is_valid_name (const char *name, const char *dict_encoding,
+                     bool issue_error)
+{
+  if (!id_is_valid (name, dict_encoding, issue_error))
+    return false;
+
+  if (name[0] != '$')
+    {
+      if (issue_error)
+        msg (SE, _("%s is not a valid name for a multiple response "
+                   "set.  Multiple response set names must begin with "
+                   "`$'."), name);
+      return false;
+    }
+
+  return true;
+}
+
  /* Checks various constraints on MRSET:
  
-   - MRSET has a valid name for a multiple response set (beginning with '$').
+   - MRSET's name begins with '$' and is valid as an identifier in DICT.
  
     - MRSET has a valid type.
  
@@ -85,7 +112,7 @@ mrset_ok (const struct mrset *mrset, const struct dictionary *dict)
    size_t i;
  
    if (mrset->name == NULL
-      || mrset->name[0] != '$'
+      || !mrset_is_valid_name (mrset->name, dict_get_encoding (dict), false)
        || (mrset->type != MRSET_MD && mrset->type != MRSET_MC)
        || mrset->vars == NULL
        || mrset->n_vars < 2)
diff --git a/src/data/mrset.h b/src/data/mrset.h

index c531db7a016791227c6bafba44e99e1ba9240a25..9971924e0d39d2d9c1adb044e2c033d39b9ab77f 100644 (file)
--- a/src/data/mrset.h
+++ b/src/data/mrset.h
@@ -61,8 +61,8 @@ enum mrset_md_cat_source
  /* A multiple response set. */
  struct mrset
    {
-    char *name;                 /* Name for syntax.  Always begins with "$". */
-    char *label;                /* Human-readable label for group. */
+    char *name;                 /* UTF-8 encoded name beginning with "$". */
+    char *label;                /* Human-readable UTF-8 label for group. */
      enum mrset_type type;       /* Group type. */
      struct variable **vars;     /* Constituent variables. */
      size_t n_vars;              /* Number of constituent variables. */
@@ -77,6 +77,9 @@ struct mrset
  struct mrset *mrset_clone (const struct mrset *);
  void mrset_destroy (struct mrset *);
  
+bool mrset_is_valid_name (const char *name, const char *dict_encoding,
+                          bool issue_error);
+
  bool mrset_ok (const struct mrset *, const struct dictionary *);
  
  #endif /* data/mrset.h */
diff --git a/src/data/por-file-reader.c b/src/data/por-file-reader.c

index 3f8ee3c9fee47601c651850ba2ce6e3195028da0..372d7682136f746110e4cc05d19c04ac30265824 100644 (file)
--- a/src/data/por-file-reader.c
+++ b/src/data/por-file-reader.c
@@ -105,10 +105,11 @@ error (struct pfm_reader *r, const char *msg, ...)
  
    m.category = MSG_C_GENERAL;
    m.severity = MSG_S_ERROR;
-  m.where.file_name = NULL;
-  m.where.line_number = 0;
-  m.where.first_column = 0;
-  m.where.last_column = 0;
+  m.file_name = NULL;
+  m.first_line = 0;
+  m.last_line = 0;
+  m.first_column = 0;
+  m.last_column = 0;
    m.text = ds_cstr (&text);
  
    msg_emit (&m);
@@ -136,10 +137,11 @@ warning (struct pfm_reader *r, const char *msg, ...)
  
    m.category = MSG_C_GENERAL;
    m.severity = MSG_S_WARNING;
-  m.where.file_name = NULL;
-  m.where.line_number = 0;
-  m.where.first_column = 0;
-  m.where.last_column = 0;
+  m.file_name = NULL;
+  m.first_line = 0;
+  m.last_line = 0;
+  m.first_column = 0;
+  m.last_column = 0;
    m.text = ds_cstr (&text);
  
    msg_emit (&m);
@@ -682,7 +684,8 @@ read_variables (struct pfm_reader *r, struct dictionary *dict)
        for (j = 0; j < 6; j++)
          fmt[j] = read_int (r);
  
-      if (!var_is_valid_name (name, false) || *name == '#' || *name == '$')
+      if (!dict_id_is_valid (dict, name, false)
+          || *name == '#' || *name == '$')
          error (r, _("Invalid variable name `%s' in position %d."), name, i);
        str_uppercase (name);
  
@@ -742,7 +745,7 @@ read_variables (struct pfm_reader *r, struct dictionary *dict)
          {
            char label[256];
            read_string (r, label);
-          var_set_label (v, label);
+          var_set_label (v, label, NULL, false); /* XXX */
          }
      }
  
@@ -832,7 +835,7 @@ read_documents (struct pfm_reader *r, struct dictionary *dict)
      {
        char line[256];
        read_string (r, line);
-      dict_add_document_line (dict, line);
+      dict_add_document_line (dict, line, false);
      }
  }
  
diff --git a/src/data/por-file-writer.c b/src/data/por-file-writer.c

index ee84e7c9c29aec8c44096d70aa64b790c21e5d6c..ea0f9dc88bfe98bebbbd2422881c664d2b76fbaa 100644 (file)
--- a/src/data/por-file-writer.c
+++ b/src/data/por-file-writer.c
@@ -436,10 +436,7 @@ write_documents (struct pfm_writer *w, const struct dictionary *dict)
    buf_write (w, "E", 1);
    write_int (w, line_cnt);
    for (i = 0; i < line_cnt; i++)
-    {
-      dict_get_document_line (dict, i, &line);
-      write_string (w, ds_cstr (&line));
-    }
+    write_string (w, dict_get_document_line (dict, i));
    ds_destroy (&line);
  }
  
diff --git a/src/data/procedure.c b/src/data/procedure.c

index fa52806f60f6795154234af340b8226f95f5172d..a45f497afd1268012a77064a5d49b33839c974f8 100644 (file)
--- a/src/data/procedure.c
+++ b/src/data/procedure.c
@@ -101,6 +101,8 @@ struct dataset {
    void (*callback) (void *); /* Callback for when the dataset changes */
    void *cb_data;
  
+  /* Default encoding for reading syntax files. */
+  char *syntax_encoding;
  }; /* struct dataset */
  
  
@@ -125,6 +127,18 @@ dataset_set_callback (struct dataset *ds, void (*cb) (void *), void *cb_data)
    ds->cb_data = cb_data;
  }
  
+void
+dataset_set_default_syntax_encoding (struct dataset *ds, const char *encoding)
+{
+  free (ds->syntax_encoding);
+  ds->syntax_encoding = xstrdup (encoding);
+}
+
+const char *
+dataset_get_default_syntax_encoding (const struct dataset *ds)
+{
+  return ds->syntax_encoding;
+}
  
  /* Returns the last time the data was read. */
  time_t
@@ -597,6 +611,9 @@ create_dataset (void)
  
    ds->caseinit = caseinit_create ();
    proc_cancel_all_transformations (ds);
+
+  ds->syntax_encoding = xstrdup ("Auto");
+
    return ds;
  }
  
@@ -621,6 +638,8 @@ destroy_dataset (struct dataset *ds)
  
    if ( ds->xform_callback)
      ds->xform_callback (false, ds->xform_callback_aux);
+
+  free (ds->syntax_encoding);
    free (ds);
  }
  
diff --git a/src/data/procedure.h b/src/data/procedure.h

index fd3af6042140fee4bdfd30740e944a3d5db8adb3..ad45a53720052ba9d784d6eb7b1aea27ec175f47 100644 (file)
--- a/src/data/procedure.h
+++ b/src/data/procedure.h
@@ -82,10 +82,12 @@ bool dataset_end_of_command (struct dataset *);
  struct dictionary *dataset_dict (const struct dataset *ds);
  const struct casereader *dataset_source (const struct dataset *ds);
  
-
  const struct ccase *lagged_case (const struct dataset *ds, int n_before);
  void dataset_need_lag (struct dataset *ds, int n_before);
  
  void dataset_set_callback (struct dataset *ds, void (*cb) (void *), void *);
  
+void dataset_set_default_syntax_encoding (struct dataset *, const char *);
+const char *dataset_get_default_syntax_encoding (const struct dataset *);
+
  #endif /* procedure.h */
diff --git a/src/data/sys-file-reader.c b/src/data/sys-file-reader.c

index ceb4e04fb0484950bba4da5d0bb6df1ba7f0eabf..6643b85da2848551cac05a352395e0c7bc9907c5 100644 (file)
--- a/src/data/sys-file-reader.c
+++ b/src/data/sys-file-reader.c
@@ -33,6 +33,7 @@
  #include "data/file-handle-def.h"
  #include "data/file-name.h"
  #include "data/format.h"
+#include "data/identifier.h"
  #include "data/missing-values.h"
  #include "data/mrset.h"
  #include "data/short-names.h"
@@ -953,7 +954,8 @@ parse_variable_records (struct sfm_reader *r, struct dictionary *dict,
                                   rec->name, 8, r->pool);
        name[strcspn (name, " ")] = '\0';
  
-      if (!var_is_valid_name (name, false) || name[0] == '$' || name[0] == '#')
+      if (!dict_id_is_valid (dict, name, false)
+          || name[0] == '$' || name[0] == '#')
          sys_error (r, rec->pos, _("Invalid variable name `%s'."), name);
  
        if (rec->width < 0 || rec->width > 255)
@@ -974,7 +976,7 @@ parse_variable_records (struct sfm_reader *r, struct dictionary *dict,
  
            utf8_label = recode_string_pool ("UTF-8", dict_encoding,
                                             rec->label, -1, r->pool);
-          var_set_label (var, utf8_label);
+          var_set_label (var, utf8_label, NULL, false);
          }
  
        /* Set missing values. */
@@ -1099,7 +1101,7 @@ parse_document (struct dictionary *dict, struct sfm_document_record *record)
        ss_rtrim (&line, ss_cstr (" "));
        line.string[line.length] = '\0';
  
-      dict_add_document_line (dict, line.string);
+      dict_add_document_line (dict, line.string, false);
  
        ss_dealloc (&line);
      }
@@ -1539,7 +1541,8 @@ parse_long_var_name_map (struct sfm_reader *r,
    while (read_variable_to_value_pair (r, dict, text, &var, &long_name))
      {
        /* Validate long name. */
-      if (!var_is_valid_name (long_name, false))
+      /* XXX need to reencode name to UTF-8 */
+      if (!dict_id_is_valid (dict, long_name, false))
          {
            sys_warn (r, record->pos,
                      _("Long variable mapping from %s to invalid "
@@ -2467,10 +2470,11 @@ sys_msg (struct sfm_reader *r, off_t offset,
  
    m.category = msg_class_to_category (class);
    m.severity = msg_class_to_severity (class);
-  m.where.file_name = NULL;
-  m.where.line_number = 0;
-  m.where.first_column = 0;
-  m.where.last_column = 0;
+  m.file_name = NULL;
+  m.first_line = 0;
+  m.last_line = 0;
+  m.first_column = 0;
+  m.last_column = 0;
    m.text = ds_cstr (&text);
  
    msg_emit (&m);
diff --git a/src/data/sys-file-writer.c b/src/data/sys-file-writer.c

index 1a65e3c067efc5688d912c0184d3e73be1fa59ed..b1cb7c22c0dcd5e7a1343041b66d75fcac28cd36 100644 (file)
--- a/src/data/sys-file-writer.c
+++ b/src/data/sys-file-writer.c
@@ -47,6 +47,7 @@
  #include "libpspp/message.h"
  #include "libpspp/misc.h"
  #include "libpspp/str.h"
+#include "libpspp/string-array.h"
  #include "libpspp/version.h"
  
  #include "gl/xmemdup0.h"
@@ -238,7 +239,7 @@ sfm_open_writer (struct file_handle *fh, struct dictionary *d,
        idx += sfm_width_to_octs (var_get_width (v));
      }
  
-  if (dict_get_documents (d) != NULL)
+  if (dict_get_document_line_cnt (d) > 0)
      write_documents (w, d);
  
    write_integer_info_record (w);
@@ -552,11 +553,22 @@ write_value_labels (struct sfm_writer *w, struct variable *v, int idx)
  static void
  write_documents (struct sfm_writer *w, const struct dictionary *d)
  {
-  size_t line_cnt = dict_get_document_line_cnt (d);
+  const struct string_array *docs = dict_get_documents (d);
+  const char *enc = dict_get_encoding (d);
+  size_t i;
  
    write_int (w, 6);             /* Record type. */
-  write_int (w, line_cnt);
-  write_bytes (w, dict_get_documents (d), line_cnt * DOC_LINE_LENGTH);
+  write_int (w, docs->n);
+  for (i = 0; i < docs->n; i++)
+    {
+      char *s = recode_string (enc, "UTF-8", docs->strings[i], -1);
+      size_t s_len = strlen (s);
+      size_t write_len = MIN (s_len, DOC_LINE_LENGTH);
+
+      write_bytes (w, s, write_len);
+      write_spaces (w, DOC_LINE_LENGTH - write_len);
+      free (s);
+    }
  }
  
  static void
diff --git a/src/data/variable.c b/src/data/variable.c

index 2ceeb0d0e9610a6396d58246db942ed8c0302ac1..029e3f49dd74739f6c6a3f57b98f7300bdbc7f4b 100644 (file)
--- a/src/data/variable.c
+++ b/src/data/variable.c
@@ -31,6 +31,7 @@
  #include "libpspp/assertion.h"
  #include "libpspp/compiler.h"
  #include "libpspp/hash-functions.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/misc.h"
  #include "libpspp/str.h"
@@ -132,7 +133,7 @@ var_clone (const struct variable *old_var)
    var_set_print_format (new_var, var_get_print_format (old_var));
    var_set_write_format (new_var, var_get_write_format (old_var));
    var_set_value_labels (new_var, var_get_value_labels (old_var));
-  var_set_label (new_var, var_get_label (old_var));
+  var_set_label (new_var, var_get_label (old_var), NULL, false);
    var_set_measure (new_var, var_get_measure (old_var));
    var_set_display_width (new_var, var_get_display_width (old_var));
    var_set_alignment (new_var, var_get_alignment (old_var));
@@ -163,109 +164,27 @@ var_destroy (struct variable *v)
  \f
  /* Variable names. */
  
-/* Return variable V's name. */
+/* Return variable V's name, as a UTF-8 encoded string. */
  const char *
  var_get_name (const struct variable *v)
  {
    return v->name;
  }
  
-/* Sets V's name to NAME.
+/* Sets V's name to NAME, a UTF-8 encoded string.
     Do not use this function for a variable in a dictionary.  Use
     dict_rename_var instead. */
  void
  var_set_name (struct variable *v, const char *name)
  {
    assert (!var_has_vardict (v));
-  assert (var_is_plausible_name (name, false));
+  assert (id_is_plausible (name, false));
  
    free (v->name);
    v->name = xstrdup (name);
    dict_var_changed (v);
  }
  
-/* Returns true if NAME is an acceptable name for a variable,
-   false otherwise.  If ISSUE_ERROR is true, issues an
-   explanatory error message on failure. */
-bool
-var_is_valid_name (const char *name, bool issue_error)
-{
-  bool plausible;
-  size_t length, i;
-
-  /* Note that strlen returns number of BYTES, not the number of
-     CHARACTERS */
-  length = strlen (name);
-
-  plausible = var_is_plausible_name(name, issue_error);
-
-  if ( ! plausible )
-    return false;
-
-
-  if (!lex_is_id1 (name[0]))
-    {
-      if (issue_error)
-        msg (SE, _("Character `%c' (in %s) may not appear "
-                   "as the first character in a variable name."),
-             name[0], name);
-      return false;
-    }
-
-
-  for (i = 0; i < length; i++)
-    {
-    if (!lex_is_idn (name[i]))
-      {
-        if (issue_error)
-          msg (SE, _("Character `%c' (in %s) may not appear in "
-                     "a variable name."),
-               name[i], name);
-        return false;
-      }
-    }
-
-  return true;
-}
-
-/* Returns true if NAME is an plausible name for a variable,
-   false otherwise.  If ISSUE_ERROR is true, issues an
-   explanatory error message on failure.
-   This function makes no use of LC_CTYPE.
-*/
-bool
-var_is_plausible_name (const char *name, bool issue_error)
-{
-  size_t length;
-
-  /* Note that strlen returns number of BYTES, not the number of
-     CHARACTERS */
-  length = strlen (name);
-  if (length < 1)
-    {
-      if (issue_error)
-        msg (SE, _("Variable name cannot be empty string."));
-      return false;
-    }
-  else if (length > VAR_NAME_LEN)
-    {
-      if (issue_error)
-        msg (SE, _("Variable name %s exceeds %d-character limit."),
-             name, (int) VAR_NAME_LEN);
-      return false;
-    }
-
-  if (lex_id_to_token (ss_cstr (name)) != T_ID)
-    {
-      if (issue_error)
-        msg (SE, _("`%s' may not be used as a variable name because it "
-                   "is a reserved word."), name);
-      return false;
-    }
-
-  return true;
-}
-
  /* Returns VAR's dictionary class. */
  enum dict_class
  var_get_dict_class (const struct variable *var)
@@ -644,33 +563,61 @@ var_get_label (const struct variable *v)
    return v->label;
  }
  
-/* Sets V's variable label to LABEL, stripping off leading and
-   trailing white space and truncating to 255 characters.
-   If LABEL is a null pointer or if LABEL is an empty string
-   (after stripping white space), then V's variable label (if
-   any) is removed. */
-void
-var_set_label (struct variable *v, const char *label)
+/* Sets V's variable label to UTF-8 encoded string LABEL, stripping off leading
+   and trailing white space.  If LABEL is a null pointer or if LABEL is an
+   empty string (after stripping white space), then V's variable label (if any)
+   is removed.
+
+   Variable labels are limited to 255 bytes in the dictionary encoding, which
+   should be specified as DICT_ENCODING.  If LABEL fits within this limit, this
+   function returns true.  Otherwise, the variable label is set to a truncated
+   value, this function returns false and, if ISSUE_WARNING is true, issues a
+   warning.  */
+bool
+var_set_label (struct variable *v, const char *label,
+               const char *dict_encoding, bool issue_warning)
  {
+  bool truncated = false;
+
    free (v->label);
    v->label = NULL;
  
    if (label != NULL)
      {
        struct substring s = ss_cstr (label);
+      size_t trunc_len;
+
+      if (dict_encoding != NULL)
+        {
+          enum { MAX_LABEL_LEN = 255 };
+
+          trunc_len = utf8_encoding_trunc_len (label, dict_encoding,
+                                               MAX_LABEL_LEN);
+          if (ss_length (s) > trunc_len)
+            {
+              if (issue_warning)
+                msg (SW, _("Truncating variable label for variable `%s' to %d "
+                           "bytes."), var_get_name (v), MAX_LABEL_LEN);
+              ss_truncate (&s, trunc_len);
+              truncated = true;
+            }
+        }
+
        ss_trim (&s, ss_cstr (CC_SPACES));
-      ss_truncate (&s, 255);
        if (!ss_is_empty (s))
          v->label = ss_xstrdup (s);
      }
+
    dict_var_changed (v);
+
+  return truncated;
  }
  
  /* Removes any variable label from V. */
  void
  var_clear_label (struct variable *v)
  {
-  var_set_label (v, NULL);
+  var_set_label (v, NULL, NULL, false);
  }
  
  /* Returns true if V has a variable V,
@@ -839,7 +786,7 @@ var_get_short_name (const struct variable *var, size_t idx)
  void
  var_set_short_name (struct variable *var, size_t idx, const char *short_name)
  {
-  assert (short_name == NULL || var_is_plausible_name (short_name, false));
+  assert (short_name == NULL || id_is_plausible (short_name, false));
  
    /* Clear old short name numbered IDX, if any. */
    if (idx < var->short_name_cnt) 
diff --git a/src/data/variable.h b/src/data/variable.h

index 03257502a9b28437f802b61d28e74f24cd23f360..9cffafc9069d58026925b4ecedc256d85b53907b 100644 (file)
--- a/src/data/variable.h
+++ b/src/data/variable.h
@@ -34,12 +34,8 @@ struct variable *var_clone (const struct variable *);
  void var_destroy (struct variable *);
  
  /* Variable names. */
-#define VAR_NAME_LEN 64 /* Maximum length of variable name, in bytes. */
-
  const char *var_get_name (const struct variable *);
  void var_set_name (struct variable *, const char *);
-bool var_is_valid_name (const char *, bool issue_error);
-bool var_is_plausible_name (const char *name, bool issue_error);
  enum dict_class var_get_dict_class (const struct variable *);
  
  int compare_vars_by_name (const void *, const void *, const void *);
@@ -102,7 +98,8 @@ struct fmt_spec var_default_formats (int width);
  /* Variable labels. */
  const char *var_to_string (const struct variable *);
  const char *var_get_label (const struct variable *);
-void var_set_label (struct variable *, const char *);
+bool var_set_label (struct variable *, const char *label,
+                    const char *dict_encoding, bool issue_warning);
  void var_clear_label (struct variable *);
  bool var_has_label (const struct variable *);
  
diff --git a/src/data/vector.c b/src/data/vector.c

index 5da797c0a0678fdf6abb3fc1683be6e502ea0a1b..87046ad42b8127853f7a67b75bb2f07dc4583121 100644 (file)
--- a/src/data/vector.c
+++ b/src/data/vector.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 2006, 2011  Free Software Foundation, Inc.
+   Copyright (C) 2006, 2010, 2011  Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -21,6 +21,7 @@
  #include <stdlib.h>
  
  #include "data/dictionary.h"
+#include "data/identifier.h"
  #include "libpspp/assertion.h"
  #include "libpspp/str.h"
  
@@ -46,19 +47,18 @@ check_widths (const struct vector *vector)
      assert (width == var_get_width (vector->vars[i]));
  }
  
-/* Creates and returns a new vector with the given NAME
+/* Creates and returns a new vector with the given UTF-8 encoded NAME
     that contains the VAR_CNT variables in VARS.
     All variables in VARS must have the same type and width. */
  struct vector *
-vector_create (const char *name,
-               struct variable **vars, size_t var_cnt)
+vector_create (const char *name, struct variable **vars, size_t var_cnt)
  {
    struct vector *vector = xmalloc (sizeof *vector);
  
    assert (var_cnt > 0);
-  assert (var_is_plausible_name (name, false));
-  vector->name = xstrdup (name);
+  assert (id_is_plausible (name, false));
  
+  vector->name = xstrdup (name);
    vector->vars = xmemdup (vars, var_cnt * sizeof *vector->vars);
    vector->var_cnt = var_cnt;
    check_widths (vector);
@@ -80,7 +80,6 @@ vector_clone (const struct vector *old,
    size_t i;
  
    new->name = xstrdup (old->name);
-
    new->vars = xnmalloc (old->var_cnt, sizeof *new->vars);
    new->var_cnt = old->var_cnt;
    for (i = 0; i < new->var_cnt; i++)
@@ -103,7 +102,7 @@ vector_destroy (struct vector *vector)
    free (vector);
  }
  
-/* Returns VECTOR's name. */
+/* Returns VECTOR's name, as a UTF-8 encoded string. */
  const char *
  vector_get_name (const struct vector *vector)
  {
diff --git a/src/data/vector.h b/src/data/vector.h

index fc2bf9518d25bfb271095a2ba4935de4c9ccd1c3..f8fe0888497cc543a8a87e91b8c705ddcfd71cb1 100644 (file)
--- a/src/data/vector.h
+++ b/src/data/vector.h
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 2006, 2011  Free Software Foundation, Inc.
+   Copyright (C) 2006, 2010, 2011  Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -34,6 +34,8 @@ enum val_type vector_get_type (const struct vector *);
  struct variable *vector_get_var (const struct vector *, size_t idx);
  size_t vector_get_var_cnt (const struct vector *);
  
+bool vector_is_valid_name (const char *name, bool issue_error);
+
  int compare_vector_ptrs_by_name (const void *a_, const void *b_);
  
  #endif /* data/vector.h */
diff --git a/src/language/automake.mk b/src/language/automake.mk

index 3052b52353ca54416ffbb53f66c4354787bf6924..6c0eaa90a05927d81c4bd2db4ff47dfa6b76341d 100644 (file)
--- a/src/language/automake.mk
+++ b/src/language/automake.mk
@@ -14,12 +14,6 @@ noinst_LTLIBRARIES +=  src/language/liblanguage.la
  
  
  src_language_liblanguage_la_SOURCES = \
-       src/language/syntax-file.c \
-       src/language/syntax-file.h \
-       src/language/syntax-string-source.c \
-       src/language/syntax-string-source.h \
-       src/language/prompt.c \
-       src/language/prompt.h \
         src/language/command.c \
         src/language/command.h \
         src/language/command.def \
diff --git a/src/language/command.c b/src/language/command.c

index 3bf9571fa5a4065537ce1225a01fb42a1ff49559..8ba8a08e4bb9458d3f9e9497826abdd3b16bbd4a 100644 (file)
--- a/src/language/command.c
+++ b/src/language/command.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2009, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -30,12 +30,11 @@
  #include "data/variable.h"
  #include "language/lexer/command-name.h"
  #include "language/lexer/lexer.h"
-#include "language/prompt.h"
  #include "libpspp/assertion.h"
  #include "libpspp/compiler.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/str.h"
-#include "libpspp/getl.h"
  #include "output/text-item.h"
  
  #include "xalloc.h"
@@ -89,7 +88,6 @@ enum flags
    {
      F_ENHANCED = 0x10,        /* Allowed only in enhanced syntax mode. */
      F_TESTING = 0x20,         /* Allowed only in testing mode. */
-    F_KEEP_FINAL_TOKEN = 0x40,/* Don't skip final token in command name. */
      F_ABBREV = 0x80           /* Not a candidate for name completion. */
    };
  
@@ -120,7 +118,8 @@ static void set_completion_state (enum cmd_state);
  \f
  /* Command parser. */
  
-static const struct command *parse_command_name (struct lexer *lexer);
+static const struct command *parse_command_name (struct lexer *,
+                                                 int *n_tokens);
  static enum cmd_result do_parse_command (struct lexer *, struct dataset *, enum cmd_state);
  
  /* Parses an entire command, from command name to terminating
@@ -163,11 +162,10 @@ do_parse_command (struct lexer *lexer,
    const struct command *command = NULL;
    enum cmd_result result;
    bool opened = false;
+  int n_tokens;
  
    /* Read the command's first token. */
-  prompt_set_style (PROMPT_FIRST);
    set_completion_state (state);
-  lex_get (lexer);
    if (lex_token (lexer) == T_STOP)
      {
        result = CMD_EOF;
@@ -180,10 +178,8 @@ do_parse_command (struct lexer *lexer,
        goto finish;
      }
  
-  prompt_set_style (PROMPT_LATER);
-
    /* Parse the command name. */
-  command = parse_command_name (lexer);
+  command = parse_command_name (lexer, &n_tokens);
    if (command == NULL)
      {
        result = CMD_FAILURE;
@@ -216,22 +212,24 @@ do_parse_command (struct lexer *lexer,
    else
      {
        /* Execute command. */
+      int i;
+
+      for (i = 0; i < n_tokens; i++)
+        lex_get (lexer);
        result = command->function (lexer, ds);
      }
  
    assert (cmd_result_is_valid (result));
  
- finish:
+finish:
    if (cmd_result_is_failure (result))
-    {
-      lex_discard_rest_of_command (lexer);
-      if (source_stream_current_error_mode (
-            lex_get_source_stream (lexer)) == ERRMODE_STOP )
-       {
-         msg (MW, _("Error encountered while ERROR=STOP is effective."));
-         result = CMD_CASCADING_FAILURE;
-       }
-    }
+    lex_interactive_reset (lexer);
+  else if (result == CMD_SUCCESS)
+    result = lex_end_of_command (lexer);
+
+  lex_discard_rest_of_command (lexer);
+  while (lex_token (lexer) == T_ENDCMD)
+    lex_get (lexer);
  
    if (opened)
      text_item_submit (text_item_create (TEXT_ITEM_COMMAND_CLOSE,
@@ -259,51 +257,65 @@ find_best_match (struct substring s, const struct command **matchp)
    return missing_words;
  }
  
-/* Parse the command name and return a pointer to the corresponding
-   struct command if successful.
-   If not successful, return a null pointer. */
+static bool
+parse_command_word (struct lexer *lexer, struct string *s, int n)
+{
+  bool need_space = ds_last (s) != EOF && ds_last (s) != '-';
+
+  switch (lex_next_token (lexer, n))
+    {
+    case T_DASH:
+      ds_put_byte (s, '-');
+      return true;
+
+    case T_ID:
+      if (need_space)
+        ds_put_byte (s, ' ');
+      ds_put_cstr (s, lex_next_tokcstr (lexer, n));
+      return true;
+
+    case T_POS_NUM:
+      if (lex_next_is_integer (lexer, n))
+        {
+          int integer = lex_next_integer (lexer, n);
+          if (integer >= 0)
+            {
+              if (need_space)
+                ds_put_byte (s, ' ');
+              ds_put_format (s, "%ld", lex_next_integer (lexer, n));
+              return true;
+            }
+        }
+      return false;
+
+    default:
+      return false;
+    }
+}
+
+/* Parses the command name.  On success returns a pointer to the corresponding
+   struct command and stores the number of tokens in the command name into
+   *N_TOKENS.  On failure, returns a null pointer and stores the number of
+   tokens required to determine that no command name was present into
+   *N_TOKENS. */
  static const struct command *
-parse_command_name (struct lexer *lexer)
+parse_command_name (struct lexer *lexer, int *n_tokens)
  {
    const struct command *command;
    int missing_words;
    struct string s;
-
-  if (lex_token (lexer) == T_EXP
-      || lex_token (lexer) == T_ASTERISK
-      || lex_token (lexer) == T_LBRACK)
-    {
-      static const struct command c = { S_ANY, 0, "COMMENT", cmd_comment };
-      return &c;
-    }
+  int word;
  
    command = NULL;
    missing_words = 0;
    ds_init_empty (&s);
-  for (;;)
+  word = 0;
+  while (parse_command_word (lexer, &s, word))
      {
-      if (lex_token (lexer) == T_DASH)
-        ds_put_byte (&s, '-');
-      else if (lex_token (lexer) == T_ID)
-        {
-          if (!ds_is_empty (&s) && ds_last (&s) != '-')
-            ds_put_byte (&s, ' ');
-          ds_put_cstr (&s, lex_tokcstr (lexer));
-        }
-      else if (lex_is_integer (lexer) && lex_integer (lexer) >= 0)
-        {
-          if (!ds_is_empty (&s) && ds_last (&s) != '-')
-            ds_put_byte (&s, ' ');
-          ds_put_format (&s, "%ld", lex_integer (lexer));
-        }
-      else
-        break;
-
        missing_words = find_best_match (ds_ss (&s), &command);
        if (missing_words <= 0)
          break;
-
-      lex_get (lexer);
+      word++;
      }
  
    if (command == NULL && missing_words > 0)
@@ -320,18 +332,10 @@ parse_command_name (struct lexer *lexer)
        else
          msg (SE, _("Unknown command `%s'."), ds_cstr (&s));
      }
-  else if (missing_words == 0)
-    {
-      if (!(command->flags & F_KEEP_FINAL_TOKEN))
-        lex_get (lexer);
-    }
-  else if (missing_words < 0)
-    {
-      assert (missing_words == -1);
-      assert (!(command->flags & F_KEEP_FINAL_TOKEN));
-    }
  
    ds_destroy (&s);
+
+  *n_tokens = (word + 1) + missing_words;
    return command;
  }
  
@@ -423,7 +427,8 @@ report_state_mismatch (const struct command *command, enum cmd_state state)
          }
      }
    else if (state == CMD_STATE_INPUT_PROGRAM)
-    msg (SE, _("%s is not allowed inside %s."), command->name, "INPUT PROGRAM" );
+    msg (SE, _("%s is not allowed inside %s."),
+         command->name, "INPUT PROGRAM" );
    else if (state == CMD_STATE_FILE_TYPE)
      msg (SE, _("%s is not allowed inside %s."), command->name, "FILE TYPE");
  
@@ -485,23 +490,26 @@ cmd_n_of_cases (struct lexer *lexer, struct dataset *ds)
    if (!lex_match_id (lexer, "ESTIMATED"))
      dict_set_case_limit (dataset_dict (ds), x);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Parses, performs the EXECUTE procedure. */
  int
-cmd_execute (struct lexer *lexer, struct dataset *ds)
+cmd_execute (struct lexer *lexer UNUSED, struct dataset *ds)
  {
    bool ok = casereader_destroy (proc_open (ds));
    if (!proc_commit (ds) || !ok)
      return CMD_CASCADING_FAILURE;
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Parses, performs the ERASE command. */
  int
  cmd_erase (struct lexer *lexer, struct dataset *ds UNUSED)
  {
+  char *filename;
+  int retval;
+
    if (settings_get_safer_mode ())
      {
        msg (SE, _("This command not allowed when the SAFER option is set."));
@@ -514,29 +522,25 @@ cmd_erase (struct lexer *lexer, struct dataset *ds UNUSED)
    if (!lex_force_string (lexer))
      return CMD_FAILURE;
  
-  if (remove (lex_tokcstr (lexer)) == -1)
+  filename = utf8_to_filename (lex_tokcstr (lexer));
+  retval = remove (filename);
+  free (filename);
+
+  if (retval == -1)
      {
        msg (SW, _("Error removing `%s': %s."),
             lex_tokcstr (lexer), strerror (errno));
        return CMD_FAILURE;
      }
+  lex_get (lexer);
  
    return CMD_SUCCESS;
  }
  
  /* Parses, performs the NEW FILE command. */
  int
-cmd_new_file (struct lexer *lexer, struct dataset *ds)
+cmd_new_file (struct lexer *lexer UNUSED, struct dataset *ds)
  {
    proc_discard_active_file (ds);
-
-  return lex_end_of_command (lexer);
-}
-
-/* Parses a comment. */
-int
-cmd_comment (struct lexer *lexer, struct dataset *ds UNUSED)
-{
-  lex_skip_comment (lexer);
    return CMD_SUCCESS;
  }
diff --git a/src/language/command.def b/src/language/command.def

index ece18a8c8f69d409b037a262261f9eba41412392..3610b3d346159178b3a5cc6dbd3d74ecd4d6ec12 100644 (file)
--- a/src/language/command.def
+++ b/src/language/command.def
@@ -16,7 +16,6 @@
  
  /* Utility commands acceptable anywhere. */
  DEF_CMD (S_ANY, F_ENHANCED, "CLOSE FILE HANDLE", cmd_close_file_handle)
-DEF_CMD (S_ANY, F_KEEP_FINAL_TOKEN, "COMMENT", cmd_comment)
  DEF_CMD (S_ANY, 0, "CACHE", cmd_cache)
  DEF_CMD (S_ANY, 0, "CD", cmd_cd)
  DEF_CMD (S_ANY, 0, "DO REPEAT", cmd_do_repeat)
@@ -25,7 +24,7 @@ DEF_CMD (S_ANY, 0, "ECHO", cmd_echo)
  DEF_CMD (S_ANY, 0, "ERASE", cmd_erase)
  DEF_CMD (S_ANY, 0, "EXIT", cmd_finish)
  DEF_CMD (S_ANY, 0, "FILE HANDLE", cmd_file_handle)
-DEF_CMD (S_ANY, F_KEEP_FINAL_TOKEN, "FILE LABEL", cmd_file_label)
+DEF_CMD (S_ANY, 0, "FILE LABEL", cmd_file_label)
  DEF_CMD (S_ANY, 0, "FINISH", cmd_finish)
  DEF_CMD (S_ANY, 0, "HOST", cmd_host)
  DEF_CMD (S_ANY, 0, "INCLUDE", cmd_include)
@@ -40,9 +39,9 @@ DEF_CMD (S_ANY, 0, "QUIT", cmd_finish)
  DEF_CMD (S_ANY, 0, "RESTORE", cmd_restore)
  DEF_CMD (S_ANY, 0, "SET", cmd_set)
  DEF_CMD (S_ANY, 0, "SHOW", cmd_show)
-DEF_CMD (S_ANY, F_KEEP_FINAL_TOKEN, "SUBTITLE", cmd_subtitle)
+DEF_CMD (S_ANY, 0, "SUBTITLE", cmd_subtitle)
  DEF_CMD (S_ANY, 0, "SYSFILE INFO", cmd_sysfile_info)
-DEF_CMD (S_ANY, F_KEEP_FINAL_TOKEN, "TITLE", cmd_title)
+DEF_CMD (S_ANY, 0, "TITLE", cmd_title)
  
  /* Commands that define (or replace) the active file. */
  DEF_CMD (S_INITIAL | S_DATA, 0, "ADD FILES", cmd_add_files)
@@ -63,7 +62,7 @@ DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "BREAK", cmd_break)
  DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "COMPUTE", cmd_compute)
  DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "DATAFILE ATTRIBUTE", cmd_datafile_attribute)
  DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "DISPLAY", cmd_display)
-DEF_CMD (S_DATA | S_INPUT_PROGRAM, F_KEEP_FINAL_TOKEN, "DOCUMENT", cmd_document)
+DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "DOCUMENT", cmd_document)
  DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "DO IF", cmd_do_if)
  DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "DROP DOCUMENTS", cmd_drop_documents)
  DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "ELSE IF", cmd_else_if)
@@ -101,7 +100,7 @@ DEF_CMD (S_DATA | S_INPUT_PROGRAM, 0, "XSAVE", cmd_xsave)
  /* Commands that may appear after active file definition. */
  DEF_CMD (S_DATA, 0, "AGGREGATE", cmd_aggregate)
  DEF_CMD (S_DATA, 0, "AUTORECODE", cmd_autorecode)
-DEF_CMD (S_DATA, F_KEEP_FINAL_TOKEN, "BEGIN DATA", cmd_begin_data)
+DEF_CMD (S_DATA, 0, "BEGIN DATA", cmd_begin_data)
  DEF_CMD (S_DATA, 0, "COUNT", cmd_count)
  DEF_CMD (S_DATA, 0, "CROSSTABS", cmd_crosstabs)
  DEF_CMD (S_DATA, 0, "CORRELATIONS", cmd_correlation)
diff --git a/src/language/control/automake.mk b/src/language/control/automake.mk

index e12813abdbb68eea3a2779f70f4e8a3df28b25f7..3e87e5a0b63d51c8b90aa78dcecb482c7303f587 100644 (file)
--- a/src/language/control/automake.mk
+++ b/src/language/control/automake.mk
@@ -6,8 +6,7 @@ language_control_sources = \
         src/language/control/control-stack.h \
         src/language/control/do-if.c \
         src/language/control/loop.c \
-       src/language/control/temporary.c \
         src/language/control/repeat.c \
-       src/language/control/repeat.h
+       src/language/control/temporary.c
  
  EXTRA_DIST += src/language/control/OChangeLog
diff --git a/src/language/control/do-if.c b/src/language/control/do-if.c

index a80dae31c9811936ea24908901c4518d5b9bebba..612b08565cf2664fa24b57eee1c97d00d187facc 100644 (file)
--- a/src/language/control/do-if.c
+++ b/src/language/control/do-if.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2009, 2011 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2009-2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -121,19 +121,19 @@ cmd_else_if (struct lexer *lexer, struct dataset *ds)
  
  /* Parse ELSE. */
  int
-cmd_else (struct lexer *lexer, struct dataset *ds)
+cmd_else (struct lexer *lexer UNUSED, struct dataset *ds)
  {
    struct do_if_trns *do_if = ctl_stack_top (&do_if_class);
    assert (ds == do_if->ds);
    if (do_if == NULL || !must_not_have_else (do_if))
      return CMD_CASCADING_FAILURE;
    add_else (do_if);
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Parse END IF. */
  int
-cmd_end_if (struct lexer *lexer, struct dataset *ds)
+cmd_end_if (struct lexer *lexer UNUSED, struct dataset *ds)
  {
    struct do_if_trns *do_if = ctl_stack_top (&do_if_class);
    assert (ds == do_if->ds);
@@ -143,7 +143,7 @@ cmd_end_if (struct lexer *lexer, struct dataset *ds)
  
    ctl_stack_pop (do_if);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Closes out DO_IF, by adding a sentinel ELSE clause if
@@ -204,7 +204,7 @@ parse_clause (struct lexer *lexer, struct do_if_trns *do_if, struct dataset *ds)
  
    add_clause (do_if, condition);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Adds a clause to DO_IF that tests for the given CONDITION and,
diff --git a/src/language/control/loop.c b/src/language/control/loop.c

index 362f199327e00c33826d90976dd746a33ee90c9e..8f4ff8251648a8cf93958347de0ce69b5d6970af 100644 (file)
--- a/src/language/control/loop.c
+++ b/src/language/control/loop.c
@@ -154,7 +154,7 @@ cmd_end_loop (struct lexer *lexer, struct dataset *ds)
  
  /* Parses BREAK. */
  int
-cmd_break (struct lexer *lexer, struct dataset *ds)
+cmd_break (struct lexer *lexer UNUSED, struct dataset *ds)
  {
    struct ctl_stmt *loop = ctl_stack_search (&loop_class);
    if (loop == NULL)
@@ -162,7 +162,7 @@ cmd_break (struct lexer *lexer, struct dataset *ds)
  
    add_transformation (ds, break_trns_proc, NULL, loop);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Closes a LOOP construct by emitting the END LOOP
diff --git a/src/language/control/repeat.c b/src/language/control/repeat.c

index ecff0577bf3ca64624e139cbd31b633ffe4e8ab1..c0fa8fe0d8dfe165deb6f55968f17ed512d9b46f 100644 (file)
--- a/src/language/control/repeat.c
+++ b/src/language/control/repeat.c
@@ -16,483 +16,412 @@
  
  #include <config.h>
  
-#include "language/control/repeat.h"
-
-#include <ctype.h>
-#include <math.h>
  #include <stdlib.h>
  
  #include "data/dictionary.h"
  #include "data/procedure.h"
-#include "data/settings.h"
-#include "libpspp/getl.h"
  #include "language/command.h"
  #include "language/lexer/lexer.h"
+#include "language/lexer/segment.h"
+#include "language/lexer/token.h"
  #include "language/lexer/variable-parser.h"
  #include "libpspp/cast.h"
-#include "libpspp/ll.h"
+#include "libpspp/hash-functions.h"
+#include "libpspp/hmap.h"
  #include "libpspp/message.h"
-#include "libpspp/misc.h"
-#include "libpspp/pool.h"
  #include "libpspp/str.h"
-#include "data/variable.h"
  
-#include "gl/intprops.h"
+#include "gl/ftoastr.h"
+#include "gl/minmax.h"
  #include "gl/xalloc.h"
  
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
  
-/* A line repeated by DO REPEAT. */
-struct repeat_line
-  {
-    struct ll ll;               /* In struct repeat_block line_list. */
-    const char *file_name;      /* File name. */
-    int line_number;            /* Line number. */
-    struct substring text;     /* Contents. */
-  };
-
-/* The type of substitution made for a DO REPEAT macro. */
-enum repeat_macro_type
-  {
-    VAR_NAMES,
-    OTHER
-  };
-
-/* Describes one DO REPEAT macro. */
-struct repeat_macro
+struct dummy_var
    {
-    struct ll ll;                       /* In struct repeat_block macros. */
-    enum repeat_macro_type type;        /* Types of replacements. */
-    struct substring name;              /* Macro name. */
-    struct substring *replacements;     /* Macro replacement. */
+    struct hmap_node hmap_node;
+    char *name;
+    char **values;
+    size_t n_values;
    };
  
-/* A DO REPEAT...END REPEAT block. */
-struct repeat_block
-  {
-    struct getl_interface parent;
-
-    struct pool *pool;                  /* Pool used for storage. */
-    struct dataset *ds;                 /* The dataset for this block */
-
-    struct ll_list lines;               /* Lines in buffer. */
-    struct ll *cur_line;                /* Last line output. */
-    int loop_cnt;                       /* Number of loops. */
-    int loop_idx;                       /* Number of loops so far. */
+static bool parse_specification (struct lexer *, struct dictionary *,
+                                 struct hmap *dummies);
+static bool parse_commands (struct lexer *, struct hmap *dummies);
+static void destroy_dummies (struct hmap *dummies);
  
-    struct ll_list macros;              /* Table of macros. */
+static bool parse_ids (struct lexer *, const struct dictionary *,
+                       struct dummy_var *);
+static bool parse_numbers (struct lexer *, struct dummy_var *);
+static bool parse_strings (struct lexer *, struct dummy_var *);
  
-    bool print;                         /* Print lines as executed? */
-  };
-
-static bool parse_specification (struct lexer *, struct repeat_block *);
-static bool parse_lines (struct lexer *, struct repeat_block *);
-static void create_vars (struct repeat_block *);
+int
+cmd_do_repeat (struct lexer *lexer, struct dataset *ds)
+{
+  struct hmap dummies;
+  bool ok;
  
-static struct repeat_macro *find_macro (struct repeat_block *,
-                                        struct substring name);
+  if (!parse_specification (lexer, dataset_dict (ds), &dummies))
+    return CMD_CASCADING_FAILURE;
  
-static int parse_ids (struct lexer *, const struct dictionary *dict,
-                     struct repeat_macro *, struct pool *);
+  ok = parse_commands (lexer, &dummies);
  
-static int parse_numbers (struct lexer *, struct repeat_macro *,
-                         struct pool *);
+  destroy_dummies (&dummies);
  
-static int parse_strings (struct lexer *, struct repeat_macro *,
-                         struct pool *);
+  return ok ? CMD_SUCCESS : CMD_CASCADING_FAILURE;
+}
  
-static void do_repeat_filter (struct getl_interface *,
-                              struct string *);
-static bool do_repeat_read (struct getl_interface *,
-                            struct string *);
-static void do_repeat_close (struct getl_interface *);
-static bool always_false (const struct getl_interface *);
-static const char *do_repeat_name (const struct getl_interface *);
-static int do_repeat_location (const struct getl_interface *);
+static unsigned int
+hash_dummy (const char *name, size_t name_len)
+{
+  return hash_case_bytes (name, name_len, 0);
+}
  
-int
-cmd_do_repeat (struct lexer *lexer, struct dataset *ds)
+static const struct dummy_var *
+find_dummy_var (struct hmap *hmap, const char *name, size_t name_len)
  {
-  struct repeat_block *block;
-
-  block = pool_create_container (struct repeat_block, pool);
-  block->ds = ds;
-  ll_init (&block->lines);
-  block->cur_line = ll_null (&block->lines);
-  block->loop_idx = 0;
-  ll_init (&block->macros);
-
-  if (!parse_specification (lexer, block) || !parse_lines (lexer, block))
-    goto error;
-
-  create_vars (block);
-
-  block->parent.read = do_repeat_read;
-  block->parent.close = do_repeat_close;
-  block->parent.filter = do_repeat_filter;
-  block->parent.interactive = always_false;
-  block->parent.name = do_repeat_name;
-  block->parent.location = do_repeat_location;
-
-  if (!ll_is_empty (&block->lines))
-    getl_include_source (lex_get_source_stream (lexer),
-                        &block->parent,
-                        lex_current_syntax_mode (lexer),
-                        lex_current_error_mode (lexer)
-                        );
-  else
-    pool_destroy (block->pool);
+  const struct dummy_var *dv;
  
-  return CMD_SUCCESS;
+  HMAP_FOR_EACH_WITH_HASH (dv, struct dummy_var, hmap_node,
+                           hash_dummy (name, name_len), hmap)
+    if (strcasecmp (dv->name, name))
+      return dv;
  
- error:
-  pool_destroy (block->pool);
-  return CMD_CASCADING_FAILURE;
+  return NULL;
  }
  
  /* Parses the whole DO REPEAT command specification.
     Returns success. */
  static bool
-parse_specification (struct lexer *lexer, struct repeat_block *block)
+parse_specification (struct lexer *lexer, struct dictionary *dict,
+                     struct hmap *dummies)
  {
-  struct substring first_name;
+  struct dummy_var *first_dv = NULL;
  
-  block->loop_cnt = 0;
+  hmap_init (dummies);
    do
      {
-      struct repeat_macro *macro;
-      struct dictionary *dict = dataset_dict (block->ds);
-      int count;
+      struct dummy_var *dv;
+      const char *name;
+      bool ok;
  
        /* Get a stand-in variable name and make sure it's unique. */
        if (!lex_force_id (lexer))
-       return false;
-      if (dict_lookup_var (dict, lex_tokcstr (lexer)))
+       goto error;
+      name = lex_tokcstr (lexer);
+      if (dict_lookup_var (dict, name))
          msg (SW, _("Dummy variable name `%s' hides dictionary variable `%s'."),
-             lex_tokcstr (lexer), lex_tokcstr (lexer));
-      if (find_macro (block, lex_tokss (lexer)))
-         {
-           msg (SE, _("Dummy variable name `%s' is given twice."),
-                lex_tokcstr (lexer));
-           return false;
-         }
+             name, name);
+      if (find_dummy_var (dummies, name, strlen (name)))
+        {
+          msg (SE, _("Dummy variable name `%s' is given twice."), name);
+          goto error;
+        }
  
        /* Make a new macro. */
-      macro = pool_alloc (block->pool, sizeof *macro);
-      ss_alloc_substring_pool (&macro->name, lex_tokss (lexer), block->pool);
-      ll_push_tail (&block->macros, &macro->ll);
+      dv = xmalloc (sizeof *dv);
+      dv->name = xstrdup (name);
+      dv->values = NULL;
+      dv->n_values = 0;
+      hmap_insert (dummies, &dv->hmap_node, hash_dummy (name, strlen (name)));
  
        /* Skip equals sign. */
        lex_get (lexer);
        if (!lex_force_match (lexer, T_EQUALS))
-       return false;
+       goto error;
  
        /* Get the details of the variable's possible values. */
-      if (lex_token (lexer) == T_ID)
-       count = parse_ids (lexer, dict, macro, block->pool);
+      if (lex_token (lexer) == T_ID || lex_token (lexer) == T_ALL)
+       ok = parse_ids (lexer, dict, dv);
        else if (lex_is_number (lexer))
-       count = parse_numbers (lexer, macro, block->pool);
+       ok = parse_numbers (lexer, dv);
        else if (lex_is_string (lexer))
-       count = parse_strings (lexer, macro, block->pool);
+       ok = parse_strings (lexer, dv);
        else
         {
           lex_error (lexer, NULL);
-         return false;
+         goto error;
         }
-      if (count == 0)
-       return false;
+      if (!ok)
+       goto error;
+      assert (dv->n_values > 0);
        if (lex_token (lexer) != T_SLASH && lex_token (lexer) != T_ENDCMD)
          {
            lex_error (lexer, NULL);
-          return false;
+          goto error;
          }
  
-      /* If this is the first variable then it defines how many
-        replacements there must be; otherwise enforce this number of
-        replacements. */
-      if (block->loop_cnt == 0)
+      /* If this is the first variable then it defines how many replacements
+        there must be; otherwise enforce this number of replacements. */
+      if (first_dv == NULL)
+        first_dv = dv;
+      else if (first_dv->n_values != dv->n_values)
         {
-         block->loop_cnt = count;
-         first_name = macro->name;
-       }
-      else if (block->loop_cnt != count)
-       {
-         msg (SE, _("Dummy variable `%.*s' had %d "
-                     "substitutions, so `%.*s' must also, but %d "
-                     "were specified."),
-              (int) ss_length (first_name), ss_data (first_name),
-               block->loop_cnt,
-               (int) ss_length (macro->name), ss_data (macro->name),
-               count);
-         return false;
+         msg (SE, _("Dummy variable `%s' had %d substitutions, so `%s' must "
+                     "also, but %d were specified."),
+               first_dv->name, first_dv->n_values,
+               dv->name, dv->n_values);
+         goto error;
         }
  
        lex_match (lexer, T_SLASH);
      }
-  while (lex_token (lexer) != T_ENDCMD);
+  while (!lex_match (lexer, T_ENDCMD));
  
-  return true;
-}
+  while (lex_match (lexer, T_ENDCMD))
+    continue;
  
-/* Finds and returns a DO REPEAT macro with the given NAME, or
-   NULL if there is none */
-static struct repeat_macro *
-find_macro (struct repeat_block *block, struct substring name)
-{
-  struct repeat_macro *macro;
-
-  ll_for_each (macro, struct repeat_macro, ll, &block->macros)
-    if (ss_equals (macro->name, name))
-      return macro;
+  return true;
  
-  return NULL;
+error:
+  destroy_dummies (dummies);
+  return false;
  }
  
-/* Advances LINE past white space and an identifier, if present.
-   Returns true if KEYWORD matches the identifer, false
-   otherwise. */
-static bool
-recognize_keyword (struct substring *line, const char *keyword)
+static size_t
+count_values (struct hmap *dummies)
  {
-  struct substring id;
-  ss_ltrim (line, ss_cstr (CC_SPACES));
-  ss_get_bytes (line, lex_id_get_length (*line), &id);
-  return lex_id_match (ss_cstr (keyword), id);
+  const struct dummy_var *dv;
+  dv = HMAP_FIRST (struct dummy_var, hmap_node, dummies);
+  return dv->n_values;
  }
  
-/* Returns true if LINE contains a DO REPEAT command, false
-   otherwise. */
-static bool
-recognize_do_repeat (struct substring line)
+static void
+do_parse_commands (struct substring s, enum lex_syntax_mode syntax_mode,
+                   struct hmap *dummies,
+                   struct string *outputs, size_t n_outputs)
  {
-  return (recognize_keyword (&line, "do")
-          && recognize_keyword (&line, "repeat"));
-}
+  struct segmenter segmenter;
  
-/* Returns true if LINE contains an END REPEAT command, false
-   otherwise.  Sets *PRINT to true for END REPEAT PRINT, false
-   otherwise. */
-static bool
-recognize_end_repeat (struct substring line, bool *print)
-{
-  if (!recognize_keyword (&line, "end")
-      || !recognize_keyword (&line, "repeat"))
-    return false;
+  segmenter_init (&segmenter, syntax_mode);
  
-  *print = recognize_keyword (&line, "print");
-  return true;
-}
+  while (!ss_is_empty (s))
+    {
+      enum segment_type type;
+      int n;
  
-/* Read all the lines we are going to substitute, inside the DO
-   REPEAT...END REPEAT block. */
-static bool
-parse_lines (struct lexer *lexer, struct repeat_block *block)
-{
-  char *previous_file_name;
-  int nesting_level;
+      n = segmenter_push (&segmenter, s.string, s.length, &type);
+      assert (n >= 0);
  
-  previous_file_name = NULL;
-  nesting_level = 0;
+      if (type == SEG_DO_REPEAT_COMMAND)
+        {
+          for (;;)
+            {
+              int k;
  
-  for (;;)
-    {
-      const char *cur_file_name;
-      struct repeat_line *line;
-      struct string text;
-      bool command_ends_before_line, command_ends_after_line;
+              k = segmenter_push (&segmenter, s.string + n, s.length - n,
+                                  &type);
+              if (type != SEG_NEWLINE && type != SEG_DO_REPEAT_COMMAND)
+                break;
  
-      /* Retrieve an input line and make a copy of it. */
-      if (!lex_get_line_raw (lexer))
-        {
-          msg (SE, _("DO REPEAT without END REPEAT."));
-          return false;
-        }
-      ds_init_string (&text, lex_entire_line_ds (lexer));
-
-      /* Record file name. */
-      cur_file_name = getl_source_name (lex_get_source_stream (lexer));
-      if (cur_file_name != NULL &&
-         (previous_file_name == NULL
-           || !strcmp (cur_file_name, previous_file_name)))
-        previous_file_name = pool_strdup (block->pool, cur_file_name);
-
-      /* Create a line structure. */
-      line = pool_alloc (block->pool, sizeof *line);
-      line->file_name = previous_file_name;
-      line->line_number = getl_source_location (lex_get_source_stream (lexer));
-      ss_alloc_substring_pool (&line->text, ds_ss (&text), block->pool);
-
-
-      /* Check whether the line contains a DO REPEAT or END
-         REPEAT command. */
-      lex_preprocess_line (&text,
-                          lex_current_syntax_mode (lexer),
-                           &command_ends_before_line,
-                           &command_ends_after_line);
-      if (recognize_do_repeat (ds_ss (&text)))
-        {
-          if (settings_get_syntax () == COMPATIBLE)
-            msg (SE, _("DO REPEAT may not nest in compatibility mode."));
-          else
-            nesting_level++;
+              n += k;
+            }
+
+          do_parse_commands (ss_head (s, n), syntax_mode, dummies,
+                             outputs, n_outputs);
          }
-      else if (recognize_end_repeat (ds_ss (&text), &block->print)
-               && nesting_level-- == 0)
+      else if (type != SEG_END)
          {
-          lex_discard_line (lexer);
-         ds_destroy (&text);
-          return true;
+          const struct dummy_var *dv;
+          size_t i;
+
+          dv = (type == SEG_IDENTIFIER
+                ? find_dummy_var (dummies, s.string, n)
+                : NULL);
+          for (i = 0; i < n_outputs; i++)
+            if (dv != NULL)
+              ds_put_cstr (&outputs[i], dv->values[i]);
+            else
+              ds_put_substring (&outputs[i], ss_head (s, n));
          }
-      ds_destroy (&text);
  
-      /* Add the line to the list. */
-      ll_push_tail (&block->lines, &line->ll);
+      ss_advance (&s, n);
      }
  }
  
-/* Creates variables for the given DO REPEAT. */
+static bool
+parse_commands (struct lexer *lexer, struct hmap *dummies)
+{
+  struct string *outputs;
+  struct string input;
+  size_t input_len;
+  size_t n_values;
+  char *file_name;
+  int line_number;
+  bool ok;
+  size_t i;
+
+  if (lex_get_file_name (lexer) != NULL)
+    file_name = xstrdup (lex_get_file_name (lexer));
+  else
+    file_name = NULL;
+  line_number = lex_get_first_line_number (lexer, 0);
+
+  ds_init_empty (&input);
+  while (lex_is_string (lexer))
+    {
+      ds_put_substring (&input, lex_tokss (lexer));
+      ds_put_byte (&input, '\n');
+      lex_get (lexer);
+    }
+  if (ds_is_empty (&input))
+    ds_put_byte (&input, '\n');
+  ds_put_byte (&input, '\0');
+  input_len = ds_length (&input);
+
+  n_values = count_values (dummies);
+  outputs = xmalloc (n_values * sizeof *outputs);
+  for (i = 0; i < n_values; i++)
+    ds_init_empty (&outputs[i]);
+
+  do_parse_commands (ds_ss (&input), lex_get_syntax_mode (lexer),
+                     dummies, outputs, n_values);
+
+  ds_destroy (&input);
+
+  while (lex_match (lexer, T_ENDCMD))
+    continue;
+
+  ok = (lex_force_match_id (lexer, "END")
+        && lex_force_match_id (lexer, "REPEAT"));
+  if (ok)
+    lex_match_id (lexer, "PRINT"); /* XXX */
+
+  lex_discard_rest_of_command (lexer);
+
+  for (i = 0; i < n_values; i++)
+    {
+      struct string *output = &outputs[n_values - i - 1];
+      struct lex_reader *reader;
+
+      reader = lex_reader_for_substring_nocopy (ds_ss (output));
+      lex_reader_set_file_name (reader, file_name);
+      reader->line_number = line_number;
+      lex_include (lexer, reader);
+    }
+  free (file_name);
+
+  return ok;
+}
+
  static void
-create_vars (struct repeat_block *block)
+destroy_dummies (struct hmap *dummies)
  {
-  struct repeat_macro *macro;
-
-  ll_for_each (macro, struct repeat_macro, ll, &block->macros)
-    if (macro->type == VAR_NAMES)
-      {
-        int i;
-
-        for (i = 0; i < block->loop_cnt; i++)
-          {
-            /* Ignore return value: if the variable already
-               exists there is no harm done. */
-            char *var_name = ss_xstrdup (macro->replacements[i]);
-            dict_create_var (dataset_dict (block->ds), var_name, 0);
-            free (var_name);
-          }
-      }
+  struct dummy_var *dv, *next;
+
+  HMAP_FOR_EACH_SAFE (dv, next, struct dummy_var, hmap_node, dummies)
+    {
+      size_t i;
+
+      hmap_delete (dummies, &dv->hmap_node);
+
+      free (dv->name);
+      for (i = 0; i < dv->n_values; i++)
+        free (dv->values[i]);
+      free (dv->values);
+      free (dv);
+    }
+  hmap_destroy (dummies);
  }
  
  /* Parses a set of ids for DO REPEAT. */
-static int
+static bool
  parse_ids (struct lexer *lexer, const struct dictionary *dict,
-          struct repeat_macro *macro, struct pool *pool)
+          struct dummy_var *dv)
  {
-  char **replacements;
-  size_t n, i;
-
-  macro->type = VAR_NAMES;
-  if (!parse_mixed_vars_pool (lexer, dict, pool, &replacements, &n, PV_NONE))
-    return 0;
-
-  macro->replacements = pool_nalloc (pool, n, sizeof *macro->replacements);
-  for (i = 0; i < n; i++)
-    macro->replacements[i] = ss_cstr (replacements[i]);
-  return n;
+  return parse_mixed_vars (lexer, dict, &dv->values, &dv->n_values, PV_NONE);
  }
  
  /* Adds REPLACEMENT to MACRO's list of replacements, which has
     *USED elements and has room for *ALLOCATED.  Allocates memory
     from POOL. */
  static void
-add_replacement (struct substring replacement,
-                 struct repeat_macro *macro, struct pool *pool,
-                 size_t *used, size_t *allocated)
+add_replacement (struct dummy_var *dv, char *value, size_t *allocated)
  {
-  if (*used == *allocated)
-    macro->replacements = pool_2nrealloc (pool, macro->replacements, allocated,
-                                          sizeof *macro->replacements);
-  macro->replacements[(*used)++] = replacement;
+  if (dv->n_values == *allocated)
+    dv->values = x2nrealloc (dv->values, allocated, sizeof *dv->values);
+  dv->values[dv->n_values++] = value;
  }
  
  /* Parses a list or range of numbers for DO REPEAT. */
-static int
-parse_numbers (struct lexer *lexer, struct repeat_macro *macro,
-              struct pool *pool)
+static bool
+parse_numbers (struct lexer *lexer, struct dummy_var *dv)
  {
-  size_t used = 0;
    size_t allocated = 0;
  
-  macro->type = OTHER;
-  macro->replacements = NULL;
-
    do
      {
-      bool integer_value_seen;
-      double a, b, i;
-
-      /* Parse A TO B into a, b. */
        if (!lex_force_num (lexer))
-       return 0;
+       return false;
  
-      if ( (integer_value_seen = lex_is_integer (lexer) ) )
-       a = lex_integer (lexer);
-      else
-       a = lex_number (lexer);
+      if (lex_next_token (lexer, 1) == T_TO)
+        {
+          long int a, b;
+          long int i;
  
-      lex_get (lexer);
-      if (lex_token (lexer) == T_TO)
-       {
-         if ( !integer_value_seen )
+          if (!lex_is_integer (lexer))
             {
-             msg (SE, _("Ranges may only have integer bounds"));
-             return 0;
+             msg (SE, _("Ranges may only have integer bounds."));
+             return false;
             }
-         lex_get (lexer);
-         if (!lex_force_int (lexer))
-           return 0;
+
+          a = lex_integer (lexer);
+          lex_get (lexer);
+          lex_get (lexer);
+
+          if (!lex_force_int (lexer))
+            return false;
+
           b = lex_integer (lexer);
            if (b < a)
              {
-              msg (SE, _("%g TO %g is an invalid range."), a, b);
-              return 0;
+              msg (SE, _("%ld TO %ld is an invalid range."), a, b);
+              return false;
              }
           lex_get (lexer);
-       }
+
+          for (i = a; i <= b; i++)
+            add_replacement (dv, xasprintf ("%ld", i), &allocated);
+        }
        else
-        b = a;
+        {
+          char s[DBL_BUFSIZE_BOUND];
  
-      for (i = a; i <= b; i++)
-        add_replacement (ss_cstr (pool_asprintf (pool, "%g", i)),
-                         macro, pool, &used, &allocated);
+          dtoastr (s, sizeof s, 0, 0, lex_number (lexer));
+          add_replacement (dv, xstrdup (s), &allocated);
+          lex_get (lexer);
+        }
  
        lex_match (lexer, T_COMMA);
      }
    while (lex_token (lexer) != T_SLASH && lex_token (lexer) != T_ENDCMD);
  
-  return used;
+  return true;
  }
  
  /* Parses a list of strings for DO REPEAT. */
-int
-parse_strings (struct lexer *lexer, struct repeat_macro *macro, struct pool *pool)
+static bool
+parse_strings (struct lexer *lexer, struct dummy_var *dv)
  {
-  size_t used = 0;
    size_t allocated = 0;
  
-  macro->type = OTHER;
-  macro->replacements = NULL;
-
    do
      {
-      char *string;
-
        if (!lex_force_string (lexer))
         {
           msg (SE, _("String expected."));
-         return 0;
+         return false;
         }
  
-      string = lex_token_representation (lexer);
-      pool_register (pool, free, string);
-      add_replacement (ss_cstr (string), macro, pool, &used, &allocated);
+      add_replacement (dv, token_to_string (lex_next (lexer, 0)), &allocated);
  
        lex_get (lexer);
        lex_match (lexer, T_COMMA);
      }
    while (lex_token (lexer) != T_SLASH && lex_token (lexer) != T_ENDCMD);
  
-  return used;
+  return true;
  }
  \f
  int
@@ -501,128 +430,3 @@ cmd_end_repeat (struct lexer *lexer UNUSED, struct dataset *ds UNUSED)
    msg (SE, _("No matching DO REPEAT."));
    return CMD_CASCADING_FAILURE;
  }
-\f
-/* Finds a DO REPEAT macro with the given NAME and returns the
-   appropriate substitution if found, or NAME otherwise. */
-static struct substring
-find_substitution (struct repeat_block *block, struct substring name)
-{
-  struct repeat_macro *macro = find_macro (block, name);
-  return macro ? macro->replacements[block->loop_idx] : name;
-}
-
-/* Makes appropriate DO REPEAT macro substitutions within the
-   repeated lines. */
-static void
-do_repeat_filter (struct getl_interface *interface, struct string *line)
-{
-  struct repeat_block *block
-    = UP_CAST (interface, struct repeat_block, parent);
-  bool in_apos, in_quote, dot;
-  struct substring input;
-  struct string output;
-  int c;
-
-  ds_init_empty (&output);
-
-  /* Strip trailing whitespace, check for & remove terminal dot. */
-  ds_rtrim (line, ss_cstr (CC_SPACES));
-  dot = ds_chomp_byte (line, '.');
-  input = ds_ss (line);
-  in_apos = in_quote = false;
-  while ((c = ss_first (input)) != EOF)
-    {
-      if (c == '\'' && !in_quote)
-       in_apos = !in_apos;
-      else if (c == '"' && !in_apos)
-       in_quote = !in_quote;
-
-      if (in_quote || in_apos || !lex_is_id1 (c))
-        {
-          ds_put_byte (&output, c);
-          ss_advance (&input, 1);
-        }
-      else
-        {
-          struct substring id;
-          ss_get_bytes (&input, lex_id_get_length (input), &id);
-          ds_put_substring (&output, find_substitution (block, id));
-        }
-    }
-  if (dot)
-    ds_put_byte (&output, '.');
-
-  ds_swap (line, &output);
-  ds_destroy (&output);
-}
-
-static struct repeat_line *
-current_line (const struct getl_interface *interface)
-{
-  struct repeat_block *block
-    = UP_CAST (interface, struct repeat_block, parent);
-  return (block->cur_line != ll_null (&block->lines)
-          ? ll_data (block->cur_line, struct repeat_line, ll)
-          : NULL);
-}
-
-/* Function called by getl to read a line.  Puts the line in
-   OUTPUT and its syntax mode in *SYNTAX.  Returns true if a line
-   was obtained, false if the source is exhausted. */
-static bool
-do_repeat_read  (struct getl_interface *interface,
-                 struct string *output)
-{
-  struct repeat_block *block
-    = UP_CAST (interface, struct repeat_block, parent);
-  struct repeat_line *line;
-
-  block->cur_line = ll_next (block->cur_line);
-  if (block->cur_line == ll_null (&block->lines))
-    {
-      block->loop_idx++;
-      if (block->loop_idx >= block->loop_cnt)
-        return false;
-
-      block->cur_line = ll_head (&block->lines);
-    }
-
-  line = current_line (interface);
-  ds_assign_substring (output, line->text);
-  return true;
-}
-
-/* Frees a DO REPEAT block.
-   Called by getl to close out the DO REPEAT block. */
-static void
-do_repeat_close (struct getl_interface *interface)
-{
-  struct repeat_block *block
-    = UP_CAST (interface, struct repeat_block, parent);
-  pool_destroy (block->pool);
-}
-
-
-static bool
-always_false (const struct getl_interface *i UNUSED)
-{
-  return false;
-}
-
-/* Returns the name of the source file from which the previous
-   line was originally obtained, or a null pointer if none. */
-static const char *
-do_repeat_name (const struct getl_interface *interface)
-{
-  struct repeat_line *line = current_line (interface);
-  return line ? line->file_name : NULL;
-}
-
-/* Returns the line number in the source file from which the
-   previous line was originally obtained, or 0 if none. */
-static int
-do_repeat_location (const struct getl_interface *interface)
-{
-  struct repeat_line *line = current_line (interface);
-  return line ? line->line_number : 0;
-}
diff --git a/src/language/control/repeat.h b/src/language/control/repeat.h

deleted file mode 100644 (file)

index 700bf64..0000000
--- a/src/language/control/repeat.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#if !INCLUDED_REPEAT_H
-#define INCLUDED_REPEAT_H 1
-
-void perform_DO_REPEAT_substitutions (void);
-
-#endif /* repeat.h */
diff --git a/src/language/control/temporary.c b/src/language/control/temporary.c

index eda939bba3ed95d6261619b26adc8927fe3e324e..b6a5cc9878143a267ffb0cdf764b86bf1349a6b9 100644 (file)
--- a/src/language/control/temporary.c
+++ b/src/language/control/temporary.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2011 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -37,12 +37,12 @@
  
  /* Parses the TEMPORARY command. */
  int
-cmd_temporary (struct lexer *lexer, struct dataset *ds)
+cmd_temporary (struct lexer *lexer UNUSED, struct dataset *ds)
  {
    if (!proc_in_temporary_transformations (ds))
      proc_start_temporary_transformations (ds);
    else
      msg (SE, _("This command may only appear once between "
                 "procedures and procedure-like commands."));
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/language/data-io/combine-files.c b/src/language/data-io/combine-files.c

index 82c36945a2fd803de74a0ea3239c3918440c4ed3..58693c62d21d568dd152eb91fd8b0edc86639f46 100644 (file)
--- a/src/language/data-io/combine-files.c
+++ b/src/language/data-io/combine-files.c
@@ -36,6 +36,7 @@
  #include "language/stats/sort-criteria.h"
  #include "libpspp/assertion.h"
  #include "libpspp/message.h"
+#include "libpspp/string-array.h"
  #include "libpspp/taint.h"
  #include "math/sort.h"
  
@@ -491,7 +492,7 @@ static bool
  merge_dictionary (struct dictionary *const m, struct comb_file *f)
  {
    struct dictionary *d = f->dict;
-  const char *d_docs, *m_docs;
+  const struct string_array *d_docs, *m_docs;
    int i;
    const char *file_encoding;
  
@@ -525,9 +526,19 @@ merge_dictionary (struct dictionary *const m, struct comb_file *f)
          dict_set_documents (m, d_docs);
        else
          {
-          char *new_docs = xasprintf ("%s%s", m_docs, d_docs);
-          dict_set_documents (m, new_docs);
-          free (new_docs);
+          struct string_array new_docs;
+          size_t i;
+
+          new_docs.n = m_docs->n + d_docs->n;
+          new_docs.strings = xmalloc (new_docs.n * sizeof *new_docs.strings);
+          for (i = 0; i < m_docs->n; i++)
+            new_docs.strings[i] = m_docs->strings[i];
+          for (i = 0; i < d_docs->n; i++)
+            new_docs.strings[m_docs->n + i] = d_docs->strings[i];
+
+          dict_set_documents (m, &new_docs);
+
+          free (new_docs.strings);
          }
      }
  
@@ -577,7 +588,7 @@ merge_dictionary (struct dictionary *const m, struct comb_file *f)
            if (var_has_missing_values (dv) && !var_has_missing_values (mv))
              var_set_missing_values (mv, var_get_missing_values (dv));
            if (var_get_label (dv) && !var_get_label (mv))
-            var_set_label (mv, var_get_label (dv));
+            var_set_label (mv, var_get_label (dv), file_encoding, false);
          }
        else
          mv = dict_clone_var_assert (m, dv);
diff --git a/src/language/data-io/data-list.c b/src/language/data-io/data-list.c

index 043b424db77a6e3303fd845438ef8ff2911f164c..6d21c8439e7cb9e3a7e5c3ebc31744a4dd92c437 100644 (file)
--- a/src/language/data-io/data-list.c
+++ b/src/language/data-io/data-list.c
@@ -204,6 +204,7 @@ cmd_data_list (struct lexer *lexer, struct dataset *ds)
                          }
                        else
                          {
+                          /* XXX should support multibyte UTF-8 characters */
                            lex_error (lexer, NULL);
                            ds_destroy (&delims);
                            goto error;
@@ -330,7 +331,7 @@ parse_fixed (struct lexer *lexer, struct dictionary *dict,
  
        /* Parse everything. */
        if (!parse_record_placement (lexer, &record, &column)
-          || !parse_DATA_LIST_vars_pool (lexer, tmp_pool,
+          || !parse_DATA_LIST_vars_pool (lexer, dict, tmp_pool,
                                          &names, &name_cnt, PV_NONE)
            || !parse_var_placements (lexer, tmp_pool, name_cnt, true,
                                      &formats, &format_cnt))
@@ -422,7 +423,7 @@ parse_free (struct lexer *lexer, struct dictionary *dict,
        size_t name_cnt;
        size_t i;
  
-      if (!parse_DATA_LIST_vars_pool (lexer, tmp_pool,
+      if (!parse_DATA_LIST_vars_pool (lexer, dict, tmp_pool,
                                       &name, &name_cnt, PV_NONE))
         return false;
  
diff --git a/src/language/data-io/data-parser.c b/src/language/data-io/data-parser.c

index 630363af53f6ce2a077028474dd1d0386bde0fd6..0c52e07bf9d843605a3580b5bad13b5c2f4b2601 100644 (file)
--- a/src/language/data-io/data-parser.c
+++ b/src/language/data-io/data-parser.c
@@ -508,10 +508,11 @@ parse_error (const struct dfm_reader *reader, const struct field *field,
  
    m.category = MSG_C_DATA;
    m.severity = MSG_S_WARNING;
-  m.where.file_name = CONST_CAST (char *, dfm_get_file_name (reader));
-  m.where.line_number = dfm_get_line_number (reader);
-  m.where.first_column = first_column;
-  m.where.last_column = last_column;
+  m.file_name = CONST_CAST (char *, dfm_get_file_name (reader));
+  m.first_line = dfm_get_line_number (reader);
+  m.last_line = m.first_line + 1;
+  m.first_column = first_column;
+  m.last_column = last_column;
    m.text = xasprintf (_("Data for variable %s is not valid as format %s: %s"),
                        field->name, fmt_name (field->format.type), error);
    msg_emit (&m);
diff --git a/src/language/data-io/data-reader.c b/src/language/data-io/data-reader.c

index 87fa8c92efa7f8da06814633f7bd1f017a7e13f2..e701a936d241f5ad749c40322fa57f5164ba16b5 100644 (file)
--- a/src/language/data-io/data-reader.c
+++ b/src/language/data-io/data-reader.c
@@ -32,7 +32,6 @@
  #include "language/command.h"
  #include "language/data-io/file-handle.h"
  #include "language/lexer/lexer.h"
-#include "language/prompt.h"
  #include "libpspp/assertion.h"
  #include "libpspp/cast.h"
  #include "libpspp/integer-format.h"
@@ -53,6 +52,7 @@ enum dfm_reader_flags
      DFM_SAW_BEGIN_DATA = 004,   /* For inline_file only, whether we've
                                     already read a BEGIN DATA line. */
      DFM_TABS_EXPANDED = 010,    /* Tabs have been expanded. */
+    DFM_CONSUME = 020           /* read_inline_record() should get a token? */
    };
  
  /* Data file reader. */
@@ -60,7 +60,7 @@ struct dfm_reader
    {
      struct file_handle *fh;     /* File handle. */
      struct fh_lock *lock;       /* Mutual exclusion lock for file. */
-    struct msg_locator where;   /* Current location in data file. */
+    int line_number;            /* Current line or record number. */
      struct string line;         /* Current line. */
      struct string scratch;      /* Extra line buffer. */
      enum dfm_reader_flags flags; /* Zero or more of DFM_*. */
@@ -141,8 +141,7 @@ dfm_open_reader (struct file_handle *fh, struct lexer *lexer)
    if (fh_get_referent (fh) != FH_REF_INLINE)
      {
        struct stat s;
-      r->where.file_name = CONST_CAST (char *, fh_get_file_name (fh));
-      r->where.line_number = 0;
+      r->line_number = 0;
        r->file = fn_open (fh_get_file_name (fh), "rb");
        if (r->file == NULL)
          {
@@ -177,33 +176,37 @@ read_inline_record (struct dfm_reader *r)
    if ((r->flags & DFM_SAW_BEGIN_DATA) == 0)
      {
        r->flags |= DFM_SAW_BEGIN_DATA;
+      r->flags &= ~DFM_CONSUME;
  
        while (lex_token (r->lexer) == T_ENDCMD)
          lex_get (r->lexer);
-      if (!lex_force_match_id (r->lexer, "BEGIN") || !lex_force_match_id (r->lexer, "DATA"))
+
+      if (!lex_force_match_id (r->lexer, "BEGIN")
+          || !lex_force_match_id (r->lexer, "DATA"))
          return false;
-      prompt_set_style (PROMPT_DATA);
-    }
  
-  if (!lex_get_line_raw (r->lexer))
-    {
-      lex_discard_line (r->lexer);
-      msg (SE, _("Unexpected end-of-file while reading data in BEGIN "
-                 "DATA.  This probably indicates "
-                 "a missing or incorrectly formatted END DATA command.  "
-                 "END DATA must appear by itself on a single line "
-                 "with exactly one space between words."));
-      return false;
+      lex_match (r->lexer, T_ENDCMD);
      }
  
-  if (ds_length (lex_entire_line_ds (r->lexer) ) >= 8
-      && !strncasecmp (lex_entire_line (r->lexer), "end data", 8))
+  if (r->flags & DFM_CONSUME)
+    lex_get (r->lexer);
+
+  if (!lex_is_string (r->lexer))
      {
-      lex_discard_line (r->lexer);
+      if (!lex_match_id (r->lexer, "END") || !lex_match_id (r->lexer, "DATA"))
+        {
+          msg (SE, _("Missing END DATA while reading inline data.  "
+                     "This probably indicates a missing or incorrectly "
+                     "formatted END DATA command.  END DATA must appear "
+                     "by itself on a single line with exactly one space "
+                     "between words."));
+          lex_discard_rest_of_command (r->lexer);
+        }
        return false;
      }
  
-  ds_assign_string (&r->line, lex_entire_line_ds (r->lexer) );
+  ds_assign_substring (&r->line, lex_tokss (r->lexer));
+  r->flags |= DFM_CONSUME;
  
    return true;
  }
@@ -480,7 +483,7 @@ read_record (struct dfm_reader *r)
      {
        bool ok = read_file_record (r);
        if (ok)
-        r->where.line_number++;
+        r->line_number++;
        return ok;
      }
    else
@@ -678,13 +681,15 @@ dfm_get_column (const struct dfm_reader *r, const char *p)
  const char *
  dfm_get_file_name (const struct dfm_reader *r)
  {
-  return fh_get_referent (r->fh) == FH_REF_FILE ? r->where.file_name : NULL;
+  return (fh_get_referent (r->fh) == FH_REF_FILE
+          ? fh_get_file_name (r->fh)
+          : NULL);
  }
  
  int
  dfm_get_line_number (const struct dfm_reader *r)
  {
-  return fh_get_referent (r->fh) == FH_REF_FILE ? r->where.line_number : -1;
+  return fh_get_referent (r->fh) == FH_REF_FILE ? r->line_number : -1;
  }
  \f
  /* BEGIN DATA...END DATA procedure. */
@@ -702,13 +707,14 @@ cmd_begin_data (struct lexer *lexer, struct dataset *ds)
                   "input program does not access the inline file."));
        return CMD_CASCADING_FAILURE;
      }
+  lex_match (lexer, T_ENDCMD);
  
    /* Open inline file. */
    r = dfm_open_reader (fh_inline_file (), lexer);
    r->flags |= DFM_SAW_BEGIN_DATA;
+  r->flags &= ~DFM_CONSUME;
  
    /* Input procedure reads from inline file. */
-  prompt_set_style (PROMPT_DATA);
    casereader_destroy (proc_open (ds));
    ok = proc_commit (ds);
    dfm_close_reader (r);
diff --git a/src/language/data-io/file-handle.q b/src/language/data-io/file-handle.q

index e847e00f40367fdea283bd482ebeb73441677e7e..7e7cdcdf9b839969b883aa43d5db54c2c20d5ffc 100644 (file)
--- a/src/language/data-io/file-handle.q
+++ b/src/language/data-io/file-handle.q
@@ -54,30 +54,32 @@ cmd_file_handle (struct lexer *lexer, struct dataset *ds)
  {
    struct cmd_file_handle cmd;
    struct file_handle *handle;
+  enum cmd_result result;
    char *handle_name;
  
+  result = CMD_CASCADING_FAILURE;
    if (!lex_force_id (lexer))
-    goto error;
-  handle_name = xstrdup (lex_tokcstr (lexer));
+    goto exit;
  
+  handle_name = xstrdup (lex_tokcstr (lexer));
    handle = fh_from_id (handle_name);
    if (handle != NULL)
      {
        msg (SE, _("File handle %s is already defined.  "
                   "Use CLOSE FILE HANDLE before redefining a file handle."),
            handle_name);
-      goto error;
+      goto exit_free_handle_name;
      }
  
    lex_get (lexer);
    if (!lex_force_match (lexer, T_SLASH))
-    goto error_free_handle_name;
+    goto exit_free_handle_name;
  
    if (!parse_file_handle (lexer, ds, &cmd, NULL))
-    goto error_free_handle_name;
+    goto exit_free_handle_name;
  
    if (lex_end_of_command (lexer) != CMD_SUCCESS)
-    goto error_free_cmd;
+    goto exit_free_cmd;
  
    if (cmd.mode != FH_SCRATCH)
      {
@@ -86,7 +88,7 @@ cmd_file_handle (struct lexer *lexer, struct dataset *ds)
        if (cmd.s_name == NULL)
          {
            lex_sbc_missing (lexer, "NAME");
-          goto error_free_cmd;
+          goto exit_free_cmd;
          }
  
        switch (cmd.mode)
@@ -119,7 +121,7 @@ cmd_file_handle (struct lexer *lexer, struct dataset *ds)
            else
              {
                msg (SE, _("RECFORM must be specified with MODE=360."));
-              goto error_free_cmd;
+              goto exit_free_cmd;
              }
            break;
          default:
@@ -145,15 +147,14 @@ cmd_file_handle (struct lexer *lexer, struct dataset *ds)
    else
      fh_create_scratch (handle_name);
  
-  free_file_handle (&cmd);
-  return CMD_SUCCESS;
+  result = CMD_SUCCESS;
  
-error_free_cmd:
+exit_free_cmd:
    free_file_handle (&cmd);
-error_free_handle_name:
+exit_free_handle_name:
    free (handle_name);
-error:
-  return CMD_CASCADING_FAILURE;
+exit:
+  return result;
  }
  
  int
diff --git a/src/language/data-io/get-data.c b/src/language/data-io/get-data.c

index 7e75b413ea49cf82fcb87a81daea8ba570e4b905..f65e8ac782547971f63f0bb3210eaae6040340d5 100644 (file)
--- a/src/language/data-io/get-data.c
+++ b/src/language/data-io/get-data.c
@@ -31,6 +31,7 @@
  #include "language/data-io/placement-parser.h"
  #include "language/lexer/format-parser.h"
  #include "language/lexer/lexer.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  
  #include "gl/xalloc.h"
@@ -152,7 +153,7 @@ parse_get_gnm (struct lexer *lexer, struct dataset *ds)
    if (!lex_force_string (lexer))
      goto error;
  
-  gri.file_name = ss_xstrdup (lex_tokss (lexer));
+  gri.file_name = utf8_to_filename (lex_tokcstr (lexer));
  
    lex_get (lexer);
  
@@ -418,6 +419,7 @@ parse_get_txt (struct lexer *lexer, struct dataset *ds)
            if (!lex_force_string (lexer))
              goto error;
  
+          /* XXX should support multibyte UTF-8 characters */
            s = lex_tokss (lexer);
            if (ss_match_string (&s, ss_cstr ("\\t")))
              ds_put_cstr (&hard_seps, "\t");
@@ -443,6 +445,7 @@ parse_get_txt (struct lexer *lexer, struct dataset *ds)
            if (!lex_force_string (lexer))
              goto error;
  
+          /* XXX should support multibyte UTF-8 characters */
            if (settings_get_syntax () == COMPATIBLE
                && ss_length (lex_tokss (lexer)) != 1)
              {
@@ -500,7 +503,8 @@ parse_get_txt (struct lexer *lexer, struct dataset *ds)
            lex_get (lexer);
          }
  
-      if (!lex_force_id (lexer))
+      if (!lex_force_id (lexer)
+          || !dict_id_is_valid (dict, lex_tokcstr (lexer), true))
          goto error;
        name = xstrdup (lex_tokcstr (lexer));
        lex_get (lexer);
diff --git a/src/language/data-io/inpt-pgm.c b/src/language/data-io/inpt-pgm.c

index 415c48c7be46ad2e0918f69bbee781d55fc80d57..72d5a2e7b88e302df07e4ca54e59c206cbdebae5 100644 (file)
--- a/src/language/data-io/inpt-pgm.c
+++ b/src/language/data-io/inpt-pgm.c
@@ -16,7 +16,6 @@
  
  #include <config.h>
  
-
  #include <float.h>
  #include <stdlib.h>
  
@@ -47,8 +46,7 @@
  /* Private result codes for use within INPUT PROGRAM. */
  enum cmd_result_extensions
    {
-    CMD_END_INPUT_PROGRAM = CMD_PRIVATE_FIRST,
-    CMD_END_CASE
+    CMD_END_CASE = CMD_PRIVATE_FIRST
    };
  
  /* Indicates how a `union value' should be initialized. */
@@ -95,7 +93,7 @@ cmd_input_program (struct lexer *lexer, struct dataset *ds)
    bool saw_END_CASE = false;
  
    proc_discard_active_file (ds);
-  if (lex_token (lexer) != T_ENDCMD)
+  if (!lex_match (lexer, T_ENDCMD))
      return lex_end_of_command (lexer);
  
    inp = xmalloc (sizeof *inp);
@@ -104,12 +102,12 @@ cmd_input_program (struct lexer *lexer, struct dataset *ds)
    inp->proto = NULL;
  
    inside_input_program = true;
-  for (;;)
+  while (!lex_match_phrase (lexer, "END INPUT PROGRAM"))
      {
-      enum cmd_result result = cmd_parse_in_state (lexer, ds, CMD_STATE_INPUT_PROGRAM);
-      if (result == CMD_END_INPUT_PROGRAM)
-        break;
-      else if (result == CMD_END_CASE)
+      enum cmd_result result;
+
+      result = cmd_parse_in_state (lexer, ds, CMD_STATE_INPUT_PROGRAM);
+      if (result == CMD_END_CASE)
          {
            emit_END_CASE (ds, inp);
            saw_END_CASE = true;
@@ -156,8 +154,12 @@ cmd_input_program (struct lexer *lexer, struct dataset *ds)
  int
  cmd_end_input_program (struct lexer *lexer UNUSED, struct dataset *ds UNUSED)
  {
-  assert (in_input_program ());
-  return CMD_END_INPUT_PROGRAM;
+  /* Inside INPUT PROGRAM, this should get caught at the top of the loop in
+     cmd_input_program().
+
+     Outside of INPUT PROGRAM, the command parser should reject this
+     command. */
+  NOT_REACHED ();
  }
  
  /* Returns true if STATE is valid given the transformations that
@@ -237,7 +239,7 @@ cmd_end_case (struct lexer *lexer, struct dataset *ds UNUSED)
    assert (in_input_program ());
    if (lex_token (lexer) == T_ENDCMD)
      return CMD_END_CASE;
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Outputs the current case */
@@ -348,13 +350,13 @@ reread_trns_free (void *t_)
  
  /* Parses END FILE command. */
  int
-cmd_end_file (struct lexer *lexer, struct dataset *ds)
+cmd_end_file (struct lexer *lexer UNUSED, struct dataset *ds)
  {
    assert (in_input_program ());
  
    add_transformation (ds, end_file_trns_proc, NULL, NULL);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Executes an END FILE transformation. */
diff --git a/src/language/data-io/save-translate.c b/src/language/data-io/save-translate.c

index 213f0bf735e40fe6b9ead6aac5aa66e3a1d698ad..d4c67b02b69bdd59a23587ee2eded16688a37f05 100644 (file)
--- a/src/language/data-io/save-translate.c
+++ b/src/language/data-io/save-translate.c
@@ -159,6 +159,7 @@ cmd_save_translate (struct lexer *lexer, struct dataset *ds)
                    lex_match (lexer, T_EQUALS);
                    if (!lex_force_string (lexer))
                      goto error;
+                  /* XXX should support multibyte UTF-8 delimiters */
                    if (ss_length (lex_tokss (lexer)) != 1)
                      {
                        msg (SE, _("The %s string must contain exactly one "
@@ -173,6 +174,7 @@ cmd_save_translate (struct lexer *lexer, struct dataset *ds)
                    lex_match (lexer, T_EQUALS);
                    if (!lex_force_string (lexer))
                      goto error;
+                  /* XXX should support multibyte UTF-8 qualifiers */
                    if (ss_length (lex_tokss (lexer)) != 1)
                      {
                        msg (SE, _("The %s string must contain exactly one "
diff --git a/src/language/data-io/trim.c b/src/language/data-io/trim.c

index ac8bf272f8528f2ebc5878e0ce0c64d8e4a1c9e3..63041f25f10465dde398a80b863a9d46bf186340 100644 (file)
--- a/src/language/data-io/trim.c
+++ b/src/language/data-io/trim.c
@@ -81,7 +81,8 @@ parse_dict_rename (struct lexer *lexer, struct dictionary *dict)
        if (v == NULL)
         return 0;
        if (!lex_force_match (lexer, T_EQUALS)
-         || !lex_force_id (lexer))
+         || !lex_force_id (lexer)
+          || !dict_id_is_valid (dict, lex_tokcstr (lexer), true))
         return 0;
        if (dict_lookup_var (dict, lex_tokcstr (lexer)) != NULL)
         {
@@ -114,7 +115,7 @@ parse_dict_rename (struct lexer *lexer, struct dictionary *dict)
           msg (SE, _("`=' expected after variable list."));
           goto done;
         }
-      if (!parse_DATA_LIST_vars (lexer, &new_names, &nn,
+      if (!parse_DATA_LIST_vars (lexer, dict, &new_names, &nn,
                                   PV_APPEND | PV_NO_SCRATCH | PV_NO_DUPLICATE))
         goto done;
        if (nn != nv)
diff --git a/src/language/dictionary/apply-dictionary.c b/src/language/dictionary/apply-dictionary.c

index 7fdbdd472f8e463b7f5677ef23d15ba38b410360..36877302afc80dfa878685b211b9a0dcaa4f70f5 100644 (file)
--- a/src/language/dictionary/apply-dictionary.c
+++ b/src/language/dictionary/apply-dictionary.c
@@ -79,12 +79,9 @@ cmd_apply_dictionary (struct lexer *lexer, struct dataset *ds)
           continue;
         }
  
-      if (var_get_label (s))
-        {
-          const char *label = var_get_label (s);
-          if (strcspn (label, " ") != strlen (label))
-            var_set_label (t, label);
-        }
+      if (var_has_label (s))
+        var_set_label (t, var_get_label (s),
+                       dict_get_encoding (dataset_dict (ds)), false);
  
        if (var_has_value_labels (s))
          {
@@ -129,5 +126,5 @@ cmd_apply_dictionary (struct lexer *lexer, struct dataset *ds)
          dict_set_weight (dataset_dict (ds), new_weight);
      }
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/language/dictionary/attributes.c b/src/language/dictionary/attributes.c

index b0c9ddfd47361e38640432fa9c1a5a9e01069095..135207306f528b1f16b027762d355f99aba53065 100644 (file)
--- a/src/language/dictionary/attributes.c
+++ b/src/language/dictionary/attributes.c
@@ -32,21 +32,26 @@
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
  
-static enum cmd_result parse_attributes (struct lexer *, struct attrset **,
-                                         size_t n);
+static enum cmd_result parse_attributes (struct lexer *,
+                                         const char *dict_encoding,
+                                         struct attrset **, size_t n);
  
  /* Parses the DATAFILE ATTRIBUTE command. */
  int
  cmd_datafile_attribute (struct lexer *lexer, struct dataset *ds)
  {
-  struct attrset *set = dict_get_attributes (dataset_dict (ds));
-  return parse_attributes (lexer, &set, 1);
+  struct dictionary *dict = dataset_dict (ds);
+  struct attrset *set = dict_get_attributes (dict);
+  return parse_attributes (lexer, dict_get_encoding (dict), &set, 1);
  }
  
  /* Parses the VARIABLE ATTRIBUTE command. */
  int
  cmd_variable_attribute (struct lexer *lexer, struct dataset *ds)
  {
+  struct dictionary *dict = dataset_dict (ds);
+  const char *dict_encoding = dict_get_encoding (dict);
+
    do 
      {
        struct variable **vars;
@@ -56,15 +61,14 @@ cmd_variable_attribute (struct lexer *lexer, struct dataset *ds)
  
        if (!lex_force_match_id (lexer, "VARIABLES")
            || !lex_force_match (lexer, T_EQUALS)
-          || !parse_variables (lexer, dataset_dict (ds), &vars, &n_vars,
-                               PV_NONE))
+          || !parse_variables (lexer, dict, &vars, &n_vars, PV_NONE))
          return CMD_FAILURE;
  
        sets = xmalloc (n_vars * sizeof *sets);
        for (i = 0; i < n_vars; i++)
          sets[i] = var_get_attributes (vars[i]);
  
-      ok = parse_attributes (lexer, sets, n_vars);
+      ok = parse_attributes (lexer, dict_encoding, sets, n_vars);
        free (vars);
        free (sets);
        if (!ok)
@@ -72,33 +76,21 @@ cmd_variable_attribute (struct lexer *lexer, struct dataset *ds)
      }
    while (lex_match (lexer, T_SLASH));
  
-  return lex_end_of_command (lexer);
-}
-
-static bool
-match_subcommand (struct lexer *lexer, const char *keyword) 
-{
-  if (lex_token (lexer) == T_ID
-      && lex_id_match (lex_tokss (lexer), ss_cstr (keyword))
-      && lex_look_ahead (lexer) == T_EQUALS)
-    {
-      lex_get (lexer);          /* Skip keyword. */
-      lex_get (lexer);          /* Skip '='. */
-      return true;
-    }
-  else
-    return false;
+  return CMD_SUCCESS;
  }
  
-/* Parses an attribute name optionally followed by an index inside square
-   brackets.  Returns the attribute name or NULL if there was a parse error.
-   Stores the index into *INDEX. */
+/* Parses an attribute name and verifies that it is valid in DICT_ENCODING,
+   optionally followed by an index inside square brackets.  Returns the
+   attribute name or NULL if there was a parse error.  Stores the index into
+   *INDEX. */
  static char *
-parse_attribute_name (struct lexer *lexer, size_t *index)
+parse_attribute_name (struct lexer *lexer, const char *dict_encoding,
+                      size_t *index)
  {
    char *name;
  
-  if (!lex_force_id (lexer))
+  if (!lex_force_id (lexer)
+      || !id_is_valid (lex_tokcstr (lexer), dict_encoding, true))
      return NULL;
    name = xstrdup (lex_tokcstr (lexer));
    lex_get (lexer);
@@ -127,13 +119,14 @@ error:
  }
  
  static bool
-add_attribute (struct lexer *lexer, struct attrset **sets, size_t n) 
+add_attribute (struct lexer *lexer, const char *dict_encoding,
+               struct attrset **sets, size_t n) 
  {
    const char *value;
    size_t index, i;
    char *name;
  
-  name = parse_attribute_name (lexer, &index);
+  name = parse_attribute_name (lexer, dict_encoding, &index);
    if (name == NULL)
      return false;
    if (!lex_force_match (lexer, T_LPAREN) || !lex_force_string (lexer))
@@ -160,12 +153,13 @@ add_attribute (struct lexer *lexer, struct attrset **sets, size_t n)
  }
  
  static bool
-delete_attribute (struct lexer *lexer, struct attrset **sets, size_t n) 
+delete_attribute (struct lexer *lexer, const char *dict_encoding,
+                  struct attrset **sets, size_t n) 
  {
    size_t index, i;
    char *name;
  
-  name = parse_attribute_name (lexer, &index);
+  name = parse_attribute_name (lexer, dict_encoding, &index);
    if (name == NULL)
      return false;
  
@@ -191,14 +185,15 @@ delete_attribute (struct lexer *lexer, struct attrset **sets, size_t n)
  }
  
  static enum cmd_result
-parse_attributes (struct lexer *lexer, struct attrset **sets, size_t n) 
+parse_attributes (struct lexer *lexer, const char *dict_encoding,
+                  struct attrset **sets, size_t n) 
  {
    enum { UNKNOWN, ADD, DELETE } command = UNKNOWN;
    do 
      {
-      if (match_subcommand (lexer, "ATTRIBUTE"))
+      if (lex_match_phrase (lexer, "ATTRIBUTE="))
          command = ADD;
-      else if (match_subcommand (lexer, "DELETE"))
+      else if (lex_match_phrase (lexer, "DELETE="))
          command = DELETE;
        else if (command == UNKNOWN)
          {
@@ -207,8 +202,8 @@ parse_attributes (struct lexer *lexer, struct attrset **sets, size_t n)
          }
  
        if (!(command == ADD
-            ? add_attribute (lexer, sets, n)
-            : delete_attribute (lexer, sets, n)))
+            ? add_attribute (lexer, dict_encoding, sets, n)
+            : delete_attribute (lexer, dict_encoding, sets, n)))
          return CMD_FAILURE;
      }
    while (lex_token (lexer) != T_SLASH && lex_token (lexer) != T_ENDCMD);
diff --git a/src/language/dictionary/missing-values.c b/src/language/dictionary/missing-values.c

index ace9948412f6fdf3a232fae8ee15cf31bf669afb..3ff4c426ff691771c0051118e064a7056cd71029 100644 (file)
--- a/src/language/dictionary/missing-values.c
+++ b/src/language/dictionary/missing-values.c
@@ -19,6 +19,7 @@
  #include <stdlib.h>
  
  #include "data/data-in.h"
+#include "data/dictionary.h"
  #include "data/format.h"
  #include "data/missing-values.h"
  #include "data/procedure.h"
@@ -28,6 +29,7 @@
  #include "language/lexer/lexer.h"
  #include "language/lexer/value-parser.h"
  #include "language/lexer/variable-parser.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/str.h"
  
@@ -37,21 +39,21 @@
  int
  cmd_missing_values (struct lexer *lexer, struct dataset *ds)
  {
+  struct dictionary *dict = dataset_dict (ds);
    struct variable **v = NULL;
    size_t nv;
  
-  int retval = CMD_FAILURE;
-  bool deferred_errors = false;
+  bool ok = true;
  
    while (lex_token (lexer) != T_ENDCMD)
      {
        size_t i;
  
-      if (!parse_variables (lexer, dataset_dict (ds), &v, &nv, PV_NONE))
-        goto done;
+      if (!parse_variables (lexer, dict, &v, &nv, PV_NONE))
+        goto error;
  
        if (!lex_force_match (lexer, T_LPAREN))
-        goto done;
+        goto error;
  
        for (i = 0; i < nv; i++)
          var_clear_missing_values (v[i]);
@@ -68,7 +70,7 @@ cmd_missing_values (struct lexer *lexer, struct dataset *ds)
                  msg (SE, _("Cannot mix numeric variables (e.g. %s) and "
                             "string variables (e.g. %s) within a single list."),
                       var_get_name (n), var_get_name (s));
-                goto done;
+                goto error;
                }
  
            if (var_is_numeric (v[0]))
@@ -81,13 +83,13 @@ cmd_missing_values (struct lexer *lexer, struct dataset *ds)
                    bool ok;
  
                    if (!parse_num_range (lexer, &x, &y, &type))
-                    goto done;
+                    goto error;
  
                    ok = (x == y
                          ? mv_add_num (&mv, x)
                          : mv_add_range (&mv, x, y));
                    if (!ok)
-                    deferred_errors = true;
+                    ok = false;
  
                    lex_match (lexer, T_COMMA);
                  }
@@ -98,27 +100,33 @@ cmd_missing_values (struct lexer *lexer, struct dataset *ds)
                while (!lex_match (lexer, T_RPAREN))
                  {
                    uint8_t value[MV_MAX_STRING];
+                  char *dict_mv;
                    size_t length;
  
                    if (!lex_force_string (lexer))
                      {
-                      deferred_errors = true;
+                      ok = false;
                        break;
                      }
  
-                  length = ss_length (lex_tokss (lexer));
+                  dict_mv = recode_string (dict_get_encoding (dict), "UTF-8",
+                                           lex_tokcstr (lexer),
+                                           ss_length (lex_tokss (lexer)));
+                  length = strlen (dict_mv);
                    if (length > MV_MAX_STRING)
                      {
+                      /* XXX truncate graphemes not bytes */
                        msg (SE, _("Truncating missing value to maximum "
                                   "acceptable length (%d bytes)."),
                             MV_MAX_STRING);
                        length = MV_MAX_STRING;
                      }
                    memset (value, ' ', MV_MAX_STRING);
-                  memcpy (value, ss_data (lex_tokss (lexer)), length);
+                  memcpy (value, dict_mv, length);
+                  free (dict_mv);
  
                    if (!mv_add_str (&mv, value))
-                    deferred_errors = true;
+                    ok = false;
  
                    lex_get (lexer);
                    lex_match (lexer, T_COMMA);
@@ -134,7 +142,7 @@ cmd_missing_values (struct lexer *lexer, struct dataset *ds)
                    msg (SE, _("Missing values provided are too long to assign "
                               "to variable of width %d."),
                         var_get_width (v[i]));
-                  deferred_errors = true;
+                  ok = false;
                  }
              }
  
@@ -145,12 +153,12 @@ cmd_missing_values (struct lexer *lexer, struct dataset *ds)
        free (v);
        v = NULL;
      }
-  retval = lex_end_of_command (lexer);
  
- done:
    free (v);
-  if (deferred_errors)
-    retval = CMD_FAILURE;
-  return retval;
+  return ok ? CMD_SUCCESS : CMD_FAILURE;
+
+error:
+  free (v);
+  return CMD_FAILURE;
  }
  
diff --git a/src/language/dictionary/modify-variables.c b/src/language/dictionary/modify-variables.c

index afb9089986b95b23babeb63e1e4947a1caed4205..f3c319001644b77a857c73a433aca7a598087f81 100644 (file)
--- a/src/language/dictionary/modify-variables.c
+++ b/src/language/dictionary/modify-variables.c
@@ -200,8 +200,8 @@ cmd_modify_vars (struct lexer *lexer, struct dataset *ds)
                        "names on RENAME subcommand."));
                   goto done;
                 }
-             if (!parse_DATA_LIST_vars (lexer, &vm.new_names,
-                                        &prev_nv_1, PV_APPEND))
+             if (!parse_DATA_LIST_vars (lexer, dataset_dict (ds),
+                                         &vm.new_names, &prev_nv_1, PV_APPEND))
                 goto done;
               if (prev_nv_1 != vm.rename_cnt)
                 {
diff --git a/src/language/dictionary/mrsets.c b/src/language/dictionary/mrsets.c

index c775f49784eab659d715cc0bb2408a7c20ad21aa..3af5d033370e9b05703580eecc4a2a64cf20e63e 100644 (file)
--- a/src/language/dictionary/mrsets.c
+++ b/src/language/dictionary/mrsets.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 2010 Free Software Foundation, Inc.
+   Copyright (C) 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -27,6 +27,7 @@
  #include "language/lexer/variable-parser.h"
  #include "libpspp/assertion.h"
  #include "libpspp/hmap.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/str.h"
  #include "libpspp/stringi-map.h"
@@ -69,7 +70,7 @@ cmd_mrsets (struct lexer *lexer, struct dataset *ds)
          return CMD_FAILURE;
      }
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  static bool
@@ -91,15 +92,10 @@ parse_group (struct lexer *lexer, struct dictionary *dict,
      {
        if (lex_match_id (lexer, "NAME"))
          {
-          if (!lex_force_match (lexer, T_EQUALS) || !lex_force_id (lexer))
+          if (!lex_force_match (lexer, T_EQUALS) || !lex_force_id (lexer)
+              || !mrset_is_valid_name (lex_tokcstr (lexer),
+                                       dict_get_encoding (dict), true))
              goto error;
-          if (lex_tokcstr (lexer)[0] != '$')
-            {
-              msg (SE, _("%s is not a valid name for a multiple response "
-                         "set.  Multiple response set names must begin with "
-                         "`$'."), lex_tokcstr (lexer));
-              goto error;
-            }
  
            free (mrset->name);
            mrset->name = xstrdup (lex_tokcstr (lexer));
@@ -159,12 +155,15 @@ parse_group (struct lexer *lexer, struct dictionary *dict,
              }
            else if (lex_is_string (lexer))
              {
-              const char *s = lex_tokcstr (lexer);
-              int width;
+              size_t width;
+              char *s;
+
+              s = recode_string (dict_get_encoding (dict), "UTF-8",
+                                 lex_tokcstr (lexer), -1);
+              width = strlen (s);
  
                /* Trim off trailing spaces, but don't trim the string until
                   it's empty because a width of 0 is a numeric type. */
-              width = strlen (s);
                while (width > 1 && s[width - 1] == ' ')
                  width--;
  
@@ -172,6 +171,8 @@ parse_group (struct lexer *lexer, struct dictionary *dict,
                value_init (&mrset->counted, width);
                memcpy (value_str_rw (&mrset->counted, width), s, width);
                mrset->width = width;
+
+              free (s);
              }
            else
              {
diff --git a/src/language/dictionary/numeric.c b/src/language/dictionary/numeric.c

index 25b2ecf8f06719fb43cceb3ba0467448fc17049e..c88d514345c9d2986364553dfcabc4aff19c9b43 100644 (file)
--- a/src/language/dictionary/numeric.c
+++ b/src/language/dictionary/numeric.c
@@ -49,7 +49,8 @@ cmd_numeric (struct lexer *lexer, struct dataset *ds)
          be used. */
        struct fmt_spec f;
  
-      if (!parse_DATA_LIST_vars (lexer, &v, &nv, PV_NO_DUPLICATE))
+      if (!parse_DATA_LIST_vars (lexer, dataset_dict (ds),
+                                 &v, &nv, PV_NO_DUPLICATE))
         return CMD_FAILURE;
  
        /* Get the optional format specification. */
@@ -98,7 +99,7 @@ cmd_numeric (struct lexer *lexer, struct dataset *ds)
      }
    while (lex_match (lexer, T_SLASH));
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  
    /* If we have an error at a point where cleanup is required,
       flow-of-control comes here. */
@@ -127,7 +128,8 @@ cmd_string (struct lexer *lexer, struct dataset *ds)
  
    do
      {
-      if (!parse_DATA_LIST_vars (lexer, &v, &nv, PV_NO_DUPLICATE))
+      if (!parse_DATA_LIST_vars (lexer, dataset_dict (ds),
+                                 &v, &nv, PV_NO_DUPLICATE))
         return CMD_FAILURE;
  
        if (!lex_force_match (lexer, T_LPAREN)
@@ -164,7 +166,7 @@ cmd_string (struct lexer *lexer, struct dataset *ds)
      }
    while (lex_match (lexer, T_SLASH));
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  
    /* If we have an error at a point where cleanup is required,
       flow-of-control comes here. */
@@ -190,5 +192,5 @@ cmd_leave (struct lexer *lexer, struct dataset *ds)
      var_set_leave (v[i], true);
    free (v);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/language/dictionary/rename-variables.c b/src/language/dictionary/rename-variables.c

index 437450451075797d0dc022ef808231c726243691..c1d31996bacfa453b289cfeaa73b1488ceab2bf5 100644 (file)
--- a/src/language/dictionary/rename-variables.c
+++ b/src/language/dictionary/rename-variables.c
@@ -66,7 +66,8 @@ cmd_rename_variables (struct lexer *lexer, struct dataset *ds)
           msg (SE, _("`=' expected between lists of new and old variable names."));
           goto lossage;
         }
-      if (!parse_DATA_LIST_vars (lexer, &rename_new_names, &prev_nv_1,
+      if (!parse_DATA_LIST_vars (lexer, dataset_dict (ds),
+                                 &rename_new_names, &prev_nv_1,
                                   PV_APPEND | PV_NO_DUPLICATE))
         goto lossage;
        if (prev_nv_1 != rename_cnt)
diff --git a/src/language/dictionary/split-file.c b/src/language/dictionary/split-file.c

index 02e7696b4b1f339d2543242b862cf0c59db0af51..56e3eeef2182bc24d3efe5ad7f7b0beb3d48f813 100644 (file)
--- a/src/language/dictionary/split-file.c
+++ b/src/language/dictionary/split-file.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2009, 2011 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2009, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -59,7 +59,7 @@ cmd_split_file (struct lexer *lexer, struct dataset *ds)
        free (v);
      }
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Dumps out the values of all the split variables for the case C. */
diff --git a/src/language/dictionary/sys-file-info.c b/src/language/dictionary/sys-file-info.c

index 952930b5bad39332d0b1f70d5e76e706568ba930..51b3cae4952933cf7baddaf5d4203a65e3db3cd6 100644 (file)
--- a/src/language/dictionary/sys-file-info.c
+++ b/src/language/dictionary/sys-file-info.c
@@ -37,6 +37,7 @@
  #include "libpspp/array.h"
  #include "libpspp/message.h"
  #include "libpspp/misc.h"
+#include "libpspp/string-array.h"
  #include "output/tab.h"
  
  #include "gl/minmax.h"
@@ -163,7 +164,7 @@ cmd_sysfile_info (struct lexer *lexer, struct dataset *ds UNUSED)
    dict_destroy (d);
  
    fh_unref (h);
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  \f
  /* DISPLAY utility. */
@@ -210,7 +211,7 @@ cmd_display (struct lexer *lexer, struct dataset *ds)
        if (lex_match_id (lexer, "VECTORS"))
         {
           display_vectors (dataset_dict(ds), sorted);
-         return lex_end_of_command (lexer);
+         return CMD_SUCCESS;
         }
        else if (lex_match_id (lexer, "SCRATCH")) 
          {
@@ -280,7 +281,7 @@ cmd_display (struct lexer *lexer, struct dataset *ds)
                                        flags);
      }
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  static void
@@ -292,24 +293,19 @@ display_macros (void)
  static void
  display_documents (const struct dictionary *dict)
  {
-  const char *documents = dict_get_documents (dict);
+  const struct string_array *documents = dict_get_documents (dict);
  
-  if (documents == NULL)
+  if (string_array_is_empty (documents))
      tab_output_text (TAB_LEFT, _("The active file dictionary does not "
                                   "contain any documents."));
    else
      {
-      struct string line = DS_EMPTY_INITIALIZER;
        size_t i;
  
        tab_output_text (TAB_LEFT | TAT_TITLE,
                        _("Documents in the active file:"));
        for (i = 0; i < dict_get_document_line_cnt (dict); i++)
-        {
-          dict_get_document_line (dict, i, &line);
-          tab_output_text (TAB_LEFT | TAB_FIX, ds_cstr (&line));
-        }
-      ds_destroy (&line);
+        tab_output_text (TAB_LEFT | TAB_FIX, dict_get_document_line (dict, i));
      }
  }
  
diff --git a/src/language/dictionary/value-labels.c b/src/language/dictionary/value-labels.c

index 9068290b7dda666495f24c1fe4499e7f7f28d9cf..18e890ddb87f87a1b206facc4f771fcc69ae361b 100644 (file)
--- a/src/language/dictionary/value-labels.c
+++ b/src/language/dictionary/value-labels.c
@@ -19,6 +19,7 @@
  #include <stdio.h>
  #include <stdlib.h>
  
+#include "data/dictionary.h"
  #include "data/procedure.h"
  #include "data/value-labels.h"
  #include "data/variable.h"
@@ -26,6 +27,7 @@
  #include "language/lexer/lexer.h"
  #include "language/lexer/value-parser.h"
  #include "language/lexer/variable-parser.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/str.h"
  
@@ -39,7 +41,8 @@
  static int do_value_labels (struct lexer *,
                             const struct dictionary *dict, bool);
  static void erase_labels (struct variable **vars, size_t var_cnt);
-static int get_label (struct lexer *, struct variable **vars, size_t var_cnt);
+static int get_label (struct lexer *, struct variable **vars, size_t var_cnt,
+                      const char *dict_encoding);
  \f
  /* Stubs. */
  
@@ -78,7 +81,7 @@ do_value_labels (struct lexer *lexer, const struct dictionary *dict, bool erase)
        if (erase)
          erase_labels (vars, var_cnt);
        while (lex_token (lexer) != T_SLASH && lex_token (lexer) != T_ENDCMD)
-       if (!get_label (lexer, vars, var_cnt))
+       if (!get_label (lexer, vars, var_cnt, dict_get_encoding (dict)))
            goto lossage;
  
        if (lex_token (lexer) != T_SLASH)
@@ -92,10 +95,7 @@ do_value_labels (struct lexer *lexer, const struct dictionary *dict, bool erase)
        free (vars);
      }
  
-  if (parse_err)
-    return CMD_FAILURE;
-
-  return lex_end_of_command (lexer);
+  return parse_err ? CMD_FAILURE : CMD_SUCCESS;
  
   lossage:
    free (vars);
@@ -116,7 +116,8 @@ erase_labels (struct variable **vars, size_t var_cnt)
  /* Parse all the labels for the VAR_CNT variables in VARS and add
     the specified labels to those variables.  */
  static int
-get_label (struct lexer *lexer, struct variable **vars, size_t var_cnt)
+get_label (struct lexer *lexer, struct variable **vars, size_t var_cnt,
+           const char *dict_encoding)
  {
    /* Parse all the labels and add them to the variables. */
    do
@@ -125,6 +126,7 @@ get_label (struct lexer *lexer, struct variable **vars, size_t var_cnt)
        int width = var_get_width (vars[0]);
        union value value;
        struct string label;
+      size_t trunc_len;
        size_t i;
  
        /* Set value. */
@@ -145,10 +147,12 @@ get_label (struct lexer *lexer, struct variable **vars, size_t var_cnt)
  
        ds_init_substring (&label, lex_tokss (lexer));
  
-      if (ds_length (&label) > MAX_LABEL_LEN)
+      trunc_len = utf8_encoding_trunc_len (ds_cstr (&label), dict_encoding,
+                                           MAX_LABEL_LEN);
+      if (ds_length (&label) > trunc_len)
         {
           msg (SW, _("Truncating value label to %d bytes."), MAX_LABEL_LEN);
-         ds_truncate (&label, MAX_LABEL_LEN);
+         ds_truncate (&label, trunc_len);
         }
  
        for (i = 0; i < var_cnt; i++)
diff --git a/src/language/dictionary/variable-label.c b/src/language/dictionary/variable-label.c

index 4735047c1e6d20b7f28f2f01d37340cb8af209e8..c0f80fbdbeac19a9fbcfac9f915ee57a52dd39bf 100644 (file)
--- a/src/language/dictionary/variable-label.c
+++ b/src/language/dictionary/variable-label.c
@@ -19,13 +19,13 @@
  #include <stdio.h>
  #include <stdlib.h>
  
+#include "data/dictionary.h"
  #include "data/procedure.h"
  #include "data/variable.h"
  #include "language/command.h"
  #include "language/lexer/lexer.h"
  #include "language/lexer/variable-parser.h"
  #include "libpspp/message.h"
-#include "libpspp/str.h"
  
  #include "gl/xalloc.h"
  
@@ -35,15 +35,17 @@
  int
  cmd_variable_labels (struct lexer *lexer, struct dataset *ds)
  {
+  struct dictionary *dict = dataset_dict (ds);
+  const char *dict_encoding = dict_get_encoding (dict);
+
    do
      {
        struct variable **v;
-      struct string label;
        size_t nv;
  
        size_t i;
  
-      if (!parse_variables (lexer, dataset_dict (ds), &v, &nv, PV_NONE))
+      if (!parse_variables (lexer, dict, &v, &nv, PV_NONE))
          return CMD_FAILURE;
  
        if (!lex_force_string (lexer))
@@ -52,15 +54,8 @@ cmd_variable_labels (struct lexer *lexer, struct dataset *ds)
           return CMD_FAILURE;
         }
  
-      ds_init_substring (&label, lex_tokss (lexer));
-      if (ds_length (&label) > 255)
-       {
-         msg (SW, _("Truncating variable label to 255 characters."));
-         ds_truncate (&label, 255);
-       }
        for (i = 0; i < nv; i++)
-        var_set_label (v[i], ds_cstr (&label));
-      ds_destroy (&label);
+        var_set_label (v[i], lex_tokcstr (lexer), dict_encoding, i == 0);
  
        lex_get (lexer);
        while (lex_token (lexer) == T_SLASH)
diff --git a/src/language/dictionary/vector.c b/src/language/dictionary/vector.c

index bf257194613f9a1840e6f1660e224a2249ef0aaf..a5e66df8c44a592fd8a0c5cb7c767c73943c0d05 100644 (file)
--- a/src/language/dictionary/vector.c
+++ b/src/language/dictionary/vector.c
@@ -50,7 +50,8 @@ cmd_vector (struct lexer *lexer, struct dataset *ds)
        size_t vector_cnt, vector_cap;
  
        /* Get the name(s) of the new vector(s). */
-      if (!lex_force_id (lexer))
+      if (!lex_force_id (lexer)
+          || !dict_id_is_valid (dict, lex_tokcstr (lexer), true))
         return CMD_CASCADING_FAILURE;
  
        vectors = NULL;
@@ -151,19 +152,17 @@ cmd_vector (struct lexer *lexer, struct dataset *ds)
                goto fail;
              }
  
-         /* Check that none of the variables exist and that
-             their names are no more than VAR_NAME_LEN bytes
-             long. */
+         /* Check that none of the variables exist and that their names are
+             not excessively long. */
            for (i = 0; i < vector_cnt; i++)
             {
                int j;
               for (j = 0; j < var_cnt; j++)
                 {
                    char *name = xasprintf ("%s%d", vectors[i], j + 1);
-                  if (strlen (name) > VAR_NAME_LEN)
+                  if (!dict_id_is_valid (dict, name, true))
                      {
                        free (name);
-                      msg (SE, _("%s is too long for a variable name."), name);
                        goto fail;
                      }
                    if (dict_lookup_var (dict, name))
@@ -200,7 +199,7 @@ cmd_vector (struct lexer *lexer, struct dataset *ds)
    while (lex_match (lexer, T_SLASH));
  
    pool_destroy (pool);
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  
  fail:
    pool_destroy (pool);
diff --git a/src/language/dictionary/weight.c b/src/language/dictionary/weight.c

index cca9aebe49844ab0290a299358b93bc12893fc18..ad06ad107af3ca1092802bfa8998c670eed20a37 100644 (file)
--- a/src/language/dictionary/weight.c
+++ b/src/language/dictionary/weight.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2011 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -58,5 +58,5 @@ cmd_weight (struct lexer *lexer, struct dataset *ds)
        dict_set_weight (dict, v);
      }
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/language/expressions/parse.c b/src/language/expressions/parse.c

index a9c324bcb55e4b944f32efe0cd7caa7dde8f2620..ed5a07093c712b3b034b82f71d6af1fbdc4db0fd 100644 (file)
--- a/src/language/expressions/parse.c
+++ b/src/language/expressions/parse.c
@@ -33,6 +33,7 @@
  #include "language/lexer/variable-parser.h"
  #include "libpspp/array.h"
  #include "libpspp/assertion.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/misc.h"
  #include "libpspp/pool.h"
@@ -785,12 +786,14 @@ parse_sysvar (struct lexer *lexer, struct expression *e)
        time_t last_proc_time = time_of_last_procedure (e->ds);
        struct tm *time;
        char temp_buf[10];
+      struct substring s;
  
        time = localtime (&last_proc_time);
        sprintf (temp_buf, "%02d %s %02d", abs (time->tm_mday) % 100,
                 months[abs (time->tm_mon) % 12], abs (time->tm_year) % 100);
  
-      return expr_allocate_string_buffer (e, temp_buf, strlen (temp_buf));
+      ss_alloc_substring (&s, ss_cstr (temp_buf));
+      return expr_allocate_string (e, s);
      }
    else if (lex_match_id (lexer, "$TRUE"))
      return expr_allocate_boolean (e, 1.0);
@@ -836,7 +839,7 @@ parse_primary (struct lexer *lexer, struct expression *e)
    switch (lex_token (lexer))
      {
      case T_ID:
-      if (lex_look_ahead (lexer) == T_LPAREN)
+      if (lex_next_token (lexer, 1) == T_LPAREN)
          {
            /* An identifier followed by a left parenthesis may be
               a vector element reference.  If not, it's a function
@@ -887,8 +890,17 @@ parse_primary (struct lexer *lexer, struct expression *e)
  
      case T_STRING:
        {
-        union any_node *node = expr_allocate_string_buffer (
-          e, lex_tokcstr (lexer), ss_length (lex_tokss (lexer)));
+        const char *dict_encoding;
+        union any_node *node;
+        char *s;
+
+        dict_encoding = (e->ds != NULL
+                         ? dict_get_encoding (dataset_dict (e->ds))
+                         : "UTF-8");
+        s = recode_string (dict_encoding, "UTF-8", lex_tokcstr (lexer),
+                           ss_length (lex_tokss (lexer)));
+        node = expr_allocate_string (e, ss_cstr (s));
+
         lex_get (lexer);
         return node;
        }
@@ -1231,7 +1243,7 @@ parse_function (struct lexer *lexer, struct expression *e)
      for (;;)
        {
          if (lex_token (lexer) == T_ID
-            && toupper (lex_look_ahead (lexer)) == T_ID)
+            && lex_next_token (lexer, 1) == T_TO)
            {
              const struct variable **vars;
              size_t var_cnt;
@@ -1473,18 +1485,6 @@ expr_allocate_vector (struct expression *e, const struct vector *vector)
    return n;
  }
  
-union any_node *
-expr_allocate_string_buffer (struct expression *e,
-                             const char *string, size_t length)
-{
-  union any_node *n = pool_alloc (e->expr_pool, sizeof n->string);
-  n->type = OP_string;
-  if (length > MAX_STRING)
-    length = MAX_STRING;
-  n->string.s = copy_string (e, string, length);
-  return n;
-}
-
  union any_node *
  expr_allocate_string (struct expression *e, struct substring s)
  {
diff --git a/src/language/expressions/private.h b/src/language/expressions/private.h

index 1a485bb4f7f0801b234a4c41cdde1f777fa06fd0..062d6f765185be6de3145397e4c30e7ae212394c 100644 (file)
--- a/src/language/expressions/private.h
+++ b/src/language/expressions/private.h
@@ -187,10 +187,7 @@ union any_node *expr_allocate_number (struct expression *e, double);
  union any_node *expr_allocate_boolean (struct expression *e, double);
  union any_node *expr_allocate_integer (struct expression *e, int);
  union any_node *expr_allocate_pos_int (struct expression *e, int);
-union any_node *expr_allocate_string_buffer (struct expression *e,
-                                             const char *string, size_t length);
-union any_node *expr_allocate_string (struct expression *e,
-                                      struct substring);
+union any_node *expr_allocate_string (struct expression *e, struct substring);
  union any_node *expr_allocate_variable (struct expression *e,
                                          const struct variable *);
  union any_node *expr_allocate_format (struct expression *e,
diff --git a/src/language/lexer/automake.mk b/src/language/lexer/automake.mk

index be48873e8d3e8fdb701109b31f9603112bf73205..11771a248d5b7036cf3ea54b6ac435ca39852aee 100644 (file)
--- a/src/language/lexer/automake.mk
+++ b/src/language/lexer/automake.mk
@@ -4,6 +4,8 @@
  language_lexer_sources = \
         src/language/lexer/command-name.c \
         src/language/lexer/command-name.h \
+       src/language/lexer/include-path.c \
+       src/language/lexer/include-path.h \
         src/language/lexer/lexer.c \
         src/language/lexer/lexer.h \
         src/language/lexer/subcommand-list.c  \
diff --git a/src/language/lexer/include-path.c b/src/language/lexer/include-path.c

new file mode 100644 (file)

index 0000000..bc20122
--- /dev/null
+++ b/src/language/lexer/include-path.c
@@ -0,0 +1,89 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "src/language/lexer/include-path.h"
+
+#include <stdlib.h>
+
+#include "data/file-name.h"
+#include "libpspp/string-array.h"
+
+#include "gl/configmake.h"
+#include "gl/relocatable.h"
+#include "gl/xvasprintf.h"
+
+static struct string_array the_include_path;
+static struct string_array default_include_path;
+
+static void include_path_init__ (void);
+
+void
+include_path_clear (void)
+{
+  include_path_init__ ();
+  string_array_clear (&the_include_path);
+}
+
+void
+include_path_add (const char *dir)
+{
+  include_path_init__ ();
+  string_array_append (&the_include_path, dir);
+}
+
+char *
+include_path_search (const char *base_name)
+{
+  return fn_search_path (base_name, include_path ());
+}
+
+const struct string_array *
+include_path_default (void)
+{
+  include_path_init__ ();
+  return &default_include_path;
+}
+
+char **
+include_path (void)
+{
+  include_path_init__ ();
+  string_array_terminate_null (&the_include_path);
+  return the_include_path.strings;
+}
+
+static void
+include_path_init__ (void)
+{
+  static bool inited;
+  char *home;
+
+  if (inited)
+    return;
+  inited = false;
+
+  string_array_init (&the_include_path);
+  string_array_append (&the_include_path, ".");
+  home = getenv ("HOME");
+  if (home != NULL)
+    string_array_append_nocopy (&the_include_path,
+                                xasprintf ("%s/.pspp", home));
+  string_array_append (&the_include_path, relocate (PKGDATADIR));
+
+  string_array_clone (&default_include_path, &the_include_path);
+}
diff --git a/src/language/lexer/include-path.h b/src/language/lexer/include-path.h

new file mode 100644 (file)

index 0000000..447b9a6
--- /dev/null
+++ b/src/language/lexer/include-path.h
@@ -0,0 +1,29 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef INCLUDE_PATH_H
+#define INCLUDE_PATH_H 1
+
+struct string_array;
+
+void include_path_clear (void);
+void include_path_add (const char *dir);
+char *include_path_search (const char *base_name);
+
+const struct string_array *include_path_default (void);
+char **include_path (void);
+
+#endif /* include-path.h */
diff --git a/src/language/lexer/lexer.c b/src/language/lexer/lexer.c

index 938d26675066fa488f250f959c19ab5835e27521..9a27d867b98cb87e15420671819ab24ccd1ca1fa 100644 (file)
--- a/src/language/lexer/lexer.c
+++ b/src/language/lexer/lexer.c
@@ -18,405 +18,254 @@
  
  #include "language/lexer/lexer.h"
  
-#include <c-ctype.h>
-#include <c-strtod.h>
  #include <errno.h>
+#include <fcntl.h>
  #include <limits.h>
  #include <math.h>
  #include <stdarg.h>
-#include <stdint.h>
  #include <stdlib.h>
+#include <string.h>
+#include <unictype.h>
+#include <unistd.h>
+#include <unistr.h>
+#include <uniwidth.h>
  
-#include "data/settings.h"
+#include "data/file-name.h"
  #include "language/command.h"
+#include "language/lexer/scan.h"
+#include "language/lexer/segment.h"
+#include "language/lexer/token.h"
  #include "libpspp/assertion.h"
-#include "libpspp/getl.h"
+#include "libpspp/cast.h"
+#include "libpspp/deque.h"
+#include "libpspp/i18n.h"
+#include "libpspp/ll.h"
  #include "libpspp/message.h"
+#include "libpspp/misc.h"
  #include "libpspp/str.h"
+#include "libpspp/u8-istream.h"
  #include "output/journal.h"
  #include "output/text-item.h"
  
+#include "gl/c-ctype.h"
+#include "gl/minmax.h"
  #include "gl/xalloc.h"
+#include "gl/xmemdup0.h"
  
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
  #define N_(msgid) msgid
  
-struct lexer
-{
-  struct string line_buffer;
-
-  struct source_stream *ss;
-
-  int token;      /* Current token. */
-  double tokval;  /* T_POS_NUM, T_NEG_NUM: the token's value. */
-
-  struct string tokstr;   /* T_ID, T_STRING: token string value. */
-
-  char *prog; /* Pointer to next token in line_buffer. */
-  bool dot;   /* True only if this line ends with a terminal dot. */
-
-  int put_token ; /* If nonzero, next token returned by lex_get().
-                   Used only in exceptional circumstances. */
+/* A token within a lex_source. */
+struct lex_token
+  {
+    /* The regular token information. */
+    struct token token;
+
+    /* Location of token in terms of the lex_source's buffer.
+       src->tail <= line_pos <= token_pos <= src->head. */
+    size_t token_pos;           /* Start of token. */
+    size_t token_len;           /* Length of source for token in bytes. */
+    size_t line_pos;            /* Start of line containing token_pos. */
+    int first_line;             /* Line number at token_pos. */
+  };
  
-  struct string put_tokstr;
-  double put_tokval;
-};
+/* A source of tokens, corresponding to a syntax file.
  
+   This is conceptually a lex_reader wrapped with everything needed to convert
+   its UTF-8 bytes into tokens. */
+struct lex_source
+  {
+    struct ll ll;               /* In lexer's list of sources. */
+    struct lex_reader *reader;
+    struct segmenter segmenter;
+    bool eof;                   /* True if T_STOP was read from 'reader'. */
+
+    /* Buffer of UTF-8 bytes. */
+    char *buffer;
+    size_t allocated;           /* Number of bytes allocated. */
+    size_t tail;                /* &buffer[0] offset into UTF-8 source. */
+    size_t head;                /* &buffer[head - tail] offset into source. */
+
+    /* Positions in source file, tail <= pos <= head for each member here. */
+    size_t journal_pos;         /* First byte not yet output to journal. */
+    size_t seg_pos;             /* First byte not yet scanned as token. */
+    size_t line_pos;            /* First byte of line containing seg_pos. */
+
+    int n_newlines;             /* Number of new-lines up to seg_pos. */
+    bool suppress_next_newline;
+
+    /* Tokens. */
+    struct deque deque;         /* Indexes into 'tokens'. */
+    struct lex_token *tokens;   /* Lookahead tokens for parser. */
+  };
  
-static int parse_id (struct lexer *);
+static struct lex_source *lex_source_create (struct lex_reader *);
+static void lex_source_destroy (struct lex_source *);
  
-/* How a string represents its contents. */
-enum string_type
+/* Lexer. */
+struct lexer
    {
-    CHARACTER_STRING,   /* Characters. */
-    BINARY_STRING,      /* Binary digits. */
-    OCTAL_STRING,       /* Octal digits. */
-    HEX_STRING          /* Hexadecimal digits. */
+    struct ll_list sources;     /* Contains "struct lex_source"s. */
    };
  
-static int parse_string (struct lexer *, enum string_type);
+static struct lex_source *lex_source__ (const struct lexer *);
+static const struct lex_token *lex_next__ (const struct lexer *, int n);
+static void lex_source_push_endcmd__ (struct lex_source *);
+
+static void lex_source_pop__ (struct lex_source *);
+static bool lex_source_get__ (const struct lex_source *);
+static void lex_source_error_valist (struct lex_source *, int n0, int n1,
+                                     const char *format, va_list)
+   PRINTF_FORMAT (4, 0);
+static const struct lex_token *lex_source_next__ (const struct lex_source *,
+                                                  int n);
  \f
-/* Initialization. */
-
-/* Initializes the lexer. */
-struct lexer *
-lex_create (struct source_stream *ss)
-{
-  struct lexer *lexer = xzalloc (sizeof (*lexer));
-
-  ds_init_empty (&lexer->tokstr);
-  ds_init_empty (&lexer->put_tokstr);
-  ds_init_empty (&lexer->line_buffer);
-  lexer->ss = ss;
-
-  return lexer;
-}
-
-struct source_stream *
-lex_get_source_stream (const struct lexer *lex)
+/* Initializes READER with the specified CLASS and otherwise some reasonable
+   defaults.  The caller should fill in the others members as desired. */
+void
+lex_reader_init (struct lex_reader *reader,
+                 const struct lex_reader_class *class)
  {
-  return lex->ss;
+  reader->class = class;
+  reader->syntax = LEX_SYNTAX_AUTO;
+  reader->error = LEX_ERROR_INTERACTIVE;
+  reader->file_name = NULL;
+  reader->line_number = 0;
  }
  
-enum syntax_mode
-lex_current_syntax_mode (const struct lexer *lex)
+/* Frees any file name already in READER and replaces it by a copy of
+   FILE_NAME, or if FILE_NAME is null then clears any existing name. */
+void
+lex_reader_set_file_name (struct lex_reader *reader, const char *file_name)
  {
-  return source_stream_current_syntax_mode (lex->ss);
+  free (reader->file_name);
+  reader->file_name = file_name != NULL ? xstrdup (file_name) : NULL;
  }
-
-enum error_mode
-lex_current_error_mode (const struct lexer *lex)
+\f
+/* Creates and returns a new lexer. */
+struct lexer *
+lex_create (void)
  {
-  return source_stream_current_error_mode (lex->ss);
+  struct lexer *lexer = xzalloc (sizeof *lexer);
+  ll_init (&lexer->sources);
+  return lexer;
  }
  
-
+/* Destroys LEXER. */
  void
  lex_destroy (struct lexer *lexer)
  {
-  if ( NULL != lexer )
+  if (lexer != NULL)
      {
-      ds_destroy (&lexer->put_tokstr);
-      ds_destroy (&lexer->tokstr);
-      ds_destroy (&lexer->line_buffer);
+      struct lex_source *source, *next;
  
+      ll_for_each_safe (source, next, struct lex_source, ll, &lexer->sources)
+        lex_source_destroy (source);
        free (lexer);
      }
  }
  
+/* Inserts READER into LEXER so that the next token read by LEXER comes from
+   READER.  Before the caller, LEXER must either be empty or at a T_ENDCMD
+   token. */
+void
+lex_include (struct lexer *lexer, struct lex_reader *reader)
+{
+  assert (ll_is_empty (&lexer->sources) || lex_token (lexer) == T_ENDCMD);
+  ll_push_head (&lexer->sources, &lex_source_create (reader)->ll);
+}
+
+/* Appends READER to LEXER, so that it will be read after all other current
+   readers have already been read. */
+void
+lex_append (struct lexer *lexer, struct lex_reader *reader)
+{
+  ll_push_tail (&lexer->sources, &lex_source_create (reader)->ll);
+}
  \f
-/* Common functions. */
+/* Advacning. */
+
+static struct lex_token *
+lex_push_token__ (struct lex_source *src)
+{
+  struct lex_token *token;
+
+  if (deque_is_full (&src->deque))
+    src->tokens = deque_expand (&src->deque, src->tokens, sizeof *src->tokens);
+
+  token = &src->tokens[deque_push_front (&src->deque)];
+  token_init (&token->token);
+  return token;
+}
  
-/* Copies put_token, lexer->put_tokstr, put_tokval into token, tokstr,
-   tokval, respectively, and sets tokid appropriately. */
  static void
-restore_token (struct lexer *lexer)
+lex_source_pop__ (struct lex_source *src)
  {
-  assert (lexer->put_token != 0);
-  lexer->token = lexer->put_token;
-  ds_assign_string (&lexer->tokstr, &lexer->put_tokstr);
-  lexer->tokval = lexer->put_tokval;
-  lexer->put_token = 0;
+  token_destroy (&src->tokens[deque_pop_back (&src->deque)].token);
  }
  
-/* Copies token, tokstr, lexer->tokval into lexer->put_token, put_tokstr,
-   put_lexer->tokval respectively. */
  static void
-save_token (struct lexer *lexer)
+lex_source_pop_front (struct lex_source *src)
  {
-  lexer->put_token = lexer->token;
-  ds_assign_string (&lexer->put_tokstr, &lexer->tokstr);
-  lexer->put_tokval = lexer->tokval;
+  token_destroy (&src->tokens[deque_pop_front (&src->deque)].token);
  }
  
-/* Parses a single token, setting appropriate global variables to
-   indicate the token's attributes. */
+/* Advances LEXER to the next token, consuming the current token. */
  void
  lex_get (struct lexer *lexer)
  {
-  /* Find a token. */
-  for (;;)
-    {
-      if (NULL == lexer->prog && ! lex_get_line (lexer) )
-       {
-         lexer->token = T_STOP;
-         return;
-       }
-
-      /* If a token was pushed ahead, return it. */
-      if (lexer->put_token)
-        {
-          restore_token (lexer);
-          return;
-        }
+  struct lex_source *src;
  
-      for (;;)
-        {
-          /* Skip whitespace. */
-         while (c_isspace ((unsigned char) *lexer->prog))
-           lexer->prog++;
-
-         if (*lexer->prog)
-           break;
-
-         if (lexer->dot)
-           {
-             lexer->dot = 0;
-             lexer->token = T_ENDCMD;
-             return;
-           }
-         else if (!lex_get_line (lexer))
-           {
-             lexer->prog = NULL;
-             lexer->token = T_STOP;
-             return;
-           }
-
-         if (lexer->put_token)
-           {
-              restore_token (lexer);
-             return;
-           }
-       }
-
-
-      /* Actually parse the token. */
-      ds_clear (&lexer->tokstr);
-
-      switch (*lexer->prog)
-       {
-       case '-': case '.':
-       case '0': case '1': case '2': case '3': case '4':
-       case '5': case '6': case '7': case '8': case '9':
-         {
-           char *tail;
-
-           /* `-' can introduce a negative number, or it can be a token by
-              itself. */
-           if (*lexer->prog == '-')
-             {
-               ds_put_byte (&lexer->tokstr, *lexer->prog++);
-               while (c_isspace ((unsigned char) *lexer->prog))
-                 lexer->prog++;
-
-               if (!c_isdigit ((unsigned char) *lexer->prog) && *lexer->prog != '.')
-                 {
-                   lexer->token = T_DASH;
-                   break;
-                 }
-                lexer->token = T_NEG_NUM;
-             }
-            else
-              lexer->token = T_POS_NUM;
-
-           /* Parse the number, copying it into tokstr. */
-           while (c_isdigit ((unsigned char) *lexer->prog))
-             ds_put_byte (&lexer->tokstr, *lexer->prog++);
-           if (*lexer->prog == '.')
-             {
-               ds_put_byte (&lexer->tokstr, *lexer->prog++);
-               while (c_isdigit ((unsigned char) *lexer->prog))
-                 ds_put_byte (&lexer->tokstr, *lexer->prog++);
-             }
-           if (*lexer->prog == 'e' || *lexer->prog == 'E')
-             {
-               ds_put_byte (&lexer->tokstr, *lexer->prog++);
-               if (*lexer->prog == '+' || *lexer->prog == '-')
-                 ds_put_byte (&lexer->tokstr, *lexer->prog++);
-               while (c_isdigit ((unsigned char) *lexer->prog))
-                 ds_put_byte (&lexer->tokstr, *lexer->prog++);
-             }
-
-           /* Parse as floating point. */
-           lexer->tokval = c_strtod (ds_cstr (&lexer->tokstr), &tail);
-           if (*tail)
-             {
-               msg (SE, _("%s does not form a valid number."),
-                    ds_cstr (&lexer->tokstr));
-               lexer->tokval = 0.0;
-
-               ds_clear (&lexer->tokstr);
-               ds_put_byte (&lexer->tokstr, '0');
-             }
-
-           break;
-         }
-
-       case '\'': case '"':
-         lexer->token = parse_string (lexer, CHARACTER_STRING);
-         break;
-
-        case '+':
-          lexer->token = T_PLUS;
-          lexer->prog++;
-          break;
-
-        case '/':
-          lexer->token = T_SLASH;
-          lexer->prog++;
-          break;
-
-        case '=':
-          lexer->token = T_EQUALS;
-          lexer->prog++;
-          break;
+  src = lex_source__ (lexer);
+  if (src == NULL)
+    return;
  
-       case '(':
-          lexer->token = T_LPAREN;
-          lexer->prog++;
-          break;
-
-       case ')':
-          lexer->token = T_RPAREN;
-          lexer->prog++;
-          break;
-
-       case '[':
-          lexer->token = T_LBRACK;
-          lexer->prog++;
-          break;
-
-       case ']':
-          lexer->token = T_RBRACK;
-          lexer->prog++;
-          break;
+  if (!deque_is_empty (&src->deque))
+    lex_source_pop__ (src);
  
-        case ',':
-          lexer->token = T_COMMA;
-          lexer->prog++;
-          break;
-
-       case '*':
-         if (*++lexer->prog == '*')
-           {
-             lexer->prog++;
-             lexer->token = T_EXP;
-           }
-         else
-           lexer->token = T_ASTERISK;
-         break;
-
-       case '<':
-         if (*++lexer->prog == '=')
-           {
-             lexer->prog++;
-             lexer->token = T_LE;
-           }
-         else if (*lexer->prog == '>')
-           {
-             lexer->prog++;
-             lexer->token = T_NE;
-           }
-         else
-           lexer->token = T_LT;
-         break;
-
-       case '>':
-         if (*++lexer->prog == '=')
-           {
-             lexer->prog++;
-             lexer->token = T_GE;
-           }
-         else
-           lexer->token = T_GT;
-         break;
-
-       case '~':
-         if (*++lexer->prog == '=')
-           {
-             lexer->prog++;
-             lexer->token = T_NE;
-           }
-         else
-           lexer->token = T_NOT;
-         break;
-
-       case '&':
-         lexer->prog++;
-         lexer->token = T_AND;
-         break;
-
-       case '|':
-         lexer->prog++;
-         lexer->token = T_OR;
-         break;
-
-        case 'b': case 'B':
-          if (lexer->prog[1] == '\'' || lexer->prog[1] == '"')
-            lexer->token = parse_string (lexer, BINARY_STRING);
-          else
-            lexer->token = parse_id (lexer);
-          break;
+  while (deque_is_empty (&src->deque))
+    if (!lex_source_get__ (src))
+      {
+        lex_source_destroy (src);
+        src = lex_source__ (lexer);
+        if (src == NULL)
+          return;
+      }
+}
+\f
+/* Issuing errors. */
  
-        case 'o': case 'O':
-          if (lexer->prog[1] == '\'' || lexer->prog[1] == '"')
-            lexer->token = parse_string (lexer, OCTAL_STRING);
-          else
-            lexer->token = parse_id (lexer);
-          break;
+/* Prints a syntax error message containing the current token and
+   given message MESSAGE (if non-null). */
+void
+lex_error (struct lexer *lexer, const char *format, ...)
+{
+  va_list args;
  
-        case 'x': case 'X':
-          if (lexer->prog[1] == '\'' || lexer->prog[1] == '"')
-            lexer->token = parse_string (lexer, HEX_STRING);
-          else
-            lexer->token = parse_id (lexer);
-          break;
+  va_start (args, format);
+  lex_next_error_valist (lexer, 0, 0, format, args);
+  va_end (args);
+}
  
-       default:
-          if (lex_is_id1 (*lexer->prog))
-            {
-              lexer->token = parse_id (lexer);
-              break;
-            }
-          else
-            {
-              unsigned char c = *lexer->prog++;
-              char *c_name = xasprintf (c_isgraph (c) ? "%c" : "\\%o", c);
-              msg (SE, _("Bad character in input: `%s'."), c_name);
-              free (c_name);
-              continue;
-            }
-        }
-      break;
-    }
+/* Prints a syntax error message containing the current token and
+   given message MESSAGE (if non-null). */
+void
+lex_error_valist (struct lexer *lexer, const char *format, va_list args)
+{
+  lex_next_error_valist (lexer, 0, 0, format, args);
  }
  
-/* Parses an identifier at the current position into tokstr.
-   Returns the correct token type. */
-static int
-parse_id (struct lexer *lexer)
+/* Prints a syntax error message containing the current token and
+   given message MESSAGE (if non-null). */
+void
+lex_next_error (struct lexer *lexer, int n0, int n1, const char *format, ...)
  {
-  struct substring rest_of_line
-    = ss_substr (ds_ss (&lexer->line_buffer),
-                 ds_pointer_to_position (&lexer->line_buffer, lexer->prog),
-                 SIZE_MAX);
-  struct substring id = ss_head (rest_of_line,
-                                 lex_id_get_length (rest_of_line));
-  lexer->prog += ss_length (id);
+  va_list args;
  
-  ds_assign_substring (&lexer->tokstr, id);
-  return lex_id_to_token (id);
+  va_start (args, format);
+  lex_next_error_valist (lexer, n0, n1, format, args);
+  va_end (args);
  }
  
  /* Reports an error to the effect that subcommand SBC may only be
@@ -438,36 +287,28 @@ lex_sbc_missing (struct lexer *lexer, const char *sbc)
  /* Prints a syntax error message containing the current token and
     given message MESSAGE (if non-null). */
  void
-lex_error (struct lexer *lexer, const char *message, ...)
+lex_next_error_valist (struct lexer *lexer, int n0, int n1,
+                       const char *format, va_list args)
  {
-  struct string s;
-
-  ds_init_empty (&s);
+  struct lex_source *src = lex_source__ (lexer);
  
-  if (lexer->token == T_STOP)
-    ds_put_cstr (&s, _("Syntax error at end of file"));
-  else if (lexer->token == T_ENDCMD)
-    ds_put_cstr (&s, _("Syntax error at end of command"));
+  if (src != NULL)
+    lex_source_error_valist (src, n0, n1, format, args);
    else
      {
-      char *token_rep = lex_token_representation (lexer);
-      ds_put_format (&s, _("Syntax error at `%s'"), token_rep);
-      free (token_rep);
-    }
-
-  if (message)
-    {
-      va_list args;
+      struct string s;
  
-      ds_put_cstr (&s, ": ");
-
-      va_start (args, message);
-      ds_put_vformat (&s, message, args);
-      va_end (args);
+      ds_init_empty (&s);
+      ds_put_format (&s, _("Syntax error at end of input"));
+      if (format != NULL)
+        {
+          ds_put_cstr (&s, ": ");
+          ds_put_vformat (&s, format, args);
+        }
+      ds_put_byte (&s, '.');
+      msg (SE, "%s", ds_cstr (&s));
+      ds_destroy (&s);
      }
-
-  msg (SE, "%s.", ds_cstr (&s));
-  ds_destroy (&s);
  }
  
  /* Checks that we're at end of command.
@@ -477,7 +318,7 @@ lex_error (struct lexer *lexer, const char *message, ...)
  int
  lex_end_of_command (struct lexer *lexer)
  {
-  if (lexer->token != T_ENDCMD)
+  if (lex_token (lexer) != T_ENDCMD && lex_token (lexer) != T_STOP)
      {
        lex_error (lexer, _("expecting end of command"));
        return CMD_FAILURE;
@@ -492,35 +333,29 @@ lex_end_of_command (struct lexer *lexer)
  bool
  lex_is_number (struct lexer *lexer)
  {
-  return lexer->token == T_POS_NUM || lexer->token == T_NEG_NUM;
+  return lex_next_is_number (lexer, 0);
  }
  
-
  /* Returns true if the current token is a string. */
  bool
  lex_is_string (struct lexer *lexer)
  {
-  return lexer->token == T_STRING;
+  return lex_next_is_string (lexer, 0);
  }
  
-
  /* Returns the value of the current token, which must be a
     floating point number. */
  double
  lex_number (struct lexer *lexer)
  {
-  assert (lex_is_number (lexer));
-  return lexer->tokval;
+  return lex_next_number (lexer, 0);
  }
  
  /* Returns true iff the current token is an integer. */
  bool
  lex_is_integer (struct lexer *lexer)
  {
-  return (lex_is_number (lexer)
-         && lexer->tokval > LONG_MIN
-         && lexer->tokval <= LONG_MAX
-         && floor (lexer->tokval) == lexer->tokval);
+  return lex_next_is_integer (lexer, 0);
  }
  
  /* Returns the value of the current token, which must be an
@@ -528,18 +363,70 @@ lex_is_integer (struct lexer *lexer)
  long
  lex_integer (struct lexer *lexer)
  {
-  assert (lex_is_integer (lexer));
-  return lexer->tokval;
+  return lex_next_integer (lexer, 0);
+}
+\f
+/* Token testing functions with lookahead.
+
+   A value of 0 for N as an argument to any of these functions refers to the
+   current token.  Lookahead is limited to the current command.  Any N greater
+   than the number of tokens remaining in the current command will be treated
+   as referring to a T_ENDCMD token. */
+
+/* Returns true if the token N ahead of the current token is a number. */
+bool
+lex_next_is_number (struct lexer *lexer, int n)
+{
+  enum token_type next_token = lex_next_token (lexer, n);
+  return next_token == T_POS_NUM || next_token == T_NEG_NUM;
+}
+
+/* Returns true if the token N ahead of the current token is a string. */
+bool
+lex_next_is_string (struct lexer *lexer, int n)
+{
+  return lex_next_token (lexer, n) == T_STRING;
+}
+
+/* Returns the value of the token N ahead of the current token, which must be a
+   floating point number. */
+double
+lex_next_number (struct lexer *lexer, int n)
+{
+  assert (lex_next_is_number (lexer, n));
+  return lex_next_tokval (lexer, n);
+}
+
+/* Returns true if the token N ahead of the current token is an integer. */
+bool
+lex_next_is_integer (struct lexer *lexer, int n)
+{
+  double value;
+
+  if (!lex_next_is_number (lexer, n))
+    return false;
+
+  value = lex_next_tokval (lexer, n);
+  return value > LONG_MIN && value <= LONG_MAX && floor (value) == value;
+}
+
+/* Returns the value of the token N ahead of the current token, which must be
+   an integer. */
+long
+lex_next_integer (struct lexer *lexer, int n)
+{
+  assert (lex_next_is_integer (lexer, n));
+  return lex_next_tokval (lexer, n);
  }
  \f
  /* Token matching functions. */
  
-/* If TOK is the current token, skips it and returns true
+/* If the current token has the specified TYPE, skips it and returns true.
     Otherwise, returns false. */
  bool
-lex_match (struct lexer *lexer, enum token_type t)
+lex_match (struct lexer *lexer, enum token_type type)
  {
-  if (lexer->token == t)
+  if (lex_token (lexer) == type)
      {
        lex_get (lexer);
        return true;
@@ -548,25 +435,26 @@ lex_match (struct lexer *lexer, enum token_type t)
      return false;
  }
  
-/* If the current token is the identifier S, skips it and returns
-   true.  The identifier may be abbreviated to its first three
-   letters.
-   Otherwise, returns false. */
+/* If the current token matches IDENTIFIER, skips it and returns true.
+   IDENTIFIER may be abbreviated to its first three letters.  Otherwise,
+   returns false.
+
+   IDENTIFIER must be an ASCII string. */
  bool
-lex_match_id (struct lexer *lexer, const char *s)
+lex_match_id (struct lexer *lexer, const char *identifier)
  {
-  return lex_match_id_n (lexer, s, 3);
+  return lex_match_id_n (lexer, identifier, 3);
  }
  
-/* If the current token is the identifier S, skips it and returns
-   true.  The identifier may be abbreviated to its first N
-   letters.
-   Otherwise, returns false. */
+/* If the current token is IDENTIFIER, skips it and returns true.  IDENTIFIER
+   may be abbreviated to its first N letters.  Otherwise, returns false.
+
+   IDENTIFIER must be an ASCII string. */
  bool
-lex_match_id_n (struct lexer *lexer, const char *s, size_t n)
+lex_match_id_n (struct lexer *lexer, const char *identifier, size_t n)
  {
-  if (lexer->token == T_ID
-      && lex_id_match_n (ss_cstr (s), lex_tokss (lexer), n))
+  if (lex_token (lexer) == T_ID
+      && lex_id_match_n (ss_cstr (identifier), lex_tokss (lexer), n))
      {
        lex_get (lexer);
        return true;
@@ -575,8 +463,8 @@ lex_match_id_n (struct lexer *lexer, const char *s, size_t n)
      return false;
  }
  
-/* If the current token is integer N, skips it and returns true.
-   Otherwise, returns false. */
+/* If the current token is integer X, skips it and returns true.  Otherwise,
+   returns false. */
  bool
  lex_match_int (struct lexer *lexer, int x)
  {
@@ -591,39 +479,41 @@ lex_match_int (struct lexer *lexer, int x)
  \f
  /* Forced matches. */
  
-/* If this token is identifier S, fetches the next token and returns
-   nonzero.
-   Otherwise, reports an error and returns zero. */
+/* If this token is IDENTIFIER, skips it and returns true.  IDENTIFIER may be
+   abbreviated to its first 3 letters.  Otherwise, reports an error and returns
+   false.
+
+   IDENTIFIER must be an ASCII string. */
  bool
-lex_force_match_id (struct lexer *lexer, const char *s)
+lex_force_match_id (struct lexer *lexer, const char *identifier)
  {
-  if (lex_match_id (lexer, s))
+  if (lex_match_id (lexer, identifier))
      return true;
    else
      {
-      lex_error (lexer, _("expecting `%s'"), s);
+      lex_error (lexer, _("expecting `%s'"), identifier);
        return false;
      }
  }
  
-/* If the current token is T, skips the token.  Otherwise, reports an
-   error and returns from the current function with return value false. */
+/* If the current token has the specified TYPE, skips it and returns true.
+   Otherwise, reports an error and returns false. */
  bool
-lex_force_match (struct lexer *lexer, enum token_type t)
+lex_force_match (struct lexer *lexer, enum token_type type)
  {
-  if (lexer->token == t)
+  if (lex_token (lexer) == type)
      {
        lex_get (lexer);
        return true;
      }
    else
      {
-      lex_error (lexer, _("expecting `%s'"), lex_token_name (t));
+      lex_error (lexer, _("expecting `%s'"), token_type_to_string (type));
        return false;
      }
  }
  
-/* If this token is a string, does nothing and returns true.
+/* If the current token is a string, does nothing and returns true.
     Otherwise, reports an error and returns false. */
  bool
  lex_force_string (struct lexer *lexer)
@@ -637,7 +527,7 @@ lex_force_string (struct lexer *lexer)
      }
  }
  
-/* If this token is an integer, does nothing and returns true.
+/* If the current token is an integer, does nothing and returns true.
     Otherwise, reports an error and returns false. */
  bool
  lex_force_int (struct lexer *lexer)
@@ -651,7 +541,7 @@ lex_force_int (struct lexer *lexer)
      }
  }
  
-/* If this token is a number, does nothing and returns true.
+/* If the current token is a number, does nothing and returns true.
     Otherwise, reports an error and returns false. */
  bool
  lex_force_num (struct lexer *lexer)
@@ -663,710 +553,1081 @@ lex_force_num (struct lexer *lexer)
    return false;
  }
  
-/* If this token is an identifier, does nothing and returns true.
+/* If the current token is an identifier, does nothing and returns true.
     Otherwise, reports an error and returns false. */
  bool
  lex_force_id (struct lexer *lexer)
  {
-  if (lexer->token == T_ID)
+  if (lex_token (lexer) == T_ID)
      return true;
  
    lex_error (lexer, _("expecting identifier"));
    return false;
  }
+\f
+/* Token accessors. */
  
-/* Weird token functions. */
-
-/* Returns the likely type of the next token, or 0 if it's hard to tell. */
+/* Returns the type of LEXER's current token. */
  enum token_type
-lex_look_ahead (struct lexer *lexer)
+lex_token (const struct lexer *lexer)
  {
-  if (lexer->put_token)
-    return lexer->put_token;
+  return lex_next_token (lexer, 0);
+}
  
-  for (;;)
+/* Returns the number in LEXER's current token.
+
+   Only T_NEG_NUM and T_POS_NUM tokens have meaningful values.  For other
+   tokens this function will always return zero. */
+double
+lex_tokval (const struct lexer *lexer)
+{
+  return lex_next_tokval (lexer, 0);
+}
+
+/* Returns the null-terminated string in LEXER's current token, UTF-8 encoded.
+
+   Only T_ID and T_STRING tokens have meaningful strings.  For other tokens
+   this functions this function will always return NULL.
+
+   The UTF-8 encoding of the returned string is correct for variable names and
+   other identifiers.  Use filename_to_utf8() to use it as a filename.  Use
+   data_in() to use it in a "union value".  */
+const char *
+lex_tokcstr (const struct lexer *lexer)
+{
+  return lex_next_tokcstr (lexer, 0);
+}
+
+/* Returns the string in LEXER's current token, UTF-8 encoded.  The string is
+   null-terminated (but the null terminator is not included in the returned
+   substring's 'length').
+
+   Only T_ID and T_STRING tokens have meaningful strings.  For other tokens
+   this functions this function will always return NULL.
+
+   The UTF-8 encoding of the returned string is correct for variable names and
+   other identifiers.  Use filename_to_utf8() to use it as a filename.  Use
+   data_in() to use it in a "union value".  */
+struct substring
+lex_tokss (const struct lexer *lexer)
+{
+  return lex_next_tokss (lexer, 0);
+}
+\f
+/* Looking ahead.
+
+   A value of 0 for N as an argument to any of these functions refers to the
+   current token.  Lookahead is limited to the current command.  Any N greater
+   than the number of tokens remaining in the current command will be treated
+   as referring to a T_ENDCMD token. */
+
+static const struct lex_token *
+lex_next__ (const struct lexer *lexer_, int n)
+{
+  struct lexer *lexer = CONST_CAST (struct lexer *, lexer_);
+  struct lex_source *src = lex_source__ (lexer);
+
+  if (src != NULL)
+    return lex_source_next__ (src, n);
+  else
+    {
+      static const struct lex_token stop_token =
+        { TOKEN_INITIALIZER (T_STOP, 0.0, ""), 0, 0, 0, 0 };
+
+      return &stop_token;
+    }
+}
+
+static const struct lex_token *
+lex_source_next__ (const struct lex_source *src, int n)
+{
+  while (deque_count (&src->deque) <= n)
      {
-      if (NULL == lexer->prog && ! lex_get_line (lexer) )
-        return 0;
-
-      for (;;)
-       {
-         while (c_isspace ((unsigned char) *lexer->prog))
-           lexer->prog++;
-         if (*lexer->prog)
-           break;
-
-         if (lexer->dot)
-           return T_ENDCMD;
-         else if (!lex_get_line (lexer))
-            return 0;
-
-         if (lexer->put_token)
-           return lexer->put_token;
-       }
-
-      switch (toupper ((unsigned char) *lexer->prog))
+      if (!deque_is_empty (&src->deque))
          {
-        case 'X': case 'B': case 'O':
-          if (lexer->prog[1] == '\'' || lexer->prog[1] == '"')
-            return T_STRING;
-          /* Fall through */
+          struct lex_token *front;
  
-       case '-':
-          return T_DASH;
+          front = &src->tokens[deque_front (&src->deque, 0)];
+          if (front->token.type == T_STOP || front->token.type == T_ENDCMD)
+            return front;
+        }
+
+      lex_source_get__ (src);
+    }
+
+  return &src->tokens[deque_back (&src->deque, n)];
+}
+
+/* Returns the "struct token" of the token N after the current one in LEXER.
+   The returned pointer can be invalidated by pretty much any succeeding call
+   into the lexer, although the string pointer within the returned token is
+   only invalidated by consuming the token (e.g. with lex_get()). */
+const struct token *
+lex_next (const struct lexer *lexer, int n)
+{
+  return &lex_next__ (lexer, n)->token;
+}
+
+/* Returns the type of the token N after the current one in LEXER. */
+enum token_type
+lex_next_token (const struct lexer *lexer, int n)
+{
+  return lex_next (lexer, n)->type;
+}
+
+/* Returns the number in the tokn N after the current one in LEXER.
+
+   Only T_NEG_NUM and T_POS_NUM tokens have meaningful values.  For other
+   tokens this function will always return zero. */
+double
+lex_next_tokval (const struct lexer *lexer, int n)
+{
+  const struct token *token = lex_next (lexer, n);
+  return token->number;
+}
+
+/* Returns the null-terminated string in the token N after the current one, in
+   UTF-8 encoding.
+
+   Only T_ID and T_STRING tokens have meaningful strings.  For other tokens
+   this functions this function will always return NULL.
+
+   The UTF-8 encoding of the returned string is correct for variable names and
+   other identifiers.  Use filename_to_utf8() to use it as a filename.  Use
+   data_in() to use it in a "union value".  */
+const char *
+lex_next_tokcstr (const struct lexer *lexer, int n)
+{
+  return lex_next_tokss (lexer, n).string;
+}
+
+/* Returns the string in the token N after the current one, in UTF-8 encoding.
+   The string is null-terminated (but the null terminator is not included in
+   the returned substring's 'length').
+
+   Only T_ID and T_STRING tokens have meaningful strings.  For other tokens
+   this functions this function will always return NULL.
+
+   The UTF-8 encoding of the returned string is correct for variable names and
+   other identifiers.  Use filename_to_utf8() to use it as a filename.  Use
+   data_in() to use it in a "union value".  */
+struct substring
+lex_next_tokss (const struct lexer *lexer, int n)
+{
+  return lex_next (lexer, n)->string;
+}
  
-        case '.':
-       case '0': case '1': case '2': case '3': case '4':
-       case '5': case '6': case '7': case '8': case '9':
-          return T_POS_NUM;
+/* If LEXER is positioned at the (pseudo)identifier S, skips it and returns
+   true.  Otherwise, returns false.
  
-       case '\'': case '"':
-          return T_STRING;
+   S may consist of an arbitrary number of identifiers, integers, and
+   punctuation e.g. "KRUSKAL-WALLIS", "2SLS", or "END INPUT PROGRAM".
+   Identifiers may be abbreviated to their first three letters.  Currently only
+   hyphens, slashes, and equals signs are supported as punctuation (but it
+   would be easy to add more).
  
-        case '+':
-          return T_PLUS;
+   S must be an ASCII string. */
+bool
+lex_match_phrase (struct lexer *lexer, const char *s)
+{
+  int tok_idx;
+
+  for (tok_idx = 0; ; tok_idx++)
+    {
+      enum token_type token;
+      unsigned char c;
+
+      while (c_isspace (*s))
+        s++;
+
+      c = *s;
+      if (c == '\0')
+        {
+          int i;
+
+          for (i = 0; i < tok_idx; i++)
+            lex_get (lexer);
+          return true;
+        }
+
+      token = lex_next_token (lexer, tok_idx);
+      switch (c)
+        {
+        case '-':
+          if (token != T_DASH)
+            return false;
+          s++;
+          break;
  
          case '/':
-          return T_SLASH;
+          if (token != T_SLASH)
+            return false;
+          s++;
+          break;
  
          case '=':
-          return T_EQUALS;
+          if (token != T_EQUALS)
+            return false;
+          s++;
+          break;
  
-       case '(':
-          return T_LPAREN;
+        case '0': case '1': case '2': case '3': case '4':
+        case '5': case '6': case '7': case '8': case '9':
+          {
+            unsigned int value;
  
-       case ')':
-          return T_RPAREN;
+            if (token != T_POS_NUM)
+              return false;
  
-       case '[':
-          return T_LBRACK;
+            value = 0;
+            do
+              {
+                value = value * 10 + (*s++ - '0');
+              }
+            while (c_isdigit (*s));
  
-       case ']':
-          return T_RBRACK;
+            if (lex_next_tokval (lexer, tok_idx) != value)
+              return false;
+          }
+          break;
  
-        case ',':
-          return T_COMMA;
+        default:
+          if (lex_is_id1 (c))
+            {
+              int len;
  
-       case '*':
-         return lexer->prog[1] == '*' ? T_EXP : T_ASTERISK;
+              if (token != T_ID)
+                return false;
  
-       case '<':
-          return (lexer->prog[1] == '=' ? T_LE
-                  : lexer->prog[1] == '>' ? T_NE
-                  : T_LT);
+              len = lex_id_get_length (ss_cstr (s));
+              if (!lex_id_match (ss_buffer (s, len),
+                                 lex_next_tokss (lexer, tok_idx)))
+                return false;
  
-       case '>':
-          return lexer->prog[1] == '=' ? T_GE : T_GT;
+              s += len;
+            }
+          else
+            NOT_REACHED ();
+        }
+    }
+}
  
-       case '~':
-          return lexer->prog[1] == '=' ? T_NE : T_NOT;
+static int
+lex_source_get_first_line_number (const struct lex_source *src, int n)
+{
+  return lex_source_next__ (src, n)->first_line;
+}
  
-       case '&':
-         return T_AND;
+static int
+count_newlines (char *s, size_t length)
+{
+  int n_newlines = 0;
+  char *newline;
  
-       case '|':
-         return T_OR;
+  while ((newline = memchr (s, '\n', length)) != NULL)
+    {
+      n_newlines++;
+      length -= (newline + 1) - s;
+      s = newline + 1;
+    }
  
-        default:
-          if (lex_is_id1 (*lexer->prog))
-            return T_ID;
-          return 0;
+  return n_newlines;
+}
+
+static int
+lex_source_get_last_line_number (const struct lex_source *src, int n)
+{
+  const struct lex_token *token = lex_source_next__ (src, n);
+
+  if (token->first_line == 0)
+    return 0;
+  else
+    {
+      char *token_str = &src->buffer[token->token_pos - src->tail];
+      return token->first_line + count_newlines (token_str, token->token_len) + 1;
+    }
+}
+
+static int
+count_columns (const char *s_, size_t length)
+{
+  const uint8_t *s = CHAR_CAST (const uint8_t *, s_);
+  int columns;
+  size_t ofs;
+  int mblen;
+
+  columns = 0;
+  for (ofs = 0; ofs < length; ofs += mblen)
+    {
+      ucs4_t uc;
+
+      mblen = u8_mbtouc (&uc, s + ofs, length - ofs);
+      if (uc != '\t')
+        {
+          int width = uc_width (uc, "UTF-8");
+          if (width > 0)
+            columns += width;
          }
+      else
+        columns = ROUND_UP (columns + 1, 8);
      }
+
+  return columns + 1;
  }
  
-/* Makes the current token become the next token to be read; the
-   current token is set to T. */
-void
-lex_put_back (struct lexer *lexer, enum token_type t)
+static int
+lex_source_get_first_column (const struct lex_source *src, int n)
  {
-  save_token (lexer);
-  lexer->token = t;
+  const struct lex_token *token = lex_source_next__ (src, n);
+  return count_columns (&src->buffer[token->line_pos - src->tail],
+                        token->token_pos - token->line_pos);
  }
-\f
-/* Weird line processing functions. */
  
-/* Returns the entire contents of the current line. */
-const char *
-lex_entire_line (const struct lexer *lexer)
+static int
+lex_source_get_last_column (const struct lex_source *src, int n)
  {
-  return ds_cstr (&lexer->line_buffer);
+  const struct lex_token *token = lex_source_next__ (src, n);
+  char *start, *end, *newline;
+
+  start = &src->buffer[token->line_pos - src->tail];
+  end = &src->buffer[(token->token_pos + token->token_len) - src->tail];
+  newline = memrchr (start, '\n', end - start);
+  if (newline != NULL)
+    start = newline + 1;
+  return count_columns (start, end - start);
  }
  
-const struct string *
-lex_entire_line_ds (const struct lexer *lexer)
+/* Returns the 1-based line number of the start of the syntax that represents
+   the token N after the current one in LEXER.  Returns 0 for a T_STOP token or
+   if the token is drawn from a source that does not have line numbers. */
+int
+lex_get_first_line_number (const struct lexer *lexer, int n)
  {
-  return &lexer->line_buffer;
+  const struct lex_source *src = lex_source__ (lexer);
+  return src != NULL ? lex_source_get_first_line_number (src, n) : 0;
  }
  
-/* As lex_entire_line(), but only returns the part of the current line
-   that hasn't already been tokenized. */
+/* Returns the 1-based line number of the end of the syntax that represents the
+   token N after the current one in LEXER, plus 1.  Returns 0 for a T_STOP
+   token or if the token is drawn from a source that does not have line
+   numbers.
+
+   Most of the time, a single token is wholly within a single line of syntax,
+   but there are two exceptions: a T_STRING token can be made up of multiple
+   segments on adjacent lines connected with "+" punctuators, and a T_NEG_NUM
+   token can consist of a "-" on one line followed by the number on the next.
+ */
+int
+lex_get_last_line_number (const struct lexer *lexer, int n)
+{
+  const struct lex_source *src = lex_source__ (lexer);
+  return src != NULL ? lex_source_get_last_line_number (src, n) : 0;
+}
+
+/* Returns the 1-based column number of the start of the syntax that represents
+   the token N after the current one in LEXER.  Returns 0 for a T_STOP
+   token.
+
+   Column numbers are measured according to the width of characters as shown in
+   a typical fixed-width font, in which CJK characters have width 2 and
+   combining characters have width 0.  */
+int
+lex_get_first_column (const struct lexer *lexer, int n)
+{
+  const struct lex_source *src = lex_source__ (lexer);
+  return src != NULL ? lex_source_get_first_column (src, n) : 0;
+}
+
+/* Returns the 1-based column number of the end of the syntax that represents
+   the token N after the current one in LEXER, plus 1.  Returns 0 for a T_STOP
+   token.
+
+   Column numbers are measured according to the width of characters as shown in
+   a typical fixed-width font, in which CJK characters have width 2 and
+   combining characters have width 0.  */
+int
+lex_get_last_column (const struct lexer *lexer, int n)
+{
+  const struct lex_source *src = lex_source__ (lexer);
+  return src != NULL ? lex_source_get_last_column (src, n) : 0;
+}
+
+/* Returns the name of the syntax file from which the current command is drawn.
+   Returns NULL for a T_STOP token or if the command's source does not have
+   line numbers.
+
+   There is no version of this function that takes an N argument because
+   lookahead only works to the end of a command and any given command is always
+   within a single syntax file. */
  const char *
-lex_rest_of_line (const struct lexer *lexer)
+lex_get_file_name (const struct lexer *lexer)
  {
-  return lexer->prog;
+  struct lex_source *src = lex_source__ (lexer);
+  return src == NULL ? NULL : src->reader->file_name;
  }
  
-/* Returns true if the current line ends in a terminal dot,
-   false otherwise. */
-bool
-lex_end_dot (const struct lexer *lexer)
+/* Returns the syntax mode for the syntax file from which the current drawn is
+   drawn.  Returns LEX_SYNTAX_AUTO for a T_STOP token or if the command's
+   source does not have line numbers.
+
+   There is no version of this function that takes an N argument because
+   lookahead only works to the end of a command and any given command is always
+   within a single syntax file. */
+enum lex_syntax_mode
+lex_get_syntax_mode (const struct lexer *lexer)
  {
-  return lexer->dot;
+  struct lex_source *src = lex_source__ (lexer);
+  return src == NULL ? LEX_SYNTAX_AUTO : src->reader->syntax;
  }
  
-/* Causes the rest of the current input line to be ignored for
-   tokenization purposes. */
-void
-lex_discard_line (struct lexer *lexer)
+/* Returns the error mode for the syntax file from which the current drawn is
+   drawn.  Returns LEX_ERROR_INTERACTIVE for a T_STOP token or if the command's
+   source does not have line numbers.
+
+   There is no version of this function that takes an N argument because
+   lookahead only works to the end of a command and any given command is always
+   within a single syntax file. */
+enum lex_error_mode
+lex_get_error_mode (const struct lexer *lexer)
  {
-  ds_cstr (&lexer->line_buffer);  /* Ensures ds_end points to something valid */
-  lexer->prog = ds_end (&lexer->line_buffer);
-  lexer->dot = false;
-  lexer->put_token = 0;
+  struct lex_source *src = lex_source__ (lexer);
+  return src == NULL ? LEX_ERROR_INTERACTIVE : src->reader->error;
  }
  
+/* If the source that LEXER is currently reading has error mode
+   LEX_ERROR_INTERACTIVE, discards all buffered input and tokens, so that the
+   next token to be read comes directly from whatever is next read from the
+   stream.
  
-/* Discards the rest of the current command.
-   When we're reading commands from a file, we skip tokens until
-   a terminal dot or EOF.
-   When we're reading commands interactively from the user,
-   that's just discarding the current line, because presumably
-   the user doesn't want to finish typing a command that will be
-   ignored anyway. */
+   It makes sense to call this function after encountering an error in a
+   command entered on the console, because usually the user would prefer not to
+   have cascading errors. */
+void
+lex_interactive_reset (struct lexer *lexer)
+{
+  struct lex_source *src = lex_source__ (lexer);
+  if (src != NULL && src->reader->error == LEX_ERROR_INTERACTIVE)
+    {
+      src->head = src->tail = 0;
+      src->journal_pos = src->seg_pos = src->line_pos = 0;
+      src->n_newlines = 0;
+      src->suppress_next_newline = false;
+      segmenter_init (&src->segmenter, segmenter_get_mode (&src->segmenter));
+      while (!deque_is_empty (&src->deque))
+        lex_source_pop__ (src);
+      lex_source_push_endcmd__ (src);
+    }
+}
+
+/* Advances past any tokens in LEXER up to a T_ENDCMD or T_STOP. */
  void
  lex_discard_rest_of_command (struct lexer *lexer)
  {
-  if (!getl_is_interactive (lexer->ss))
+  while (lex_token (lexer) != T_STOP && lex_token (lexer) != T_ENDCMD)
+    lex_get (lexer);
+}
+
+/* Discards all lookahead tokens in LEXER, then discards all input sources
+   until it encounters one with error mode LEX_ERROR_INTERACTIVE or until it
+   runs out of input sources. */
+void
+lex_discard_noninteractive (struct lexer *lexer)
+{
+  struct lex_source *src = lex_source__ (lexer);
+
+  if (src != NULL)
      {
-      while (lexer->token != T_STOP && lexer->token != T_ENDCMD)
-       lex_get (lexer);
+      while (!deque_is_empty (&src->deque))
+        lex_source_pop__ (src);
+
+      for (; src != NULL && src->reader->error != LEX_ERROR_INTERACTIVE;
+           src = lex_source__ (lexer))
+        lex_source_destroy (src);
      }
-  else
-    lex_discard_line (lexer);
  }
  \f
-/* Weird line reading functions. */
+static size_t
+lex_source_max_tail__ (const struct lex_source *src)
+{
+  const struct lex_token *token;
+  size_t max_tail;
+
+  assert (src->seg_pos >= src->line_pos);
+  max_tail = MIN (src->journal_pos, src->line_pos);
+
+  /* Use the oldest token also.  (We know that src->deque cannot be empty
+     because we are in the process of adding a new token, which is already
+     initialized enough to use here.) */
+  token = &src->tokens[deque_back (&src->deque, 0)];
+  assert (token->token_pos >= token->line_pos);
+  max_tail = MIN (max_tail, token->line_pos);
+
+  return max_tail;
+}
  
-/* Remove C-style comments in STRING, begun by slash-star and
-   terminated by star-slash or newline. */
  static void
-strip_comments (struct string *string)
+lex_source_expand__ (struct lex_source *src)
  {
-  char *cp;
-  int quote;
-  bool in_comment;
-
-  in_comment = false;
-  quote = EOF;
-  for (cp = ds_cstr (string); *cp; )
+  if (src->head - src->tail >= src->allocated)
      {
-      /* If we're not in a comment, check for quote marks. */
-      if (!in_comment)
+      size_t max_tail = lex_source_max_tail__ (src);
+      if (max_tail > src->tail)
          {
-          if (*cp == quote)
-            quote = EOF;
-          else if (*cp == '\'' || *cp == '"')
-            quote = *cp;
+          /* Advance the tail, freeing up room at the head. */
+          memmove (src->buffer, src->buffer + (max_tail - src->tail),
+                   src->head - max_tail);
+          src->tail = max_tail;
          }
-
-      /* If we're not inside a quotation, check for comment. */
-      if (quote == EOF)
+      else
          {
-          if (cp[0] == '/' && cp[1] == '*')
-            {
-              in_comment = true;
-              *cp++ = ' ';
-              *cp++ = ' ';
-              continue;
-            }
-          else if (in_comment && cp[0] == '*' && cp[1] == '/')
-            {
-              in_comment = false;
-              *cp++ = ' ';
-              *cp++ = ' ';
-              continue;
-            }
+          /* Buffer is completely full.  Expand it. */
+          src->buffer = x2realloc (src->buffer, &src->allocated);
          }
-
-      /* Check commenting. */
-      if (in_comment)
-        *cp = ' ';
-      cp++;
      }
-}
-
-/* Prepares LINE, which is subject to the given SYNTAX rules, for
-   tokenization by stripping comments and determining whether it
-   is the beginning or end of a command and storing into
-   *LINE_STARTS_COMMAND and *LINE_ENDS_COMMAND appropriately. */
-void
-lex_preprocess_line (struct string *line,
-                     enum syntax_mode syntax,
-                     bool *line_starts_command,
-                     bool *line_ends_command)
-{
-  strip_comments (line);
-  ds_rtrim (line, ss_cstr (CC_SPACES));
-  *line_ends_command = ds_chomp_byte (line, '.') || ds_is_empty (line);
-  *line_starts_command = false;
-  if (syntax == GETL_BATCH)
+  else
      {
-      int first = ds_first (line);
-      *line_starts_command = !c_isspace (first);
-      if (first == '+' || first == '-')
-        *ds_data (line) = ' ';
+      /* There's space available at the head of the buffer.  Nothing to do. */
      }
  }
  
-/* Reads a line, without performing any preprocessing. */
-bool
-lex_get_line_raw (struct lexer *lexer)
+static void
+lex_source_read__ (struct lex_source *src)
  {
-  bool ok = getl_read_line (lexer->ss, &lexer->line_buffer);
-  if (ok)
+  do
      {
-      const char *line = ds_cstr (&lexer->line_buffer);
-      text_item_submit (text_item_create (TEXT_ITEM_SYNTAX, line));
+      size_t head_ofs;
+      size_t n;
+
+      lex_source_expand__ (src);
+
+      head_ofs = src->head - src->tail;
+      n = src->reader->class->read (src->reader, &src->buffer[head_ofs],
+                                    src->allocated - head_ofs,
+                                    segmenter_get_prompt (&src->segmenter));
+      if (n == 0)
+        {
+          /* End of input.
+
+             Ensure that the input always ends in a new-line followed by a null
+             byte, as required by the segmenter library. */
+
+          if (src->head == src->tail
+              || src->buffer[src->head - src->tail - 1] != '\n')
+            src->buffer[src->head++ - src->tail] = '\n';
+
+          lex_source_expand__ (src);
+          src->buffer[src->head++ - src->tail] = '\0';
+
+          return;
+        }
+
+      src->head += n;
      }
-  else
-    lexer->prog = NULL;
-  return ok;
+  while (!memchr (&src->buffer[src->seg_pos - src->tail], '\n',
+                  src->head - src->seg_pos));
  }
  
-/* Reads a line for use by the tokenizer, and preprocesses it by
-   removing comments, stripping trailing whitespace and the
-   terminal dot, and removing leading indentors. */
-bool
-lex_get_line (struct lexer *lexer)
+static struct lex_source *
+lex_source__ (const struct lexer *lexer)
+{
+  return (ll_is_empty (&lexer->sources) ? NULL
+          : ll_data (ll_head (&lexer->sources), struct lex_source, ll));
+}
+
+static struct substring
+lex_source_get_syntax__ (const struct lex_source *src, int n0, int n1)
  {
-  bool line_starts_command;
+  const struct lex_token *token0 = lex_source_next__ (src, n0);
+  const struct lex_token *token1 = lex_source_next__ (src, MAX (n0, n1));
+  size_t start = token0->token_pos;
+  size_t end = token1->token_pos + token1->token_len;
  
-  if (!lex_get_line_raw (lexer))
-    return false;
+  return ss_buffer (&src->buffer[start - src->tail], end - start);
+}
  
-  lex_preprocess_line (&lexer->line_buffer,
-                      lex_current_syntax_mode (lexer),
-                       &line_starts_command, &lexer->dot);
+static void
+lex_ellipsize__ (struct substring in, char *out, size_t out_size)
+{
+  size_t out_maxlen;
+  size_t out_len;
+  int mblen;
  
-  if (line_starts_command)
-    lexer->put_token = T_ENDCMD;
+  assert (out_size >= 16);
+  out_maxlen = out_size - (in.length >= out_size ? 3 : 0) - 1;
+  for (out_len = 0; out_len < in.length; out_len += mblen)
+    {
+      if (in.string[out_len] == '\n'
+          || (in.string[out_len] == '\r'
+              && out_len + 1 < in.length
+              && in.string[out_len + 1] == '\n'))
+        break;
+
+      mblen = u8_mblen (CHAR_CAST (const uint8_t *, in.string + out_len),
+                        in.length - out_len);
+      if (out_len + mblen > out_maxlen)
+        break;
+    }
  
-  lexer->prog = ds_cstr (&lexer->line_buffer);
-  return true;
+  memcpy (out, in.string, out_len);
+  strcpy (&out[out_len], out_len < in.length ? "..." : "");
  }
-\f
-/* Token names. */
  
-/* Returns the name of a token. */
-const char *
-lex_token_name (enum token_type token)
+static void
+lex_source_error_valist (struct lex_source *src, int n0, int n1,
+                         const char *format, va_list args)
  {
-  switch (token)
-    {
-    case T_ID:
-    case T_POS_NUM:
-    case T_NEG_NUM:
-    case T_STRING:
-    case TOKEN_N_TYPES:
-      NOT_REACHED ();
+  const struct lex_token *token;
+  struct string s;
+  struct msg m;
  
-    case T_STOP:
-      return "";
+  ds_init_empty (&s);
  
-    case T_ENDCMD:
-      return ".";
+  token = lex_source_next__ (src, n0);
+  if (token->token.type == T_ENDCMD)
+    ds_put_cstr (&s, _("Syntax error at end of command"));
+  else
+    {
+      struct substring syntax = lex_source_get_syntax__ (src, n0, n1);
+      if (!ss_is_empty (syntax))
+        {
+          char syntax_cstr[64];
  
-    case T_PLUS:
-      return "+";
+          lex_ellipsize__ (syntax, syntax_cstr, sizeof syntax_cstr);
+          ds_put_format (&s, _("Syntax error at `%s'"), syntax_cstr);
+        }
+      else
+        ds_put_cstr (&s, _("Syntax error"));
+    }
  
-    case T_DASH:
-      return "-";
+  if (format)
+    {
+      ds_put_cstr (&s, ": ");
+      ds_put_vformat (&s, format, args);
+    }
+  ds_put_byte (&s, '.');
+
+  m.category = MSG_C_SYNTAX;
+  m.severity = MSG_S_ERROR;
+  m.file_name = src->reader->file_name;
+  m.first_line = lex_source_get_first_line_number (src, n0);
+  m.last_line = lex_source_get_last_line_number (src, n1);
+  m.first_column = lex_source_get_first_column (src, n0);
+  m.last_column = lex_source_get_last_column (src, n1);
+  m.text = ds_steal_cstr (&s);
+  msg_emit (&m);
+}
  
-    case T_ASTERISK:
-      return "*";
+static void PRINTF_FORMAT (2, 3)
+lex_get_error (struct lex_source *src, const char *format, ...)
+{
+  va_list args;
+  int n;
  
-    case T_SLASH:
-      return "/";
+  va_start (args, format);
  
-    case T_EQUALS:
-      return "=";
+  n = deque_count (&src->deque) - 1;
+  lex_source_error_valist (src, n, n, format, args);
+  lex_source_pop_front (src);
  
-    case T_LPAREN:
-      return "(";
+  va_end (args);
+}
  
-    case T_RPAREN:
-      return ")";
+static bool
+lex_source_get__ (const struct lex_source *src_)
+{
+  struct lex_source *src = CONST_CAST (struct lex_source *, src_);
  
-    case T_LBRACK:
-      return "[";
+  struct state
+    {
+      struct segmenter segmenter;
+      enum segment_type last_segment;
+      int newlines;
+      size_t line_pos;
+      size_t seg_pos;
+    };
+
+  struct state state, saved;
+  enum scan_result result;
+  struct scanner scanner;
+  struct lex_token *token;
+  int n_lines;
+  int i;
+
+  if (src->eof)
+    return false;
  
-    case T_RBRACK:
-      return "]";
+  state.segmenter = src->segmenter;
+  state.newlines = 0;
+  state.seg_pos = src->seg_pos;
+  state.line_pos = src->line_pos;
+  saved = state;
+
+  token = lex_push_token__ (src);
+  scanner_init (&scanner, &token->token);
+  token->line_pos = src->line_pos;
+  token->token_pos = src->seg_pos;
+  if (src->reader->line_number > 0)
+    token->first_line = src->reader->line_number + src->n_newlines;
+  else
+    token->first_line = 0;
  
-    case T_COMMA:
-      return ",";
+  for (;;)
+    {
+      enum segment_type type;
+      const char *segment;
+      size_t seg_maxlen;
+      int seg_len;
+
+      segment = &src->buffer[state.seg_pos - src->tail];
+      seg_maxlen = src->head - state.seg_pos;
+      seg_len = segmenter_push (&state.segmenter, segment, seg_maxlen, &type);
+      if (seg_len < 0)
+        {
+          lex_source_read__ (src);
+          continue;
+        }
  
-    case T_AND:
-      return "AND";
+      state.last_segment = type;
+      state.seg_pos += seg_len;
+      if (type == SEG_NEWLINE)
+        {
+          state.newlines++;
+          state.line_pos = state.seg_pos;
+        }
  
-    case T_OR:
-      return "OR";
+      result = scanner_push (&scanner, type, ss_buffer (segment, seg_len),
+                             &token->token);
+      if (result == SCAN_SAVE)
+        saved = state;
+      else if (result == SCAN_BACK)
+        {
+          state = saved;
+          break;
+        }
+      else if (result == SCAN_DONE)
+        break;
+    }
  
-    case T_NOT:
-      return "NOT";
+  n_lines = state.newlines;
+  if (state.last_segment == SEG_END_COMMAND && !src->suppress_next_newline)
+    {
+      n_lines++;
+      src->suppress_next_newline = true;
+    }
+  else if (n_lines > 0 && src->suppress_next_newline)
+    {
+      n_lines--;
+      src->suppress_next_newline = false;
+    }
+  for (i = 0; i < n_lines; i++)
+    {
+      const char *newline;
+      const char *line;
+      size_t line_len;
  
-    case T_EQ:
-      return "EQ";
+      line = &src->buffer[src->journal_pos - src->tail];
+      newline = rawmemchr (line, '\n');
+      line_len = newline - line;
+      if (line_len > 0 && line[line_len - 1] == '\r')
+        line_len--;
  
-    case T_GE:
-      return ">=";
+      text_item_submit (text_item_create_nocopy (TEXT_ITEM_SYNTAX,
+                                                 xmemdup0 (line, line_len)));
  
-    case T_GT:
-      return ">";
+      src->journal_pos += newline - line + 1;
+    }
  
-    case T_LE:
-      return "<=";
+  token->token_len = state.seg_pos - src->seg_pos;
  
-    case T_LT:
-      return "<";
+  src->segmenter = state.segmenter;
+  src->seg_pos = state.seg_pos;
+  src->line_pos = state.line_pos;
+  src->n_newlines += state.newlines;
  
-    case T_NE:
-      return "~=";
+  switch (token->token.type)
+    {
+    default:
+      break;
  
-    case T_ALL:
-      return "ALL";
+    case T_STOP:
+      token->token.type = T_ENDCMD;
+      src->eof = true;
+      break;
  
-    case T_BY:
-      return "BY";
+    case SCAN_BAD_HEX_LENGTH:
+      lex_get_error (src, _("String of hex digits has %d characters, which "
+                            "is not a multiple of 2"),
+                     (int) token->token.number);
+      break;
  
-    case T_TO:
-      return "TO";
+    case SCAN_BAD_HEX_DIGIT:
+    case SCAN_BAD_UNICODE_DIGIT:
+      lex_get_error (src, _("`%c' is not a valid hex digit"),
+                     (int) token->token.number);
+      break;
  
-    case T_WITH:
-      return "WITH";
+    case SCAN_BAD_UNICODE_LENGTH:
+      lex_get_error (src, _("Unicode string contains %d bytes, which is "
+                            "not in the valid range of 1 to 8 bytes"),
+                     (int) token->token.number);
+      break;
  
-    case T_EXP:
-      return "**";
-    }
+    case SCAN_BAD_UNICODE_CODE_POINT:
+      lex_get_error (src, _("U+%04X is not a valid Unicode code point"),
+                     (int) token->token.number);
+      break;
  
-  NOT_REACHED ();
-}
+    case SCAN_EXPECTED_QUOTE:
+      lex_get_error (src, _("Unterminated string constant"));
+      break;
  
-/* Returns an ASCII representation of the current token as a
-   malloc()'d string. */
-char *
-lex_token_representation (struct lexer *lexer)
-{
-  char *token_rep;
+    case SCAN_EXPECTED_EXPONENT:
+      lex_get_error (src, _("Missing exponent following `%s'"),
+                     token->token.string.string);
+      break;
  
-  switch (lexer->token)
-    {
-    case T_ID:
-    case T_POS_NUM:
-    case T_NEG_NUM:
-      return ss_xstrdup (lex_tokss (lexer));
+    case SCAN_UNEXPECTED_DOT:
+      lex_get_error (src, _("Unexpected `.' in middle of command"));
+      break;
  
-    case T_STRING:
+    case SCAN_UNEXPECTED_CHAR:
        {
-        struct substring ss;
-       int hexstring = 0;
-       char *sp, *dp;
-
-        ss = lex_tokss (lexer);
-       for (sp = ss_data (ss); sp < ss_end (ss); sp++)
-         if (!c_isprint ((unsigned char) *sp))
-           {
-             hexstring = 1;
-             break;
-           }
-
-       token_rep = xmalloc (2 + ss_length (ss) * 2 + 1 + 1);
-
-       dp = token_rep;
-       if (hexstring)
-         *dp++ = 'X';
-       *dp++ = '\'';
-
-        for (sp = ss_data (ss); sp < ss_end (ss); sp++)
-          if (!hexstring)
-           {
-             if (*sp == '\'')
-               *dp++ = '\'';
-             *dp++ = (unsigned char) *sp;
-           }
-          else
-           {
-             *dp++ = (((unsigned char) *sp) >> 4)["0123456789ABCDEF"];
-             *dp++ = (((unsigned char) *sp) & 15)["0123456789ABCDEF"];
-           }
-       *dp++ = '\'';
-       *dp = '\0';
-
-       return token_rep;
+        char c_name[16];
+        lex_get_error (src, _("Bad character %s in input"),
+                       uc_name (token->token.number, c_name));
        }
+      break;
  
-    default:
-      return xstrdup (lex_token_name (lexer->token));
+    case SCAN_SKIP:
+      lex_source_pop_front (src);
+      break;
      }
+
+  return true;
  }
  \f
-/* Really weird functions. */
+static void
+lex_source_push_endcmd__ (struct lex_source *src)
+{
+  struct lex_token *token = lex_push_token__ (src);
+  token->token.type = T_ENDCMD;
+  token->token_pos = 0;
+  token->token_len = 0;
+  token->line_pos = 0;
+  token->first_line = 0;
+}
  
-/* Skip a COMMENT command. */
-void
-lex_skip_comment (struct lexer *lexer)
+static struct lex_source *
+lex_source_create (struct lex_reader *reader)
  {
-  for (;;)
-    {
-      if (!lex_get_line (lexer))
-        {
-          lexer->put_token = T_STOP;
-         lexer->prog = NULL;
-          return;
-        }
+  struct lex_source *src;
+  enum segmenter_mode mode;
+
+  src = xzalloc (sizeof *src);
+  src->reader = reader;
+
+  if (reader->syntax == LEX_SYNTAX_AUTO)
+    mode = SEG_MODE_AUTO;
+  else if (reader->syntax == LEX_SYNTAX_INTERACTIVE)
+    mode = SEG_MODE_INTERACTIVE;
+  else if (reader->syntax == LEX_SYNTAX_BATCH)
+    mode = SEG_MODE_BATCH;
+  else
+    NOT_REACHED ();
+  segmenter_init (&src->segmenter, mode);
  
-      if (lexer->put_token == T_ENDCMD)
-       break;
+  src->tokens = deque_init (&src->deque, 4, sizeof *src->tokens);
  
-      ds_cstr (&lexer->line_buffer); /* Ensures ds_end will point to a valid char */
-      lexer->prog = ds_end (&lexer->line_buffer);
-      if (lexer->dot)
-       break;
-    }
+  lex_source_push_endcmd__ (src);
+
+  return src;
  }
-\f
-/* Private functions. */
  
-/* When invoked, tokstr contains a string of binary, octal, or
-   hex digits, according to TYPE.  The string is converted to
-   characters having the specified values. */
  static void
-convert_numeric_string_to_char_string (struct lexer *lexer,
-                                      enum string_type type)
+lex_source_destroy (struct lex_source *src)
+{
+  char *file_name = src->reader->file_name;
+  if (src->reader->class->close != NULL)
+    src->reader->class->close (src->reader);
+  free (file_name);
+  free (src->buffer);
+  while (!deque_is_empty (&src->deque))
+    lex_source_pop__ (src);
+  free (src->tokens);
+  ll_remove (&src->ll);
+  free (src);
+}
+\f
+struct lex_file_reader
+  {
+    struct lex_reader reader;
+    struct u8_istream *istream;
+    char *file_name;
+  };
+
+static struct lex_reader_class lex_file_reader_class;
+
+/* Creates and returns a new lex_reader that will read from file FILE_NAME (or
+   from stdin if FILE_NAME is "-").  The file is expected to be encoded with
+   ENCODING, which should take one of the forms accepted by
+   u8_istream_for_file().  SYNTAX and ERROR become the syntax mode and error
+   mode of the new reader, respectively.
+
+   Returns a null pointer if FILE_NAME cannot be opened. */
+struct lex_reader *
+lex_reader_for_file (const char *file_name, const char *encoding,
+                     enum lex_syntax_mode syntax,
+                     enum lex_error_mode error)
  {
-  const char *base_name;
-  int base;
-  int chars_per_byte;
-  size_t byte_cnt;
-  size_t i;
-  char *p;
+  struct lex_file_reader *r;
+  struct u8_istream *istream;
  
-  switch (type)
+  istream = (!strcmp(file_name, "-")
+             ? u8_istream_for_fd (encoding, STDIN_FILENO)
+             : u8_istream_for_file (encoding, file_name, O_RDONLY));
+  if (istream == NULL)
      {
-    case BINARY_STRING:
-      base_name = _("binary");
-      base = 2;
-      chars_per_byte = 8;
-      break;
-    case OCTAL_STRING:
-      base_name = _("octal");
-      base = 8;
-      chars_per_byte = 3;
-      break;
-    case HEX_STRING:
-      base_name = _("hex");
-      base = 16;
-      chars_per_byte = 2;
-      break;
-    default:
-      NOT_REACHED ();
+      msg (ME, _("Opening `%s': %s."), file_name, strerror (errno));
+      return NULL;
      }
  
-  byte_cnt = ds_length (&lexer->tokstr) / chars_per_byte;
-  if (ds_length (&lexer->tokstr) % chars_per_byte)
-    msg (SE, _("String of %s digits has %zu characters, which is not a "
-              "multiple of %d."),
-        base_name, ds_length (&lexer->tokstr), chars_per_byte);
+  r = xmalloc (sizeof *r);
+  lex_reader_init (&r->reader, &lex_file_reader_class);
+  r->reader.syntax = syntax;
+  r->reader.error = error;
+  r->reader.file_name = xstrdup (file_name);
+  r->reader.line_number = 1;
+  r->istream = istream;
+  r->file_name = xstrdup (file_name);
  
-  p = ds_cstr (&lexer->tokstr);
-  for (i = 0; i < byte_cnt; i++)
+  return &r->reader;
+}
+
+static struct lex_file_reader *
+lex_file_reader_cast (struct lex_reader *r)
+{
+  return UP_CAST (r, struct lex_file_reader, reader);
+}
+
+static size_t
+lex_file_read (struct lex_reader *r_, char *buf, size_t n,
+               enum prompt_style prompt_style UNUSED)
+{
+  struct lex_file_reader *r = lex_file_reader_cast (r_);
+  ssize_t n_read = u8_istream_read (r->istream, buf, n);
+  if (n_read < 0)
      {
-      int value;
-      int j;
-
-      value = 0;
-      for (j = 0; j < chars_per_byte; j++, p++)
-       {
-         int v;
-
-         if (*p >= '0' && *p <= '9')
-           v = *p - '0';
-         else
-           {
-             static const char alpha[] = "abcdef";
-             const char *q = strchr (alpha, tolower ((unsigned char) *p));
-
-             if (q)
-               v = q - alpha + 10;
-             else
-               v = base;
-           }
-
-         if (v >= base)
-           msg (SE, _("`%c' is not a valid %s digit."), *p, base_name);
-
-         value = value * base + v;
-       }
-
-      ds_cstr (&lexer->tokstr)[i] = (unsigned char) value;
+      msg (ME, _("Error reading `%s': %s."), r->file_name, strerror (errno));
+      return 0;
      }
-
-  ds_truncate (&lexer->tokstr, byte_cnt);
+  return n_read;
  }
  
-/* Parses a string from the input buffer into tokstr.  The input
-   buffer pointer lexer->prog must point to the initial single or double
-   quote.  TYPE indicates the type of string to be parsed.
-   Returns token type. */
-static int
-parse_string (struct lexer *lexer, enum string_type type)
+static void
+lex_file_close (struct lex_reader *r_)
  {
-  if (type != CHARACTER_STRING)
-    lexer->prog++;
+  struct lex_file_reader *r = lex_file_reader_cast (r_);
  
-  /* Accumulate the entire string, joining sections indicated by +
-     signs. */
-  for (;;)
+  if (u8_istream_fileno (r->istream) != STDIN_FILENO)
      {
-      /* Single or double quote. */
-      int c = *lexer->prog++;
-
-      /* Accumulate section. */
-      for (;;)
-       {
-         /* Check end of line. */
-         if (*lexer->prog == '\0')
-           {
-             msg (SE, _("Unterminated string constant."));
-             goto finish;
-           }
-
-         /* Double quote characters to embed them in strings. */
-         if (*lexer->prog == c)
-           {
-             if (lexer->prog[1] == c)
-               lexer->prog++;
-             else
-               break;
-           }
-
-         ds_put_byte (&lexer->tokstr, *lexer->prog++);
-       }
-      lexer->prog++;
-
-      /* Skip whitespace after final quote mark. */
-      if (lexer->prog == NULL)
-       break;
-      for (;;)
-       {
-         while (c_isspace ((unsigned char) *lexer->prog))
-           lexer->prog++;
-         if (*lexer->prog)
-           break;
-
-         if (lexer->dot)
-           goto finish;
-
-         if (!lex_get_line (lexer))
-            goto finish;
-       }
-
-      /* Skip plus sign. */
-      if (*lexer->prog != '+')
-       break;
-      lexer->prog++;
-
-      /* Skip whitespace after plus sign. */
-      if (lexer->prog == NULL)
-       break;
-      for (;;)
-       {
-         while (c_isspace ((unsigned char) *lexer->prog))
-           lexer->prog++;
-         if (*lexer->prog)
-           break;
-
-         if (lexer->dot)
-           goto finish;
-
-         if (!lex_get_line (lexer))
-            {
-              msg (SE, _("Unexpected end of file in string concatenation."));
-              goto finish;
-            }
-       }
-
-      /* Ensure that a valid string follows. */
-      if (*lexer->prog != '\'' && *lexer->prog != '"')
-       {
-         msg (SE, _("String expected following `+'."));
-         goto finish;
-       }
+      if (u8_istream_close (r->istream) != 0)
+        msg (ME, _("Error closing `%s': %s."), r->file_name, strerror (errno));
      }
+  else
+    u8_istream_free (r->istream);
  
-  /* We come here when we've finished concatenating all the string sections
-     into one large string. */
-finish:
-  if (type != CHARACTER_STRING)
-    convert_numeric_string_to_char_string (lexer, type);
-
-  return T_STRING;
+  free (r->file_name);
+  free (r);
  }
+
+static struct lex_reader_class lex_file_reader_class =
+  {
+    lex_file_read,
+    lex_file_close
+  };
  \f
-/* Token Accessor Functions */
+struct lex_string_reader
+  {
+    struct lex_reader reader;
+    struct substring s;
+    size_t offset;
+  };
  
-enum token_type
-lex_token (const struct lexer *lexer)
+static struct lex_reader_class lex_string_reader_class;
+
+/* Creates and returns a new lex_reader for the contents of S, which must be
+   encoded in UTF-8.  The new reader takes ownership of S and will free it
+   with ss_dealloc() when it is closed. */
+struct lex_reader *
+lex_reader_for_substring_nocopy (struct substring s)
  {
-  return lexer->token;
+  struct lex_string_reader *r;
+
+  r = xmalloc (sizeof *r);
+  lex_reader_init (&r->reader, &lex_string_reader_class);
+  r->reader.syntax = LEX_SYNTAX_INTERACTIVE;
+  r->s = s;
+  r->offset = 0;
+
+  return &r->reader;
  }
  
-double
-lex_tokval (const struct lexer *lexer)
+/* Creates and returns a new lex_reader for a copy of null-terminated string S,
+   which must be encoded in UTF-8.  The caller retains ownership of S. */
+struct lex_reader *
+lex_reader_for_string (const char *s)
  {
-  return lexer->tokval;
+  struct substring ss;
+  ss_alloc_substring (&ss, ss_cstr (s));
+  return lex_reader_for_substring_nocopy (ss);
  }
  
-/* Returns the null-terminated string value associated with LEXER's current
-   token.  For a T_ID token, this is the identifier, and for a T_STRING token,
-   this is the string.  For other tokens the value is undefined. */
-const char *
-lex_tokcstr (const struct lexer *lexer)
+/* Formats FORMAT as a printf()-like format string and creates and returns a
+   new lex_reader for the formatted result.  */
+struct lex_reader *
+lex_reader_for_format (const char *format, ...)
  {
-  return ds_cstr (&lexer->tokstr);
+  struct lex_reader *r;
+  va_list args;
+
+  va_start (args, format);
+  r = lex_reader_for_substring_nocopy (ss_cstr (xvasprintf (format, args)));
+  va_end (args);
+
+  return r;
  }
  
-/* Returns the string value associated with LEXER's current token.  For a T_ID
-   token, this is the identifier, and for a T_STRING token, this is the string.
-   For other tokens the value is undefined. */
-struct substring
-lex_tokss (const struct lexer *lexer)
+static struct lex_string_reader *
+lex_string_reader_cast (struct lex_reader *r)
  {
-  return ds_ss (&lexer->tokstr);
+  return UP_CAST (r, struct lex_string_reader, reader);
  }
  
-/* If the lexer is positioned at the (pseudo)identifier S, which
-   may contain a hyphen ('-'), skips it and returns true.  Each
-   half of the identifier may be abbreviated to its first three
-   letters.
-   Otherwise, returns false. */
-bool
-lex_match_hyphenated_word (struct lexer *lexer, const char *s)
-{
-  const char *hyphen = strchr (s, '-');
-  if (hyphen == NULL)
-    return lex_match_id (lexer, s);
-  else if (lexer->token != T_ID
-          || !lex_id_match (ss_buffer (s, hyphen - s), lex_tokss (lexer))
-          || lex_look_ahead (lexer) != T_DASH)
-    return false;
-  else
-    {
-      lex_get (lexer);
-      lex_force_match (lexer, T_DASH);
-      lex_force_match_id (lexer, hyphen + 1);
-      return true;
-    }
+static size_t
+lex_string_read (struct lex_reader *r_, char *buf, size_t n,
+                 enum prompt_style prompt_style UNUSED)
+{
+  struct lex_string_reader *r = lex_string_reader_cast (r_);
+  size_t chunk;
+
+  chunk = MIN (n, r->s.length - r->offset);
+  memcpy (buf, r->s.string + r->offset, chunk);
+  r->offset += chunk;
+
+  return chunk;
  }
  
+static void
+lex_string_close (struct lex_reader *r_)
+{
+  struct lex_string_reader *r = lex_string_reader_cast (r_);
+
+  ss_dealloc (&r->s);
+  free (r);
+}
+
+static struct lex_reader_class lex_string_reader_class =
+  {
+    lex_string_read,
+    lex_string_close
+  };
diff --git a/src/language/lexer/lexer.h b/src/language/lexer/lexer.h

index f00ca77d2f52deeaa43d4601a6626538b7d3dcfb..b9e936bf994c67cdb7d6952950196f57170b3cf3 100644 (file)
--- a/src/language/lexer/lexer.h
+++ b/src/language/lexer/lexer.h
@@ -14,34 +14,91 @@
     You should have received a copy of the GNU General Public License
     along with this program.  If not, see <http://www.gnu.org/licenses/>. */
  
-#if !lexer_h
-#define lexer_h 1
+#ifndef LEXER_H
+#define LEXER_H 1
  
-#include <ctype.h>
  #include <stdbool.h>
  #include <stddef.h>
  
  #include "data/identifier.h"
  #include "data/variable.h"
-#include "libpspp/getl.h"
+#include "libpspp/compiler.h"
+#include "libpspp/prompt.h"
  
  struct lexer;
  
+/* The syntax mode for which a syntax file is intended. */
+enum lex_syntax_mode
+  {
+    LEX_SYNTAX_AUTO,            /* Try to guess intent. */
+    LEX_SYNTAX_INTERACTIVE,     /* Interactive mode. */
+    LEX_SYNTAX_BATCH            /* Batch mode. */
+  };
+
+/* Handling of errors. */
+enum lex_error_mode
+  {
+    LEX_ERROR_INTERACTIVE,     /* Always continue to next command. */
+    LEX_ERROR_CONTINUE,        /* Continue to next command, except for
+                                  cascading failures. */
+    LEX_ERROR_STOP             /* Stop processing. */
+  };
+
+/* Reads a single syntax file as a stream of bytes encoded in UTF-8.
+
+   Not opaque. */
+struct lex_reader
+  {
+    const struct lex_reader_class *class;
+    enum lex_syntax_mode syntax;
+    enum lex_error_mode error;
+    char *file_name;            /* NULL if not associated with a file. */
+    int line_number;            /* 1-based initial line number, 0 if none. */
+  };
+
+/* An implementation of a lex_reader. */
+struct lex_reader_class
+  {
+    /* Reads up to N bytes of data from READER into N.  Returns the positive
+       number of bytes read if successful, or zero at end of input or on
+       error.
+
+       STYLE provides a hint to interactive readers as to what kind of syntax
+       is being read right now. */
+    size_t (*read) (struct lex_reader *reader, char *buf, size_t n,
+                    enum prompt_style style);
+
+    /* Closes and destroys READER, releasing any allocated storage.
+
+       The caller will free the 'file_name' member of READER, so the
+       implementation should not do so. */
+    void (*close) (struct lex_reader *reader);
+  };
+
+/* Helper functions for lex_reader. */
+void lex_reader_init (struct lex_reader *, const struct lex_reader_class *);
+void lex_reader_set_file_name (struct lex_reader *, const char *file_name);
+
+/* Creating various kinds of lex_readers. */
+struct lex_reader *lex_reader_for_file (const char *file_name,
+                                        const char *encoding,
+                                        enum lex_syntax_mode syntax,
+                                        enum lex_error_mode error);
+struct lex_reader *lex_reader_for_string (const char *);
+struct lex_reader *lex_reader_for_format (const char *, ...)
+  PRINTF_FORMAT (1, 2);
+struct lex_reader *lex_reader_for_substring_nocopy (struct substring);
+
  /* Initialization. */
-struct lexer * lex_create (struct source_stream *);
+struct lexer *lex_create (void);
  void lex_destroy (struct lexer *);
  
-/* State accessors */
-struct source_stream * lex_get_source_stream (const struct lexer *);
-enum syntax_mode lex_current_syntax_mode (const struct lexer *);
-enum error_mode lex_current_error_mode (const struct lexer *);
+/* Files. */
+void lex_include (struct lexer *, struct lex_reader *);
+void lex_append (struct lexer *, struct lex_reader *);
  
-/* Common functions. */
+/* Advancing. */
  void lex_get (struct lexer *);
-void lex_error (struct lexer *, const char *, ...);
-void lex_sbc_only_once (const char *);
-void lex_sbc_missing (struct lexer *, const char *);
-int lex_end_of_command (struct lexer *);
  
  /* Token testing functions. */
  bool lex_is_number (struct lexer *);
@@ -50,14 +107,19 @@ bool lex_is_integer (struct lexer *);
  long lex_integer (struct lexer *);
  bool lex_is_string (struct lexer *);
  
+/* Token testing functions with lookahead. */
+bool lex_next_is_number (struct lexer *, int n);
+double lex_next_number (struct lexer *, int n);
+bool lex_next_is_integer (struct lexer *, int n);
+long lex_next_integer (struct lexer *, int n);
+bool lex_next_is_string (struct lexer *, int n);
  
  /* Token matching functions. */
  bool lex_match (struct lexer *, enum token_type);
  bool lex_match_id (struct lexer *, const char *);
  bool lex_match_id_n (struct lexer *, const char *, size_t n);
  bool lex_match_int (struct lexer *, int);
-bool lex_match_hyphenated_word (struct lexer *lexer, const char *s);
-
+bool lex_match_phrase (struct lexer *, const char *s);
  
  /* Forcible matching functions. */
  bool lex_force_match (struct lexer *, enum token_type);
@@ -67,36 +129,46 @@ bool lex_force_num (struct lexer *);
  bool lex_force_id (struct lexer *);
  bool lex_force_string (struct lexer *);
  
-/* Weird token functions. */
-enum token_type lex_look_ahead (struct lexer *);
-void lex_put_back (struct lexer *, enum token_type);
-
-/* Weird line processing functions. */
-const char *lex_entire_line (const struct lexer *);
-const struct string *lex_entire_line_ds (const struct lexer *);
-const char *lex_rest_of_line (const struct lexer *);
-bool lex_end_dot (const struct lexer *);
-void lex_preprocess_line (struct string *, enum syntax_mode,
-                          bool *line_starts_command,
-                          bool *line_ends_command);
-void lex_discard_line (struct lexer *);
-void lex_discard_rest_of_command (struct lexer *);
-
-/* Weird line reading functions. */
-bool lex_get_line (struct lexer *);
-bool lex_get_line_raw (struct lexer *);
-
-/* Token names. */
-const char *lex_token_name (enum token_type);
-char *lex_token_representation (struct lexer *);
-
-/* Token accessors */
+/* Token accessors. */
  enum token_type lex_token (const struct lexer *);
  double lex_tokval (const struct lexer *);
  const char *lex_tokcstr (const struct lexer *);
  struct substring lex_tokss (const struct lexer *);
  
-/* Really weird functions. */
-void lex_skip_comment (struct lexer *);
+/* Looking ahead. */
+const struct token *lex_next (const struct lexer *, int n);
+enum token_type lex_next_token (const struct lexer *, int n);
+const char *lex_next_tokcstr (const struct lexer *, int n);
+double lex_next_tokval (const struct lexer *, int n);
+struct substring lex_next_tokss (const struct lexer *, int n);
+
+/* Current position. */
+int lex_get_first_line_number (const struct lexer *, int n);
+int lex_get_last_line_number (const struct lexer *, int n);
+int lex_get_first_column (const struct lexer *, int n);
+int lex_get_last_column (const struct lexer *, int n);
+const char *lex_get_file_name (const struct lexer *);
+
+/* Issuing errors. */
+void lex_error (struct lexer *, const char *, ...) PRINTF_FORMAT (2, 3);
+void lex_next_error (struct lexer *, int n0, int n1, const char *, ...)
+  PRINTF_FORMAT (4, 5);
+int lex_end_of_command (struct lexer *);
+
+void lex_sbc_only_once (const char *);
+void lex_sbc_missing (struct lexer *, const char *);
+
+void lex_error_valist (struct lexer *, const char *, va_list)
+  PRINTF_FORMAT (2, 0);
+void lex_next_error_valist (struct lexer *lexer, int n0, int n1,
+                            const char *format, va_list)
+  PRINTF_FORMAT (4, 0);
+
+/* Error handling. */
+enum lex_syntax_mode lex_get_syntax_mode (const struct lexer *);
+enum lex_error_mode lex_get_error_mode (const struct lexer *);
+void lex_discard_rest_of_command (struct lexer *);
+void lex_interactive_reset (struct lexer *);
+void lex_discard_noninteractive (struct lexer *);
  
-#endif /* !lexer_h */
+#endif /* lexer.h */
diff --git a/src/language/lexer/q2c.c b/src/language/lexer/q2c.c

index d578dbc2919fee8bcc81264bac22c9ed512ddfed..f53ccfc33c181ad48cba48705f35e23d5cb2d552 100644 (file)
--- a/src/language/lexer/q2c.c
+++ b/src/language/lexer/q2c.c
@@ -1425,7 +1425,7 @@ make_match (const char *t)
    else if (strchr (t, hyphen_proxy))
      {
        char *c = unmunge (t);
-      sprintf (s, "lex_match_hyphenated_word (lexer, \"%s\")", c);
+      sprintf (s, "lex_match_phrase (lexer, \"%s\")", c);
        free (c);
      }
    else
@@ -1836,12 +1836,12 @@ dump_parser (int persistent)
        if (def->type == SBC_VARLIST)
         dump (1, "if (lex_token (lexer) == T_ID "
                "&& dict_lookup_var (dataset_dict (ds), lex_tokcstr (lexer)) != NULL "
-             "&& lex_look_ahead (lexer) != '=')");
+             "&& lex_next_token (lexer, 1) != T_EQUALS)");
        else
         {
           dump (0, "if ((lex_token (lexer) == T_ID "
                  "&& dict_lookup_var (dataset_dict (ds), lex_tokcstr (lexer)) "
-               "&& lex_look_ahead () != '=')");
+               "&& lex_next_token (lexer, 1) != T_EQUALS)");
           dump (1, "     || token == T_ALL)");
         }
        dump (1, "{");
diff --git a/src/language/lexer/value-parser.c b/src/language/lexer/value-parser.c

index 649bbf2498c248c57afc17a55edda014ab296e8f..ff2701e8dadfb586b64a791367de731177b00563 100644 (file)
--- a/src/language/lexer/value-parser.c
+++ b/src/language/lexer/value-parser.c
@@ -107,7 +107,7 @@ parse_number (struct lexer *lexer, double *x, const enum fmt_type *format)
  
        assert (fmt_get_category (*format) != FMT_CAT_STRING);
  
-      if (!data_in_msg (lex_tokss (lexer), C_ENCODING, *format, &v, 0, NULL))
+      if (!data_in_msg (lex_tokss (lexer), "UTF-8", *format, &v, 0, NULL))
          return false;
  
        lex_get (lexer);
diff --git a/src/language/lexer/variable-parser.c b/src/language/lexer/variable-parser.c

index 84cf972563545becd34344a8b1b64a787c89ce0e..fbcc170d84b34b33f790a852a64853744f27e12c 100644 (file)
--- a/src/language/lexer/variable-parser.c
+++ b/src/language/lexer/variable-parser.c
@@ -415,8 +415,8 @@ add_var_name (char *name,
  /* Parses a list of variable names according to the DATA LIST version
     of the TO convention.  */
  bool
-parse_DATA_LIST_vars (struct lexer *lexer, char ***namesp,
-                      size_t *n_varsp, int pv_opts)
+parse_DATA_LIST_vars (struct lexer *lexer, const struct dictionary *dict,
+                      char ***namesp, size_t *n_varsp, int pv_opts)
  {
    char **names;
    size_t n_vars;
@@ -453,7 +453,8 @@ parse_DATA_LIST_vars (struct lexer *lexer, char ***namesp,
  
    do
      {
-      if (lex_token (lexer) != T_ID)
+      if (lex_token (lexer) != T_ID
+          || !dict_id_is_valid (dict, lex_tokcstr (lexer), true))
         {
           lex_error (lexer, "expecting variable name");
           goto exit;
@@ -474,7 +475,8 @@ parse_DATA_LIST_vars (struct lexer *lexer, char ***namesp,
            unsigned long int number;
  
           lex_get (lexer);
-         if (lex_token (lexer) != T_ID)
+         if (lex_token (lexer) != T_ID
+              || !dict_id_is_valid (dict, lex_tokcstr (lexer), true))
             {
               lex_error (lexer, "expecting variable name");
               goto exit;
@@ -574,7 +576,8 @@ register_vars_pool (struct pool *pool, char **names, size_t nnames)
     parse_DATA_LIST_vars(), except that all allocations are taken
     from the given POOL. */
  bool
-parse_DATA_LIST_vars_pool (struct lexer *lexer, struct pool *pool,
+parse_DATA_LIST_vars_pool (struct lexer *lexer, const struct dictionary *dict,
+                           struct pool *pool,
                             char ***names, size_t *nnames, int pv_opts)
  {
    int retval;
@@ -585,7 +588,7 @@ parse_DATA_LIST_vars_pool (struct lexer *lexer, struct pool *pool,
       re-free it later. */
    assert (!(pv_opts & PV_APPEND));
  
-  retval = parse_DATA_LIST_vars (lexer, names, nnames, pv_opts);
+  retval = parse_DATA_LIST_vars (lexer, dict, names, nnames, pv_opts);
    if (retval)
      register_vars_pool (pool, *names, *nnames);
    return retval;
@@ -624,7 +627,7 @@ parse_mixed_vars (struct lexer *lexer, const struct dictionary *dict,
           free (v);
           *nnames += nv;
         }
-      else if (!parse_DATA_LIST_vars (lexer, names, nnames, PV_APPEND))
+      else if (!parse_DATA_LIST_vars (lexer, dict, names, nnames, PV_APPEND))
         goto fail;
      }
    return 1;
diff --git a/src/language/lexer/variable-parser.h b/src/language/lexer/variable-parser.h

index b0ab8c51920fc170acfc72fba19224da6cea7302..7abea0a14beade2d73626e24f16e77f300367a72 100644 (file)
--- a/src/language/lexer/variable-parser.h
+++ b/src/language/lexer/variable-parser.h
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006, 2007 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2006, 2007, 2010 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -59,9 +59,11 @@ bool parse_variables_pool (struct lexer *, struct pool *, const struct dictionar
                            struct variable ***, size_t *, int opts);
  bool parse_var_set_vars (struct lexer *, const struct var_set *, struct variable ***, size_t *,
                          int opts);
-bool parse_DATA_LIST_vars (struct lexer *, char ***names, size_t *cnt, int opts);
-bool parse_DATA_LIST_vars_pool (struct lexer *, struct pool *,
-                               char ***names, size_t *cnt, int opts);
+bool parse_DATA_LIST_vars (struct lexer *, const struct dictionary *,
+                           char ***names, size_t *cnt, int opts);
+bool parse_DATA_LIST_vars_pool (struct lexer *, const struct dictionary *,
+                                struct pool *,
+                                char ***names, size_t *cnt, int opts);
  bool parse_mixed_vars (struct lexer *, const struct dictionary *dict,
                        char ***names, size_t *cnt, int opts);
  bool parse_mixed_vars_pool (struct lexer *, const struct dictionary *dict,
diff --git a/src/language/prompt.c b/src/language/prompt.c

deleted file mode 100644 (file)

index 614796e..0000000
--- a/src/language/prompt.c
+++ /dev/null
@@ -1,75 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2010, 2011 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#include <config.h>
-
-#include <stdio.h>
-#include <errno.h>
-#include <stdlib.h>
-
-#include "language/prompt.h"
-
-#include "data/file-name.h"
-#include "data/settings.h"
-#include "data/variable.h"
-#include "language/command.h"
-#include "language/lexer/lexer.h"
-#include "libpspp/assertion.h"
-#include "libpspp/message.h"
-#include "libpspp/str.h"
-#include "libpspp/version.h"
-#include "output/tab.h"
-
-#include "gl/xalloc.h"
-
-/* Current prompting style. */
-static enum prompt_style current_style;
-
-/* Gets the command prompt for the given STYLE. */
-const char *
-prompt_get (enum prompt_style style)
-{
-  switch (style)
-    {
-    case PROMPT_FIRST:
-      return "PSPP> ";
-
-    case PROMPT_LATER:
-      return "    > ";
-
-    case PROMPT_DATA:
-      return "data> ";
-
-    case PROMPT_CNT:
-      NOT_REACHED ();
-    }
-  NOT_REACHED ();
-}
-
-/* Sets STYLE as the current prompt style. */
-void
-prompt_set_style (enum prompt_style style)
-{
-  assert (style < PROMPT_CNT);
-  current_style = style;
-}
-
-/* Returns the current prompt. */
-enum prompt_style
-prompt_get_style (void)
-{
-  return current_style;
-}
diff --git a/src/language/prompt.h b/src/language/prompt.h

deleted file mode 100644 (file)

index aa5733a..0000000
--- a/src/language/prompt.h
+++ /dev/null
@@ -1,35 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2010 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#ifndef PROMPT_H
-#define PROMPT_H 1
-
-#include <stdbool.h>
-
-enum prompt_style
-  {
-    PROMPT_FIRST,              /* First line of command. */
-    PROMPT_LATER,          /* Second or later line of command. */
-    PROMPT_DATA,               /* Between BEGIN DATA and END DATA. */
-    PROMPT_CNT
-  };
-
-enum prompt_style prompt_get_style (void);
-void prompt_set_style (enum prompt_style);
-
-const char *prompt_get (enum prompt_style);
-
-#endif /* PROMPT_H */
diff --git a/src/language/stats/aggregate.c b/src/language/stats/aggregate.c

index 241ad7c13c81e95e725672f54a0c31e68aab27fc..2a4eda87ad042092f81d3eccd93c4bb46c2f4025 100644 (file)
--- a/src/language/stats/aggregate.c
+++ b/src/language/stats/aggregate.c
@@ -39,14 +39,15 @@
  #include "language/lexer/variable-parser.h"
  #include "language/stats/sort-criteria.h"
  #include "libpspp/assertion.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/misc.h"
  #include "libpspp/pool.h"
  #include "libpspp/str.h"
  #include "math/moments.h"
+#include "math/percentiles.h"
  #include "math/sort.h"
  #include "math/statistic.h"
-#include "math/percentiles.h"
  
  #include "gl/minmax.h"
  #include "gl/xalloc.h"
@@ -416,7 +417,7 @@ parse_aggregate_functions (struct lexer *lexer, const struct dictionary *dict,
         {
           size_t n_dest_prev = n_dest;
  
-         if (!parse_DATA_LIST_vars (lexer, &dest, &n_dest,
+         if (!parse_DATA_LIST_vars (lexer, dict, &dest, &n_dest,
                                       (PV_APPEND | PV_SINGLE | PV_NO_SCRATCH
                                        | PV_NO_DUPLICATE)))
             goto error;
@@ -434,14 +435,8 @@ parse_aggregate_functions (struct lexer *lexer, const struct dictionary *dict,
  
           if (lex_is_string (lexer))
             {
-              /* XXX check re-encoded length */
-             struct string label;
-             ds_init_substring (&label, lex_tokss (lexer));
-
-             ds_truncate (&label, 255);
-             dest_label[n_dest - 1] = ds_xstrdup (&label);
+             dest_label[n_dest - 1] = xstrdup (lex_tokcstr (lexer));
               lex_get (lexer);
-             ds_destroy (&label);
             }
         }
  
@@ -502,7 +497,9 @@ parse_aggregate_functions (struct lexer *lexer, const struct dictionary *dict,
                 lex_match (lexer, T_COMMA);
                 if (lex_is_string (lexer))
                   {
-                   arg[i].c = ss_xstrdup (lex_tokss (lexer));
+                   arg[i].c = recode_string (dict_get_encoding (agr->dict),
+                                              "UTF-8", lex_tokcstr (lexer),
+                                              -1);
                     type = VAL_STRING;
                   }
                 else if (lex_is_number (lexer))
@@ -640,7 +637,8 @@ parse_aggregate_functions (struct lexer *lexer, const struct dictionary *dict,
  
             free (dest[i]);
             if (dest_label[i])
-              var_set_label (destvar, dest_label[i]);
+              var_set_label (destvar, dest_label[i],
+                             dict_get_encoding (agr->dict), true);
  
             v->dest = destvar;
           }
@@ -811,6 +809,7 @@ accumulate_aggregate_info (struct agr_proc *agr, const struct ccase *input)
             iter->int1 = 1;
             break;
           case MAX | FSTRING:
+            /* Need to do some kind of Unicode collation thingy here */
             if (memcmp (iter->string, value_str (v, src_width), src_width) < 0)
               memcpy (iter->string, value_str (v, src_width), src_width);
             iter->int1 = 1;
diff --git a/src/language/stats/autorecode.c b/src/language/stats/autorecode.c

index 77385783549029af27c4e3ced65ba07c30e03482..fdf3dddc2ea83cec64daff9f394f34c40978b4e6 100644 (file)
--- a/src/language/stats/autorecode.c
+++ b/src/language/stats/autorecode.c
@@ -120,7 +120,8 @@ cmd_autorecode (struct lexer *lexer, struct dataset *ds)
    if (!lex_force_match_id (lexer, "INTO"))
      goto error;
    lex_match (lexer, T_EQUALS);
-  if (!parse_DATA_LIST_vars (lexer, &dst_names, &n_dsts, PV_NO_DUPLICATE))
+  if (!parse_DATA_LIST_vars (lexer, dict, &dst_names, &n_dsts,
+                             PV_NO_DUPLICATE))
      goto error;
    if (n_dsts != n_srcs)
      {
diff --git a/src/language/stats/descriptives.c b/src/language/stats/descriptives.c

index 50d52d3cff03f68d2bac2d9d0a84097b961ae032..adedd5e67f600aeb651eafa68316d615d5d73ef0 100644 (file)
--- a/src/language/stats/descriptives.c
+++ b/src/language/stats/descriptives.c
@@ -31,9 +31,10 @@
  #include "language/lexer/lexer.h"
  #include "language/lexer/variable-parser.h"
  #include "libpspp/array.h"
+#include "libpspp/assertion.h"
  #include "libpspp/compiler.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
-#include "libpspp/assertion.h"
  #include "math/moments.h"
  #include "output/tab.h"
  
@@ -303,7 +304,7 @@ cmd_descriptives (struct lexer *lexer, struct dataset *ds)
          }
        else if (var_cnt == 0)
          {
-          if (lex_look_ahead (lexer) == T_EQUALS)
+          if (lex_next_token (lexer, 1) == T_EQUALS)
              {
                lex_match_id (lexer, "VARIABLES");
                lex_match (lexer, T_EQUALS);
@@ -507,17 +508,22 @@ static char *
  generate_z_varname (const struct dictionary *dict, struct dsc_proc *dsc,
                      const char *var_name, int *z_cnt)
  {
-  char name[VAR_NAME_LEN + 1];
+  char *z_name, *trunc_name;
  
    /* Try a name based on the original variable name. */
-  name[0] = 'Z';
-  str_copy_trunc (name + 1, sizeof name - 1, var_name);
-  if (try_name (dict, dsc, name))
-    return xstrdup (name);
+  z_name = xasprintf ("Z%s", var_name);
+  trunc_name = utf8_encoding_trunc (z_name, dict_get_encoding (dict),
+                                    ID_MAX_LEN);
+  free (z_name);
+  if (try_name (dict, dsc, trunc_name))
+    return trunc_name;
+  free (trunc_name);
  
    /* Generate a synthetic name. */
    for (;;)
      {
+      char name[8];
+
        (*z_cnt)++;
  
        if (*z_cnt <= 99)
@@ -675,7 +681,8 @@ setup_z_trns (struct dsc_proc *dsc, struct dataset *ds)
  
           dst_var = dict_create_var_assert (dataset_dict (ds), dv->z_name, 0);
            var_set_label (dst_var, xasprintf (_("Z-score of %s"),
-                                             var_to_string (dv->v)));
+                                             var_to_string (dv->v)),
+                         dict_get_encoding (dataset_dict (ds)), false);
  
            z = &t->z_scores[cnt++];
            z->src_var = dv->v;
diff --git a/src/language/stats/flip.c b/src/language/stats/flip.c

index 534efb4919c05c64bbb31d1c0f9d50cd3f3f7635..23544c8b1ae2e7b822c193fe01fe928e1838219c 100644 (file)
--- a/src/language/stats/flip.c
+++ b/src/language/stats/flip.c
@@ -220,7 +220,7 @@ cmd_flip (struct lexer *lexer, struct dataset *ds)
                                           flip->n_vars,
                                           &flip_casereader_class, flip);
    proc_set_active_file_data (ds, reader);
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  
   error:
    destroy_flip_pgm (flip);
@@ -249,7 +249,7 @@ make_new_var (struct dictionary *dict, const char *name_)
      *--cp = '\0';
  
    /* Fix invalid characters. */
-  for (cp = name; *cp && cp < name + VAR_NAME_LEN; cp++)
+  for (cp = name; *cp && cp < name + ID_MAX_LEN; cp++)
      if (cp == name)
        {
          if (!lex_is_id1 (*cp) || *cp == '$')
@@ -270,8 +270,8 @@ make_new_var (struct dictionary *dict, const char *name_)
        int i;
        for (i = 1; ; i++)
          {
-          char n[VAR_NAME_LEN + 1];
-          int ofs = MIN (VAR_NAME_LEN - 1 - intlog10 (i), len);
+          char n[ID_MAX_LEN + 1];
+          int ofs = MIN (ID_MAX_LEN - 1 - intlog10 (i), len);
            strncpy (n, name, ofs);
            sprintf (&n[ofs], "%d", i);
  
diff --git a/src/language/stats/frequencies.q b/src/language/stats/frequencies.q

index adc4f16b359876ff350f2c5b36c2ecdaa37a6ee6..ef4b7f9589d3a770fc9650a1a93b5581e9a2e33e 100644 (file)
--- a/src/language/stats/frequencies.q
+++ b/src/language/stats/frequencies.q
@@ -738,14 +738,23 @@ frq_custom_grouped (struct lexer *lexer, struct dataset *ds, struct cmd_frequenc
            }
  
         free (v);
-       if (!lex_match (lexer, T_SLASH))
-         break;
-       if ((lex_token (lexer) != T_ID || dict_lookup_var (dataset_dict (ds), lex_tokcstr (lexer)) != NULL)
-            && lex_token (lexer) != T_ALL)
-         {
-           lex_put_back (lexer, T_SLASH);
-           break;
-         }
+        if (lex_token (lexer) != T_SLASH)
+          break;
+
+        if ((lex_next_token (lexer, 1) == T_ID
+             && dict_lookup_var (dataset_dict (ds),
+                                 lex_next_tokcstr (lexer, 1)))
+            || lex_next_token (lexer, 1) == T_ALL)
+          {
+            /* The token after the slash is a variable name.  Keep parsing. */
+            lex_get (lexer);
+          }
+        else
+          {
+            /* The token after the slash must be the start of a new
+               subcommand.  Let the caller see the slash. */
+            break;
+          }
        }
  
    return 1;
diff --git a/src/language/stats/npar.c b/src/language/stats/npar.c

index 3a178b233dddbcfae06d8559d01587c4a589231b..a572e09f56cbbcccc2b0bba90dc150db5f239cf7 100644 (file)
--- a/src/language/stats/npar.c
+++ b/src/language/stats/npar.c
@@ -258,8 +258,8 @@ parse_npar_tests (struct lexer *lexer, struct dataset *ds, struct cmd_npar_tests
                NOT_REACHED ();
              }
          }
-      else if (lex_match_hyphenated_word (lexer, "K-W") ||
-              lex_match_hyphenated_word (lexer, "KRUSKAL-WALLIS"))
+      else if (lex_match_phrase (lexer, "K-W") ||
+              lex_match_phrase (lexer, "KRUSKAL-WALLIS"))
          {
            lex_match (lexer, T_EQUALS);
            npt->kruskal_wallis++;
@@ -276,8 +276,8 @@ parse_npar_tests (struct lexer *lexer, struct dataset *ds, struct cmd_npar_tests
                NOT_REACHED ();
              }
          }
-      else if (lex_match_hyphenated_word (lexer, "M-W") ||
-              lex_match_hyphenated_word (lexer, "MANN-WHITNEY"))
+      else if (lex_match_phrase (lexer, "M-W") ||
+              lex_match_phrase (lexer, "MANN-WHITNEY"))
          {
            lex_match (lexer, T_EQUALS);
            npt->mann_whitney++;
@@ -759,44 +759,39 @@ npar_chisquare (struct lexer *lexer, struct dataset *ds,
  
    cstp->n_expected = 0;
    cstp->expected = NULL;
-  if ( lex_match (lexer, T_SLASH) )
+  if (lex_match_phrase (lexer, "/EXPECTED"))
      {
-      if ( lex_match_id (lexer, "EXPECTED") )
-       {
-         lex_force_match (lexer, T_EQUALS);
-         if ( ! lex_match_id (lexer, "EQUAL") )
-           {
-             double f;
-             int n;
-             while ( lex_is_number (lexer) )
-               {
-                 int i;
-                 n = 1;
-                 f = lex_number (lexer);
-                 lex_get (lexer);
-                 if ( lex_match (lexer, T_ASTERISK))
-                   {
-                     n = f;
-                     f = lex_number (lexer);
-                     lex_get (lexer);
-                   }
-                 lex_match (lexer, T_COMMA);
-
-                 cstp->n_expected += n;
-                 cstp->expected = pool_realloc (specs->pool,
-                                                cstp->expected,
-                                                sizeof (double) *
-                                                cstp->n_expected);
-                 for ( i = cstp->n_expected - n ;
-                       i < cstp->n_expected;
-                       ++i )
-                   cstp->expected[i] = f;
+      lex_force_match (lexer, T_EQUALS);
+      if ( ! lex_match_id (lexer, "EQUAL") )
+        {
+          double f;
+          int n;
+          while ( lex_is_number (lexer) )
+            {
+              int i;
+              n = 1;
+              f = lex_number (lexer);
+              lex_get (lexer);
+              if ( lex_match (lexer, T_ASTERISK))
+                {
+                  n = f;
+                  f = lex_number (lexer);
+                  lex_get (lexer);
+                }
+              lex_match (lexer, T_COMMA);
  
-               }
-           }
-       }
-      else
-        retval = 3;
+              cstp->n_expected += n;
+              cstp->expected = pool_realloc (specs->pool,
+                                             cstp->expected,
+                                             sizeof (double) *
+                                             cstp->n_expected);
+              for ( i = cstp->n_expected - n ;
+                    i < cstp->n_expected;
+                    ++i )
+                cstp->expected[i] = f;
+
+            }
+        }
      }
  
    if ( cstp->ranged && cstp->n_expected > 0 &&
@@ -828,7 +823,7 @@ npar_binomial (struct lexer *lexer, struct dataset *ds,
    struct binomial_test *btp = pool_alloc (specs->pool, sizeof (*btp));
    struct one_sample_test *tp = &btp->parent;
    struct npar_test *nt = &tp->parent;
-  bool equals;
+  bool equals = false;
  
    nt->execute = binomial_execute;
    nt->insert_variables = one_sample_insert_variables;
diff --git a/src/language/stats/rank.q b/src/language/stats/rank.q

index 49a040e3ae0c7facbb87258cf21c8f108e4973b0..97c98c32f0613573c04cf23af8b30be329f4398b 100644 (file)
--- a/src/language/stats/rank.q
+++ b/src/language/stats/rank.q
@@ -198,7 +198,8 @@ fraction_name(void)
  /* Create a label on DEST_VAR, describing its derivation from SRC_VAR and F */
  static void
  create_var_label (struct variable *dest_var,
-                 const struct variable *src_var, enum RANK_FUNC f)
+                 const struct variable *src_var, enum RANK_FUNC f,
+                  const char *dict_encoding)
  {
    struct string label;
    ds_init_empty (&label);
@@ -224,7 +225,7 @@ create_var_label (struct variable *dest_var,
      ds_put_format (&label, _("%s of %s"),
                     function_name[f], var_get_name (src_var));
  
-  var_set_label (dest_var, ds_cstr (&label));
+  var_set_label (dest_var, ds_cstr (&label), dict_encoding, false);
  
    ds_destroy (&label);
  }
@@ -673,15 +674,18 @@ cmd_rank (struct lexer *lexer, struct dataset *ds)
        int v;
        for ( v = 0 ; v < n_src_vars ;  v ++ )
         {
+          struct dictionary *dict = dataset_dict (ds);
+
           if ( rank_specs[i].destvars[v] == NULL )
             {
               rank_specs[i].destvars[v] =
-               create_rank_variable (dataset_dict(ds), rank_specs[i].rfunc, src_vars[v], NULL);
+               create_rank_variable (dict, rank_specs[i].rfunc, src_vars[v], NULL);
             }
  
           create_var_label ( rank_specs[i].destvars[v],
                              src_vars[v],
-                            rank_specs[i].rfunc);
+                            rank_specs[i].rfunc,
+                             dict_get_encoding (dict));
         }
      }
  
diff --git a/src/language/stats/sort-cases.c b/src/language/stats/sort-cases.c

index 1134874da5b5034fc54b932cb153ef5818d4aeeb..ff0f305e9fa911003fede236e5d4d712c5dbca7d 100644 (file)
--- a/src/language/stats/sort-cases.c
+++ b/src/language/stats/sort-cases.c
@@ -78,6 +78,6 @@ cmd_sort_cases (struct lexer *lexer, struct dataset *ds)
    max_buffers = INT_MAX;
  
    subcase_destroy (&ordering);
-  return ok ? lex_end_of_command (lexer) : CMD_CASCADING_FAILURE;
+  return ok ? CMD_SUCCESS : CMD_CASCADING_FAILURE;
  }
  
diff --git a/src/language/syntax-file.c b/src/language/syntax-file.c

deleted file mode 100644 (file)

index 286ce1e..0000000
--- a/src/language/syntax-file.c
+++ /dev/null
@@ -1,144 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2009, 2010, 2011 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#include <config.h>
-
-#include "language/syntax-file.h"
-
-#include <stdio.h>
-#include <errno.h>
-#include <stdint.h>
-#include <stdlib.h>
-
-#include "data/file-name.h"
-#include "data/settings.h"
-#include "data/variable.h"
-#include "language/command.h"
-#include "language/lexer/lexer.h"
-#include "language/prompt.h"
-#include "libpspp/assertion.h"
-#include "libpspp/cast.h"
-#include "libpspp/getl.h"
-#include "libpspp/ll.h"
-#include "libpspp/message.h"
-#include "libpspp/str.h"
-#include "libpspp/version.h"
-#include "output/tab.h"
-
-#include "gl/xalloc.h"
-
-#include "gettext.h"
-#define _(msgid) gettext (msgid)
-
-struct syntax_file_source
-  {
-    struct getl_interface parent ;
-
-    FILE *syntax_file;
-
-    /* Current location. */
-    char *fn;                          /* File name. */
-    int ln;                            /* Line number. */
-  };
-
-static const char *
-name (const struct getl_interface *s)
-{
-  const struct syntax_file_source *sfs = UP_CAST (s, struct syntax_file_source,
-                                                  parent);
-  return sfs->fn;
-}
-
-static int
-line_number (const struct getl_interface *s)
-{
-  const struct syntax_file_source *sfs = UP_CAST (s, struct syntax_file_source,
-                                                  parent);
-  return sfs->ln;
-}
-
-
-/* Reads a line from syntax file source S into LINE.
-   Returns true if successful, false at end of file. */
-static bool
-read_syntax_file (struct getl_interface *s,
-                  struct string *line)
-{
-  struct syntax_file_source *sfs = UP_CAST (s, struct syntax_file_source,
-                                            parent);
-
-  if (sfs->syntax_file == NULL)
-    return false;
-
-  /* Read line from file and remove new-line.
-     Skip initial "#! /usr/bin/pspp" line. */
-  do
-    {
-      sfs->ln++;
-      ds_clear (line);
-      if (!ds_read_line (line, sfs->syntax_file, SIZE_MAX))
-        {
-          if (ferror (sfs->syntax_file))
-            msg (ME, _("Reading `%s': %s."), sfs->fn, strerror (errno));
-          return false;
-        }
-      ds_chomp_byte (line, '\n');
-    }
-  while (sfs->ln == 1 && !memcmp (ds_cstr (line), "#!", 2));
-
-  return true;
-}
-
-static void
-syntax_close (struct getl_interface *s)
-{
-  struct syntax_file_source *sfs = UP_CAST (s, struct syntax_file_source,
-                                            parent);
-
-  if (sfs->syntax_file && EOF == fn_close (sfs->fn, sfs->syntax_file))
-    msg (MW, _("Closing `%s': %s."), sfs->fn, strerror (errno));
-  free (sfs->fn);
-  free (sfs);
-}
-
-static bool
-always_false (const struct getl_interface *s UNUSED)
-{
-  return false;
-}
-
-
-/* Creates a syntax file source with file name FN. */
-struct getl_interface *
-create_syntax_file_source (const char *fn)
-{
-  struct syntax_file_source *ss = xzalloc (sizeof (*ss));
-
-  ss->fn = xstrdup (fn);
-  ss->syntax_file = fn_open (ss->fn, "r");
-  if (ss->syntax_file == NULL)
-    msg (ME, _("Opening `%s': %s."), ss->fn, strerror (errno));
-
-  ss->parent.interactive = always_false;
-  ss->parent.read = read_syntax_file ;
-  ss->parent.filter = NULL;
-  ss->parent.close = syntax_close ;
-  ss->parent.name = name ;
-  ss->parent.location = line_number;
-
-  return &ss->parent;
-}
-
diff --git a/src/language/syntax-file.h b/src/language/syntax-file.h

deleted file mode 100644 (file)

index 8044f3c..0000000
--- a/src/language/syntax-file.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#if !SYNTAX_FILE
-#define SYNTAX_FILE 1
-
-struct getl_interface;
-
-/* Creates a syntax file source with file name FN. */
-struct getl_interface * create_syntax_file_source (const char *) ;
-
-#endif
diff --git a/src/language/syntax-string-source.c b/src/language/syntax-string-source.c

deleted file mode 100644 (file)

index 1d3d4d6..0000000
--- a/src/language/syntax-string-source.c
+++ /dev/null
@@ -1,151 +0,0 @@
-/* PSPPIRE - a graphical interface for PSPP.
-   Copyright (C) 2007, 2009, 2010, 2011 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-
-#include <config.h>
-
-#include "language/syntax-string-source.h"
-
-#include <stdlib.h>
-
-#include "libpspp/cast.h"
-#include "libpspp/getl.h"
-#include "libpspp/compiler.h"
-#include "libpspp/str.h"
-
-#include "gl/xalloc.h"
-
-struct syntax_string_source
-  {
-    struct getl_interface parent;
-    struct string buffer;
-    size_t posn;
-  };
-
-
-static bool
-always_false (const struct getl_interface *i UNUSED)
-{
-  return false;
-}
-
-/* Returns the name of the source */
-static const char *
-name (const struct getl_interface *i UNUSED)
-{
-  return NULL;
-}
-
-
-/* Returns the location within the source */
-static int
-location (const struct getl_interface *i UNUSED)
-{
-  return 0;
-}
-
-
-static void
-do_close (struct getl_interface *i )
-{
-  struct syntax_string_source *sss = UP_CAST (i, struct syntax_string_source,
-                                              parent);
-
-  ds_destroy (&sss->buffer);
-
-  free (sss);
-}
-
-
-
-static bool
-read_single_line (struct getl_interface *i,
-                 struct string *line)
-{
-  struct syntax_string_source *sss = UP_CAST (i, struct syntax_string_source,
-                                              parent);
-
-  size_t next;
-
-  if ( sss->posn == -1)
-    return false;
-
-  next = ss_find_byte (ds_substr (&sss->buffer,
-                                 sss->posn, -1), '\n');
-
-  ds_assign_substring (line,
-                      ds_substr (&sss->buffer,
-                                 sss->posn,
-                                 next)
-                      );
-
-  if ( next != -1 )
-    sss->posn += next + 1; /* + 1 to skip newline */
-  else
-    sss->posn = -1; /* End of file encountered */
-
-  return true;
-}
-
-static struct syntax_string_source *
-create_syntax_string_source__ (void)
-{
-  struct syntax_string_source *sss = xzalloc (sizeof *sss);
-
-  sss->posn = 0;
-
-  sss->parent.interactive = always_false;
-  sss->parent.close = do_close;
-  sss->parent.read = read_single_line;
-
-  sss->parent.name = name;
-  sss->parent.location = location;
-
-  return sss;
-}
-
-struct getl_interface *
-create_syntax_string_source (const char *s)
-{
-  struct syntax_string_source *sss = create_syntax_string_source__ ();
-  ds_init_cstr (&sss->buffer, s);
-  return &sss->parent;
-}
-
-struct getl_interface *
-create_syntax_format_source (const char *format, ...)
-{
-  struct syntax_string_source *sss;
-  va_list args;
-
-  sss = create_syntax_string_source__ ();
-
-  ds_init_empty (&sss->buffer);
-
-  va_start (args, format);
-  ds_put_vformat (&sss->buffer, format, args);
-  va_end (args);
-
-  return &sss->parent;
-}
-
-/* Return the syntax currently contained in S.
-   Primarily usefull for debugging */
-const char *
-syntax_string_source_get_syntax (const struct syntax_string_source *s)
-{
-  return ds_cstr (&s->buffer);
-}
diff --git a/src/language/syntax-string-source.h b/src/language/syntax-string-source.h

deleted file mode 100644 (file)

index d2e1a9b..0000000
--- a/src/language/syntax-string-source.h
+++ /dev/null
@@ -1,33 +0,0 @@
-/* PSPPIRE - a graphical interface for PSPP.
-   Copyright (C) 2007, 2010 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#ifndef SYNTAX_STRING_SOURCE_H
-#define SYNTAX_STRING_SOURCE_H
-
-#include "libpspp/compiler.h"
-
-struct getl_interface;
-
-struct syntax_string_source;
-
-struct getl_interface *create_syntax_string_source (const char *);
-struct getl_interface *create_syntax_format_source (const char *, ...)
-  PRINTF_FORMAT (1, 2);
-
-const char * syntax_string_source_get_syntax (const struct syntax_string_source *s);
-
-
-#endif
diff --git a/src/language/tests/format-guesser-test.c b/src/language/tests/format-guesser-test.c

index b5dbe6d623ade6860ee4df8248cc4801d2b8ef92..cd7ca52475591aed781542582ef6de17c55aa3fa 100644 (file)
--- a/src/language/tests/format-guesser-test.c
+++ b/src/language/tests/format-guesser-test.c
@@ -53,5 +53,5 @@ cmd_debug_format_guesser (struct lexer *lexer, struct dataset *ds UNUSED)
    msg_enable ();
    putc ('\n', stderr);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/language/tests/moments-test.c b/src/language/tests/moments-test.c

index f6499cb81ccddd0c027000d50c0c9d7161e4eba1..af328928abddbff93e5cb6ac59ef955ef4367e0f 100644 (file)
--- a/src/language/tests/moments-test.c
+++ b/src/language/tests/moments-test.c
@@ -135,7 +135,7 @@ cmd_debug_moments (struct lexer *lexer, struct dataset *ds UNUSED)
      }
    fprintf (stderr, "\n");
  
-  retval = lex_end_of_command (lexer);
+  retval = CMD_SUCCESS;
  
   done:
    free (values);
diff --git a/src/language/tests/paper-size.c b/src/language/tests/paper-size.c

index 660fe9c7a8c5e4e08373cd6b38518c9390664381..0322f5728b80209a165a55693afdc42da4364e63 100644 (file)
--- a/src/language/tests/paper-size.c
+++ b/src/language/tests/paper-size.c
@@ -44,5 +44,5 @@ cmd_debug_paper_size (struct lexer *lexer, struct dataset *ds UNUSED)
      printf ("error\n");
    lex_get (lexer);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/language/utilities/cache.c b/src/language/utilities/cache.c

index 2d818af82b20691c4e2b9861931132a7943467ef..dda055dbcadd07bbfb9a10f64ed190a05db17368 100644 (file)
--- a/src/language/utilities/cache.c
+++ b/src/language/utilities/cache.c
@@ -27,8 +27,8 @@
  
  /* Parses the CACHE command. */
  int
-cmd_cache (struct lexer *lexer, struct dataset *ds UNUSED)
+cmd_cache (struct lexer *lexer UNUSED, struct dataset *ds UNUSED)
  {
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
diff --git a/src/language/utilities/cd.c b/src/language/utilities/cd.c

index 90a1bcf8db6a8ca71ac63167a8d898207ec4cb70..cae84eb3c46b43e6f06bf2b3bb43e941a517bdd4 100644 (file)
--- a/src/language/utilities/cd.c
+++ b/src/language/utilities/cd.c
@@ -16,12 +16,14 @@
  
  #include <config.h>
  
+#include "language/command.h"
+
  #include <errno.h>
  #include <unistd.h>
  
-#include "language/command.h"
-#include "libpspp/message.h"
  #include "language/lexer/lexer.h"
+#include "libpspp/i18n.h"
+#include "libpspp/message.h"
  
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
@@ -35,7 +37,7 @@ cmd_cd (struct lexer *lexer, struct dataset *ds UNUSED)
    if ( ! lex_force_string (lexer))
      goto error;
  
-  path = ss_xstrdup (lex_tokss (lexer));
+  path = utf8_to_filename (lex_tokcstr (lexer));
  
    if ( -1 == chdir (path) )
      {
diff --git a/src/language/utilities/date.c b/src/language/utilities/date.c

index 80a09c21f5b4d8aaf1e2437f8b1ea8c2a0ba5614..3b754b4980efa902ce155bc4708ce93c7358242f 100644 (file)
--- a/src/language/utilities/date.c
+++ b/src/language/utilities/date.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 2004, 2011 Free Software Foundation, Inc.
+   Copyright (C) 2004, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -28,7 +28,7 @@ int
  cmd_use (struct lexer *lexer, struct dataset *ds UNUSED)
  {
    if (lex_match (lexer, T_ALL))
-    return lex_end_of_command (lexer);
+    return CMD_SUCCESS;
  
    msg (SW, _("Only USE ALL is currently implemented."));
    return CMD_FAILURE;
diff --git a/src/language/utilities/host.c b/src/language/utilities/host.c

index f9b10480d2507c95d20dc677767c41c98639565f..fbc9d208be7a08a1b4c518894f0e317d1413be5b 100644 (file)
--- a/src/language/utilities/host.c
+++ b/src/language/utilities/host.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2009, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -30,9 +30,11 @@
  #include "language/lexer/lexer.h"
  #include "libpspp/assertion.h"
  #include "libpspp/compiler.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/str.h"
  
+#include "gl/localcharset.h"
  #include "gl/xalloc.h"
  #include "gl/xmalloca.h"
  
@@ -134,6 +136,7 @@ cmd_host (struct lexer *lexer, struct dataset *ds UNUSED)
    else if (lex_match_id (lexer, "COMMAND"))
      {
        struct string command;
+      char *locale_command;
        bool ok;
  
        lex_match (lexer, T_EQUALS);
@@ -154,10 +157,15 @@ cmd_host (struct lexer *lexer, struct dataset *ds UNUSED)
            return CMD_FAILURE;
          }
  
-      ok = run_command (ds_cstr (&command));
+      locale_command = recode_string (locale_charset (), "UTF-8",
+                                      ds_cstr (&command),
+                                      ds_length (&command));
        ds_destroy (&command);
  
-      return ok ? lex_end_of_command (lexer) : CMD_FAILURE;
+      ok = run_command (locale_command);
+      free (locale_command);
+
+      return ok ? CMD_SUCCESS : CMD_FAILURE;
      }
    else
      {
diff --git a/src/language/utilities/include.c b/src/language/utilities/include.c

index e326134de01e3a11de7a154e4cff1f31c42eac10..e802abecc49ed4b3129c38ede93024eaaa76430c 100644 (file)
--- a/src/language/utilities/include.c
+++ b/src/language/utilities/include.c
@@ -23,10 +23,11 @@
  #include <unistd.h>
  
  #include "data/file-name.h"
+#include "data/procedure.h"
  #include "language/command.h"
+#include "language/lexer/include-path.h"
  #include "language/lexer/lexer.h"
-#include "language/syntax-file.h"
-#include "libpspp/getl.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/str.h"
  
@@ -36,67 +37,79 @@
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
  
-static int parse_insert (struct lexer *lexer, char **filename);
+enum variant
+  {
+    INSERT,
+    INCLUDE
+  };
  
-
-int
-cmd_include (struct lexer *lexer, struct dataset *ds UNUSED)
+static int
+do_insert (struct lexer *lexer, struct dataset *ds, enum variant variant)
  {
-  char *filename = NULL;
-  int status = parse_insert (lexer, &filename);
-
-  if ( CMD_SUCCESS != status)
-    return status;
+  enum lex_syntax_mode syntax_mode;
+  enum lex_error_mode error_mode;
+  char *relative_name;
+  char *filename;
+  char *encoding;
+  int status;
+  bool cd;
  
-  lex_get (lexer);
-
-  status = lex_end_of_command (lexer);
+  /* Skip optional FILE=. */
+  if (lex_match_id (lexer, "FILE"))
+    lex_match (lexer, T_EQUALS);
  
-  if ( status == CMD_SUCCESS)
+  /* File name can be identifier or string. */
+  if (lex_token (lexer) != T_ID && !lex_is_string (lexer))
      {
-      struct source_stream *ss = lex_get_source_stream (lexer);
-
-      assert (filename);
-      getl_include_source (ss, create_syntax_file_source (filename),
-                          GETL_BATCH, ERRMODE_STOP);
-      free (filename);
+      lex_error (lexer, _("expecting file name"));
+      return CMD_FAILURE;
      }
  
-  return status;
-}
-
-
-int
-cmd_insert (struct lexer *lexer, struct dataset *ds UNUSED)
-{
-  enum syntax_mode syntax_mode = GETL_INTERACTIVE;
-  enum error_mode error_mode = ERRMODE_CONTINUE;
-  char *filename = NULL;
-  int status = parse_insert (lexer, &filename);
-  bool cd = false;
-
-  if ( CMD_SUCCESS != status)
-    return status;
+  relative_name = utf8_to_filename (lex_tokcstr (lexer)); 
+  filename = include_path_search (relative_name);
+  free (relative_name);
  
+  if ( ! filename)
+    {
+      msg (SE, _("Can't find `%s' in include file search path."),
+           lex_tokcstr (lexer));
+      return CMD_FAILURE;
+    }
    lex_get (lexer);
  
+  syntax_mode = LEX_SYNTAX_INTERACTIVE;
+  error_mode = LEX_ERROR_CONTINUE;
+  cd = false;
+  status = CMD_FAILURE;
+  encoding = xstrdup (dataset_get_default_syntax_encoding (ds));
    while ( T_ENDCMD != lex_token (lexer))
      {
-      if (lex_match_id (lexer, "SYNTAX"))
+      if (lex_match_id (lexer, "ENCODING"))
+        {
+          lex_match (lexer, T_EQUALS);
+          if (!lex_force_string (lexer))
+            goto exit;
+
+          free (encoding);
+          encoding = xstrdup (lex_tokcstr (lexer));
+        }
+      else if (variant == INSERT && lex_match_id (lexer, "SYNTAX"))
         {
           lex_match (lexer, T_EQUALS);
           if ( lex_match_id (lexer, "INTERACTIVE") )
-           syntax_mode = GETL_INTERACTIVE;
+           syntax_mode = LEX_SYNTAX_INTERACTIVE;
           else if ( lex_match_id (lexer, "BATCH"))
-           syntax_mode = GETL_BATCH;
+           syntax_mode = LEX_SYNTAX_BATCH;
+         else if ( lex_match_id (lexer, "AUTO"))
+           syntax_mode = LEX_SYNTAX_AUTO;
           else
             {
-             lex_error (lexer, _("expecting %s or %s after %s"),
-                         "BATCH", "INTERACTIVE", "SYNTAX");
-             return CMD_FAILURE;
+             lex_error (lexer, _("expecting %s, %s, or %s after %s"),
+                         "BATCH", "INTERACTIVE", "AUTO", "SYNTAX");
+             goto exit;
             }
         }
-      else if (lex_match_id (lexer, "CD"))
+      else if (variant == INSERT && lex_match_id (lexer, "CD"))
         {
           lex_match (lexer, T_EQUALS);
           if ( lex_match_id (lexer, "YES") )
@@ -111,100 +124,71 @@ cmd_insert (struct lexer *lexer, struct dataset *ds UNUSED)
             {
               lex_error (lexer, _("expecting %s or %s after %s"),
                           "YES", "NO", "CD");
-             return CMD_FAILURE;
+             goto exit;
             }
         }
-      else if (lex_match_id (lexer, "ERROR"))
+      else if (variant == INSERT && lex_match_id (lexer, "ERROR"))
         {
           lex_match (lexer, T_EQUALS);
           if ( lex_match_id (lexer, "CONTINUE") )
             {
-             error_mode = ERRMODE_CONTINUE;
+             error_mode = LEX_ERROR_CONTINUE;
             }
           else if ( lex_match_id (lexer, "STOP"))
             {
-             error_mode = ERRMODE_STOP;
+             error_mode = LEX_ERROR_STOP;
             }
           else
             {
               lex_error (lexer, _("expecting %s or %s after %s"),
                           "CONTINUE", "STOP", "ERROR");
-             return CMD_FAILURE;
+             goto exit;
             }
         }
  
        else
         {
-         lex_error (lexer, _("Unexpected token: `%s'."),
-                    lex_token_representation (lexer));
-
-         return CMD_FAILURE;
+         lex_error (lexer, NULL);
+         goto exit;
         }
      }
-
    status = lex_end_of_command (lexer);
  
    if ( status == CMD_SUCCESS)
      {
-      struct source_stream *ss = lex_get_source_stream (lexer);
-
-      assert (filename);
-      getl_include_source (ss, create_syntax_file_source (filename),
-                          syntax_mode,
-                          error_mode);
-
-      if ( cd )
-       {
-         char *directory = dir_name (filename);
-         chdir (directory);
-         free (directory);
-       }
-
-      free (filename);
+      struct lex_reader *reader;
+
+      reader = lex_reader_for_file (filename, encoding,
+                                    syntax_mode, error_mode);
+      if (reader != NULL)
+        {
+          lex_discard_rest_of_command (lexer);
+          lex_include (lexer, reader);
+
+          if ( cd )
+            {
+              char *directory = dir_name (filename);
+              chdir (directory);
+              free (directory);
+            }
+        }
      }
  
+exit:
+  free (encoding);
+  free (filename);
    return status;
  }
  
-
-static int
-parse_insert (struct lexer *lexer, char **filename)
+int
+cmd_include (struct lexer *lexer, struct dataset *ds)
  {
-  const char *target_fn;
-  char *relative_filename;
-
-  /* Skip optional FILE=. */
-  if (lex_match_id (lexer, "FILE"))
-    lex_match (lexer, T_EQUALS);
-
-  /* File name can be identifier or string. */
-  if (lex_token (lexer) != T_ID && !lex_is_string (lexer))
-    {
-      lex_error (lexer, _("expecting file name"));
-      return CMD_FAILURE;
-    }
-
-  target_fn = lex_tokcstr (lexer);
-
-  relative_filename =
-    fn_search_path (target_fn,
-                   getl_include_path (lex_get_source_stream (lexer)));
-
-  if ( ! relative_filename)
-    {
-      msg (SE, _("Can't find `%s' in include file search path."),
-        target_fn);
-      return CMD_FAILURE;
-    }
-
-  *filename = relative_filename;
-  if (*filename == NULL) 
-    {
-      msg (SE, _("Unable to open `%s': %s."),
-           relative_filename, strerror (errno));
-      free (relative_filename);
-      return CMD_FAILURE;
-    }
+  return do_insert (lexer, ds, INCLUDE);
+}
  
-  return CMD_SUCCESS;
+int
+cmd_insert (struct lexer *lexer, struct dataset *ds)
+{
+  return do_insert (lexer, ds, INSERT);
  }
+
diff --git a/src/language/utilities/permissions.c b/src/language/utilities/permissions.c

index 83fa820c562d3be6473ef70afa49dd95a5b6ecc6..8b0e3f0c31926862e1135464460a30fccee3ca50 100644 (file)
--- a/src/language/utilities/permissions.c
+++ b/src/language/utilities/permissions.c
@@ -25,6 +25,7 @@
  #include "data/settings.h"
  #include "language/command.h"
  #include "language/lexer/lexer.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/misc.h"
  #include "libpspp/str.h"
@@ -94,20 +95,23 @@ cmd_permissions (struct lexer *lexer, struct dataset *ds UNUSED)
  int
  change_permissions (const char *file_name, enum PER per)
  {
+  char *locale_file_name;
    struct stat buf;
    mode_t mode;
  
    if (settings_get_safer_mode ())
      {
        msg (SE, _("This command not allowed when the SAFER option is set."));
-      return CMD_FAILURE;
+      return 0;
      }
  
  
-  if ( -1 == stat(file_name, &buf) )
+  locale_file_name = utf8_to_filename (file_name);
+  if ( -1 == stat(locale_file_name, &buf) )
      {
        const int errnum = errno;
        msg (SE, _("Cannot stat %s: %s"), file_name, strerror(errnum));
+      free (locale_file_name);
        return 0;
      }
  
@@ -116,13 +120,16 @@ change_permissions (const char *file_name, enum PER per)
    else
      mode = buf.st_mode & ~0222;
  
-  if ( -1 == chmod(file_name, mode))
+  if ( -1 == chmod(locale_file_name, mode))
  
      {
        const int errnum = errno;
        msg (SE, _("Cannot change mode of %s: %s"), file_name, strerror(errnum));
+      free (locale_file_name);
        return 0;
      }
  
+  free (locale_file_name);
+
    return 1;
  }
diff --git a/src/language/utilities/set.q b/src/language/utilities/set.q

index 9837d2cc6bbe7d4d57c701bcfc6121df5fba054d..3da12a6153bcbcc12f3ccf201324972268a5ef78 100644 (file)
--- a/src/language/utilities/set.q
+++ b/src/language/utilities/set.q
@@ -493,7 +493,10 @@ stc_custom_journal (struct lexer *lexer, struct dataset *ds UNUSED, struct cmd_s
      journal_disable ();
    else if (lex_is_string (lexer) || lex_token (lexer) == T_ID)
      {
-      journal_set_file_name (lex_tokcstr (lexer));
+      char *filename = utf8_to_filename (lex_tokcstr (lexer));
+      journal_set_file_name (filename);
+      free (filename);
+
        lex_get (lexer);
      }
    else
@@ -905,12 +908,12 @@ static struct settings *saved_settings[MAX_SAVED_SETTINGS];
  static int n_saved_settings;
  
  int
-cmd_preserve (struct lexer *lexer, struct dataset *ds UNUSED)
+cmd_preserve (struct lexer *lexer UNUSED, struct dataset *ds UNUSED)
  {
    if (n_saved_settings < MAX_SAVED_SETTINGS)
      {
        saved_settings[n_saved_settings++] = settings_get ();
-      return lex_end_of_command (lexer);
+      return CMD_SUCCESS;
      }
    else
      {
@@ -922,14 +925,14 @@ cmd_preserve (struct lexer *lexer, struct dataset *ds UNUSED)
  }
  
  int
-cmd_restore (struct lexer *lexer, struct dataset *ds UNUSED)
+cmd_restore (struct lexer *lexer UNUSED, struct dataset *ds UNUSED)
  {
    if (n_saved_settings > 0)
      {
        struct settings *s = saved_settings[--n_saved_settings];
        settings_set (s);
        settings_destroy (s);
-      return lex_end_of_command (lexer);
+      return CMD_SUCCESS;
      }
    else
      {
diff --git a/src/language/utilities/title.c b/src/language/utilities/title.c

index 9d5b8261619c0a39c3d318bf8134933c4008422c..398288b812c9d966c6e6de13a76edeffbb784f06 100644 (file)
--- a/src/language/utilities/title.c
+++ b/src/language/utilities/title.c
@@ -52,20 +52,10 @@ cmd_subtitle (struct lexer *lexer, struct dataset *ds UNUSED)
  static int
  parse_title (struct lexer *lexer, enum text_item_type type)
  {
-  if (lex_look_ahead (lexer) == T_STRING)
-    {
-      lex_get (lexer);
-      if (!lex_force_string (lexer))
-       return CMD_FAILURE;
-      set_title (lex_tokcstr (lexer), type);
-      lex_get (lexer);
-      return lex_end_of_command (lexer);
-    }
-  else
-    {
-      set_title (lex_rest_of_line (lexer), type);
-      lex_discard_line (lexer);
-    }
+  if (!lex_force_string (lexer))
+    return CMD_FAILURE;
+  set_title (lex_tokcstr (lexer), type);
+  lex_get (lexer);
    return CMD_SUCCESS;
  }
  
@@ -79,81 +69,49 @@ set_title (const char *title, enum text_item_type type)
  int
  cmd_file_label (struct lexer *lexer, struct dataset *ds)
  {
-  const char *label;
-
-  label = lex_rest_of_line (lexer);
-  lex_discard_line (lexer);
-  while (isspace ((unsigned char) *label))
-    label++;
+  if (!lex_force_string (lexer))
+    return CMD_FAILURE;
  
-  dict_set_label (dataset_dict (ds), label);
+  dict_set_label (dataset_dict (ds), lex_tokcstr (lexer));
+  lex_get (lexer);
  
    return CMD_SUCCESS;
  }
  
-/* Add entry date line to DICT's documents. */
-static void
-add_document_trailer (struct dictionary *dict)
-{
-  char buf[64];
-
-  sprintf (buf, _("   (Entered %s)"), get_start_date ());
-  dict_add_document_line (dict, buf);
-}
-
  /* Performs the DOCUMENT command. */
  int
  cmd_document (struct lexer *lexer, struct dataset *ds)
  {
    struct dictionary *dict = dataset_dict (ds);
-  struct string line = DS_EMPTY_INITIALIZER;
-  bool end_dot;
+  char *trailer;
  
-  do
+  if (!lex_force_string (lexer))
+    return CMD_FAILURE;
+
+  while (lex_is_string (lexer))
      {
-      end_dot = lex_end_dot (lexer);
-      ds_assign_string (&line, lex_entire_line_ds (lexer));
-      if (end_dot)
-        ds_put_byte (&line, '.');
-      dict_add_document_line (dict, ds_cstr (&line));
-
-      lex_discard_line (lexer);
-      lex_get_line (lexer);
+      dict_add_document_line (dict, lex_tokcstr (lexer), true);
+      lex_get (lexer);
      }
-  while (!end_dot);
  
-  add_document_trailer (dict);
-  ds_destroy (&line);
+  trailer = xasprintf (_("   (Entered %s)"), get_start_date ());
+  dict_add_document_line (dict, trailer, true);
+  free (trailer);
  
    return CMD_SUCCESS;
  }
  
-/* Performs the DROP DOCUMENTS command. */
+/* Performs the ADD DOCUMENTS command. */
  int
-cmd_drop_documents (struct lexer *lexer, struct dataset *ds)
+cmd_add_documents (struct lexer *lexer, struct dataset *ds)
  {
-  dict_clear_documents (dataset_dict (ds));
-
-  return lex_end_of_command (lexer);
+  return cmd_document (lexer, ds);
  }
  
-
-/* Performs the ADD DOCUMENTS command. */
+/* Performs the DROP DOCUMENTS command. */
  int
-cmd_add_documents (struct lexer *lexer, struct dataset *ds)
+cmd_drop_documents (struct lexer *lexer UNUSED, struct dataset *ds)
  {
-  struct dictionary *dict = dataset_dict (ds);
-
-  if ( ! lex_force_string (lexer) )
-    return CMD_FAILURE;
-
-  while ( lex_is_string (lexer))
-    {
-      dict_add_document_line (dict, lex_tokcstr (lexer));
-      lex_get (lexer);
-    }
-
-  add_document_trailer (dict);
-
-  return lex_end_of_command (lexer) ;
+  dict_clear_documents (dataset_dict (ds));
+  return CMD_SUCCESS;
  }
diff --git a/src/language/xforms/compute.c b/src/language/xforms/compute.c

index 5089d80ded1b13abd40b6e94041ae9d7f4a0873d..82a1121fbf4b9452190dca43d7f3b4e65e3dd12e 100644 (file)
--- a/src/language/xforms/compute.c
+++ b/src/language/xforms/compute.c
@@ -100,7 +100,7 @@ cmd_compute (struct lexer *lexer, struct dataset *ds)
  
    lvalue_finalize (lvalue, compute, dict);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  
   fail:
    lvalue_destroy (lvalue, dict);
@@ -256,7 +256,7 @@ cmd_if (struct lexer *lexer, struct dataset *ds)
  
    lvalue_finalize (lvalue, compute, dict);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  
   fail:
    lvalue_destroy (lvalue, dict);
@@ -346,7 +346,7 @@ lvalue_parse (struct lexer *lexer, struct dataset *ds)
    if (!lex_force_id (lexer))
      goto lossage;
  
-  if (lex_look_ahead (lexer) == T_LPAREN)
+  if (lex_next_token (lexer, 1) == T_LPAREN)
      {
        /* Vector. */
        lvalue->vector = dict_lookup_vector (dict, lex_tokcstr (lexer));
diff --git a/src/language/xforms/count.c b/src/language/xforms/count.c

index 172a5e2c6d19238f692beb29c0d6388304a42eed..d0045fee5235f7dacd9c713dc1b0629ab32b2607 100644 (file)
--- a/src/language/xforms/count.c
+++ b/src/language/xforms/count.c
@@ -28,6 +28,7 @@
  #include "language/lexer/value-parser.h"
  #include "language/lexer/variable-parser.h"
  #include "libpspp/compiler.h"
+#include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/pool.h"
  #include "libpspp/str.h"
@@ -91,7 +92,9 @@ static trns_proc_func count_trns_proc;
  static trns_free_func count_trns_free;
  
  static bool parse_numeric_criteria (struct lexer *, struct pool *, struct criteria *);
-static bool parse_string_criteria (struct lexer *, struct pool *, struct criteria *);
+static bool parse_string_criteria (struct lexer *, struct pool *,
+                                   struct criteria *,
+                                   const char *dict_encoding);
  \f
  int
  cmd_count (struct lexer *lexer, struct dataset *ds)
@@ -133,13 +136,14 @@ cmd_count (struct lexer *lexer, struct dataset *ds)
        crit = dv->crit = pool_alloc (trns->pool, sizeof *crit);
        for (;;)
         {
+          struct dictionary *dict = dataset_dict (ds);
            bool ok;
  
           crit->next = NULL;
           crit->vars = NULL;
-         if (!parse_variables_const (lexer, dataset_dict (ds), &crit->vars,
+         if (!parse_variables_const (lexer, dict, &crit->vars,
                                       &crit->var_cnt,
-                                PV_DUPLICATE | PV_SAME_TYPE))
+                                      PV_DUPLICATE | PV_SAME_TYPE))
             goto fail;
            pool_register (trns->pool, free, crit->vars);
  
@@ -150,7 +154,8 @@ cmd_count (struct lexer *lexer, struct dataset *ds)
            if (var_is_numeric (crit->vars[0]))
              ok = parse_numeric_criteria (lexer, trns->pool, crit);
            else
-            ok = parse_string_criteria (lexer, trns->pool, crit);
+            ok = parse_string_criteria (lexer, trns->pool, crit,
+                                        dict_get_encoding (dict));
           if (!ok)
             goto fail;
  
@@ -230,7 +235,8 @@ parse_numeric_criteria (struct lexer *lexer, struct pool *pool, struct criteria
  
  /* Parses a set of string criteria values.  Returns success. */
  static bool
-parse_string_criteria (struct lexer *lexer, struct pool *pool, struct criteria *crit)
+parse_string_criteria (struct lexer *lexer, struct pool *pool,
+                       struct criteria *crit, const char *dict_encoding)
  {
    int len = 0;
    size_t allocated = 0;
@@ -244,6 +250,8 @@ parse_string_criteria (struct lexer *lexer, struct pool *pool, struct criteria *
    for (;;)
      {
        char **cur;
+      char *s;
+
        if (crit->value_cnt >= allocated)
          crit->values.str = pool_2nrealloc (pool, crit->values.str,
                                             &allocated,
@@ -251,11 +259,17 @@ parse_string_criteria (struct lexer *lexer, struct pool *pool, struct criteria *
  
        if (!lex_force_string (lexer))
         return false;
+
+      s = recode_string (dict_encoding, "UTF-8", lex_tokcstr (lexer),
+                         ss_length (lex_tokss (lexer)));
+
        cur = &crit->values.str[crit->value_cnt++];
        *cur = pool_alloc (pool, len + 1);
-      str_copy_rpad (*cur, len + 1, lex_tokcstr (lexer));
+      str_copy_rpad (*cur, len + 1, s);
        lex_get (lexer);
  
+      free (s);
+
        lex_match (lexer, T_COMMA);
        if (lex_match (lexer, T_RPAREN))
         break;
diff --git a/src/language/xforms/fail.c b/src/language/xforms/fail.c

index 3ca945243e767199d70190332dc64863ee8c6392..feedb7808348746d6842398ebc31966b39d1470f 100644 (file)
--- a/src/language/xforms/fail.c
+++ b/src/language/xforms/fail.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 2007, 2009 Free Software Foundation, Inc.
+   Copyright (C) 2007, 2009, 2010 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -38,10 +38,8 @@ trns_fail (void *x UNUSED, struct ccase **c UNUSED,
  }
  
  int
-cmd_debug_xform_fail (struct lexer *lexer, struct dataset *ds)
+cmd_debug_xform_fail (struct lexer *lexer UNUSED, struct dataset *ds)
  {
-
    add_transformation (ds, trns_fail, NULL, NULL);
-
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/language/xforms/recode.c b/src/language/xforms/recode.c

index 62cf387eee9faf30b6afa4cff210c17124e76a26..77543ca7a8caa5c55055456266220d0173708061 100644 (file)
--- a/src/language/xforms/recode.c
+++ b/src/language/xforms/recode.c
@@ -85,8 +85,6 @@ struct recode_trns
    {
      struct pool *pool;
  
-
-
      /* Variable types, for convenience. */
      enum val_type src_type;     /* src_vars[*] type. */
      enum val_type dst_type;     /* dst_vars[*] type. */
@@ -106,18 +104,21 @@ struct recode_trns
    };
  
  static bool parse_src_vars (struct lexer *, struct recode_trns *, const struct dictionary *dict);
-static bool parse_mappings (struct lexer *, struct recode_trns *);
+static bool parse_mappings (struct lexer *, struct recode_trns *,
+                            const char *dict_encoding);
  static bool parse_dst_vars (struct lexer *, struct recode_trns *, const struct dictionary *dict);
  
  static void add_mapping (struct recode_trns *,
                           size_t *map_allocated, const struct map_in *);
  
  static bool parse_map_in (struct lexer *lexer, struct map_in *, struct pool *,
-                          enum val_type src_type, size_t max_src_width);
+                          enum val_type src_type, size_t max_src_width,
+                          const char *dict_encoding);
  static void set_map_in_generic (struct map_in *, enum map_in_type);
  static void set_map_in_num (struct map_in *, enum map_in_type, double, double);
  static void set_map_in_str (struct map_in *, struct pool *,
-                            struct substring, size_t width);
+                            struct substring, size_t width,
+                            const char *dict_encoding);
  
  static bool parse_map_out (struct lexer *lexer, struct pool *, struct map_out *);
  static void set_map_out_num (struct map_out *, double);
@@ -138,15 +139,16 @@ cmd_recode (struct lexer *lexer, struct dataset *ds)
  {
    do
      {
+      struct dictionary *dict = dataset_dict (ds);
        struct recode_trns *trns
          = pool_create_container (struct recode_trns, pool);
  
        /* Parse source variable names,
           then input to output mappings,
           then destintation variable names. */
-      if (!parse_src_vars (lexer, trns, dataset_dict (ds) )
-          || !parse_mappings (lexer, trns)
-          || !parse_dst_vars (lexer, trns, dataset_dict (ds)))
+      if (!parse_src_vars (lexer, trns, dict)
+          || !parse_mappings (lexer, trns, dict_get_encoding (dict))
+          || !parse_dst_vars (lexer, trns, dict))
          {
            recode_trns_free (trns);
            return CMD_FAILURE;
@@ -160,9 +162,9 @@ cmd_recode (struct lexer *lexer, struct dataset *ds)
        /* Create destination variables, if needed.
           This must be the final step; otherwise we'd have to
           delete destination variables on failure. */
-      trns->dst_dict = dataset_dict (ds);
+      trns->dst_dict = dict;
        if (trns->src_vars != trns->dst_vars)
-       create_dst_vars (trns, dataset_dict (ds));
+       create_dst_vars (trns, dict);
  
        /* Done. */
        add_transformation (ds,
@@ -170,7 +172,7 @@ cmd_recode (struct lexer *lexer, struct dataset *ds)
      }
    while (lex_match (lexer, T_SLASH));
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Parses a set of variables to recode into TRNS->src_vars and
@@ -192,7 +194,8 @@ parse_src_vars (struct lexer *lexer,
     into TRNS->mappings and TRNS->map_cnt.  Sets TRNS->dst_type.
     Returns true if successful, false on parse error. */
  static bool
-parse_mappings (struct lexer *lexer, struct recode_trns *trns)
+parse_mappings (struct lexer *lexer, struct recode_trns *trns,
+                const char *dict_encoding)
  {
    size_t map_allocated;
    bool have_dst_type;
@@ -232,7 +235,8 @@ parse_mappings (struct lexer *lexer, struct recode_trns *trns)
                struct map_in in;
  
                if (!parse_map_in (lexer, &in, trns->pool,
-                                 trns->src_type, trns->max_src_width))
+                                 trns->src_type, trns->max_src_width,
+                                 dict_encoding))
                  return false;
                add_mapping (trns, &map_allocated, &in);
                lex_match (lexer, T_COMMA);
@@ -292,7 +296,8 @@ parse_mappings (struct lexer *lexer, struct recode_trns *trns)
     false on parse error. */
  static bool
  parse_map_in (struct lexer *lexer, struct map_in *in, struct pool *pool,
-              enum val_type src_type, size_t max_src_width)
+              enum val_type src_type, size_t max_src_width,
+              const char *dict_encoding)
  {
  
    if (lex_match_id (lexer, "ELSE"))
@@ -319,7 +324,8 @@ parse_map_in (struct lexer *lexer, struct map_in *in, struct pool *pool,
          return false;
        else 
         {
-         set_map_in_str (in, pool, lex_tokss (lexer), max_src_width);
+         set_map_in_str (in, pool, lex_tokss (lexer), max_src_width,
+                          dict_encoding);
           lex_get (lexer);
           if (lex_token (lexer) == T_ID
               && lex_id_match (ss_cstr ("THRU"), lex_tokss (lexer)))
@@ -371,13 +377,16 @@ set_map_in_num (struct map_in *in, enum map_in_type type, double x, double y)
     right to WIDTH characters long. */
  static void
  set_map_in_str (struct map_in *in, struct pool *pool,
-                struct substring string, size_t width)
+                struct substring string, size_t width,
+                const char *dict_encoding)
  {
+  char *s = recode_string (dict_encoding, "UTF-8",
+                           ss_data (string), ss_length (string));
    in->type = MAP_SINGLE;
    value_init_pool (pool, &in->x, width);
    value_copy_buf_rpad (&in->x, width,
-                       CHAR_CAST_BUG (uint8_t *, ss_data (string)),
-                       ss_length (string), ' ');
+                       CHAR_CAST (uint8_t *, s), strlen (s), ' ');
+  free (s);
  }
  
  /* Parses a mapping output value into OUT, allocating memory from
diff --git a/src/language/xforms/sample.c b/src/language/xforms/sample.c

index 693f84477e51476f41770299d4207cd1a8b5478a..f2a30a2a58fdbfc5e94958eec4bd65fbdc710464 100644 (file)
--- a/src/language/xforms/sample.c
+++ b/src/language/xforms/sample.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2009, 2011 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2009-2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -111,7 +111,7 @@ cmd_sample (struct lexer *lexer, struct dataset *ds)
    trns->frac = frac;
    add_transformation (ds, sample_trns_proc, sample_trns_free, trns);
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
  
  /* Executes a SAMPLE transformation. */
diff --git a/src/language/xforms/select-if.c b/src/language/xforms/select-if.c

index 4240f63b47b94e2f79a972c4954e2275c457ab38..9df4eba5e3a2c603b10ba59b17d58431f4991e69 100644 (file)
--- a/src/language/xforms/select-if.c
+++ b/src/language/xforms/select-if.c
@@ -125,5 +125,5 @@ cmd_filter (struct lexer *lexer, struct dataset *ds)
        dict_set_filter (dict, v);
      }
  
-  return lex_end_of_command (lexer);
+  return CMD_SUCCESS;
  }
diff --git a/src/libpspp/automake.mk b/src/libpspp/automake.mk

index fcb281408ad29dda634727a21e7889ea6b37b0a2..e4948406d034bde5762a55288766d5e0fb4d7eaf 100644 (file)
--- a/src/libpspp/automake.mk
+++ b/src/libpspp/automake.mk
@@ -28,8 +28,6 @@ src_libpspp_libpspp_la_SOURCES = \
         src/libpspp/float-format.h \
         src/libpspp/freaderror.c \
         src/libpspp/freaderror.h \
-       src/libpspp/getl.c \
-       src/libpspp/getl.h \
         src/libpspp/hash-functions.c \
         src/libpspp/hash-functions.h \
         src/libpspp/hash.c \
@@ -56,8 +54,6 @@ src_libpspp_libpspp_la_SOURCES = \
         src/libpspp/misc.h \
         src/libpspp/model-checker.c \
         src/libpspp/model-checker.h \
-       src/libpspp/msg-locator.c \
-       src/libpspp/msg-locator.h \
         src/libpspp/pool.c \
         src/libpspp/pool.h \
         src/libpspp/prompt.c \
diff --git a/src/libpspp/getl.c b/src/libpspp/getl.c

deleted file mode 100644 (file)

index 9db6c3a..0000000
--- a/src/libpspp/getl.c
+++ /dev/null
@@ -1,271 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006, 2009, 2010 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#include <config.h>
-
-#include "libpspp/getl.h"
-
-#include <stdlib.h>
-
-#include "libpspp/ll.h"
-#include "libpspp/str.h"
-#include "libpspp/string-array.h"
-
-#include "gl/configmake.h"
-#include "gl/relocatable.h"
-#include "gl/xalloc.h"
-
-struct getl_source
-  {
-    struct getl_source *included_from; /* File that this is nested inside. */
-    struct getl_source *includes;      /* File nested inside this file. */
-
-    struct ll  ll;   /* Element in the sources list */
-
-    struct getl_interface *interface;
-    enum syntax_mode syntax_mode;
-    enum error_mode error_mode;
-  };
-
-struct source_stream
-  {
-    struct ll_list sources ;  /* List of source files. */
-    struct string_array include_path;
-  };
-
-char **
-getl_include_path (const struct source_stream *ss_)
-{
-  struct source_stream *ss = CONST_CAST (struct source_stream *, ss_);
-  string_array_terminate_null (&ss->include_path);
-  return ss->include_path.strings;
-}
-
-static struct getl_source *
-current_source (const struct source_stream *ss)
-{
-  const struct ll *ll = ll_head (&ss->sources);
-  return ll_data (ll, struct getl_source, ll );
-}
-
-enum syntax_mode
-source_stream_current_syntax_mode (const struct source_stream *ss)
-{
-  struct getl_source *cs = current_source (ss);
-
-  return cs->syntax_mode;
-}
-
-
-
-enum error_mode
-source_stream_current_error_mode (const struct source_stream *ss)
-{
-  struct getl_source *cs = current_source (ss);
-
-  return cs->error_mode;
-}
-
-
-
-/* Initialize getl. */
-struct source_stream *
-create_source_stream (void)
-{
-  struct source_stream *ss;
-
-  ss = xzalloc (sizeof (*ss));
-  ll_init (&ss->sources);
-
-  string_array_init (&ss->include_path);
-  string_array_append (&ss->include_path, ".");
-  if (getenv ("HOME") != NULL)
-    string_array_append_nocopy (&ss->include_path,
-                                xasprintf ("%s/.pspp", getenv ("HOME")));
-  string_array_append (&ss->include_path, relocate (PKGDATADIR));
-
-  return ss;
-}
-
-/* Delete everything from the include path. */
-void
-getl_clear_include_path (struct source_stream *ss)
-{
-  string_array_clear (&ss->include_path);
-}
-
-/* Add to the include path. */
-void
-getl_add_include_dir (struct source_stream *ss, const char *path)
-{
-  string_array_append (&ss->include_path, path);
-}
-
-/* Appends source S to the list of source files. */
-void
-getl_append_source (struct source_stream *ss,
-                   struct getl_interface *i,
-                   enum syntax_mode syntax_mode,
-                   enum error_mode err_mode)
-{
-  struct getl_source *s = xzalloc (sizeof ( struct getl_source ));
-
-  s->interface = i ;
-  s->syntax_mode = syntax_mode;
-  s->error_mode = err_mode;
-
-  ll_push_tail (&ss->sources, &s->ll);
-}
-
-/* Nests source S within the current source file. */
-void
-getl_include_source (struct source_stream *ss,
-                    struct getl_interface *i,
-                    enum syntax_mode syntax_mode,
-                    enum error_mode err_mode)
-{
-  struct getl_source *current = current_source (ss);
-  struct getl_source *s = xzalloc (sizeof ( struct getl_source ));
-
-  s->interface = i;
-
-  s->included_from = current ;
-  s->includes  = NULL;
-  s->syntax_mode  = syntax_mode;
-  s->error_mode = err_mode;
-  current->includes = s;
-
-  ll_push_head (&ss->sources, &s->ll);
-}
-
-/* Closes the current source, and move  the current source to the
-   next file in the chain. */
-static void
-close_source (struct source_stream *ss)
-{
-  struct getl_source *s = current_source (ss);
-
-  if ( s->interface->close )
-    s->interface->close (s->interface);
-
-  ll_pop_head (&ss->sources);
-
-  if (s->included_from != NULL)
-    current_source (ss)->includes = NULL;
-
-  free (s);
-}
-
-/* Closes all sources until an interactive source is
-   encountered. */
-void
-getl_abort_noninteractive (struct source_stream *ss)
-{
-  while ( ! ll_is_empty (&ss->sources))
-    {
-      const struct getl_source *s = current_source (ss);
-
-      if ( !s->interface->interactive (s->interface) )
-       close_source (ss);
-    }
-}
-
-/* Returns true if the current source is interactive,
-   false otherwise. */
-bool
-getl_is_interactive (const struct source_stream *ss)
-{
-  const struct getl_source *s = current_source (ss);
-
-  if (ll_is_empty (&ss->sources) )
-    return false;
-
-  return s->interface->interactive (s->interface);
-}
-
-/* Returns the name of the current source, or NULL if there is no
-   current source */
-const char *
-getl_source_name (const struct source_stream *ss)
-{
-  const struct getl_source *s = current_source (ss);
-
-  if ( ll_is_empty (&ss->sources) )
-    return NULL;
-
-  if ( ! s->interface->name )
-    return NULL;
-
-  return s->interface->name (s->interface);
-}
-
-/* Returns the line number within the current source, or 0 if there is no
-   current source. */
-int
-getl_source_location (const struct source_stream *ss)
-{
-  const struct getl_source *s = current_source (ss);
-
-  if ( ll_is_empty (&ss->sources) )
-    return 0;
-
-  if ( !s->interface->location )
-    return 0;
-
-  return s->interface->location (s->interface);
-}
-
-
-/* Close getl. */
-void
-destroy_source_stream (struct source_stream *ss)
-{
-  while ( !ll_is_empty (&ss->sources))
-    close_source (ss);
-  string_array_destroy (&ss->include_path);
-
-  free (ss);
-}
-
-
-/* Reads a single line into LINE.
-   Returns true when a line has been read, false at end of input.
-*/
-bool
-getl_read_line (struct source_stream *ss, struct string *line)
-{
-  assert (ss != NULL);
-  while (!ll_is_empty (&ss->sources))
-    {
-      struct getl_source *s = current_source (ss);
-
-      ds_clear (line);
-      if (s->interface->read (s->interface, line))
-        {
-          while (s)
-           {
-             if (s->interface->filter)
-               s->interface->filter (s->interface, line);
-             s = s->included_from;
-           }
-
-          return true;
-        }
-      close_source (ss);
-    }
-
-  return false;
-}
diff --git a/src/libpspp/getl.h b/src/libpspp/getl.h

deleted file mode 100644 (file)

index c7d0967..0000000
--- a/src/libpspp/getl.h
+++ /dev/null
@@ -1,113 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006, 2010, 2011 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#ifndef GETL_H
-#define GETL_H 1
-
-#include <stdbool.h>
-#include "libpspp/ll.h"
-
-struct string;
-
-struct getl_source;
-
-/* Syntax rules that apply to a given source line. */
-enum syntax_mode
-  {
-    /* Each line that begins in column 1 starts a new command.  A
-       `+' or `-' in column 1 is ignored to allow visual
-       indentation of new commands.  Continuation lines must be
-       indented from the left margin.  A period at the end of a
-       line does end a command, but it is optional. */
-    GETL_BATCH,
-
-    /* Each command must end in a period or in a blank line. */
-    GETL_INTERACTIVE
-  };
-
-enum error_mode
-  {
-    /* When errors are encountered, report the error and continue to
-       the next command. */
-    ERRMODE_CONTINUE,
-
-    /* When errors are encountered, abort the current stream. */
-    ERRMODE_STOP
-  };
-
-/* An abstract base class for objects which act as line buffers for the
-   PSPP.  Ie anything which might contain content for the lexer */
-struct getl_interface
-  {
-    /* Returns true if the interface is interactive, that is, if
-       it prompts a human user.  This property is independent of
-       the syntax mode returned by the read member function. */
-    bool  (*interactive) (const struct getl_interface *);
-
-    /* Read a line the intended syntax mode from the interface.
-       Returns true if succesful, false on failure or at end of
-       input. */
-    bool  (*read)  (struct getl_interface *,
-                    struct string *);
-
-    /* Close and destroy the interface */
-    void  (*close) (struct getl_interface *);
-
-    /* Filter for current and all included sources, which may
-       modify the line.  Usually null.  */
-    void  (*filter) (struct getl_interface *,
-                     struct string *line);
-
-    /* Returns the name of the source */
-    const char * (*name) (const struct getl_interface *);
-
-    /* Returns the current location within the source */
-    int (*location) (const struct getl_interface *);
-  };
-
-struct source_stream;
-
-struct source_stream *create_source_stream (void);
-
-enum syntax_mode source_stream_current_syntax_mode
-   (const struct source_stream *);
-
-
-enum error_mode source_stream_current_error_mode
-   (const struct source_stream *);
-
-
-void destroy_source_stream (struct source_stream *);
-
-void getl_clear_include_path (struct source_stream *);
-void getl_add_include_dir (struct source_stream *, const char *);
-char **getl_include_path (const struct source_stream *);
-
-void getl_abort_noninteractive (struct source_stream *);
-bool getl_is_interactive (const struct source_stream *);
-
-bool getl_read_line (struct source_stream *, struct string *);
-
-void getl_append_source (struct source_stream *, struct getl_interface *s,
-                        enum syntax_mode, enum error_mode) ;
-
-void getl_include_source (struct source_stream *, struct getl_interface *s,
-                         enum syntax_mode, enum error_mode) ;
-
-const char * getl_source_name (const struct source_stream *);
-int getl_source_location (const struct source_stream *);
-
-#endif /* line-buffer.h */
diff --git a/src/libpspp/message.c b/src/libpspp/message.c

index c7b56e0a8d3a1b680571244a835b680355eace7f..80475b35455e4b3c5e6e3abcdeff258dc6c71855 100644 (file)
--- a/src/libpspp/message.c
+++ b/src/libpspp/message.c
@@ -17,7 +17,6 @@
  #include <config.h>
  
  #include "libpspp/message.h"
-#include "libpspp/msg-locator.h"
  
  #include <assert.h>
  #include <stdarg.h>
@@ -26,10 +25,12 @@
  #include <string.h>
  #include <unistd.h>
  
-#include "data/settings.h"
+#include "libpspp/cast.h"
  #include "libpspp/str.h"
  #include "libpspp/version.h"
+#include "data/settings.h"
  
+#include "gl/minmax.h"
  #include "gl/progname.h"
  #include "gl/xalloc.h"
  #include "gl/xvasprintf.h"
@@ -37,8 +38,9 @@
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
  
-/* Message handler as set by msg_init(). */
-static void (*msg_handler)  (const struct msg *);
+/* Message handler as set by msg_set_handler(). */
+static void (*msg_handler)  (const struct msg *, void *aux);
+static void *msg_aux;
  
  /* Disables emitting messages if positive. */
  static int messages_disabled;
@@ -57,27 +59,19 @@ msg (enum msg_class class, const char *format, ...)
    m.severity = msg_class_to_severity (class);
    va_start (args, format);
    m.text = xvasprintf (format, args);
-  m.where.file_name = NULL;
-  m.where.line_number = 0;
-  m.where.first_column = 0;
-  m.where.last_column = 0;
+  m.file_name = NULL;
+  m.first_line = m.last_line = 0;
+  m.first_column = m.last_column = 0;
    va_end (args);
  
    msg_emit (&m);
  }
  
-static struct source_stream *s_stream;
-
  void
-msg_init (struct source_stream *ss,  void (*handler) (const struct msg *) )
+msg_set_handler (void (*handler) (const struct msg *, void *aux), void *aux)
  {
-  s_stream = ss;
    msg_handler = handler;
-}
-
-void
-msg_done (void)
-{
+  msg_aux = aux;
  }
  \f
  /* Working with messages. */
@@ -89,8 +83,8 @@ msg_dup (const struct msg *m)
    struct msg *new_msg;
  
    new_msg = xmemdup (m, sizeof *m);
-  if (m->where.file_name != NULL)
-    new_msg->where.file_name = xstrdup (m->where.file_name);
+  if (m->file_name != NULL)
+    new_msg->file_name = xstrdup (m->file_name);
    new_msg->text = xstrdup (m->text);
  
    return new_msg;
@@ -98,13 +92,13 @@ msg_dup (const struct msg *m)
  
  /* Frees a message created by msg_dup().
  
-   (Messages not created by msg_dup(), as well as their where.file_name
+   (Messages not created by msg_dup(), as well as their file_name
     members, are typically not dynamically allocated, so this function should
     not be used to destroy them.) */
  void
  msg_destroy (struct msg *m)
  {
-  free (m->where.file_name);
+  free (m->file_name);
    free (m->text);
    free (m);
  }
@@ -118,23 +112,56 @@ msg_to_string (const struct msg *m, const char *command_name)
    ds_init_empty (&s);
  
    if (m->category != MSG_C_GENERAL
-      && (m->where.file_name
-          || m->where.line_number > 0
-          || m->where.first_column > 0))
+      && (m->file_name || m->first_line > 0 || m->first_column > 0))
      {
-      if (m->where.file_name)
-        ds_put_format (&s, "%s", m->where.file_name);
-      if (m->where.line_number > 0)
+      int l1 = m->first_line;
+      int l2 = MAX (m->first_line, m->last_line - 1);
+      int c1 = m->first_column;
+      int c2 = MAX (m->first_column, m->last_column - 1);
+
+      if (m->file_name)
+        ds_put_format (&s, "%s", m->file_name);
+
+      if (l1 > 0)
          {
            if (!ds_is_empty (&s))
              ds_put_byte (&s, ':');
-          ds_put_format (&s, "%d", m->where.line_number);
+
+          if (l2 > l1)
+            {
+              if (c1 > 0)
+                ds_put_format (&s, "%d.%d-%d.%d", l1, c1, l2, c2);
+              else
+                ds_put_format (&s, "%d-%d", l1, l2);
+            }
+          else
+            {
+              if (c1 > 0)
+                {
+                  if (c2 > c1)
+                    {
+                      /* The GNU coding standards say to use
+                         LINENO-1.COLUMN-1-COLUMN-2 for this case, but GNU
+                         Emacs interprets COLUMN-2 as LINENO-2 if I do that.
+                         I've submitted an Emacs bug report:
+                         http://debbugs.gnu.org/cgi/bugreport.cgi?bug=7725.
+
+                         For now, let's be compatible. */
+                      ds_put_format (&s, "%d.%d-%d.%d", l1, c1, l1, c2);
+                    }
+                  else
+                    ds_put_format (&s, "%d.%d", l1, c1);
+                }
+              else
+                ds_put_format (&s, "%d", l1);
+            }
          }
-      if (m->where.first_column > 0)
+      else if (c1 > 0)
          {
-          ds_put_format (&s, ".%d", m->where.first_column);
-          if (m->where.last_column > m->where.first_column + 1)
-            ds_put_format (&s, "-%d", m->where.last_column - 1);
+          if (c2 > c1)
+            ds_put_format (&s, ".%d-%d", c1, c2);
+          else
+            ds_put_format (&s, ".%d", c1);
          }
        ds_put_cstr (&s, ": ");
      }
@@ -214,12 +241,13 @@ submit_note (char *s)
  
    m.category = MSG_C_GENERAL;
    m.severity = MSG_S_NOTE;
-  m.where.file_name = NULL;
-  m.where.line_number = 0;
-  m.where.first_column = 0;
-  m.where.last_column = 0;
+  m.file_name = NULL;
+  m.first_line = 0;
+  m.last_line = 0;
+  m.first_column = 0;
+  m.last_column = 0;
    m.text = s;
-  msg_handler (&m);
+  msg_handler (&m, msg_aux);
    free (s);
  }
  
@@ -236,7 +264,7 @@ process_msg (const struct msg *m)
        || (warnings_off && m->severity == MSG_S_WARNING) )
      return;
  
-  msg_handler (m);
+  msg_handler (m, msg_aux);
  
    counts[m->severity]++;
    max_msgs = settings_get_max_messages (m->severity);
@@ -271,20 +299,6 @@ process_msg (const struct msg *m)
  void
  msg_emit (struct msg *m)
  {
-  if ( s_stream && m->where.file_name == NULL )
-    {
-      struct msg_locator loc;
-
-      get_msg_location (s_stream, &loc);
-      m->where.file_name = loc.file_name;
-      m->where.line_number = loc.line_number;
-    }
-  else
-    {
-      m->where.file_name = NULL;
-      m->where.line_number = 0;
-    }
-
    if (!messages_disabled)
       process_msg (m);
  
diff --git a/src/libpspp/message.h b/src/libpspp/message.h

index 7c59847f4aac760e2d7ad32abee7da3f97d5f080..5ced994ad52c7db1c7a70919d8422b0536f5d67d 100644 (file)
--- a/src/libpspp/message.h
+++ b/src/libpspp/message.h
@@ -67,30 +67,24 @@ msg_class_from_category_and_severity (enum msg_category category,
    return category * 3 + severity;
  }
  
-/* A file location.  */
-struct msg_locator
-  {
-    char *file_name;           /* File name (NULL if none). */
-    int line_number;           /* Line number (0 if none). */
-    int first_column;          /* 1-based column number (0 if none). */
-    int last_column;           /* 1-based exclusive last column (0 if none). */
-  };
-
  /* A message. */
  struct msg
    {
      enum msg_category category; /* Message category. */
      enum msg_severity severity; /* Message severity. */
-    struct msg_locator where;  /* File location, or (NULL, -1). */
+    char *file_name;            /* Name of file containing error, or NULL. */
+    int first_line;             /* 1-based line number, or 0 if none. */
+    int last_line;             /* 1-based exclusive last line (0=none). */
+    int first_column;           /* 1-based first column, or 0 if none. */
+    int last_column;            /* 1-based exclusive last column (0=none). */
      char *text;                 /* Error text. */
    };
  
  struct source_stream ;
  
  /* Initialization. */
-void msg_init (struct source_stream *, void (*handler) (const struct msg *) );
-
-void msg_done (void);
+void msg_set_handler (void (*handler) (const struct msg *, void *lexer),
+                      void *aux);
  
  /* Working with messages. */
  struct msg *msg_dup (const struct msg *);
@@ -107,9 +101,6 @@ void msg_enable (void);
  void msg_disable (void);
  
  /* Error context. */
-void msg_push_msg_locator (const struct msg_locator *);
-void msg_pop_msg_locator (const struct msg_locator *);
-
  bool msg_ui_too_many_errors (void);
  void msg_ui_reset_counts (void);
  bool msg_ui_any_errors (void);
diff --git a/src/libpspp/msg-locator.c b/src/libpspp/msg-locator.c

deleted file mode 100644 (file)

index d29f35d..0000000
--- a/src/libpspp/msg-locator.c
+++ /dev/null
@@ -1,87 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006, 2011 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#include <config.h>
-
-#include "libpspp/msg-locator.h"
-
-#include <stdlib.h>
-
-#include "libpspp/assertion.h"
-#include "libpspp/cast.h"
-#include "libpspp/message.h"
-#include "libpspp/getl.h"
-
-#include "gl/xalloc.h"
-
-/* File locator stack. */
-static const struct msg_locator **file_loc;
-
-static int nfile_loc, mfile_loc;
-
-void
-msg_locator_done (void)
-{
-  free(file_loc);
-  file_loc = NULL;
-  nfile_loc = mfile_loc = 0;
-}
-
-
-/* File locator stack functions. */
-
-/* Pushes F onto the stack of file locations. */
-void
-msg_push_msg_locator (const struct msg_locator *loc)
-{
-  if (nfile_loc >= mfile_loc)
-    {
-      if (mfile_loc == 0)
-       mfile_loc = 8;
-      else
-       mfile_loc *= 2;
-
-      file_loc = xnrealloc (file_loc, mfile_loc, sizeof *file_loc);
-    }
-
-  file_loc[nfile_loc++] = loc;
-}
-
-/* Pops F off the stack of file locations.
-   Argument F is only used for verification that that is actually the
-   item on top of the stack. */
-void
-msg_pop_msg_locator (const struct msg_locator *loc)
-{
-  assert (nfile_loc >= 0 && file_loc[nfile_loc - 1] == loc);
-  nfile_loc--;
-}
-
-/* Puts the current file and line number into LOC, or NULL and -1 if
-   none. */
-void
-get_msg_location (const struct source_stream *ss, struct msg_locator *loc)
-{
-  if (nfile_loc)
-    {
-      *loc = *file_loc[nfile_loc - 1];
-    }
-  else
-    {
-      loc->file_name = CONST_CAST (char *, getl_source_name (ss));
-      loc->line_number = getl_source_location (ss);
-    }
-}
diff --git a/src/libpspp/msg-locator.h b/src/libpspp/msg-locator.h

deleted file mode 100644 (file)

index 1dfc883..0000000
--- a/src/libpspp/msg-locator.h
+++ /dev/null
@@ -1,34 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-struct msg_locator ;
-
-void msg_locator_done (void);
-
-/* File locator stack functions. */
-
-/* Pushes F onto the stack of file locations. */
-void msg_push_msg_locator (const struct msg_locator *loc);
-
-/* Pops F off the stack of file locations.
-   Argument F is only used for verification that that is actually the
-   item on top of the stack. */
-void msg_pop_msg_locator (const struct msg_locator *loc);
-
-struct source_stream ;
-/* Puts the current file and line number into LOC, or NULL and -1 if
-   none. */
-void get_msg_location (const struct source_stream *ss, struct msg_locator *loc);
diff --git a/src/output/driver.c b/src/output/driver.c

index 136b4c4f2705cc0a6fbb67f8a9c1281e14d3f2cc..0da3e6a0f6549b6fb6f3dd99edf19c7c6ffb884b 100644 (file)
--- a/src/output/driver.c
+++ b/src/output/driver.c
@@ -50,9 +50,6 @@ static const struct output_driver_factory *factories[];
  /* Drivers currently registered with output_driver_register(). */
  static struct llx_list drivers = LLX_INITIALIZER (drivers);
  
-static struct output_item *deferred_syntax;
-static bool in_command;
-
  void
  output_close (void)
  {
@@ -72,8 +69,10 @@ output_get_supported_formats (struct string_set *formats)
      string_set_insert (formats, (*fp)->extension);
  }
  
-static void
-output_submit__ (struct output_item *item)
+/* Submits ITEM to the configured output drivers, and transfers ownership to
+   the output subsystem. */
+void
+output_submit (struct output_item *item)
  {
    struct llx *llx, *next;
  
@@ -105,53 +104,6 @@ output_submit__ (struct output_item *item)
    output_item_unref (item);
  }
  
-static void
-flush_deferred_syntax (void)
-{
-  if (deferred_syntax != NULL)
-    {
-      output_submit__ (deferred_syntax);
-      deferred_syntax = NULL;
-    }
-}
-
-/* Submits ITEM to the configured output drivers, and transfers ownership to
-   the output subsystem. */
-void
-output_submit (struct output_item *item)
-{
-  if (is_text_item (item))
-    {
-      struct text_item *text = to_text_item (item);
-      switch (text_item_get_type (text))
-        {
-        case TEXT_ITEM_SYNTAX:
-          if (!in_command)
-            {
-              flush_deferred_syntax ();
-              deferred_syntax = item;
-              return;
-            }
-          break;
-
-        case TEXT_ITEM_COMMAND_OPEN:
-          output_submit__ (item);
-          flush_deferred_syntax ();
-          in_command = true;
-          return;
-
-        case TEXT_ITEM_COMMAND_CLOSE:
-          in_command = false;
-          break;
-
-        default:
-          break;
-        }
-    }
-
-  output_submit__ (item);
-}
-
  /* Flushes output to screen devices, so that the user can see
     output that doesn't fill up an entire page. */
  void
diff --git a/src/ui/gui/automake.mk b/src/ui/gui/automake.mk

index 1595fdd00e356bca8d6504e271087c9fe3ad559c..eb37c16cd66a58c5ad123b38f5e6d833ff241874 100644 (file)
--- a/src/ui/gui/automake.mk
+++ b/src/ui/gui/automake.mk
@@ -221,8 +221,6 @@ src_ui_gui_psppire_SOURCES = \
         src/ui/gui/sort-cases-dialog.h \
         src/ui/gui/split-file-dialog.c \
         src/ui/gui/split-file-dialog.h \
-       src/ui/gui/syntax-editor-source.c \
-       src/ui/gui/syntax-editor-source.h \
         src/ui/gui/text-data-import-dialog.c \
         src/ui/gui/text-data-import-dialog.h \
         src/ui/gui/transpose-dialog.c \
diff --git a/src/ui/gui/comments-dialog.c b/src/ui/gui/comments-dialog.c

index 78cc74dea61d8500cf978d5cd8fe0e291aefcfda..35e7c368c678c61f90331fb1d7dd741b05008c6f 100644 (file)
--- a/src/ui/gui/comments-dialog.c
+++ b/src/ui/gui/comments-dialog.c
@@ -1,5 +1,5 @@
  /* PSPPIRE - a graphical user interface for PSPP.
-   Copyright (C) 2007, 2010  Free Software Foundation
+   Copyright (C) 2007, 2010, 2011  Free Software Foundation
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -195,13 +195,7 @@ refresh (PsppireDialog *dialog, const struct comment_dialog *cd)
    gtk_text_buffer_set_text (buffer, "", 0);
  
    for ( i = 0 ; i < dict_get_document_line_cnt (cd->dict->dict); ++i )
-    {
-      struct string str;
-      ds_init_empty (&str);
-      dict_get_document_line (cd->dict->dict, i, &str);
-      add_line_to_buffer (buffer, ds_cstr (&str));
-      ds_destroy (&str);
-    }
+    add_line_to_buffer (buffer, dict_get_document_line (cd->dict->dict, i));
  }
  
  
@@ -216,11 +210,10 @@ generate_syntax (const struct comment_dialog *cd)
    GtkWidget *tv = get_widget_assert (cd->xml, "comments-textview1");
    GtkWidget *check = get_widget_assert (cd->xml, "comments-checkbutton1");
    GtkTextBuffer *buffer = gtk_text_view_get_buffer (GTK_TEXT_VIEW (tv));
-  const char *existing_docs = dict_get_documents (cd->dict->dict);
  
    str = g_string_new ("\n* Data File Comments.\n\n");
  
-  if ( NULL != existing_docs)
+  if (dict_get_documents (cd->dict->dict) != NULL)
      g_string_append (str, "DROP DOCUMENTS.\n");
  
    g_string_append (str, "ADD DOCUMENT\n");
diff --git a/src/ui/gui/executor.c b/src/ui/gui/executor.c

index 24b80db2443d3163ee5978ba02b8417341fac102..754d09ae5681b44b357f7882db701fa8eb6d631a 100644 (file)
--- a/src/ui/gui/executor.c
+++ b/src/ui/gui/executor.c
@@ -22,15 +22,12 @@
  #include "data/procedure.h"
  #include "language/command.h"
  #include "language/lexer/lexer.h"
-#include "language/syntax-string-source.h"
  #include "libpspp/cast.h"
-#include "libpspp/getl.h"
  #include "output/driver.h"
  #include "ui/gui/psppire-data-store.h"
  #include "ui/gui/psppire-output-window.h"
  
  extern struct dataset *the_dataset;
-extern struct source_stream *the_source_stream;
  extern PsppireDataStore *the_data_store;
  
  /* Lazy casereader callback function used by execute_syntax. */
@@ -42,7 +39,7 @@ create_casereader_from_data_store (void *data_store_)
  }
  
  gboolean
-execute_syntax (struct getl_interface *sss)
+execute_syntax (struct lex_reader *lex_reader)
  {
    struct lexer *lexer;
    gboolean retval = TRUE;
@@ -74,9 +71,9 @@ execute_syntax (struct getl_interface *sss)
  
    g_return_val_if_fail (proc_has_active_file (the_dataset), FALSE);
  
-  lexer = lex_create (the_source_stream);
-
-  getl_append_source (the_source_stream, sss, GETL_BATCH, ERRMODE_CONTINUE);
+  lexer = lex_create ();
+  psppire_set_lexer (lexer);
+  lex_append (lexer, lex_reader);
  
    for (;;)
      {
@@ -85,8 +82,7 @@ execute_syntax (struct getl_interface *sss)
        if ( cmd_result_is_failure (result))
         {
           retval = FALSE;
-         if ( source_stream_current_error_mode (the_source_stream)
-              == ERRMODE_STOP )
+         if ( lex_get_error_mode (lexer) == LEX_ERROR_STOP )
             break;
         }
  
@@ -94,9 +90,8 @@ execute_syntax (struct getl_interface *sss)
         break;
      }
  
-  getl_abort_noninteractive (the_source_stream);
-
    lex_destroy (lexer);
+  psppire_set_lexer (NULL);
  
    proc_execute (the_dataset);
  
@@ -125,5 +120,5 @@ execute_syntax_string (gchar *syntax)
  void
  execute_const_syntax_string (const gchar *syntax)
  {
-  execute_syntax (create_syntax_string_source (syntax));
+  execute_syntax (lex_reader_for_string (syntax));
  }
diff --git a/src/ui/gui/executor.h b/src/ui/gui/executor.h

index 81ece2b8223f9ba6e85fefba905600ccb5fe49a3..ae363e960bedaed574df0ecb65df3c945c195892 100644 (file)
--- a/src/ui/gui/executor.h
+++ b/src/ui/gui/executor.h
@@ -20,9 +20,9 @@
  
  #include <glib.h>
  
-struct getl_interface;
+struct lex_reader;
  
-gboolean execute_syntax (struct getl_interface *sss);
+gboolean execute_syntax (struct lex_reader *);
  gchar *execute_syntax_string (gchar *syntax);
  void execute_const_syntax_string (const gchar *syntax);
  
diff --git a/src/ui/gui/main.c b/src/ui/gui/main.c

index 0c88204ae1469347daf7b25347cb4873883f8eec..7e9d4ee51a8960f2d52cb365076888e2eeca715e 100644 (file)
--- a/src/ui/gui/main.c
+++ b/src/ui/gui/main.c
@@ -21,13 +21,14 @@
  #include <gtk/gtk.h>
  #include <stdlib.h>
  
+#include "language/lexer/include-path.h"
  #include "libpspp/argv-parser.h"
  #include "libpspp/assertion.h"
  #include "libpspp/cast.h"
-#include "libpspp/getl.h"
-#include "libpspp/version.h"
  #include "libpspp/copyleft.h"
  #include "libpspp/str.h"
+#include "libpspp/string-array.h"
+#include "libpspp/version.h"
  #include "ui/source-init-opts.h"
  
  #include "gl/configmake.h"
@@ -58,28 +59,10 @@ static const struct argv_option startup_options[N_STARTUP_OPTIONS] =
      {"no-splash", 'q', no_argument, OPT_NO_SPLASH}
    };
  
-static char *
-get_default_include_path (void)
-{
-  struct source_stream *ss;
-  struct string dst;
-  char **path;
-  size_t i;
-
-  ss = create_source_stream ();
-  path = getl_include_path (ss);
-  ds_init_empty (&dst);
-  for (i = 0; path[i] != NULL; i++)
-    ds_put_format (&dst, " %s", path[i]);
-  destroy_source_stream (ss);
-
-  return ds_steal_cstr (&dst);
-}
-
  static void
  usage (void)
  {
-  char *default_include_path = get_default_include_path ();
+  char *inc_path = string_array_join (include_path_default (), " ");
    GOptionGroup *gtk_options;
    GOptionContext *ctx;
    gchar *gtk_help_base, *gtk_help;
@@ -116,16 +99,16 @@ Language options:\n\
                              set to `compatible' to disable PSPP extensions\n\
    -i, --interactive         interpret syntax in interactive mode\n\
    -s, --safer               don't allow some unsafe operations\n\
-Default search path:%s\n\
+Default search path: %s\n\
  \n\
  Informative output:\n\
    -h, --help                display this help and exit\n\
    -V, --version             output version information and exit\n\
  \n\
  A non-option argument is interpreted as a .sav or .por file to load.\n"),
-          program_name, gtk_help, default_include_path);
+          program_name, gtk_help, inc_path);
  
-  free (default_include_path);
+  free (inc_path);
    g_free (gtk_help_base);
  
    emit_bug_reporting_address ();
@@ -202,7 +185,6 @@ quit_one_loop (gpointer data)
  
  struct initialisation_parameters
  {
-  struct source_stream *ss;
    const char *data_file;
    GtkWidget *splash_window;
  };
@@ -212,7 +194,7 @@ static gboolean
  run_inner_loop (gpointer data)
  {
    struct initialisation_parameters *ip = data;
-  initialize (ip->ss, ip->data_file);
+  initialize (ip->data_file);
  
    g_timeout_add (500, hide_splash_window, ip->splash_window);
  
@@ -240,7 +222,6 @@ main (int argc, char *argv[])
    struct initialisation_parameters init_p;
    gboolean show_splash = TRUE;
    struct argv_parser *parser;
-  struct source_stream *ss;
    const gchar *vers;
  
    set_program_name (argv[0]);
@@ -264,7 +245,6 @@ main (int argc, char *argv[])
      }
  
  
-  ss = create_source_stream ();
    /* Parse our own options. 
       This must come BEFORE gdk_init otherwise options such as 
       --help --version which ought to work without an X server, won't.
@@ -272,7 +252,7 @@ main (int argc, char *argv[])
    parser = argv_parser_create ();
    argv_parser_add_options (parser, startup_options, N_STARTUP_OPTIONS,
                             startup_option_callback, &show_splash);
-  source_init_register_argv_parser (parser, ss);
+  source_init_register_argv_parser (parser);
    if (!argv_parser_run (parser, argc, argv))
      exit (EXIT_FAILURE);
    argv_parser_destroy (parser);
@@ -283,7 +263,6 @@ main (int argc, char *argv[])
    gdk_init (&argc, &argv);
  
    init_p.splash_window = create_splash_window ();
-  init_p.ss = ss;
    init_p.data_file = optind < argc ? argv[optind] : NULL;
  
    if ( show_splash )
diff --git a/src/ui/gui/psppire-data-window.c b/src/ui/gui/psppire-data-window.c

index b01582bb8f0969be2352b2614babcfd132e1ff86..6dcb8c69c10f4ddc1cfe0586c5177ded574eaba3 100644 (file)
--- a/src/ui/gui/psppire-data-window.c
+++ b/src/ui/gui/psppire-data-window.c
@@ -22,7 +22,7 @@
  
  #include "data/any-reader.h"
  #include "data/procedure.h"
-#include "language/syntax-string-source.h"
+#include "language/lexer/lexer.h"
  #include "libpspp/message.h"
  #include "ui/gui/help-menu.h"
  #include "ui/gui/binomial-dialog.h"
@@ -352,8 +352,9 @@ static gboolean
  load_file (PsppireWindow *de, const gchar *file_name)
  {
    gchar *native_file_name;
-  struct getl_interface *sss;
    struct string filename;
+  gchar *syntax;
+  bool ok;
  
    ds_init_empty (&filename);
  
@@ -364,15 +365,12 @@ load_file (PsppireWindow *de, const gchar *file_name)
  
    g_free (native_file_name);
  
-  sss = create_syntax_format_source ("GET FILE=%s.",
-                                    ds_cstr (&filename));
-
+  syntax = g_strdup_printf ("GET FILE=%s.", ds_cstr (&filename));
    ds_destroy (&filename);
  
-  if (execute_syntax (sss) )
-    return TRUE;
-
-  return FALSE;
+  ok = execute_syntax (lex_reader_for_string (syntax));
+  g_free (syntax);
+  return ok;
  }
  
  static GtkWidget *
diff --git a/src/ui/gui/psppire-dict.c b/src/ui/gui/psppire-dict.c

index d19fc809edccf68c8cde11f01f0c6dc8172030b6..91bfed2597be89d297be3c00c7c4d36a021945d6 100644 (file)
--- a/src/ui/gui/psppire-dict.c
+++ b/src/ui/gui/psppire-dict.c
@@ -1,5 +1,5 @@
  /* PSPPIRE - a graphical user interface for PSPP.
-   Copyright (C) 2004, 2006, 2007, 2009  Free Software Foundation
+   Copyright (C) 2004, 2006, 2007, 2009, 2010, 2011  Free Software Foundation
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -23,6 +23,7 @@
  #include <gtk/gtk.h>
  
  #include "data/dictionary.h"
+#include "data/identifier.h"
  #include "data/missing-values.h"
  #include "data/value-labels.h"
  #include "data/variable.h"
@@ -425,7 +426,7 @@ psppire_dict_set_name (PsppireDict* d, gint idx, const gchar *name)
    g_assert (d);
    g_assert (PSPPIRE_IS_DICT (d));
  
-  if ( ! var_is_valid_name (name, false))
+  if ( ! dict_id_is_valid (d->dict, name, false))
      return FALSE;
  
    if ( idx < dict_get_var_cnt (d->dict))
@@ -527,7 +528,7 @@ gboolean
  psppire_dict_check_name (const PsppireDict *dict,
                          const gchar *name, gboolean report)
  {
-  if ( ! var_is_valid_name (name, report ) )
+  if ( ! dict_id_is_valid (dict->dict, name, report ) )
      return FALSE;
  
    if (psppire_dict_lookup_var (dict, name))
@@ -835,7 +836,7 @@ gboolean
  psppire_dict_rename_var (PsppireDict *dict, struct variable *v,
                          const gchar *name)
  {
-  if ( ! var_is_valid_name (name, false))
+  if ( ! dict_id_is_valid (dict->dict, name, false))
      return FALSE;
  
    /* Make sure no other variable has this name */
diff --git a/src/ui/gui/psppire-syntax-window.c b/src/ui/gui/psppire-syntax-window.c

index 1d78881904f3ef52239c303e419c5eea2a6f7f28..50e7f09d1365a358cbcfcecb1e6adc8a967de197 100644 (file)
--- a/src/ui/gui/psppire-syntax-window.c
+++ b/src/ui/gui/psppire-syntax-window.c
@@ -30,7 +30,6 @@
  #include "psppire-data-window.h"
  #include "psppire-window-register.h"
  #include "psppire-syntax-window.h"
-#include "syntax-editor-source.h"
  
  #include "xalloc.h"
  
@@ -156,8 +155,16 @@ editor_execute_syntax (const PsppireSyntaxWindow *sw, GtkTextIter start,
                        GtkTextIter stop)
  {
    PsppireWindow *win = PSPPIRE_WINDOW (sw);
-  const gchar *name = psppire_window_get_filename (win);
-  execute_syntax (create_syntax_editor_source (sw->buffer, start, stop, name));
+  struct lex_reader *reader;
+  gchar *text;
+
+  text = gtk_text_buffer_get_text (sw->buffer, &start, &stop, FALSE);
+  reader = lex_reader_for_string (text);
+  g_free (text);
+
+  lex_reader_set_file_name (reader, psppire_window_get_filename (win));
+
+  execute_syntax (reader);
  }
  
  
@@ -590,8 +597,6 @@ on_modified_changed (GtkTextBuffer *buffer, PsppireWindow *window)
      psppire_window_set_unsaved (window);
  }
  
-extern struct source_stream *the_source_stream ;
-
  static void
  psppire_syntax_window_init (PsppireSyntaxWindow *window)
  {
@@ -616,7 +621,6 @@ psppire_syntax_window_init (PsppireSyntaxWindow *window)
    window->edit_paste = get_action_assert (xml, "edit_paste");
  
    window->buffer = gtk_text_view_get_buffer (GTK_TEXT_VIEW (text_view));
-  window->lexer = lex_create (the_source_stream);
  
    window->sb = get_widget_assert (xml, "statusbar2");
    window->text_context = gtk_statusbar_get_context_id (GTK_STATUSBAR (window->sb), "Text Context");
diff --git a/src/ui/gui/psppire-syntax-window.h b/src/ui/gui/psppire-syntax-window.h

index 08f5a7c5b2190246c9691a43b191b25b8f81484a..2b1fe03f4d00894fc1af62e5b063873759a69f9f 100644 (file)
--- a/src/ui/gui/psppire-syntax-window.h
+++ b/src/ui/gui/psppire-syntax-window.h
@@ -48,7 +48,6 @@ struct _PsppireSyntaxWindow
    /* <private> */
  
    GtkTextBuffer *buffer;  /* The buffer which contains the text */
-  struct lexer *lexer;    /* Lexer to parse syntax */
    GtkWidget *sb;
    guint text_context;
  
diff --git a/src/ui/gui/psppire-var-store.c b/src/ui/gui/psppire-var-store.c

index a2e6685443e3dd5a525d5f985f8e7acb1f4dd709..2a915afe149d6288a8af80c2ff201712ee63058a 100644 (file)
--- a/src/ui/gui/psppire-var-store.c
+++ b/src/ui/gui/psppire-var-store.c
@@ -1,5 +1,5 @@
  /* PSPPIRE - a graphical user interface for PSPP.
-   Copyright (C) 2006, 2009, 2010  Free Software Foundation
+   Copyright (C) 2006, 2009, 2010, 2011  Free Software Foundation
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -489,7 +489,7 @@ psppire_var_store_clear (PsppireSheetModel *model,  glong row, glong col)
    switch (col)
      {
      case PSPPIRE_VAR_STORE_COL_LABEL:
-      var_set_label (pv, NULL);
+      var_clear_label (pv);
        return TRUE;
        break;
      }
@@ -588,7 +588,8 @@ psppire_var_store_set_string (PsppireSheetModel *model,
        break;
      case PSPPIRE_VAR_STORE_COL_LABEL:
        {
-       var_set_label (pv, text);
+       var_set_label (pv, text,
+                       psppire_dict_encoding (var_store->dictionary), true);
         return TRUE;
        }
        break;
diff --git a/src/ui/gui/psppire.c b/src/ui/gui/psppire.c

index bb6006d111be872bb67aa29a9c0716bc4d2c6a50..0d41932adc0e6cd51036420908ba439418e33250 100644 (file)
--- a/src/ui/gui/psppire.c
+++ b/src/ui/gui/psppire.c
@@ -32,9 +32,6 @@
  #include "data/sys-file-reader.h"
  
  #include "language/lexer/lexer.h"
-#include "language/syntax-string-source.h"
-
-#include "libpspp/getl.h"
  #include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/version.h"
@@ -68,12 +65,10 @@ PsppireVarStore *the_var_store = 0;
  
  static void create_icon_factory (void);
  
-struct source_stream *the_source_stream ;
  struct dataset * the_dataset = NULL;
  
  static GtkWidget *the_data_window;
  
-static void handle_msg (const struct msg *);
  static void load_data_file (const char *);
  
  static void
@@ -89,7 +84,7 @@ replace_casereader (struct casereader *s)
  
  
  void
-initialize (struct source_stream *ss, const char *data_file)
+initialize (const char *data_file)
  {
    PsppireDict *dictionary = 0;
  
@@ -102,9 +97,7 @@ initialize (struct source_stream *ss, const char *data_file)
    fh_init ();
  
    the_dataset = create_dataset ();
-
-  the_source_stream = ss;
-  msg_init (ss, handle_msg);
+  psppire_set_lexer (NULL);
  
    dictionary = psppire_dict_new_from_dict (dataset_dict (the_dataset));
  
@@ -143,7 +136,6 @@ initialize (struct source_stream *ss, const char *data_file)
  void
  de_initialize (void)
  {
-  destroy_source_stream (the_source_stream);
    settings_done ();
    output_close ();
    i18n_done ();
@@ -300,7 +292,25 @@ load_data_file (const char *arg)
  }
  
  static void
-handle_msg (const struct msg *m)
+handle_msg (const struct msg *m_, void *lexer_)
+{
+  struct lexer *lexer = lexer_;
+  struct msg m = *m_;
+
+  if (lexer != NULL && m.file_name == NULL)
+    {
+      m.file_name = CONST_CAST (char *, lex_get_file_name (lexer));
+      m.first_line = lex_get_first_line_number (lexer, 0);
+      m.last_line = lex_get_last_line_number (lexer, 0);
+      m.first_column = lex_get_first_column (lexer, 0);
+      m.last_column = lex_get_last_column (lexer, 0);
+    }
+
+  message_item_submit (message_item_create (&m));
+}
+
+void
+psppire_set_lexer (struct lexer *lexer)
  {
-  message_item_submit (message_item_create (m));
+  msg_set_handler (handle_msg, lexer);
  }
diff --git a/src/ui/gui/psppire.h b/src/ui/gui/psppire.h

index ee747ef90606b200b195d9fa497ce239457fe58d..bb7966089c4059508264d03146a146f4fbce20f0 100644 (file)
--- a/src/ui/gui/psppire.h
+++ b/src/ui/gui/psppire.h
@@ -17,13 +17,15 @@
  #ifndef PSPPIRE_H
  #define PSPPIRE_H
  
-struct source_stream;
+struct lexer;
  
-void initialize (struct source_stream *, const char *data_file);
+void initialize (const char *data_file);
  void de_initialize (void);
  
  void psppire_quit (void);
  
  const char * output_file_name (void);
  
+void psppire_set_lexer (struct lexer *);
+
  #endif /* PSPPIRE_H */
diff --git a/src/ui/gui/syntax-editor-source.c b/src/ui/gui/syntax-editor-source.c

deleted file mode 100644 (file)

index 6ec866c..0000000
--- a/src/ui/gui/syntax-editor-source.c
+++ /dev/null
@@ -1,130 +0,0 @@
-/* PSPPIRE - a graphical user interface for PSPP.
-   Copyright (C) 2006, 2009  Free Software Foundation
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-
-#include <config.h>
-
-#include <libpspp/getl.h>
-#include <libpspp/compiler.h>
-#include <libpspp/cast.h>
-#include <libpspp/str.h>
-
-#include <stdlib.h>
-
-#include <gtk/gtk.h>
-
-#include "syntax-editor-source.h"
-#include "psppire-syntax-window.h"
-
-#include "xalloc.h"
-
-struct syntax_editor_source
-  {
-    struct getl_interface parent;
-    GtkTextBuffer *buffer;
-    GtkTextIter i;
-    GtkTextIter end;
-    const gchar *name;
-  };
-
-
-static bool
-always_false (const struct getl_interface *i UNUSED)
-{
-  return false;
-}
-
-/* Returns the name of the source */
-static const char *
-name (const struct getl_interface *i)
-{
-  const struct syntax_editor_source *ses = (const struct syntax_editor_source *) i;
-  return ses->name;
-}
-
-
-/* Returns the location within the source */
-static int
-location (const struct getl_interface *i)
-{
-  const struct syntax_editor_source *ses = (const struct syntax_editor_source *) i;
-
-  return gtk_text_iter_get_line (&ses->i);
-}
-
-
-static bool
-read_line_from_buffer (struct getl_interface *i,
-                      struct string *line)
-{
-  gchar *text;
-  GtkTextIter next_line;
-
-  struct syntax_editor_source *ses
-    = UP_CAST (i, struct syntax_editor_source, parent);
-
-  if ( gtk_text_iter_compare (&ses->i, &ses->end) >= 0)
-    return false;
-
-  next_line = ses->i;
-  gtk_text_iter_forward_line (&next_line);
-
-  text = gtk_text_buffer_get_text (ses->buffer,
-                                  &ses->i, &next_line,
-                                  FALSE);
-  g_strchomp (text);
-
-  ds_assign_cstr (line, text);
-
-  g_free (text);
-
-  gtk_text_iter_forward_line (&ses->i);
-
-  return true;
-}
-
-
-static void
-do_close (struct getl_interface *i )
-{
-  free (i);
-}
-
-struct getl_interface *
-create_syntax_editor_source (GtkTextBuffer *buffer,
-                            GtkTextIter start,
-                            GtkTextIter stop,
-                            const gchar *nm
-                            )
-{
-  struct syntax_editor_source *ses = xzalloc (sizeof *ses);
-
-  ses->buffer = buffer;
-  ses->i = start;
-  ses->end = stop;
-  ses->name = nm;
-
-
-  ses->parent.interactive = always_false;
-  ses->parent.read = read_line_from_buffer;
-  ses->parent.close = do_close;
-
-  ses->parent.name = name;
-  ses->parent.location = location;
-
-
-  return &ses->parent;
-}
diff --git a/src/ui/gui/syntax-editor-source.h b/src/ui/gui/syntax-editor-source.h

deleted file mode 100644 (file)

index f8d08ea..0000000
--- a/src/ui/gui/syntax-editor-source.h
+++ /dev/null
@@ -1,34 +0,0 @@
-/* PSPPIRE - a graphical user interface for PSPP.
-   Copyright (C) 2006  Free Software Foundation
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#ifndef SYNTAX_EDITOR_SOURCE_H
-#define SYNTAX_EDITOR_SOURCE_H
-
-#include <gtk/gtk.h>
-struct getl_interface;
-
-struct syntax_editor;
-
-struct getl_interface *
-create_syntax_editor_source (GtkTextBuffer *buffer,
-                            GtkTextIter start,
-                            GtkTextIter stop,
-                            const gchar *name
-                            );
-
-
-
-#endif
diff --git a/src/ui/source-init-opts.c b/src/ui/source-init-opts.c

index a984b3d099546d59b09b1487e1ec0e832ed3cb67..9627fdf57a5525696f2ba5953b110a0b2f14cf94 100644 (file)
--- a/src/ui/source-init-opts.c
+++ b/src/ui/source-init-opts.c
@@ -26,11 +26,10 @@
  #include "data/por-file-reader.h"
  #include "data/settings.h"
  #include "data/sys-file-reader.h"
-#include "language/syntax-file.h"
-#include "language/syntax-string-source.h"
+#include "language/lexer/include-path.h"
+#include "language/lexer/lexer.h"
  #include "libpspp/assertion.h"
  #include "libpspp/argv-parser.h"
-#include "libpspp/getl.h"
  #include "libpspp/llx.h"
  #include "libpspp/message.h"
  #include "ui/syntax-gen.h"
@@ -62,10 +61,8 @@ static const struct argv_option source_init_options[N_SOURCE_INIT_OPTIONS] =
    };
  
  static void
-source_init_option_callback (int id, void *ss_)
+source_init_option_callback (int id, void *aux UNUSED)
  {
-  struct source_stream *ss = ss_;
-
    switch (id)
      {
      case OPT_ALGORITHM:
@@ -82,13 +79,13 @@ source_init_option_callback (int id, void *ss_)
  
      case OPT_INCLUDE:
        if (!strcmp (optarg, "-"))
-       getl_clear_include_path (ss);
+        include_path_clear ();
        else
-       getl_add_include_dir (ss, optarg);
+        include_path_add (optarg);
        break;
  
      case OPT_NO_INCLUDE:
-      getl_clear_include_path (ss);
+      include_path_clear ();
        break;
  
      case OPT_SAFER:
@@ -113,9 +110,8 @@ source_init_option_callback (int id, void *ss_)
  }
  
  void
-source_init_register_argv_parser (struct argv_parser *ap,
-                                  struct source_stream *ss)
+source_init_register_argv_parser (struct argv_parser *ap)
  {
    argv_parser_add_options (ap, source_init_options, N_SOURCE_INIT_OPTIONS,
-                           source_init_option_callback, ss);
+                           source_init_option_callback, NULL);
  }
diff --git a/src/ui/source-init-opts.h b/src/ui/source-init-opts.h

index cfd87f3fa0322266114d346b850b73332dfd42bf..ee1de0319a2af3c7ecd2d7ba096b841c0b85d8ee 100644 (file)
--- a/src/ui/source-init-opts.h
+++ b/src/ui/source-init-opts.h
@@ -19,9 +19,7 @@
  #define UI_SOURCE_INIT_OPTS
  
  struct argv_parser;
-struct source_stream;
  
-void source_init_register_argv_parser (struct argv_parser *,
-                                       struct source_stream *);
+void source_init_register_argv_parser (struct argv_parser *);
  
  #endif /* ui/source/source-init-opts.h */
diff --git a/src/ui/terminal/automake.mk b/src/ui/terminal/automake.mk

index 81b896dc878839d44590f9e3205ebc53d5e07fb0..3fa72c41739dc0505384762c67dabe1c1ac198b8 100644 (file)
--- a/src/ui/terminal/automake.mk
+++ b/src/ui/terminal/automake.mk
@@ -3,16 +3,13 @@
  noinst_LTLIBRARIES += src/ui/terminal/libui.la
  
  src_ui_terminal_libui_la_SOURCES = \
-       src/ui/terminal/read-line.c \
-       src/ui/terminal/read-line.h \
         src/ui/terminal/main.c \
-       src/ui/terminal/msg-ui.c \
-       src/ui/terminal/msg-ui.h \
-       src/ui/terminal/terminal.c \
-       src/ui/terminal/terminal.h \
         src/ui/terminal/terminal-opts.c \
-       src/ui/terminal/terminal-opts.h 
-
+       src/ui/terminal/terminal-opts.h \
+       src/ui/terminal/terminal-reader.c \
+       src/ui/terminal/terminal-reader.h \
+       src/ui/terminal/terminal.c \
+       src/ui/terminal/terminal.h
  
  src_ui_terminal_libui_la_CFLAGS = $(NCURSES_CFLAGS)
  
diff --git a/src/ui/terminal/main.c b/src/ui/terminal/main.c

index 5fa1604143fe7a7ba136988d85401cafaf0d2b26..a9db6febe4a9db25e72244b206e51bf88e3db724 100644 (file)
--- a/src/ui/terminal/main.c
+++ b/src/ui/terminal/main.c
@@ -1,5 +1,5 @@
  /* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006, 2007, 2009, 2010 Free Software Foundation, Inc.
+   Copyright (C) 1997-9, 2000, 2006, 2007, 2009, 2010, 2011 Free Software Foundation, Inc.
  
     This program is free software: you can redistribute it and/or modify
     it under the terms of the GNU General Public License as published by
@@ -39,20 +39,19 @@
  #include "gsl/gsl_errno.h"
  #include "language/command.h"
  #include "language/lexer/lexer.h"
-#include "language/syntax-file.h"
+#include "language/lexer/include-path.h"
  #include "libpspp/argv-parser.h"
  #include "libpspp/compiler.h"
-#include "libpspp/getl.h"
  #include "libpspp/i18n.h"
  #include "libpspp/message.h"
  #include "libpspp/version.h"
  #include "math/random.h"
  #include "output/driver.h"
+#include "output/message-item.h"
  #include "ui/debugger.h"
  #include "ui/source-init-opts.h"
-#include "ui/terminal/msg-ui.h"
-#include "ui/terminal/read-line.h"
  #include "ui/terminal/terminal-opts.h"
+#include "ui/terminal/terminal-reader.h"
  #include "ui/terminal/terminal.h"
  
  #include "gl/fatal-signal.h"
@@ -62,15 +61,13 @@
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
  
-static struct dataset * the_dataset = NULL;
+static struct dataset *the_dataset;
  
-static struct lexer *the_lexer;
-static struct source_stream *the_source_stream ;
-
-static void add_syntax_file (struct source_stream *, enum syntax_mode,
-                             const char *file_name);
+static void add_syntax_reader (struct lexer *, const char *file_name,
+                               const char *encoding, enum lex_syntax_mode);
  static void bug_handler(int sig);
  static void fpu_init (void);
+static void output_msg (const struct msg *, void *);
  
  /* Program entry point. */
  int
@@ -78,8 +75,10 @@ main (int argc, char **argv)
  {
    struct terminal_opts *terminal_opts;
    struct argv_parser *parser;
-  enum syntax_mode syntax_mode;
+  enum lex_syntax_mode syntax_mode;
+  char *syntax_encoding;
    bool process_statrc;
+  struct lexer *lexer;
  
    set_program_name (argv[0]);
  
@@ -92,31 +91,32 @@ main (int argc, char **argv)
    gsl_set_error_handler_off ();
  
    fh_init ();
-  the_source_stream = create_source_stream ();
-  readln_initialize ();
    settings_init ();
    terminal_check_size ();
    random_init ();
  
+  lexer = lex_create ();
    the_dataset = create_dataset ();
  
    parser = argv_parser_create ();
-  terminal_opts = terminal_opts_init (parser, &syntax_mode, &process_statrc);
-  source_init_register_argv_parser (parser, the_source_stream);
+  terminal_opts = terminal_opts_init (parser, &syntax_mode, &process_statrc,
+                                      &syntax_encoding);
+  source_init_register_argv_parser (parser);
    if (!argv_parser_run (parser, argc, argv))
      exit (EXIT_FAILURE);
    terminal_opts_done (terminal_opts, argc, argv);
    argv_parser_destroy (parser);
  
-  msg_ui_init (the_source_stream);
+  msg_set_handler (output_msg, lexer);
+  dataset_set_default_syntax_encoding (the_dataset, syntax_encoding);
  
    /* Add syntax files to source stream. */
    if (process_statrc)
      {
-      char *rc = fn_search_path ("rc", getl_include_path (the_source_stream));
+      char *rc = include_path_search ("rc");
        if (rc != NULL)
          {
-          add_syntax_file (the_source_stream, GETL_BATCH, rc);
+          add_syntax_reader (lexer, rc, "Auto", LEX_SYNTAX_AUTO);
            free (rc);
          }
      }
@@ -125,28 +125,37 @@ main (int argc, char **argv)
        int i;
  
        for (i = optind; i < argc; i++)
-        add_syntax_file (the_source_stream, syntax_mode, argv[i]);
+        add_syntax_reader (lexer, argv[i], syntax_encoding, syntax_mode);
      }
    else
-    add_syntax_file (the_source_stream, syntax_mode, "-");
+    add_syntax_reader (lexer, "-", syntax_encoding, syntax_mode);
  
    /* Parse and execute syntax. */
-  the_lexer = lex_create (the_source_stream);
+  lex_get (lexer);
    for (;;)
      {
-      int result = cmd_parse (the_lexer, the_dataset);
+      int result = cmd_parse (lexer, the_dataset);
  
        if (result == CMD_EOF || result == CMD_FINISH)
         break;
-      if (result == CMD_CASCADING_FAILURE &&
-         !getl_is_interactive (the_source_stream))
-       {
-         msg (SE, _("Stopping syntax file processing here to avoid "
-                    "a cascade of dependent command failures."));
-         getl_abort_noninteractive (the_source_stream);
-       }
-      else if (msg_ui_too_many_errors ())
-        getl_abort_noninteractive (the_source_stream);
+      else if (cmd_result_is_failure (result) && lex_token (lexer) != T_STOP)
+        {
+          if (lex_get_error_mode (lexer) == LEX_ERROR_STOP)
+            {
+              msg (MW, _("Error encountered while ERROR=STOP is effective."));
+              lex_discard_noninteractive (lexer);
+            }
+          else if (result == CMD_CASCADING_FAILURE
+                   && lex_get_error_mode (lexer) != LEX_ERROR_INTERACTIVE)
+            {
+              msg (SE, _("Stopping syntax file processing here to avoid "
+                         "a cascade of dependent command failures."));
+              lex_discard_noninteractive (lexer);
+            }
+        }
+
+      if (msg_ui_too_many_errors ())
+        lex_discard_noninteractive (lexer);
      }
  
  
@@ -155,16 +164,13 @@ main (int argc, char **argv)
    random_done ();
    settings_done ();
    fh_done ();
-  lex_destroy (the_lexer);
-  destroy_source_stream (the_source_stream);
-  readln_uninitialize ();
+  lex_destroy (lexer);
    output_close ();
-  msg_ui_done ();
    i18n_done ();
  
    return msg_ui_any_errors ();
  }
-
+\f
  static void
  fpu_init (void)
  {
@@ -210,13 +216,32 @@ bug_handler(int sig)
  }
  
  static void
-add_syntax_file (struct source_stream *ss, enum syntax_mode syntax_mode,
-                 const char *file_name)
+output_msg (const struct msg *m_, void *lexer_)
+{
+  struct lexer *lexer = lexer_;
+  struct msg m = *m_;
+
+  if (m.file_name == NULL)
+    {
+      m.file_name = CONST_CAST (char *, lex_get_file_name (lexer));
+      m.first_line = lex_get_first_line_number (lexer, 0);
+      m.last_line = lex_get_last_line_number (lexer, 0);
+    }
+
+  message_item_submit (message_item_create (&m));
+}
+
+static void
+add_syntax_reader (struct lexer *lexer, const char *file_name,
+                   const char *encoding, enum lex_syntax_mode syntax_mode)
  {
-  struct getl_interface *source;
+  struct lex_reader *reader;
+
+  reader = (!strcmp (file_name, "-") && isatty (STDIN_FILENO)
+            ? terminal_reader_create ()
+            : lex_reader_for_file (file_name, encoding, syntax_mode,
+                                   LEX_ERROR_CONTINUE));
  
-  source = (!strcmp (file_name, "-") && isatty (STDIN_FILENO)
-           ? create_readln_source ()
-           : create_syntax_file_source (file_name));
-  getl_append_source (ss, source, syntax_mode, ERRMODE_CONTINUE);
+  if (reader)
+    lex_append (lexer, reader);
  }
diff --git a/src/ui/terminal/msg-ui.c b/src/ui/terminal/msg-ui.c

deleted file mode 100644 (file)

index 63682d1..0000000
--- a/src/ui/terminal/msg-ui.c
+++ /dev/null
@@ -1,41 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006, 2010 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#include <config.h>
-
-#include "msg-ui.h"
-#include "libpspp/message.h"
-#include "libpspp/msg-locator.h"
-#include "output/message-item.h"
-
-static void
-handle_msg (const struct msg *m)
-{
-  message_item_submit (message_item_create (m));
-}
-
-void
-msg_ui_init (struct source_stream *ss)
-{
-  msg_init (ss, handle_msg);
-}
-
-void
-msg_ui_done (void)
-{
-  msg_done ();
-  msg_locator_done ();
-}
diff --git a/src/ui/terminal/msg-ui.h b/src/ui/terminal/msg-ui.h

deleted file mode 100644 (file)

index 197d7c0..0000000
--- a/src/ui/terminal/msg-ui.h
+++ /dev/null
@@ -1,29 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2006 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#ifndef MSG_UI_H
-#define MSG_UI_H 1
-
-#include <stdbool.h>
-#include <stdio.h>
-
-struct source_stream;
-
-void msg_ui_set_error_file (FILE *);
-void msg_ui_init (struct source_stream *);
-void msg_ui_done (void);
-
-#endif /* msg-ui.h */
diff --git a/src/ui/terminal/read-line.c b/src/ui/terminal/read-line.c

deleted file mode 100644 (file)

index 544e18d..0000000
--- a/src/ui/terminal/read-line.c
+++ /dev/null
@@ -1,264 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2007, 2009, 2011 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#include <config.h>
-
-#include "ui/terminal/read-line.h"
-
-#include <stdlib.h>
-#include <stdbool.h>
-#include <assert.h>
-#include <errno.h>
-#if ! HAVE_READLINE
-#include <stdint.h>
-#endif
-
-#include "data/file-name.h"
-#include "data/settings.h"
-#include "language/command.h"
-#include "language/prompt.h"
-#include "libpspp/cast.h"
-#include "libpspp/message.h"
-#include "libpspp/str.h"
-#include "libpspp/version.h"
-#include "output/driver.h"
-#include "output/journal.h"
-#include "ui/terminal/msg-ui.h"
-#include "ui/terminal/terminal.h"
-
-#include "gl/xalloc.h"
-
-#include "gettext.h"
-#define _(msgid) gettext (msgid)
-
-#if HAVE_READLINE
-#include <readline/readline.h>
-#include <readline/history.h>
-
-static char *history_file;
-
-static char **complete_command_name (const char *, int, int);
-static char **dont_complete (const char *, int, int);
-#endif /* HAVE_READLINE */
-
-
-struct readln_source
-{
-  struct getl_interface parent ;
-
-  bool (*interactive_func) (struct string *line,
-                           enum prompt_style) ;
-};
-
-
-static bool initialised = false;
-
-/* Initialize getl. */
-void
-readln_initialize (void)
-{
-  initialised = true;
-
-#if HAVE_READLINE
-  rl_basic_word_break_characters = "\n";
-  using_history ();
-  stifle_history (500);
-  if (history_file == NULL)
-    {
-      const char *home_dir = getenv ("HOME");
-      if (home_dir != NULL)
-        {
-          history_file = xasprintf ("%s/.pspp_history", home_dir);
-          read_history (history_file);
-        }
-    }
-#endif
-}
-
-/* Close getl. */
-void
-readln_uninitialize (void)
-{
-  initialised = false;
-
-#if HAVE_READLINE
-  if (history_file != NULL && false == settings_get_testing_mode () )
-    write_history (history_file);
-  clear_history ();
-  free (history_file);
-#endif
-}
-
-
-static bool
-read_interactive (struct getl_interface *s,
-                  struct string *line)
-{
-  struct readln_source *is  = UP_CAST (s, struct readln_source, parent);
-
-  return is->interactive_func (line, prompt_get_style ());
-}
-
-static bool
-always_true (const struct getl_interface *s UNUSED)
-{
-  return true;
-}
-
-/* Display a welcoming message. */
-static void
-welcome (void)
-{
-  static bool welcomed = false;
-  if (welcomed)
-    return;
-  welcomed = true;
-  fputs ("PSPP is free software and you are welcome to distribute copies of "
-        "it\nunder certain conditions; type \"show copying.\" to see the "
-        "conditions.\nThere is ABSOLUTELY NO WARRANTY for PSPP; type \"show "
-        "warranty.\" for details.\n", stdout);
-  puts (stat_version);
-  readln_initialize ();
-  journal_enable ();
-}
-
-/* Gets a line from the user and stores it into LINE.
-   Prompts the user with PROMPT.
-   Returns true if successful, false at end of file.
-   */
-static bool
-readln_read (struct string *line, enum prompt_style style)
-{
-  const char *prompt = prompt_get (style);
-#if HAVE_READLINE
-  char *string;
-#endif
-  bool eof;
-
-  assert (initialised);
-
-  msg_ui_reset_counts ();
-
-  welcome ();
-
-  output_flush ();
-
-#if HAVE_READLINE
-  rl_attempted_completion_function = (style == PROMPT_FIRST
-                                      ? complete_command_name
-                                      : dont_complete);
-  string = readline (prompt);
-  if (string == NULL)
-    eof = true;
-  else
-    {
-      if (string[0])
-        add_history (string);
-      ds_assign_cstr (line, string);
-      free (string);
-      eof = false;
-    }
-#else
-  fputs (prompt, stdout);
-  fflush (stdout);
-  if (ds_read_line (line, stdin, SIZE_MAX))
-    {
-      ds_chomp (line, '\n');
-      eof = false;
-    }
-  else
-    eof = true;
-#endif
-
-  /* Check whether the size of the window has changed, so that
-     the output drivers can adjust their settings as needed.  We
-     only do this for the first line of a command, as it's
-     possible that the output drivers are actually in use
-     afterward, and we don't want to confuse them in the middle
-     of output. */
-  if (style == PROMPT_FIRST)
-    terminal_check_size ();
-
-  return !eof;
-}
-
-static void
-readln_close (struct getl_interface *i)
-{
-  free (i);
-}
-
-/* Creates a source which uses readln to get its line */
-struct getl_interface *
-create_readln_source (void)
-{
-  struct readln_source *rlns  = xzalloc (sizeof (*rlns));
-
-  rlns->interactive_func = readln_read;
-
-  rlns->parent.interactive = always_true;
-  rlns->parent.read = read_interactive;
-  rlns->parent.close = readln_close;
-
-  return &rlns->parent;
-}
-
-
-#if HAVE_READLINE
-static char *command_generator (const char *text, int state);
-
-/* Returns a set of command name completions for TEXT.
-   This is of the proper form for assigning to
-   rl_attempted_completion_function. */
-static char **
-complete_command_name (const char *text, int start, int end UNUSED)
-{
-  if (start == 0)
-    {
-      /* Complete command name at start of line. */
-      return rl_completion_matches (text, command_generator);
-    }
-  else
-    {
-      /* Otherwise don't do any completion. */
-      rl_attempted_completion_over = 1;
-      return NULL;
-    }
-}
-
-/* Do not do any completion for TEXT. */
-static char **
-dont_complete (const char *text UNUSED, int start UNUSED, int end UNUSED)
-{
-  rl_attempted_completion_over = 1;
-  return NULL;
-}
-
-/* If STATE is 0, returns the first command name matching TEXT.
-   Otherwise, returns the next command name matching TEXT.
-   Returns a null pointer when no matches are left. */
-static char *
-command_generator (const char *text, int state)
-{
-  static const struct command *cmd;
-  const char *name;
-
-  if (state == 0)
-    cmd = NULL;
-  name = cmd_complete (text, &cmd);
-  return name ? xstrdup (name) : NULL;
-}
-#endif /* HAVE_READLINE */
diff --git a/src/ui/terminal/read-line.h b/src/ui/terminal/read-line.h

deleted file mode 100644 (file)

index 0eb6709..0000000
--- a/src/ui/terminal/read-line.h
+++ /dev/null
@@ -1,31 +0,0 @@
-/* PSPP - a program for statistical analysis.
-   Copyright (C) 1997-9, 2000, 2011 Free Software Foundation, Inc.
-
-   This program is free software: you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation, either version 3 of the License, or
-   (at your option) any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
-
-#ifndef READLN_H
-#define READLN_H
-
-#include "libpspp/str.h"
-#include "libpspp/getl.h"
-
-void readln_initialize (void);
-void readln_uninitialize (void);
-
-struct getl_interface *create_readln_source (void);
-
-
-
-#endif /* READLN_H */
-
diff --git a/src/ui/terminal/terminal-opts.c b/src/ui/terminal/terminal-opts.c

index d55a8a2e9607d5622dcb1f52b192a78263f7f28b..121d89d5712a5bb127ff73f1778d5e498bb7a234 100644 (file)
--- a/src/ui/terminal/terminal-opts.c
+++ b/src/ui/terminal/terminal-opts.c
@@ -23,12 +23,11 @@
  
  #include "data/settings.h"
  #include "data/file-name.h"
-#include "language/syntax-file.h"
+#include "language/lexer/include-path.h"
  #include "libpspp/argv-parser.h"
  #include "libpspp/assertion.h"
  #include "libpspp/cast.h"
  #include "libpspp/compiler.h"
-#include "libpspp/getl.h"
  #include "libpspp/llx.h"
  #include "libpspp/str.h"
  #include "libpspp/string-array.h"
@@ -38,8 +37,6 @@
  #include "output/driver.h"
  #include "output/driver-provider.h"
  #include "output/msglog.h"
-#include "ui/terminal/msg-ui.h"
-#include "ui/terminal/read-line.h"
  
  #include "gl/error.h"
  #include "gl/progname.h"
@@ -53,12 +50,13 @@
  
  struct terminal_opts
    {
-    enum syntax_mode *syntax_mode;
      struct string_map options;  /* Output driver options. */
      bool has_output_driver;
      bool has_terminal_driver;
      bool has_error_file;
+    enum lex_syntax_mode *syntax_mode;
      bool *process_statrc;
+    char **syntax_encoding;
    };
  
  enum
@@ -68,7 +66,9 @@ enum
      OPT_OUTPUT,
      OPT_OUTPUT_OPTION,
      OPT_NO_OUTPUT,
+    OPT_BATCH,
      OPT_INTERACTIVE,
+    OPT_SYNTAX_ENCODING,
      OPT_NO_STATRC,
      OPT_HELP,
      OPT_VERSION,
@@ -82,7 +82,9 @@ static struct argv_option terminal_argv_options[N_TERMINAL_OPTIONS] =
      {"output", 'o', required_argument, OPT_OUTPUT},
      {NULL, 'O', required_argument, OPT_OUTPUT_OPTION},
      {"no-output", 0, no_argument, OPT_NO_OUTPUT},
+    {"batch", 'b', no_argument, OPT_BATCH},
      {"interactive", 'i', no_argument, OPT_INTERACTIVE},
+    {"syntax-encoding", 0, required_argument, OPT_SYNTAX_ENCODING},
      {"no-statrc", 'r', no_argument, OPT_NO_STATRC},
      {"help", 'h', no_argument, OPT_HELP},
      {"version", 'V', no_argument, OPT_VERSION},
@@ -160,29 +162,11 @@ get_supported_formats (void)
    return format_string;
  }
  
-static char *
-get_default_include_path (void)
-{
-  struct source_stream *ss;
-  struct string dst;
-  char **path;
-  size_t i;
-
-  ss = create_source_stream ();
-  path = getl_include_path (ss);
-  ds_init_empty (&dst);
-  for (i = 0; path[i] != NULL; i++)
-    ds_put_format (&dst, " %s", path[i]);
-  destroy_source_stream (ss);
-
-  return ds_steal_cstr (&dst);
-}
-
  static void
  usage (void)
  {
    char *supported_formats = get_supported_formats ();
-  char *default_include_path = get_default_include_path ();
+  char *inc_path = string_array_join (include_path_default (), " ");
  
    printf (_("\
  PSPP, a program for statistical analysis of sample data.\n\
@@ -208,19 +192,21 @@ Language options:\n\
                              calculated from broken algorithms\n\
    -x, --syntax={compatible|enhanced}\n\
                              set to `compatible' to disable PSPP extensions\n\
+  -b, --batch               interpret syntax in batch mode\n\
    -i, --interactive         interpret syntax in interactive mode\n\
+  --syntax-encoding=ENCODING  specify encoding for syntax files\n\
    -s, --safer               don't allow some unsafe operations\n\
-Default search path:%s\n\
+Default search path: %s\n\
  \n\
  Informative output:\n\
    -h, --help                display this help and exit\n\
    -V, --version             output version information and exit\n\
  \n\
  Non-option arguments are interpreted as syntax files to execute.\n"),
-          program_name, supported_formats, default_include_path);
+          program_name, supported_formats, inc_path);
  
    free (supported_formats);
-  free (default_include_path);
+  free (inc_path);
  
    emit_bug_reporting_address ();
    exit (EXIT_SUCCESS);
@@ -257,8 +243,16 @@ terminal_option_callback (int id, void *to_)
        to->has_output_driver = true;
        break;
  
+    case OPT_BATCH:
+      *to->syntax_mode = LEX_SYNTAX_BATCH;
+      break;
+
      case OPT_INTERACTIVE:
-      *to->syntax_mode = GETL_INTERACTIVE;
+      *to->syntax_mode = LEX_SYNTAX_INTERACTIVE;
+      break;
+
+    case OPT_SYNTAX_ENCODING:
+      *to->syntax_encoding = optarg;
        break;
  
      case OPT_NO_STATRC:
@@ -282,19 +276,23 @@ terminal_option_callback (int id, void *to_)
  
  struct terminal_opts *
  terminal_opts_init (struct argv_parser *ap,
-                    enum syntax_mode *syntax_mode, bool *process_statrc)
+                    enum lex_syntax_mode *syntax_mode, bool *process_statrc,
+                    char **syntax_encoding)
  {
    struct terminal_opts *to;
  
-  *syntax_mode = GETL_BATCH;
+  *syntax_mode = LEX_SYNTAX_AUTO;
    *process_statrc = true;
+  *syntax_encoding = "Auto";
  
    to = xzalloc (sizeof *to);
    to->syntax_mode = syntax_mode;
    string_map_init (&to->options);
    to->has_output_driver = false;
    to->has_error_file = false;
+  to->syntax_mode = syntax_mode;
    to->process_statrc = process_statrc;
+  to->syntax_encoding = syntax_encoding;
  
    argv_parser_add_options (ap, terminal_argv_options, N_TERMINAL_OPTIONS,
                             terminal_option_callback, to);
diff --git a/src/ui/terminal/terminal-opts.h b/src/ui/terminal/terminal-opts.h

index 50f2319690fb53d32abf074f67cdc80788431ead..64581baea304c81558074e3298d9c8744854ca03 100644 (file)
--- a/src/ui/terminal/terminal-opts.h
+++ b/src/ui/terminal/terminal-opts.h
@@ -19,14 +19,16 @@
  #define UI_TERMINAL_TERMINAL_OPTS_H 1
  
  #include <stdbool.h>
-#include "libpspp/getl.h"
+#include "language/lexer/lexer.h"
  
  struct argv_parser;
+struct lexer;
  struct terminal_opts;
  
  struct terminal_opts *terminal_opts_init (struct argv_parser *,
-                                          enum syntax_mode *,
-                                          bool *process_statrc);
+                                          enum lex_syntax_mode *,
+                                          bool *process_statrc,
+                                          char **syntax_encoding);
  void terminal_opts_done (struct terminal_opts *, int argc, char *argv[]);
  
  #endif /* ui/terminal/terminal-opts.h */
diff --git a/src/ui/terminal/terminal-reader.c b/src/ui/terminal/terminal-reader.c

new file mode 100644 (file)

index 0000000..7c80d27
--- /dev/null
+++ b/src/ui/terminal/terminal-reader.c
@@ -0,0 +1,308 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 1997-9, 2000, 2007, 2009, 2010, 2011 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#include <config.h>
+
+#include "ui/terminal/terminal-reader.h"
+
+#include <assert.h>
+#include <errno.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include "data/file-name.h"
+#include "data/settings.h"
+#include "language/command.h"
+#include "language/lexer/lexer.h"
+#include "libpspp/assertion.h"
+#include "libpspp/cast.h"
+#include "libpspp/message.h"
+#include "libpspp/prompt.h"
+#include "libpspp/str.h"
+#include "libpspp/version.h"
+#include "output/driver.h"
+#include "output/journal.h"
+#include "ui/terminal/terminal.h"
+
+#include "gl/minmax.h"
+#include "gl/xalloc.h"
+
+#include "gettext.h"
+#define _(msgid) gettext (msgid)
+
+struct terminal_reader
+  {
+    struct lex_reader reader;
+    struct substring s;
+    size_t offset;
+    bool eof;
+  };
+
+static int n_terminal_readers;
+
+static void readline_init (void);
+static void readline_done (void);
+static struct substring readline_read (enum prompt_style);
+
+/* Display a welcoming message. */
+static void
+welcome (void)
+{
+  static bool welcomed = false;
+  if (welcomed)
+    return;
+  welcomed = true;
+  fputs ("PSPP is free software and you are welcome to distribute copies of "
+        "it\nunder certain conditions; type \"show copying.\" to see the "
+        "conditions.\nThere is ABSOLUTELY NO WARRANTY for PSPP; type \"show "
+        "warranty.\" for details.\n", stdout);
+  puts (stat_version);
+  journal_enable ();
+}
+
+static struct terminal_reader *
+terminal_reader_cast (struct lex_reader *r)
+{
+  return UP_CAST (r, struct terminal_reader, reader);
+}
+
+static size_t
+terminal_reader_read (struct lex_reader *r_, char *buf, size_t n,
+                      enum prompt_style prompt_style)
+{
+  struct terminal_reader *r = terminal_reader_cast (r_);
+  size_t chunk;
+
+  if (r->offset >= r->s.length && !r->eof)
+    {
+      welcome ();
+      msg_ui_reset_counts ();
+      output_flush ();
+
+      ss_dealloc (&r->s);
+      r->s = readline_read (prompt_style);
+      r->offset = 0;
+      r->eof = ss_is_empty (r->s);
+
+      /* Check whether the size of the window has changed, so that
+         the output drivers can adjust their settings as needed.  We
+         only do this for the first line of a command, as it's
+         possible that the output drivers are actually in use
+         afterward, and we don't want to confuse them in the middle
+         of output. */
+      if (prompt_style == PROMPT_FIRST)
+        terminal_check_size ();
+    }
+
+  chunk = MIN (n, r->s.length - r->offset);
+  memcpy (buf, r->s.string + r->offset, chunk);
+  r->offset += chunk;
+  return chunk;
+}
+
+static void
+terminal_reader_close (struct lex_reader *r_)
+{
+  struct terminal_reader *r = terminal_reader_cast (r_);
+
+  ss_dealloc (&r->s);
+  free (r->reader.file_name);
+  free (r);
+
+  if (!--n_terminal_readers)
+    readline_done ();
+}
+
+static struct lex_reader_class terminal_reader_class =
+  {
+    terminal_reader_read,
+    terminal_reader_close
+  };
+
+/* Creates a source which uses readln to get its line */
+struct lex_reader *
+terminal_reader_create (void)
+{
+  struct terminal_reader *r;
+
+  if (!n_terminal_readers++)
+    readline_init ();
+
+  r = xzalloc (sizeof *r);
+  r->reader.class = &terminal_reader_class;
+  r->reader.syntax = LEX_SYNTAX_INTERACTIVE;
+  r->reader.error = LEX_ERROR_INTERACTIVE;
+  r->reader.file_name = NULL;
+  r->s = ss_empty ();
+  r->offset = 0;
+  r->eof = false;
+  return &r->reader;
+}
+\f
+#if HAVE_READLINE
+#include <readline/readline.h>
+#include <readline/history.h>
+
+static char *history_file;
+
+static char **complete_command_name (const char *, int, int);
+static char **dont_complete (const char *, int, int);
+static char *command_generator (const char *text, int state);
+
+static void
+readline_init (void)
+{
+  rl_basic_word_break_characters = "\n";
+  using_history ();
+  stifle_history (500);
+  if (history_file == NULL)
+    {
+      const char *home_dir = getenv ("HOME");
+      if (home_dir != NULL)
+        {
+          history_file = xasprintf ("%s/.pspp_history", home_dir);
+          read_history (history_file);
+        }
+    }
+}
+
+static void
+readline_done (void)
+{
+  if (history_file != NULL && false == settings_get_testing_mode () )
+    write_history (history_file);
+  clear_history ();
+  free (history_file);
+}
+
+static const char *
+readline_prompt (enum prompt_style style)
+{
+  switch (style)
+    {
+    case PROMPT_FIRST:
+      return "PSPP> ";
+
+    case PROMPT_LATER:
+      return "    > ";
+
+    case PROMPT_DATA:
+      return "data> ";
+
+    case PROMPT_COMMENT:
+      return "comment> ";
+
+    case PROMPT_DOCUMENT:
+      return "document> ";
+
+    case PROMPT_DO_REPEAT:
+      return "DO REPEAT> ";
+    }
+
+  NOT_REACHED ();
+}
+
+static struct substring
+readline_read (enum prompt_style style)
+{
+  char *string;
+
+  rl_attempted_completion_function = (style == PROMPT_FIRST
+                                      ? complete_command_name
+                                      : dont_complete);
+  string = readline (readline_prompt (style));
+  if (string != NULL)
+    {
+      char *end;
+
+      if (string[0])
+        add_history (string);
+
+      end = strchr (string, '\0');
+      *end = '\n';
+      return ss_buffer (string, end - string + 1);
+    }
+  else
+    return ss_empty ();
+}
+
+/* Returns a set of command name completions for TEXT.
+   This is of the proper form for assigning to
+   rl_attempted_completion_function. */
+static char **
+complete_command_name (const char *text, int start, int end UNUSED)
+{
+  if (start == 0)
+    {
+      /* Complete command name at start of line. */
+      return rl_completion_matches (text, command_generator);
+    }
+  else
+    {
+      /* Otherwise don't do any completion. */
+      rl_attempted_completion_over = 1;
+      return NULL;
+    }
+}
+
+/* Do not do any completion for TEXT. */
+static char **
+dont_complete (const char *text UNUSED, int start UNUSED, int end UNUSED)
+{
+  rl_attempted_completion_over = 1;
+  return NULL;
+}
+
+/* If STATE is 0, returns the first command name matching TEXT.
+   Otherwise, returns the next command name matching TEXT.
+   Returns a null pointer when no matches are left. */
+static char *
+command_generator (const char *text, int state)
+{
+  static const struct command *cmd;
+  const char *name;
+
+  if (state == 0)
+    cmd = NULL;
+  name = cmd_complete (text, &cmd);
+  return name ? xstrdup (name) : NULL;
+}
+#else  /* !HAVE_READLINE */
+static void
+readline_init (void)
+{
+}
+
+static void
+readline_done (void)
+{
+}
+
+static struct substring
+readline_read (enum prompt_style style)
+{
+  const char *prompt = prompt_get (style);
+  struct string line;
+
+  fputs (prompt, stdout);
+  fflush (stdout);
+  ds_init_empty (&line);
+  ds_read_line (&line, stdin, SIZE_MAX);
+
+  return line.ss;
+}
+#endif /* !HAVE_READLINE */
diff --git a/src/ui/terminal/terminal-reader.h b/src/ui/terminal/terminal-reader.h

new file mode 100644 (file)

index 0000000..2d51c9b
--- /dev/null
+++ b/src/ui/terminal/terminal-reader.h
@@ -0,0 +1,23 @@
+/* PSPP - a program for statistical analysis.
+   Copyright (C) 1997-9, 2000, 2010 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>. */
+
+#ifndef TERMINAL_READER_H
+#define TERMINAL_READER_H
+
+struct lex_reader *terminal_reader_create (void);
+
+#endif /* terminal-reader.h */
+
diff --git a/tests/data/data-in.at b/tests/data/data-in.at

index 05dc10ff2b2501a9cc0e89d4aa1395cac2df25b1..d5595975ec05e6a26c3074cb0ca43b6e70a6cf0b 100644 (file)
--- a/tests/data/data-in.at
+++ b/tests/data/data-in.at
@@ -3500,23 +3500,23 @@ PRINT OUTFILE='wkday.out'/ALL.
  EXECUTE.
  ])
  AT_CHECK([pspp -O format=csv wkday.sps], [0], [dnl
-wkday.sps:20.1-2: warning: Data for variable wkday2 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.2: warning: Data for variable wkday2 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-3: warning: Data for variable wkday3 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.3: warning: Data for variable wkday3 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-4: warning: Data for variable wkday4 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.4: warning: Data for variable wkday4 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-5: warning: Data for variable wkday5 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.5: warning: Data for variable wkday5 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-6: warning: Data for variable wkday6 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.6: warning: Data for variable wkday6 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-7: warning: Data for variable wkday7 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.7: warning: Data for variable wkday7 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-8: warning: Data for variable wkday8 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.8: warning: Data for variable wkday8 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-9: warning: Data for variable wkday9 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.9: warning: Data for variable wkday9 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  
-wkday.sps:20.1-10: warning: Data for variable wkday10 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
+wkday.sps:20.1-20.10: warning: Data for variable wkday10 is not valid as format WKDAY: Unrecognized weekday name.  At least the first two letters of an English weekday name must be specified.
  ])
  AT_CHECK([cat wkday.out], [0], [dnl
    .  .  .  .  .  .  .  .  . @&t@
@@ -3595,51 +3595,51 @@ PRINT OUTFILE='month.out'/ALL.
  EXECUTE.
  ])
  AT_CHECK([pspp -O format=csv month.sps], [0], [dnl
-month.sps:15.1-4: warning: Data for variable month4 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:15.1-15.4: warning: Data for variable month4 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:15.1-5: warning: Data for variable month5 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:15.1-15.5: warning: Data for variable month5 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:15.1-6: warning: Data for variable month6 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:15.1-15.6: warning: Data for variable month6 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:15.1-7: warning: Data for variable month7 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:15.1-15.7: warning: Data for variable month7 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:15.1-8: warning: Data for variable month8 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:15.1-15.8: warning: Data for variable month8 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:15.1-9: warning: Data for variable month9 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:15.1-15.9: warning: Data for variable month9 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:15.1-10: warning: Data for variable month10 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:15.1-15.10: warning: Data for variable month10 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-3: warning: Data for variable month3 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.3: warning: Data for variable month3 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-4: warning: Data for variable month4 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.4: warning: Data for variable month4 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-5: warning: Data for variable month5 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.5: warning: Data for variable month5 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-6: warning: Data for variable month6 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.6: warning: Data for variable month6 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-7: warning: Data for variable month7 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.7: warning: Data for variable month7 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-8: warning: Data for variable month8 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.8: warning: Data for variable month8 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-9: warning: Data for variable month9 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.9: warning: Data for variable month9 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:26.1-10: warning: Data for variable month10 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:26.1-26.10: warning: Data for variable month10 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-3: warning: Data for variable month3 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.3: warning: Data for variable month3 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-4: warning: Data for variable month4 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.4: warning: Data for variable month4 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-5: warning: Data for variable month5 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.5: warning: Data for variable month5 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-6: warning: Data for variable month6 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.6: warning: Data for variable month6 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-7: warning: Data for variable month7 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.7: warning: Data for variable month7 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-8: warning: Data for variable month8 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.8: warning: Data for variable month8 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-9: warning: Data for variable month9 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.9: warning: Data for variable month9 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  
-month.sps:39.1-10: warning: Data for variable month10 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
+month.sps:39.1-39.10: warning: Data for variable month10 is not valid as format MONTH: Unrecognized month format.  Months may be specified as Arabic or Roman numerals or as at least 3 letters of their English names.
  ])
  AT_CHECK([cat month.out], [0], [dnl
     .   .   .   .   .   .   .   . @&t@
diff --git a/tests/data/sys-file-reader.at b/tests/data/sys-file-reader.at

index 3a8c6892cb777b10a6ad421d2c092d8b57f1b590..4a06ebdf1735da1a79d0751d3a2ae9dafe31d7e1 100644 (file)
--- a/tests/data/sys-file-reader.at
+++ b/tests/data/sys-file-reader.at
@@ -1268,8 +1268,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xd4: Misplaced type 4 record.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1298,8 +1296,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xd4: Unrecognized record type 8.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1356,8 +1352,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xb4: Invalid variable name `$UM1'.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1386,8 +1380,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xb4: Invalid variable name `TO'.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1416,8 +1408,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xb4: Bad width 256 for variable VAR1.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1447,8 +1437,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xd4: Duplicate variable name `VAR1'.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1477,8 +1465,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xb4: Variable label indicator field is not 0 or 1.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1507,8 +1493,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     ["error: `sys-file.sav' near offset 0xb4: Numeric missing value indicator field is not -3, -2, 0, 1, 2, or 3."
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1537,8 +1521,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     ["error: `sys-file.sav' near offset 0xb4: String missing value indicator field is not 0, 1, 2, or 3."
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1568,8 +1550,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xb4: Missing string continuation record.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1598,8 +1578,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0xc0: Unknown variable format 255.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1675,8 +1653,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav': Weighting variable must be numeric (not string variable `STR1').
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1708,8 +1684,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0x4c: Variable index 3 not in valid range 1...2.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1742,8 +1716,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], 
     [error: `sys-file.sav' near offset 0x4c: Variable index 3 refers to long string continuation.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1779,8 +1751,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0x12c: Duplicate type 6 (document) record.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1817,8 +1787,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0xd4: Number of document lines (0) must be greater than 0 and less than 26843545.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1848,8 +1816,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0xd8: Record type 7 subtype 3 too large.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -1942,8 +1908,6 @@ do
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0xd8: Floating-point representation indicated by system file (2) differs from expected (1).
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -2718,8 +2682,6 @@ warning: `sys-file.sav' near offset 0xd8: NUM1 listed as string of invalid lengt
  "warning: `sys-file.sav' near offset 0xd8: NUM1 listed in very long string record with width 00255, which requires only one segment."
  
  error: `sys-file.sav' near offset 0xd8: Very long string NUM1 overflows dictionary.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -2757,8 +2719,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0x4f8: Very long string with width 256 has segment 1 of width 9 (expected 4).
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -2789,8 +2749,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0xd4: Invalid number of labels 2147483647.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -2823,8 +2781,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0xe8: Variable index record (type 4) does not immediately follow value label record (type 3) as it should.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -2854,8 +2810,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0xec: Number of variables associated with a value label (0) is not between 1 and the number of variables (1).
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -2888,8 +2842,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  error: `sys-file.sav' near offset 0xf4: Value labels may not be added to long string variables (e.g. STR1) using records types 3 and 4.
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -2922,8 +2874,6 @@ GET FILE='sys-file.sav'.
  ])
    AT_CHECK([pspp -O format=csv sys-file.sps], [1], [dnl
  "error: `sys-file.sav' near offset 0xf4: Variables associated with value label are not all of identical type.  Variable STR1 is string, but variable NUM1 is numeric."
-
-sys-file.sps:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -3148,8 +3098,6 @@ num1,num2
  3,4
  5,6
  7,8
-
-sys-file.sps:2: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -3186,8 +3134,6 @@ LIST.
  Table: Data List
  num1,num2
  1,2
-
-sys-file.sps:2: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -3224,8 +3170,6 @@ LIST.
  Table: Data List
  str14
  one data item @&t@
-
-sys-file.sps:2: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
@@ -3276,8 +3220,6 @@ LIST.
  Table: Data List
  num1,num2,str4,str8,str15
  -99,0,,abcdefgh,0123   @&t@
-
-sys-file.sps:2: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  done
  AT_CLEANUP
diff --git a/tests/dissect-sysfile.c b/tests/dissect-sysfile.c

index fda3b385200c605e2a69c16f331b02f2d9b64276..444c15cc88bbec3544c017422aecb11b7b700bb3 100644 (file)
--- a/tests/dissect-sysfile.c
+++ b/tests/dissect-sysfile.c
@@ -36,7 +36,7 @@
  #include "gettext.h"
  #define _(msgid) gettext (msgid)
  
-#define VAR_NAME_LEN 64
+#define ID_MAX_LEN 64
  
  struct sfm_reader
    {
@@ -925,7 +925,7 @@ read_long_string_value_labels (struct sfm_reader *r, size_t size, size_t count)
    while (ftello (r->file) - start < size * count)
      {
        long long posn = ftello (r->file);
-      char var_name[VAR_NAME_LEN + 1];
+      char var_name[ID_MAX_LEN + 1];
        int var_name_len;
        int n_values;
        int width;
@@ -933,10 +933,10 @@ read_long_string_value_labels (struct sfm_reader *r, size_t size, size_t count)
  
        /* Read variable name. */
        var_name_len = read_int (r);
-      if (var_name_len > VAR_NAME_LEN)
+      if (var_name_len > ID_MAX_LEN)
          sys_error (r, _("Variable name length in long string value label "
                          "record (%d) exceeds %d-byte limit."),
-                   var_name_len, VAR_NAME_LEN);
+                   var_name_len, ID_MAX_LEN);
        read_string (r, var_name, var_name_len + 1);
  
        /* Read width, number of values. */
diff --git a/tests/language/control/do-repeat.at b/tests/language/control/do-repeat.at

index a0b29d93a10a6ee83d4ab1b00dcb7f5778412e76..4421ba6b715fbf2641b602d004ea4b3f23eca68b 100644 (file)
--- a/tests/language/control/do-repeat.at
+++ b/tests/language/control/do-repeat.at
@@ -1,6 +1,95 @@
  AT_BANNER([DO REPEAT])
  
-AT_SETUP([DO REPEAT -- ordinary])
+AT_SETUP([DO REPEAT -- simple])
+AT_DATA([do-repeat.sps], [dnl
+INPUT PROGRAM.
+STRING y(A1).
+DO REPEAT xval = 1 2 3 / yval = 'a' 'b' 'c' / var = a b c.
+COMPUTE x=xval.
+COMPUTE y=yval.
+COMPUTE var=xval.
+END CASE.
+END REPEAT.
+END FILE.
+END INPUT PROGRAM.
+LIST.
+])
+AT_CHECK([pspp -o pspp.csv do-repeat.sps])
+AT_CHECK([cat pspp.csv], [0], [dnl
+Table: Data List
+y,x,a,b,c
+a,1.00,1.00,.  ,.  @&t@
+b,2.00,.  ,2.00,.  @&t@
+c,3.00,.  ,.  ,3.00
+])
+AT_CLEANUP
+
+AT_SETUP([DO REPEAT -- containing BEGIN DATA])
+AT_DATA([do-repeat.sps], [dnl
+DO REPEAT offset = 1 2 3.
+DATA LIST NOTABLE /x 1-2.
+BEGIN DATA.
+10
+20
+30
+END DATA.
+COMPUTE x = x + offset.
+LIST.
+END REPEAT.
+])
+AT_CHECK([pspp -o pspp.csv do-repeat.sps])
+AT_CHECK([cat pspp.csv], [0], [dnl
+Table: Data List
+x
+11
+21
+31
+
+Table: Data List
+x
+12
+22
+32
+
+Table: Data List
+x
+13
+23
+33
+])
+AT_CLEANUP
+
+AT_SETUP([DO REPEAT -- dummy vars not expanded in include files])
+AT_DATA([include.sps], [dnl
+COMPUTE y = y + x + 10.
+])
+AT_DATA([do-repeat.sps], [dnl
+INPUT PROGRAM.
+COMPUTE x = 0.
+COMPUTE y = 0.
+END CASE.
+END FILE.
+END INPUT PROGRAM.
+
+DO REPEAT x = 1 2 3.
+INCLUDE 'include.sps'.
+END REPEAT.
+
+LIST.
+])
+AT_CHECK([pspp -o pspp.csv do-repeat.sps], [0], [dnl
+do-repeat.sps:8: warning: DO REPEAT: Dummy variable name `x' hides dictionary variable `x'.
+])
+AT_CHECK([cat pspp.csv], [0], [dnl
+do-repeat.sps:8: warning: DO REPEAT: Dummy variable name `x' hides dictionary variable `x'.
+
+Table: Data List
+x,y
+.00,30.00
+])
+AT_CLEANUP
+
+AT_SETUP([DO REPEAT -- nested])
  AT_DATA([do-repeat.sps], [dnl
  DATA LIST NOTABLE /a 1.
  BEGIN DATA.
@@ -55,13 +144,7 @@ AT_DATA([do-repeat.sps], [dnl
  DATA LIST NOTABLE /x 1.
  DO REPEAT y = 1 TO 10.
  ])
-AT_CHECK([pspp -o pspp.csv do-repeat.sps], [1], [dnl
-error: DO REPEAT: DO REPEAT without END REPEAT.
-error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
-])
-AT_CHECK([cat pspp.csv], [0], [dnl
-error: DO REPEAT: DO REPEAT without END REPEAT.
-
-error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
+AT_CHECK([pspp -O format=csv do-repeat.sps], [1], [dnl
+error: DO REPEAT: Syntax error at end of input: expecting `END'.
  ])
  AT_CLEANUP
diff --git a/tests/language/data-io/data-list.at b/tests/language/data-io/data-list.at

index dba2a4a12e182a0d14cd0db555ffcfb4389c54d0..b0582c6c47a7fd141395c41cc85a05a7aa22f958 100644 (file)
--- a/tests/language/data-io/data-list.at
+++ b/tests/language/data-io/data-list.at
@@ -49,7 +49,7 @@ B,F8.0
  C,F8.0
  D,F8.0
  
-data-list.pspp:3.9-13: warning: Data for variable D is not valid as format F: Number followed by garbage.
+data-list.pspp:3.9-3.13: warning: Data for variable D is not valid as format F: Number followed by garbage.
  
  Table: Data List
  A,B,C,D
@@ -160,9 +160,9 @@ end data.
  list.
  ])
  AT_CHECK([pspp -O format=csv data-list.pspp], [0], [dnl
-data-list.pspp:8.1-3: warning: Data for variable count is not valid as format F: Field contents are not numeric.
+data-list.pspp:8.1-8.3: warning: Data for variable count is not valid as format F: Field contents are not numeric.
  
-data-list.pspp:11.1-3: warning: Data for variable count is not valid as format F: Field contents are not numeric.
+data-list.pspp:11.1-11.3: warning: Data for variable count is not valid as format F: Field contents are not numeric.
  
  Table: Data List
  start,end,count
diff --git a/tests/language/data-io/get.at b/tests/language/data-io/get.at

index a519b67fe3592583790a15e5ef6c929be2f2c5ff..b9a187f292034cb872516de81ea61862e1fb5c6a 100644 (file)
--- a/tests/language/data-io/get.at
+++ b/tests/language/data-io/get.at
@@ -58,8 +58,6 @@ dnl We use stdin here, because the bug seems to manifest itself only in
  dnl interactive mode.
  AT_CHECK([echo "GET /FILE='nonexistent.sav'." | pspp -O format=csv], [1], [dnl
  error: An error occurred while opening `nonexistent.sav': No such file or directory.
-
--:1: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  AT_CLEANUP
  
diff --git a/tests/language/data-io/inpt-pgm.at b/tests/language/data-io/inpt-pgm.at

index f0ce9395279bd5a520c7d2bddae7648195d4a50a..f048d3743fd44f881561354a3951929550c7849b 100644 (file)
--- a/tests/language/data-io/inpt-pgm.at
+++ b/tests/language/data-io/inpt-pgm.at
@@ -13,10 +13,6 @@ END INPUT PROGRAM.
  ])
  AT_CHECK([pspp -O format=csv input-program.sps], [1], [dnl
  input-program.sps:3: error: BEGIN DATA: BEGIN DATA is not allowed inside INPUT PROGRAM.
-
-input-program.sps:4: error: Unknown command `123456789'.
-
-input-program.sps:5: error: Unknown command `END DATA'.
  ])
  AT_CLEANUP
  
@@ -32,6 +28,6 @@ END INPUT PROGRAM.
  DESCRIPTIVES x.
  ])
  AT_CHECK([pspp -O format=csv input-program.sps], [1], [dnl
-error: DESCRIPTIVES: Syntax error at end of file: expecting `BEGIN'.
+error: DESCRIPTIVES: Syntax error at end of input: expecting `BEGIN'.
  ])
  AT_CLEANUP
diff --git a/tests/language/data-io/print.at b/tests/language/data-io/print.at

index 9381d7da4547e7706fde4453d37bf366c3bbc46b..71259e0c68dc8566c8b84824c948f4c8d7647a07 100644 (file)
--- a/tests/language/data-io/print.at
+++ b/tests/language/data-io/print.at
@@ -172,7 +172,7 @@ PRINT F8.2
  LIST.
  ])
  AT_CHECK([pspp -O format=csv print.sps], [1], [dnl
-print.sps:7: error: PRINT: Syntax error at `F8.2': expecting a valid subcommand.
+print.sps:7.7-7.10: error: PRINT: Syntax error at `F8.2': expecting a valid subcommand.
  
  Table: Data List
  a,b
diff --git a/tests/language/dictionary/missing-values.at b/tests/language/dictionary/missing-values.at

index df2aeebe3d4392683c05ad3f4f6d5576c0dd9642..75254462aa0ea3bc39cffec96897b3cd9a7ea15a 100644 (file)
--- a/tests/language/dictionary/missing-values.at
+++ b/tests/language/dictionary/missing-values.at
@@ -63,7 +63,7 @@ missing-values.sps:5: error: MISSING VALUES: Missing values provided are too lon
  
  missing-values.sps:8: error: MISSING VALUES: Truncating missing value to maximum acceptable length (8 bytes).
  
-missing-values.sps:11: error: MISSING VALUES: Syntax error at `THRU': expecting string.
+missing-values.sps:11.26-11.29: error: MISSING VALUES: Syntax error at `THRU': expecting string.
  
  missing-values.sps:11: error: MISSING VALUES: THRU is not a variable name.
  
diff --git a/tests/language/expressions/evaluate.at b/tests/language/expressions/evaluate.at

index efa715296d309fb53ea5d3908d3aedc93b9fca0c..e56a3a457e53c46d4a6905d95c8521c355891556 100644 (file)
--- a/tests/language/expressions/evaluate.at
+++ b/tests/language/expressions/evaluate.at
@@ -10,7 +10,12 @@ DEBUG EVALUATE m4_argn(4, check)/[]m4_car(check).
     AT_CAPTURE_FILE([evaluate.sps])
     m4_pushdef([i], [2])
     AT_CHECK([pspp --testing-mode --error-file=- --no-output evaluate.sps], 
-     [m4_if(m4_bregexp([m4_foreach([check], [m4_shift($@)], [m4_argn(3, check)])], [error:]), [-1], [0], [1])],
+     [m4_if(m4_bregexp([m4_foreach([check], [m4_shift($@)], [m4_argn(3, check)])], [error:]), [-1], [0], [1])], 
+     [stdout])
+   # Use sed to transform "file:line.column:" into plain "file:line:",
+   # because column numbers change between opt and noopt versions.
+   AT_CHECK([[sed 's/\(evaluate.sps:[0-9]\{1,\}\)\.[0-9]\{1,\}:/\1:/' stdout]],
+     [0],
       [m4_foreach([check], [m4_shift($@)],
          [m4_define([i], m4_incr(i))dnl
  m4_if(m4_argn(3, check), [], [], [evaluate.sps:[]i[]: m4_argn(3, check)
@@ -284,7 +289,7 @@ dnl <> token can't be split:
     [error: DEBUG EVALUATE: Syntax error at `>'.]],
  dnl # ~= token can't be split:
    [[1 ~ = 1], [error],
-   [error: DEBUG EVALUATE: Syntax error at `NOT': expecting end of command.]])
+   [error: DEBUG EVALUATE: Syntax error at `~': expecting end of command.]])
  
  CHECK_EXPR_EVAL([exp lg10 ln sqrt abs mod mod10 rnd trunc],
    [[exp(10)], [22026.47]],
diff --git a/tests/language/expressions/parse.at b/tests/language/expressions/parse.at

index 7ebc5dc15dbcf263c20560d0405328b2152060a3..df8192b28ed74d0d7978231b54052c34ddfe78b9 100644 (file)
--- a/tests/language/expressions/parse.at
+++ b/tests/language/expressions/parse.at
@@ -18,6 +18,6 @@ END IF.
  AT_CHECK([pspp -O format=csv parse.sps], [1], [dnl
  parse.sps:10: error: IF: Unknown identifier y.
  
-parse.sps:10: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
+parse.sps:11: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  AT_CLEANUP
diff --git a/tests/language/lexer/lexer.at b/tests/language/lexer/lexer.at

index 2c1dfc93b850d01446692dfa48425292ead089d2..08c146447fbed1964843a9544efec6f63b9292e2 100644 (file)
--- a/tests/language/lexer/lexer.at
+++ b/tests/language/lexer/lexer.at
@@ -18,3 +18,46 @@ a
  2.00
  ])
  AT_CLEANUP
+
+AT_SETUP([lexer properly reports scan errors])
+AT_DATA([lexer.sps], [dnl
+x'123'
+x'1x'
+u''
+u'012345678'
+u'd800'
+u'110000'
+'foo
+'very long unterminated string that be ellipsized in its error message
+1e .x
+`
+�
+])
+AT_CHECK([pspp -O format=csv lexer.sps], [1], [dnl
+"lexer.sps:1.1-1.6: error: Syntax error at `x'123'': String of hex digits has 3 characters, which is not a multiple of 2."
+
+lexer.sps:2.1-2.5: error: Syntax error at `x'1x'': `x' is not a valid hex digit.
+
+"lexer.sps:3.1-3.3: error: Syntax error at `u''': Unicode string contains 0 bytes, which is not in the valid range of 1 to 8 bytes."
+
+"lexer.sps:4.1-4.12: error: Syntax error at `u'012345678'': Unicode string contains 9 bytes, which is not in the valid range of 1 to 8 bytes."
+
+lexer.sps:5.1-5.7: error: Syntax error at `u'd800'': U+D800 is not a valid Unicode code point.
+
+lexer.sps:6.1-6.9: error: Syntax error at `u'110000'': U+110000 is not a valid Unicode code point.
+
+lexer.sps:7.1-7.4: error: Syntax error at `'foo': Unterminated string constant.
+
+lexer.sps:8.1-8.70: error: Syntax error at `'very long unterminated string that be ellipsized in its err...': Unterminated string constant.
+
+lexer.sps:9.1-9.2: error: Syntax error at `1e': Missing exponent following `1e'.
+
+lexer.sps:9.4: error: Syntax error at `.': Unexpected `.' in middle of command.
+
+lexer.sps:9: error: Unknown command `x'.
+
+lexer.sps:10.1: error: Syntax error at ``': Bad character ``' in input.
+
+lexer.sps:11.1: error: Syntax error at `�': Bad character U+FFFD in input.
+])
+AT_CLEANUP
diff --git a/tests/language/lexer/q2c.at b/tests/language/lexer/q2c.at

index eeeed8d73f8dd3d12b3ae560e72c616640bd269f..6ba3f7ab126065d6932507efc42fa68a5223dca3 100644 (file)
--- a/tests/language/lexer/q2c.at
+++ b/tests/language/lexer/q2c.at
@@ -16,7 +16,7 @@ CROSSTABS.
  AT_CHECK([pspp -O format=csv q2c.sps], [1], [dnl
  q2c.sps:8: error: EXAMINE: VARIABLES subcommand must be given.
  
-q2c.sps:9: error: ONEWAY: Syntax error at end of command: expecting variable name.
+q2c.sps:9.7: error: ONEWAY: Syntax error at end of command: expecting variable name.
  
  q2c.sps:10: error: CROSSTABS: TABLES subcommand must be given.
  ])
diff --git a/tests/language/stats/aggregate.at b/tests/language/stats/aggregate.at

index da078f84d1ff298bf4355bf774c21216416ce46f..1c3a7e17095ccd0613826486fa4958271c7153a4 100644 (file)
--- a/tests/language/stats/aggregate.at
+++ b/tests/language/stats/aggregate.at
@@ -283,9 +283,7 @@ AGGREGATE OUTFILE=* MODE=ADDVARIABLES
  
  AT_CHECK([pspp -O format=csv dup-variables.sps], [1],
  ["dup-variables.sps:24: error: AGGREGATE: Variable name N_BREAK is not unique within the aggregate file dictionary, which contains the aggregate variables and the break variables."
-
-dup-variables.sps:24: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  
  
-AT_CLEANUP
-\ No newline at end of file
+AT_CLEANUP
diff --git a/tests/language/stats/rank.at b/tests/language/stats/rank.at

index 99a0459fcfa388f93da9efb864d30ac269ccfe56..6cb366849b88da1e6c7664593ee3c2043cb0a93e 100644 (file)
--- a/tests/language/stats/rank.at
+++ b/tests/language/stats/rank.at
@@ -539,8 +539,6 @@ Variables Created By RANK
  x into Rx(RANK of x)
  
  rank.sps:14: error: RANK: DEBUG XFORM FAIL transformation executed
-
-rank.sps:14: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  AT_CLEANUP
  
@@ -578,9 +576,9 @@ RANK x
   /RANK INTO foo  bar wiz.
  ])
  AT_CHECK([pspp -O format=csv rank.sps], [1], [dnl
-rank.sps:15: error: RANK: Syntax error at end of command: expecting `@{:@'.
+rank.sps:15.1: error: RANK: Syntax error at end of command: expecting `@{:@'.
  
-rank.sps:19: error: RANK: Syntax error at `d': expecting integer.
+rank.sps:19.11: error: RANK: Syntax error at `d': expecting integer.
  
  rank.sps:25: error: RANK: Variable x already exists.
  
diff --git a/tests/language/utilities/insert.at b/tests/language/utilities/insert.at

index e119f1033bc4a5794dc54c6f965a11d91ab59a86..34376b1517c31902be8560e4824960005c59967e 100644 (file)
--- a/tests/language/utilities/insert.at
+++ b/tests/language/utilities/insert.at
@@ -3,13 +3,13 @@ AT_BANNER([INSERT])
  dnl Create a file "batch.sps" that is valid syntax only in batch mode.
  m4_define([CREATE_BATCH_SPS], 
    [AT_DATA([batch.sps], [dnl
-input program.
-+  loop #i = 1 to 5.
-+    compute z = #i
-+    end case.
-+  end loop
-end file.
-end input program.
+input program
+loop #i = 1 to 5
++  compute z = #i
++  end case
+end loop
+end file
+end input program
  ])])
  
  AT_SETUP([INSERT SYNTAX=INTERACTIVE])
@@ -17,14 +17,13 @@ CREATE_BATCH_SPS
  AT_DATA([insert.sps], [dnl
  INSERT 
    FILE='batch.sps'
-  SYNTAX=INTERACTIVE.
+  SYNTAX=interactive.
  LIST.
  ])
  AT_CHECK([pspp -o pspp.csv insert.sps], [1], [dnl
-batch.sps:2: error: INPUT PROGRAM: Syntax error at `+': expecting command name.
-batch.sps:3: error: INPUT PROGRAM: Syntax error at `+': expecting command name.
-batch.sps:5: error: INPUT PROGRAM: Syntax error at `+': expecting command name.
-batch.sps:7: error: Input program did not create any variables.
+batch.sps:2.1-2.4: error: INPUT PROGRAM: Syntax error at `loop': expecting end of command.
+batch.sps:3: error: COMPUTE: COMPUTE is allowed only after the active file has been defined or inside INPUT PROGRAM.
+batch.sps:4: error: END CASE: END CASE is allowed only inside INPUT PROGRAM.
  insert.sps:4: error: LIST: LIST is allowed only after the active file has been defined.
  ])
  AT_CLEANUP
@@ -111,24 +110,22 @@ END DATA.
  * The following line is erroneous
  
  DISPLAY AKSDJ.
+LIST.
  ])])
  
  AT_SETUP([INSERT ERROR=STOP])
  CREATE_ERROR_SPS
  AT_DATA([insert.sps], [INSERT FILE='error.sps' ERROR=STOP.
-LIST.
  ])
  AT_CHECK([pspp -o pspp.csv insert.sps], [1], [dnl
  error.sps:10: error: DISPLAY: AKSDJ is not a variable name.
  warning: Error encountered while ERROR=STOP is effective.
-error.sps:10: error: Stopping syntax file processing here to avoid a cascade of dependent command failures.
  ])
  AT_CLEANUP
  
  AT_SETUP([INSERT ERROR=CONTINUE])
  CREATE_ERROR_SPS
  AT_DATA([insert.sps], [INSERT FILE='error.sps' ERROR=CONTINUE.
-LIST.
  ])
  AT_CHECK([pspp -o pspp.csv insert.sps], [1], [dnl
  error.sps:10: error: DISPLAY: AKSDJ is not a variable name.
@@ -156,7 +153,7 @@ INSERT
  LIST.
  ])
  AT_CHECK([pspp -O format=csv insert.sps], [1], [dnl
-insert.sps:3: error: INSERT: Can't find `nonexistent' in include file search path.
+insert.sps:2: error: INSERT: Can't find `nonexistent' in include file search path.
  
  insert.sps:6: error: LIST: LIST is allowed only after the active file has been defined.
  ])
author	Ben Pfaff <blp@cs.stanford.edu>
	Sun, 20 Mar 2011 00:05:47 +0000 (17:05 -0700)
committer	Ben Pfaff <blp@cs.stanford.edu>
	Sun, 20 Mar 2011 16:43:45 +0000 (09:43 -0700)
NEWS		patch \| blob \| history
Smake		patch \| blob \| history
doc/dev/concepts.texi		patch \| blob \| history
doc/flow-control.texi		patch \| blob \| history
doc/invoking.texi		patch \| blob \| history
doc/language.texi		patch \| blob \| history
doc/utilities.texi		patch \| blob \| history
perl-module/PSPP.xs		patch \| blob \| history
perl-module/t/Pspp.t		patch \| blob \| history
src/data/automake.mk		patch \| blob \| history
src/data/dictionary.c		patch \| blob \| history
src/data/dictionary.h		patch \| blob \| history
src/data/file-handle-def.c		patch \| blob \| history
src/data/gnumeric-reader.h		patch \| blob \| history
src/data/identifier.c		patch \| blob \| history
src/data/identifier.h		patch \| blob \| history
src/data/identifier2.c	[new file with mode: 0644]	patch \| blob
src/data/mrset.c		patch \| blob \| history
src/data/mrset.h		patch \| blob \| history
src/data/por-file-reader.c		patch \| blob \| history
src/data/por-file-writer.c		patch \| blob \| history
src/data/procedure.c		patch \| blob \| history
src/data/procedure.h		patch \| blob \| history
src/data/sys-file-reader.c		patch \| blob \| history
src/data/sys-file-writer.c		patch \| blob \| history
src/data/variable.c		patch \| blob \| history
src/data/variable.h		patch \| blob \| history
src/data/vector.c		patch \| blob \| history
src/data/vector.h		patch \| blob \| history
src/language/automake.mk		patch \| blob \| history
src/language/command.c		patch \| blob \| history
src/language/command.def		patch \| blob \| history
src/language/control/automake.mk		patch \| blob \| history
src/language/control/do-if.c		patch \| blob \| history
src/language/control/loop.c		patch \| blob \| history
src/language/control/repeat.c		patch \| blob \| history
src/language/control/repeat.h	[deleted file]	patch \| blob \| history
src/language/control/temporary.c		patch \| blob \| history
src/language/data-io/combine-files.c		patch \| blob \| history
src/language/data-io/data-list.c		patch \| blob \| history
src/language/data-io/data-parser.c		patch \| blob \| history
src/language/data-io/data-reader.c		patch \| blob \| history
src/language/data-io/file-handle.q		patch \| blob \| history
src/language/data-io/get-data.c		patch \| blob \| history
src/language/data-io/inpt-pgm.c		patch \| blob \| history
src/language/data-io/save-translate.c		patch \| blob \| history
src/language/data-io/trim.c		patch \| blob \| history
src/language/dictionary/apply-dictionary.c		patch \| blob \| history
src/language/dictionary/attributes.c		patch \| blob \| history
src/language/dictionary/missing-values.c		patch \| blob \| history
src/language/dictionary/modify-variables.c		patch \| blob \| history
src/language/dictionary/mrsets.c		patch \| blob \| history
src/language/dictionary/numeric.c		patch \| blob \| history
src/language/dictionary/rename-variables.c		patch \| blob \| history
src/language/dictionary/split-file.c		patch \| blob \| history
src/language/dictionary/sys-file-info.c		patch \| blob \| history
src/language/dictionary/value-labels.c		patch \| blob \| history
src/language/dictionary/variable-label.c		patch \| blob \| history
src/language/dictionary/vector.c		patch \| blob \| history
src/language/dictionary/weight.c		patch \| blob \| history
src/language/expressions/parse.c		patch \| blob \| history
src/language/expressions/private.h		patch \| blob \| history
src/language/lexer/automake.mk		patch \| blob \| history
src/language/lexer/include-path.c	[new file with mode: 0644]	patch \| blob
src/language/lexer/include-path.h	[new file with mode: 0644]	patch \| blob
src/language/lexer/lexer.c		patch \| blob \| history
src/language/lexer/lexer.h		patch \| blob \| history
src/language/lexer/q2c.c		patch \| blob \| history
src/language/lexer/value-parser.c		patch \| blob \| history
src/language/lexer/variable-parser.c		patch \| blob \| history
src/language/lexer/variable-parser.h		patch \| blob \| history
src/language/prompt.c	[deleted file]	patch \| blob \| history
src/language/prompt.h	[deleted file]	patch \| blob \| history
src/language/stats/aggregate.c		patch \| blob \| history
src/language/stats/autorecode.c		patch \| blob \| history
src/language/stats/descriptives.c		patch \| blob \| history
src/language/stats/flip.c		patch \| blob \| history
src/language/stats/frequencies.q		patch \| blob \| history
src/language/stats/npar.c		patch \| blob \| history
src/language/stats/rank.q		patch \| blob \| history
src/language/stats/sort-cases.c		patch \| blob \| history
src/language/syntax-file.c	[deleted file]	patch \| blob \| history
src/language/syntax-file.h	[deleted file]	patch \| blob \| history
src/language/syntax-string-source.c	[deleted file]	patch \| blob \| history
src/language/syntax-string-source.h	[deleted file]	patch \| blob \| history
src/language/tests/format-guesser-test.c		patch \| blob \| history
src/language/tests/moments-test.c		patch \| blob \| history
src/language/tests/paper-size.c		patch \| blob \| history
src/language/utilities/cache.c		patch \| blob \| history
src/language/utilities/cd.c		patch \| blob \| history
src/language/utilities/date.c		patch \| blob \| history
src/language/utilities/host.c		patch \| blob \| history
src/language/utilities/include.c		patch \| blob \| history
src/language/utilities/permissions.c		patch \| blob \| history
src/language/utilities/set.q		patch \| blob \| history
src/language/utilities/title.c		patch \| blob \| history
src/language/xforms/compute.c		patch \| blob \| history
src/language/xforms/count.c		patch \| blob \| history
src/language/xforms/fail.c		patch \| blob \| history
src/language/xforms/recode.c		patch \| blob \| history
src/language/xforms/sample.c		patch \| blob \| history
src/language/xforms/select-if.c		patch \| blob \| history
src/libpspp/automake.mk		patch \| blob \| history
src/libpspp/getl.c	[deleted file]	patch \| blob \| history
src/libpspp/getl.h	[deleted file]	patch \| blob \| history
src/libpspp/message.c		patch \| blob \| history
src/libpspp/message.h		patch \| blob \| history
src/libpspp/msg-locator.c	[deleted file]	patch \| blob \| history
src/libpspp/msg-locator.h	[deleted file]	patch \| blob \| history
src/output/driver.c		patch \| blob \| history
src/ui/gui/automake.mk		patch \| blob \| history
src/ui/gui/comments-dialog.c		patch \| blob \| history
src/ui/gui/executor.c		patch \| blob \| history
src/ui/gui/executor.h		patch \| blob \| history
src/ui/gui/main.c		patch \| blob \| history
src/ui/gui/psppire-data-window.c		patch \| blob \| history
src/ui/gui/psppire-dict.c		patch \| blob \| history
src/ui/gui/psppire-syntax-window.c		patch \| blob \| history
src/ui/gui/psppire-syntax-window.h		patch \| blob \| history
src/ui/gui/psppire-var-store.c		patch \| blob \| history
src/ui/gui/psppire.c		patch \| blob \| history
src/ui/gui/psppire.h		patch \| blob \| history
src/ui/gui/syntax-editor-source.c	[deleted file]	patch \| blob \| history
src/ui/gui/syntax-editor-source.h	[deleted file]	patch \| blob \| history
src/ui/source-init-opts.c		patch \| blob \| history
src/ui/source-init-opts.h		patch \| blob \| history
src/ui/terminal/automake.mk		patch \| blob \| history
src/ui/terminal/main.c		patch \| blob \| history
src/ui/terminal/msg-ui.c	[deleted file]	patch \| blob \| history
src/ui/terminal/msg-ui.h	[deleted file]	patch \| blob \| history
src/ui/terminal/read-line.c	[deleted file]	patch \| blob \| history
src/ui/terminal/read-line.h	[deleted file]	patch \| blob \| history
src/ui/terminal/terminal-opts.c		patch \| blob \| history
src/ui/terminal/terminal-opts.h		patch \| blob \| history
src/ui/terminal/terminal-reader.c	[new file with mode: 0644]	patch \| blob
src/ui/terminal/terminal-reader.h	[new file with mode: 0644]	patch \| blob
tests/data/data-in.at		patch \| blob \| history
tests/data/sys-file-reader.at		patch \| blob \| history
tests/dissect-sysfile.c		patch \| blob \| history
tests/language/control/do-repeat.at		patch \| blob \| history
tests/language/data-io/data-list.at		patch \| blob \| history
tests/language/data-io/get.at		patch \| blob \| history
tests/language/data-io/inpt-pgm.at		patch \| blob \| history
tests/language/data-io/print.at		patch \| blob \| history
tests/language/dictionary/missing-values.at		patch \| blob \| history
tests/language/expressions/evaluate.at		patch \| blob \| history
tests/language/expressions/parse.at		patch \| blob \| history
tests/language/lexer/lexer.at		patch \| blob \| history
tests/language/lexer/q2c.at		patch \| blob \| history
tests/language/stats/aggregate.at		patch \| blob \| history
tests/language/stats/rank.at		patch \| blob \| history
tests/language/utilities/insert.at		patch \| blob \| history