pspp
2 years agolexer: Keep entire source file in memory.
Ben Pfaff [Mon, 22 Nov 2021 19:45:48 +0000 (11:45 -0800)]
lexer: Keep entire source file in memory.

Previously, the lexer tried to keep only part of each source file in
memory, the part that wasn't fully parsed yet.  With this commit, the
lexer holds the entire file in memory, even parts that are no longer
needed.  This should make it easier to produce better error messages.

2 years agopivot-table: New function pivot_value_new_variable__().
Ben Pfaff [Mon, 6 Dec 2021 05:21:19 +0000 (21:21 -0800)]
pivot-table: New function pivot_value_new_variable__().

2 years agou8-line: Add definition of an initializer.
Ben Pfaff [Mon, 6 Dec 2021 05:20:53 +0000 (21:20 -0800)]
u8-line: Add definition of an initializer.

2 years agostring-array: New functions for comparing string arrays.
Ben Pfaff [Mon, 6 Dec 2021 05:20:42 +0000 (21:20 -0800)]
string-array: New functions for comparing string arrays.

2 years agovariable-parser: New functions for parsing syntax without a dictionary.
Ben Pfaff [Mon, 6 Dec 2021 05:20:00 +0000 (21:20 -0800)]
variable-parser: New functions for parsing syntax without a dictionary.

This will acquire its first user in an upcoming commit.

2 years agoexpressions: Simplify function name parsing.
Ben Pfaff [Mon, 6 Dec 2021 05:19:12 +0000 (21:19 -0800)]
expressions: Simplify function name parsing.

2 years agodistributions: New module for probability distribution functions.
Ben Pfaff [Mon, 6 Dec 2021 05:18:22 +0000 (21:18 -0800)]
distributions: New module for probability distribution functions.

These functions are currently just used in expressions, but in an upcoming
commit they will also be used in matrices, so this commit makes them more
widely available.

2 years agodata-writer: New function dfm_put_record_utf8().
Ben Pfaff [Mon, 6 Dec 2021 05:15:02 +0000 (21:15 -0800)]
data-writer: New function dfm_put_record_utf8().

This will have another user in the upcoming support for the matrix
language.

2 years agolexer: Add tokens for '{', '}', ':', ';' for use in the matrix language.
Ben Pfaff [Mon, 6 Dec 2021 05:14:08 +0000 (21:14 -0800)]
lexer: Add tokens for '{', '}', ':', ';' for use in the matrix language.

2 years agofile-handle-def: New function fh_equal().
Ben Pfaff [Mon, 6 Dec 2021 05:13:00 +0000 (21:13 -0800)]
file-handle-def: New function fh_equal().

This will have its first user in an upcoming commit.

2 years agolexer: Factor out functions for counting columns.
Ben Pfaff [Mon, 6 Dec 2021 04:02:37 +0000 (20:02 -0800)]
lexer: Factor out functions for counting columns.

These will have additional upcoming users.

2 years agolexer: New lex_at_phrase(), lex_get_n() functions.
Ben Pfaff [Mon, 6 Dec 2021 04:01:01 +0000 (20:01 -0800)]
lexer: New lex_at_phrase(), lex_get_n() functions.

These will have their first users in upcoming commits.

2 years agolexer: Issue error message in forgotten case in lex_force_int_range().
Ben Pfaff [Mon, 6 Dec 2021 17:02:55 +0000 (09:02 -0800)]
lexer: Issue error message in forgotten case in lex_force_int_range().

Found by inspection.

2 years agolexer: Be consistent across 32/64 bit in lex_force_int_range().
Ben Pfaff [Mon, 6 Dec 2021 17:02:30 +0000 (09:02 -0800)]
lexer: Be consistent across 32/64 bit in lex_force_int_range().

The exact error message that this function reported varied between 32-
and 64-bit platforms for invalid integers that require more than 32 bits
but no more than 64 bits, since "long int" has a different range between
those platforms.  This commit fixes the problem.

This issue caused the test "testing lexer crash due to overflow" to fail
on 32-bit platforms.

Thanks to Friedrich Beckmann for reporting the problem.

2 years agodoc: Typo fixes, wording and formatting improvements.
Ben Pfaff [Mon, 6 Dec 2021 05:11:40 +0000 (21:11 -0800)]
doc: Typo fixes, wording and formatting improvements.

2 years agosegment: Add comment about zero-length segments.
Ben Pfaff [Tue, 30 Nov 2021 14:54:48 +0000 (06:54 -0800)]
segment: Add comment about zero-length segments.

2 years agoformat: Make fmt_check() easier to translate.
Ben Pfaff [Sat, 6 Nov 2021 21:15:29 +0000 (14:15 -0700)]
format: Make fmt_check() easier to translate.

2 years agoany-writer: Add comment.
Ben Pfaff [Mon, 6 Dec 2021 05:12:17 +0000 (21:12 -0800)]
any-writer: Add comment.

2 years agodriver: New function output_log_nocopy().
Ben Pfaff [Mon, 6 Dec 2021 05:16:29 +0000 (21:16 -0800)]
driver: New function output_log_nocopy().

2 years agoMATRIX DATA: Add test for factors and splits together.
Ben Pfaff [Mon, 6 Dec 2021 05:21:41 +0000 (21:21 -0800)]
MATRIX DATA: Add test for factors and splits together.

2 years agoexpressions: Fix definitions of IDF.T1G and IDF.T2G.
Ben Pfaff [Mon, 6 Dec 2021 05:18:42 +0000 (21:18 -0800)]
expressions: Fix definitions of IDF.T1G and IDF.T2G.

2 years agodataset: Fix memory leak destroying a dataset that has a permanent_dict.
Ben Pfaff [Sat, 9 Oct 2021 17:20:28 +0000 (10:20 -0700)]
dataset: Fix memory leak destroying a dataset that has a permanent_dict.

Found by Address Sanitizer.

2 years agopivot-table-test: Fix memory leak when table is not displayed.
Ben Pfaff [Sat, 9 Oct 2021 17:13:34 +0000 (10:13 -0700)]
pivot-table-test: Fix memory leak when table is not displayed.

Found by Address Sanitizer.

2 years agolexer: Fix memory leak merging tokens only some of which come from macros.
Ben Pfaff [Sat, 9 Oct 2021 17:10:35 +0000 (10:10 -0700)]
lexer: Fix memory leak merging tokens only some of which come from macros.

Found by Address Sanitizer.

2 years agomacro: Fix memory leaks in error cases parsing function arguments.
Ben Pfaff [Sat, 9 Oct 2021 16:41:01 +0000 (09:41 -0700)]
macro: Fix memory leaks in error cases parsing function arguments.

Found by Address Sanitizer.

2 years agolexer: Fix memory leak when macro expands as empty.
Ben Pfaff [Sat, 9 Oct 2021 16:35:41 +0000 (09:35 -0700)]
lexer: Fix memory leak when macro expands as empty.

Found by Address Sanitizer.

2 years agomacro: Fix memory leak expanding !DO loop over list.
Ben Pfaff [Sat, 9 Oct 2021 16:32:02 +0000 (09:32 -0700)]
macro: Fix memory leak expanding !DO loop over list.

Found by Address Sanitizer.

2 years agomacro: Fix memory leak with keyword "enclose" arguments.
Ben Pfaff [Sat, 9 Oct 2021 16:23:52 +0000 (09:23 -0700)]
macro: Fix memory leak with keyword "enclose" arguments.

The memory for the argument was being allocated two places, which caused
the first-allocated block to be leaked.

Found by Address Sanitizer.

2 years agosegment: Fix read past end of buffer when input ends in '-'.
Ben Pfaff [Sat, 9 Oct 2021 16:13:40 +0000 (09:13 -0700)]
segment: Fix read past end of buffer when input ends in '-'.

Thanks to John Darrington for reporting this.
Found by Address Sanitizer.

2 years agoexpressions: Parse multiple sets of parentheses for grouping together.
Ben Pfaff [Wed, 6 Oct 2021 05:15:28 +0000 (22:15 -0700)]
expressions: Parse multiple sets of parentheses for grouping together.

Fuzzers are fond of driving expression parsers to failure by exhausting
the stack in trivial ways.  This defeats the simplest attempts by
lining up thousands of left parentheses in a row.

I am a bit curious whether the fuzzer will now invent something more
sophisticated, such as nested function calls or non-empty expressions like
1+(1+(1+(1+(1+....

This fixes bug #61286.
Thanks to Irfan Ariq for reporting the bug.

2 years agoDATA LIST: Fix assertion when RECORDS given twice with decreasing value.
Ben Pfaff [Wed, 6 Oct 2021 04:57:37 +0000 (21:57 -0700)]
DATA LIST: Fix assertion when RECORDS given twice with decreasing value.

Fixes bug #61285.
Thanks to Irfan Ariq for reporting this bug.

2 years agodictionary: Allow dict_set_documents() argument to reference old documents.
Ben Pfaff [Tue, 5 Oct 2021 16:19:26 +0000 (09:19 -0700)]
dictionary: Allow dict_set_documents() argument to reference old documents.

merge_dictionary() in combine-file.c includes the old documents in the new
ones by just copying pointers.  dict_set_documents() didn't handle this
properly.  This fixes the problem.

Fixes bug #61258.
Thanks to Irfan Ariq for reporting the problem.

2 years agoencoding-guesser: Avoid reading uninitialized data for zero-length files.
Ben Pfaff [Tue, 5 Oct 2021 15:47:46 +0000 (08:47 -0700)]
encoding-guesser: Avoid reading uninitialized data for zero-length files.

Found while investigating bug #61254.
Thanks to Irfan Ariq for reporting this bug.

2 years agosegment: Fix 1-byte read past initialized data when file ends in CR.
Ben Pfaff [Tue, 5 Oct 2021 06:14:32 +0000 (23:14 -0700)]
segment: Fix 1-byte read past initialized data when file ends in CR.

Fixes bug #61253.
Thanks to Irfan Ariq for reporting this bug.

2 years agoRemove some unnecessary gettext macro definitions
John Darrington [Sun, 3 Oct 2021 06:03:44 +0000 (08:03 +0200)]
Remove some unnecessary gettext macro definitions

2 years agoFix compiler warning
John Darrington [Sat, 2 Oct 2021 19:16:55 +0000 (21:16 +0200)]
Fix compiler warning

2 years agoFix memory leak in tokenize_string_segment
John Darrington [Sat, 2 Oct 2021 16:33:21 +0000 (18:33 +0200)]
Fix memory leak in tokenize_string_segment

2 years agoFix memory leak in MCONVERT
John Darrington [Sat, 2 Oct 2021 14:51:37 +0000 (16:51 +0200)]
Fix memory leak in MCONVERT

2 years agoFix memory leak upon failure to create matrix reader
John Darrington [Sat, 2 Oct 2021 14:35:42 +0000 (16:35 +0200)]
Fix memory leak upon failure to create matrix reader

2 years agoMore ASAN_OPTIONS
John Darrington [Sat, 2 Oct 2021 14:31:14 +0000 (16:31 +0200)]
More ASAN_OPTIONS

2 years agoUse XCALLOC / XZALLOC macros where reasonable
John Darrington [Sat, 2 Oct 2021 13:32:40 +0000 (15:32 +0200)]
Use XCALLOC / XZALLOC macros where reasonable

2 years agoFix possible segfault when running RANK with bad syntax
John Darrington [Sat, 2 Oct 2021 04:48:57 +0000 (06:48 +0200)]
Fix possible segfault when running RANK with bad syntax

* src/language/stats/rank.c (): Initialise the vars member

Fixes bug #61257

2 years agoReplace numerous instances of xzalloc with XZALLOC
John Darrington [Sat, 2 Oct 2021 04:47:31 +0000 (06:47 +0200)]
Replace numerous instances of xzalloc with XZALLOC

2 years agoMatrix readers - fix memory leaks
John Darrington [Sun, 26 Sep 2021 18:26:45 +0000 (20:26 +0200)]
Matrix readers - fix memory leaks

* src/language/data-io/matrix-reader.c (matrix_reader_destroy): free members cvar, svars and fvars

2 years agoIgnore sanitizer ODR warnings, during testing with -fsanitize=address
John Darrington [Sun, 26 Sep 2021 18:39:38 +0000 (20:39 +0200)]
Ignore sanitizer ODR warnings, during testing with -fsanitize=address

* tests/libpspp/sparse-xarray-test.at: Set detect_odr_violation=0

2 years agoImplement the MCONVERT command.
Ben Pfaff [Mon, 27 Sep 2021 05:35:33 +0000 (22:35 -0700)]
Implement the MCONVERT command.

2 years agolexer: Fix use-after-free error in lex_source_get_lookahead().
Ben Pfaff [Sun, 26 Sep 2021 18:06:45 +0000 (11:06 -0700)]
lexer: Fix use-after-free error in lex_source_get_lookahead().

This code used local variable 'out' as if its value stayed the same from
one iteration of the loop to the next, but in fact its scope meant that
it became indeterminate on each new iteration.  This commit fixes the
problem by moving its declaration to an outer scope.

Thanks to John Darrington for reporting the problem.

2 years agoAvoid numerical problems with missing weights on non-linear cases.
John Darrington [Sun, 26 Sep 2021 14:53:38 +0000 (16:53 +0200)]
Avoid numerical problems with missing weights on non-linear cases.

* src/math/order-stats.c (order_stats_accumulate_idx): Ignore cases
  with missing weight values.

2 years agoFix possible incorrect assertion when creating unique casereaders.
John Darrington [Sat, 25 Sep 2021 16:59:52 +0000 (18:59 +0200)]
Fix possible incorrect assertion when creating unique casereaders.

* src/data/casereader-translator.c (uniquify): Force dir to be an element of  {0, 1, -1}

2 years agoRemove unused module src/math/extrema
John Darrington [Sat, 25 Sep 2021 09:41:04 +0000 (11:41 +0200)]
Remove unused module src/math/extrema

* src/math/extrema.c: Delete
* src/math/extrema.h: Delete
* src/math/automake.mk: Remove entries for src/math/extrema.c and
  src/math/extrema.h

2 years agoOptions dialog: add checkbox for startup tips
John Darrington [Sat, 11 Sep 2021 05:39:29 +0000 (07:39 +0200)]
Options dialog: add checkbox for startup tips

2 years agosys-file-encoding: Automatically generate the .c file at build time.
Ben Pfaff [Mon, 6 Sep 2021 17:07:41 +0000 (10:07 -0700)]
sys-file-encoding: Automatically generate the .c file at build time.

Suggested by John Darrington.

2 years agoWindows/build-dependencies: Add -fstack-protector flags
John Darrington [Sun, 5 Sep 2021 11:16:33 +0000 (13:16 +0200)]
Windows/build-dependencies: Add -fstack-protector flags

For some reason this flag seems to be necessary with the latest
x86_64_w64-mingw32 toolchain.

2 years agoWindows/build-dependencies: Use correct logical OR operator
John Darrington [Sun, 5 Sep 2021 11:01:25 +0000 (13:01 +0200)]
Windows/build-dependencies: Use correct logical OR operator

2 years agoWindows/build-dependencies: New flag --no-clean
John Darrington [Sun, 5 Sep 2021 10:59:49 +0000 (12:59 +0200)]
Windows/build-dependencies: New flag --no-clean

2 years agoAdd some missing #include directives
John Darrington [Sat, 4 Sep 2021 07:36:55 +0000 (09:36 +0200)]
Add some missing #include directives

* src/language/stats/crosstabs.c: Add missing #include directive.
* src/libpspp/pool.c: Add missing #include directive.

2 years agoMATRIX DATA: Fully implement.
Ben Pfaff [Fri, 3 Sep 2021 05:15:53 +0000 (22:15 -0700)]
MATRIX DATA: Fully implement.

This command had a partial implementation for correlation matrices that
left out some of the language features.  This commit adds those features.

2 years agocase: Introduce new functions for numbers and substrings in cases.
Ben Pfaff [Fri, 3 Sep 2021 02:59:23 +0000 (19:59 -0700)]
case: Introduce new functions for numbers and substrings in cases.

Use the case_num_*() functions everywhere in the tree for clarity and
brevity.

2 years agosys-file-encoding: Put the buffer-read-only declaration at the very top.
Ben Pfaff [Fri, 3 Sep 2021 01:45:39 +0000 (18:45 -0700)]
sys-file-encoding: Put the buffer-read-only declaration at the very top.

This accidentally got pushed down when license notices were added to a
bunch of files en masse.  It only works if it's at the top.

2 years agoRemove unneeded Emacs declarations that say that a .c file is in C.
Ben Pfaff [Sat, 31 Jul 2021 02:48:08 +0000 (19:48 -0700)]
Remove unneeded Emacs declarations that say that a .c file is in C.

These were there because they were formerly .q files.

2 years agoDEFINE: Properly support redefining a macro.
Ben Pfaff [Thu, 2 Sep 2021 16:23:11 +0000 (09:23 -0700)]
DEFINE: Properly support redefining a macro.

Redefining a macro didn't work in simple cases because the macro name was
being expanded in the DEFINE command.

Thanks to Frans Houweling for reporting the bug.

2 years agomacro: Fix crash for !QUOTE of a quoted string.
Ben Pfaff [Wed, 1 Sep 2021 17:02:26 +0000 (10:02 -0700)]
macro: Fix crash for !QUOTE of a quoted string.

This revealed that the tests didn't include anything for !QUOTE and
!UNQUOTE, even though the manual had lots of examples.  I added a test for
these and for !NULL, which had also been forgotten.

Thanks to Frans Houweling for reporting the bug.

2 years agoAvoid GtkCritical on startup
John Darrington [Sun, 29 Aug 2021 17:45:38 +0000 (19:45 +0200)]
Avoid GtkCritical on startup

2 years agoTeX tests: Use the shell instead of wc to test for maximum line length.
John Darrington [Sat, 28 Aug 2021 08:50:32 +0000 (10:50 +0200)]
TeX tests: Use the shell instead of wc to test for maximum line length.

When testing for maximum line length use the shell rather than
relying on wc -L  : The -L flag is not present on some systems.

tests/tex.at: Remove dependence on wc
configure.ac: Remove test for wc -L

Fixes bug #59859

2 years agoFix import of ods files with repeated column data.
John Darrington [Sat, 28 Aug 2021 07:10:34 +0000 (09:10 +0200)]
Fix import of ods files with repeated column data.

Fixes bug #61078

Reported-by: Elias Tsolis
2 years agoFix crash when double clicking on variable sheet cells when no variable is defined.
John Darrington [Sat, 28 Aug 2021 07:02:14 +0000 (09:02 +0200)]
Fix crash when double clicking on variable sheet cells when no variable is defined.

Reported-by: Maruthi Pathapati
2 years agoDEFINE: Only expand macro functions when the name is followed by '('.
Ben Pfaff [Thu, 26 Aug 2021 16:32:31 +0000 (09:32 -0700)]
DEFINE: Only expand macro functions when the name is followed by '('.

Frans Houweling reported that PSPP was flagging an error for !EVAL(!len)
when !len was the name of a defined macro.  This was because !len is short
for the !LENGTH macro function.  This commit fixes the problem.

2 years agoTolerate pathnames with spaces
John Darrington [Sat, 24 Jul 2021 13:32:04 +0000 (15:32 +0200)]
Tolerate pathnames with spaces

Reported by: Vivien Kraus

2 years agomacro: Continue expanding macro even in face of errors in call.
Ben Pfaff [Sat, 24 Jul 2021 06:17:48 +0000 (23:17 -0700)]
macro: Continue expanding macro even in face of errors in call.

In practice, it was more confusing not to expand it than to expand it.

2 years agomacro: Allow positional arguments to be empty.
Ben Pfaff [Sat, 24 Jul 2021 06:08:31 +0000 (23:08 -0700)]
macro: Allow positional arguments to be empty.

It wasn't clear before that this was allowed, but it seems that it should
be for compatibility.

Reported by Frans Houweling.

2 years agomacro: Make ARG_CHAREND and ARG_ENCLOSE more uniform in struct macro_param.
Ben Pfaff [Sat, 24 Jul 2021 05:50:55 +0000 (22:50 -0700)]
macro: Make ARG_CHAREND and ARG_ENCLOSE more uniform in struct macro_param.

A few pieces of code want to find the end of a parameter and it's easier
if the "end" token is always the same member.

2 years agoidentifier: Make T_STOP always 0.
Ben Pfaff [Sat, 24 Jul 2021 05:51:34 +0000 (22:51 -0700)]
identifier: Make T_STOP always 0.

This is more sensible default if a token is zero-initialized.

2 years agomacro: Allow macro A to use its arguments as part of call to macro B.
Ben Pfaff [Thu, 22 Jul 2021 06:00:10 +0000 (23:00 -0700)]
macro: Allow macro A to use its arguments as part of call to macro B.

I overlooked this possibility before.  This implements it.

Thanks to Frans Houweling for reporting the issue.

2 years agoTurn !* into a single token, for macro expansion purposes.
Ben Pfaff [Thu, 22 Jul 2021 05:48:07 +0000 (22:48 -0700)]
Turn !* into a single token, for macro expansion purposes.

2 years agoDEFINE: Equals sign is optional for both positional and keyword parameters.
Ben Pfaff [Tue, 20 Jul 2021 14:53:59 +0000 (07:53 -0700)]
DEFINE: Equals sign is optional for both positional and keyword parameters.

Thanks to Frans Houweling for reporting this bug.

2 years agomacro: Properly parse !ENCLOSE keyword arguments.
Ben Pfaff [Tue, 20 Jul 2021 05:19:41 +0000 (22:19 -0700)]
macro: Properly parse !ENCLOSE keyword arguments.

The opening delimiter was being included in the argument.

Thanks to Frans Houweling for reporting this bug.

2 years agoDEFINE: Don't use PSPP_CHECK_MACRO_EXPANSION macro in tests.
Ben Pfaff [Tue, 20 Jul 2021 05:10:27 +0000 (22:10 -0700)]
DEFINE: Don't use PSPP_CHECK_MACRO_EXPANSION macro in tests.

I found that this macro just obscured things, so I expanded it in each
case and removed the macro itself.

2 years agoDEFINE: Allow !DEFAULT to follow the argument type declaration.
Ben Pfaff [Tue, 20 Jul 2021 03:49:32 +0000 (20:49 -0700)]
DEFINE: Allow !DEFAULT to follow the argument type declaration.

Frans Houweling reported that SPSS allows either order.

2 years agolexer: Change the pipeline to allow more flexible use of macros.
Ben Pfaff [Sun, 18 Jul 2021 21:21:24 +0000 (14:21 -0700)]
lexer: Change the pipeline to allow more flexible use of macros.

Frans Houweling reported that a use of macros in the following way:

    DEFINE !dir() "/directory/to/my/work" !ENDDEFINE.
    GET FILE=!dir + "/filename.sav".

did not work properly with the newly implemented PSPP macro facility.
Indeed, PSPP has until now implemented string concatenation early in the
lexical pipeline, before macro expansion, so that the above could
not work.  This commit reworks it so that string concatenation happens
as the last stage in lexical analysis.  It allows the above syntax
to work as expected.

2 years agostr: New function ss_swap().
Ben Pfaff [Sun, 18 Jul 2021 21:20:06 +0000 (14:20 -0700)]
str: New function ss_swap().

2 years agosegment: Make negative numbers into single segments.
Ben Pfaff [Mon, 5 Jul 2021 22:15:45 +0000 (15:15 -0700)]
segment: Make negative numbers into single segments.

2 years agoconfigure: Enable GCC warnings to report use of C2x features.
Ben Pfaff [Mon, 5 Jul 2021 20:46:32 +0000 (13:46 -0700)]
configure: Enable GCC warnings to report use of C2x features.

One C2x feature in GCC 11 is the reason that the previous commit
afda22462a88 ("Fix broken build due to missing braces") was needed by
some developers.

2 years agoFix broken build due to missing braces
John Darrington [Sun, 4 Jul 2021 17:44:07 +0000 (19:44 +0200)]
Fix broken build due to missing braces

2 years agoDEFINE: New command.
Ben Pfaff [Sun, 4 Jul 2021 05:05:35 +0000 (22:05 -0700)]
DEFINE: New command.

2 years agolexer: Move lex_ellipsize() into string module, as str_ellipsize().
Ben Pfaff [Sun, 4 Jul 2021 16:43:09 +0000 (09:43 -0700)]
lexer: Move lex_ellipsize() into string module, as str_ellipsize().

2 years agotoken: Update functional interface and add token_copy(), token_equal().
Ben Pfaff [Sun, 4 Jul 2021 05:20:55 +0000 (22:20 -0700)]
token: Update functional interface and add token_copy(), token_equal().

These will have users in an upcoming commit.

2 years agosegment: Ignore !ENDDEFINE in /*comments*/ and "strings".
Ben Pfaff [Sun, 4 Jul 2021 05:10:26 +0000 (22:10 -0700)]
segment: Ignore !ENDDEFINE in /*comments*/ and "strings".

2 years agosegment: Distinguish snippets from full files.
Ben Pfaff [Sun, 4 Jul 2021 05:00:47 +0000 (22:00 -0700)]
segment: Distinguish snippets from full files.

The comment on segmenter_init() explains what this means:

If IS_SNIPPET is false, then the segmenter will parse as if it's being
given a whole file.  This means, for example, that it will interpret -
or + at the beginning of the syntax as a separator between commands
(since - or + at the beginning of a line has this meaning).

If IS_SNIPPET is true, then the segmenter will parse as if it's being
given an isolated piece of syntax.  This means that, for example, that
it will interpret - or + at the beginning of the syntax as an operator
token or (if followed by a digit) as part of a number.

2 years agolexer: Factor out scan error messages into new function.
Ben Pfaff [Sun, 4 Jul 2021 02:35:32 +0000 (19:35 -0700)]
lexer: Factor out scan error messages into new function.

2 years agomessage: Make msg_emit() take full ownership of its argument.
Ben Pfaff [Sat, 3 Jul 2021 18:46:17 +0000 (11:46 -0700)]
message: Make msg_emit() take full ownership of its argument.

The way it treated the argument before was just confusing.

2 years agomessage: Break message location out into a separate struct.
Ben Pfaff [Sat, 3 Jul 2021 18:28:05 +0000 (11:28 -0700)]
message: Break message location out into a separate struct.

This will make it cleaner to have a stack of locations for use in reporting
macro expansion errors.

2 years agomessage: Get rid of 'shipped' member in struct message.
Ben Pfaff [Sat, 3 Jul 2021 01:07:24 +0000 (18:07 -0700)]
message: Get rid of 'shipped' member in struct message.

It seemed to me that it wasn't a very clean design, since it required a
message to be modified as part of emitting it.

2 years agostringi-set: New functions for not necessarily null terminated strings.
Ben Pfaff [Sat, 26 Jun 2021 21:54:37 +0000 (14:54 -0700)]
stringi-set: New functions for not necessarily null terminated strings.

2 years agostringi-map: Add some support for non-null-terminated strings.
Ben Pfaff [Sun, 4 Jul 2021 21:13:45 +0000 (14:13 -0700)]
stringi-map: Add some support for non-null-terminated strings.

2 years agoTITLE and SUBTITLE: Don't treat an unquoted argument as a quoted string.
Ben Pfaff [Sat, 26 Jun 2021 21:53:23 +0000 (14:53 -0700)]
TITLE and SUBTITLE: Don't treat an unquoted argument as a quoted string.

This will allow the argument to be processed through the macro processor.

2 years agolexer: New function lex_next_representation().
Ben Pfaff [Sun, 27 Jun 2021 18:19:07 +0000 (11:19 -0700)]
lexer: New function lex_next_representation().

2 years agolexer: Factor some token inspectors out into new token functions.
Ben Pfaff [Sun, 13 Jun 2021 17:33:09 +0000 (10:33 -0700)]
lexer: Factor some token inspectors out into new token functions.

2 years agodoc: Fix operator precedence chart.
Ben Pfaff [Sat, 12 Jun 2021 21:39:20 +0000 (14:39 -0700)]
doc: Fix operator precedence chart.

Also, improve a few entries.

2 years agoidentifier: Remove TOKEN_N_TYPES from enum token_type.
Ben Pfaff [Sun, 30 May 2021 22:49:42 +0000 (15:49 -0700)]
identifier: Remove TOKEN_N_TYPES from enum token_type.

2 years agosegment: Refine treatment of start of macro body.
Ben Pfaff [Sun, 30 May 2021 20:31:39 +0000 (13:31 -0700)]
segment: Refine treatment of start of macro body.

Previously, if the first line of the macro body (the same line as the
closing parenthesis in the DEFINE) was blank, we reported it as a blank
line to the lexer.  The parser for DEFINE could check for that (by seeing
whether the first line of macro body was empty or all-spaces) but it seems
more elegant to do it in the segmenter.  This implements that.