From 69bf3f901b0a949cfa957950f55df78d0c86a765 Mon Sep 17 00:00:00 2001 From: Ben Pfaff Date: Mon, 31 May 2021 22:43:03 -0700 Subject: [PATCH] Start writing documentation. --- doc/flow-control.texi | 491 +++++++++++++++++++++++++++++++ doc/utilities.texi | 1 + src/language/lexer/lexer.c | 7 +- src/language/lexer/macro.c | 7 + tests/language/control/define.at | 9 +- 5 files changed, 508 insertions(+), 7 deletions(-) diff --git a/doc/flow-control.texi b/doc/flow-control.texi index c0931f1bce..77b4a3c42e 100644 --- a/doc/flow-control.texi +++ b/doc/flow-control.texi @@ -20,6 +20,7 @@ looping, and flow of control. @menu * BREAK:: Exit a loop. +* DEFINE:: Define a macro. * DO IF:: Conditionally execute a block of code. * DO REPEAT:: Textually repeat a code block. * LOOP:: Repeat a block of code. @@ -39,6 +40,496 @@ BREAK. @cmd{BREAK} is allowed only inside @cmd{LOOP}@dots{}@cmd{END LOOP}. @xref{LOOP}, for more details. +@node DEFINE +@section DEFINE +@vindex DEFINE +@cindex macro + +@display +DEFINE macro_name([argument[/argument]@dots{}]) +@dots{}body@dots{} +!ENDDEFINE. + +Each argument takes the following form: + @{!arg_name =,!POSITIONAL@} [!DEFAULT(default)] [!NOEXPAND] + @{!TOKENS(count),!CHAREND('token'),!ENCLOSE('start','end'),!CMDEND@} + +The following directives may be used within the body: + !OFFEXPAND + !ONEXPAND + +The following functions may be used within the body: + !BLANKS(count) + !CONCAT(arg@dots{}) + !EVAL(arg) + !HEAD(arg) + !INDEX(haystack, needle) + !LENGTH(arg) + !NULL + !QUOTE(arg) + !SUBSTR(arg, start[, count]) + !TAIL(arg) + !UNQUOTE(arg) + !UPCASE(arg) +@end display + +The DEFINE command defines a macro that can later be called any number +of times within a syntax file. Each time it is called, the macro's +body is @dfn{expanded}, that is, substituted, as if the body had been +written instead of the macro call. A macro may accept arguments, +whose values are specified at the point of invocation and expanded in +the body where they are referenced. Macro bodies may also use various +directives and functions, which are also expanded when the macro is +called. + +Many identifiers associated with macros begin with @samp{!}, a +character not normally allowed in identifiers. These identifiers are +reserved only for use with macros, which helps keep them from being +confused with other kinds of identifiers. + +@menu +* Macro Basics:: +* Macro Arguments:: +* Controlling Macro Expansion:: +* Macro Functions:: +* Macro Settings:: +* Macro Notes:: +@end menu + +@node Macro Basics +@subsection Macro Basics + +The simplest macros have no arguments. The following defines a macro +named @code{!vars} that expands to the variable names @code{v1 v2 v3}, +along with a few example uses. The macro's name begins with @samp{!}, +which is optional for macro names. The @code{()} following the macro +name are required: + +@example +DEFINE !vars() +v1 v2 v3 +!ENDDEFINE. + +DESCRIPTIVES !vars. +FREQUENCIES /VARIABLES=!vars. +@end example + +Macros can also expand to entire commands. For example, the following +example performs the same analyses as the last one: + +@example +DEFINE !commands() +DESCRIPTIVES v1 v2 v3. +FREQUENCIES /VARIABLES=v1 v2 v3. +!ENDDEFINE. + +!commands +@end example + +The body of a macro can call another macro. For example, we could +combine the two preceding examples, with @code{!commands} calling +@code{!vars} to obtain the variables to analyze. The following shows +one way that could work: + +@example +DEFINE !commands() +DESCRIPTIVES !vars. +FREQUENCIES /VARIABLES=!vars. +!ENDDEFINE. + +DEFINE !vars() v1 v2 v3 !ENDDEFINE. +!commands + +* We can redefine the variables macro to analyze different variables: +DEFINE !vars() v4 v5 !ENDDEFINE. +!commands +@end example + +The @code{!commands} macro would be easier to use if it took the +variables to analyze as an argument rather than through another macro. +The following section shows how to do that. + +@node Macro Arguments +@subsection Macro Arguments + +Macros may take any number of arguments, which are specified within +the parentheses in the DEFINE command. Arguments come in two +varieties based on how their values are specified when the macro is +called: + +@itemize @bullet +@item +A @dfn{positional} argument has a required value that follows the +macro's name. Use the @code{!POSITIONAL} keyword to declare a +positional argument. + +References to a positional argument in a macro body are numbered: +@code{!1} is the first positional argument, @code{!2} the second, and +so on. + +The following example uses a positional argument: + +@example +DEFINE !analyze(!POSITIONAL !CMDEND) +DESCRIPTIVES !1. +FREQUENCIES /VARIABLES=!1. +!ENDDEFINE. + +!analyze v1 v2 v3. +!analyze v4 v5. +@end example + +@item +A @dfn{keyword} argument has a name. In the macro call, its value is +specified with the syntax @code{@var{name}=@var{value}}. Because of +the names, keyword argument values may take any order in a macro call. +If one is omitted, then a default value is used: either the value +specified in @code{!DEFAULT(@var{value})}, or an empty value +otherwise. + +In declaration and calls, a keyword argument's name may not begin with +@samp{!}, but references to it in the macro body do start with a +leading @samp{!}. + +The following example uses a keyword argument that defaults to ALL if +the argument is not assigned a value: + +@example +DEFINE !analyze_kw(vars=!DEFAULT(ALL) !CMDEND) +DESCRIPTIVES !vars. +FREQUENCIES /VARIABLES=!vars. +!ENDDEFINE. + +!analyze_kw vars=v1 v2 v3. /* Analyze specified variables. +!analyze_kw. /* Analyze all variables. +@end example +@end itemize + +@example +DEFINE !analyze_kw(vars=!CMDEND) +DESCRIPTIVES !vars. +FREQUENCIES /VARIABLES=!vars. +!ENDDEFINE. + +!analyze_kw vars=v1 v2 v3. +@end example + +If a macro has both positional and keyword arguments, then the +positional arguments must come first in the DEFINE command, and their +values also come first in macro calls. + +Each argument declaration specifies the form of its value: + +@table @code +@item !TOKENS(@var{count}) +Exactly @var{count} tokens, e.g.@: @code{!TOKENS(1)} for a single +token. Each identifier, number, quoted string, operator, or +punctuator is a token. @xref{Tokens}, for a complete definition. + +The following variant of @code{!analyze_kw} accepts only a single +variable name (or @code{ALL}) as its argument: + +@example +DEFINE !analyze_one_var(!POSITIONAL !TOKENS(1)) +DESCRIPTIVES !1. +FREQUENCIES /VARIABLES=!1. +!ENDDEFINE. + +!analyze_one_var v1. +@end example + +@item !CHAREND('@var{token}') +Any number of tokens up to @var{token}, which should be an operator or +punctuator token such as @samp{/} or @samp{+}. The @var{token} does +not become part of the value. + +@item !ENCLOSE('@var{start}','@var{end}') +Any number of tokens enclosed between @var{start} and @var{end}, which +should each be operator or punctuator tokens. For example, use +@code{!ENCLOSE('(',')')} for a value enclosed within parentheses. +(Such a value could never have right parentheses inside it, even +paired with left parentheses.) The start and end tokens are not part +of the value. + +With the following variant of @code{!analyze_kw}, the variables must +be specified within parentheses: + +@example +DEFINE !analyze_parens(vars=!ENCLOSE('(',')')) +DESCRIPTIVES !vars. +FREQUENCIES /VARIABLES=!vars. +!ENDDEFINE. + +!analyze_parens vars=(v1 v2 v3). +@end example + +@item !CMDEND +Any number of tokens up to the end of the command. This should be +used only for the last positional parameter, since it consumes all of +the tokens in the command calling the macro. +@end table + +By default, when an argument's value contains a macro call, the call +is expanded each time the argument appears in the macro's body. The +@code{!NOEXPAND} keyword in an argument declaration suppresses this +expansion. + +@node Controlling Macro Expansion +@subsection Controlling Macro Expansion + +Multiple factors control whether macro calls are expanded in different +situations. At the highest level, @code{SET MEXPAND} controls whether +macro calls are expanded. By default, it is enabled. @xref{SET +MEXPAND}, for details. + +A macro body may contain macro calls. By default, these are expanded. +If a macro body contains @code{!OFFEXPAND} or @code{!ONEXPAND} +directives, then @code{!OFFEXPAND} disables expansion of macro calls +until the following @code{!ONEXPAND}. + +A macro argument's value may contain a macro call. By default, these +macro calls are expanded. If the argument was declared with the +@code{!NOEXPAND} keyword, they are not expanded. + +The argument to a macro function is a special context that does not +expand macro calls. For example, if @code{!vars} is the name of a +macro, then @code{!LENGTH(!vars)} expands to 5, as does +@code{!LENGTH(!1)} if positional argument 1 has value @code{!vars}. +In these cases, use the @code{!EVAL} macro function to expand macros, +e.g.@: @code{!LENGTH(!EVAL(!vars))} or @code{!LENGTH(!EVAL(!1))}. +@xref{Macro Functions}, for details. + +These rules apply to macro calls. Uses of macro functions and macro +arguments within a macro body are always expanded. + +@node Macro Functions +@subsection Macro Functions + +Macro bodies may manipulate syntax using macro functions. Macro +functions accept tokens as arguments and expand to sequences of +characters. + +The arguments to macro functions have a restricted form. They may +only be a single token (such as an identifier or a string), a macro +argument, or a call to a macro function. Thus, @code{x}, @code{5.0}, +@code{x}, @code{!1}, @code{"5 + 6"}, and @code{!CONCAT(x,y)} are valid +macro arguments, but @code{x y} and @code{5 + 6} are not. + +Macro functions expand to sequences of characters. When these +character strings are processed further as character strings, e.g.@: +with @code{!LENGTH}, any character string is valid. When they are +interpreted as PSPP syntax, e.g.@: when the expansion becomes part of +a command, they need to be valid for that purpose. For example, +@code{!UNQUOTE("It's")} will yield an error if the expansion +@code{It's} becomes part of a PSPP command, because it contains +unbalanced single quotes, but @code{!LENGTH(!UNQUOTE("It's"))} expands +to 4. + +The following macro functions are available. Each function's +documentation includes examples in the form @code{@var{call} +@expansion{} @var{expansion}}. + +@deffn {Macro Function} !BLANKS (count) +Expands to @var{count} unquoted spaces, where @var{count} is a +nonnegative integer. Outside quotes, any positive number of spaces +are equivalent; for a quoted string of spaces, use +@code{!QUOTE(!BLANKS(@var{count}))}. + +In the examples below, @samp{_} stands in for a space to make the +results visible. + +@example +!BLANKS(0) @expansion{} @r{empty} +!BLANKS(1) @expansion{} _ +!BLANKS(2) @expansion{} __ +!QUOTE(!BLANKS(5)) @expansion{} '_____' +@end example +@end deffn + +@deffn {Macro Function} !CONCAT (arg@dots{}) +Expands to the concatenation of all of the arguments. Before +concatenation, each quoted string argument is unquoted, as if +@code{!UNQUOTE} were applied. + +@example +!CONCAT(x, y) @expansion{} xy +!CONCAT('x', 'y') @expansion{} xy +!CONCAT(12, 34) @expansion{} 1234 +!CONCAT(!NULL, 123) @expansion{} 123 +@end example +@end deffn + +@deffn {Macro Function} !EVAL (arg) +Expands macro calls in @var{arg}. This is especially useful if +@var{arg} is the name of a macro or a macro argument that expands to +one, because arguments to macro functions are not expanded by default. + +The following examples assume that @code{!vars} is a macro that +expands to @code{a b c}: + +@example +!vars @expansion{} a b c +!QUOTE(!vars) @expansion{} '!vars' +!EVAL(!vars) @expansion{} a b c +!QUOTE(!EVAL(!vars)) @expansion{} 'a b c' +@end example + +These examples additionally assume that argument @code{!1} has value +@code{!vars}: + +@example +!1 @expansion{} a b c +!QUOTE(!1) @expansion{} '!vars' +!EVAL(!1) @expansion{} a b c +!QUOTE(!EVAL(!1)) @expansion{} 'a b c' +@end example +@end deffn + +@deffn {Macro Function} !HEAD (arg) +@deffnx {Macro Function} !TAIL (arg) +@code{!HEAD} expands to just the first token in an unquoted version of +@var{arg}, and @code{!TAIL} to all the tokens after the first. + +@example +!HEAD('a b c') @expansion{} a +!HEAD('a') @expansion{} a +!HEAD(!NULL) @expansion{} @r{empty} +!HEAD('') @expansion{} @r{empty} + +!TAIL('a b c') @expansion{} b c +!TAIL('a') @expansion{} @r{empty} +!TAIL(!NULL) @expansion{} @r{empty} +!TAIL('') @expansion{} @r{empty} +@end example +@end deffn + +@deffn {Macro Function} !INDEX (haystack, needle) +Looks for @var{needle} in @var{haystack}. If it is present, expands +to the 1-based index of its first occurrence; if not, expands to 0. + +@example +!INDEX(banana, an) @expansion{} 2 +!INDEX(banana, nan) @expansion{} 3 +!INDEX(banana, apple) @expansion{} 0 +!INDEX("banana", nan) @expansion{} 4 +!INDEX("banana", "nan") @expansion{} 0 +!INDEX(!UNQUOTE("banana"), !UNQUOTE("nan")) @expansion{} 3 +@end example +@end deffn + +@deffn {Macro Function} !LENGTH (arg) +Expands to a number token representing the number of characters in +@var{arg}. + +@example +!LENGTH(123) @expansion{} 3 +!LENGTH(123.00) @expansion{} 6 +!LENGTH( 123 ) @expansion{} 3 +!LENGTH("123") @expansion{} 5 +!LENGTH(xyzzy) @expansion{} 5 +!LENGTH("xyzzy") @expansion{} 7 +!LENGTH("xy""zzy") @expansion{} 9 +!LENGTH(!UNQUOTE("xyzzy")) @expansion{} 5 +!LENGTH(!UNQUOTE("xy""zzy")) @expansion{} 6 +!LENGTH(!1) @expansion{} 5 @r{if @t{!1} is @t{a b c}} +!LENGTH(!1) @expansion{} 0 @r{if @t{!1} is empty} +!LENGTH(!NULL) @expansion{} 0 +@end example +@end deffn + +@deffn {Macro Function} !NULL +Expands to an empty character sequence. + +@example +!NULL @expansion{} @r{empty} +!QUOTE(!NULL) @expansion{} '' +@end example +@end deffn + +@deffn {Macro Function} !QUOTE (arg) +@deffnx {Macro Function} !UNQUOTE (arg) +The @code{!QUOTE} function expands to its argument surrounded by +apostrophes, doubling any apostrophes inside the argument to make sure +that it is valid PSPP syntax for a string. If the argument was +already a quoted string, @code{!QUOTE} expands to it unchanged. + +Given a quoted string argument, the @code{!UNQUOTED} function expands +to the string's contents, with the quotes removed and any doubled +quote marks reduced to singletons. If the argument was not a quoted +string, @code{!UNQUOTE} expands to the argument unchanged. + +@example +!QUOTE(123.0) @expansion{} '123.0' +!QUOTE( 123 ) @expansion{} '123' +!QUOTE('a b c') @expansion{} 'a b c' +!QUOTE("a b c") @expansion{} "a b c" +!QUOTE(!1) @expansion{} 'a ''b'' c' @r{if @t{!1} is @t{a 'b' c}} + +!UNQUOTE(123.0) @expansion{} 123.0 +!UNQUOTE( 123 ) @expansion{} 123 +!UNQUOTE('a b c') @expansion{} a b c +!UNQUOTE("a b c") @expansion{} a b c +!UNQUOTE(!1) @expansion{} a 'b' c @r{if @t{!1} is @t{a 'b' c}} + +!QUOTE(!UNQUOTE(123.0)) @expansion{} '123.0' +!QUOTE(!UNQUOTE( 123 )) @expansion{} '123' +!QUOTE(!UNQUOTE('a b c')) @expansion{} 'a b c' +!QUOTE(!UNQUOTE("a b c")) @expansion{} 'a b c' +!QUOTE(!UNQUOTE(!1)) @expansion{} 'a ''b'' c' @r{if @t{!1} is @t{a 'b' c}} +@end example +@end deffn + +@deffn {Macro Function} !SUBSTR (arg, start[, count]) +Expands to a substring of @var{arg} starting from 1-based position +@var{start}. If @var{count} is given, it limits the number of +characters in the expansion; if it is omitted, then the expansion +extends to the end of @var{arg}. + +@example +!SUBSTR(banana, 3) @expansion{} nana +!SUBSTR(banana, 3, 3) @expansion{} nan +!SUBSTR("banana", 3) @expansion{} anana" +!SUBSTR("banana", 3, 3) @expansion{} ana + +!SUBSTR(banana, 3, 0) @expansion{} @r{empty} +!SUBSTR(banana, 3, 10) @expansion{} nana +!SUBSTR(banana, 10, 3) @expansion{} @r{empty} +@end example +@end deffn + +@deffn {Macro Function} !UPCASE (arg) +Expands to an unquoted version of @var{arg} with all letters converted +to uppercase. + +@example +!UPCASE(freckle) @expansion{} FRECKLE +!UPCASE('freckle') @expansion{} FRECKLE +!UPCASE('a b c') @expansion{} A B C +!UPCASE('A B C') @expansion{} A B C +@end example +@end deffn + +@node Macro Settings +@subsection Macro Settings + +MPRINT +MEXPAND +MNEST +MITERATE + +PRESERVE...RESTORE + +SET MEXPAND, etc. doesn't work inside macro bodies. + +@node Macro Notes +@subsection Extra Notes + +@code{!*} expands to all the positional arguments. + +Macros in comments. + +Macros in titles. + @node DO IF @section DO IF @vindex DO IF diff --git a/doc/utilities.texi b/doc/utilities.texi index da5de1ddce..62ca1a8f24 100644 --- a/doc/utilities.texi +++ b/doc/utilities.texi @@ -938,6 +938,7 @@ The following subcommands affect the interpretation of macros. @table @asis @item MEXPAND +@anchor{SET MEXPAND} Controls whether macros are expanded. The default is ON. @item MPRINT diff --git a/src/language/lexer/lexer.c b/src/language/lexer/lexer.c index 534ed077e1..817a07baf5 100644 --- a/src/language/lexer/lexer.c +++ b/src/language/lexer/lexer.c @@ -1758,12 +1758,7 @@ lex_source_get (const struct lex_source *src_) static void lex_source_push_endcmd__ (struct lex_source *src) { - struct lex_token *token = lex_push_token__ (src); - token->token.type = T_ENDCMD; - token->token_pos = 0; - token->token_len = 0; - token->line_pos = 0; - token->first_line = 0; + *lex_push_token__ (src) = (struct lex_token) { .token = { .type = T_ENDCMD } }; } static struct lex_source * diff --git a/src/language/lexer/macro.c b/src/language/lexer/macro.c index d67278c51c..584ee7f2fd 100644 --- a/src/language/lexer/macro.c +++ b/src/language/lexer/macro.c @@ -940,6 +940,13 @@ macro_expand (const struct macro_tokens *mts, const struct macro_expander *me, bool *expand, struct macro_tokens *exp) { + /* Macro expansion: + + - Macro names in macro bodies are not expanded by default. !EVAL() + expands them. + + - Macro names in arguments to macro invocations (outside of macro bodies) + are expanded by default, unless !NOEXPAND. */ if (nesting_countdown <= 0) { printf ("maximum nesting level exceeded\n"); diff --git a/tests/language/control/define.at b/tests/language/control/define.at index 5117f0886e..8f1c8dea8d 100644 --- a/tests/language/control/define.at +++ b/tests/language/control/define.at @@ -344,4 +344,11 @@ note: unexpanded token "x" note: unexpanded token "/" note: unexpanded token "arg2" note: unexpanded token "=" -note: unexpanded token "y"]) \ No newline at end of file +note: unexpanded token "y"]) + +PSPP_CHECK_MACRO_EXPANSION([default keyword arguments], + [DEFINE !k(arg1 = !DEFAULT(a b c) !CMDEND) k(!arg1) !ENDDEFINE], + [!k arg1=x. +!k], + [k(x) +k(a b c)]) \ No newline at end of file -- 2.30.2