X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Flanguage.texi;h=5381668c705ad38675c924d2147858a8b2790c82;hb=9ade26c8349b4434008c46cf09bc7473ec743972;hp=78d38acdc6f0813fe2e1666fa7ddfa8b5bb69c31;hpb=afdf3096926b561f4e6511c10fcf73fc6796b9d2;p=pspp-builds.git diff --git a/doc/language.texi b/doc/language.texi index 78d38acd..5381668c 100644 --- a/doc/language.texi +++ b/doc/language.texi @@ -111,26 +111,29 @@ character used for quoting in the string, double it, e.g.@: significant inside strings. Strings can be concatenated using @samp{+}, so that @samp{"a" + 'b' + -'c'} is equivalent to @samp{'abc'}. Concatenation is useful for -splitting a single string across multiple source lines. - -Strings may also be expressed as hexadecimal, octal, or binary -character values by prefixing the initial quote character by @samp{X}, -@samp{O}, or @samp{B} or their lowercase equivalents. Each pair, -triplet, or octet of characters, according to the radix, is -transformed into a single character with the given value. If there is -an incomplete group of characters, the missing final digits are -assumed to be @samp{0}. These forms of strings are nonportable -because numeric values are associated with different characters by -different operating systems. Therefore, their use should be confined -to syntax files that will not be widely distributed. - -@cindex characters, reserved -@cindex 0 -@cindex white space -The character with value 00 is reserved for -internal use by PSPP. Its use in strings causes an error and -replacement by a space character. +'c'} is equivalent to @samp{'abc'}. So that a long string may be +broken across lines, a line break may precede or follow, or both +precede and follow, the @samp{+}. (However, an entirely blank line +preceding or following the @samp{+} is interpreted as ending the +current command.) + +Strings may also be expressed as hexadecimal character values by +prefixing the initial quote character by @samp{x} or @samp{X}. +Regardless of the syntax file or active dataset's encoding, the +hexadecimal digits in the string are interpreted as Unicode characters +in UTF-8 encoding. + +Individual Unicode code points may also be expressed by specifying the +hexadecimal code point number in single or double quotes preceded by +@samp{u} or @samp{U}. For example, Unicode code point U+1D11E, the +musical G clef character, could be expressed as @code{U'1D11E'}. +Invalid Unicode code points (above U+10FFFF or in between U+D800 and +U+DFFF) are not allowed. + +When strings are concatenated with @samp{+}, each segment's prefix is +considered individually. For example, @code{'The G clef symbol is:' + +u"1d11e" + "."} inserts a G clef symbol in the middle of an otherwise +plain text string. @item Punctuators and Operators @cindex punctuators @@ -177,33 +180,40 @@ described in the previous section (@pxref{Tokens}). A blank line, or one that consists only of white space or comments, also ends a command. @node Syntax Variants -@section Variants of syntax. +@section Syntax Variants @cindex Batch syntax @cindex Interactive syntax -There are two variants of command syntax, @i{viz}: @dfn{batch} mode and -@dfn{interactive} mode. -Batch mode is the default when reading commands from a file. -Interactive mode is the default when commands are typed at a prompt -by a user. -Certain commands, such as @cmd{INSERT} (@pxref{INSERT}), may explicitly -change the syntax mode. - -In batch mode, any line that contains a non-space character -in the leftmost column begins a new command. -Thus, each command consists of a flush-left line followed by any -number of lines indented from the left margin. -In this mode, a plus or minus sign (@samp{+}, @samp{@minus{}}) as the -first character in a line is ignored and causes that line to begin a -new command, which allows for visual indentation of a command without -that command being considered part of the previous command. -The period terminating the end of a command is optional but recommended. - -In interactive mode, each command must be terminated with a period -or by a blank line. -The use of @samp{+} and @samp{@minus{}} as continuation characters is not -permitted. +There are three variants of command syntax, which vary only in how +they detect the end of one command and the start of the next. + +In @dfn{interactive mode}, which is the default for syntax typed at a +command prompt, a period as the last non-blank character on a line +ends a command. A blank line also ends a command. + +In @dfn{batch mode}, an end-of-line period or a blank line also ends a +command. Additionally, it treats any line that has a non-blank +character in the leftmost column as beginning a new command. Thus, in +batch mode the second and subsequent lines in a command must be +indented. + +Regardless of the syntax mode, a plus sign, minus sign, or period in +the leftmost column of a line is ignored and causes that line to begin +a new command. This is most useful in batch mode, in which the first +line of a new command could not otherwise be indented, but it is +accepted regardless of syntax mode. + +The default mode for reading commands from a file is @dfn{auto mode}. +It is the same as batch mode, except that a line with a non-blank in +the leftmost column only starts a new command if that line begins with +the name of a PSPP command. This correctly interprets most valid PSPP +syntax files regardless of the syntax mode for which they are +intended. + +The @option{--interactive} (or @option{-i}) or @option{--batch} (or +@option{-b}) options set the syntax mode for files listed on the PSPP +command line. @xref{Main Options}, for more details. @node Types of Commands @section Types of Commands