From: Ben Pfaff Date: Sun, 5 Nov 2006 05:20:52 +0000 (+0000) Subject: Rewrite and improve formatted output routines. X-Git-Tag: v0.6.0~701 X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=8acca2de53c1852f38726f70fc6516b34732a79f;p=pspp-builds.git Rewrite and improve formatted output routines. Add lots of regression tests. Revise documentation. Thanks to John Darrington for review--see patch #5522. --- diff --git a/ChangeLog b/ChangeLog index ddf1d6cf..e50320c4 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +Sat Nov 4 15:59:31 2006 Ben Pfaff + + * configure.ac: Check for the "round" function added in C99. + Tue Oct 31 19:55:52 2006 Ben Pfaff * Smake (GNULIB_MODULES): Add `mempcpy' module. diff --git a/configure.ac b/configure.ac index dddfc564..0295d7fa 100644 --- a/configure.ac +++ b/configure.ac @@ -73,7 +73,7 @@ AC_DEFINE(FPREP_IEEE754, 1, AC_C_BIGENDIAN AC_FUNC_VPRINTF -AC_CHECK_FUNCS([__setfpucw isinf isnan finite getpid feholdexcept]) +AC_CHECK_FUNCS([__setfpucw isinf isnan finite getpid feholdexcept round]) AC_PROG_LN_S diff --git a/doc/data-io.texi b/doc/data-io.texi index d7e18ae8..efff25ea 100644 --- a/doc/data-io.texi +++ b/doc/data-io.texi @@ -186,7 +186,7 @@ In columnar style, the starting column and ending column for the field are specified after the variable name, separated by a dash (@samp{-}). For instance, the third through fifth columns on a line would be specified @samp{3-5}. By default, variables are considered to be in -@samp{F} format (@pxref{Input/Output Formats}). (This default can be +@samp{F} format (@pxref{Input and Output Formats}). (This default can be changed; see @ref{SET} for more information.) In columnar style, to use a variable format other than the default, @@ -218,7 +218,7 @@ Implied decimal places also exist in FORTRAN style. A format specification with @var{d} decimal places also has @var{d} implied decimal places. -In addition to the standard format specifiers (@pxref{Input/Output +In addition to the standard format specifiers (@pxref{Input and Output Formats}), FORTRAN style defines some extensions: @table @asis @@ -381,7 +381,7 @@ The FILE and END subcommands are as in @cmd{DATA LIST FIXED} above. The variables to be parsed are given as a single list of variable names. This list must be introduced by a single slash (@samp{/}). The set of variable names may contain format specifications in parentheses -(@pxref{Input/Output Formats}). Format specifications apply to all +(@pxref{Input and Output Formats}). Format specifications apply to all variables back to the previous parenthesized format specification. In addition, an asterisk may be used to indicate that all variables diff --git a/doc/language.texi b/doc/language.texi index 66e5ba36..3b667618 100644 --- a/doc/language.texi +++ b/doc/language.texi @@ -383,7 +383,7 @@ Some details of variables are described in the sections below. * Attributes:: Attributes of variables. * System Variables:: Variables automatically defined by PSPP. * Sets of Variables:: Lists of variable names. -* Input/Output Formats:: Input and output formats. +* Input and Output Formats:: Input and output formats. * Scratch Variables:: Variables deleted by procedures. @end menu @@ -470,12 +470,12 @@ string. @xref{VALUE LABELS}. Display width, format, and (for numeric variables) number of decimal places. This attribute does not affect how data are stored, just how they are displayed. Example: a width of 8, with 2 decimal places. -@xref{PRINT FORMATS}. +@xref{Input and Output Formats}. @cindex write format @item Write format -Similar to print format, but used by certain commands that are -designed to write to binary files. @xref{WRITE FORMATS}. +Similar to print format, but used by the @cmd{WRITE} command +(@pxref{WRITE}). @end table @node System Variables, Sets of Variables, Attributes, Variables @@ -522,7 +522,7 @@ was read, in format F20. Page width, in characters, in format F3. @end table -@node Sets of Variables, Input/Output Formats, System Variables, Variables +@node Sets of Variables, Input and Output Formats, System Variables, Variables @subsection Lists of variable names @cindex TO convention @cindex convention, TO @@ -551,326 +551,680 @@ After a set of variables has been defined with @cmd{DATA LIST} or another command with this method, the same set can be referenced on later commands using the same syntax. -@node Input/Output Formats, Scratch Variables, Sets of Variables, Variables +@node Input and Output Formats, Scratch Variables, Sets of Variables, Variables @subsection Input and Output Formats -Data that PSPP inputs and outputs must have one of a number of formats. -These formats are described, in general, by a format specification of -the form @code{NAMEw.d}, where @var{name} is the -format name and @var{w} is a field width. @var{d} is the optional -desired number of decimal places, if appropriate. If @var{d} is not -included then it is assumed to be 0. Some formats do not allow @var{d} -to be specified. +An @dfn{input format} describes how to interpret the contents of an +input field as a number or a string. It might specify that the field +contains an ordinary decimal number, a time or date, a number in binary +or hexadecimal notation, or one of several other notations. Input +formats are used by commands such as @cmd{DATA LIST} that read data or +syntax files into the PSPP active file. + +Every input format corresponds to a default @dfn{output format} that +specifies the formatting used when the value is output later. It is +always possible to explicitly specify an output format that resembles +the input format. Usually, this is the default, but in cases where the +input format is unfriendly to human readability, such as binary or +hexadecimal formats, the default output format is an easier-to-read +decimal format. + +Every variable has two output formats, called its @dfn{print format} and +@dfn{write format}. Print formats are used in most output contexts; +write formats are used only by @cmd{WRITE} (@pxref{WRITE}). Newly +created variables have identical print and write formats, and +@cmd{FORMATS}, the most commonly used command for changing formats +(@pxref{FORMATS}), sets both of them to the same value as well. Thus, +most of the time, the distinction between print and write formats is +unimportant. + +Input and output formats are specified to PSPP with a @dfn{format +specification} of the form @code{TYPEw} or @code{TYPEw.d}, where +@code{TYPE} is one of the format types described later, @code{w} is a +field width measured in columns, and @code{d} is an optional number of +decimal places. If @code{d} is omitted, a value of 0 is assumed. Some +formats do not allow a nonzero @code{d} to be specified. + +The following sections describe the input and output formats supported +by PSPP. -When @cmd{DATA LIST} or another command specifies an input format, -that format is converted to an output format for the purposes of -@cmd{PRINT} and other data output commands. For most purposes, input -and output formats are the same; the salient differences are described -below. +@menu +* Basic Numeric Formats:: +* Custom Currency Formats:: +* Legacy Numeric Formats:: +* Binary and Hexadecimal Numeric Formats:: +* Time and Date Formats:: +* Date Component Formats:: +* String Formats:: +@end menu -Below are listed the input and output formats supported by PSPP. If an -input format is mapped to a different output format by default, then -that mapping is indicated with @result{}. Each format has the listed -bounds on input width (iw) and output width (ow). +@node Basic Numeric Formats +@subsubsection Basic Numeric Formats + +The basic numeric formats are used for input and output of real numbers +in standard or scientific notation. The following table shows an +example of how each format displays positive and negative numbers with +the default decimal point setting: + +@float +@multitable {DOLLAR10.2} {@code{@tie{}$3,141.59}} {@code{-$3,141.59}} +@headitem Format @tab @code{@tie{}}3141.59 @tab -3141.59 +@item F8.2 @tab @code{@tie{}3141.59} @tab @code{-3141.59} +@item COMMA9.2 @tab @code{@tie{}3,141.59} @tab @code{-3,141.59} +@item DOT9.2 @tab @code{@tie{}3.141,59} @tab @code{-3.141,59} +@item DOLLAR10.2 @tab @code{@tie{}$3,141.59} @tab @code{-$3,141.59} +@item PCT9.2 @tab @code{@tie{}3141.59%} @tab @code{-3141.59%} +@item E8.1 @tab @code{@tie{}3.1E+003} @tab @code{-3.1E+003} +@end multitable +@end float + +On output, numbers in F format are expressed in standard decimal +notation with the requested number of decimal places. The other formats +output some variation on this style: -The standard numeric input and output formats are given in the following -table: +@itemize @bullet +@item +Numbers in COMMA format are additionally grouped every three digits by +inserting a grouping character. The grouping character is ordinarily a +comma, but it can be changed to a period (@pxref{SET DECIMAL}). -@table @asis -@item Fw.d: 1 <= iw,ow <= 40 -Standard decimal format with @var{d} decimal places. If the number is -too large to fit within the field width, it is expressed in scientific -notation (@code{1.2+34}) if w >= 6, with always at least two digits in -the exponent. When used as an input format, scientific notation is -allowed but an E or an F must be used to introduce the exponent. - -The default output format is the same as the input format, except if -@var{d} > 1. In that case the output @var{w} is always made to be at -least 2 + @var{d}. +@item +DOT format is like COMMA format, but it interchanges the role of the +decimal point and grouping characters. That is, the current grouping +character is used as a decimal point and vice versa. -@item Ew.d: 1 <= iw <= 40; 6 <= ow <= 40 -For input this is equivalent to F format except that no E or F is -require to introduce the exponent. For output, produces scientific -notation in the form @code{1.2+34}. There are always at least two -digits given in the exponent. +@item +DOLLAR format is like COMMA format, but it prefixes the number with +@samp{$}. -The default output @var{w} is the largest of the input @var{w}, the -input @var{d} + 7, and 10. The default output @var{d} is the input -@var{d}, but at least 3. +@item +PCT format is like F format, but adds @samp{%} after the number. -@item COMMAw.d: 1 <= iw,ow <= 40 -Equivalent to F format, except that groups of three digits are -comma-separated on output. If the number is too large to express in the -field width, then first commas are eliminated, then if there is still -not enough space the number is expressed in scientific notation given -that w >= 6. Commas are allowed and ignored when this is used as an -input format. +@item +The E format always produces output in scientific notation. +@end itemize -@item DOTw.d: 1 <= iw,ow <= 40 -Equivalent to COMMA format except that the roles of comma and decimal -point are interchanged. However: If SET /DECIMAL=DOT is in effect, then -COMMA uses @samp{,} for a decimal point and DOT uses @samp{.} for a -decimal point. +On input, the basic numeric formats accept positive and numbers in +standard decimal notation or scientific notation. Leading and trailing +spaces are allowed. An empty or all-spaces field, or one that contains +only a single period, is treated as the system missing value. -@item DOLLARw.d: 1 <= iw <= 40; 2 <= ow <= 40 -Equivalent to COMMA format, except that the number is prefixed by a -dollar sign (@samp{$}) if there is room. On input the value is allowed -to be prefixed by a dollar sign, which is ignored. +In scientific notation, the exponent may be introduced by a sign +(@samp{+} or @samp{-}), or by one of the letters @samp{e} or @samp{d} +(in uppercase or lowercase), or by a letter followed by a sign. A +single space may follow the letter or the sign or both. -The default output @var{w} is the input @var{w}, but at least 2. +On fixed-format @cmd{DATA LIST} (@pxref{DATA LIST FIXED}) and in a few +other contexts, decimals are implied when the field does not contain a +decimal point. In F6.5 format, for example, the field @code{314159} is +taken as the value 3.14159 with implied decimals. Decimals are never +implied if an explicit decimal point is present or if scientific +notation is used. -@item PCTw.d: 2 <= iw,ow <= 40 -Equivalent to F format, except that the number is suffixed by a percent -sign (@samp{%}) if there is room. On input the value is allowed to be -suffixed by a percent sign, which is ignored. +E and F formats accept the basic syntax already described. The other +formats allow some additional variations: -The default output @var{w} is the input @var{w}, but at least 2. +@itemize @bullet +@item +COMMA, DOLLAR, and DOT formats ignore grouping characters within the +integer part of the input field. The identity of the grouping +character depends on the format. -@item Nw.d: 1 <= iw,ow <= 40 -Only digits are allowed within the field width. The decimal point is -assumed to be @var{d} digits from the right margin. +@item +DOLLAR format allows a dollar sign to precede the number. In a negative +number, the dollar sign may precede or follow the minus sign. -The default output format is F with the same @var{w} and @var{d}, except -if @var{d} > 1. In that case the output @var{w} is always made to be at -least 2 + @var{d}. +@item +PCT format allows a percent sign to follow the number. +@end itemize -@item Zw.d @result{} F: 1 <= iw,ow <= 40 -Zoned decimal input. If you need to use this then you know how. +All of the basic number formats have a maximum field width of 40 and +accept no more than 16 decimal places, on both input and output. Some +additional restrictions apply: -@item IBw.d @result{} F: 1 <= iw,ow <= 8 -Integer binary format. The field is interpreted as a fixed-point -positive or negative binary number in two's-complement notation. The -location of the decimal point is implied. Endianness is the same as the -host machine. +@itemize @bullet +@item +As input formats, the basic numeric formats allow no more decimal places +than the field width. As output formats, the field width must be +greater than the number of decimal places; that is, large enough to +allow for a decimal point and the number of requested decimal places. +DOLLAR and PCT formats must allow an additional column for @samp{$} or +@samp{%}. -The default output format is F8.2 if @var{d} is 0. Otherwise it is F, -with output @var{w} as 9 + input @var{d} and output @var{d} as input -@var{d}. +@item +The default output format for a given input format increases the field +width enough to make room for optional input characters. If an input +format calls for decimal places, the width is increased by 1 to make +room for an implied decimal point. COMMA, DOT, and DOLLAR formats also +increase the output width to make room for grouping characters. DOLLAR +and PCT further increase the output field width by 1 to make room for +@samp{$} or @samp{%}. The increased output width is capped at 40, the +maximum field width. -@item PIB @result{} F: 1 <= iw,ow <= 8 -Positive integer binary format. The field is interpreted as a -fixed-point positive binary number. The location of the decimal point -is implied. Endianness is the same as the host machine. +@item +The E format is exceptional. For output, E format has a minimum width +of 7 plus the number of decimal places. The default output format for +an E input format is an E format with at least 3 decimal places and +thus a minimum width of 10. +@end itemize -The default output format follows the rules for IB format. +More details of basic numeric output formatting are given below: -@item Pw.d @result{} F: 1 <= iw,ow <= 16 -Binary coded decimal format. Each byte from left to right, except the -rightmost, represents two digits. The upper nibble of each byte is more -significant. The upper nibble of the final byte is the least -significant digit. The lower nibble of the final byte is the sign; a -value of D represents a negative sign and all other values are -considered positive. The decimal point is implied. +@itemize @bullet +@item +Output rounds to nearest, with ties rounded away from zero. Thus, 2.5 +is output as @code{3} in F1.0 format, and -1.125 as @code{-1.13} in F5.1 +format. -The default output format follows the rules for IB format. +@item +The system-missing value is output as a period in a field of spaces, +placed in the decimal point's position, or in the rightmost column if no +decimal places are requested. A period is used even if the decimal +point character is a comma. -@item PKw.d @result{} F: 1 <= iw,ow <= 16 -Positive binary code decimal format. Same as P but the last byte is the -same as the others. +@item +A number that does not fill its field is right-justified within the +field. -The default output format follows the rules for IB format. +@item +A number is too large for its field causes decimal places to be dropped +to make room. If dropping decimals does not make enough room, +scientific notation is used if the field is wide enough. If a number +does not fit in the field, even in scientific notation, the overflow is +indicated by filling the field with asterisks (@samp{*}). -@item RBw @result{} F: 2 <= iw,ow <= 8 +@item +COMMA, DOT, and DOLLAR formats insert grouping characters only if space +is available for all of them. Grouping characters are never inserted +when all decimal places must be dropped. Thus, 1234.56 in COMMA5.2 +format is output as @samp{@tie{}1235} without a comma, even though there +is room for one, because all decimal places were dropped. -Binary C architecture-dependent ``double'' format. For a standard -IEEE754 implementation @var{w} should be 8. +@item +DOLLAR or PCT format drop the @samp{$} or @samp{%} only if the number +would not fit at all without it. Scientific notation with @samp{$} or +@samp{%} is preferred to ordinary decimal notation without it. -The default output format follows the rules for IB format. +@item +Except in scientific notation, a decimal point is included only when +it is followed by a digit. If the integer part of the number being +output is 0, and a decimal point is included, then the zero before the +decimal point is dropped. -@item PIBHEXw.d @result{} F: 2 <= iw,ow <= 16 -PIB format encoded as textual hex digit pairs. @var{w} must be even. +In scientific notation, the number always includes a decimal point, +even if it is not followed by a digit. -The input width is mapped to a default output width as follows: -2@result{}4, 4@result{}6, 6@result{}9, 8@result{}11, 10@result{}14, -12@result{}16, 14@result{}18, 16@result{}21. No allowances are made for -decimal places. +@item +A negative number includes a minus sign only in the presence of a +nonzero digit: -0.01 is output as @samp{-.01} in F4.2 format but as +@samp{@tie{}@tie{}.0} in F4.1 format. Thus, a ``negative zero'' never +includes a minus sign. -@item RBHEXw @result{} F: 4 <= iw,ow <= 16 +@item +In negative numbers output in DOLLAR format, the dollar sign follows the +negative sign. Thus, -9.99 in DOLLAR6.2 format is output as +@code{-$9.99}. -RB format encoded as textual hex digits pairs. @var{w} must be even. +@item +In scientific notation, the exponent is output as @samp{E} followed by +@samp{+} or @samp{-} and exactly three digits. Numbers with magnitude +less than 10**-999 or larger than 10**999 are not supported by most +computers, but if they are supported then their output is considered +to overflow the field and will be output as asterisks. -The default output format is F8.2. +@item +On most computers, no more than 15 decimal digits are significant in +output, even if more are printed. In any case, output precision cannot +be any higher than input precision; few data sets are accurate to 15 +digits of precision. Unavoidable loss of precision in intermediate +calculations may also reduce precision of output. -@item CCAw.d: 1 <= ow <= 40 -@itemx CCBw.d: 1 <= ow <= 40 -@itemx CCCw.d: 1 <= ow <= 40 -@itemx CCDw.d: 1 <= ow <= 40 -@itemx CCEw.d: 1 <= ow <= 40 +@item +Special values such as infinities and ``not a number'' values are +usually converted to the system-missing value before printing. In a few +circumstances, these values are output directly. In fields of width 3 +or greater, special values are output as however many characters will +fit from @code{+Infinity} or @code{-Infinity} for infinities, from +@code{NaN} for ``not a number,'' or from @code{Unknown} for other values +(if any are supported by the system). In fields under 3 columns wide, +special values are output as asterisks. +@end itemize -User-defined custom currency formats. May not be used as an input -format. @xref{SET}, for more details. -@end table +@node Custom Currency Formats +@subsubsection Custom Currency Formats + +The custom currency formats are closely related to the basic numeric +formats, but they allow users to customize the output format. The +SET command configures custom currency formats, using the syntax +@display +SET CC@var{x}=@t{"}@var{string}@t{"}. +@end display +@noindent +where @var{x} is A, B, C, D, or E, and @var{string} is no more than 16 +characters long. + +@var{string} must contain exactly three commas or exactly three periods +(but not both), except that a single quote character may be used to +``escape'' a following comma, period, or single quote. If three commas +are used, commas will be used for grouping in output, and a period will +be used as the decimal point. Uses of periods reverses these roles. + +The commas or periods divide @var{string} into four fields, called the +@dfn{negative prefix}, @dfn{prefix}, @dfn{suffix}, and @dfn{negative +suffix}, respectively. The prefix and suffix are added to output +whenever space is available. The negative prefix and negative suffix +are always added to a negative number when the output includes a nonzero +digit. + +The following syntax shows how custom currency formats could be used to +reproduce basic numeric formats: -The date and time numeric input and output formats accept a number of -possible formats. Before describing the formats themselves, some -definitions of the elements that make up their formats will be helpful: +@example +@group +SET CCA="-,,,". /* Same as COMMA. +SET CCB="-...". /* Same as DOT. +SET CCC="-,$,,". /* Same as DOLLAR. +SET CCD="-,,%,". /* Like PCT, but groups with commas. +@end group +@end example -@table @dfn -@item leader -All formats accept an optional white space leader. +Here are some more examples of custom currency formats. The final +example shows how to use a single quote to escape a delimiter: -@item day -An integer between 1 and 31 representing the day of month. +@example +@group +SET CCA=",EUR,,-". /* Euro. +SET CCB="(,USD ,,)". /* US dollar. +SET CCC="-.R$..". /* Brazilian real. +SET CCD="-,, NIS,". /* Israel shekel. +SET CCE="-.Rp'. ..". /* Indonesia Rupiah. +@end group +@end example -@item day-count -An integer representing a number of days. +@noindent These formats would yield the following output: -@item date-delimiter -One or more characters of white space or the following characters: -@code{- / . ,} +@float +@multitable {CCD13.2} {@code{@tie{}@tie{}USD 3,145.59}} {@code{(USD 3,145.59)}} +@headitem Format @tab @code{@tie{}}3145.59 @tab -3145.59 +@item CCA12.2 @tab @code{@tie{}EUR3,145.59} @tab @code{EUR3,145.59-} +@item CCB14.2 @tab @code{@tie{}@tie{}USD 3,145.59} @tab @code{(USD 3,145.59)} +@item CCC11.2 @tab @code{@tie{}R$3.145,59} @tab @code{-R$3.145,59} +@item CCD13.2 @tab @code{@tie{}3,145.59 NIS} @tab @code{-3,145.59 NIS} +@item CCE10.0 @tab @code{@tie{}Rp. 3.146} @tab @code{-Rp. 3.146} +@end multitable +@end float + +The default for all the custom currency formats is @samp{-,,,}, +equivalent to COMMA format. + +@node Legacy Numeric Formats +@subsubsection Legacy Numeric Formats + +The N and Z numeric formats provide compatibility with legacy file +formats. They have much in common: -@item month -A month name in one of the following forms: @itemize @bullet @item -An integer between 1 and 12. +Output is rounded to the nearest representable value, with ties rounded +away from zero. + +@item +Numbers too large to display are output as a field filled with asterisks +(@samp{*}). + +@item +The decimal point is always implicitly the specified number of digits +from the right edge of the field, except that Z format input allows an +explicit decimal point. + +@item +Scientific notation may not be used. + +@item +The system-missing value is output as a period in a field of spaces. +The period is placed just to the right of the implied decimal point in +Z format, or at the right end in N format or in Z format if no decimal +places are requested. A period is used even if the decimal point +character is a comma. + @item -Roman numerals representing an integer between 1 and 12. +Field width may range from 1 to 40. Decimal places may range from 0 up +to the field width, to a maximum of 16. + @item -At least the first three characters of an English month name (January, -February, @dots{}). +When a legacy numeric format used for input is converted to an output +format, it is changed into the equivalent F format. The field width is +increased by 1 if any decimal places are specified, to make room for a +decimal point. For Z format, the field width is increased by 1 more +column, to make room for a negative sign. The output field width is +capped at 40 columns. @end itemize -@item year -An integer year number between 1582 and 19999, or between 1 and 199. -Years between 1 and 199 will have 1900 added. +@subsubheading N Format -@item julian -A single number with a year number in the first 2, 3, or 4 digits (as -above) and the day number within the year in the last 3 digits. +The N format supports input and output of fields that contain only +digits. On input, leading or trailing spaces, a decimal point, or any +other non-digit character causes the field to be read as the +system-missing value. As a special exception, an N format used on +@cmd{DATA LIST FREE} or @cmd{DATA LIST LIST} is treated as the +equivalent F format. -@item quarter -An integer between 1 and 4 representing a quarter. +On output, N pads the field on the left with zeros. Negative numbers +are output like the system-missing value. -@item q-delimiter -The letter @samp{Q} or @samp{q}. +@subsubheading Z Format -@item week -An integer between 1 and 53 representing a week within a year. +The Z format is a ``zoned decimal'' format used on IBM mainframes. Z +format encodes the sign as part of the final digit, which must be one of +the following: +@example +0123456789 +@{ABCDEFGHI +@}JKLMNOPQR +@end example +@noindent +where the characters in each row represent digits 0 through 9 in order. +Characters in the first two rows indicate a positive sign; those in the +third indicate a negative sign. + +On output, Z fields are padded on the left with spaces. On input, +leading and trailing spaces are ignored. Any character in an input +field other than spaces, the digit characters above, and @samp{.} causes +the field to be read as system-missing. + +The decimal point character for input and output is always @samp{.}, +even if the decimal point character is a comma (@pxref{SET DECIMAL}). + +Nonzero, negative values output in Z format are marked as negative even +when no nonzero digits are output. For example, -0.2 is output in Z1.0 +format as @samp{J}. The ``negative zero'' value supported by most +machines is output as positive. + +@node Binary and Hexadecimal Numeric Formats +@subsubsection Binary and Hexadecimal Numeric Formats + +The binary and hexadecimal formats are primarily designed for +compatibility with existing machine formats, not for human readability. +All of them therefore have a F format as default output format. Some of +these formats are only portable between machines with compatible byte +ordering (endianness) or floating-point format. + +Binary formats use byte values that in text files are interpreted as +special control functions, such as carriage return and line feed. Thus, +data in binary formats should not be included in syntax files or read +from data files with variable-length records, such as ordinary text +files. They may be read from or written to data files with fixed-length +records. @xref{FILE HANDLE}, for information on working with +fixed-length records. + +@subsubheading P and PK Formats + +These are binary-coded decimal formats, in which every byte (except the +last, in P format) represents two decimal digits. The most-significant +4 bits of the first byte is the most-significant decimal digit, the +least-significant 4 bits of the first byte is the next decimal digit, +and so on. + +In P format, the most-significant 4 bits of the last byte are the +least-significant decimal digit. The least-significant 4 bits represent +the sign: decimal 15 indicates a negative value, decimal 13 indicates a +positive value. + +Numbers are rounded downward on output. The system-missing value and +numbers outside representable range are output as zero. + +The maximum field width is 16. Decimal places may range from 0 up to +the number of decimal digits represented by the field. + +The default output format is an F format with twice the input field +width, plus one column for a decimal point (if decimal places were +requested). + +@subsubheading IB and PIB Formats + +These are integer binary formats. IB reads and writes 2's complement +binary integers, and PIB reads and writes unsigned binary integers. The +byte ordering is by default the host machine's, but SET RIB may be used +to select a specific byte ordering for reading (@pxref{SET RIB}) and +SET WIB, similarly, for writing (@pxref{SET WIB}). + +The maximum field width is 8. Decimal places may range from 0 up to the +number of decimal digits in the largest value representable in the field +width. + +The default output format is an F format whose width is the number of +decimal digits in the largest value representable in the field width, +plus 1 if the format has decimal places. + +@subsubheading RB Format + +This is a binary format for real numbers. By default it reads and +writes the host machine's floating-point format, but SET RRB may be +used to select an alternate floating-point format for reading +(@pxref{SET RRB}) and SET WRB, similarly, for writing (@pxref{SET +WRB}). + +The recommended field width depends on the floating-point format. +NATIVE (the default format), IDL, IDB, VD, VG, and ZL formats should use +a field width of 8. ISL, ISB, VF, and ZS formats should use a field +width of 4. Other field widths will not produce useful results. The +maximum field width is 8. No decimal places may be specified. -@item wk-delimiter -The letters @samp{wk} in any case. +The default output format is F8.2. -@item time-delimiter -At least one characters of white space or @samp{:} or @samp{.}. +@subsubheading PIBHEX and RBHEX Formats + +These are hexadecimal formats, for reading and writing binary formats +where each byte has been recoded as a pair of hexadecimal digits. + +A hexadecimal field consists solely of hexadecimal digits +@samp{0}@dots{}@samp{9} and @samp{A}@dots{}@samp{F}. Uppercase and +lowercase are accepted on input; output is in uppercase. + +Other than the hexadecimal representation, these formats are equivalent +to PIB and RB formats, respectively. However, bytes in PIBHEX format +are always ordered with the most-significant byte first (big-endian +order), regardless of the host machine's native byte order or PSPP +settings. + +Field widths must be even and between 2 and 16. RBHEX format allows no +decimal places; PIBHEX allows as many decimal places as a PIB format +with half the given width. + +@node Time and Date Formats +@subsubsection Time and Date Formats + +In PSPP, a @dfn{time} is an interval. The time formats translate +between human-friendly descriptions of time intervals and PSPP's +internal representation of time intervals, which is simply the number of +seconds in the interval. PSPP has two time formats: + +@float +@multitable {Time Format} {@code{dd-mmm-yyyy HH:MM:SS.ss}} {@code{01-OCT-1978 04:31:17.01}} +@headitem Time Format @tab Template @tab Example +@item TIME @tab @code{hh:MM:SS.ss} @tab @code{04:31:17.01} +@item DTIME @tab @code{DD HH:MM:SS.ss} @tab @code{00 04:31:17.01} +@end multitable +@end float + +A @dfn{date} is a moment in the past or the future. Internally, PSPP +represents a date as the number of seconds since the @dfn{epoch}, +midnight, Oct. 14, 1582. The date formats translate between +human-readable dates and PSPP's numeric representation of dates and +times. PSPP has several date formats: + +@float +@multitable {Date Format} {@code{dd-mmm-yyyy HH:MM:SS.ss}} {@code{01-OCT-1978 04:31:17.01}} +@headitem Date Format @tab Template @tab Example +@item DATE @tab @code{dd-mmm-yyyy} @tab @code{01-OCT-1978} +@item ADATE @tab @code{mm/dd/yyyy} @tab @code{10/01/1978} +@item EDATE @tab @code{dd.mm.yyyy} @tab @code{01.10.1978} +@item JDATE @tab @code{yyyyjjj} @tab @code{1978274} +@item SDATE @tab @code{yyyy/mm/dd} @tab @code{1978/10/01} +@item QYR @tab @code{q Q yyyy} @tab @code{3 Q 1978} +@item MOYR @tab @code{mmm yyyy} @tab @code{OCT 1978} +@item WKYR @tab @code{ww WK yyyy} @tab @code{40 WK 1978} +@item DATETIME @tab @code{dd-mmm-yyyy HH:MM:SS.ss} @tab @code{01-OCT-1978 04:31:17.01} +@end multitable +@end float + +The templates in the preceding tables describe how the time and date +formats are input and output: -@item hour -An integer greater than 0 representing an hour. +@table @code +@item dd +Day of month, from 1 to 31. Always output as two digits. + +@item mm +@itemx mmm +Month. In output, @code{mm} is output as two digits, @code{mmm} as the +first three letters of an English month name (January, February, +@dots{}). In input, both of these formats, plus Roman numerals, are +accepted. + +@item yyyy +Year. In output, DATETIME always produces a 4-digit year; other +formats can produce a 2- or 4-digit year. The century assumed for +2-digit years depends on the EPOCH setting (@pxref{SET EPOCH}). In +output, a year outside the epoch causes the whole field to be filled +with asterisks (@samp{*}). + +@item jjj +Day of year (Julian day), from 1 to 366. This is exactly three digits +giving the count of days from the start of the year. January 1 is +considered day 1. + +@item q +Quarter of year, from 1 to 4. Quarters start on January 1, April 1, +July 1, and October 1. + +@item ww +Week of year, from 1 to 53. Output as exactly two digits. January 1 is +the first day of week 1. + +@item DD +Count of days, which may be positive or negative. Output as at least +two digits. + +@item hh +Count of hours, which may be positive or negative. Output as at least +two digits. + +@item HH +Hour of day, from 0 to 23. Output as exactly two digits. + +@item MM +Minute of hour, from 0 to 59. Output as exactly two digits. + +@item SS.ss +Seconds within minute, from 0 to 59. The integer part is output as +exactly two digits. On output, seconds and fractional seconds may or +may not be included, depending on field width and decimal places. On +input, seconds and fractional seconds are optional. The DECIMAL setting +controls the character accepted and displayed as the decimal point +(@pxref{SET DECIMAL}). +@end table -@item minute -An integer between 0 and 59 representing a minute within an hour. +For output, the date and time formats use the delimiters indicated in +the table. For input, date components may be separated by spaces or by +one of the characters @samp{-}, @samp{/}, @samp{.}, or @samp{,}, and +time components may be separated by spaces, @samp{:}, or @samp{.}. On +input, the @samp{Q} separating quarter from year and the @samp{WK} +separating week from year may be uppercase or lowercase, and the spaces +around them are optional. + +On input, all time and date formats accept any amount of leading and +trailing white space. + +The maximum width for time and date formats is 40 columns. Minimum +input and output width for each of the time and date formats is shown +below: +@float +@multitable {DATETIME} {Min. Input Width} {Min. Output Width} {4-digit year} +@headitem Format @tab Min. Input Width @tab Min. Output Width @tab Option +@item DATE @tab 8 @tab 9 @tab 4-digit year +@item ADATE @tab 8 @tab 8 @tab 4-digit year +@item EDATE @tab 8 @tab 8 @tab 4-digit year +@item JDATE @tab 5 @tab 5 @tab 4-digit year +@item SDATE @tab 8 @tab 8 @tab 4-digit year +@item QYR @tab 4 @tab 6 @tab 4-digit year +@item MOYR @tab 6 @tab 6 @tab 4-digit year +@item WKYR @tab 6 @tab 8 @tab 4-digit year +@item DATETIME @tab 17 @tab 17 @tab seconds +@item TIME @tab 5 @tab 5 @tab seconds +@item DTIME @tab 8 @tab 8 @tab seconds +@end multitable +@end float +@noindent +In the table, ``Option'' describes what increased output width enables: -@item opt-second -Optionally, a time-delimiter followed by a real number representing a -number of seconds. +@table @asis +@item 4-digit year +A field 2 columns wider than minimum will include a 4-digit year. +(DATETIME format always includes a 4-digit year.) + +@item seconds +A field 3 columns wider than minimum will include seconds as well as +minutes. A field 5 columns wider than minimum, or more, can also +include a decimal point and fractional seconds (but no more than allowed +by the format's decimal places). +@end table -@item hour24 -An integer between 0 and 23 representing an hour within a day. +For the time and date formats, the default output format is the same as +the input format, except that PSPP increases the field width, if +necessary, to the minimum allowed for output. -@item weekday -At least the first two characters of an English day word. +Time or dates narrower than the field width are right-justified within +the field. -@item spaces -Any amount or no amount of white space. +When a time or date exceeds the field width, characters are trimmed from +the end until it fits. This can occur in an unusual situation, e.g.@: +with a year greater than 9999 (which adds an extra digit), or for a +negative value on TIME or DTIME (which adds a leading minus sign). -@item sign -An optional positive or negative sign. +@c What about out-of-range values? -@item trailer -All formats accept an optional white space trailer. -@end table +The system-missing value is output as a period at the right end of the +field. -The date input formats are strung together from the above pieces. On -output, the date formats are always printed in a single canonical -manner, based on field width. The date input and output formats are -described below: +@node Date Component Formats +@subsubsection Date Component Formats -@table @asis -@item DATEw: 9 <= iw,ow <= 40 -Date format. Input format: leader + day + date-delimiter + -month + date-delimiter + year + trailer. Output format: DD-MMM-YY for -@var{w} < 11, DD-MMM-YYYY otherwise. - -@item EDATEw: 8 <= iw,ow <= 40 -European date format. Input format same as DATE. Output format: -DD.MM.YY for @var{w} < 10, DD.MM.YYYY otherwise. - -@item SDATEw: 8 <= iw,ow <= 40 -Standard date format. Input format: leader + year + date-delimiter + -month + date-delimiter + day + trailer. Output format: YY/MM/DD for -@var{w} < 10, YYYY/MM/DD otherwise. - -@item ADATEw: 8 <= iw,ow <= 40 -American date format. Input format: leader + month + date-delimiter + -day + date-delimiter + year + trailer. Output format: MM/DD/YY for -@var{w} < 10, MM/DD/YYYY otherwise. - -@item JDATEw: 5 <= iw,ow <= 40 -Julian date format. Input format: leader + julian + trailer. Output -format: YYDDD for @var{w} < 7, YYYYDDD otherwise. - -@item QYRw: 4 <= iw <= 40, 6 <= ow <= 40 -Quarter/year format. Input format: leader + quarter + q-delimiter + -year + trailer. Output format: @samp{Q Q YY}, where the first -@samp{Q} is one of the digits 1, 2, 3, 4, if @var{w} < 8, @code{Q Q -YYYY} otherwise. - -@item MOYRw: 6 <= iw,ow <= 40 -Month/year format. Input format: leader + month + date-delimiter + year -+ trailer. Output format: @samp{MMM YY} for @var{w} < 8, @samp{MMM -YYYY} otherwise. - -@item WKYRw: 6 <= iw <= 40, 8 <= ow <= 40 -Week/year format. Input format: leader + week + wk-delimiter + year + -trailer. Output format: @samp{WW WK YY} for @var{w} < 10, @samp{WW WK -YYYY} otherwise. - -@item DATETIMEw.d: 17 <= iw,ow <= 40 -Date and time format. Input format: leader + day + date-delimiter + -month + date-delimiter + year + time-delimiter + hour24 + time-delimiter -+ minute + opt-second. Output format: @samp{DD-MMM-YYYY HH:MM}. If -@var{w} > 19 then seconds @samp{:SS} is added. If @var{w} > 22 and -@var{d} > 0 then fractional seconds @samp{.SS} are added. - -@item TIMEw.d: 5 <= iw,ow <= 40 -Time format. Input format: leader + sign + spaces + hour + -time-delimiter + minute + opt-second. Output format: @samp{HH:MM}. -Seconds and fractional seconds are available with @var{w} of at least 8 -and 10, respectively. - -@item DTIMEw.d: 1 <= iw <= 40, 8 <= ow <= 40 -Time format with day count. Input format: leader + sign + spaces + -day-count + time-delimiter + hour + time-delimiter + minute + -opt-second. Output format: @samp{DD HH:MM}. Seconds and fractional -seconds are available with @var{w} of at least 8 and 10, respectively. - -@item WKDAYw: 2 <= iw,ow <= 40 -A weekday as a number between 1 and 7, where 1 is Sunday. Input format: -leader + weekday + trailer. Output format: as many characters, in all -capital letters, of the English name of the weekday as will fit in the -field width. - -@item MONTHw: 3 <= iw,ow <= 40 -A month as a number between 1 and 12, where 1 is January. Input format: -leader + month + trailer. Output format: as many character, in all -capital letters, of the English name of the month as will fit in the -field width. -@end table +The WKDAY and MONTH formats provide input and output for the names of +weekdays and months, respectively. -There are only two formats that may be used with string variables: +On output, these formats convert a number between 1 and 7, for WKDAY, or +between 1 and 12, for MONTH, into the English name of a day or month, +respectively. If the name is longer than the field, it is trimmed to +fit. If the name is shorter than the field, it is padded on the right +with spaces. Values outside the valid range, and the system-missing +value, are output as all spaces. -@table @asis -@item Aw: 1 <= iw <= 255, 1 <= ow <= 254 -The entire field is treated as a string value. +On input, English weekday or month names (in uppercase or lowercase) are +converted back to their corresponding numbers. Weekday and month names +may be abbreviated to their first 2 or 3 letters, respectively. -@item AHEXw @result{} A: 2 <= iw <= 254; 2 <= ow <= 510 -The field is composed of characters in a string encoded as textual hex -digit pairs. +The field width may range from 2 to 40, for WKDAY, or from 3 to 40, for +MONTH. No decimal places are allowed. -The default output @var{w} is half the input @var{w}. -@end table +The default output format is the same as the input format. + +@node String Formats +@subsubsection String Formats + +The A and AHEX formats are the only ones that may be assigned to string +variables. Neither format allows any decimal places. + +In A format, the entire field is treated as a string value. The field +width may range from 1 to 32,767, the maximum string width. The default +output format is the same as the input format. + +In AHEX format, the field is composed of characters in a string encoded +as hex digit pairs. On output, hex digits are output in uppercase; on +input, uppercase and lowercase are both accepted. The default output +format is A format with half the input width. -@node Scratch Variables, , Input/Output Formats, Variables +@node Scratch Variables, , Input and Output Formats, Variables @subsection Scratch Variables Most of the time, variables don't retain their values between cases. diff --git a/doc/transformation.texi b/doc/transformation.texi index 327f6739..d8b08a45 100644 --- a/doc/transformation.texi +++ b/doc/transformation.texi @@ -287,7 +287,7 @@ one or more @dfn{test} variables for each case. The target variable values are always nonnegative integers. They are never missing. The target variable is assigned an F8.2 output format. -@xref{Input/Output Formats}. Any variables, including long and short +@xref{Input and Output Formats}. Any variables, including long and short string variables, may be test variables. User-missing values of test variables are treated just like any other diff --git a/doc/utilities.texi b/doc/utilities.texi index 54ae7b3b..31966849 100644 --- a/doc/utilities.texi +++ b/doc/utilities.texi @@ -88,6 +88,8 @@ DISPLAY FILE LABEL. active file, if any. @xref{FILE LABEL}. +This command is a PSPP extension. + @node DROP DOCUMENTS, ECHO, DISPLAY FILE LABEL, Utilities @section DROP DOCUMENTS @vindex DROP DOCUMENTS @@ -232,6 +234,8 @@ SET /DECIMAL=@{DOT,COMMA@} /FORMAT=fmt_spec /EPOCH=@{AUTOMATIC,year@} + /RIB=@{NATIVE,MSBFIRST,LSBFIRST,VAX@} + /RRB=@{NATIVE,ISL,ISB,IDL,IDB,VF,VD,VG,ZS,ZL@} (program input) /ENDCMD='.' @@ -258,6 +262,8 @@ SET /CC@{A,B,C,D,E@}=@{'npre,pre,suf,nsuf','npre.pre.suf.nsuf'@} /DECIMAL=@{DOT,COMMA@} /FORMAT=fmt_spec + /WIB=@{NATIVE,MSBFIRST,LSBFIRST,VAX@} + /WRB=@{NATIVE,ISL,ISB,IDL,IDB,VF,VD,VG,ZS,ZL@} (output routing) /ECHO=@{ON,OFF@} @@ -320,22 +326,86 @@ system-missing value to be assigned to null items. This is the default. Any real value may be assigned. @item DECIMAL +@anchor{SET DECIMAL} The default DOT setting causes the decimal point character to be -@samp{.}. A setting of COMMA causes the decimal point character to be -@samp{,}. +@samp{.} and the grouping character to be @samp{,}. A setting of COMMA +causes the decimal point character to be @samp{,} and the grouping +character to be @samp{.}. @item FORMAT Allows the default numeric input/output format to be specified. The -default is F8.2. @xref{Input/Output Formats}. +default is F8.2. @xref{Input and Output Formats}. @item EPOCH @anchor{SET EPOCH} Specifies the range of years used when a 2-digit year is read from a data file or used in a date construction expression (@pxref{Date -Construction}). If a 4-digit year is specified, then 2-digit years -are interpreted starting from that year, known as the epoch. If -AUTOMATIC (the default) is specified, then the epoch begins 69 years -before the current date. +Construction}). If a 4-digit year is specified for the epoch, then +2-digit years are interpreted starting from that year, known as the +epoch. If AUTOMATIC (the default) is specified, then the epoch begins +69 years before the current date. + +@item RIB +@anchor{SET RIB} + +PSPP extension to set the byte ordering (endianness) used for reading +data in IB or PIB format (@pxref{Binary and Hexadecimal Numeric +Formats}). In MSBFIRST ordering, the most-significant byte appears at +the left end of a IB or PIB field. In LSBFIRST ordering, the +least-significant byte appears at the left end. VAX ordering is like +MSBFIRST, except that each pair of bytes is in reverse order. NATIVE, +the default, is equivalent to MSBFIRST or LSBFIRST depending on the +native format of the machine running PSPP. + +@item RRB +@anchor{SET RRB} + +PSPP extension to set the floating-point format used for reading data in +RB format (@pxref{Binary and Hexadecimal Numeric Formats}). The +possibilities are: + +@table @asis +@item NATIVE +The native format of the machine running PSPP. Equivalent to either IDL +or IDB. + +@item ISL +32-bit IEEE 754 single-precision floating point, in little-endian byte +order. + +@item ISB +32-bit IEEE 754 single-precision floating point, in big-endian byte +order. + +@item IDL +64-bit IEEE 754 double-precision floating point, in little-endian byte +order. + +@item IDB +64-bit IEEE 754 double-precision floating point, in big-endian byte +order. + +@item VF +32-bit VAX F format, in VAX-endian byte order. + +@item VD +64-bit VAX D format, in VAX-endian byte order. + +@item VG +64-bit VAX G format, in VAX-endian byte order. + +@item ZS +32-bit IBM Z architecture short format hexadecimal floating point, in +big-endian byte order. + +@item ZL +64-bit IBM Z architecture long format hexadecimal floating point, in +big-endian byte order. + +Z architecture also supports IEEE 754 floating point. The ZS and ZL +formats are only for use with very old input files. +@end table +The default is NATIVE. @end table Program input subcommands affect the way that programs are parsed when @@ -408,18 +478,10 @@ subcommands are @itemx CCC @itemx CCD @itemx CCE -Set up custom currency formats. The argument is a string which must -contain exactly three commas or exactly three periods. If commas, then -the grouping character for the currency format is @samp{,}, and the -decimal point character is @samp{.}; if periods, then the situation is -reversed. - -The commas or periods divide the string into four fields, which are, in -order, the negative prefix, prefix, suffix, and negative suffix. When a -value is formatted using the custom currency format, the prefix precedes -the value formatted and the suffix follows it. In addition, if the -value is negative, the negative prefix precedes the prefix and the -negative suffix follows the suffix. +@anchor{CCx Settings} + +Set up custom currency formats. @xref{Custom Currency Formats}, for +details. @item DECIMAL The default DOT setting causes the decimal point character to be @@ -428,7 +490,26 @@ The default DOT setting causes the decimal point character to be @item FORMAT Allows the default numeric input/output format to be specified. The -default is F8.2. @xref{Input/Output Formats}. +default is F8.2. @xref{Input and Output Formats}. + +@item WIB +@anchor{SET WIB} + +PSPP extension to set the byte ordering (endianness) used for writing +data in IB or PIB format (@pxref{Binary and Hexadecimal Numeric +Formats}). In MSBFIRST ordering, the most-significant byte appears at +the left end of a IB or PIB field. In LSBFIRST ordering, the +least-significant byte appears at the left end. VAX ordering is like +MSBFIRST, except that each pair of bytes is in reverse order. NATIVE, +the default, is equivalent to MSBFIRST or LSBFIRST depending on the +native format of the machine running PSPP. + +@item WRB +@anchor{SET WRB} + +PSPP extension to set the floating-point format used for writing data in +RB format (@pxref{Binary and Hexadecimal Numeric Formats}). The choices +are the same as SET RIB. The default is NATIVE. @end table Output routing subcommands affect where the output of transformations diff --git a/doc/variables.texi b/doc/variables.texi index 525c7094..91b07112 100644 --- a/doc/variables.texi +++ b/doc/variables.texi @@ -101,7 +101,7 @@ FORMATS var_list (fmt_spec). @cmd{FORMATS} set both print and write formats for the specified numeric variables to the specified format specification. -@xref{Input/Output Formats}. +@xref{Input and Output Formats}. Specify a list of variables followed by a format specification in parentheses. The print and write formats of the specified variables @@ -256,7 +256,7 @@ setting their output formats. Specify a slash (@samp{/}), followed by the names of the new numeric variables. If you wish to set their output formats, follow their names -by an output format specification in parentheses (@pxref{Input/Output +by an output format specification in parentheses (@pxref{Input and Output Formats}); otherwise, the default is F8.2. Variables created with @cmd{NUMERIC} are initialized to the @@ -333,7 +333,7 @@ transformations. Specify a slash (@samp{/}), followed by the names of the string variables to create and the desired output format specification in -parentheses (@pxref{Input/Output Formats}). Variable widths are +parentheses (@pxref{Input and Output Formats}). Variable widths are implicitly derived from the specified output formats. Created variables are initialized to spaces. diff --git a/src/data/ChangeLog b/src/data/ChangeLog index 60076a0f..4be64ea9 100644 --- a/src/data/ChangeLog +++ b/src/data/ChangeLog @@ -1,3 +1,27 @@ +Sat Nov 4 15:59:56 2006 Ben Pfaff + + * calendar.c (calendar_offset_to_gregorian) Also return the + year-of-day. Change callers to new interface. + + * data-out.c: Completely rewrite internals to conform to SPSS + output formats as completely as possible. + (data_out) Change interface to put input parameters before output + parameters, for consistency with the style I now prefer. Update + all callers. + (data_out_get_integer_format) New public function. + (data_out_set_integer_format) New public function. + (data_out_get_float_format) New public function. + (data_out_set_float_format) New public function. + + * data-out.h: New file. Move prototype for data_out here, from + format.h. + + * format.c: (fmt_step_width) Use equality comparison instead of + bitwise and, for clarity. + (fmt_is_string) Ditto. + (fmt_input_to_output) Fix categories that are translated to F + format. + Sun Nov 5 08:29:34 WST 2006 John Darrington * casefilter.c casefilter.h (new files), casefile.c casefile.h diff --git a/src/data/calendar.c b/src/data/calendar.c index 0a6498dd..3a38f126 100644 --- a/src/data/calendar.c +++ b/src/data/calendar.c @@ -153,14 +153,15 @@ calendar_offset_to_year (int ofs) } /* Takes a count of days from 14 Oct 1582 and translates it into - a Gregorian calendar date in (*Y,*M,*D). Dates both before - and after the epoch are supported. */ + a Gregorian calendar date in (*Y,*M,*D). Also stores the + year-relative day number into *YD. Dates both before and + after the epoch are supported. */ void -calendar_offset_to_gregorian (int ofs, int *y, int *m, int *d) +calendar_offset_to_gregorian (int ofs, int *y, int *m, int *d, int *yd) { int year = *y = calendar_offset_to_year (ofs); int january1 = raw_gregorian_to_offset (year, 1, 1); - int yday = ofs - january1 + 1; + int yday = *yd = ofs - january1 + 1; int march1 = january1 + cum_month_days (year, 3); int correction = ofs < march1 ? 0 : (is_leap_year (year) ? 1 : 2); int month = *m = (12 * (yday - 1 + correction) + 373) / 367; @@ -195,8 +196,8 @@ calendar_offset_to_wday (int ofs) int calendar_offset_to_month (int ofs) { - int y, m, d; - calendar_offset_to_gregorian (ofs, &y, &m, &d); + int y, m, d, yd; + calendar_offset_to_gregorian (ofs, &y, &m, &d, &yd); return m; } @@ -205,7 +206,7 @@ calendar_offset_to_month (int ofs) int calendar_offset_to_mday (int ofs) { - int y, m, d; - calendar_offset_to_gregorian (ofs, &y, &m, &d); + int y, m, d, yd; + calendar_offset_to_gregorian (ofs, &y, &m, &d, &yd); return d; } diff --git a/src/data/calendar.h b/src/data/calendar.h index 1a70592b..f46e71fe 100644 --- a/src/data/calendar.h +++ b/src/data/calendar.h @@ -5,7 +5,7 @@ typedef void calendar_error_func (void *aux, const char *, ...); double calendar_gregorian_to_offset (int y, int m, int d, calendar_error_func *, void *aux); -void calendar_offset_to_gregorian (int ofs, int *y, int *m, int *d); +void calendar_offset_to_gregorian (int ofs, int *y, int *m, int *d, int *yd); int calendar_offset_to_year (int ofs); int calendar_offset_to_month (int ofs); int calendar_offset_to_mday (int ofs); diff --git a/src/data/data-out.c b/src/data/data-out.c index 74f97188..ffc79ecc 100644 --- a/src/data/data-out.c +++ b/src/data/data-out.c @@ -18,1215 +18,941 @@ 02110-1301, USA. */ #include -#include + +#include "data-out.h" + #include -#include #include +#include +#include #include #include + #include "calendar.h" -#include -#include #include "format.h" +#include "settings.h" +#include "variable.h" + +#include +#include +#include #include +#include #include #include -#include "settings.h" #include -#include "variable.h" + +#include "minmax.h" #include "gettext.h" #define _(msgid) gettext (msgid) -/* Public functions. */ - -typedef int numeric_converter (char *, const struct fmt_spec *, double); -static numeric_converter convert_F, convert_N, convert_E, convert_F_plus; -static numeric_converter convert_Z, convert_IB, convert_P, convert_PIB; -static numeric_converter convert_PIBHEX, convert_PK, convert_RB; -static numeric_converter convert_RBHEX, convert_CCx, convert_date; -static numeric_converter convert_time, convert_WKDAY, convert_MONTH; - -static numeric_converter try_F, convert_infinite; - -typedef int string_converter (char *, const struct fmt_spec *, const char *); -static string_converter convert_A, convert_AHEX; +/* A representation of a number that can be quickly rounded to + any desired number of decimal places (up to a specified + maximum). */ +struct rounder + { + char string[64]; /* Magnitude of number with excess precision. */ + int integer_digits; /* Number of digits before decimal point. */ + int leading_nines; /* Number of `9's or `.'s at start of string. */ + int leading_zeros; /* Number of `0's or `.'s at start of string. */ + bool negative; /* Is the number negative? */ + }; -/* Converts binary value V into printable form in the exactly - FP->W character in buffer S according to format specification - FP. No null terminator is appended to the buffer. */ -bool -data_out (char *s, const struct fmt_spec *fp, const union value *v) +static void rounder_init (struct rounder *, double number, int max_decimals); +static int rounder_width (const struct rounder *, int decimals, + int *integer_digits, bool *negative); +static void rounder_format (const struct rounder *, int decimals, + char *output); + +/* Format of integers in output (SET WIB). */ +static enum integer_format output_integer_format = INTEGER_NATIVE; + +/* Format of reals in output (SET WRB). */ +static enum float_format output_float_format = FLOAT_NATIVE_DOUBLE; + +typedef void data_out_converter_func (const union value *, + const struct fmt_spec *, + char *); +#define FMT(NAME, METHOD, IMIN, OMIN, IO, CATEGORY) \ + static data_out_converter_func output_##METHOD; +#include "format.def" + +static bool output_decimal (const struct rounder *, const struct fmt_spec *, + bool require_affixes, char *); +static bool output_scientific (double, const struct fmt_spec *, + bool require_affixes, char *); + +static double power10 (int) PURE_FUNCTION; +static double power256 (int) PURE_FUNCTION; + +static void output_infinite (double, const struct fmt_spec *, char *); +static void output_missing (const struct fmt_spec *, char *); +static void output_overflow (const struct fmt_spec *, char *); +static bool output_bcd_integer (double, int digits, char *); +static void output_binary_integer (uint64_t, int bytes, enum integer_format, + char *); +static void output_hex (const void *, size_t bytes, char *); + +/* Converts the INPUT value into printable form in the exactly + FORMAT->W characters in OUTPUT according to format + specification FORMAT. No null terminator is appended to the + buffer. */ +void +data_out (const union value *input, const struct fmt_spec *format, + char *output) { - int ok; - - assert (fmt_check_output (fp)); - if (fmt_is_numeric (fp->type)) + static data_out_converter_func *const converters[FMT_NUMBER_OF_FORMATS] = { - enum fmt_category category = fmt_get_category (fp->type); - double number = v->f; - - /* Handle SYSMIS turning into blanks. */ - if (!(category & (FMT_CAT_CUSTOM | FMT_CAT_BINARY | FMT_CAT_HEXADECIMAL)) - && number == SYSMIS) - { - memset (s, ' ', fp->w); - s[fp->w - fp->d - 1] = '.'; - return true; - } - - /* Handle decimal shift. */ - if ((category & (FMT_CAT_LEGACY | FMT_CAT_BINARY)) - && number != SYSMIS - && fp->d) - number *= pow (10.0, fp->d); - - switch (fp->type) - { - case FMT_F: - ok = convert_F (s, fp, number); - break; - - case FMT_N: - ok = convert_N (s, fp, number); - break; - - case FMT_E: - ok = convert_E (s, fp, number); - break; - - case FMT_COMMA: case FMT_DOT: case FMT_DOLLAR: case FMT_PCT: - ok = convert_F_plus (s, fp, number); - break; - - case FMT_Z: - ok = convert_Z (s, fp, number); - break; - - case FMT_A: - NOT_REACHED (); - - case FMT_AHEX: - NOT_REACHED (); - - case FMT_IB: - ok = convert_IB (s, fp, number); - break; - - case FMT_P: - ok = convert_P (s, fp, number); - break; - - case FMT_PIB: - ok = convert_PIB (s, fp, number); - break; - - case FMT_PIBHEX: - ok = convert_PIBHEX (s, fp, number); - break; - - case FMT_PK: - ok = convert_PK (s, fp, number); - break; +#define FMT(NAME, METHOD, IMIN, OMIN, IO, CATEGORY) output_##METHOD, +#include "format.def" + }; - case FMT_RB: - ok = convert_RB (s, fp, number); - break; + assert (fmt_check_output (format)); - case FMT_RBHEX: - ok = convert_RBHEX (s, fp, number); - break; + converters[format->type] (input, format, output); +} - case FMT_CCA: case FMT_CCB: case FMT_CCC: case FMT_CCD: case FMT_CCE: - ok = convert_CCx (s, fp, number); - break; +/* Returns the current output integer format. */ +enum integer_format +data_out_get_integer_format (void) +{ + return output_integer_format; +} - case FMT_DATE: case FMT_EDATE: case FMT_SDATE: case FMT_ADATE: - case FMT_JDATE: case FMT_QYR: case FMT_MOYR: case FMT_WKYR: - case FMT_DATETIME: - ok = convert_date (s, fp, number); - break; +/* Sets the output integer format to INTEGER_FORMAT. */ +void +data_out_set_integer_format (enum integer_format integer_format) +{ + output_integer_format = integer_format; +} - case FMT_TIME: case FMT_DTIME: - ok = convert_time (s, fp, number); - break; +/* Returns the current output float format. */ +enum float_format +data_out_get_float_format (void) +{ + return output_float_format; +} - case FMT_WKDAY: - ok = convert_WKDAY (s, fp, number); - break; +/* Sets the output float format to FLOAT_FORMAT. */ +void +data_out_set_float_format (enum float_format float_format) +{ + output_float_format = float_format; +} + +/* Main conversion functions. */ - case FMT_MONTH: - ok = convert_MONTH (s, fp, number); - break; +/* Outputs F, COMMA, DOT, DOLLAR, PCT, E, CCA, CCB, CCC, CCD, and + CCE formats. */ +static void +output_number (const union value *input, const struct fmt_spec *format, + char *output) +{ + double number = input->f; - default: - NOT_REACHED (); - } - } + if (number == SYSMIS) + output_missing (format, output); + else if (!isfinite (number)) + output_infinite (number, format, output); else { - /* String formatting. */ - const char *string = v->s; - - switch (fp->type) + if (format->type != FMT_E && fabs (number) < 1.5 * power10 (format->w)) { - case FMT_A: - ok = convert_A (s, fp, string); - break; - - case FMT_AHEX: - ok = convert_AHEX (s, fp, string); - break; + struct rounder r; + rounder_init (&r, number, format->d); - default: - NOT_REACHED (); + if (output_decimal (&r, format, true, output) + || output_scientific (number, format, true, output) + || output_decimal (&r, format, false, output)) + return; } - } - - /* Error handling. */ - if (!ok) - strncpy (s, "ERROR", fp->w); - - return ok; -} - -/* Main conversion functions. */ - -static void insert_commas (char *dst, const char *src, - const struct fmt_spec *fp); -static int year4 (int year); -static int try_CCx (char *s, const struct fmt_spec *fp, double v); - -#if FLT_RADIX!=2 -#error Write your own floating-point output routines. -#endif - -/* Converts a number between 0 and 15 inclusive to a `hexit' - [0-9A-F]. */ -#define MAKE_HEXIT(X) ("0123456789ABCDEF"[X]) - -/* Table of powers of 10. */ -static const double power10[] = - { - 0, /* Not used. */ - 1e01, 1e02, 1e03, 1e04, 1e05, 1e06, 1e07, 1e08, 1e09, 1e10, - 1e11, 1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19, 1e20, - 1e21, 1e22, 1e23, 1e24, 1e25, 1e26, 1e27, 1e28, 1e29, 1e30, - 1e31, 1e32, 1e33, 1e34, 1e35, 1e36, 1e37, 1e38, 1e39, 1e40, - }; -/* Handles F format. */ -static int -convert_F (char *dst, const struct fmt_spec *fp, double number) -{ - if (!try_F (dst, fp, number)) - convert_E (dst, fp, number); - return 1; + if (!output_scientific (number, format, false, output)) + output_overflow (format, output); + } } -/* Handles N format. */ -static int -convert_N (char *dst, const struct fmt_spec *fp, double number) +/* Outputs N format. */ +static void +output_N (const union value *input, const struct fmt_spec *format, + char *output) { - double d = floor (number); - - if (d < 0 || d == SYSMIS) - { - msg (ME, _("The N output format cannot be used to output a " - "negative number or the system-missing value.")); - return 0; - } - - if (d < power10[fp->w]) + double number = input->f * power10 (format->d); + if (input->f == SYSMIS || number < 0) + output_missing (format, output); + else { char buf[128]; - sprintf (buf, "%0*.0f", fp->w, number); - memcpy (dst, buf, fp->w); + number = fabs (round (number)); + if (number < power10 (format->w) + && sprintf (buf, "%0*.0f", format->w, number) == format->w) + memcpy (output, buf, format->w); + else + output_overflow (format, output); } - else - memset (dst, '*', fp->w); - - return 1; } -/* Handles E format. Also operates as fallback for some other - formats. */ -static int -convert_E (char *dst, const struct fmt_spec *fp, double number) +/* Outputs Z format. */ +static void +output_Z (const union value *input, const struct fmt_spec *format, + char *output) { - /* Temporary buffer. */ + double number = input->f * power10 (format->d); char buf[128]; - - /* Ranged number of decimal places. */ - int d; - - if (!finite (number)) - return convert_infinite (dst, fp, number); - - /* Check that the format is wide enough. - Although PSPP generally checks this, convert_E() can be called as - a fallback from other formats which do not check. */ - if (fp->w < 6) - { - memset (dst, '*', fp->w); - return 1; - } - - /* Put decimal places in usable range. */ - d = min (fp->d, fp->w - 6); - if (number < 0) - d--; - if (d < 0) - d = 0; - sprintf (buf, "%*.*E", fp->w, d, number); - - /* What we do here is force the exponent part to have four - characters whenever possible. That is, 1.00E+99 is okay (`E+99') - but 1.00E+100 (`E+100') must be coerced to 1.00+100 (`+100'). On - the other hand, 1.00E1000 (`E+100') cannot be canonicalized. - Note that ANSI C guarantees at least two digits in the - exponent. */ - if (fabs (number) > 1e99) - { - /* Pointer to the `E' in buf. */ - char *cp; - - cp = strchr (buf, 'E'); - if (cp) - { - /* Exponent better not be bigger than an int. */ - int exp = atoi (cp + 1); - - if (abs (exp) > 99 && abs (exp) < 1000) - { - /* Shift everything left one place: 1.00e+100 -> 1.00+100. */ - cp[0] = cp[1]; - cp[1] = cp[2]; - cp[2] = cp[3]; - cp[3] = cp[4]; - } - else if (abs (exp) >= 1000) - memset (buf, '*', fp->w); - } - } - - /* The C locale always uses a period `.' as a decimal point. - Translate to comma if necessary. */ - if (fmt_decimal_char (fp->type) != '.') + if (input->f == SYSMIS) + output_missing (format, output); + else if (fabs (number) >= power10 (format->w) + || sprintf (buf, "%0*.0f", format->w, + fabs (round (number))) != format->w) + output_overflow (format, output); + else { - char *cp = strchr (buf, '.'); - if (cp) - *cp = fmt_decimal_char (fp->type); + if (number < 0 && strspn (buf, "0") < format->w) + { + char *p = &buf[format->w - 1]; + *p = "}JKLMNOPQR"[*p - '0']; + } + memcpy (output, buf, format->w); } - - memcpy (dst, buf, fp->w); - return 1; } -/* Handles COMMA, DOT, DOLLAR, and PCT formats. */ -static int -convert_F_plus (char *dst, const struct fmt_spec *fp, double number) +/* Outputs P format. */ +static void +output_P (const union value *input, const struct fmt_spec *format, + char *output) { - char buf[40]; - - if (try_F (buf, fp, number)) - insert_commas (dst, buf, fp); + if (output_bcd_integer (fabs (input->f * power10 (format->d)), + format->w * 2 - 1, output) + && input->f < 0.0) + output[format->w - 1] |= 0xd; else - convert_E (dst, fp, number); - - return 1; + output[format->w - 1] |= 0xf; } -static int -convert_Z (char *dst, const struct fmt_spec *fp, double number) +/* Outputs PK format. */ +static void +output_PK (const union value *input, const struct fmt_spec *format, + char *output) { - static bool warned = false; - - if (!warned) - { - msg (MW, - _("Quality of zoned decimal (Z) output format code is " - "suspect. Check your results. Report bugs to %s."), - PACKAGE_BUGREPORT); - warned = 1; - } + output_bcd_integer (input->f * power10 (format->d), format->w * 2, output); +} - if (number == SYSMIS) +/* Outputs IB format. */ +static void +output_IB (const union value *input, const struct fmt_spec *format, + char *output) +{ + double number = round (input->f * power10 (format->d)); + if (input->f == SYSMIS + || number >= power256 (format->w) / 2 - 1 + || number < -power256 (format->w) / 2) + memset (output, 0, format->w); + else { - msg (ME, _("The system-missing value cannot be output as a zoned " - "decimal number.")); - return 0; + uint64_t integer = fabs (number); + if (number < 0) + integer = -integer; + output_binary_integer (integer, format->w, output_integer_format, + output); } - - { - char buf[41]; - double d; - int i; - - d = fabs (floor (number)); - if (d >= power10[fp->w]) - { - msg (ME, _("Number %g too big to fit in field with format Z%d.%d."), - number, fp->w, fp->d); - return 0; - } - - sprintf (buf, "%*.0f", fp->w, number); - for (i = 0; i < fp->w; i++) - dst[i] = (buf[i] - '0') | 0xf0; - if (number < 0) - dst[fp->w - 1] &= 0xdf; - } - - return 1; } -static int -convert_A (char *dst, const struct fmt_spec *fp, const char *string) +/* Outputs PIB format. */ +static void +output_PIB (const union value *input, const struct fmt_spec *format, + char *output) { - memcpy(dst, string, fp->w); - return 1; + double number = round (input->f * power10 (format->d)); + if (input->f == SYSMIS + || number < 0 || number >= power256 (format->w)) + memset (output, 0, format->w); + else + output_binary_integer (number, format->w, output_integer_format, output); } -static int -convert_AHEX (char *dst, const struct fmt_spec *fp, const char *string) +/* Outputs PIBHEX format. */ +static void +output_PIBHEX (const union value *input, const struct fmt_spec *format, + char *output) { - int i; - - for (i = 0; i < fp->w / 2; i++) + double number = round (input->f); + if (input->f == SYSMIS) + output_missing (format, output); + else if (input->f < 0 || number >= power256 (format->w / 2)) + output_overflow (format, output); + else { - *dst++ = MAKE_HEXIT ((string[i]) >> 4); - *dst++ = MAKE_HEXIT ((string[i]) & 0xf); + char tmp[8]; + output_binary_integer (number, format->w / 2, INTEGER_MSB_FIRST, tmp); + output_hex (tmp, format->w / 2, output); } - - return 1; } -static int -convert_IB (char *dst, const struct fmt_spec *fp, double number) +/* Outputs RB format. */ +static void +output_RB (const union value *input, const struct fmt_spec *format, + char *output) { - /* Strategy: Basically the same as convert_PIBHEX() but with - base 256. Then negate the two's-complement result if number - is negative. */ - - /* Used for constructing the two's-complement result. */ - unsigned temp[8]; + double d = input->f; + memcpy (output, &d, format->w); +} - /* Fraction (mantissa). */ - double frac; +/* Outputs RBHEX format. */ +static void +output_RBHEX (const union value *input, const struct fmt_spec *format, + char *output) +{ + double d = input->f; + output_hex (&d, format->w / 2, output); +} - /* Exponent. */ - int exp; +/* Outputs DATE, ADATE, EDATE, JDATE, SDATE, QYR, MOYR, WKYR, + DATETIME, TIME, and DTIME formats. */ +static void +output_date (const union value *input, const struct fmt_spec *format, + char *output) +{ + double number = input->f; + double magnitude = fabs (number); + int year, month, day, yday; - /* Difference between exponent and (-8*fp->w-1). */ - int diff; + const char *template = fmt_date_template (format->type); + size_t template_width = strlen (template); + int excess_width = format->w - template_width; - /* Counter. */ - int i; + char tmp[64]; + char *p = tmp; - /* Make the exponent (-8*fp->w-1). */ - frac = frexp (fabs (number), &exp); - diff = exp - (-8 * fp->w - 1); - exp -= diff; - frac *= ldexp (1.0, diff); + assert (format->w >= template_width); + if (number == SYSMIS) + goto missing; - /* Extract each base-256 digit. */ - for (i = 0; i < fp->w; i++) + if (fmt_get_category (format->type) == FMT_CAT_DATE) { - modf (frac, &frac); - frac *= 256.0; - temp[i] = floor (frac); + if (number <= 0) + goto missing; + calendar_offset_to_gregorian (number / 60. / 60. / 24., + &year, &month, &day, &yday); } + else + year = month = day = yday = 0; - /* Perform two's-complement negation if number is negative. */ - if (number < 0) + while (*template != '\0') { - /* Perform NOT operation. */ - for (i = 0; i < fp->w; i++) - temp[i] = ~temp[i]; - /* Add 1 to the whole number. */ - for (i = fp->w - 1; i >= 0; i--) - { - temp[i]++; - if (temp[i]) - break; - } + int ch = *template; + int count = 1; + while (template[count] == ch) + count++; + template += count; + + switch (ch) + { + case 'd': + if (count < 3) + p += sprintf (p, "%02d", day); + else + p += sprintf (p, "%03d", yday); + break; + case 'm': + if (count < 3) + p += sprintf (p, "%02d", month); + else + { + static const char *months[12] = + { + "JAN", "FEB", "MAR", "APR", "MAY", "JUN", + "JUL", "AUG", "SEP", "OCT", "NOV", "DEC", + }; + p = stpcpy (p, months[month - 1]); + } + break; + case 'y': + if (count >= 4 || excess_width >= 2) + { + if (year <= 9999) + p += sprintf (p, "%04d", year); + else if (format->type == FMT_DATETIME) + p = stpcpy (p, "****"); + else + goto overflow; + } + else + { + int offset = year - get_epoch (); + if (offset < 0 || offset > 99) + goto overflow; + p += sprintf (p, "%02d", abs (year) % 100); + } + break; + case 'q': + p += sprintf (p, "%d", (month - 1) / 3 + 1); + break; + case 'w': + p += sprintf (p, "%2d", (yday - 1) / 7 + 1); + break; + case 'D': + if (number < 0) + *p++ = '-'; + p += sprintf (p, "%.0f", floor (magnitude / 60. / 60. / 24.)); + break; + case 'h': + if (number < 0) + *p++ = '-'; + p += sprintf (p, "%.0f", floor (magnitude / 60. / 60.)); + break; + case 'H': + p += sprintf (p, "%02d", + (int) fmod (floor (magnitude / 60. / 60.), 24.)); + break; + case 'M': + p += sprintf (p, "%02d", + (int) fmod (floor (magnitude / 60.), 60.)); + excess_width = format->w - (p - tmp); + if (excess_width < 0) + goto overflow; + if (excess_width == 3 || excess_width == 4 + || (excess_width >= 5 && format->d == 0)) + p += sprintf (p, ":%02d", (int) fmod (magnitude, 60.)); + else if (excess_width >= 5) + { + int d = MIN (format->d, excess_width - 4); + int w = d + 3; + sprintf (p, ":%0*.*f", w, d, fmod (magnitude, 60.)); + if (fmt_decimal_char (FMT_F) != '.') + { + char *cp = strchr (p, '.'); + if (cp != NULL) + *cp = fmt_decimal_char (FMT_F); + } + p += strlen (p); + } + break; + default: + assert (count == 1); + *p++ = ch; + break; + } } - memcpy (dst, temp, fp->w); -#ifndef WORDS_BIGENDIAN - buf_reverse (dst, fp->w); -#endif - - return 1; -} - -static int -convert_P (char *dst, const struct fmt_spec *fp, double number) -{ - /* Buffer for fp->w*2-1 characters + a decimal point if library is - not quite compliant + a null. */ - char buf[17]; - - /* Counter. */ - int i; - - /* Main extraction. */ - sprintf (buf, "%0*.0f", fp->w * 2 - 1, floor (fabs (number))); - for (i = 0; i < fp->w; i++) - ((unsigned char *) dst)[i] - = ((buf[i * 2] - '0') << 4) + buf[i * 2 + 1] - '0'; + buf_copy_lpad (output, format->w, tmp, p - tmp); + return; - /* Set sign. */ - dst[fp->w - 1] &= 0xf0; - if (number >= 0.0) - dst[fp->w - 1] |= 0xf; - else - dst[fp->w - 1] |= 0xd; + overflow: + output_overflow (format, output); + return; - return 1; + missing: + output_missing (format, output); + return; } -static int -convert_PIB (char *dst, const struct fmt_spec *fp, double number) +/* Outputs WKDAY format. */ +static void +output_WKDAY (const union value *input, const struct fmt_spec *format, + char *output) { - /* Strategy: Basically the same as convert_IB(). */ - - /* Fraction (mantissa). */ - double frac; - - /* Exponent. */ - int exp; - - /* Difference between exponent and (-8*fp->w). */ - int diff; - - /* Counter. */ - int i; - - /* Make the exponent (-8*fp->w). */ - frac = frexp (fabs (number), &exp); - diff = exp - (-8 * fp->w); - exp -= diff; - frac *= ldexp (1.0, diff); + static const char *weekdays[7] = + { + "SUNDAY", "MONDAY", "TUESDAY", "WEDNESDAY", + "THURSDAY", "FRIDAY", "SATURDAY", + }; - /* Extract each base-256 digit. */ - for (i = 0; i < fp->w; i++) + if (input->f >= 1 && input->f < 8) + buf_copy_str_rpad (output, format->w, weekdays[(int) input->f - 1]); + else { - modf (frac, &frac); - frac *= 256.0; - ((unsigned char *) dst)[i] = floor (frac); + if (input->f != SYSMIS) + msg (ME, _("Weekday number %f is not between 1 and 7."), input->f); + output_missing (format, output); } -#ifndef WORDS_BIGENDIAN - buf_reverse (dst, fp->w); -#endif - - return 1; } -static int -convert_PIBHEX (char *dst, const struct fmt_spec *fp, double number) +/* Outputs MONTH format. */ +static void +output_MONTH (const union value *input, const struct fmt_spec *format, + char *output) { - /* Strategy: Use frexp() to create a normalized result (but mostly - to find the base-2 exponent), then change the base-2 exponent to - (-4*fp->w) using multiplication and division by powers of two. - Extract each hexit by multiplying by 16. */ - - /* Fraction (mantissa). */ - double frac; - - /* Exponent. */ - int exp; - - /* Difference between exponent and (-4*fp->w). */ - int diff; - - /* Counter. */ - int i; - - /* Make the exponent (-4*fp->w). */ - frac = frexp (fabs (number), &exp); - diff = exp - (-4 * fp->w); - exp -= diff; - frac *= ldexp (1.0, diff); + static const char *months[12] = + { + "JANUARY", "FEBRUARY", "MARCH", "APRIL", "MAY", "JUNE", + "JULY", "AUGUST", "SEPTEMBER", "OCTOBER", "NOVEMBER", "DECEMBER", + }; - /* Extract each hexit. */ - for (i = 0; i < fp->w; i++) + if (input->f >= 1 && input->f < 13) + buf_copy_str_rpad (output, format->w, months[(int) input->f - 1]); + else { - modf (frac, &frac); - frac *= 16.0; - *dst++ = MAKE_HEXIT ((int) floor (frac)); + if (input->f != SYSMIS) + msg (ME, _("Month number %f is not between 1 and 12."), input->f); + output_missing (format, output); } - - return 1; } -static int -convert_PK (char *dst, const struct fmt_spec *fp, double number) +/* Outputs A format. */ +static void +output_A (const union value *input, const struct fmt_spec *format, + char *output) { - /* Buffer for fp->w*2 characters + a decimal point if library is not - quite compliant + a null. */ - char buf[18]; - - /* Counter. */ - int i; - - /* Main extraction. */ - sprintf (buf, "%0*.0f", fp->w * 2, floor (fabs (number))); - - for (i = 0; i < fp->w; i++) - ((unsigned char *) dst)[i] - = ((buf[i * 2] - '0') << 4) + buf[i * 2 + 1] - '0'; - - return 1; + memcpy (output, input->s, format->w); } -static int -convert_RB (char *dst, const struct fmt_spec *fp, double number) +/* Outputs AHEX format. */ +static void +output_AHEX (const union value *input, const struct fmt_spec *format, + char *output) { - union - { - double d; - char c[8]; - } - u; - - u.d = number; - memcpy (dst, u.c, fp->w); - - return 1; + output_hex (input->s, format->w, output); } + +/* Decimal and scientific formatting. */ -static int -convert_RBHEX (char *dst, const struct fmt_spec *fp, double number) +/* If REQUEST plus the current *WIDTH fits within MAX_WIDTH, + increments *WIDTH by REQUEST and return true. + Otherwise returns false without changing *WIDTH. */ +static bool +allocate_space (int request, int max_width, int *width) { - union - { - double d; - char c[8]; - } - u; - - int i; - - u.d = number; - for (i = 0; i < fp->w / 2; i++) + assert (*width <= max_width); + if (request + *width <= max_width) { - *dst++ = MAKE_HEXIT (u.c[i] >> 4); - *dst++ = MAKE_HEXIT (u.c[i] & 15); + *width += request; + return true; } - - return 1; -} - -static int -convert_CCx (char *dst, const struct fmt_spec *fp, double number) -{ - if (try_CCx (dst, fp, number)) - return 1; else - { - struct fmt_spec f; - - f.type = FMT_COMMA; - f.w = fp->w; - f.d = fp->d; - - return convert_F_plus (dst, &f, number); - } + return false; } -static int -convert_date (char *dst, const struct fmt_spec *fp, double number) +/* Tries to compose the number represented by R, in the style of + FORMAT, into OUTPUT. Returns true if successful, false on + failure, which occurs if FORMAT's width is too narrow. If + REQUIRE_AFFIXES is true, then the prefix and suffix specified + by FORMAT's style must be included; otherwise, they may be + omitted to make the number fit. */ +static bool +output_decimal (const struct rounder *r, const struct fmt_spec *format, + bool require_affixes, char *output) { - static const char *months[12] = - { - "JAN", "FEB", "MAR", "APR", "MAY", "JUN", - "JUL", "AUG", "SEP", "OCT", "NOV", "DEC", - }; + const struct fmt_number_style *style = fmt_get_style (format->type); + int decimals; - char buf[64] = {0}; - int ofs = number / 86400.; - int month, day, year; - - if (ofs < 1) - return 0; - - calendar_offset_to_gregorian (ofs, &year, &month, &day); - switch (fp->type) + for (decimals = format->d; decimals >= 0; decimals--) { - case FMT_DATE: - if (fp->w >= 11) - sprintf (buf, "%02d-%s-%04d", day, months[month - 1], year); - else - sprintf (buf, "%02d-%s-%02d", day, months[month - 1], year % 100); - break; - case FMT_EDATE: - if (fp->w >= 10) - sprintf (buf, "%02d.%02d.%04d", day, month, year); - else - sprintf (buf, "%02d.%02d.%02d", day, month, year % 100); - break; - case FMT_SDATE: - if (fp->w >= 10) - sprintf (buf, "%04d/%02d/%02d", year, month, day); - else - sprintf (buf, "%02d/%02d/%02d", year % 100, month, day); - break; - case FMT_ADATE: - if (fp->w >= 10) - sprintf (buf, "%02d/%02d/%04d", month, day, year); - else - sprintf (buf, "%02d/%02d/%02d", month, day, year % 100); - break; - case FMT_JDATE: - { - int yday = calendar_offset_to_yday (ofs); - - if (fp->w < 7) - sprintf (buf, "%02d%03d", year % 100, yday); - else if (year4 (year)) - sprintf (buf, "%04d%03d", year, yday); - else - break; - } - case FMT_QYR: - if (fp->w >= 8) - sprintf (buf, "%d Q% 04d", (month - 1) / 3 + 1, year); + /* Formatted version of magnitude of NUMBER. */ + char magnitude[64]; + + /* Number of digits in MAGNITUDE's integer and fractional parts. */ + int integer_digits; + + /* Amount of space within the field width already claimed. + Initially this is the width of MAGNITUDE, then it is reduced + in stages as space is allocated to prefixes and suffixes and + grouping characters. */ + int width; + + /* Include various decorations? */ + bool add_neg_prefix; + bool add_affixes; + bool add_grouping; + + /* Position in output. */ + char *p; + + /* Make sure there's room for the number's magnitude, plus + the negative suffix, plus (if negative) the negative + prefix. */ + width = rounder_width (r, decimals, &integer_digits, &add_neg_prefix); + width += ss_length (style->neg_suffix); + if (add_neg_prefix) + width += ss_length (style->neg_prefix); + if (width > format->w) + continue; + + /* If there's room for the prefix and suffix, allocate + space. If the affixes are required, but there's no + space, give up. */ + add_affixes = allocate_space (fmt_affix_width (style), + format->w, &width); + if (!add_affixes && require_affixes) + continue; + + /* Check whether we should include grouping characters. + We need room for a complete set or we don't insert any at all. + We don't include grouping characters if decimal places were + requested but they were all dropped. */ + add_grouping = (style->grouping != 0 + && integer_digits > 3 + && (format->d == 0 || decimals > 0) + && allocate_space ((integer_digits - 1) / 3, + format->w, &width)); + + /* Format the number's magnitude. */ + rounder_format (r, decimals, magnitude); + + /* Assemble number. */ + p = output; + if (format->w > width) + p = mempset (p, ' ', format->w - width); + if (add_neg_prefix) + p = mempcpy (p, ss_data (style->neg_prefix), + ss_length (style->neg_prefix)); + if (add_affixes) + p = mempcpy (p, ss_data (style->prefix), ss_length (style->prefix)); + if (!add_grouping) + p = mempcpy (p, magnitude, integer_digits); else - sprintf (buf, "%d Q% 02d", (month - 1) / 3 + 1, year % 100); - break; - case FMT_MOYR: - if (fp->w >= 8) - sprintf (buf, "%s% 04d", months[month - 1], year); + { + int i; + for (i = 0; i < integer_digits; i++) + { + if (i > 0 && (integer_digits - i) % 3 == 0) + *p++ = style->grouping; + *p++ = magnitude[i]; + } + } + if (decimals > 0) + { + *p++ = style->decimal; + p = mempcpy (p, &magnitude[integer_digits + 1], decimals); + } + if (add_affixes) + p = mempcpy (p, ss_data (style->suffix), ss_length (style->suffix)); + if (add_neg_prefix) + p = mempcpy (p, ss_data (style->neg_suffix), + ss_length (style->neg_suffix)); else - sprintf (buf, "%s% 02d", months[month - 1], year % 100); - break; - case FMT_WKYR: - { - int yday = calendar_offset_to_yday (ofs); - - if (fp->w >= 10) - sprintf (buf, "%02d WK% 04d", (yday - 1) / 7 + 1, year); - else - sprintf (buf, "%02d WK% 02d", (yday - 1) / 7 + 1, year % 100); - } - break; - case FMT_DATETIME: - { - char *cp; - - cp = spprintf (buf, "%02d-%s-%04d %02d:%02d", - day, months[month - 1], year, - (int) fmod (floor (number / 60. / 60.), 24.), - (int) fmod (floor (number / 60.), 60.)); - if (fp->w >= 20) - { - int w, d; - - if (fp->w >= 22 && fp->d > 0) - { - d = min (fp->d, fp->w - 21); - w = 3 + d; - } - else - { - w = 2; - d = 0; - } - - cp = spprintf (cp, ":%0*.*f", w, d, fmod (number, 60.)); - } - } - break; - default: - NOT_REACHED (); - } + p = mempset (p, ' ', ss_length (style->neg_suffix)); + assert (p == output + format->w); - if (buf[0] == 0) - return 0; - buf_copy_str_rpad (dst, fp->w, buf); - return 1; + return true; + } + return false; } -static int -convert_time (char *dst, const struct fmt_spec *fp, double number) +/* Formats NUMBER into OUTPUT in scientific notation according to + the style of the format specified in FORMAT. */ +static bool +output_scientific (double number, const struct fmt_spec *format, + bool require_affixes, char *output) { - char temp_buf[40]; - char *cp; - - double time; + const struct fmt_number_style *style = fmt_get_style (format->type); int width; + int fraction_width; + bool add_affixes; + char buf[64], *p; - if (fabs (number) > 1e20) - { - msg (ME, _("Time value %g too large in magnitude to convert to " - "alphanumeric time."), number); - return 0; - } + /* Allocate minimum required space. */ + width = 6 + ss_length (style->neg_suffix); + if (number < 0) + width += ss_length (style->neg_prefix); + if (width > format->w) + return false; + + /* Check for room for prefix and suffix. */ + add_affixes = allocate_space (fmt_affix_width (style), format->w, &width); + if (require_affixes && !add_affixes) + return false; + + /* Figure out number of characters we can use for the fraction, + if any. (If that turns out to be 1, then we'll output a + decimal point without any digits following; that's what the + # flag does in the call to sprintf, below.) */ + fraction_width = MIN (MIN (format->d + 1, format->w - width), 16); + if (format->type != FMT_E + && (fraction_width == 1 + || format->w - width + (style->grouping == 0 && number < 0) <= 2)) + fraction_width = 0; + width += fraction_width; + + /* Format (except suffix). */ + p = buf; + if (width < format->w) + p = mempset (p, ' ', format->w - width); + if (number < 0) + p = mempcpy (p, ss_data (style->neg_prefix), + ss_length (style->neg_prefix)); + if (add_affixes) + p = mempcpy (p, ss_data (style->prefix), ss_length (style->prefix)); + if (fraction_width > 0) + sprintf (p, "%#.*E", fraction_width - 1, fabs (number)); + else + sprintf (p, "%.0E", fabs (number)); - time = number; - width = fp->w; - cp = temp_buf; - if (time < 0) - *cp++ = '-', time = -time; - if (fp->type == FMT_DTIME) + /* The C locale always uses a period `.' as a decimal point. + Translate to comma if necessary. */ + if (style->decimal != '.') { - double days = floor (time / 60. / 60. / 24.); - cp = spprintf (temp_buf, "%02.0f ", days); - time = time - days * 60. * 60. * 24.; - width -= 3; + char *cp = strchr (p, '.'); + if (cp != NULL) + *cp = style->decimal; } - else - cp = temp_buf; - - cp = spprintf (cp, "%02.0f:%02.0f", - fmod (floor (time / 60. / 60.), 24.), - fmod (floor (time / 60.), 60.)); - if (width >= 8) - { - int w, d; + /* Make exponent have exactly three digits, plus sign. */ + { + char *cp = strchr (p, 'E') + 1; + long int exponent = strtol (cp, NULL, 10); + if (abs (exponent) > 999) + return false; + sprintf (cp, "%+04ld", exponent); + } - if (width >= 10 && fp->d >= 0 && fp->d != 0) - d = min (fp->d, width - 9), w = 3 + d; - else - w = 2, d = 0; + /* Add suffixes. */ + p = strchr (p, '\0'); + if (add_affixes) + p = mempcpy (p, ss_data (style->suffix), ss_length (style->suffix)); + if (number < 0) + p = mempcpy (p, ss_data (style->neg_suffix), + ss_length (style->neg_suffix)); + else + p = mempset (p, ' ', ss_length (style->neg_suffix)); - cp = spprintf (cp, ":%0*.*f", w, d, fmod (time, 60.)); - } - buf_copy_str_rpad (dst, fp->w, temp_buf); + assert (p == buf + format->w); - return 1; + buf_copy_str_lpad (output, format->w, buf); + return true; } - -static int -convert_WKDAY (char *dst, const struct fmt_spec *fp, double wkday) + +#ifndef HAVE_ROUND +/* Return X rounded to the nearest integer, + rounding ties away from zero. */ +static double +round (double x) { - static const char *weekdays[7] = - { - "SUNDAY", "MONDAY", "TUESDAY", "WEDNESDAY", - "THURSDAY", "FRIDAY", "SATURDAY", - }; - - if (wkday < 1 || wkday > 7) - { - msg (ME, _("Weekday index %f does not lie between 1 and 7."), - (double) wkday); - return 0; - } - buf_copy_str_rpad (dst, fp->w, weekdays[(int) wkday - 1]); - - return 1; + return x >= 0.0 ? floor (x + .5) : ceil (x - .5); } +#endif /* !HAVE_ROUND */ -static int -convert_MONTH (char *dst, const struct fmt_spec *fp, double month) +/* Returns true if the magnitude represented by R should be + rounded up when chopped off at DECIMALS decimal places, false + if it should be rounded down. */ +static bool +should_round_up (const struct rounder *r, int decimals) { - static const char *months[12] = - { - "JANUARY", "FEBRUARY", "MARCH", "APRIL", "MAY", "JUNE", - "JULY", "AUGUST", "SEPTEMBER", "OCTOBER", "NOVEMBER", "DECEMBER", - }; - - if (month < 1 || month > 12) - { - msg (ME, _("Month index %f does not lie between 1 and 12."), - month); - return 0; - } - - buf_copy_str_rpad (dst, fp->w, months[(int) month - 1]); - - return 1; + int digit = r->string[r->integer_digits + decimals + 1]; + assert (digit >= '0' && digit <= '9'); + return digit >= '5'; } - -/* Helper functions. */ -/* Copies SRC to DST, inserting commas and dollar signs as appropriate - for format spec *FP. */ +/* Initializes R for formatting the magnitude of NUMBER to no + more than MAX_DECIMAL decimal places. */ static void -insert_commas (char *dst, const char *src, const struct fmt_spec *fp) +rounder_init (struct rounder *r, double number, int max_decimals) { - /* Number of leading spaces in the number. This is the amount of - room we have for inserting commas and dollar signs. */ - int n_spaces; - - /* Number of digits before the decimal point. This is used to - determine the Number of commas to insert. */ - int n_digits; - - /* Number of commas to insert. */ - int n_commas; - - /* Number of items ,%$ to insert. */ - int n_items; - - /* Number of n_items items not to use for commas. */ - int n_reserved; - - /* Digit iterator. */ - int i; - - /* Source pointer. */ - const char *sp; - - /* Count spaces and digits. */ - sp = src; - while (sp < src + fp->w && *sp == ' ') - sp++; - n_spaces = sp - src; - sp = src + n_spaces; - if (*sp == '-') - sp++; - n_digits = 0; - while (sp + n_digits < src + fp->w && isdigit ((unsigned char) sp[n_digits])) - n_digits++; - n_commas = (n_digits - 1) / 3; - n_items = n_commas + (fp->type == FMT_DOLLAR || fp->type == FMT_PCT); - - /* Check whether we have enough space to do insertions. */ - if (!n_spaces || !n_items) + assert (fabs (number) < 1e41); + assert (max_decimals >= 0 && max_decimals <= 16); + if (max_decimals == 0) { - memcpy (dst, src, fp->w); - return; - } - if (n_items > n_spaces) - { - n_items -= n_commas; - if (!n_items) - { - memcpy (dst, src, fp->w); - return; - } - } + /* Fast path. No rounding needed. - /* Put spaces at the beginning if there's extra room. */ - if (n_spaces > n_items) - { - memset (dst, ' ', n_spaces - n_items); - dst += n_spaces - n_items; - } - - /* Insert $ and reserve space for %. */ - n_reserved = 0; - if (fp->type == FMT_DOLLAR) - { - *dst++ = '$'; - n_items--; + We append ".00" to the integer representation because + round_up assumes that fractional digits are present. */ + sprintf (r->string, "%.0f.00", fabs (round (number))); } - else if (fp->type == FMT_PCT) - n_reserved = 1; - - /* Copy negative sign and digits, inserting commas. */ - if (sp - src > n_spaces) - *dst++ = '-'; - for (i = n_digits; i; i--) + else { - if (i % 3 == 0 && n_digits > i && n_items > n_reserved) - { - n_items--; - *dst++ = fmt_grouping_char (fp->type); - } - *dst++ = *sp++; + /* Slow path. + + This is more difficult than it really should be because + we have to make sure that numbers that are exactly + halfway between two representations are always rounded + away from zero. This is not what sprintf normally does + (usually it rounds to even), so we have to fake it as + best we can, by formatting with extra precision and then + doing the rounding ourselves. + + We take up to two rounds to format numbers. In the + first round, we obtain 2 digits of precision beyond + those requested by the user. If those digits are + exactly "50", then in a second round we format with as + many digits as are significant in a "double". + + It might be better to directly implement our own + floating-point formatting routine instead of relying on + the system's sprintf implementation. But the classic + Steele and White paper on printing floating-point + numbers does not hint how to do what we want, and it's + not obvious how to change their algorithms to do so. It + would also be a lot of work. */ + sprintf (r->string, "%.*f", max_decimals + 2, fabs (number)); + if (!strcmp (r->string + strlen (r->string) - 2, "50")) + { + int binary_exponent, decimal_exponent, format_decimals; + frexp (number, &binary_exponent); + decimal_exponent = binary_exponent * 3 / 10; + format_decimals = (DBL_DIG + 1) - decimal_exponent; + if (format_decimals > max_decimals + 2) + sprintf (r->string, "%.*f", format_decimals, fabs (number)); + } } + + if (r->string[0] == '0') + memmove (r->string, &r->string[1], strlen (r->string)); - /* Copy decimal places and insert % if necessary. */ - memcpy (dst, sp, fp->w - (sp - src)); - if (fp->type == FMT_PCT && n_items > 0) - dst[fp->w - (sp - src)] = '%'; + r->leading_zeros = strspn (r->string, "0."); + r->leading_nines = strspn (r->string, "9."); + r->integer_digits = strchr (r->string, '.') - r->string; + r->negative = number < 0; } -/* Returns 1 if YEAR (i.e., 1987) can be represented in four digits, 0 - otherwise. */ -static int -year4 (int year) -{ - if (year >= 1 && year <= 9999) - return 1; - msg (ME, _("Year %d cannot be represented in four digits for " - "output formatting purposes."), year); - return 0; -} +/* Returns the number of characters required to format the + magnitude represented by R to DECIMALS decimal places. + The return value includes integer digits and a decimal point + and fractional digits, if any, but it does not include any + negative prefix or suffix or other affixes. + + *INTEGER_DIGITS is set to the number of digits before the + decimal point in the output, between 0 and 40. + If R represents a negative number and its rounded + representation would include at least one nonzero digit, + *NEGATIVE is set to true; otherwise, it is set to false. */ static int -try_CCx (char *dst, const struct fmt_spec *fp, double number) +rounder_width (const struct rounder *r, int decimals, + int *integer_digits, bool *negative) { - const struct fmt_number_style *style = fmt_get_style (fp->type); - - struct fmt_spec f; - - char buf[64]; - char buf2[64]; - char *cp; - - /* Determine length available, decimal character for number - proper. */ - f.type = style->decimal == fmt_decimal_char (FMT_COMMA) ? FMT_COMMA : FMT_DOT; - f.w = fp->w - fmt_affix_width (style); - if (number < 0) - f.w -= fmt_neg_affix_width (style) - 1; - else - /* Convert -0 to +0. */ - number = fabs (number); - f.d = fp->d; - - if (f.w <= 0) - return 0; - - /* There's room for all that currency crap. Let's do the F - conversion first. */ - if (!convert_F (buf, &f, number) || *buf == '*') - return 0; - insert_commas (buf2, buf, &f); - - /* Postprocess back into buf. */ - cp = buf; - if (number < 0) - cp = mempcpy (cp, ss_data (style->neg_prefix), - ss_length (style->neg_prefix)); - cp = mempcpy (cp, ss_data (style->prefix), ss_length (style->prefix)); - { - char *bp = buf2; - while (*bp == ' ') - bp++; - - assert ((number >= 0) ^ (*bp == '-')); - if (number < 0) - bp++; - - memcpy (cp, bp, f.w - (bp - buf2)); - cp += f.w - (bp - buf2); - } - cp = mempcpy (cp, ss_data (style->suffix), ss_length (style->suffix)); - if (number < 0) - cp = mempcpy (cp, ss_data (style->neg_suffix), - ss_length (style->neg_suffix)); - - /* Copy into dst. */ - assert (cp - buf <= fp->w); - if (cp - buf < fp->w) + /* Calculate base measures. */ + int width = r->integer_digits; + if (decimals > 0) + width += decimals + 1; + *integer_digits = r->integer_digits; + *negative = r->negative; + + /* Rounding can cause adjustments. */ + if (should_round_up (r, decimals)) { - memcpy (&dst[fp->w - (cp - buf)], buf, cp - buf); - memset (dst, ' ', fp->w - (cp - buf)); + /* Rounding up leading 9s adds a new digit (a 1). */ + if (r->leading_nines >= width) + { + width++; + ++*integer_digits; + } } else - memcpy (dst, buf, fp->w); - - return 1; + { + /* Rounding down. */ + if (r->leading_zeros >= width) + { + /* All digits that remain after rounding are zeros. + Therefore we drop the negative sign. */ + *negative = false; + if (r->integer_digits == 0 && decimals == 0) + { + /* No digits at all are left. We need to display + at least a single digit (a zero). */ + assert (width == 0); + width++; + *integer_digits = 1; + } + } + } + return width; } -static int -format_and_round (char *dst, double number, const struct fmt_spec *fp, - int decimals); - -/* Tries to format NUMBER into DST as the F format specified in - *FP. Return true if successful, false on failure. */ -static int -try_F (char *dst, const struct fmt_spec *fp, double number) +/* Formats the magnitude represented by R into OUTPUT, rounding + to DECIMALS decimal places. Exactly as many characters as + indicated by rounder_width are written. No terminating null + is appended. */ +static void +rounder_format (const struct rounder *r, int decimals, char *output) { - assert (fp->w <= 40); - if (finite (number)) + int base_width = r->integer_digits + (decimals > 0 ? decimals + 1 : 0); + if (should_round_up (r, decimals)) { - if (fabs (number) < power10[fp->w]) + if (r->leading_nines < base_width) { - /* The value may fit in the field. */ - if (fp->d == 0) + /* Rounding up. This is the common case where rounding + up doesn't add an extra digit. */ + char *p; + memcpy (output, r->string, base_width); + for (p = output + base_width - 1; ; p--) { - /* There are no decimal places, so there's no way - that the value can be shortened. Either it fits - or it doesn't. */ - char buf[41]; - sprintf (buf, "%*.0f", fp->w, number); - if (strlen (buf) <= fp->w) + assert (p >= output); + if (*p == '9') + *p = '0'; + else if (*p >= '0' && *p <= '8') { - buf_copy_str_lpad (dst, fp->w, buf); - return true; + (*p)++; + break; } - else - return false; - } - else - { - /* First try to format it with 2 extra decimal - places. This gives us a good chance of not - needing even more decimal places, but it also - avoids wasting too much time formatting more - decimal places on the first try. */ - int result = format_and_round (dst, number, fp, fp->d + 2); - - if (result >= 0) - return result; - - /* 2 extra decimal places weren't enough to - correctly round. Try again with the maximum - number of places. */ - return format_and_round (dst, number, fp, LDBL_DIG + 1); + else + assert (*p == '.'); } } else { - /* The value is too big to fit in the field. */ - return false; + /* Rounding up leading 9s causes the result to be a 1 + followed by a number of 0s, plus a decimal point. */ + char *p = output; + *p++ = '1'; + p = mempset (p, '0', r->integer_digits); + if (decimals > 0) + { + *p++ = '.'; + p = mempset (p, '0', decimals); + } + assert (p == output + base_width + 1); } } - else - return convert_infinite (dst, fp, number); -} - -/* Tries to compose NUMBER into DST in format FP by first - formatting it with DECIMALS decimal places, then rounding off - to as many decimal places will fit or the number specified in - FP, whichever is fewer. - - Returns 1 if conversion succeeds, 0 if this try at conversion - failed and so will any other tries (because the integer part - of the number is too long), or -1 if this try failed but - another with higher DECIMALS might succeed (because we'd be - able to properly round). */ -static int -format_and_round (char *dst, double number, const struct fmt_spec *fp, - int decimals) -{ - /* Number of characters before the decimal point, - which includes digits and possibly a minus sign. */ - int predot_chars; - - /* Number of digits in the output fraction, - which may be smaller than fp->d if there's not enough room. */ - int fraction_digits; - - /* Points to last digit that will remain in the fraction after - rounding. */ - char *final_frac_dig; - - /* Round up? */ - bool round_up; - - char buf[128]; - - assert (decimals > fp->d); - if (decimals > LDBL_DIG) - decimals = LDBL_DIG + 1; - - sprintf (buf, "%.*f", decimals, number); - - /* Omit integer part if it's 0. */ - if (!memcmp (buf, "0.", 2)) - memmove (buf, buf + 1, strlen (buf)); - else if (!memcmp (buf, "-0.", 3)) - memmove (buf + 1, buf + 2, strlen (buf + 1)); - - predot_chars = strcspn (buf, "."); - if (predot_chars > fp->w) - { - /* Can't possibly fit. */ - return 0; - } - else if (predot_chars == fp->w) - { - /* Exact fit for integer part and sign. */ - memcpy (dst, buf, fp->w); - return 1; - } - else if (predot_chars + 1 == fp->w) - { - /* There's room for the decimal point, but not for any - digits of the fraction. - Right-justify the integer part and sign. */ - dst[0] = ' '; - memcpy (dst + 1, buf, fp->w - 1); - return 1; - } - - /* It looks like we have room for at least one digit of the - fraction. Figure out how many. */ - fraction_digits = fp->w - predot_chars - 1; - if (fraction_digits > fp->d) - fraction_digits = fp->d; - final_frac_dig = buf + predot_chars + fraction_digits; - - /* Decide rounding direction and truncate string. */ - if (final_frac_dig[1] == '5' - && strspn (final_frac_dig + 2, "0") == strlen (final_frac_dig + 2)) + else { - /* Exactly 1/2. */ - if (decimals <= LDBL_DIG) + /* Rounding down. */ + if (r->integer_digits != 0 || decimals != 0) { - /* Don't have enough fractional digits to know which way to - round. We can format with more decimal places, so go - around again. */ - return -1; + /* Common case: just copy the digits. */ + memcpy (output, r->string, base_width); } else { - /* We used up all our fractional digits and still don't - know. Round to even. */ - round_up = (final_frac_dig[0] - '0') % 2 != 0; + /* No digits remain. The output is just a zero. */ + output[0] = '0'; } } - else - round_up = final_frac_dig[1] >= '5'; - final_frac_dig[1] = '\0'; +} + +/* Helper functions. */ - /* Do rounding. */ - if (round_up) +/* Returns 10**X. */ +static double PURE_FUNCTION +power10 (int x) +{ + static const double p[] = { - char *cp = final_frac_dig; - for (;;) - { - if (*cp >= '0' && *cp <= '8') - { - (*cp)++; - break; - } - else if (*cp == '9') - *cp = '0'; - else - assert (*cp == '.'); - - if (cp == buf || *--cp == '-') - { - size_t length; - - /* Tried to go past the leftmost digit. Insert a 1. */ - memmove (cp + 1, cp, strlen (cp) + 1); - *cp = '1'; - - length = strlen (buf); - if (length > fp->w) - { - /* Inserting the `1' overflowed our space. - Drop a decimal place. */ - buf[--length] = '\0'; - - /* If that was the last decimal place, drop the - decimal point too. */ - if (buf[length - 1] == '.') - buf[length - 1] = '\0'; - } - - break; - } - } - } - - /* Omit `-' if value output is zero. */ - if (buf[0] == '-' && buf[strspn (buf, "-.0")] == '\0') - memmove (buf, buf + 1, strlen (buf)); + 1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6, 1e7, 1e8, 1e9, + 1e10, 1e11, 1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19, + 1e20, 1e21, 1e22, 1e23, 1e24, 1e25, 1e26, 1e27, 1e28, 1e29, + 1e30, 1e31, 1e32, 1e33, 1e34, 1e35, 1e36, 1e37, 1e38, 1e39, + 1e40, + }; + return x >= 0 && x < sizeof p / sizeof *p ? p[x] : pow (10.0, x); +} - buf_copy_str_lpad (dst, fp->w, buf); - return 1; +/* Returns 256**X. */ +static double PURE_FUNCTION +power256 (int x) +{ + static const double p[] = + { + 1.0, + 256.0, + 65536.0, + 16777216.0, + 4294967296.0, + 1099511627776.0, + 281474976710656.0, + 72057594037927936.0, + 18446744073709551616.0 + }; + return x >= 0 && x < sizeof p / sizeof *p ? p[x] : pow (256.0, x); } -/* Formats non-finite NUMBER into DST according to the width - given in FP. */ -static int -convert_infinite (char *dst, const struct fmt_spec *fp, double number) +/* Formats non-finite NUMBER into OUTPUT according to the width + given in FORMAT. */ +static void +output_infinite (double number, const struct fmt_spec *format, char *output) { - assert (!finite (number)); + assert (!isfinite (number)); - if (fp->w >= 3) + if (format->w >= 3) { const char *s; @@ -1237,10 +963,95 @@ convert_infinite (char *dst, const struct fmt_spec *fp, double number) else s = "Unknown"; - buf_copy_str_lpad (dst, fp->w, s); + buf_copy_str_lpad (output, format->w, s); } else - memset (dst, '*', fp->w); + output_overflow (format, output); +} - return true; +/* Formats OUTPUT as a missing value for the given FORMAT. */ +static void +output_missing (const struct fmt_spec *format, char *output) +{ + memset (output, ' ', format->w); + + if (format->type != FMT_N) + { + int dot_ofs = (format->type == FMT_PCT ? 2 + : format->type == FMT_E ? 5 + : 1); + output[MAX (0, format->w - format->d - dot_ofs)] = '.'; + } + else + output[format->w - 1] = '.'; +} + +/* Formats OUTPUT for overflow given FORMAT. */ +static void +output_overflow (const struct fmt_spec *format, char *output) +{ + memset (output, '*', format->w); +} + +/* Converts the integer part of NUMBER to a packed BCD number + with the given number of DIGITS in OUTPUT. If DIGITS is odd, + the least significant nibble of the final byte in OUTPUT is + set to 0. Returns true if successful, false if NUMBER is not + representable. On failure, OUTPUT is cleared to all zero + bytes. */ +static bool +output_bcd_integer (double number, int digits, char *output) +{ + char decimal[64]; + + assert (digits < sizeof decimal); + if (number != SYSMIS + && number >= 0. + && number < power10 (digits) + && sprintf (decimal, "%0*.0f", digits, round (number)) == digits) + { + const char *src = decimal; + int i; + + for (i = 0; i < digits / 2; i++) + { + int d0 = *src++ - '0'; + int d1 = *src++ - '0'; + *output++ = (d0 << 4) + d1; + } + if (digits % 2) + *output = (*src - '0') << 4; + + return true; + } + else + { + memset (output, 0, digits); + return false; + } +} + +/* Writes VALUE to OUTPUT as a BYTES-byte binary integer of the + given INTEGER_FORMAT. */ +static void +output_binary_integer (uint64_t value, int bytes, + enum integer_format integer_format, char *output) +{ + integer_put (value, integer_format, output, bytes); +} + +/* Converts the BYTES bytes in DATA to twice as many hexadecimal + digits in OUTPUT. */ +static void +output_hex (const void *data_, size_t bytes, char *output) +{ + const uint8_t *data = data_; + size_t i; + + for (i = 0; i < bytes; i++) + { + static const char hex_digits[] = "0123456789ABCDEF"; + *output++ = hex_digits[data[i] >> 4]; + *output++ = hex_digits[data[i] & 15]; + } } diff --git a/src/data/data-out.h b/src/data/data-out.h new file mode 100644 index 00000000..bf949c8b --- /dev/null +++ b/src/data/data-out.h @@ -0,0 +1,37 @@ +/* PSPP - computes sample statistics. + Copyright (C) 1997-9, 2000, 2006 Free Software Foundation, Inc. + Written by Ben Pfaff . + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#ifndef DATA_OUT_H +#define DATA_OUT_H 1 + +#include +#include + +struct fmt_spec; +union value; + +void data_out (const union value *, const struct fmt_spec *, char *); + +enum integer_format data_out_get_integer_format (void); +void data_out_set_integer_format (enum integer_format); + +enum float_format data_out_get_float_format (void); +void data_out_set_float_format (enum float_format); + +#endif /* data-out.h */ diff --git a/src/data/format.c b/src/data/format.c index 3ce41ee0..06042d0e 100644 --- a/src/data/format.c +++ b/src/data/format.c @@ -456,7 +456,7 @@ fmt_max_output_decimals (enum fmt_type type, int width) int fmt_step_width (enum fmt_type type) { - return fmt_get_category (type) & FMT_CAT_HEXADECIMAL ? 2 : 1; + return fmt_get_category (type) == FMT_CAT_HEXADECIMAL ? 2 : 1; } /* Returns true if TYPE is used for string fields, @@ -464,7 +464,7 @@ fmt_step_width (enum fmt_type type) bool fmt_is_string (enum fmt_type type) { - return fmt_get_category (type) & FMT_CAT_STRING; + return fmt_get_category (type) == FMT_CAT_STRING; } /* Returns true if TYPE is used for numeric fields, @@ -491,10 +491,19 @@ fmt_get_category (enum fmt_type type) enum fmt_type fmt_input_to_output (enum fmt_type type) { - enum fmt_category category = fmt_get_category (type); - return (category & FMT_CAT_STRING ? FMT_A - : category & (FMT_CAT_BASIC | FMT_CAT_HEXADECIMAL) ? FMT_F - : type); + switch (fmt_get_category (type)) + { + case FMT_CAT_STRING: + return FMT_A; + + case FMT_CAT_LEGACY: + case FMT_CAT_BINARY: + case FMT_CAT_HEXADECIMAL: + return FMT_F; + + default: + return type; + } } /* Returns the SPSS format type corresponding to the given PSPP @@ -503,7 +512,7 @@ int fmt_to_io (enum fmt_type type) { return get_fmt_desc (type)->io; -}; +} /* Determines the PSPP format corresponding to the given SPSS format type. If successful, sets *FMT_TYPE to the PSPP format diff --git a/src/data/format.h b/src/data/format.h index 3678632e..919f1c69 100644 --- a/src/data/format.h +++ b/src/data/format.h @@ -163,9 +163,5 @@ enum measure bool measure_is_valid(enum measure m); bool alignment_is_valid(enum alignment a); - -#include - -bool data_out (char *s, const struct fmt_spec *fp, const union value *v); #endif /* format.h */ diff --git a/src/data/value-labels.c b/src/data/value-labels.c index ad7f3b81..96d3dd75 100644 --- a/src/data/value-labels.c +++ b/src/data/value-labels.c @@ -23,6 +23,7 @@ #include +#include #include #include #include @@ -545,7 +546,7 @@ value_to_string (const union value *val, const struct variable *var) if (s == NULL) { static char buf[MAX_STRING + 1]; - data_out (buf, &var->print, val); + data_out (val, &var->print, buf); buf[var->print.w] = '\0'; s = buf; } diff --git a/src/language/data-io/list.q b/src/language/data-io/list.q index b099a270..d8256e32 100644 --- a/src/language/data-io/list.q +++ b/src/language/data-io/list.q @@ -26,6 +26,7 @@ #include "size_max.h" #include #include +#include #include #include #include @@ -666,15 +667,15 @@ list_cases (const struct ccase *c, void *aux UNUSED, const struct dataset *ds UN if (fmt_is_string (v->print.type) || v->fv != -1) { - data_out (ds_put_uninit(&line_buffer, v->print.w), - &v->print, case_data (c, v->fv)); + data_out (case_data (c, v->fv), &v->print, + ds_put_uninit (&line_buffer, v->print.w)); } else { union value case_idx_value; case_idx_value.f = case_idx; - data_out (ds_put_uninit(&line_buffer,v->print.w), - &v->print, &case_idx_value); + data_out (&case_idx_value, &v->print, + ds_put_uninit (&line_buffer,v->print.w)); } ds_put_char(&line_buffer, ' '); @@ -702,12 +703,12 @@ list_cases (const struct ccase *c, void *aux UNUSED, const struct dataset *ds UN char buf[256]; if (fmt_is_string (v->print.type) || v->fv != -1) - data_out (buf, &v->print, case_data (c, v->fv)); + data_out (case_data (c, v->fv), &v->print, buf); else { union value case_idx_value; case_idx_value.f = case_idx; - data_out (buf, &v->print, &case_idx_value); + data_out (&case_idx_value, &v->print, buf); } fputs (" ", x->file); diff --git a/src/language/data-io/print.c b/src/language/data-io/print.c index 2bc28a5d..262d5397 100644 --- a/src/language/data-io/print.c +++ b/src/language/data-io/print.c @@ -22,7 +22,7 @@ #include #include -#include +#include #include #include #include @@ -460,7 +460,7 @@ print_trns_proc (void *trns_, struct ccase *c, casenumber case_num UNUSED) const union value *input = case_data (c, spec->var->fv); char *output = ds_put_uninit (&trns->line, spec->format.w); if (!spec->sysmis_as_spaces || input->f != SYSMIS) - data_out (output, &spec->format, input); + data_out (input, &spec->format, output); else memset (output, ' ', spec->format.w); if (spec->add_space) diff --git a/src/language/dictionary/ChangeLog b/src/language/dictionary/ChangeLog index 726c6547..c0b714fc 100644 --- a/src/language/dictionary/ChangeLog +++ b/src/language/dictionary/ChangeLog @@ -1,3 +1,8 @@ +Sat Nov 4 16:04:19 2006 Ben Pfaff + + * numeric.c: (cmd_string) Check that output format is valid. + Simplify parsing. + Wed Nov 1 20:50:54 2006 Ben Pfaff * sys-file-info.c: (cmd_display) Use compare_var_ptr_names to diff --git a/src/language/dictionary/numeric.c b/src/language/dictionary/numeric.c index b7a23315..ae961963 100644 --- a/src/language/dictionary/numeric.c +++ b/src/language/dictionary/numeric.c @@ -128,7 +128,9 @@ cmd_string (struct dataset *ds) if (!parse_DATA_LIST_vars (&v, &nv, PV_NONE)) return CMD_FAILURE; - if (!lex_force_match ('(') || !parse_format_specifier (&f)) + if (!lex_force_match ('(') + || !parse_format_specifier (&f) + || !lex_force_match (')')) goto fail; if (!fmt_is_string (f.type)) { @@ -137,12 +139,8 @@ cmd_string (struct dataset *ds) "variable."), fmt_to_string (&f, str)); goto fail; } - - if (!lex_match (')')) - { - msg (SE, _("`)' expected after output format.")); - goto fail; - } + if (!fmt_check_output (&f)) + goto fail; width = fmt_var_width (&f); diff --git a/src/language/dictionary/split-file.c b/src/language/dictionary/split-file.c index ef34d7e9..8c5a9555 100644 --- a/src/language/dictionary/split-file.c +++ b/src/language/dictionary/split-file.c @@ -22,6 +22,7 @@ #include #include +#include #include #include #include @@ -95,7 +96,7 @@ output_split_file_values (const struct dataset *ds, const struct ccase *c) assert (v->type == NUMERIC || v->type == ALPHA); tab_text (t, 0, i + 1, TAB_LEFT | TAT_PRINTF, "%s", v->name); - data_out (temp_buf, &v->print, case_data (c, v->fv)); + data_out (case_data (c, v->fv), &v->print, temp_buf); temp_buf[v->print.w] = 0; tab_text (t, 1, i + 1, TAT_PRINTF, "%.*s", v->print.w, temp_buf); diff --git a/src/language/expressions/helpers.h b/src/language/expressions/helpers.h index 0f44b35f..124868cc 100644 --- a/src/language/expressions/helpers.h +++ b/src/language/expressions/helpers.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include diff --git a/src/language/expressions/operations.def b/src/language/expressions/operations.def index 303c1d5c..6b4d5221 100644 --- a/src/language/expressions/operations.def +++ b/src/language/expressions/operations.def @@ -587,7 +587,7 @@ absorb_miss string function STRING (x, no_format f) v.f = x; dst = alloc_string (e, f->w); assert (!fmt_is_string (f->type)); - data_out (dst.string, f, &v); + data_out (&v, f, dst.string); return dst; } diff --git a/src/language/lexer/format-parser.h b/src/language/lexer/format-parser.h index 046f57ab..d16b2f5b 100644 --- a/src/language/lexer/format-parser.h +++ b/src/language/lexer/format-parser.h @@ -24,8 +24,6 @@ #include -struct fmt_spec; - bool parse_abstract_format_specifier (char type[FMT_TYPE_LEN_MAX + 1], int *width, int *decimals); bool parse_format_specifier (struct fmt_spec *); diff --git a/src/language/stats/crosstabs.q b/src/language/stats/crosstabs.q index cf76b35b..67ee5aab 100644 --- a/src/language/stats/crosstabs.q +++ b/src/language/stats/crosstabs.q @@ -37,6 +37,7 @@ #include #include +#include #include #include #include @@ -1718,7 +1719,7 @@ format_cell_entry (struct tab_table *table, int c, int r, double value, s.length = 10; s.string = tab_alloc (table, 16); v.f = value; - data_out (s.string, &f, &v); + data_out (&v, &f, s.string); while (*s.string == ' ') { s.length--; @@ -3200,8 +3201,8 @@ format_short (char *s, const struct fmt_spec *fp, const union value *v) } /* Format. */ - data_out (s, fp, v); - + data_out (v, fp, s); + /* Null terminate. */ s[fp->w] = '\0'; } diff --git a/src/language/utilities/ChangeLog b/src/language/utilities/ChangeLog index bcae1ebd..e05deb42 100644 --- a/src/language/utilities/ChangeLog +++ b/src/language/utilities/ChangeLog @@ -1,3 +1,16 @@ +Sat Nov 4 16:05:47 2006 Ben Pfaff + + * set.q: Add WIB, WRB settings to control binary formats used by + data_out. + (cmd_set) Implement SET WIB, WRB. + (stc_to_integer_format) New function. + (stc_to_float_format) New function. + (show_integer_format) New function. + (show_float_format) New function. + (show_wib) New function. + (show_wrb) New function. + (static var show_table[]) Add SHOW WIB, WRB. + Sat Nov 4 11:48:23 2006 Ben Pfaff * set.q: Update ERRORS, MESSAGES, RESULTS command syntax. diff --git a/src/language/utilities/set.q b/src/language/utilities/set.q index d55511e0..a05878a3 100644 --- a/src/language/utilities/set.q +++ b/src/language/utilities/set.q @@ -24,6 +24,7 @@ #include #include +#include #include #include #include @@ -36,6 +37,8 @@ #include #include #include +#include +#include #include #include #include @@ -106,6 +109,8 @@ int tgetnum (const char *); tb1=string "x==3 || x==11" "3 or 11 characters long"; tbfonts=string; undefined=undef:warn/nowarn; + wib=wib:msbfirst/lsbfirst/vax/native; + wrb=wrb:native/isl/isb/idl/idb/vf/vd/vg/zs/zl; width=custom; workspace=integer "x>=1024" "%s must be at least 1 MB"; xsort=xsort:yes/no. @@ -118,6 +123,8 @@ int tgetnum (const char *); /* (functions) */ static bool do_cc (const char *cc_string, enum fmt_type); +static enum integer_format stc_to_integer_format (int stc); +static enum float_format stc_to_float_format (int stc); int cmd_set (struct dataset *ds) @@ -173,6 +180,10 @@ cmd_set (struct dataset *ds) set_scompression (cmd.scompress == STC_ON); if (cmd.sbc_undefined) set_undefined (cmd.undef == STC_WARN); + if (cmd.sbc_wib) + data_out_set_integer_format (stc_to_integer_format (cmd.wib)); + if (cmd.sbc_wrb) + data_out_set_float_format (stc_to_float_format (cmd.wrb)); if (cmd.sbc_workspace) set_workspace (cmd.n_workspace[0] * 1024L); @@ -204,6 +215,52 @@ cmd_set (struct dataset *ds) return CMD_SUCCESS; } +/* Returns the integer_format value corresponding to STC, + which should be the value of cmd.rib or cmd.wib. */ +static enum integer_format +stc_to_integer_format (int stc) +{ + return (stc == STC_MSBFIRST ? INTEGER_MSB_FIRST + : stc == STC_LSBFIRST ? INTEGER_LSB_FIRST + : stc == STC_VAX ? INTEGER_VAX + : INTEGER_NATIVE); +} + +/* Returns the float_format value corresponding to STC, + which should be the value of cmd.rrb or cmd.wrb. */ +static enum float_format +stc_to_float_format (int stc) +{ + switch (stc) + { + case STC_NATIVE: + return FLOAT_NATIVE_DOUBLE; + + case STC_ISL: + return FLOAT_IEEE_SINGLE_LE; + case STC_ISB: + return FLOAT_IEEE_SINGLE_BE; + case STC_IDL: + return FLOAT_IEEE_DOUBLE_LE; + case STC_IDB: + return FLOAT_IEEE_DOUBLE_BE; + + case STC_VF: + return FLOAT_VAX_F; + case STC_VD: + return FLOAT_VAX_D; + case STC_VG: + return FLOAT_VAX_G; + + case STC_ZS: + return FLOAT_Z_SHORT; + case STC_ZL: + return FLOAT_Z_LONG; + } + + NOT_REACHED (); +} + /* Find the grouping characters in CC_STRING and set CC's grouping and decimal members appropriately. Returns true if successful, false otherwise. */ @@ -604,6 +661,66 @@ show_mxwarns (const struct dataset *ds UNUSED) msg (SN, _("MXWARNS is %d."), get_mxwarns ()); } +/* Outputs that SETTING has the given INTEGER_FORMAT value. */ +static void +show_integer_format (const char *setting, enum integer_format integer_format) +{ + msg (SN, _("%s is %s (%s)."), + setting, + (integer_format == INTEGER_MSB_FIRST ? "MSBFIRST" + : integer_format == INTEGER_LSB_FIRST ? "LSBFIRST" + : "VAX"), + integer_format == INTEGER_NATIVE ? "NATIVE" : "nonnative"); +} + +/* Outputs that SETTING has the given FLOAT_FORMAT value. */ +static void +show_float_format (const char *setting, enum float_format float_format) +{ + const char *format_name = ""; + + switch (float_format) + { + case FLOAT_IEEE_SINGLE_LE: + format_name = "ISL (32-bit IEEE 754 single, little-endian)"; + break; + case FLOAT_IEEE_SINGLE_BE: + format_name = "ISB (32-bit IEEE 754 single, big-endian)"; + break; + case FLOAT_IEEE_DOUBLE_LE: + format_name = "IDL (64-bit IEEE 754 double, little-endian)"; + break; + case FLOAT_IEEE_DOUBLE_BE: + format_name = "IDB (64-bit IEEE 754 double, big-endian)"; + break; + + case FLOAT_VAX_F: + format_name = "VF (32-bit VAX F, VAX-endian)"; + break; + case FLOAT_VAX_D: + format_name = "VD (64-bit VAX D, VAX-endian)"; + break; + case FLOAT_VAX_G: + format_name = "VG (64-bit VAX G, VAX-endian)"; + break; + + case FLOAT_Z_SHORT: + format_name = "ZS (32-bit IBM Z hexadecimal short, big-endian)"; + break; + case FLOAT_Z_LONG: + format_name = "ZL (64-bit IBM Z hexadecimal long, big-endian)"; + break; + + case FLOAT_FP: + case FLOAT_HEX: + NOT_REACHED (); + } + + msg (SN, _("%s is %s (%s)."), + setting, format_name, + float_format == FLOAT_NATIVE_DOUBLE ? "NATIVE" : "nonnative"); +} + static void show_scompression (const struct dataset *ds UNUSED) { @@ -632,6 +749,18 @@ show_weight (const struct dataset *ds) msg (SN, _("WEIGHT is variable %s."), var->name); } +static void +show_wib (const struct dataset *ds UNUSED) +{ + show_integer_format ("WIB", data_out_get_integer_format ()); +} + +static void +show_wrb (const struct dataset *ds UNUSED) +{ + show_float_format ("WRB", data_out_get_float_format ()); +} + static void show_width (const struct dataset *ds UNUSED) { @@ -663,6 +792,8 @@ const struct show_sbc show_table[] = {"SCOMPRESSION", show_scompression}, {"UNDEFINED", show_undefined}, {"WEIGHT", show_weight}, + {"WIB", show_wib}, + {"WRB", show_wrb}, {"WIDTH", show_width}, }; diff --git a/src/libpspp/str.c b/src/libpspp/str.c index cf0069dd..965e3b30 100644 --- a/src/libpspp/str.c +++ b/src/libpspp/str.c @@ -158,6 +158,22 @@ buf_copy_str_lpad (char *dst, size_t dst_size, const char *src) } } +/* Copies buffer SRC, of SRC_SIZE bytes, to DST, of DST_SIZE bytes. + DST is truncated to DST_SIZE bytes or padded on the left with + spaces as needed. */ +void +buf_copy_lpad (char *dst, size_t dst_size, + const char *src, size_t src_size) +{ + if (src_size >= dst_size) + memmove (dst, src, dst_size); + else + { + memset (dst, ' ', dst_size - src_size); + memmove (&dst[dst_size - src_size], src, src_size); + } +} + /* Copies buffer SRC, of SRC_SIZE bytes, to DST, of DST_SIZE bytes. DST is truncated to DST_SIZE bytes or padded on the right with spaces as needed. */ @@ -253,6 +269,15 @@ spprintf (char *dst, const char *format, ...) return dst + count; } + +/* Sets the SIZE bytes starting at BLOCK to C, + and returns the byte following BLOCK. */ +void * +mempset (void *block, int c, size_t size) +{ + memset (block, c, size); + return (char *) block + size; +} /* Substrings. */ diff --git a/src/libpspp/str.h b/src/libpspp/str.h index d8698ba4..eaf12c3b 100644 --- a/src/libpspp/str.h +++ b/src/libpspp/str.h @@ -44,6 +44,7 @@ void buf_reverse (char *, size_t); char *buf_find_reverse (const char *, size_t, const char *, size_t); int buf_compare_case (const char *, const char *, size_t); int buf_compare_rpad (const char *, size_t, const char *, size_t); +void buf_copy_lpad (char *, size_t, const char *, size_t); void buf_copy_rpad (char *, size_t, const char *, size_t); void buf_copy_str_lpad (char *, size_t, const char *); void buf_copy_str_rpad (char *, size_t, const char *); @@ -56,6 +57,8 @@ void str_uppercase (char *); void str_lowercase (char *); char *spprintf (char *dst, const char *format, ...); + +void *mempset (void *, int, size_t); /* Common character classes for use with substring and string functions. */ diff --git a/src/output/table.c b/src/output/table.c index 029fd5d0..ca3e467c 100644 --- a/src/output/table.c +++ b/src/output/table.c @@ -29,6 +29,7 @@ #include "output.h" #include "manager.h" +#include #include #include #include @@ -541,7 +542,7 @@ tab_value (struct tab_table *table, int c, int r, unsigned char opt, table->cc[c + r * table->cf] = ss_buffer (contents, f->w); table->ct[c + r * table->cf] = opt; - data_out (contents, f, v); + data_out (v, f, contents); } /* Sets cell (C,R) in TABLE, with options OPT, to have value VAL @@ -580,7 +581,7 @@ tab_float (struct tab_table *table, int c, int r, unsigned char opt, #endif double_value.f = val; - data_out (buf, &f, &double_value); + data_out (&double_value, &f, buf); cp = buf; while (isspace ((unsigned char) *cp) && cp < &buf[w]) diff --git a/src/ui/gui/helper.c b/src/ui/gui/helper.c index 841bda5a..dea14d54 100644 --- a/src/ui/gui/helper.c +++ b/src/ui/gui/helper.c @@ -1,5 +1,6 @@ #include "helper.h" #include +#include #include #include @@ -16,10 +17,7 @@ value_to_text(union value v, struct fmt_spec format) gchar *s = 0; s = g_new(gchar, format.w + 1); - if ( ! data_out(s, &format, &v) ) - { - g_warning("Can't format missing discrete value \n"); - } + data_out(&v, &format, s); s[format.w]='\0'; g_strchug(s); diff --git a/src/ui/gui/psppire-data-store.c b/src/ui/gui/psppire-data-store.c index eeaf31d6..0cb8ef91 100644 --- a/src/ui/gui/psppire-data-store.c +++ b/src/ui/gui/psppire-data-store.c @@ -28,6 +28,7 @@ #include #include +#include #include #include @@ -506,7 +507,7 @@ psppire_data_store_get_string (const GSheetModel *model, gint row, gint column) /* Converts binary value V into printable form in the exactly FP->W character in buffer S according to format specification FP. No null terminator is appended to the buffer. */ - data_out (s->str, fp, v); + data_out (v, fp, s->str); text = pspp_locale_to_utf8 (s->str, fp->w, 0); g_string_free (s, TRUE); diff --git a/tests/ChangeLog b/tests/ChangeLog index 8cc076d6..790a3316 100644 --- a/tests/ChangeLog +++ b/tests/ChangeLog @@ -1,3 +1,26 @@ +Sat Nov 4 16:08:58 2006 Ben Pfaff + + * automake.mk: Add binhex-out.sh, date-out.sh, month-out.sh, + num-out.sh, time-out.sh, wkday-out.sh from formats directory. Add + formats/inexactify as a program needed by tests. + + * command/no_case_size.sh: Update output to conform with updated + formatted output code. + + * expressions/expressions.sh: Ditto. + + * formats/binhex-out.sh: New test. + + * formats/date-out.sh: New test. + + * formats/month-out.sh: New test. + + * formats/num-out.sh: New test. + + * formats/time-out.sh: New test. + + * formats/wkday-out.sh: New test. + Thu Oct 26 20:20:39 2006 Ben Pfaff * automake.mk: Add tests/formats/float-format.sh. diff --git a/tests/automake.mk b/tests/automake.mk index 2527c836..dde7b09c 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -57,7 +57,13 @@ TESTS = \ tests/command/use.sh \ tests/command/very-long-strings.sh \ tests/command/weight.sh \ + tests/formats/binhex-out.sh \ + tests/formats/date-out.sh \ tests/formats/float-format.sh \ + tests/formats/month-out.sh \ + tests/formats/num-out.sh \ + tests/formats/time-out.sh \ + tests/formats/wkday-out.sh \ tests/bugs/agg_crash.sh \ tests/bugs/agg-crash-2.sh \ tests/bugs/alpha-freq.sh \ @@ -112,7 +118,8 @@ TESTS = \ tests/libpspp/ll-test \ tests/libpspp/llx-test -check_PROGRAMS += tests/libpspp/ll-test tests/libpspp/llx-test +check_PROGRAMS += tests/libpspp/ll-test tests/libpspp/llx-test \ + tests/formats/inexactify tests_libpspp_ll_test_SOURCES = \ src/libpspp/ll.c \ @@ -126,6 +133,8 @@ tests_libpspp_llx_test_SOURCES = \ src/libpspp/llx.h \ tests/libpspp/llx-test.c +tests_formats_inexactify_SOURCES = tests/formats/inexactify.c + EXTRA_DIST += $(TESTS) tests/weighting.data tests/data-list.data tests/list.data \ tests/no_case_size.sav \ tests/coverage.sh tests/test_template \ diff --git a/tests/command/no_case_size.sh b/tests/command/no_case_size.sh index 25744bdc..3d6f2867 100755 --- a/tests/command/no_case_size.sh +++ b/tests/command/no_case_size.sh @@ -92,12 +92,12 @@ diff -b -w pspp.list - < "123.6" string($sysmis, f5.1) => " . " string("abc", A5) => error string(123, e1) => error # E has a minimum width of 6 on output. -string(123, e6.0) => " 1E+02" +string(123, e6.0) => "1E+002" substr('abcdefgh', -5) => "" substr('abcdefgh', 0) => "" diff --git a/tests/formats/binhex-out.expected.gz b/tests/formats/binhex-out.expected.gz new file mode 100644 index 00000000..f8fc6e98 Binary files /dev/null and b/tests/formats/binhex-out.expected.gz differ diff --git a/tests/formats/binhex-out.sh b/tests/formats/binhex-out.sh new file mode 100755 index 00000000..caffb03b --- /dev/null +++ b/tests/formats/binhex-out.sh @@ -0,0 +1,168 @@ +#! /bin/sh + +TEMPDIR=/tmp/pspp-tst-$$ +mkdir -p $TEMPDIR +trap 'cd /; rm -rf $TEMPDIR' 0 + +# ensure that top_builddir are absolute +if [ -z "$top_builddir" ] ; then top_builddir=. ; fi +if [ -z "$top_srcdir" ] ; then top_srcdir=. ; fi +top_builddir=`cd $top_builddir; pwd` +PSPP=$top_builddir/src/ui/terminal/pspp + +# ensure that top_srcdir is absolute +top_srcdir=`cd $top_srcdir; pwd` + +STAT_CONFIG_PATH=$top_srcdir/config +export STAT_CONFIG_PATH + +fail() +{ + echo $activity + echo FAILED + exit 1; +} + + +no_result() +{ + echo $activity + echo NO RESULT; + exit 2; +} + +pass() +{ + exit 0; +} + +cd $TEMPDIR + +activity="write pspp syntax" +cat > binhex-out.pspp < expected.out +if [ $? -ne 0 ] ; then no_result ; fi + +activity="compare output" +cmp expected.out binhex.out +if [ $? -ne 0 ] ; then fail ; fi + +pass diff --git a/tests/formats/date-out.sh b/tests/formats/date-out.sh new file mode 100755 index 00000000..6e28083c --- /dev/null +++ b/tests/formats/date-out.sh @@ -0,0 +1,578 @@ +#! /bin/sh + +TEMPDIR=/tmp/pspp-tst-$$ +mkdir -p $TEMPDIR +trap 'cd /; rm -rf $TEMPDIR' 0 + +# ensure that top_builddir are absolute +if [ -z "$top_builddir" ] ; then top_builddir=. ; fi +if [ -z "$top_srcdir" ] ; then top_srcdir=. ; fi +top_builddir=`cd $top_builddir; pwd` +PSPP=$top_builddir/src/ui/terminal/pspp + +# ensure that top_srcdir is absolute +top_srcdir=`cd $top_srcdir; pwd` + +STAT_CONFIG_PATH=$top_srcdir/config +export STAT_CONFIG_PATH + +fail() +{ + echo $activity + echo FAILED + exit 1; +} + + +no_result() +{ + echo $activity + echo NO RESULT; + exit 2; +} + +pass() +{ + exit 0; +} + +cd $TEMPDIR + +activity="write pspp syntax" +cat > date-out.pspp < bad-date-out.pspp <. + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA + 02110-1301, USA. */ + +#include +#include +#include +#include +#include + +/* Replaces insignificant digits by # to facilitate textual + comparisons. Not a perfect solution to the general-purpose + comparison problem, because rounding that affects earlier + digits can still cause differences. */ +int +main (void) +{ + bool in_quotes = false; + bool in_exponent = false; + int digits = 0; + + for (;;) + { + int c = getchar (); + if (c == EOF) + break; + else if (c == '\n') + in_quotes = false; + else if (c == '"') + { + in_quotes = !in_quotes; + in_exponent = false; + digits = 0; + } + else if (in_quotes && !in_exponent) + { + if (strchr ("+dDeE", c) != NULL || (c == '-' && digits)) + in_exponent = true; + else if (strchr ("0123456789}JKLMNOPQR", c) != NULL) + { + if (digits || c >= '1') + digits++; + if (digits > 13) + c = isdigit (c) ? '#' : '@'; + } + } + putchar (c); + } + return EXIT_SUCCESS; +} diff --git a/tests/formats/month-out.sh b/tests/formats/month-out.sh new file mode 100755 index 00000000..65d0a42e --- /dev/null +++ b/tests/formats/month-out.sh @@ -0,0 +1,996 @@ +#! /bin/sh + +TEMPDIR=/tmp/pspp-tst-$$ +mkdir -p $TEMPDIR +trap 'cd /; rm -rf $TEMPDIR' 0 + +# ensure that top_builddir are absolute +if [ -z "$top_builddir" ] ; then top_builddir=. ; fi +if [ -z "$top_srcdir" ] ; then top_srcdir=. ; fi +top_builddir=`cd $top_builddir; pwd` +PSPP=$top_builddir/src/ui/terminal/pspp + +# ensure that top_srcdir is absolute +top_srcdir=`cd $top_srcdir; pwd` + +STAT_CONFIG_PATH=$top_srcdir/config +export STAT_CONFIG_PATH + +fail() +{ + echo $activity + echo FAILED + exit 1; +} + + +no_result() +{ + echo $activity + echo NO RESULT; + exit 2; +} + +pass() +{ + exit 0; +} + +cd $TEMPDIR + +activity="write pspp syntax" +cat > month-out.pspp <) { + s/^ //; + if (scalar (my (@line) = /^([A-Z]+)(\d+)([^"]+")( *)([^%"]*)(%?")$/) == 6) { + if (defined ($prev[0]) + && $line[0] eq $prev[0] + && $line[1] == $prev[1] + 1 + && $line[2] eq $prev[2] + && $line[5] eq $prev[5]) { + if ($line[3] eq " $prev[3]" + && $line[4] eq $prev[4]) { + flush_prefix (); + flush_suffix (); + $n++; + } elsif ($line[3] eq $prev[3] + && length ($line[4]) == length ($prev[4]) + 1 + && $prev[4] eq substr ($line[4], 0, length ($line[4]) - 1)) { + flush_n (); + flush_prefix (); + $suffix .= substr ($line[4], -1); + } elsif ($line[3] eq $prev[3] + && $prev[4] eq substr ($line[4], 1)) { + flush_n (); + flush_suffix (); + $prefix .= substr ($line[4], 0, 1); + } else { + flush (); + print $_; + } + } else { + flush (); + print $_; + } + @prev = @line; + } else { + flush (); + print $_; + @prev = (); + } +} +flush (); + +sub flush_suffix { + if ($suffix ne '') { + print "\$$suffix\n"; + $suffix = ''; + } +} + +sub flush_prefix { + if ($prefix ne '') { + print "^$prefix\n"; + $prefix = ''; + } +} + +sub flush_n { + if ($n) { + print "*$n\n"; + $n = 0; + } +} + +sub flush { + flush_prefix (); + flush_suffix (); + flush_n (); +} diff --git a/tests/formats/num-out-compare.pl b/tests/formats/num-out-compare.pl new file mode 100644 index 00000000..6bf0cb79 --- /dev/null +++ b/tests/formats/num-out-compare.pl @@ -0,0 +1,122 @@ +#! /usr/bin/perl -w + +use strict; +use Getopt::Long; + +my $exact = 0; +my $spss = 0; +my $verbose = 0; +Getopt::Long::Configure ("bundling"); +GetOptions ("e|exact!" => \$exact, + "s|spss!" => \$spss, + "v|verbose+" => \$verbose, + "h|help" => sub { usage (0) }) + or usage (1); + +sub usage { + print "$0: compare expected and actual numeric formatting output\n"; + print "usage: $0 [OPTION...] EXPECTED ACTUAL\n"; + print "where EXPECTED is the file containing expected output\n"; + print "and ACTUAL is the file containing actual output.\n"; + print "Options:\n"; + print " -e, --exact: Require numbers to be exactly equal.\n"; + print " (By default, small differences are permitted.)\n"; + print " -s, --spss: Ignore most SPSS formatting bugs in EXPECTED.\n"; + print " (A few differences are not compensated)\n"; + print " -v, --verbose: Use once to summarize errors and differences.\n"; + print " Use twice for details of differences.\n"; + exit (@_); +} + +open (EXPECTED, '<', $ARGV[0]) or die "$ARGV[0]: open: $!\n"; +open (ACTUAL, '<', $ARGV[1]) or die "$ARGV[1]: open: $!\n"; +my ($expr); +my ($bad_round) = 0; +my ($approximate) = 0; +my ($spss_wtf1) = 0; +my ($spss_wtf2) = 0; +my ($lost_sign) = 0; +my ($errors) = 0; +while (defined (my $a = ) && defined (my $b = )) { + chomp $a; + chomp $b; + if ($a eq $b) { + if ($a !~ /^\s*$/ && $a !~ /:/) { + $expr = $a; + $expr =~ s/\s*$//; + $expr =~ s/^\s*//; + } + } else { + my ($fmt, $a_out) = $a =~ /^ (.*): "(.*)"$/ or die; + my ($b_fmt, $b_out) = $b =~ /^ (.*): "(.*)"$/ or die; + die if $fmt ne $b_fmt; + die if $a_out eq $b_out; + + if (!$exact) { + if (increment ($a_out) eq $b_out || increment ($b_out) eq $a_out) { + $approximate++; + next; + } + } + if ($spss) { + if ($a_out =~ /0.*0/ && $a_out !~ /[1-9]/) { + $bad_round++; + next; + } elsif ($a_out =~ /\*/ && $a_out !~ /^\*+$/) { + $spss_wtf1++; + next; + } elsif ($expr =~ /^-/ + && $a_out =~ /^\*+$/ + && $b_out =~ /-\d(\.\d*#*)?E[-+]\d\d\d/ + && $fmt =~ /^E/) { + $spss_wtf2++; + next; + } elsif ($expr =~ /^-/ + && (($a_out !~ /-/ && $a_out =~ /[1-9]/ && $b_out =~ /-/) + || ($a_out =~ /^[0-9]+$/ && $b_out =~ /^\*+$/))) { + $lost_sign++; + next; + } + } + print "$.: $expr in $fmt: expected \"$a_out\", got \"$b_out\"\n" + if $verbose > 1; + $errors++; + } +} +if ($verbose) { + print "$errors errors\n"; + if (!$exact) { + print "$approximate approximate matches\n"; + } + if ($spss) { + print "$bad_round bad rounds\n"; + print "$spss_wtf1 SPSS WTF 1\n"; + print "$spss_wtf2 SPSS WTF 2\n"; + print "$lost_sign lost signs\n"; + } +} +exit ($errors > 0); + +# Returns the argument value incremented by one unit in its final +# decimal place. +sub increment { + local ($_) = @_; + my ($last_digit, $i); + for ($i = 0; $i < length $_; $i++) { + my ($c) = substr ($_, $i, 1); + last if ($c eq 'E'); + $last_digit = $i if $c =~ /[0-9]/; + } + return $_ if !defined $last_digit; + for ($i = $last_digit; $i >= 0; $i--) { + my ($c) = substr ($_, $i, 1); + if ($c eq '9') { + substr ($_, $i, 1) = '0'; + } elsif ($c =~ /[0-8]/) { + substr ($_, $i, 1) = chr (ord ($c) + 1); + last; + } + } + $_ = "1$_" if $i < 0; + return $_; +} diff --git a/tests/formats/num-out-decmp.pl b/tests/formats/num-out-decmp.pl new file mode 100644 index 00000000..3c066fee --- /dev/null +++ b/tests/formats/num-out-decmp.pl @@ -0,0 +1,28 @@ +use warnings; +use strict; + +my (@line); +while (<>) { + if (my ($n) = /^\*(\d+)$/) { + for (1...$n) { + $line[1]++; + $line[3] = " $line[3]"; + print ' ', join ('', @line), "\n"; + } + } elsif (my ($suffix) = /^\$(.*)$/) { + for my $c (split ('', $suffix)) { + $line[1]++; + $line[4] .= $c; + print ' ', join ('', @line), "\n"; + } + } elsif (my ($prefix) = /^\^(.*)$/) { + for my $c (split ('', $prefix)) { + $line[1]++; + $line[4] = "$c$line[4]"; + print ' ', join ('', @line), "\n"; + } + } else { + @line = /^([A-Z]+)(\d+)([^"]+")( *)([^%"]*)(%?")$/; + print " $_"; + } +} diff --git a/tests/formats/num-out.expected.cmp.gz b/tests/formats/num-out.expected.cmp.gz new file mode 100644 index 00000000..f1f55b15 Binary files /dev/null and b/tests/formats/num-out.expected.cmp.gz differ diff --git a/tests/formats/num-out.pl b/tests/formats/num-out.pl new file mode 100644 index 00000000..1bb602b5 --- /dev/null +++ b/tests/formats/num-out.pl @@ -0,0 +1,41 @@ +use warnings; +use strict; + +my @values = qw(0 2 9.5 27 271 999.95 2718 9999.995 27182 271828 +2718281 2**39 2**333 2**-21 -2 -9.5 -27 -271 -999.95 -2718 -9999.995 +-27182 -271828 -2718281 -2**39 -2**333 -2**-21 -0 3.125 31.25 314.125 +3141.5 31415.875 314159.25 3141592.625 31415926.5 271828182.25 +3214567890.5 31415926535.875 -3.125 -31.375 -314.125 -3141.5 +-31415.875 -314159.25 -3141592.625 -31415926.5 -271828182.25 +-3214567890.5 -31415926535.875); + +print "SET CCA=',,,'.\n"; +print "SET CCB='-,[[[,]]],-'.\n"; +print "SET CCC='((,[,],))'.\n"; +print "SET CCD=',XXX,,-'.\n"; +print "SET CCE=',,YYY,-'.\n"; +print "INPUT PROGRAM.\n"; +print "STRING EXPR(A16).\n"; +print map ("COMPUTE NUM=$_.\nCOMPUTE EXPR='$_'.\nEND CASE.\n", @values); +print "END FILE.\n"; +print "END INPUT PROGRAM.\n"; + +print "PRINT OUTFILE='output.txt'/EXPR.\n"; +for my $format qw (F COMMA DOT DOLLAR PCT E CCA CCB CCC CCD CCE N Z) { + for my $d (0...16) { + my ($min_w); + if ($format ne 'E') { + $min_w = $d + 1; + $min_w++ if $format eq 'DOLLAR' || $format eq 'PCT'; + $min_w = 2 if $min_w == 1 && ($format =~ /^CC/); + } else { + $min_w = $d + 7; + } + for my $w ($min_w...40) { + my ($f) = "$format$w.$d"; + print "PRINT OUTFILE='output.txt'/'$f: \"' NUM($f) '\"'.\n"; + } + } + print "PRINT SPACE OUTFILE='output.txt'.\n"; +} +print "EXECUTE.\n"; diff --git a/tests/formats/num-out.sh b/tests/formats/num-out.sh new file mode 100755 index 00000000..37b0d2eb --- /dev/null +++ b/tests/formats/num-out.sh @@ -0,0 +1,78 @@ +#! /bin/sh + +TEMPDIR=/tmp/pspp-tst-$$ +mkdir -p $TEMPDIR +trap 'cd /; rm -rf $TEMPDIR' 0 + +# ensure that top_builddir are absolute +if [ -z "$top_builddir" ] ; then top_builddir=. ; fi +if [ -z "$top_srcdir" ] ; then top_srcdir=. ; fi +top_builddir=`cd $top_builddir; pwd` +PSPP=$top_builddir/src/ui/terminal/pspp + +# ensure that top_srcdir is absolute +top_srcdir=`cd $top_srcdir; pwd` + +STAT_CONFIG_PATH=$top_srcdir/config +export STAT_CONFIG_PATH + +fail() +{ + echo $activity + echo FAILED + exit 1; +} + + +no_result() +{ + echo $activity + echo NO RESULT; + exit 2; +} + +pass() +{ + exit 0; +} + +cd $TEMPDIR + +activity="generate pspp syntax" +$PERL $top_srcdir/tests/formats/num-out.pl > num-out.pspp +if [ $? -ne 0 ] ; then no_result ; fi +echo -n . + +activity="run program" +$SUPERVISOR $PSPP --testing-mode num-out.pspp +if [ $? -ne 0 ] ; then no_result ; fi +echo -n . + +activity="inexactify results" +$top_builddir/tests/formats/inexactify < output.txt > output.inexact +if [ $? -ne 0 ] ; then no_result ; fi +echo -n . + +activity="gunzip expected results" +gzip -cd < $top_srcdir/tests/formats/num-out.expected.cmp.gz > expected.txt.cmp +if [ $? -ne 0 ] ; then no_result ; fi +echo -n . + +activity="decompress expected results" +$PERL $top_srcdir/tests/formats/num-out-decmp.pl < expected.txt.cmp > expected.txt +if [ $? -ne 0 ] ; then no_result ; fi +echo -n . + +activity="inexactify expected results" +$top_builddir/tests/formats/inexactify < expected.txt > expected.inexact +if [ $? -ne 0 ] ; then no_result ; fi +echo -n . + +activity="compare output" +$PERL $top_srcdir/tests/formats/num-out-compare.pl \ + $PSPP_NUM_OUT_COMPARE_FLAGS expected.inexact output.inexact +if [ $? -ne 0 ] ; then fail ; fi + +echo . + +pass diff --git a/tests/formats/time-out.sh b/tests/formats/time-out.sh new file mode 100755 index 00000000..c110b966 --- /dev/null +++ b/tests/formats/time-out.sh @@ -0,0 +1,13124 @@ +#! /bin/sh + +TEMPDIR=/tmp/pspp-tst-$$ +mkdir -p $TEMPDIR +trap 'cd /; rm -rf $TEMPDIR' 0 + +# ensure that top_builddir are absolute +if [ -z "$top_builddir" ] ; then top_builddir=. ; fi +if [ -z "$top_srcdir" ] ; then top_srcdir=. ; fi +top_builddir=`cd $top_builddir; pwd` +PSPP=$top_builddir/src/ui/terminal/pspp + +# ensure that top_srcdir is absolute +top_srcdir=`cd $top_srcdir; pwd` + +STAT_CONFIG_PATH=$top_srcdir/config +export STAT_CONFIG_PATH + +fail() +{ + echo $activity + echo FAILED + exit 1; +} + + +no_result() +{ + echo $activity + echo NO RESULT; + exit 2; +} + +pass() +{ + exit 0; +} + +cd $TEMPDIR + +activity="write pspp syntax" +cat > time-out.pspp < wkday-out.pspp <