From: Ben Pfaff Date: Wed, 16 Jun 2010 03:41:12 +0000 (-0700) Subject: Do not treat isolated CR in input data as new-line. X-Git-Tag: v0.7.6~364 X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=adb78c1da5de2792a8f07b241e72b8bd341fd90b;p=pspp-builds.git Do not treat isolated CR in input data as new-line. User "tvw" on IRC reported that PSPP failed to parse data that SPSS 18 accepted. We found that the problem was carriage return (CR) characters in the middle of a line. PSPP treated these as new-lines, but SPSS did not. This commit adopts the SPSS behavior in PSPP and adjusts one test that checks this behavior. This will break reading some old Mac OS files, since Mac OS before version 10 used CR without LF as new-line. Time will tell whether this is a real problem for our users. --- diff --git a/src/libpspp/str.c b/src/libpspp/str.c index 79f3c912..71f54474 100644 --- a/src/libpspp/str.c +++ b/src/libpspp/str.c @@ -1,5 +1,5 @@ /* PSPP - a program for statistical analysis. - Copyright (C) 1997-9, 2000, 2006, 2009 Free Software Foundation, Inc. + Copyright (C) 1997-9, 2000, 2006, 2009, 2010 Free Software Foundation, Inc. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -1240,9 +1240,9 @@ ds_steal_cstr (struct string *st) to ST, false if no characters were read before an I/O error or end of file (or if MAX_LENGTH was 0). - This function accepts LF, CR LF, and CR sequences as new-line, - and translates each of them to a single '\n' new-line - character in ST. */ + This function treats LF and CR LF sequences as new-line, + translating each of them to a single '\n' new-line character + in ST. */ bool ds_read_line (struct string *st, FILE *stream, size_t max_length) { @@ -1251,21 +1251,36 @@ ds_read_line (struct string *st, FILE *stream, size_t max_length) for (length = 0; length < max_length; length++) { int c = getc (stream); - if (c == EOF) - break; - - if (c == '\r') + switch (c) { + case EOF: + return length > 0; + + case '\n': + ds_put_char (st, c); + return true; + + case '\r': c = getc (stream); - if (c != '\n') + if (c == '\n') { + /* CR followed by LF is special: translate to \n. */ + ds_put_char (st, '\n'); + return true; + } + else + { + /* CR followed by anything else is just CR. */ + ds_put_char (st, '\r'); + if (c == EOF) + return true; ungetc (c, stream); - c = '\n'; } + break; + + default: + ds_put_char (st, c); } - ds_put_char (st, c); - if (c == '\n') - return true; } return length > 0; diff --git a/tests/command/line-ends.sh b/tests/command/line-ends.sh index 2c12e4f6..76499629 100755 --- a/tests/command/line-ends.sh +++ b/tests/command/line-ends.sh @@ -67,7 +67,7 @@ if [ $? -ne 0 ] ; then no_result ; fi activity="create input.txt" -printf '1 2 3\n4 5 6\r\n7 8 9\r10 11 12\n13 14 15 \r\n16 17 18\r' > input.txt +printf '1 2 3\n4 5 6\r\n7\r8\r9\r\n10 11 12\n13 14 15 \r\n16\r\r17\r18\n' > input.txt if [ $? -ne 0 ] ; then no_result ; fi @@ -77,7 +77,7 @@ if [ $? -ne 0 ] ; then no_result ; fi activity="check input.txt" cksum input.txt > input.cksum diff input.cksum - <