From: Ben Pfaff Date: Fri, 2 Mar 2012 06:32:42 +0000 (-0800) Subject: identifier: Apply isdigit() only to values in valid range. X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=c785bf16095624e47d9af976aaa751295a66f3d5;p=pspp identifier: Apply isdigit() only to values in valid range. Applying isdigit() to a value that is not EOF or in the range 0...UCHAR_MAX yields undefined behavior and in fact caused a segfault on several Debian architectures for U+FFFD. Found by "lexer properly reports scan errors" test on Debian buildds. --- diff --git a/src/data/identifier.c b/src/data/identifier.c index f1c22ef1b5..a757b31e3a 100644 --- a/src/data/identifier.c +++ b/src/data/identifier.c @@ -1,5 +1,5 @@ /* PSPP - a program for statistical analysis. - Copyright (C) 1997-9, 2000, 2005, 2009, 2010, 2011 Free Software Foundation, Inc. + Copyright (C) 1997-9, 2000, 2005, 2009, 2010, 2011, 2012 Free Software Foundation, Inc. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -196,8 +196,9 @@ lex_uc_is_id1 (ucs4_t uc) bool lex_uc_is_idn (ucs4_t uc) { - return (is_ascii_id1 (uc) || isdigit (uc) || uc == '.' || uc == '_' - || (uc >= 0x80 && uc_is_property_id_continue (uc))); + return (uc < 0x80 + ? is_ascii_id1 (uc) || isdigit (uc) || uc == '.' || uc == '_' + : uc >= 0x80 && uc_is_property_id_continue (uc)); } /* Returns true if Unicode code point UC is a space that separates tokens. */