John Darrington [Sun, 19 Jul 2009 12:20:07 +0000 (14:20 +0200)]
Add assertion to check code consitency
John Darrington [Sun, 19 Jul 2009 11:35:35 +0000 (13:35 +0200)]
Fix cleanup of ROC command.
Properly deallocate variables, and use correct
symbols for parser return values. Also, delete
roc.h which is unnecessary. Thanks to Ben Pfaff
for pointing out these problems.
John Darrington [Sun, 19 Jul 2009 11:04:39 +0000 (13:04 +0200)]
Respect the constness of caseproto.
New function caseproto_clone. This means we
can clone a proto, then mutate it as we want.
John Darrington [Sun, 19 Jul 2009 09:41:49 +0000 (11:41 +0200)]
Corrected spelling of "consolidate".
Thanks to Ben Pfaff for pointing out my mistake.
John Darrington [Sun, 19 Jul 2009 09:35:38 +0000 (11:35 +0200)]
Added the '=' to the plot subcommand's documentation.
Thanks to Ben Pfaff for pointing this out.
John Darrington [Fri, 17 Jul 2009 15:11:44 +0000 (23:11 +0800)]
Ensure correct behaviour when the state var is missing.
When the state variable is missing, then the entire
case is skipped.
John Darrington [Fri, 17 Jul 2009 15:04:22 +0000 (23:04 +0800)]
Update documentation regarding missing values.
Explicitly mention that cases are excluded on a
listwise basis.
John Darrington [Fri, 17 Jul 2009 14:48:29 +0000 (22:48 +0800)]
Fix ROC behaviour in the presence of missing values.
Make sure that the ROC command's behaviour is correct,
when missing values appear in the result variable.
John Darrington [Wed, 15 Jul 2009 12:19:49 +0000 (20:19 +0800)]
New function prepare_cutpoints
Move the code which creates the cutpoints into its own
function. This makes for easier reading IMO.
John Darrington [Wed, 15 Jul 2009 12:05:58 +0000 (20:05 +0800)]
Updated the example with an easier to visualise one
John Darrington [Thu, 25 Jun 2009 03:08:09 +0000 (11:08 +0800)]
Fix bugs when input data is repeated
John Darrington [Wed, 24 Jun 2009 08:47:38 +0000 (16:47 +0800)]
Added second ROC test
John Darrington [Mon, 15 Jun 2009 23:27:31 +0000 (07:27 +0800)]
Add new functions to define subcase orderings.
Allow subcases to be defined from a index and width,
rather from a variable. This avoids much of the
need for var_create_internal.
John Darrington [Sat, 13 Jun 2009 05:29:25 +0000 (13:29 +0800)]
Added code to plot the ROC curve
John Darrington [Thu, 11 Jun 2009 06:22:22 +0000 (14:22 +0800)]
Added code to generate the ROC cutpoint tables.
John Darrington [Thu, 11 Jun 2009 04:57:17 +0000 (12:57 +0800)]
Add check that input to casereader_create_distinct are sorted
John Darrington [Wed, 10 Jun 2009 13:50:48 +0000 (21:50 +0800)]
Fix bug when positive and negative groups are of different lengths
John Darrington [Wed, 10 Jun 2009 13:49:39 +0000 (21:49 +0800)]
Add framework for ROC summary table
John Darrington [Wed, 10 Jun 2009 13:25:50 +0000 (21:25 +0800)]
Use the requested method for calculating the ROC AUC standard error
John Darrington [Wed, 10 Jun 2009 13:14:01 +0000 (21:14 +0800)]
Added basic calculation and display of area under the curve
John Darrington [Wed, 10 Jun 2009 13:11:32 +0000 (21:11 +0800)]
Added test for the ROC command
John Darrington [Wed, 10 Jun 2009 03:36:05 +0000 (11:36 +0800)]
Added a new casereader translator to consolodate cases.
This new translator creates a reader which provides
a list of distinct cases in the input, with the weights
consolodated, where applicable.
John Darrington [Wed, 10 Jun 2009 01:44:01 +0000 (09:44 +0800)]
Added stub for ROC computation
John Darrington [Tue, 9 Jun 2009 11:16:24 +0000 (19:16 +0800)]
Added documentation for the ROC command
John Darrington [Tue, 9 Jun 2009 11:15:08 +0000 (19:15 +0800)]
Added parser for the ROC command.
John Darrington [Tue, 9 Jun 2009 11:04:25 +0000 (19:04 +0800)]
Support mult-data charts and legend.
Add support for charts to have datasets with seperate
colours, and a legend to indicate them.
Ben Pfaff [Mon, 8 Jun 2009 04:57:36 +0000 (21:57 -0700)]
Fix handling of #! at beginning of PSPP syntax file; add regression test.
Fixes bug #26518.
Thanks to John Darrington for testing.
Ben Pfaff [Sun, 7 Jun 2009 20:14:23 +0000 (13:14 -0700)]
Remove spurious Makefile from src/output.
Ben Pfaff [Sun, 7 Jun 2009 04:04:21 +0000 (21:04 -0700)]
crosstabs: Fix chi-square display and add regression test.
Bug #26739.
Ben Pfaff [Sun, 7 Jun 2009 03:53:10 +0000 (20:53 -0700)]
crosstab: Remove struct that was defined but never used.
Ben Pfaff [Sun, 7 Jun 2009 03:44:49 +0000 (20:44 -0700)]
crosstabs: Remove write-only variable.
Ben Pfaff [Sun, 7 Jun 2009 03:30:14 +0000 (20:30 -0700)]
crosstabs: Fix segfault when chi-square was requested.
Bug #26739.
Ben Pfaff [Wed, 3 Jun 2009 05:21:01 +0000 (22:21 -0700)]
datasheet-test: Add support for testing string backing store columns.
Ben Pfaff [Wed, 3 Jun 2009 04:55:50 +0000 (21:55 -0700)]
crosstabs: Trim unsightly spaces from titles in output.
Unfortunately, none of the tests exercise this code, so it's hard to say
whether it is correct.
Ben Pfaff [Wed, 3 Jun 2009 02:52:18 +0000 (19:52 -0700)]
crosstabs: Fix memory leaks.
Ben Pfaff [Sat, 30 May 2009 04:51:45 +0000 (21:51 -0700)]
argv-parser: Add assertion to find likely bugs in client code.
Ben Pfaff [Sat, 30 May 2009 04:51:19 +0000 (21:51 -0700)]
datasheet: Fix bugs in datasheet_resize_column() found with new test.
Ben Pfaff [Sat, 30 May 2009 04:46:24 +0000 (21:46 -0700)]
datasheet-test: Add test for datasheet_resize_column().
Ben Pfaff [Sat, 30 May 2009 04:43:33 +0000 (21:43 -0700)]
datasheet-test: Fix printing of string values in error messages.
Ben Pfaff [Sat, 30 May 2009 04:26:13 +0000 (21:26 -0700)]
datasheet-test: Check duplicate states before discarding them.
By failing to check states whose hashes already appeared in the model
checker table, the datasheet test was missing some bugs. This commit
changes the datasheet test code to check the state before it checks for
the hash.
Ben Pfaff [Thu, 28 May 2009 05:22:48 +0000 (22:22 -0700)]
datasheet-test: Make column widths to test configurable on command line.
Ben Pfaff [Sat, 30 May 2009 04:45:28 +0000 (21:45 -0700)]
datasheet-test: Don't test null operations.
By not testing null operations (such as inserting or deleting 0 rows or
columns) the duration of the test is cut roughly in half, with little if
any reduction in test coverage.
Ben Pfaff [Sat, 30 May 2009 04:50:12 +0000 (21:50 -0700)]
sparse-xarray-test: Style and comment fixes.
Ben Pfaff [Wed, 27 May 2009 06:04:32 +0000 (23:04 -0700)]
value: New function value_swap.
Ben Pfaff [Wed, 27 May 2009 05:02:48 +0000 (22:02 -0700)]
Move datasheet test out of PSPP into a separate binary.
When it's not difficult to do so, it is better to put tests in separate
binaries instead of in the PSPP binaries, so that the binaries are not
burdened with code that is not of real interest to users and to make the
main PSPP binaries build faster.
Ben Pfaff [Tue, 26 May 2009 03:24:07 +0000 (20:24 -0700)]
Make MAX_SHORT_STRING an implementation detail of the "value" code.
MAX_SHORT_STRING used to be important. It was referenced all over the
source tree. Now, there is little reason for code outside the "value"
code itself to use it.
Ben Pfaff [Tue, 26 May 2009 03:22:01 +0000 (20:22 -0700)]
Use MAX_SHORT_STRING in place of MIN_LONG_STRING.
There is no good reason to have both of these constants, so replace all
uses of MAX_LONG_STRING by MAX_SHORT_STRING.
Ben Pfaff [Tue, 26 May 2009 03:21:08 +0000 (20:21 -0700)]
Fix portable file reader use of long strings.
This code hadn't been converted to the new "union value" representation,
where a single "union value" always represents a whole data item. This
commit fixes that.
Ben Pfaff [Tue, 26 May 2009 03:20:07 +0000 (20:20 -0700)]
Get rid of uses of MAX_SHORT_STRING in Gnumeric and PostgreSQL readers.
MAX_SHORT_STRING is now intended to be an implementation detail of the
value code. There is no real reason that the Gnumeric or PostgreSQL
readers need to use it, so make them use their own constants instead.
Ben Pfaff [Tue, 26 May 2009 03:07:19 +0000 (20:07 -0700)]
Implement missing values for long string variables.
Ben Pfaff [Mon, 25 May 2009 19:36:21 +0000 (12:36 -0700)]
Fix test failure introduced along with parse_value().
Ben Pfaff [Mon, 25 May 2009 02:36:01 +0000 (19:36 -0700)]
sys-file-reader: Fix memory leak.
Ben Pfaff [Mon, 25 May 2009 02:35:40 +0000 (19:35 -0700)]
Add support for value labels on long string variables.
Ben Pfaff [Sun, 24 May 2009 18:26:41 +0000 (11:26 -0700)]
New function parse_value() for parsing a value of specified width.
Occasionally a value of a given width needs to be parsed from syntax.
This commit adds a helper function for doing so and modifies a few pieces
of code to use it. Probably there are other places where it would be
useful that could not easily be found with "grep".
This commit also renames the range-parser code to value-parser and puts
the new function in there, as a natural generalization.
Suggested by John Darrington.
Ben Pfaff [Tue, 12 May 2009 03:08:19 +0000 (20:08 -0700)]
Make value_set_missing(), etc. tolerate values of width -1.
In some circumstances a value of width -1 crops up, e.g. when a case is
made from a dictionary that has had a variable deleted in the middle.
Such a value has no content at all. In the long run it should be possible
to get rid of these values entirely--their presence is a wart--but for now
the case and value code needs to tolerate them.
This fixes a segfault in the GUI when inserting a new case when the
datasheet case has a column with width -1 (due to deletion of a variable),
which was caused by case_set_missing() calling value_set_missing() for
the -1 width variable, which in turn was writing through an invalid
pointer.
John Darrington [Mon, 11 May 2009 23:18:31 +0000 (07:18 +0800)]
Prevent invalid variable widths in variable sheet.
Ben Pfaff [Mon, 11 May 2009 13:33:35 +0000 (06:33 -0700)]
Remove debug printfs that escaped from my local tree.
Ben Pfaff [Mon, 11 May 2009 13:33:15 +0000 (06:33 -0700)]
gui: Fix segfault when pushing Del on a long string variable cell.
Thanks to John Darrington for reporting the problem.
Ben Pfaff [Mon, 11 May 2009 05:23:00 +0000 (22:23 -0700)]
Change "union value" to dynamically allocate long strings.
Until now, a single "union value" could hold a numeric value or a short
string value. A long string value (one longer than MAX_SHORT_STRING)
required a number of contiguous "union value"s. This situation was
inconvenient sometimes, because any occasion where a long string value
might be required (even if it was unlikely) required using dynamic
memory allocation.
With this change, a value of any type, regardless of whether it is numeric
or short or long string, occupies a single "union value". The internal
representation of short and long strings is now different, however: long
strings are now internally represented by a pointer to dynamically
allocated memory. This means that "union value"s must now be initialized
and uninitialized properly, to ensure that memory is properly allocated
and freed behind the scenese.
This change thus has a ripple effect on PSPP code that works with values.
In particular, code that deals with cases is greatly changed, because a
case now needs to know the type of each value that it contains. Thus, a
new concept called a "case prototype", which represents the type and
width of each value within a case, is introduced, and every place in PSPP
that creates a case must now create a corresponding prototype to go with
it. This is why this commit is so big.
As part of writing up this commit, it became clear that some code was poor
enough that it needed to be rewritten entirely. Therefore, CROSSTABS and
T-TEST are almost completely modified by this commit.
Ben Pfaff [Thu, 7 May 2009 05:58:01 +0000 (22:58 -0700)]
output: Add auxiliary data parameter to tab_dim.
Until now, the tab_dim function has not provided any way to pass auxiliary
data to the table dimensioning function. This commit adds this ability
and updates all the callers of tab_dim to do so.
Ben Pfaff [Thu, 7 May 2009 05:47:51 +0000 (22:47 -0700)]
New data structure sparse_xarray.
Ben Pfaff [Thu, 7 May 2009 03:34:14 +0000 (20:34 -0700)]
New wrapper for access to temporary files.
Ben Pfaff [Tue, 5 May 2009 12:42:23 +0000 (05:42 -0700)]
model-checker: Add command-line parser for model checking options.
This adds a parser for command-line options to configure a set of
mc_options for running the model checker. It is used by an upcoming test
for the sparse_xarray. It might also make sense to break the datasheet
tests out of PSPP into a separate program using this parser.
Ben Pfaff [Tue, 5 May 2009 12:39:03 +0000 (05:39 -0700)]
Implement new command-line argument parser.
glibc has two option parsers, but neither one of them feels quite
right:
- getopt_long is simple, but not modular, in that there is no
easy way to make it accept multiple collections of options
supported by different modules.
- argp is more sophisticated and more complete, and hence more
complex. It still lacks one important feature for
modularity: there is no straightforward way for option groups
that are implemented independently to have separate auxiliary
data,
The parser implemented in this commit is meant to be simple and
modular. It is based internally on getopt_long.
The initial use for this option parser is for an upcoming commit of a test
program that has some of its own options and some from the model checker,
but it should also be appropriate for PSPP and PSPPIRE if anyone wants to
adapt them to use it.
Ben Pfaff [Tue, 5 May 2009 05:30:02 +0000 (22:30 -0700)]
model-checker: Don't discard error states.
Even if a state with an error is a duplicate, we don't want to discard it,
because then we lose information about bugs.
Ben Pfaff [Tue, 5 May 2009 05:27:05 +0000 (22:27 -0700)]
model-checker: Revise advice on checking for duplicates.
Until now the documentation on the model checker has advised checking for
a duplicate state before checking for consistency, but in fact this can
cause bugs to be missed if only some paths to a given state yield
incorrect results. So revise the advice to check for consistency before
checking for a duplicate state.
Ben Pfaff [Tue, 5 May 2009 05:20:42 +0000 (22:20 -0700)]
model-checker: Add more progress functions.
The model checker supports "progress functions" that report the current
status of the model checking run. Until now the implementation only
exported a single progress function that printed a line of dots across
stderr. This commit moves the "fancy" progress function that was
previously part of the PSPP language code into the model checker itself
and adds an even more verbose progress function as well.
Ben Pfaff [Tue, 5 May 2009 05:33:48 +0000 (22:33 -0700)]
model-checker: Move summary printing function into model checker.
There is no reason that the model checker itself should not be able to
print a summary of its results. Until now, this code was buried in the
PSPP language code, but the model checker itself is a better place for it.
Ben Pfaff [Fri, 24 Apr 2009 04:09:12 +0000 (21:09 -0700)]
model-checker: Kill dependencies and move back to libpspp.
Commit
95b074ff3 "Moved the datasheet testing code out of
src/{libspp,data}" moved the model-checker implementation from libpspp
into language/tests because it depended on math/moments.h and
data/val-type.h, which violates the dependency structure of the PSPP
libraries.
However, now I want to use the model checker in a test that should not
need to use anything from language/tests, so this commit eliminates these
dependencies and moves the model checker back to src/libpspp.
Ben Pfaff [Tue, 5 May 2009 04:53:07 +0000 (21:53 -0700)]
sparse-array: Simplify code slightly.
Instead of checking whether the key is in range in each caller of
find_leaf_node, do it in find_leaf_node itself. This also allows checking
the cache before checking whether the key is in range, which might be an
optimization.
Ben Pfaff [Thu, 7 May 2009 03:22:09 +0000 (20:22 -0700)]
sparse-array: Improve iteration interface.
The sparse_array_scan function only supports iteration in the forward
direction and its interface is somewhat awkward. This commit replaces it
by four new functions that allow iteration in both forward and reverse
directions and have a more conventional interface.
Ben Pfaff [Tue, 5 May 2009 12:51:54 +0000 (05:51 -0700)]
sparse-array: Use __builtin_ctzl on GCC 4.0 or later, as an optimization.
This should be a worthwhile optimization in many cases, because
__builtin_ctzl compiles to a single machine instruction on x86, whereas
the generic implementation compiles to several.
Ben Pfaff [Tue, 5 May 2009 02:31:16 +0000 (19:31 -0700)]
range-set: New functions range_set_last and range_set_prev.
These are useful for iterating through a range set in reverse order.
Ben Pfaff [Fri, 24 Apr 2009 03:27:54 +0000 (20:27 -0700)]
range-set: Add new function range_set_scan().
Ben Pfaff [Fri, 24 Apr 2009 03:27:10 +0000 (20:27 -0700)]
range-set: Inline some simple functions.
Some of the range-set functions are very simple and worth inlining, so
move the definitions of those functions into range-set.h. This required
moving the definition of struct range_set and struct range_set_node into
the header. Some of the functions used internally by those functions had
to be moved too, and renamed as well since for internal use in range-set.c
their names did not respect the namespace.
Ben Pfaff [Fri, 24 Apr 2009 03:14:52 +0000 (20:14 -0700)]
range-set: Add test coverage for range_set_destroy(NULL).
"gcov -b" showed that range_set_destroy() was never called with a NULL
argument. There's no reason not to test that too (although of course it
is unlikely to be broken).
Ben Pfaff [Tue, 5 May 2009 14:03:24 +0000 (07:03 -0700)]
range-set: New function range_set_allocate_fully.
Ben Pfaff [Wed, 1 Apr 2009 05:00:44 +0000 (22:00 -0700)]
Delete CORRELATIONS skeletal parser.
This code didn't do anything useful, it just parsed syntax. We can
resurrect it when someone wants to implement CORRELATIONS later.
Ben Pfaff [Tue, 5 May 2009 14:03:02 +0000 (07:03 -0700)]
pool: New function pool_strdup0.
This function is the pool analogue of xmemdup0, except that it is only
appropriate for use with strings.
Ben Pfaff [Tue, 5 May 2009 12:46:01 +0000 (05:46 -0700)]
datasheet: Drop false dependency on md4.h.
datasheet-check.c doesn't use anything from md4.h, so there's no point in
including it.
Ben Pfaff [Thu, 16 Apr 2009 05:31:56 +0000 (22:31 -0700)]
perl-module: Better document "make test" requirements.
Ben Pfaff [Fri, 16 Jan 2009 04:39:18 +0000 (20:39 -0800)]
t-test: Move 'mode' variable from file scope into cmd_t_test().
Ben Pfaff [Fri, 16 Jan 2009 04:37:32 +0000 (20:37 -0800)]
t-test: Move 'cmd' variable from file scope into cmd_t_test().
This variable was only used inside cmd_t_test() anyhow.
Ben Pfaff [Fri, 16 Jan 2009 04:34:43 +0000 (20:34 -0800)]
t-test: Remove write-only variable.
Jason H Stover [Wed, 3 Jun 2009 19:54:19 +0000 (15:54 -0400)]
Moved static is_origin from design_matrix.c to category.c: cat_is_origin.
John Darrington [Sun, 17 May 2009 23:43:14 +0000 (07:43 +0800)]
Fix incorrect word order
John Darrington [Sun, 17 May 2009 23:42:31 +0000 (07:42 +0800)]
Remove whitespace before footnotes
John Darrington [Sun, 17 May 2009 23:39:06 +0000 (07:39 +0800)]
Correct grammar
John Darrington [Sun, 17 May 2009 07:29:54 +0000 (15:29 +0800)]
Change examples using heights to be all in millimetres
John Darrington [Sun, 17 May 2009 07:20:23 +0000 (15:20 +0800)]
Delete '*' from DATA LIST examples
John Darrington [Sun, 17 May 2009 07:16:23 +0000 (15:16 +0800)]
Mention that two juxtapose LIST keywords are intentional
John Darrington [Sun, 17 May 2009 05:15:41 +0000 (13:15 +0800)]
Fix misaligned menu items
John Darrington [Sun, 17 May 2009 05:06:59 +0000 (13:06 +0800)]
Added a tutorial chapter to the manual.
John Darrington [Sat, 16 May 2009 03:38:11 +0000 (11:38 +0800)]
Fix bug inserting rows and columns and rename state variable.
The code in psppire-data-editor was inspecting the variable
called "state" on the GtkWidget class whereas it should have
been looking at the PsppireSheet class.
To avoid any future confusion, PsppireSheet's "state" variable
has been renamed to select_status.
John Darrington [Sat, 16 May 2009 02:16:57 +0000 (10:16 +0800)]
Remove unused code
John Darrington [Fri, 15 May 2009 23:18:01 +0000 (07:18 +0800)]
Updated dutch translation. Thanks to unkonwn-1
Also regenerated en_GB.po
John Darrington [Fri, 15 May 2009 22:55:52 +0000 (06:55 +0800)]
Correct typo in command line argument string.
Thanks to Michel Boaventura for reporting this.
John Darrington [Thu, 14 May 2009 08:19:09 +0000 (16:19 +0800)]
Remove gratuitous call to change_active_cell.
This caused data from the previous cell to
be transfered to the new cell. Fixes bug #26568
John Darrington [Thu, 14 May 2009 07:48:14 +0000 (15:48 +0800)]
Remove unneeded object members
John Darrington [Wed, 13 May 2009 22:54:56 +0000 (06:54 +0800)]
Ensure that windows opens the right file for output.
Thanks to Michel Boaventura for reporting this problem.
Fixes bug #26542