2008-07-24 Jason H Stover * glm.q (run_glm): Dropped weight argument. 2008-07-22 Jason H Stover * glm.q (run_glm): Re-written to form covariance matrix rather than store entire data set in memory. * glm.q (data_pass_one): Renamed prepare_categories() to data_pass_one(). Accumulate mean and variance. 2008-06-21 Jason Stover * regression.q (reg_stats_coeff): Use new accessor function pspp_coeff_get_sd. Fixed bug 23567. No longer call compute_moments in run_regression. Pass entire design_matrix to pspp_linreg (). 2008-05-29 John Darrington * examine.q: Fixed bug where incorrect levels of dereferencing were applied to pointers. 2008-04-09 John Darrington * regression.q: Fix display of degrees of freedom. 2008-04-08 Jason Stover * regression.q (identify_indep_vars): Don't panic unless n_indep_vars is 0. 2008-03-16 Ben Pfaff Bug #22037. Thanks to John Darrington for reporting this bug. * crosstabs.q (calc_general): Only the short string prefix of long string variables are tabulated, so we must not copy or zero out more data than that. 2008-03-10 Jason Stover * regression.q (run_regression): Removed code for EXPORT subcommand. Remove use of coefficient 0 as the intercept. 2008-02-14 John Darrington * examine.q: Fixed counts of missing variables. Thanks to Jason Stover for reporting this problem. 2008-01-02 John Darrington * binomial.c chisquare.c examine.q frequencies.q oneway.q regression.q : updated all users of var_get_value_name to use replacement function var_append_value_name. 2007-12-07 Ben Pfaff Patch #6302. * crosstabs.q (precalc): Initialize data structures even if the first case cannot be read. * frequencies.q (precalc): Ditto. 2007-11-03 Ben Pfaff Allow output files to overwrite input files (bug #21280). * aggregate.c (cmd_aggregate): Manage file handle reference counts. * correlations.q (internal_cmd_frequencies): Ditto. (cor_custom_matrix): Ditto. * regression.q (regression_custom_export): Ditto. (cmd_regression): Ditto. 2007-10-12 Ben Pfaff * flip.c (flip_file): No need to conditionally substitute for "fseeko" and "off_t" manually anymore, as gnulib takes care of it for us. 2007-09-21 Jason Stover * regression.q (run_regression): Partial fix of memory leak, bug 21056. 2007-09-19 Ben Pfaff Fix bug #21108. * aggregate.c (cmd_aggregate): Destroy casereader consistently, even if casereader fails. * examine.q (run_examine): Ditto. * glm.q (run_glm): Ditto. * oneway.q (run_oneway): Ditto. * regression.q (run_regression): Ditto. * t-test.q (calculate): Ditto. * descriptives.c (calc_descriptives): Ditto. Also avoid gratuitous casereader_clone. 2007-09-13 Jason Stover * regression.q (cmd_regression): Move declaration of models in to definition of cmd_regression. * regression.q (run_regression): Free mom to fix memory leak. 2007-09-12 Ben Pfaff * crosstabs.q (postcalc): Free sorted_tab and the structures that it points to, to plug a memory leak. Fixes bug #20910. Thanks to John Darrington for reporting this bug and for review. 2007-09-04 Ben Pfaff * crosstabs.q (cmd_crosstabs): Free xtab and the structures that it points to, to plug a memory leak. Fixes bug #18315. 2007-08-15 Jason Stover * regression.q (identify_indep_vars): Print an error if dependent and independent variables are the same. Fixes bug 19819. 2007-08-12 Ben Pfaff * flip.c: Drop use of dict_get_compacted_dict_index_to_case_index and just use the ordinary case indexes. There seemed to be no reason for the former method. 2007-08-03 Ben Pfaff * rank.q (rank_cmd): Instead of sorting by SPLIT FILE vars, group by them. Fixes bug #17239. Reviewed by John Darrington. 2007-08-01 Ben Pfaff Clean up handling of median, by treating it almost like any other percentile. Fixes bug #17424. Thanks to John Darrington for review. * frequencies.q (internal_cmd_frequencies): Fix handling of bit masks for `stats' variable. If median is selected, turn it off and add a 50th percentile. (add_percentile): Simplify code a little. (calc_stats): Drop special casing of median. (dump_statistics): Ditto, except that we label the 50th percentile as "50 (Median)" to make it clear that it's also the median. 2007-07-31 Ben Pfaff Remove integer mode from FREQUENCIES and incidentally fix bug #17421. Reviewed by John Darrington. * frequencies.q (int_pool): Rename data_pool. (gen_pool): Rename syntax_pool. (enum FRQM_*): Removed. (struct freq_tab): Removed `mode', `vector', `min', `max', `out_of_range', `sysmis' members. (calc): Delete support for integer mode. (precalc): Ditto. (postprocess_freq_tab): Ditto. (cleanup_freq_tab): Ditto. (frq_custom_variables): Ditto. 2007-07-28 John Darrington * t-test.q: Moved the order in which groups are displayed in the independent samples case, where a cut point is given. 2007-07-27 Ben Pfaff * regression.q (run_regression): Move casereader_destroy call so that it always gets called, not just if there was some valid data. Fixes bug #19581. Reviewed by Jason Stover. 2007-07-24 Ben Pfaff * flip.c (struct flip_pgm): Remove `case_size' member (now unused). (cmd_flip): Pass var_cnt as number of cases instead of case_cnt, to fix bug #20494. Don't assign to `case_size' member, which was unused after assignment. (build_dictionary): When NEWNAMES not used, get the number of variables right, to fix bug #20493. 2007-07-10 Jason Stover * glm.q: Initial version of the GLM procedure. 2007-06-06 Ben Pfaff Adapt case sources, sinks, and clients of procedure code to the new infrastructure. * aggregate.c: Simplify greatly since everything is more uniform now. * autorecode.c: Adapt to new procedure code. * binomial.c: Ditto. * chisquare.c: Ditto. * crosstabs.q: Ditto. * descriptives.c: Ditto. * examine.q: Ditto. * npar-summary.c: Ditto. * frequencies.q: Ditto. * npar.q: Ditto. * oneway.q: Ditto. * regression.q: Ditto. * sort-cases.c: Ditto. * t-test.c: Ditto. * sort-criteria.c: Rewrite to output a struct case_ordering. * flip.c: Rewrite to be a casereader. * rank.q: Simplify greatly since casereaders are much more flexible than what we had before. 2007-05-15 Jason Stover * regression.q (run_regression): Tell the user when the data contain no valid cases. 2007-05-08 Jason Stover * regression.q: Partial fix of bug which caused a crash if dependent variable and independent variable were the same. 2007-04-16 John Darrington * t-test.q: Changed the output width of reported counts and degrees of freedom, to avoid truncating these values. Thanks to Seth Woolley for reporting this problem. A proper fix involves re-thinking the output driver. 2007-04-12 Jason Stover * regression.q (run_regression): Added if (n_data >0) to fix bug 19581. 2007-03-29 Jason Stover * regression.q (prepare_data): New function. * regression.q (compute_moments): New function. 2007-03-18 Ben Pfaff * crosstabs.q (static var write): Rename write_style to avoid conflict with POSIX function of same name. 2007-03-16 Jason Stover * regression.q (run_regression): Added support for moments. Sat Feb 17 08:16:00 2007 Ben Pfaff * flip.c (flip_sink_create): Improve error message when temporary file cannot be created. Tue Feb 6 19:58:03 2007 Ben Pfaff * flip.c (flip_file): Give better error message on end-of-file. 2007-02-04 Jason Stover * regression.q: Fixed p-value computation in the test for individual regression coefficients. Mon Jan 15 11:03:20 2007 Ben Pfaff Fix bugs found by valgrind when --enable-debug is used with the new case code. These bugs are hidden when the data set is small enough to find in memory; if a bigger data set that would overflow to disk were used, then data corruption would occur. * chisquare.c (create_freq_hash): Pass free_freq_mutable_hash to hsh_create as free function. Make copy of data put into hash. * oneway.q (free_value): New function. (run_oneway): Use free_value as arg to hsh_create. Make copy of data put into hash. * rank.q (rank_cases): Don't access data in case after we've given away the case. Tue Jan 9 19:16:11 2007 Ben Pfaff Fix bug #18739. * aggregate.c (parse_aggregate_functions) Initialize function_name. Fri Dec 22 14:04:09 2006 Ben Pfaff Simplify missing value handling. * aggregate.c (struct agr_var): Remove `bool include_missing', add `enum mv_class exclude'. Remove `int missing', add `bool saw_missing'. Update users. * descriptives.c (struct dsc_trns): Removed `int include_user_missing', add `enum mv_class exclude'. Update users. (struct dsc_proc): Ditto. * examine.q: (static var value_is_missing): Rename `exclude_values', change type to `enum mv_class'. Update users. * rank.q: Ditto. Fri Dec 22 19:22:18 WST 2006 John Darrington * frequencies.q : Fixed bug #17420, where the table bounds were overun when /FORMAT=nolabels was given. Wed Dec 20 18:45:31 WST 2006 John Darrington * binomial.c binomial.h : New files. Thanks to Jason Stover for assistance with these. * chisquare.c chisquare.h freq.c freq.h npar-summary.c npar-summary.h npar.h npar.q: New files. Implementing NPAR TESTS. * frequencies.q : Moved structure definitions into freq.[ch] Sat Dec 16 22:26:44 2006 Ben Pfaff Make it possible to pull cases from the active file with a function call, instead of requiring indirection through a callback function. * aggregate.c (cmd_aggregate): Take advantage of new procedure interface. (agr_to_active_file): Removed. (presorted_agr_to_sysfile): Removed. * autorecode.c (cmd_autorecode): Take advantage of new procedure interface. (autorecode_proc_func): Removed. * flip.c (struct flip_pgm): New members to allow conformance with new case_source_class interface. (cmd_flip): Adapt to new case_source_class interface. (flip_source_read): Ditto. (flip_source_destroy): Ditto. Sat Dec 16 12:54:27 2006 Ben Pfaff * rank.q (rank_custom_variables): Allow grouping variables to be strings. Fixes bug #18533. Thanks to John Darrington for review. Sat Dec 9 18:47:51 2006 Ben Pfaff * regression.q (is_depvar): Compare variable pointers instead of variable names. Thu Dec 7 15:26:25 WST 2006 John Darrington * examine.q: Allocated the categorical values for the dependent and independent variables, on the heap. Hence they can be of any width. Wed Dec 6 21:14:26 2006 Ben Pfaff * regression.q (reg_inserted): Compare variable pointers instead of variable indexes. Mon Dec 4 22:33:46 2006 Ben Pfaff * crosstabs.q (insert_summary): Use var_to_string for labeling. (output_pivot_table) Ditto. (submit) Ditto. * frequencies.q (setup_z_trns): Ditto. (dump_full) Ditto. (dump_condensed) Ditto. (dump_statistics) Ditto. Sun Nov 5 08:31:42 WST 2006 John Darrington * t-test.q, oneway.q: Changed to use the new casefilter structure. Sat Oct 14 16:52:28 2006 Ben Pfaff * rank.q: (rank_sorted_casefile) Add some missing case_destroy() calls to fix a memory leak. Sun Oct 8 09:45:40 WST 2006 John Darrington * rank.q: Plugged a small memory leak which occurred under error conditions. Sat Oct 7 11:06:01 WST 2006 John Darrington * rank.q: Implemented most of the RANK command. 2006-07-14 Jason Stover * regression.q (run_regression): New function to move knowledge of pspp_linreg_cache out of math/coefficient.[ch]. Sat Jul 1 17:41:46 2006 Ben Pfaff Fix bug #11612, "q2c documentation does not agree with code". * examine.q: Audit use of q2c "+" prefixes that indicate that a command may appear multiple times. * frequencies.q: Ditto. * oneway.q: Ditto. * regression.q: Ditto. * t-test.q: Ditto. Fri Jun 23 14:18:22 2006 Ben Pfaff Support long string variables on FREQUENCIES, as an extension when in enhanced algorithms mode. For Greg Hunt . * frequencies.q: (struct freq) Change `v' member from union value to union value *. Update all references. (struct var_freqs) Add width, print members to represent effective variable width and display format. (calc) Copy entire long string value into the hash table. (frq_custom_variables) Set new width, print members. (hash_value_alpha) Get width from var_freqs. (compare_value_alpha_a) Ditto. (compare_freq_alpha_a) Ditto. (compare_freq_alpha_d) Ditto. (dump_full) Get display format from var_freqs. (dump_condensed) Ditto. Mon Jun 19 22:07:13 2006 Ben Pfaff * frequencies.q: (dump_full) Only put the first MAX_SHORT_STRING bytes of string variables into the output cells, seeing as we only copy that many. (dump_condensed) Ditto. Mon Jun 19 21:52:05 2006 Ben Pfaff Fixes a bug reported by Greg Hunt . * frequencies.q: (hsh_hash_bytes) We only copy the first MAX_SHORT_STRING bytes of string variables, so we must only compare that many bytes, even if the string variable is longer. (compare_value_alpha_a) Ditto. (compare_freq_alpha_a) Ditto. (compare_freq_alpha_d) Ditto. 2006-05-11 Jason Stover * regression.q: Adjusted code to account for cache->coeff being a pspp_linreg_coeff **. Sun May 7 18:31:25 2006 Ben Pfaff Fix memory leak. * aggregate.c (cmd_aggregate): Free default_dict before replacing it. Sun May 7 17:09:19 2006 Ben Pfaff * flip.c (flip_file): Check return value of pool_fclose(). Sat May 6 16:00:13 2006 Ben Pfaff Get rid of `char *c' member in union value, for cleanliness. * aggregate.c: (union agr_argument) New union. (struct agr_var) Change element type of arg[] from union value to union agr_argument. (parse_aggregate_functions) Change local variable types likewise. * autorecode.c: (union arc_value) New union. (struct arc_item) Change "from" from union value to union arc_value. (recode) Change local variable from union value to union arc_value. (autorecode_trns_proc) Ditto. (compare_alpha_value) Ditto. (hash_alpha_value) Ditto. (compare_numeric_value) Ditto. (hash_numeric_value) Ditto. (autorecode_proc_func) Ditto. Sat May 6 10:43:33 2006 Ben Pfaff Continue reforming procedure execution. In this phase, get rid of the output code for SPLIT FILE groups in procedure.c, which really shouldn't be doing any output. Move it into the individual procedures instead. This also adds some flexibility. * crosstabs.q (precalc): Call output_split_file_values(). * descriptives.c (calc_descriptives): Ditto. * examine.q (run_examine): Ditto. * frequencies.q (precalc): Ditto. * oneway.q (run_oneway): Ditto. * regression.q (run_regression): Ditto. * t-test.q (calculate): Ditto. Wed May 3 23:05:31 2006 Ben Pfaff Continue reforming procedure execution. In this phase, get rid of many global variables, consolidating procedure execution in procedure.c. Encapsulate transformations in new "struct trns_chain". Also, change implementation of N OF CASES, FILTER, and PROCESS IF from special cases to transformations. * aggregate.c (cmd_aggregate) Use discard_variables(). 2006-04-28 Jason Stover * regression.q (regression_trns_resid_proc): Pass only the variables used in the model to (*model->residual)(). * regression.q (regression_trns_pred_proc): Pass only the variables used in the model to (*model->pred)(). 2006-04-26 Jason Stover * regression.q: Added support for multiple transformations. * regression.q (regression_trns_resid_proc): New function. * regression.q (regression_trns_pred_proc): New function. * regression.q (subcommand_save): Added support for saving predicted values. * regression.q (regression_trns_free): New function. * regression.q (reg_get_name): New function. * regression.q (reg_save_var): New function. Tue Apr 25 13:18:56 2006 Ben Pfaff * rank.q (parse_rank_function): Use SE instead of ME for parse errors. Tue Apr 25 13:16:28 2006 Ben Pfaff * flip.c (flip_sink_write): Use snprintf() to simplify a bit of code. 2006-04-21 Jason Stover * regression.q (try_name): New function. (Partly copied from try_name in descriptives.c.) * regression.q (subcommand_save): Choose residual variable names correctly. 2006-04-20 Jason Stover * regression.q (cmd_regression): Moved call to subcommand_save() outside multipass_procedure_with_splits(). * regression.q (regression_trns_proc): Fixed value counter n_vals before calling *model->residual(). 2006-04-19 Jason Stover * regression.q (regression_trns_proc): Fixed the look-up of the number of variables. 2006-04-18 Jason Stover * regression.q (regression_trns_proc): Look up the residual variable in the linear regression cache. * regression.q (subcommand_save): Set the residual variable in the linear regression cache. 2006-04-17 Jason Stover * regression.q (regression_trns_proc): Accept case_idx as an int to match the definition of trns_proc_func. 2006-04-17 Jason Stover * regression.q (regression_trns_proc): New function. * regression.q (subcommand_save): Create variable residuals and add a transformation to assign values to them. Also free the linreg_cache if the SAVE command was not called. Removed the casereading loop. Placed actual computation of residuals in regression_trns_proc. * regression.q (run_regression): Moved call to free pspp_linreg_cache to subcommand_save. Sat Apr 15 18:01:03 2006 Ben Pfaff * examine.q (output_examine): Add casts to fix warnings. 2006-04-07 Jason Stover * regression.q (subcommand_save): New function. 2006-04-04 Jason Stover * regression.q: New function reg_has_categorical () to tell whether a pspp_linreg_struct was made with any variables of type ALPHA. * regression.q: (subcommand_export): Call reg_print_categorical_encoding() only if the model uses any categorical variables. Mon Mar 27 16:00:42 2006 Ben Pfaff * crosstabs.q: (output_pivot_table) Drop spurious space from message. 2006-03-15 Jason Stover * regression.q: Added custom syntax parser for VARIABLES subcommand * regression.q: Moved most instructions for run_regression () inside the loop over dependent variables. Thu Mar 2 08:40:33 WST 2006 John Darrington * Moved files from src directory