X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=src%2Fdata%2FChangeLog;h=c404d820e53253a40a987bc55328928b25de3b9a;hb=4e30d33a680cceb0ac2ee3e78c94fdeb46ab2fcd;hp=238601c6d9bd838d6e8076580fb20dcd5c7a9c1e;hpb=b74d09af5e07f954c18e7cdb8aca3af47fa10208;p=pspp-builds.git diff --git a/src/data/ChangeLog b/src/data/ChangeLog index 238601c6..c404d820 100644 --- a/src/data/ChangeLog +++ b/src/data/ChangeLog @@ -1,3 +1,1436 @@ +2007-07-29 Ben Pfaff + + Provisional fix for bug #18692 and bug #20161. Reviewed by John + Darrington. + + * file-name.c (fn_open): Only pass "r" or "w" to popen as mode + argument (never "rb" or "wb") because SUSv3 says that only those + modes are defined, and glibc in fact rejects other modes. + + Open portable files with fn_open so that they can be read from + pipes. Fix missing fh_close call to go along with fh_open. + Report an error if the file close reports an error. + * por-file-reader.c (close_reader): New function. + (por_file_casereader_destroy): Use close_reader. + (pfm_open_reader): Open file with fn_open. + +2007-07-28 Ben Pfaff + + Make PSPP able to read all the portable files I could find on the + web. Thanks to John Darrington for review. Bug #17620. + * por-file-reader.c (struct pfm_reader): New member `line_length'. + (error): Print file offset in hexadecimal. + (warning): New function. + (advance): Treat lines less than 80 bytes long as padded to 80 + bytes with spaces. + (pfm_open_reader): Call read_documents if we find an "E" record. + (convert_format): Convert invalid formats to the default format + instead of aborting reading the file. + (read_variables): Rename duplicate variable names instead of + aborting reading the file. + (read_value_label): Allow string variables of different widths to + be assigned value labels in the same record. Replace duplicate + value labels instead of aborting. + (read_documents): New function. + + * por-file-writer.c (pfm_open_writer): Call write_documents if the + dictionary has documents. + (write_documents): New function. + +2007-07-25 Ben Pfaff + + Fix bugs related to bug #17213. + + * settings.c: Use HAVE_LIBNCURSES instead of HAVE_LIBTERMCAP, + since the former is what config.h has. Include the needed ncurses + headers. + (static var echo) Rename to `do_echo' because the original name is + the same as an ncurses identifier. + (get_termcap_viewport) Use error instead of msg. + + * file-name.c (fn_interp_vars): Fix interpolation of $VARS. + (fn_close): Don't close stdin, stdout, stderr. + +2007-07-26 John Darrington + + * procedure.c procedure.h: Added callbacks which get invoked whenever + a dataset's transformation chain changes. + +2007-07-24 Ben Pfaff + + Fix bug #6113. + * sys-file-writer.c (write_variable_display_parameters): Use new + var_default_display_width function to choose display width of + segments after the first one in a given variable. + * variable.c (var_create): Use var_default_display_width to pick + new variable's display width. + (var_default_display_width): New function. + Reviewed by John Darrington. + +2007-07-24 Ben Pfaff + + Fix bug #20427. + * por-file-writer.c (write_variables): Write weight variable. + Reviewed by John Darrington. + +2007-07-23 Ben Pfaff + + Improvements to system file reader and writer. + + First, move all detailed knowledge of very long strings into + sys-file-private.[ch], so that this nasty stuff can be isolated. + + * sys-file-private.c (REAL_VLS_CHUNK): New macro. + (EFFECTIVE_VLS_CHUNK): New macro. + (min_int): New function. + (max_int): New function. + (sfm_width_to_bytes): Rewrite. + (sfm_width_to_octs): New function. + (sfm_segment_alloc_width): New function. + (sfm_segment_alloc_bytes): New function. + (sfm_segment_used_bytes): New function. + (sfm_segment_offset): New function. + (sfm_segment_effective_offset): New function. + (sfm_dictionary_to_sfm_vars): New function. + + * sys-file-private.h (MIN_VERY_LONG_STRING): Removed. + (EFFECTIVE_LONG_STRING_LENGTH): Removed. + (struct sfm_var): New structure. + + Next, improvements to the system file reader. + + * sys-file-reader.h (struct sfm_read_info): Changed `case_cnt' to + type casenumber. Added `version_major', `version_minor', + `version_revision'. + + * sys-file-reader.c (struct sfm_reader): Replaced `flt64_cnt' by + `oct_cnt'. Rename `vars', `var_cnt' to `sfm_vars', `sfm_var_cnt'. + Change `case_cnt' to type casenumber. Removed `has_vls'. + (struct sfm_var): Removed. + (sfm_open_reader): Don't warn on wrong case size if the file was + written by SPSS 13, which tends to get it wrong. Use + sfm_dictionary_to_sfm_vars. + (read_header): Always output system file info. + (read_variable_record): Simplify code for reading missing values. + (read_machine_int32_info): Save version numbers from system file + into info struct passed as new argument. + (read_long_string_map): Restructured to use new sys-file-private + functions. + (read_value_labels): Use size_overflow_p. + (sys_file_casereader_read): Get rid of distinction between fast + and slow paths. Use information provided by sys-file-primate's + struct sfm_var to simplify code. + (skip_whole_strings): New function. + (read_int32): Renamed read_int. Changed return value to int. + Updated all callers. + (read_flt64): Renamed read_float. Changed return value to + double. Updated all callers. + (int32_to_native): Removed. Changed callers to use + integer_convert. + (flt64_to_double): Removed. Changed callers to use float_convert. + + Finally, get rid of int32, flt64 terminology and types in system + file writer. The former wasn't very useful since a POSIX "int" + can hold the whole range of int32 and we generally didn't have a + need for it to be exactly-32-bits, just at-least-32-bits. The + latter was inconvenient because we had to assume that it could be + different from double and thereby convert special values SYSMIS, + HIGHEST, LOWEST to and from it in multiple places. Instead, now + we just use "int" and "double" in most places, and do conversions, + if necessary, very close to where we do I/O. This change meant + that the writer code couldn't represent records in the file as C + structs any longer, but that's no great loss. The code actually + seems to be more readable without them. + + Simplify the compression buffering code: only buffer as much as + necessary, which is no more than eight 8-byte units at any given + time. + + * sys-file-writer.c (typedef flt64): Removed. + (macro second_lowest_flt64): Removed. + (struct sysfile_header): Removed. + (struct sysfile_variable): Removed. + (struct sfm_writer): Removed `needs_translation', `has_vls', + `flt64_cnt'. Changed `compress' to type bool and `case_cnt' to + type casenumber. Renamed `vars' to `sfm_vars', `var_cnt' to + `sfm_var_cnt'. Replaced `buf', `end', `ptr', `x', `y' for + compression buffering by `opcodes', `opcode_cnt', `data', + `data_cnt'. Renamed `var_cnt_vls' as `segment_cnt'. + (sfm_open_writer): Use sfm_dictionary_to_sfm_vars. Use simple + data writer functions instead of structures. + (calc_oct_idx): New function. + (write_header): Use simple data writer functions instead of + structures. + (write_format_spec): Renamed write_format. New argument. + (write_variable_continuation_records): New function. + (write_variable): Use simple data writer functions instead of + structures. Use write_variable_continuation_records. Write + entire very long string instead of requiring caller to understand + them. + (write_value_labels): Use simple data writer functions instead of + structures. + (write_documents): Ditto. + (write_variable_display_parameters): Use sys-file-private + functions to simplify. Use simple data writer functions instead + of structures. + (write_vls_length_table): Use simple data writer functions instead + of structures. + (write_longvar_table): Ditto. + (write_rec_7_34): Break into new functions + write_integer_info_record, write_float_info_record. Use simple + data writer functions instead of structures. + (buf_write): Removed. + (append_string_max): Removed. + (ensure_buf_space): Removed. + (sys_file_casewriter_write): Get rid of the distinction between + fast and slow paths, which didn't seem to be too useful. Use new + functions write_case_uncompressed, write_case_compressed. + (put_instruction): Removed. + (put_element): Removed. + (write_compressed_data): Removed. + (close_writer): Use flush_compressed. Only write case count to + system file if it will fit in the field. + (write_case_compressed): New function. + (write_case_uncompressed): New function. + (flush_compressed): New function. + (put_cmp_opcode): New function. + (put_cmp_number): New function. + (write_int): New function. + (convert_double_to_output_format): New function. + (write_float): New function. + (write_value): New function. + (write_string): New function. + (write_bytes): New function. + (write_zeros): New function. + (write_spaces): New function. + + Reviewed by John Darrington. + +2007-07-22 Ben Pfaff + + Don't try to write very long strings to portable files. The + format does not support it. + + * por-file-writer.c (MAX_POR_WIDTH): New macro. + (pfm_open_writer): Limit output width to MAX_POR_WIDTH. + (write_format): Add arg to take width to resize format to. + (write_value): Limit width of value written to MAX_POR_WIDTH. + (write_variables): Limit width of variable and its output formats + to MAX_POR_WIDTH. + Reviewed by John Darrington. + +2007-07-22 Ben Pfaff + + * sys-file-reader.c (read_variable_to_value_map): Use max_warnings + local variable instead of literal 5. + Reviewed by John Darrington. + +2007-07-22 Ben Pfaff + + Fix problems with uniqueness of short names in system files with + very long string variables. Now a variable may have multiple + short names. + + * automake.mk (src_data_libdata_a_SOURCES): Add new files + short-names.c, short-names.h. + + * dictionary.c (dict_clone): Clone all the short names. + (compare_strings): Move into short-names.c. + (hash_strings): Ditto. + (set_var_short_name_suffix): Ditto. + (dict_assign_short_names): Ditto, rename short_names_assign, + change to assign all short names. + + * por-file-writer.c (write_variables): Use short_names_assign + instead of dict_assign_short_names. + + * short-names.c: New file. + + * short-names.h: New file. + + * sys-file-private.c (sfm_width_to_segments): New function. + + * sys-file-reader.c (read_long_var_name_map): Save and restore all + the short names, not just the first one. + + * sys-file-writer.c (cont_var_name): Removed. + (sfm_open_writer): Use short_names_assign instead of + dict_assign_short_names. Use unique short names assigned by + short_names_assign instead of those generated by cont_var_name. + + * variable.c (struct variable): Remove `short_name' member, + replace by `short_names' and `short_name_cnt'. + (var_create) Initialize new members. + (var_get_short_name_cnt): New function. + (var_get_short_name): Now takes an index argument. Changed most + callers to pass 0. + (var_set_short_name): Ditto. + (var_clear_short_name): Renamed var_clear_short_names, changed to + clear all short names. + + Reviewed by John Darrington. + +2007-07-22 Ben Pfaff + + * variable.c (var_set_width): Use new var_set_width function. + + * missing-values.c (mv_n_values): Drop assertion, which was not + needed. + + * format.c (fmt_default_for_width): New function. + (fmt_resize): New function. + + Reviewed by John Darrington. + +2007-07-18 John Darrington + + * datasheet.c (datasheet_delete_columns): Added assertion to check + we're not deleting outside the range of the sheet. + + + * dictionary.c dictionary.h variable.c: Added the ability for string + variables to be resized. + + * vardict.h: Added some prototypes (moved from dictionary.h) as + these should only be called by variable.c + + +2007-07-14 John Darrington + + * sfm-reader.c: Respect case_cnt field in file header. + +2007-07-01 John Darrington + + * transformation.c transformation.h (trns_chain_execute): Changed the + signature (Patch #6057) + +2007-06-10 Ben Pfaff + + * casereader-filter.c (casereader_filter_destroy): Make sure to + write all the remaining excluded cases to the casewriter, if any. + + * caseinit.c (init_list_destroy): Rewrite. + (init_list_clear): Ditto. + + * casegrouper.c (casegrouper_get_next_group): Always set *reader + to null when returning false. + +2007-06-06 Ben Pfaff + + Actually implement the new procedure code and adapt all of its + clients to match. Also adapt all of the other case sources and + sinks in the tree and their clients to use the + casereader/casewriter infrastructure. + + * automake.mk: Add and remove files. + + * any-reader.c: Change into a casereader. + * por-file-reader.c: Ditto. + * scratch-reader.c: Ditto. + * sys-file-reader.c: Ditto. + + * any-writer.c: Change into a casewriter. + * por-file-writer.c: Ditto. + * scratch-writer.c: Ditto. + * sys-file-writer.c: Ditto. + + * procedure.c: Change to use casereader, casewriter, caseinit, and + other new infrastructure. + + * scratch-handle.c: Adapt to new infrastructure. + + * case-sink.c: Removed, now dead code. + * case-sink.h: Ditto. + * case-source.c: Ditto. + * case-source.h: Ditto. + * casefile-factory.c: Ditto. + * casefile-private.h: Ditto. + * casefile.c: Ditto. + * casefile.h: Ditto. + * casefilter.c: Ditto. + * casefilter.h: Ditto. + * fastfile.c: Ditto. + * fastfile.h: Ditto. + * fastfile-factory.c: Ditto. + * fastfile-factory.h: Ditto. + * storage-stream.c: Ditto. + * storage-stream.h: Ditto. + +2007-06-06 Ben Pfaff + + Add datasheet code. + + * automake.mk: Add new files. + + * datasheet.c: New file. + + * datasheet.h: New file. + +2007-06-06 Ben Pfaff + + Until now, the procedure code has provided a case to the + case_source, which has filled in the data values that come from + the active file. "Left" data values that don't come from the + active file naturally stay the same from case to case, because the + procedure code keeps using that same case. + + One of the compromises that comes with the new procedure code is + that the active file allocates and provides its own case, which + the procedure code then has to resize to provide room for any + other variables that should go in the case and then fill in the + values of "left" variables. Then, when we're done with that case, + we have to save the values of "left" variables to copy into the + next case read from the active file. + + The caseinit code helps with this. + + * automake.mk: Add new files. + + * caseinit.c: New file. + + * caseinit.h: New file. + +2007-06-06 Ben Pfaff + + * value.h (value_cnt_from_width): New function. + + * variable.c (var_get_value_cnt): Use new function. + +2007-06-06 Ben Pfaff + + Add casegrouper, to allow cases read from a given casereader to be + broken into groups, each of which has its own casereader. + Generally cases are grouped based on having equal values for some + set of variables. + + * automake.mk: Add new files. + + * casegrouper.c: New file. + + * casegrouper.h: New file. + +2007-06-06 Ben Pfaff + + Add interface to lexicographical ordering of cases. + + * automake.mk: Add new files. + + * case-ordering.c: New file. + + * case-ordering.h: New file. + +2007-06-06 Ben Pfaff + + Add casereaders and casewriters, the basis of the new data processing + implementation. A casereader is a uniform interface to reading cases + from a data source; a casewriter is a uniform interface to writing + cases to a data sink. + + * automake.mk: Add new files. + + * casereader-filter.c: New file. + + * casereader-provider.h: New file. + + * casereader-translator.c: New file. + + * casereader.c: New file. + + * casereader.h: New file. + + * casewriter-provider.h: New file. + + * casewriter-translator.c: New file. + + * casewriter.c: New file. + + * casewriter.h: New file. + +2007-06-06 Ben Pfaff + + "casewindow" data structure that extends the deque (from libpspp) + of cases with the ability to dump cases to disk if we get too many + of them in memory. + + * automake.mk: Add new files. + + * casewindow.c: New file. + + * casewindow.h: New file. + +2007-06-06 Ben Pfaff + + sparse_cases data structure that augments a sparse_array of cases + with the ability to dump cases to disk if we get too many cases in + memory. + + * automake.mk: Add new files. + + * sparse-cases.c: New file. + + * sparse-cases.h: New file. + +2007-06-06 Ben Pfaff + + Adds a low-level on-disk case array data structure. + + * automake.mk: Add new files. + + * case-tmpfile.c: New file. + + * case-tmpfile.h: New file. + +2007-06-06 Ben Pfaff + + In a couple of places we calculate the maximum number of cases to + keep in memory based on the user-defined workspace. Enable + centralizing the calculation through a new function. + + * settings.c (get_workspace_cases): New function. + +2007-06-06 Ben Pfaff + + The casenumber type is defined in transformations.h, but case.h is + a more sensible place. Move it. + + * case.h (CASENUMBER_MAX): New macro. + (typedef casenumber): Move here, from transformations.h. + +2007-06-03 Ben Pfaff + + Slightly generalize case_to_values and case_from_values functions. + + * case.c (case_to_values): Rename case_copy_out, change interface. + (case_from_values): Rename case_copy_in, change interface. + + * fastfile.c (fastfilereader_get_next_case): Update caller. + (write_case_to_disk): Ditto. + +2007-06-02 Ben Pfaff + + Clean up after a forgotten part of patch #5829. + + * casedeque.h: Remove unused file. + + * automake.mk: Remove casedeque.h from sources. + +2007-05-10 Jason Stover + + * category.c: Removed redundant #include + +2007-05-06 Ben Pfaff + + Abstract the documents within a dictionary a little better. + Thanks to John Darrington for suggestion, initial version, and + review. Patch #5917. + + * dictionary.c (struct dictionary): Change `documents' member from + char * to struct string. + (dict_clear): Destroy struct string. + (dict_get_documents): Convert struct string to char *. + (dict_set_documents): Set struct string. Pad to 80-character + multiple. + (dict_clear_documents): New function. + (dict_add_document_line): New function. + (dict_get_document_line_cnt): New function. + (dict_get_document_line): New function. + + * dictionary.h (macro DOC_LINE_LENGTH): New macro. + + * sys-file-reader.c (read_documents): Use new document functions. + +2007-04-19 John Darrington + + * sys-file-reader.c: When reading a system file which has no + long name table, automatically create one where the long names + are the lower case versions of the short names. + +2007-04-22 Ben Pfaff + + * dictionary.c (dict_set_split_vars): dict_destroy expects that + dict_clear will free most data related to the dictionary. + dict_clear does a decent job, except that dict_set_split_vars on + some systems won't actually free the dict's "split" member. + Instead, it'll allocate a 1-byte region. Fix this. + + * value.c (value_copy): New function. + (value_set_missing): Ditto. + +2007-04-22 John Darrington + + * Deleted existing category.h and moved cat-routines.h into + category.h Encapsulated struct cat_vals better. + +2007-04-19 John Darrington + + * sys-file-reader.c: When reading a system file which has no + long name table, automatically create one where the long names + are the lower case versions of the short names. + +2007-04-16 John Darrington + + * sys-file-reader.c: Some versions of Other Software seem to + produce system files with string variables' measure set to + zero. We'll assume these are supposed to be nominal variables. + +2007-03-30 Ben Pfaff + + * procedure.c: Adapt to new deque data structure. + +Mon Feb 19 10:53:21 2007 John McCabe-Dansted + Ben Pfaff + + * file-name.c: Mingw compatibility fixes. + (fn_search_path): Use ISSLASH instead of comparing against '/' + directly. + (fn_dir_name): Use dir_name from gnulib. + (fn_is_absolute): Use IS_ABSOLUTE_FILE_NAME from gnulib. + (fn_get_identity): Use GetFullPathName instead of canonicalize + from gnulib, because the latter does not fully support + Windows-style path names. Use this implementation based on the + detected presence of Windows instead of the absence of Unix, since + the new implementation is Windows-specific. + (fn_compare_file_identities): In Windows implementation, compare + names case-insensitively. + +Sun Feb 18 13:28:02 2007 Ben Pfaff + + * make-file.c: Don't include mkstemp.h, because gnulib now causes + to have the same effect. + +Sun Feb 18 11:20:24 2007 Ben Pfaff + + * por-file-reader.c: Add missing _() around messages. + +Sun Feb 11 20:44:13 2007 Ben Pfaff + + * make-file.c: Include "mkstemp.h", without which linking on + mingw32 fails. + +Thu Feb 8 14:59:05 2007 Ben Pfaff + + Reduce platform dependence. + + * file-name.c (fn_tilde_expand): Removed, and removed calls to it. + Everywhere we using this, we really should have just depended on + the shell to expand tildes. + (fn_search_path): Simplify, given that we don't do tilde expansion + any longer. + (fn_normalize): Removed. Caller changed to use the canonicalize + module from gnulib. + (fn_get_cwd): Removed. Only user was fn_normalize. + (fn_is_absolute): Really only test for absolute names. + (fn_is_special): Use pipe files if HAVE_POPEN, not if we're in + unix. + (fn_readlink): Removed, as it was only used fn_normalize. + (fn_exists): Assume the stat function is available; gnulib does. + (fn_open): Use pipe files if HAVE_POPEN, not if we're in unix. + +Sat Feb 3 21:52:17 2007 Ben Pfaff + + * dictionary.c (dict_create_vector_assert): New function. + +Wed Feb 7 21:25:15 2007 Ben Pfaff + + * file-name.c (fn_normalize): Correct name of function + fn_is_special. Thanks to John McCabe-Dansted + for pointing this out. + +Thu Feb 1 16:53:37 2007 Ben Pfaff + + We are using a single member in struct file_handle, the "name" + field, for more than one purpose. When it begins with '"', it's a + file name; otherwise, it's a token that can be used to identify + it. When that assertion fires, it's because we searched for the + name case-sensitively as a file name (so that there was no match), + and then we try to insert it case-insensitively as a token, which + fails because duplicates aren't allowed. + + Solution: break the two purposes into two separate fields. This + fixes the problem and likely makes the code easier to read too. + + Fixes bug #18922. Thanks to John Darrington for bug report and + review. + + * file-handle-def.c (struct file_handle): New `id' member. + (fh_from_name): Rename fh_from_id. Update all callers. + (create_handle): New `id' parameter. Update all callers. + (fh_create_file): Ditto. + (fh_get_id): New function. + +Mon Jan 15 16:18:10 2007 Ben Pfaff + + * case.c (case_is_null): Change return type to bool. + +Mon Jan 15 10:57:28 2007 Ben Pfaff + + Add debugging code. + + * case.c (case_clone) [DEBUGGING]: When debugging, don't use + reference counting to share data. This makes it easy for + valgrind, etc. to find accesses to cases that have been destroyed + but have been kept around by another user's ref-count. This often + happens when the data set is small enough to find in memory; if a + bigger data set that would overflow to disk were used, then data + corruption would occur. + +Mon Jan 15 10:55:18 2007 Ben Pfaff + + Simplify code. + + * case.c (case_unshare): Make it check internally whether the + ref_cnt is greater than 1, so that the callers don't have to. + Update callers not to check. + +Mon Jan 15 10:53:01 2007 Ben Pfaff + + Before, I was thinking that I might want to get rid of reference + counting at some point. Now, I'm pretty sure that it's here to + stay. Thus, because we have to store the value_cnt anyway for + reference-counted cases, we might as well expose it to users. + + * case.c (case_get_value_cnt): New function. + (case_resize): Drop OLD_CNT argument. Update all callers. Only + resize case if its size actually changed. + + * casefile.c (casefile_append_xfer): Use case_get_value_cnt + instead of peeking inside struct case directly. + (casefile_append): Ditto. + +Mon Jan 15 10:50:22 2007 Ben Pfaff + + Get rid of the inlines for the case functions, which made the + header file hard to read. (Also, in testing with "-O2 -DNDEBUG", + the inlines didn't speed up "make check" at all, which is not a + perfect benchmark but seems indicative.) + + * case.c: Remove #ifdef DEBUGGING...#endif around many function + definitions. Remove some assertions on nonnull pointers that were + redundant with a pointer dereference soon after in the function. + Also: + (struct case_data): Move definition here from case.h. + (case_data): Ditto. + (case_num): Ditto. + (case_str): Ditto. + (case_data_wr): Ditto. + +Sun Jan 14 21:41:12 2007 Ben Pfaff + + * automake.mk: Add casedeque.h to sources. + + * casedeque.h: New file. + + * procedure.c: (struct dataset) Change lag_count, lag_head, + lag_queue member into single struct casedeque member. Update all + users to use the casedeque instead. + (lag_case) Removed. + +Sun Jan 14 21:43:12 2007 Ben Pfaff + + * procedure.c: Simplify lagged cases interface. Updated all + clients--well, the only client--to use the simplified interface. + (dataset_n_lag) Removed. + (dataset_set_n_lag) Removed. + (dataset_need_lag) New function. + +Tue Jan 9 07:20:05 WST 2007 John Darrington + + * dictionary.c procedure.c: More changes to ensure that callbacks occur + whenever appropriate, but only when the dataset/dictionary is in a + consistent state. + +Sun Jan 7 08:33:04 WST 2007 John Darrington + + * dictionary.c dictionary.h : Added callbacks for change of filter and + split variables. Refactored some code to ensure that callbacks get + invoked when appropriate. + + * procedure.c (proc_cancel_temporary_transformations): Make sure that + replace_dict callback occurs when permanent_dict replaces the current + dictionary. + +Wed Jan 3 11:02:11 WST 2007 John Darrington + + * dictionary.c dictionary.h : Added callback for when the weight + variable of a dictionary changes. + +Mon Jan 1 10:36:26 WST 2007 John Darrington + + * dictionary.c dictionary.h : Added replace_source and replace_dict + callbacks, and functions to deal with them. + +Fri Dec 22 13:56:08 2006 Ben Pfaff + + Simplify missing value handling. + + * missing-values.h (enum mv_class): New type. + (enum mv_type): Moved definition into missing-values.c and renamed + each MV_* to MVT_*, to distinguish them from the exposed mv_class + enums. Updated all uses. + (struct missing_values): Changed type of `type' from `enum + mv_type' to `int' because the definition is no longer exposed. + + * missing-values.c (mv_is_value_missing): Add new enum mv_class + parameter. Update all callers. + (mv_is_num_missing): Ditto. + (mv_is_str_missing): Ditto. + (mv_is_value_user_missing): Removed. Changed callers to use + mv_is_value_missing. + (mv_is_num_user_missing): Removed. Changed callers to use + mv_is_num_missing. + (mv_is_str_user_missing): Removed. Changed callers to use + mv_is_str_missing. + (mv_is_value_system_missing): Removed. Changed callers to use + mv_is_value_missing. + (mv_set_type): Removed. Changed callers to use mv_clear. + (mv_clear): New function. + + * variable.c (var_is_value_missing): Add new enum mv_class + parameter. Update all callers. + (var_is_num_missing): Ditto. + (var_is_str_missing): Ditto. + (var_is_value_user_missing): Removed. Changed callers to use + var_is_value_missing. + (var_is_num_user_missing): Removed. Changed callers to use + var_is_num_missing. + (var_is_str_user_missing): Removed. Changed callers to use + var_is_str_missing. + (var_is_value_system_missing): Removed. Changed callers to use + var_is_value_missing. + + * casefilter.c (struct casefilter): Use enum mv_class in place of + bool. + (casefilter_variable_missing): Adapt to new member. + (casefilter_create): Change signature to take enum mv_class, + update callers. + +Fri Dec 22 20:08:38 WST 2006 John Darrington + + * casefile-factory.h fastfile-factory.c fastfile-factory.h: New files. + + * case-sink.c case-sink.h procedure.c procedure.h + storage-stream.c: Now uses the factory. + +Sat Dec 16 22:05:18 2006 Ben Pfaff + + Make it possible to pull cases from the active file with a + function call, instead of requiring indirection through a callback + function. + + * case-source.h (struct case_source_class): Change ->read function + to return a single case, instead of calling a callback function + for each case. Change ->destroy function to return an error + status. + + * case-source.c (free_case_source): Pass along the value returned + by the case_source ->destroy function. + + * procedure.c (struct write_case_data): Removed. + (struct dataset): Added some members to track procedure state. + (procedure): Optimize the trivial case at this level. + (internal_procedure): Re-implement in terms of proc_open, + proc_read, proc_close. + (proc_open) New function. + (proc_read) New function. + (proc_close) New function. + (write_case) Moved into proc_read. + (close_active_file) Moved closing of data source into proc_close. + + * storage-source.c: Rewrote to conform with modified + case_source_class interface. + + * transformations.c (trns_chain_execute): Added argument to allow + starting execution from an arbitrary transformation. Updated + callers. + + * transformations.h (enum TRNS_NEXT_CASE) Renamed TRNS_END_CASE. + +Sat Dec 16 14:09:25 2006 Ben Pfaff + + * sys-file-reader.c (read_display_parameters): Don't assume that + MEASURE_* and ALIGN_* have the same values found in system files. + + * sys-file-writer.c (write_variable_display_parameters): Ditto. + + * variable.h: Change MEASURE_NOMINAL, MEASURE_ORDINAL, + MEASURE_SCALE to be 0-based instead of 1-based. This also fixes + the value of n_MEASURES, which was off by 1 (at least from my + point of view). + +Sat Dec 16 12:17:34 WST 2006 John Darrington + + * dictionary.c dictionary.h vardict.h variable.c: Added optional + callbacks which are invoked when the dictionary or its + variables are changed. + + * missing-values.c missing-values.h value-labels.c: Tidied up + consistency checks, and made some of them return false + instead of assert-failing. + +Wed Dec 13 19:30:11 2006 Ben Pfaff + + * calendar.c (calendar_days_in_month): New function. + +Mon Dec 11 07:53:39 2006 Ben Pfaff + + * value-labels.c (hash_int_val_lab): Only hash as many bytes as + the value label's width. + +Sun Dec 10 14:21:29 2006 Ben Pfaff + + * sfm-private.h: Move contents into sys-file-writer.c, which is + the only remaining user. Removed Borland C++-specific directives. + + * sys-file-reader.c: Clean up and rewrite entire file. The + rewritten version is simpler and better abstracted, and should be + easier to maintain and extend. It avoids using structures to read + file data, which is prone to padding variations among compilers. + It should also handle non-IEEE 754 system files, although I + haven't been able to find any. It has been tested against many + .sav files obtained from the Web and found to produce the same + results as the earlier version of the code, or in some cases + improved results. It is more tolerant of format variations found + in the wild. + + * sys-file-reader.h (struct sfm_read_info): Removed `big_endian' + member, putting an enum integer_format in its place. New member + `float_format'. Changed `compressed' member to type bool. + +Sun Dec 10 13:48:53 2006 Ben Pfaff + + * dictionary.c (dict_delete_consecutive_vars): New function. + +Sat Dec 9 20:08:25 2006 Ben Pfaff + + * file-name.c (fn_search_path): Remove prefix arg that was unused + by any caller. Updated all callers. + +Sat Dec 9 20:04:22 2006 Ben Pfaff + + * format.c (fmt_dollar_template): Use user's decimal point + character. Add assertion. + +Sat Dec 9 20:02:25 2006 Ben Pfaff + + * format.c (fmt_dollar_template): New function, based on + dollar_format_template from var-type-dialog.c. + +Sat Dec 9 18:05:59 2006 Ben Pfaff + + * data-out.c (output_scientific): Fix bad assumption that "buf" is + null-terminated. + +Sat Dec 9 17:23:23 2006 Ben Pfaff + + Finish converting struct variable to an opaque type. In this + phase, we add remaining setter and getter functions, convert the + remaining PSPP code to use them, and do a bunch of cleanup. The + resulting changes are pervasive but mostly trivial, and only the + notable changes are logged. + + * automake.mk (src_data_libdata_a_SOURCES): Add the new source + files. + + * case.c (case_data): Renamed case_data_idx. + (case_num): Renamed case_num_idx. + (case_str): Renamed case_str_idx. + (case_data_rw): Renamed case_data_rw_idx. + + * case.h (case_data): New function with old name and an interface + that takes a variable instead of an index, which is easier to + use. Updated all callers to use the new interface, or to use the + new *_idx function (see above). + (case_num): Ditto. + (case_str): Ditto. + (case_data_rw): Ditto. + + * category.c (cat_stored_values_destroy): Changed interface to + take a struct cat_vals * instead of a struct variable *. + + * dictionary.c (dict_clone): Use new vector_clone function. + (dict_clear) Use new var_destroy function. + (add_var) New function. + (dict_create_var) Rewrite in terms of dict_create_var_assert. + (dict_create_var_assert) Rewrite in terms of add_var. + (dict_clone_var) Rewrite in terms of dict_clone_var_assert. + (dict_clone_var_assert) Rewrite in terms of var_clone, add_var. + (dict_lookup_var) Use new var_create, var_destroy functions. + (dict_contains_var) Rewrite in terms of new vardict functionality. + (set_var_dict_index) New function. + (set_var_case_index) New function. + (reindex_vars) New function. + (dict_delete_var) Rewrite in terms of new vardict functionality. + (dict_reorder_var) Ditto. + (dict_reorder_vars) Ditto. + (rename_var) New function. + (dict_rename_var) Use rename_var. + (dict_rename_vars) Use pool to simplify code. Use rename_var. + (dict_get_compacted_idx_to_fv) Rename + dict_get_compacted_dict_index_to_case_index, update callers. + (dict_create_vector) Use new vector_create function. + (dict_clear_vectors) Use new vector_destroy function. + (set_var_short_name_suffix) Move here from variable.c, renamed + from var_set_short_name_suffix, make static, update caller. + + * sys-file-private.c: New file. + (sfm_width_to_bytes) Moved here from variable.c, renamed from + width_to_bytes, update callers. + + * sys-file-private.h: New file. Later it will supplant + sfm-private.h; for now it supplements it. + (macro MIN_VERY_LONG_STRING) New macro. + (macro EFFECTIVE_LONG_STRING_LENGTH) New macro, from value.h. + + * sys-file-reader.c: Use MIN_VERY_LONG_STRING - 1 where + MAX_LONG_STRING was used before. + + * sys-file-writer.c: Ditto. + + * value-labels.c: Change the paradigm here to be that a null + pointer is OK for a struct val_labs * in most cases; it just + represents an empty set of value labels. + (val_labs_copy) A copy of a null set is a null set. + (val_labs_count) A null set has 0 labels. + (val_labs_replace) Change return type to void. Rewrite for + simplicity. + (val_labs_find) A null set does not contain the value. + (value_to_string) Moved to variable.c, renamed var_get_value_name, + transposed argument order, updated all callers. + + * value.c: New file. + (value_dup) Moved here from variable.c. + (compare_values) Ditto. + (hash_value) Ditto. + + * value.h: (macro MAX_SHORT_STRING) Rewrote for simplicity. + (macro MAX_LONG_STRING) Removed, because it was only interesting + for system files, not for general code. + (macro MAX_VERY_LONG_STRING) Ditto. + (macro EFFECTIVE_LONG_STRING_LENGTH) Moved to sys-file-private.h. + (macro MAX_ELEMS_PER_VALUE) Removed, as it was unused. + + * vardict.h: New file, for an interface between variables and + their dictionaries. + + * variable.c: A lot of functions were moved around, for better + organization. + (struct variable) Move definition here, from variable.h. + (var_type_adj) Removed--makes i18n hard. + (var_type_noun) Ditto. + (var_create) New function. + (var_clone) New function. + (var_destroy) New function. + (var_set_name) Assert that variable is not in a dictionary. + (compare_var_names) Rename compare_vars_by_name and fix a couple + of callers who thought the args were strings. + (hash_var_name) Rename hash_var_by_name. + (compare_var_ptr_names) Rename compare_var_ptrs_by_name. + (hash_var_ptr_name) Rename hash_var_ptr_by_name. + (var_is_very_long_string) Removed, because it was only interesting + to system file code. + (var_set_missing_values) Allow the argument to be the wrong width, + as long as we can resize it. Simplify callers who were doing the + resizing themselves. + (var_get_value_labels) New function. + (var_has_value_labels) New function. + (var_set_value_labels) New function. + (alloc_value_labels) New function. + (var_add_value_label) New function. + (var_replace_value_label) New function. + (var_clear_value_labels) New function. + (var_lookup_value_label) New function. + (var_get_value_name) Moved here from variable.c, renamed from + var_get_value_name, transposed argument order, updated all + callers. + (var_to_string) Moved here, from variable-label.c. + (var_set_leave) New function. + (var_get_leave) New function. + (var_must_leave) New function. + (var_set_short_name_suffix) Moved to dictionary.c, renamed + set_var_short_name_suffix. + (var_get_dict_index) New function. + (var_get_case_index) New function. + (var_get_obs_vals) New function. + (var_set_obs_vals) New function. + (var_has_obs_vals) New function. + (var_get_vardict) New function. + (var_set_vardict) New function. + (var_has_vardict) New function. + (var_clear_vardict) New function. + (value_dup) Moved to value.c. + (compare_values) Ditto. + (hash_value) Ditto. + + * variable.h: (enum NUMERIC) Rename VAR_NUMERIC, update all users. + (enum ALPHA) Rename VAR_STRING, update all users. + + * vector.c: New file. + (struct vector) Moved here, from variable.h. + (check_widths) New function. + (vector_create) New function. + (vector_clone) New function. + (vector_destroy) New function. + (vector_get_name) New function. + (vector_get_var) New function. + (vector_get_var_cnt) New function. + (compare_vector_ptrs_by_name) New function. + + * vector.h: New file. + +Sun Dec 10 11:32:56 WST 2006 John Darrington + + * casefilter.c (casefilter_variable_missing): Avoided comparision of + string variables to SYSMIS. Thanks to Ben Pfaff for reporting this + problem. + +Sat Dec 9 07:18:03 WST 2006 John Darrington + + * value-labels.c (destroy_atoms): New function. + * value-labels.c (atom_create): Call destroy_atoms in atexit handler. + +Thu Dec 7 17:38:26 2006 Ben Pfaff + + Thanks to Jason Stover for pointing out this problem. + + * data-out.c (output_number): Use gsl_finite from GSL, which is + portable, instead of isfinite, which is not. + (power256) Ditto. + +Thu Dec 7 15:22:38 WST 2006 John Darrington + + * variable.c variable.h (value_dup): New function. + +Mon Dec 4 22:20:17 2006 Ben Pfaff + + Start converting struct variable to an opaque type. In this + phase, we add a bunch of setter and getter functions and convert + most of the PSPP code to use them. The resulting changes are + pervasive but mostly trivial, and only the notable changes are + logged. + + * format.c (fmt_equal): New function. + + * variable.c (var_type_is_valid): New function. + (measure_is_valid) Moved here, from format.c. + (alignment_is_valid) Moved here, from format.c. + (var_get_name) New function. + (var_set_name) New function. + (width_to_type) New function. + (var_get_type) New function. + (var_get_width) New function. + (var_set_width) New function. + (var_is_numeric) New function. + (var_is_alpha) New function. + (var_is_short_string) New function. + (var_is_long_string) New function. + (var_is_very_long_string) New function. + (var_get_missing_values) New function. + (var_set_missing_values) New function. + (var_clear_missing_values) New function. + (var_has_missing_values) New function. + (var_is_value_missing) New function. + (var_is_num_missing) New function. + (var_is_str_missing) New function. + (var_is_value_user_missing) New function. + (var_is_num_user_missing) New function. + (var_is_str_user_missing) New function. + (var_is_value_system_missing) New function. + (var_get_print_format) New function. + (var_set_print_format) New function. + (var_get_write_format) New function. + (var_set_write_format) New function. + (var_set_both_formats) New function. + (var_get_label) New function. + (var_set_label) New function. + (var_clear_label) New function. + (var_has_label) New function. + (var_get_measure) New function. + (var_set_measure) New function. + (var_get_display_width) New function. + (var_set_display_width) New function. + (var_get_alignment) New function. + (var_set_alignment) New function. + (var_get_value_cnt) New function. + (var_get_leave) New function. + (var_get_short_name) New function. + + * variable.h: (struct variable) Removed "type" and "nv" members; + they are now computed from "width" where needed. + +Mon Dec 4 21:38:40 2006 Ben Pfaff + + * missing-values.c (mv_resize): Don't write beyond end of the + allocated buffer when resizing a long string. + +Sat Dec 2 16:28:32 2006 Ben Pfaff + + Clean up identifier code: don't require identifier enumerations to + be in a particular order; make better use of string library; + expose less of the internals. + + * identifier.c: (lex_skip_identifier) Rename lex_id_get_length, + change interface. Updated all callers. + (lex_id_match) Change interface to use struct substring, update + all callers. + (lex_id_match_len) Removed. Update callers to use lex_id_match. + (global array keywords[]) Make static, change form. Update all + users to use lex_id_name instead. + (lex_is_keyword) New function. + (lex_id_to_token) Change interface to use struct substring, update + all callers. + (lex_id_name) New function. + + * identifier.h: (T_FIRST_KEYWORD) Removed. Changed users to call + lex_is_keyword instead. + (T_LAST_KEYWORD) Removed. + (T_N_KEYWORDS) Removed. + +Sat Nov 18 20:46:35 2006 Ben Pfaff + + * format.c: (fmt_date_template) Distinguish characters for which a + space is output and any date delimiter is allowed on input, from + those for which a space is output and only a space is allowed on + input. The former is represented by X, the latter by a space. + Also, drop distinction between h and H, changing the former to the + latter. + + * data-in.c: Completely rewrite internals to conform to SPSS input + formats as closely as possible. + (data_in) Changed external interface by replacing the structure + that was used as a single argument by a set of arguments. Updated + all callers. + (data_in_finite_line) Removed. Converted all callers to use plain + data_in. + (data_in_get_integer_format) New function. + (data_in_set_integer_format) New function. + (data_in_get_float_format) New function. + (data_in_set_float_format) New function. + + * data-in.h: (enums DI_IGNORE_ERROR, DI_IMPLIED_DECIMALS) Removed. + (struct data_in) Removed. + + * data-out.c: (output_date) Drop each component from the input as + it is output, to allow us to drop the distinction between h (a + count of hours) and H (the hour of day) template characters. + Also, handle new X template character. + (output_scientific) Follow more rational rule on when to drop + fraction introduced between SPSS 13 and 15. Updated test case to + match new behavior. + +Sat Nov 11 11:41:26 2006 Ben Pfaff + + Fix buffer overflow reported by John Darrington. + + * data-out.c (output_bcd_integer): In case of SYSMIS, etc., + realize that DIGITS is a count of nibbles, not of bytes. + +Sat Nov 4 15:59:56 2006 Ben Pfaff + + * calendar.c (calendar_offset_to_gregorian) Also return the + year-of-day. Change callers to new interface. + + * data-out.c: Completely rewrite internals to conform to SPSS + output formats as completely as possible. + (data_out) Change interface to put input parameters before output + parameters, for consistency with the style I now prefer. Update + all callers. + (data_out_get_integer_format) New public function. + (data_out_set_integer_format) New public function. + (data_out_get_float_format) New public function. + (data_out_set_float_format) New public function. + + * data-out.h: New file. Move prototype for data_out here, from + format.h. + + * format.c: (fmt_step_width) Use equality comparison instead of + bitwise and, for clarity. + (fmt_is_string) Ditto. + (fmt_input_to_output) Fix categories that are translated to F + format. + +Sun Nov 5 08:29:34 WST 2006 John Darrington + + * casefilter.c casefilter.h (new files), casefile.c casefile.h + casefile-private.h: Added casefilter to assist commands with missing + values. + +Sat Nov 4 11:47:09 2006 Ben Pfaff + + Implement SET ERRORS, SHOW ERRORS. Fixes bug #17609. + + * settings.c: (route_errors_to_terminal) New variable. + (route_errors_to_listing) New variable. + (get_error_routing_to_terminal) New function. + (set_error_routing_to_terminal) New function. + (get_error_routing_to_listing) New function. + (set_error_routing_to_listing) New function. + + * settings.h: (SET_ROUTE_* enums) Removed, because unused. + +Tue Oct 31 19:58:27 2006 Ben Pfaff + + * format.c: Completely rewrite, to achieve better abstraction. + Rewrite all references to formats in other files. + + * format.def: Rewrite and reorganize. + + * settings.c: Move everything related to custom currency formats + into format.[ch], changing them in form, so as to group related + code and definitions better. Changed all references to use the + new functions. + (static var decimal) Removed. + (static var grouping) Removed. + (static var cc) Removed. + (get_decimal) Removed. + (set_decimal) Removed. + (get_grouping) Removed. + (set_grouping) Removed. + (get_cc) Removed. + (set_cc) Removed. + + * settings.h: (macro CC_CNT) Removed. + (macro CC_WIDTH) Removed. + (struct custom_currency) Removed. + +Tue Oct 31 19:56:19 2006 Ben Pfaff + + * data-in.c (data_in): Use switch statement instead of table, to + avoid dependence on the order of the FMT_* enums. + +Tue Oct 31 19:35:36 2006 Ben Pfaff + + * data-out.c: (num_to_string) Removed, because it was dead code. + +Tue Oct 31 18:09:24 2006 Ben Pfaff + + * data-in.c (parse_trailer): Fix error message. + +Sat Oct 28 11:56:50 2006 Ben Pfaff + + * format.c (fmt_is_binary): New function. + +Thu Oct 19 22:59:56 WST 2006 John Darrington + + * procedure.c procedure.h: Encapsulated the static data into a single + struct. + +Sat Oct 14 16:56:44 2006 Ben Pfaff + + * casefile.c (casereader_read_xfer): Always initialize the case, + even on an error condition. + +Wed Sep 27 09:37:49 WST 2006 John Darrington + + * procedure.c (case_limit_trns_proc): Fixed buglet which rendered the + entire function useless. + +Mon Sep 25 17:11:46 WST 2006 John Darrington + + * casefile-private.h casefile.c casefile.h fastfile.c: Created new + casereader method casereader_clone. + + * procedure.c pransformations.h: Introduced new type casenum_t + +Thu Sep 21 07:00:30 2006 Ben Pfaff + + * variable.c: (width_to_bytes) Rephrase code for clarify. + +Sun Jul 16 19:52:03 2006 Ben Pfaff + + * format.c: (fmt_type_from_string) New function. + (fmt_to_string) Include decimals in output if the format has + decimals, even if the format type does not. This way, we can + accurately reproduce incorrect formats in user output. + (check_common_specifier) Make the check for a bad format type an + assertion, so we get bug reports if they show up. Fix message. + Check for decimal places with a format type that doesn't allow + them. + (check_input_specifier) Remove check for FMT_X, which has been + deleted. + (check_output_specifier) Ditto. + + * format.def: Remove FMT_T, FMT_X, FMT_DESCEND, FMT_NEWREC. + + * format.h: (macro FMT_TYPE_LEN_MAX) New macro. + (struct fmt_desc) Use FMT_TYPE_LEN_MAX in definition. + (enum fmt_parse_flags) Removed. + +Mon Jul 17 18:26:21 WST 2006 John Darrington + + * casefile.c casefile.h: Converted to an abstract base class. + * casefile-private.h fastfile.c fastfile.h: New files. + * automake.mk procedure.c scratch-writer.c storage-stream.c + +Wed Jul 12 21:02:26 2006 Ben Pfaff + + * procedure.c (internal_procedure): Create sink_case with only as + many values as the compacted dictionary. + +Wed Jul 12 21:01:00 2006 Ben Pfaff + + Remove "debugging" code that caused plenty of false positives and + no true positives. + + * case.h (struct ccase): [DEBUGGING] Remove `this' member. + + * case.c: Remove all references to `this' member. + +Thu Jul 6 19:09:53 2006 Ben Pfaff + + Fix link error noted by Jason Stover. + + * storage-stream.c: Include . + +Tue Jul 4 08:47:35 2006 Ben Pfaff + + Fix bug #15766 (/KEEP subcommand on SAVE doesn't fully support + ALL) and additional underlying system file issues. + + Thanks to John Darrington for review. + + First problem: var_hash points to variables not owned by the + sys-file-reader, which the caller may free or modify. Use an + array of sfm_vars instead, as done earlier (e.g. CVS version + 1.12). + + * sys-file-reader.c (struct sfm_reader): Remove var_hash, svars + members and remove all code that references it. Add vars, var_cnt + members. Remove fix_specials member, which was unused. + (struct sfm_var) Remove name member, which was unused. + (sfm_close_reader) Free vars member instead of var_hash. + (compare_var_shortnames) Removed. + (hash_var_shortname) Removed. + (sfm_open_reader) Fill out vars array. + (compare_var_index) Removed. + (sfm_read_case) Use vars instead of var_hash. + + Second problem: we're confused about when we actually have very + long strings, causing us to choose incorrectly between slow path + and fast path in sfm_read_case. + + * sys-file-reader.c: (sfm_open_reader) Only mark has_vls if we + have very long strings, not when we have long variable names, + which is an unrelated feature. + +Tue Jun 27 12:06:49 2006 Ben Pfaff + + * variable.h: Move var_set and variable parsing declarations to + new header, src/language/lexer/variable-parser.h. Modified lots + of files to include the new header. + +Sun Jun 25 22:39:32 2006 Ben Pfaff + + * value-labels.c (value_to_string): When there's no value label, + format the variable according to its print format, instead of + always effectively using A or F format. + +Mon Jun 19 18:05:42 WST 2006 John Darrington + + * casefile.c (casefile_get_random_reader): Nasty hack to get around + the mode assertion. + + * format.c: Removed tortological assertion. + Fri Jun 9 12:20:09 2006 Ben Pfaff Reform string library.