1 2007-07-23 Ben Pfaff <blp@gnu.org>
3 Improvements to system file reader and writer.
5 First, move all detailed knowledge of very long strings into
6 sys-file-private.[ch], so that this nasty stuff can be isolated.
8 * sys-file-private.c (REAL_VLS_CHUNK): New macro.
9 (EFFECTIVE_VLS_CHUNK): New macro.
10 (min_int): New function.
11 (max_int): New function.
12 (sfm_width_to_bytes): Rewrite.
13 (sfm_width_to_octs): New function.
14 (sfm_segment_alloc_width): New function.
15 (sfm_segment_alloc_bytes): New function.
16 (sfm_segment_used_bytes): New function.
17 (sfm_segment_offset): New function.
18 (sfm_segment_effective_offset): New function.
19 (sfm_dictionary_to_sfm_vars): New function.
21 * sys-file-private.h (MIN_VERY_LONG_STRING): Removed.
22 (EFFECTIVE_LONG_STRING_LENGTH): Removed.
23 (struct sfm_var): New structure.
25 Next, improvements to the system file reader.
27 * sys-file-reader.h (struct sfm_read_info): Changed `case_cnt' to
28 type casenumber. Added `version_major', `version_minor',
31 * sys-file-reader.c (struct sfm_reader): Replaced `flt64_cnt' by
32 `oct_cnt'. Rename `vars', `var_cnt' to `sfm_vars', `sfm_var_cnt'.
33 Change `case_cnt' to type casenumber. Removed `has_vls'.
34 (struct sfm_var): Removed.
35 (sfm_open_reader): Don't warn on wrong case size if the file was
36 written by SPSS 13, which tends to get it wrong. Use
37 sfm_dictionary_to_sfm_vars.
38 (read_header): Always output system file info.
39 (read_variable_record): Simplify code for reading missing values.
40 (read_machine_int32_info): Save version numbers from system file
41 into info struct passed as new argument.
42 (read_long_string_map): Restructured to use new sys-file-private
44 (read_value_labels): Use size_overflow_p.
45 (sys_file_casereader_read): Get rid of distinction between fast
46 and slow paths. Use information provided by sys-file-primate's
47 struct sfm_var to simplify code.
48 (skip_whole_strings): New function.
49 (read_int32): Renamed read_int. Changed return value to int.
51 (read_flt64): Renamed read_float. Changed return value to
52 double. Updated all callers.
53 (int32_to_native): Removed. Changed callers to use
55 (flt64_to_double): Removed. Changed callers to use float_convert.
57 Finally, get rid of int32, flt64 terminology and types in system
58 file writer. The former wasn't very useful since a POSIX "int"
59 can hold the whole range of int32 and we generally didn't have a
60 need for it to be exactly-32-bits, just at-least-32-bits. The
61 latter was inconvenient because we had to assume that it could be
62 different from double and thereby convert special values SYSMIS,
63 HIGHEST, LOWEST to and from it in multiple places. Instead, now
64 we just use "int" and "double" in most places, and do conversions,
65 if necessary, very close to where we do I/O. This change meant
66 that the writer code couldn't represent records in the file as C
67 structs any longer, but that's no great loss. The code actually
68 seems to be more readable without them.
70 Simplify the compression buffering code: only buffer as much as
71 necessary, which is no more than eight 8-byte units at any given
74 * sys-file-writer.c (typedef flt64): Removed.
75 (macro second_lowest_flt64): Removed.
76 (struct sysfile_header): Removed.
77 (struct sysfile_variable): Removed.
78 (struct sfm_writer): Removed `needs_translation', `has_vls',
79 `flt64_cnt'. Changed `compress' to type bool and `case_cnt' to
80 type casenumber. Renamed `vars' to `sfm_vars', `var_cnt' to
81 `sfm_var_cnt'. Replaced `buf', `end', `ptr', `x', `y' for
82 compression buffering by `opcodes', `opcode_cnt', `data',
83 `data_cnt'. Renamed `var_cnt_vls' as `segment_cnt'.
84 (sfm_open_writer): Use sfm_dictionary_to_sfm_vars. Use simple
85 data writer functions instead of structures.
86 (calc_oct_idx): New function.
87 (write_header): Use simple data writer functions instead of
89 (write_format_spec): Renamed write_format. New argument.
90 (write_variable_continuation_records): New function.
91 (write_variable): Use simple data writer functions instead of
92 structures. Use write_variable_continuation_records. Write
93 entire very long string instead of requiring caller to understand
95 (write_value_labels): Use simple data writer functions instead of
97 (write_documents): Ditto.
98 (write_variable_display_parameters): Use sys-file-private
99 functions to simplify. Use simple data writer functions instead
101 (write_vls_length_table): Use simple data writer functions instead
103 (write_longvar_table): Ditto.
104 (write_rec_7_34): Break into new functions
105 write_integer_info_record, write_float_info_record. Use simple
106 data writer functions instead of structures.
107 (buf_write): Removed.
108 (append_string_max): Removed.
109 (ensure_buf_space): Removed.
110 (sys_file_casewriter_write): Get rid of the distinction between
111 fast and slow paths, which didn't seem to be too useful. Use new
112 functions write_case_uncompressed, write_case_compressed.
113 (put_instruction): Removed.
114 (put_element): Removed.
115 (write_compressed_data): Removed.
116 (close_writer): Use flush_compressed. Only write case count to
117 system file if it will fit in the field.
118 (write_case_compressed): New function.
119 (write_case_uncompressed): New function.
120 (flush_compressed): New function.
121 (put_cmp_opcode): New function.
122 (put_cmp_number): New function.
123 (write_int): New function.
124 (convert_double_to_output_format): New function.
125 (write_float): New function.
126 (write_value): New function.
127 (write_string): New function.
128 (write_bytes): New function.
129 (write_zeros): New function.
130 (write_spaces): New function.
132 2007-07-22 Ben Pfaff <blp@gnu.org>
134 Don't try to write very long strings to portable files. The
135 format does not support it.
137 * por-file-writer.c (MAX_POR_WIDTH): New macro.
138 (pfm_open_writer): Limit output width to MAX_POR_WIDTH.
139 (write_format): Add arg to take width to resize format to.
140 (write_value): Limit width of value written to MAX_POR_WIDTH.
141 (write_variables): Limit width of variable and its output formats
144 2007-07-22 Ben Pfaff <blp@gnu.org>
146 * sys-file-reader.c (read_variable_to_value_map): Use max_warnings
147 local variable instead of literal 5.
149 2007-07-22 Ben Pfaff <blp@gnu.org>
151 Fix problems with uniqueness of short names in system files with
152 very long string variables. Now a variable may have multiple
155 * automake.mk (src_data_libdata_a_SOURCES): Add new files
156 short-names.c, short-names.h.
158 * dictionary.c (dict_clone): Clone all the short names.
159 (compare_strings): Move into short-names.c.
160 (hash_strings): Ditto.
161 (set_var_short_name_suffix): Ditto.
162 (dict_assign_short_names): Ditto, rename short_names_assign,
163 change to assign all short names.
165 * por-file-writer.c (write_variables): Use short_names_assign
166 instead of dict_assign_short_names.
168 * short-names.c: New file.
170 * short-names.h: New file.
172 * sys-file-private.c (sfm_width_to_segments): New function.
174 * sys-file-reader.c (read_long_var_name_map): Save and restore all
175 the short names, not just the first one.
177 * sys-file-writer.c (cont_var_name): Removed.
178 (sfm_open_writer): Use short_names_assign instead of
179 dict_assign_short_names. Use unique short names assigned by
180 short_names_assign instead of those generated by cont_var_name.
182 * variable.c (struct variable): Remove `short_name' member,
183 replace by `short_names' and `short_name_cnt'.
184 (var_create) Initialize new members.
185 (var_get_short_name_cnt): New function.
186 (var_get_short_name): Now takes an index argument. Changed most
188 (var_set_short_name): Ditto.
189 (var_clear_short_name): Renamed var_clear_short_names, changed to
190 clear all short names.
192 2007-07-22 Ben Pfaff <blp@gnu.org>
194 * variable.c (var_set_width): Use new var_set_width function.
196 * missing-values.c (mv_n_values): Drop assertion, which was not
199 * format.c (fmt_default_for_width): New function.
200 (fmt_resize): New function.
202 2007-07-18 John Darrington <john@darrington.wattle.id.au>
204 * datasheet.c (datasheet_delete_columns): Added assertion to check
205 we're not deleting outside the range of the sheet.
208 * dictionary.c dictionary.h variable.c: Added the ability for string
209 variables to be resized.
211 * vardict.h: Added some prototypes (moved from dictionary.h) as
212 these should only be called by variable.c
215 2007-07-14 John Darrington <john@darrington.wattle.id.au>
217 * sfm-reader.c: Respect case_cnt field in file header.
219 2007-07-01 John Darrington <john@darrington.wattle.id.au>
221 * transformation.c transformation.h (trns_chain_execute): Changed the
222 signature (Patch #6057)
224 2007-06-10 Ben Pfaff <blp@gnu.org>
226 * casereader-filter.c (casereader_filter_destroy): Make sure to
227 write all the remaining excluded cases to the casewriter, if any.
229 * caseinit.c (init_list_destroy): Rewrite.
230 (init_list_clear): Ditto.
232 * casegrouper.c (casegrouper_get_next_group): Always set *reader
233 to null when returning false.
235 2007-06-06 Ben Pfaff <blp@gnu.org>
237 Actually implement the new procedure code and adapt all of its
238 clients to match. Also adapt all of the other case sources and
239 sinks in the tree and their clients to use the
240 casereader/casewriter infrastructure.
242 * automake.mk: Add and remove files.
244 * any-reader.c: Change into a casereader.
245 * por-file-reader.c: Ditto.
246 * scratch-reader.c: Ditto.
247 * sys-file-reader.c: Ditto.
249 * any-writer.c: Change into a casewriter.
250 * por-file-writer.c: Ditto.
251 * scratch-writer.c: Ditto.
252 * sys-file-writer.c: Ditto.
254 * procedure.c: Change to use casereader, casewriter, caseinit, and
255 other new infrastructure.
257 * scratch-handle.c: Adapt to new infrastructure.
259 * case-sink.c: Removed, now dead code.
260 * case-sink.h: Ditto.
261 * case-source.c: Ditto.
262 * case-source.h: Ditto.
263 * casefile-factory.c: Ditto.
264 * casefile-private.h: Ditto.
267 * casefilter.c: Ditto.
268 * casefilter.h: Ditto.
271 * fastfile-factory.c: Ditto.
272 * fastfile-factory.h: Ditto.
273 * storage-stream.c: Ditto.
274 * storage-stream.h: Ditto.
276 2007-06-06 Ben Pfaff <blp@gnu.org>
280 * automake.mk: Add new files.
282 * datasheet.c: New file.
284 * datasheet.h: New file.
286 2007-06-06 Ben Pfaff <blp@gnu.org>
288 Until now, the procedure code has provided a case to the
289 case_source, which has filled in the data values that come from
290 the active file. "Left" data values that don't come from the
291 active file naturally stay the same from case to case, because the
292 procedure code keeps using that same case.
294 One of the compromises that comes with the new procedure code is
295 that the active file allocates and provides its own case, which
296 the procedure code then has to resize to provide room for any
297 other variables that should go in the case and then fill in the
298 values of "left" variables. Then, when we're done with that case,
299 we have to save the values of "left" variables to copy into the
300 next case read from the active file.
302 The caseinit code helps with this.
304 * automake.mk: Add new files.
306 * caseinit.c: New file.
308 * caseinit.h: New file.
310 2007-06-06 Ben Pfaff <blp@gnu.org>
312 * value.h (value_cnt_from_width): New function.
314 * variable.c (var_get_value_cnt): Use new function.
316 2007-06-06 Ben Pfaff <blp@gnu.org>
318 Add casegrouper, to allow cases read from a given casereader to be
319 broken into groups, each of which has its own casereader.
320 Generally cases are grouped based on having equal values for some
323 * automake.mk: Add new files.
325 * casegrouper.c: New file.
327 * casegrouper.h: New file.
329 2007-06-06 Ben Pfaff <blp@gnu.org>
331 Add interface to lexicographical ordering of cases.
333 * automake.mk: Add new files.
335 * case-ordering.c: New file.
337 * case-ordering.h: New file.
339 2007-06-06 Ben Pfaff <blp@gnu.org>
341 Add casereaders and casewriters, the basis of the new data processing
342 implementation. A casereader is a uniform interface to reading cases
343 from a data source; a casewriter is a uniform interface to writing
344 cases to a data sink.
346 * automake.mk: Add new files.
348 * casereader-filter.c: New file.
350 * casereader-provider.h: New file.
352 * casereader-translator.c: New file.
354 * casereader.c: New file.
356 * casereader.h: New file.
358 * casewriter-provider.h: New file.
360 * casewriter-translator.c: New file.
362 * casewriter.c: New file.
364 * casewriter.h: New file.
366 2007-06-06 Ben Pfaff <blp@gnu.org>
368 "casewindow" data structure that extends the deque (from libpspp)
369 of cases with the ability to dump cases to disk if we get too many
372 * automake.mk: Add new files.
374 * casewindow.c: New file.
376 * casewindow.h: New file.
378 2007-06-06 Ben Pfaff <blp@gnu.org>
380 sparse_cases data structure that augments a sparse_array of cases
381 with the ability to dump cases to disk if we get too many cases in
384 * automake.mk: Add new files.
386 * sparse-cases.c: New file.
388 * sparse-cases.h: New file.
390 2007-06-06 Ben Pfaff <blp@gnu.org>
392 Adds a low-level on-disk case array data structure.
394 * automake.mk: Add new files.
396 * case-tmpfile.c: New file.
398 * case-tmpfile.h: New file.
400 2007-06-06 Ben Pfaff <blp@gnu.org>
402 In a couple of places we calculate the maximum number of cases to
403 keep in memory based on the user-defined workspace. Enable
404 centralizing the calculation through a new function.
406 * settings.c (get_workspace_cases): New function.
408 2007-06-06 Ben Pfaff <blp@gnu.org>
410 The casenumber type is defined in transformations.h, but case.h is
411 a more sensible place. Move it.
413 * case.h (CASENUMBER_MAX): New macro.
414 (typedef casenumber): Move here, from transformations.h.
416 2007-06-03 Ben Pfaff <blp@gnu.org>
418 Slightly generalize case_to_values and case_from_values functions.
420 * case.c (case_to_values): Rename case_copy_out, change interface.
421 (case_from_values): Rename case_copy_in, change interface.
423 * fastfile.c (fastfilereader_get_next_case): Update caller.
424 (write_case_to_disk): Ditto.
426 2007-06-02 Ben Pfaff <blp@gnu.org>
428 Clean up after a forgotten part of patch #5829.
430 * casedeque.h: Remove unused file.
432 * automake.mk: Remove casedeque.h from sources.
434 2007-05-10 Jason Stover <jhs@math.gcsu.edu>
436 * category.c: Removed redundant #include
438 2007-05-06 Ben Pfaff <blp@gnu.org>
440 Abstract the documents within a dictionary a little better.
441 Thanks to John Darrington for suggestion, initial version, and
444 * dictionary.c (struct dictionary): Change `documents' member from
445 char * to struct string.
446 (dict_clear): Destroy struct string.
447 (dict_get_documents): Convert struct string to char *.
448 (dict_set_documents): Set struct string. Pad to 80-character
450 (dict_clear_documents): New function.
451 (dict_add_document_line): New function.
452 (dict_get_document_line_cnt): New function.
453 (dict_get_document_line): New function.
455 * dictionary.h (macro DOC_LINE_LENGTH): New macro.
457 * sys-file-reader.c (read_documents): Use new document functions.
459 2007-04-19 John Darrington <john@darrington.wattle.id.au>
461 * sys-file-reader.c: When reading a system file which has no
462 long name table, automatically create one where the long names
463 are the lower case versions of the short names.
465 2007-04-22 Ben Pfaff <blp@gnu.org>
467 * dictionary.c (dict_set_split_vars): dict_destroy expects that
468 dict_clear will free most data related to the dictionary.
469 dict_clear does a decent job, except that dict_set_split_vars on
470 some systems won't actually free the dict's "split" member.
471 Instead, it'll allocate a 1-byte region. Fix this.
473 * value.c (value_copy): New function.
474 (value_set_missing): Ditto.
476 2007-04-22 John Darrington <john@darrington.wattle.id.au>
478 * Deleted existing category.h and moved cat-routines.h into
479 category.h Encapsulated struct cat_vals better.
481 2007-04-19 John Darrington <john@darrington.wattle.id.au>
483 * sys-file-reader.c: When reading a system file which has no
484 long name table, automatically create one where the long names
485 are the lower case versions of the short names.
487 2007-04-16 John Darrington <john@darrington.wattle.id.au>
489 * sys-file-reader.c: Some versions of Other Software seem to
490 produce system files with string variables' measure set to
491 zero. We'll assume these are supposed to be nominal variables.
493 2007-03-30 Ben Pfaff <blp@gnu.org>
495 * procedure.c: Adapt to new deque data structure.
497 Mon Feb 19 10:53:21 2007 John McCabe-Dansted <gmatht@gmail.com>
498 Ben Pfaff <blp@gnu.org>
500 * file-name.c: Mingw compatibility fixes.
501 (fn_search_path): Use ISSLASH instead of comparing against '/'
503 (fn_dir_name): Use dir_name from gnulib.
504 (fn_is_absolute): Use IS_ABSOLUTE_FILE_NAME from gnulib.
505 (fn_get_identity): Use GetFullPathName instead of canonicalize
506 from gnulib, because the latter does not fully support
507 Windows-style path names. Use this implementation based on the
508 detected presence of Windows instead of the absence of Unix, since
509 the new implementation is Windows-specific.
510 (fn_compare_file_identities): In Windows implementation, compare
511 names case-insensitively.
513 Sun Feb 18 13:28:02 2007 Ben Pfaff <blp@gnu.org>
515 * make-file.c: Don't include mkstemp.h, because gnulib now causes
516 <stdlib.h> to have the same effect.
518 Sun Feb 18 11:20:24 2007 Ben Pfaff <blp@gnu.org>
520 * por-file-reader.c: Add missing _() around messages.
522 Sun Feb 11 20:44:13 2007 Ben Pfaff <blp@gnu.org>
524 * make-file.c: Include "mkstemp.h", without which linking on
527 Thu Feb 8 14:59:05 2007 Ben Pfaff <blp@gnu.org>
529 Reduce platform dependence.
531 * file-name.c (fn_tilde_expand): Removed, and removed calls to it.
532 Everywhere we using this, we really should have just depended on
533 the shell to expand tildes.
534 (fn_search_path): Simplify, given that we don't do tilde expansion
536 (fn_normalize): Removed. Caller changed to use the canonicalize
538 (fn_get_cwd): Removed. Only user was fn_normalize.
539 (fn_is_absolute): Really only test for absolute names.
540 (fn_is_special): Use pipe files if HAVE_POPEN, not if we're in
542 (fn_readlink): Removed, as it was only used fn_normalize.
543 (fn_exists): Assume the stat function is available; gnulib does.
544 (fn_open): Use pipe files if HAVE_POPEN, not if we're in unix.
546 Sat Feb 3 21:52:17 2007 Ben Pfaff <blp@gnu.org>
548 * dictionary.c (dict_create_vector_assert): New function.
550 Wed Feb 7 21:25:15 2007 Ben Pfaff <blp@gnu.org>
552 * file-name.c (fn_normalize): Correct name of function
553 fn_is_special. Thanks to John McCabe-Dansted <gmatht@gmail.com>
554 for pointing this out.
556 Thu Feb 1 16:53:37 2007 Ben Pfaff <blp@gnu.org>
558 We are using a single member in struct file_handle, the "name"
559 field, for more than one purpose. When it begins with '"', it's a
560 file name; otherwise, it's a token that can be used to identify
561 it. When that assertion fires, it's because we searched for the
562 name case-sensitively as a file name (so that there was no match),
563 and then we try to insert it case-insensitively as a token, which
564 fails because duplicates aren't allowed.
566 Solution: break the two purposes into two separate fields. This
567 fixes the problem and likely makes the code easier to read too.
569 Fixes bug #18922. Thanks to John Darrington for bug report and
572 * file-handle-def.c (struct file_handle): New `id' member.
573 (fh_from_name): Rename fh_from_id. Update all callers.
574 (create_handle): New `id' parameter. Update all callers.
575 (fh_create_file): Ditto.
576 (fh_get_id): New function.
578 Mon Jan 15 16:18:10 2007 Ben Pfaff <blp@gnu.org>
580 * case.c (case_is_null): Change return type to bool.
582 Mon Jan 15 10:57:28 2007 Ben Pfaff <blp@gnu.org>
586 * case.c (case_clone) [DEBUGGING]: When debugging, don't use
587 reference counting to share data. This makes it easy for
588 valgrind, etc. to find accesses to cases that have been destroyed
589 but have been kept around by another user's ref-count. This often
590 happens when the data set is small enough to find in memory; if a
591 bigger data set that would overflow to disk were used, then data
592 corruption would occur.
594 Mon Jan 15 10:55:18 2007 Ben Pfaff <blp@gnu.org>
598 * case.c (case_unshare): Make it check internally whether the
599 ref_cnt is greater than 1, so that the callers don't have to.
600 Update callers not to check.
602 Mon Jan 15 10:53:01 2007 Ben Pfaff <blp@gnu.org>
604 Before, I was thinking that I might want to get rid of reference
605 counting at some point. Now, I'm pretty sure that it's here to
606 stay. Thus, because we have to store the value_cnt anyway for
607 reference-counted cases, we might as well expose it to users.
609 * case.c (case_get_value_cnt): New function.
610 (case_resize): Drop OLD_CNT argument. Update all callers. Only
611 resize case if its size actually changed.
613 * casefile.c (casefile_append_xfer): Use case_get_value_cnt
614 instead of peeking inside struct case directly.
615 (casefile_append): Ditto.
617 Mon Jan 15 10:50:22 2007 Ben Pfaff <blp@gnu.org>
619 Get rid of the inlines for the case functions, which made the
620 header file hard to read. (Also, in testing with "-O2 -DNDEBUG",
621 the inlines didn't speed up "make check" at all, which is not a
622 perfect benchmark but seems indicative.)
624 * case.c: Remove #ifdef DEBUGGING...#endif around many function
625 definitions. Remove some assertions on nonnull pointers that were
626 redundant with a pointer dereference soon after in the function.
628 (struct case_data): Move definition here from case.h.
632 (case_data_wr): Ditto.
634 Sun Jan 14 21:41:12 2007 Ben Pfaff <blp@gnu.org>
636 * automake.mk: Add casedeque.h to sources.
638 * casedeque.h: New file.
640 * procedure.c: (struct dataset) Change lag_count, lag_head,
641 lag_queue member into single struct casedeque member. Update all
642 users to use the casedeque instead.
645 Sun Jan 14 21:43:12 2007 Ben Pfaff <blp@gnu.org>
647 * procedure.c: Simplify lagged cases interface. Updated all
648 clients--well, the only client--to use the simplified interface.
649 (dataset_n_lag) Removed.
650 (dataset_set_n_lag) Removed.
651 (dataset_need_lag) New function.
653 Tue Jan 9 07:20:05 WST 2007 John Darrington <john@darrington.wattle.id.au>
655 * dictionary.c procedure.c: More changes to ensure that callbacks occur
656 whenever appropriate, but only when the dataset/dictionary is in a
659 Sun Jan 7 08:33:04 WST 2007 John Darrington <john@darrington.wattle.id.au>
661 * dictionary.c dictionary.h : Added callbacks for change of filter and
662 split variables. Refactored some code to ensure that callbacks get
663 invoked when appropriate.
665 * procedure.c (proc_cancel_temporary_transformations): Make sure that
666 replace_dict callback occurs when permanent_dict replaces the current
669 Wed Jan 3 11:02:11 WST 2007 John Darrington <john@darrington.wattle.id.au>
671 * dictionary.c dictionary.h : Added callback for when the weight
672 variable of a dictionary changes.
674 Mon Jan 1 10:36:26 WST 2007 John Darrington <john@darrington.wattle.id.au>
676 * dictionary.c dictionary.h : Added replace_source and replace_dict
677 callbacks, and functions to deal with them.
679 Fri Dec 22 13:56:08 2006 Ben Pfaff <blp@gnu.org>
681 Simplify missing value handling.
683 * missing-values.h (enum mv_class): New type.
684 (enum mv_type): Moved definition into missing-values.c and renamed
685 each MV_* to MVT_*, to distinguish them from the exposed mv_class
686 enums. Updated all uses.
687 (struct missing_values): Changed type of `type' from `enum
688 mv_type' to `int' because the definition is no longer exposed.
690 * missing-values.c (mv_is_value_missing): Add new enum mv_class
691 parameter. Update all callers.
692 (mv_is_num_missing): Ditto.
693 (mv_is_str_missing): Ditto.
694 (mv_is_value_user_missing): Removed. Changed callers to use
696 (mv_is_num_user_missing): Removed. Changed callers to use
698 (mv_is_str_user_missing): Removed. Changed callers to use
700 (mv_is_value_system_missing): Removed. Changed callers to use
702 (mv_set_type): Removed. Changed callers to use mv_clear.
703 (mv_clear): New function.
705 * variable.c (var_is_value_missing): Add new enum mv_class
706 parameter. Update all callers.
707 (var_is_num_missing): Ditto.
708 (var_is_str_missing): Ditto.
709 (var_is_value_user_missing): Removed. Changed callers to use
710 var_is_value_missing.
711 (var_is_num_user_missing): Removed. Changed callers to use
713 (var_is_str_user_missing): Removed. Changed callers to use
715 (var_is_value_system_missing): Removed. Changed callers to use
716 var_is_value_missing.
718 * casefilter.c (struct casefilter): Use enum mv_class in place of
720 (casefilter_variable_missing): Adapt to new member.
721 (casefilter_create): Change signature to take enum mv_class,
724 Fri Dec 22 20:08:38 WST 2006 John Darrington <john@darrington.wattle.id.au>
726 * casefile-factory.h fastfile-factory.c fastfile-factory.h: New files.
728 * case-sink.c case-sink.h procedure.c procedure.h
729 storage-stream.c: Now uses the factory.
731 Sat Dec 16 22:05:18 2006 Ben Pfaff <blp@gnu.org>
733 Make it possible to pull cases from the active file with a
734 function call, instead of requiring indirection through a callback
737 * case-source.h (struct case_source_class): Change ->read function
738 to return a single case, instead of calling a callback function
739 for each case. Change ->destroy function to return an error
742 * case-source.c (free_case_source): Pass along the value returned
743 by the case_source ->destroy function.
745 * procedure.c (struct write_case_data): Removed.
746 (struct dataset): Added some members to track procedure state.
747 (procedure): Optimize the trivial case at this level.
748 (internal_procedure): Re-implement in terms of proc_open,
749 proc_read, proc_close.
750 (proc_open) New function.
751 (proc_read) New function.
752 (proc_close) New function.
753 (write_case) Moved into proc_read.
754 (close_active_file) Moved closing of data source into proc_close.
756 * storage-source.c: Rewrote to conform with modified
757 case_source_class interface.
759 * transformations.c (trns_chain_execute): Added argument to allow
760 starting execution from an arbitrary transformation. Updated
763 * transformations.h (enum TRNS_NEXT_CASE) Renamed TRNS_END_CASE.
765 Sat Dec 16 14:09:25 2006 Ben Pfaff <blp@gnu.org>
767 * sys-file-reader.c (read_display_parameters): Don't assume that
768 MEASURE_* and ALIGN_* have the same values found in system files.
770 * sys-file-writer.c (write_variable_display_parameters): Ditto.
772 * variable.h: Change MEASURE_NOMINAL, MEASURE_ORDINAL,
773 MEASURE_SCALE to be 0-based instead of 1-based. This also fixes
774 the value of n_MEASURES, which was off by 1 (at least from my
777 Sat Dec 16 12:17:34 WST 2006 John Darrington <john@darrington.wattle.id.au>
779 * dictionary.c dictionary.h vardict.h variable.c: Added optional
780 callbacks which are invoked when the dictionary or its
781 variables are changed.
783 * missing-values.c missing-values.h value-labels.c: Tidied up
784 consistency checks, and made some of them return false
785 instead of assert-failing.
787 Wed Dec 13 19:30:11 2006 Ben Pfaff <blp@gnu.org>
789 * calendar.c (calendar_days_in_month): New function.
791 Mon Dec 11 07:53:39 2006 Ben Pfaff <blp@gnu.org>
793 * value-labels.c (hash_int_val_lab): Only hash as many bytes as
794 the value label's width.
796 Sun Dec 10 14:21:29 2006 Ben Pfaff <blp@gnu.org>
798 * sfm-private.h: Move contents into sys-file-writer.c, which is
799 the only remaining user. Removed Borland C++-specific directives.
801 * sys-file-reader.c: Clean up and rewrite entire file. The
802 rewritten version is simpler and better abstracted, and should be
803 easier to maintain and extend. It avoids using structures to read
804 file data, which is prone to padding variations among compilers.
805 It should also handle non-IEEE 754 system files, although I
806 haven't been able to find any. It has been tested against many
807 .sav files obtained from the Web and found to produce the same
808 results as the earlier version of the code, or in some cases
809 improved results. It is more tolerant of format variations found
812 * sys-file-reader.h (struct sfm_read_info): Removed `big_endian'
813 member, putting an enum integer_format in its place. New member
814 `float_format'. Changed `compressed' member to type bool.
816 Sun Dec 10 13:48:53 2006 Ben Pfaff <blp@gnu.org>
818 * dictionary.c (dict_delete_consecutive_vars): New function.
820 Sat Dec 9 20:08:25 2006 Ben Pfaff <blp@gnu.org>
822 * file-name.c (fn_search_path): Remove prefix arg that was unused
823 by any caller. Updated all callers.
825 Sat Dec 9 20:04:22 2006 Ben Pfaff <blp@gnu.org>
827 * format.c (fmt_dollar_template): Use user's decimal point
828 character. Add assertion.
830 Sat Dec 9 20:02:25 2006 Ben Pfaff <blp@gnu.org>
832 * format.c (fmt_dollar_template): New function, based on
833 dollar_format_template from var-type-dialog.c.
835 Sat Dec 9 18:05:59 2006 Ben Pfaff <blp@gnu.org>
837 * data-out.c (output_scientific): Fix bad assumption that "buf" is
840 Sat Dec 9 17:23:23 2006 Ben Pfaff <blp@gnu.org>
842 Finish converting struct variable to an opaque type. In this
843 phase, we add remaining setter and getter functions, convert the
844 remaining PSPP code to use them, and do a bunch of cleanup. The
845 resulting changes are pervasive but mostly trivial, and only the
846 notable changes are logged.
848 * automake.mk (src_data_libdata_a_SOURCES): Add the new source
851 * case.c (case_data): Renamed case_data_idx.
852 (case_num): Renamed case_num_idx.
853 (case_str): Renamed case_str_idx.
854 (case_data_rw): Renamed case_data_rw_idx.
856 * case.h (case_data): New function with old name and an interface
857 that takes a variable instead of an index, which is easier to
858 use. Updated all callers to use the new interface, or to use the
859 new *_idx function (see above).
862 (case_data_rw): Ditto.
864 * category.c (cat_stored_values_destroy): Changed interface to
865 take a struct cat_vals * instead of a struct variable *.
867 * dictionary.c (dict_clone): Use new vector_clone function.
868 (dict_clear) Use new var_destroy function.
869 (add_var) New function.
870 (dict_create_var) Rewrite in terms of dict_create_var_assert.
871 (dict_create_var_assert) Rewrite in terms of add_var.
872 (dict_clone_var) Rewrite in terms of dict_clone_var_assert.
873 (dict_clone_var_assert) Rewrite in terms of var_clone, add_var.
874 (dict_lookup_var) Use new var_create, var_destroy functions.
875 (dict_contains_var) Rewrite in terms of new vardict functionality.
876 (set_var_dict_index) New function.
877 (set_var_case_index) New function.
878 (reindex_vars) New function.
879 (dict_delete_var) Rewrite in terms of new vardict functionality.
880 (dict_reorder_var) Ditto.
881 (dict_reorder_vars) Ditto.
882 (rename_var) New function.
883 (dict_rename_var) Use rename_var.
884 (dict_rename_vars) Use pool to simplify code. Use rename_var.
885 (dict_get_compacted_idx_to_fv) Rename
886 dict_get_compacted_dict_index_to_case_index, update callers.
887 (dict_create_vector) Use new vector_create function.
888 (dict_clear_vectors) Use new vector_destroy function.
889 (set_var_short_name_suffix) Move here from variable.c, renamed
890 from var_set_short_name_suffix, make static, update caller.
892 * sys-file-private.c: New file.
893 (sfm_width_to_bytes) Moved here from variable.c, renamed from
894 width_to_bytes, update callers.
896 * sys-file-private.h: New file. Later it will supplant
897 sfm-private.h; for now it supplements it.
898 (macro MIN_VERY_LONG_STRING) New macro.
899 (macro EFFECTIVE_LONG_STRING_LENGTH) New macro, from value.h.
901 * sys-file-reader.c: Use MIN_VERY_LONG_STRING - 1 where
902 MAX_LONG_STRING was used before.
904 * sys-file-writer.c: Ditto.
906 * value-labels.c: Change the paradigm here to be that a null
907 pointer is OK for a struct val_labs * in most cases; it just
908 represents an empty set of value labels.
909 (val_labs_copy) A copy of a null set is a null set.
910 (val_labs_count) A null set has 0 labels.
911 (val_labs_replace) Change return type to void. Rewrite for
913 (val_labs_find) A null set does not contain the value.
914 (value_to_string) Moved to variable.c, renamed var_get_value_name,
915 transposed argument order, updated all callers.
918 (value_dup) Moved here from variable.c.
919 (compare_values) Ditto.
922 * value.h: (macro MAX_SHORT_STRING) Rewrote for simplicity.
923 (macro MAX_LONG_STRING) Removed, because it was only interesting
924 for system files, not for general code.
925 (macro MAX_VERY_LONG_STRING) Ditto.
926 (macro EFFECTIVE_LONG_STRING_LENGTH) Moved to sys-file-private.h.
927 (macro MAX_ELEMS_PER_VALUE) Removed, as it was unused.
929 * vardict.h: New file, for an interface between variables and
932 * variable.c: A lot of functions were moved around, for better
934 (struct variable) Move definition here, from variable.h.
935 (var_type_adj) Removed--makes i18n hard.
936 (var_type_noun) Ditto.
937 (var_create) New function.
938 (var_clone) New function.
939 (var_destroy) New function.
940 (var_set_name) Assert that variable is not in a dictionary.
941 (compare_var_names) Rename compare_vars_by_name and fix a couple
942 of callers who thought the args were strings.
943 (hash_var_name) Rename hash_var_by_name.
944 (compare_var_ptr_names) Rename compare_var_ptrs_by_name.
945 (hash_var_ptr_name) Rename hash_var_ptr_by_name.
946 (var_is_very_long_string) Removed, because it was only interesting
948 (var_set_missing_values) Allow the argument to be the wrong width,
949 as long as we can resize it. Simplify callers who were doing the
951 (var_get_value_labels) New function.
952 (var_has_value_labels) New function.
953 (var_set_value_labels) New function.
954 (alloc_value_labels) New function.
955 (var_add_value_label) New function.
956 (var_replace_value_label) New function.
957 (var_clear_value_labels) New function.
958 (var_lookup_value_label) New function.
959 (var_get_value_name) Moved here from variable.c, renamed from
960 var_get_value_name, transposed argument order, updated all
962 (var_to_string) Moved here, from variable-label.c.
963 (var_set_leave) New function.
964 (var_get_leave) New function.
965 (var_must_leave) New function.
966 (var_set_short_name_suffix) Moved to dictionary.c, renamed
967 set_var_short_name_suffix.
968 (var_get_dict_index) New function.
969 (var_get_case_index) New function.
970 (var_get_obs_vals) New function.
971 (var_set_obs_vals) New function.
972 (var_has_obs_vals) New function.
973 (var_get_vardict) New function.
974 (var_set_vardict) New function.
975 (var_has_vardict) New function.
976 (var_clear_vardict) New function.
977 (value_dup) Moved to value.c.
978 (compare_values) Ditto.
981 * variable.h: (enum NUMERIC) Rename VAR_NUMERIC, update all users.
982 (enum ALPHA) Rename VAR_STRING, update all users.
984 * vector.c: New file.
985 (struct vector) Moved here, from variable.h.
986 (check_widths) New function.
987 (vector_create) New function.
988 (vector_clone) New function.
989 (vector_destroy) New function.
990 (vector_get_name) New function.
991 (vector_get_var) New function.
992 (vector_get_var_cnt) New function.
993 (compare_vector_ptrs_by_name) New function.
995 * vector.h: New file.
997 Sun Dec 10 11:32:56 WST 2006 John Darrington <john@darrington.wattle.id.au>
999 * casefilter.c (casefilter_variable_missing): Avoided comparision of
1000 string variables to SYSMIS. Thanks to Ben Pfaff for reporting this
1003 Sat Dec 9 07:18:03 WST 2006 John Darrington <john@darrington.wattle.id.au>
1005 * value-labels.c (destroy_atoms): New function.
1006 * value-labels.c (atom_create): Call destroy_atoms in atexit handler.
1008 Thu Dec 7 17:38:26 2006 Ben Pfaff <blp@gnu.org>
1010 Thanks to Jason Stover for pointing out this problem.
1012 * data-out.c (output_number): Use gsl_finite from GSL, which is
1013 portable, instead of isfinite, which is not.
1016 Thu Dec 7 15:22:38 WST 2006 John Darrington <john@darrington.wattle.id.au>
1018 * variable.c variable.h (value_dup): New function.
1020 Mon Dec 4 22:20:17 2006 Ben Pfaff <blp@gnu.org>
1022 Start converting struct variable to an opaque type. In this
1023 phase, we add a bunch of setter and getter functions and convert
1024 most of the PSPP code to use them. The resulting changes are
1025 pervasive but mostly trivial, and only the notable changes are
1028 * format.c (fmt_equal): New function.
1030 * variable.c (var_type_is_valid): New function.
1031 (measure_is_valid) Moved here, from format.c.
1032 (alignment_is_valid) Moved here, from format.c.
1033 (var_get_name) New function.
1034 (var_set_name) New function.
1035 (width_to_type) New function.
1036 (var_get_type) New function.
1037 (var_get_width) New function.
1038 (var_set_width) New function.
1039 (var_is_numeric) New function.
1040 (var_is_alpha) New function.
1041 (var_is_short_string) New function.
1042 (var_is_long_string) New function.
1043 (var_is_very_long_string) New function.
1044 (var_get_missing_values) New function.
1045 (var_set_missing_values) New function.
1046 (var_clear_missing_values) New function.
1047 (var_has_missing_values) New function.
1048 (var_is_value_missing) New function.
1049 (var_is_num_missing) New function.
1050 (var_is_str_missing) New function.
1051 (var_is_value_user_missing) New function.
1052 (var_is_num_user_missing) New function.
1053 (var_is_str_user_missing) New function.
1054 (var_is_value_system_missing) New function.
1055 (var_get_print_format) New function.
1056 (var_set_print_format) New function.
1057 (var_get_write_format) New function.
1058 (var_set_write_format) New function.
1059 (var_set_both_formats) New function.
1060 (var_get_label) New function.
1061 (var_set_label) New function.
1062 (var_clear_label) New function.
1063 (var_has_label) New function.
1064 (var_get_measure) New function.
1065 (var_set_measure) New function.
1066 (var_get_display_width) New function.
1067 (var_set_display_width) New function.
1068 (var_get_alignment) New function.
1069 (var_set_alignment) New function.
1070 (var_get_value_cnt) New function.
1071 (var_get_leave) New function.
1072 (var_get_short_name) New function.
1074 * variable.h: (struct variable) Removed "type" and "nv" members;
1075 they are now computed from "width" where needed.
1077 Mon Dec 4 21:38:40 2006 Ben Pfaff <blp@gnu.org>
1079 * missing-values.c (mv_resize): Don't write beyond end of the
1080 allocated buffer when resizing a long string.
1082 Sat Dec 2 16:28:32 2006 Ben Pfaff <blp@gnu.org>
1084 Clean up identifier code: don't require identifier enumerations to
1085 be in a particular order; make better use of string library;
1086 expose less of the internals.
1088 * identifier.c: (lex_skip_identifier) Rename lex_id_get_length,
1089 change interface. Updated all callers.
1090 (lex_id_match) Change interface to use struct substring, update
1092 (lex_id_match_len) Removed. Update callers to use lex_id_match.
1093 (global array keywords[]) Make static, change form. Update all
1094 users to use lex_id_name instead.
1095 (lex_is_keyword) New function.
1096 (lex_id_to_token) Change interface to use struct substring, update
1098 (lex_id_name) New function.
1100 * identifier.h: (T_FIRST_KEYWORD) Removed. Changed users to call
1101 lex_is_keyword instead.
1102 (T_LAST_KEYWORD) Removed.
1103 (T_N_KEYWORDS) Removed.
1105 Sat Nov 18 20:46:35 2006 Ben Pfaff <blp@gnu.org>
1107 * format.c: (fmt_date_template) Distinguish characters for which a
1108 space is output and any date delimiter is allowed on input, from
1109 those for which a space is output and only a space is allowed on
1110 input. The former is represented by X, the latter by a space.
1111 Also, drop distinction between h and H, changing the former to the
1114 * data-in.c: Completely rewrite internals to conform to SPSS input
1115 formats as closely as possible.
1116 (data_in) Changed external interface by replacing the structure
1117 that was used as a single argument by a set of arguments. Updated
1119 (data_in_finite_line) Removed. Converted all callers to use plain
1121 (data_in_get_integer_format) New function.
1122 (data_in_set_integer_format) New function.
1123 (data_in_get_float_format) New function.
1124 (data_in_set_float_format) New function.
1126 * data-in.h: (enums DI_IGNORE_ERROR, DI_IMPLIED_DECIMALS) Removed.
1127 (struct data_in) Removed.
1129 * data-out.c: (output_date) Drop each component from the input as
1130 it is output, to allow us to drop the distinction between h (a
1131 count of hours) and H (the hour of day) template characters.
1132 Also, handle new X template character.
1133 (output_scientific) Follow more rational rule on when to drop
1134 fraction introduced between SPSS 13 and 15. Updated test case to
1137 Sat Nov 11 11:41:26 2006 Ben Pfaff <blp@gnu.org>
1139 Fix buffer overflow reported by John Darrington.
1141 * data-out.c (output_bcd_integer): In case of SYSMIS, etc.,
1142 realize that DIGITS is a count of nibbles, not of bytes.
1144 Sat Nov 4 15:59:56 2006 Ben Pfaff <blp@gnu.org>
1146 * calendar.c (calendar_offset_to_gregorian) Also return the
1147 year-of-day. Change callers to new interface.
1149 * data-out.c: Completely rewrite internals to conform to SPSS
1150 output formats as completely as possible.
1151 (data_out) Change interface to put input parameters before output
1152 parameters, for consistency with the style I now prefer. Update
1154 (data_out_get_integer_format) New public function.
1155 (data_out_set_integer_format) New public function.
1156 (data_out_get_float_format) New public function.
1157 (data_out_set_float_format) New public function.
1159 * data-out.h: New file. Move prototype for data_out here, from
1162 * format.c: (fmt_step_width) Use equality comparison instead of
1163 bitwise and, for clarity.
1164 (fmt_is_string) Ditto.
1165 (fmt_input_to_output) Fix categories that are translated to F
1168 Sun Nov 5 08:29:34 WST 2006 John Darrington <john@darrington.wattle.id.au>
1170 * casefilter.c casefilter.h (new files), casefile.c casefile.h
1171 casefile-private.h: Added casefilter to assist commands with missing
1174 Sat Nov 4 11:47:09 2006 Ben Pfaff <blp@gnu.org>
1176 Implement SET ERRORS, SHOW ERRORS. Fixes bug #17609.
1178 * settings.c: (route_errors_to_terminal) New variable.
1179 (route_errors_to_listing) New variable.
1180 (get_error_routing_to_terminal) New function.
1181 (set_error_routing_to_terminal) New function.
1182 (get_error_routing_to_listing) New function.
1183 (set_error_routing_to_listing) New function.
1185 * settings.h: (SET_ROUTE_* enums) Removed, because unused.
1187 Tue Oct 31 19:58:27 2006 Ben Pfaff <blp@gnu.org>
1189 * format.c: Completely rewrite, to achieve better abstraction.
1190 Rewrite all references to formats in other files.
1192 * format.def: Rewrite and reorganize.
1194 * settings.c: Move everything related to custom currency formats
1195 into format.[ch], changing them in form, so as to group related
1196 code and definitions better. Changed all references to use the
1198 (static var decimal) Removed.
1199 (static var grouping) Removed.
1200 (static var cc) Removed.
1201 (get_decimal) Removed.
1202 (set_decimal) Removed.
1203 (get_grouping) Removed.
1204 (set_grouping) Removed.
1208 * settings.h: (macro CC_CNT) Removed.
1209 (macro CC_WIDTH) Removed.
1210 (struct custom_currency) Removed.
1212 Tue Oct 31 19:56:19 2006 Ben Pfaff <blp@gnu.org>
1214 * data-in.c (data_in): Use switch statement instead of table, to
1215 avoid dependence on the order of the FMT_* enums.
1217 Tue Oct 31 19:35:36 2006 Ben Pfaff <blp@gnu.org>
1219 * data-out.c: (num_to_string) Removed, because it was dead code.
1221 Tue Oct 31 18:09:24 2006 Ben Pfaff <blp@gnu.org>
1223 * data-in.c (parse_trailer): Fix error message.
1225 Sat Oct 28 11:56:50 2006 Ben Pfaff <blp@gnu.org>
1227 * format.c (fmt_is_binary): New function.
1229 Thu Oct 19 22:59:56 WST 2006 John Darrington <john@darrington.wattle.id.au>
1231 * procedure.c procedure.h: Encapsulated the static data into a single
1234 Sat Oct 14 16:56:44 2006 Ben Pfaff <blp@gnu.org>
1236 * casefile.c (casereader_read_xfer): Always initialize the case,
1237 even on an error condition.
1239 Wed Sep 27 09:37:49 WST 2006 John Darrington <john@darrington.wattle.id.au>
1241 * procedure.c (case_limit_trns_proc): Fixed buglet which rendered the
1242 entire function useless.
1244 Mon Sep 25 17:11:46 WST 2006 John Darrington <john@darrington.wattle.id.au>
1246 * casefile-private.h casefile.c casefile.h fastfile.c: Created new
1247 casereader method casereader_clone.
1249 * procedure.c pransformations.h: Introduced new type casenum_t
1251 Thu Sep 21 07:00:30 2006 Ben Pfaff <blp@gnu.org>
1253 * variable.c: (width_to_bytes) Rephrase code for clarify.
1255 Sun Jul 16 19:52:03 2006 Ben Pfaff <blp@gnu.org>
1257 * format.c: (fmt_type_from_string) New function.
1258 (fmt_to_string) Include decimals in output if the format has
1259 decimals, even if the format type does not. This way, we can
1260 accurately reproduce incorrect formats in user output.
1261 (check_common_specifier) Make the check for a bad format type an
1262 assertion, so we get bug reports if they show up. Fix message.
1263 Check for decimal places with a format type that doesn't allow
1265 (check_input_specifier) Remove check for FMT_X, which has been
1267 (check_output_specifier) Ditto.
1269 * format.def: Remove FMT_T, FMT_X, FMT_DESCEND, FMT_NEWREC.
1271 * format.h: (macro FMT_TYPE_LEN_MAX) New macro.
1272 (struct fmt_desc) Use FMT_TYPE_LEN_MAX in definition.
1273 (enum fmt_parse_flags) Removed.
1275 Mon Jul 17 18:26:21 WST 2006 John Darrington <john@darrington.wattle.id.au>
1277 * casefile.c casefile.h: Converted to an abstract base class.
1278 * casefile-private.h fastfile.c fastfile.h: New files.
1279 * automake.mk procedure.c scratch-writer.c storage-stream.c
1281 Wed Jul 12 21:02:26 2006 Ben Pfaff <blp@gnu.org>
1283 * procedure.c (internal_procedure): Create sink_case with only as
1284 many values as the compacted dictionary.
1286 Wed Jul 12 21:01:00 2006 Ben Pfaff <blp@gnu.org>
1288 Remove "debugging" code that caused plenty of false positives and
1291 * case.h (struct ccase): [DEBUGGING] Remove `this' member.
1293 * case.c: Remove all references to `this' member.
1295 Thu Jul 6 19:09:53 2006 Ben Pfaff <blp@gnu.org>
1297 Fix link error noted by Jason Stover.
1299 * storage-stream.c: Include <assert.h>.
1301 Tue Jul 4 08:47:35 2006 Ben Pfaff <blp@gnu.org>
1303 Fix bug #15766 (/KEEP subcommand on SAVE doesn't fully support
1304 ALL) and additional underlying system file issues.
1306 Thanks to John Darrington for review.
1308 First problem: var_hash points to variables not owned by the
1309 sys-file-reader, which the caller may free or modify. Use an
1310 array of sfm_vars instead, as done earlier (e.g. CVS version
1313 * sys-file-reader.c (struct sfm_reader): Remove var_hash, svars
1314 members and remove all code that references it. Add vars, var_cnt
1315 members. Remove fix_specials member, which was unused.
1316 (struct sfm_var) Remove name member, which was unused.
1317 (sfm_close_reader) Free vars member instead of var_hash.
1318 (compare_var_shortnames) Removed.
1319 (hash_var_shortname) Removed.
1320 (sfm_open_reader) Fill out vars array.
1321 (compare_var_index) Removed.
1322 (sfm_read_case) Use vars instead of var_hash.
1324 Second problem: we're confused about when we actually have very
1325 long strings, causing us to choose incorrectly between slow path
1326 and fast path in sfm_read_case.
1328 * sys-file-reader.c: (sfm_open_reader) Only mark has_vls if we
1329 have very long strings, not when we have long variable names,
1330 which is an unrelated feature.
1332 Tue Jun 27 12:06:49 2006 Ben Pfaff <blp@gnu.org>
1334 * variable.h: Move var_set and variable parsing declarations to
1335 new header, src/language/lexer/variable-parser.h. Modified lots
1336 of files to include the new header.
1338 Sun Jun 25 22:39:32 2006 Ben Pfaff <blp@gnu.org>
1340 * value-labels.c (value_to_string): When there's no value label,
1341 format the variable according to its print format, instead of
1342 always effectively using A or F format.
1344 Mon Jun 19 18:05:42 WST 2006 John Darrington <john@darrington.wattle.id.au>
1346 * casefile.c (casefile_get_random_reader): Nasty hack to get around
1349 * format.c: Removed tortological assertion.
1351 Fri Jun 9 12:20:09 2006 Ben Pfaff <blp@gnu.org>
1353 Reform string library.
1355 * file-name.c (fn_interp_vars): Change interface to take a
1356 substring as input. Updated all users.
1358 Fri Jun 9 12:11:24 2006 Ben Pfaff <blp@gnu.org>
1360 * format.c (measure_is_valid): Really return false when m >=
1363 Tue Jun 6 18:46:26 2006 Ben Pfaff <blp@gnu.org>
1365 Implement random access to casefiles, for use in GUI.
1367 * casefile.c: (struct casereader) Add `random', `file_ofs',
1368 `buffer_ofs' members.
1369 (casefile_get_random_reader) New function.
1370 (read_open_file) Break part into new function
1371 seek_and_fill_buffer().
1372 (fill_buffer) Update buffer_ofs, file_ofs.
1373 (casereader_seek) New function.
1375 Tue May 30 19:52:33 WST 2006 John Darrington <john@darrington.wattle.id.au>
1377 * settings.c: Added call to i18n{done, init}.
1379 Tue May 9 21:09:17 2006 Ben Pfaff <blp@gnu.org>
1381 * procedure.h: Add WARN_UNUSED_RESULT to procedure function
1384 Tue May 9 21:08:05 2006 Ben Pfaff <blp@gnu.org>
1386 * casefile.c: Convert many uses of `int' to `bool'.
1388 Sat May 6 22:49:43 2006 Ben Pfaff <blp@gnu.org>
1390 * transformations.c (trns_chain_destroy): Destroy chain's trns
1391 member, to fix memory leak.
1393 Sat May 6 22:48:30 2006 Ben Pfaff <blp@gnu.org>
1395 * storage-stream.c (storage_source_decapsulate): Destroy case
1396 source to fix memory leak.
1398 Sat May 6 22:46:47 2006 Ben Pfaff <blp@gnu.org>
1400 * scratch-reader.c (scratch_reader_read_case): Copy into existing
1401 case passed as argument instead of initializing the argument as a
1402 case. Fixes memory leak that showed up in
1403 tests/command/aggregate.sh with scratch files.
1405 Sat May 6 22:45:55 2006 Ben Pfaff <blp@gnu.org>
1407 * procedure.c (proc_done): Destroy default_dict, to fix memory
1410 Sat May 6 22:44:44 2006 Ben Pfaff <blp@gnu.org>
1412 Simplify procedure_with_splits().
1414 * procedure.c (struct split_aux_data): Removed case_count member.
1415 (procedure_with_splits) Don't initialize case_count.
1416 (split_procedure_case_func) Check whether prev_case is null
1417 instead of case_count.
1418 (split_procedure_end_func) Ditto.
1420 Sat May 6 22:42:23 2006 Ben Pfaff <blp@gnu.org>
1422 * case.c (case_move): Do nothing if dst and src are the same
1424 (case_try_create) Merge two similar cases.
1425 (case_copy) Unshare only if data must be actually copied.
1427 Sun May 7 10:04:06 WST 2006 John Darrington <john@darrington.wattle.id.au>
1429 * data-in.c data-out.c dictionary.c sys-file-reader.c
1430 sys-file-writer.c variable.c variable.h: Reworked very long string
1431 support for better encapsulation.
1433 Sat May 6 19:02:00 2006 Ben Pfaff <blp@gnu.org>
1435 * value-labels.c (val_labs_can_set_width): New function.
1436 (val_labs_set_width) Clear labels if increasing width to long
1438 (val_labs_destroy) Remove unneeded test for null.
1440 Sat May 6 16:14:08 2006 Ben Pfaff <blp@gnu.org>
1442 * value-labels.h: Remove unneeded dependency on variable.h.
1444 Sat May 6 15:58:36 2006 Ben Pfaff <blp@gnu.org>
1446 Get rid of `char *c' member in union value, for cleanliness.
1448 * value.h: (union value) Remove `c' member.
1450 Sat May 6 15:36:59 2006 Ben Pfaff <blp@gnu.org>
1452 Make dictionary compacting functions a little more general.
1454 * sys-file-writer.c (sfm_open_writer): Use
1455 dict_compacting_would_change().
1456 (does_dict_need_translation) Removed.
1458 Sat May 6 15:35:42 2006 Ben Pfaff <blp@gnu.org>
1460 Make dictionary compacting functions a little more general.
1462 * dictionary.c (dict_needs_compaction): Rename
1463 dict_compacting_would_shrink(). Update all callers.
1464 (dict_compacting_would_change) New function.
1466 Sat May 6 14:25:49 2006 Ben Pfaff <blp@gnu.org>
1468 * sys-file-writer.c: (does_dict_need_translation) Fix bug:
1469 inverted return value (!).
1471 Sat May 6 13:37:52 2006 Ben Pfaff <blp@gnu.org>
1473 Continue reforming procedure execution.
1475 * procedure.c: Search and replace "vfm" by "proc". Notably:
1476 (static var vfm_source) Rename proc_source. Update all
1478 (static var vfm_sink) Rename proc_sink. Update all references.
1480 Sat May 6 12:38:55 2006 Ben Pfaff <blp@gnu.org>
1482 Continue reforming procedure execution. In this phase, remove
1483 PROCESS IF, which was deprecated anyway and can be easily
1484 simulated with TEMPORARY followed by SELECT IF.
1486 * procedure.c: (open_active_file) Don't call
1487 add_process_if_trns().
1488 (discard_variables) Get rid of redundant call to
1489 proc_cancel_all_transformations().
1490 (add_process_if_trns) Removed.
1491 (process_if_trns_proc) Removed.
1492 (process_if_trns_free) Removed.
1494 Sat May 6 10:58:05 2006 Ben Pfaff <blp@gnu.org>
1496 Continue reforming procedure execution. In this phase, add
1497 `const' to the case passed to procedure()'s callback.
1499 Updated all users of procedure() as well.
1501 * procedure.c: (struct write_case_data) Add "const" to ccase
1502 parameter for case_func member.
1503 (procedure) Add "const" to ccase parameter for proc_func
1505 (multipass_case_func) Make ccase parameter const.
1506 (internal_procedure) Add "const" to ccase parameter for case_func
1508 (split_procedure_case_func) Make ccase parameter const.
1509 (multipass_split_case_func) Make ccase parameter const.
1511 Sat May 6 10:30:33 2006 Ben Pfaff <blp@gnu.org>
1513 Continue reforming procedure execution. In this phase, get rid of
1514 the output code for SPLIT FILE groups in procedure.c, which really
1515 shouldn't be doing any output. Move it into the individual
1516 procedures instead. This also adds some flexibility.
1518 Updated many users of procedure_with_splits() and
1519 multipass_procedure_with_splits() to call
1520 output_split_file_values() and to deal with increased use of
1523 * procedure.c: (struct split_aux_data) Add "const struct ccase *"
1524 parameter to begin_func member.
1525 (procedure_with_splits) Add "const struct ccase *" parameter to
1526 begin_func parameter. Make ccase parameter const in proc_func
1528 (split_procedure_case_func) Don't dump split file group. Pass
1530 (dump_splits) Moved to language/dictionary/split-file.c as
1531 output_split_file_values().
1532 (struct multipass_split_aux_data) Add "const struct ccase *"
1533 parameter to split_func member.
1534 (multipass_procedure_with_splits) Add "const struct ccase *"
1535 parameter to split_func parameter.
1536 (multipass_split_case_func) Save new SPLIT FILE case before
1538 (multipass_split_output) Pass saved SPLIT FILE case to split_func.
1540 Fri May 5 22:48:50 2006 Ben Pfaff <blp@gnu.org>
1542 Continue reforming procedure execution. Change
1543 internal_procedure() so that it calls open_active_file() and
1544 close_active_file(), which isolates most of the actual procedure
1547 * procedure.c: (struct write_case_data) Rename `proc_func' member
1548 to `case_func' and update all references.
1549 (procedure) Rewrite as one-line wrapper around
1550 internal_procedure().
1551 (struct multipass_aux_data) New.
1552 (multipass_callback) Renamed multipass_case_func(). Use struct
1553 multipass_aux_data as auxiliary data.
1554 (multipass_end_func) New function.
1555 (multipass_procedure) Rewrite as wrapper for internal_procedure()
1556 that uses multipass_case_func, multipass_end_func.
1557 (internal_procedure) Add `end_func' argument. Move optimization
1558 of trivial case in here. Move call to open_active_file() and
1559 close_active_file() in here. Now assert that vfm_source is
1561 (procedure_with_splits_callback) Rename
1562 split_procedure_case_func().
1563 (split_procedure_end_func) New function.
1564 (multipass_split_callback) Rename multipass_split_case_func.
1565 (multipass_split_end_func) New function.
1566 (discard_variables) No need to test for nonnull vfm_source.
1568 Fri May 5 21:34:02 2006 Ben Pfaff <blp@gnu.org>
1570 Continue reforming procedure execution. Get rid of unused member.
1572 * procedure.c: (struct write_case_data) Remove `cases_analyzed'
1574 (write_case) Don't increment cases_analyzed.
1576 Thu May 4 21:50:11 2006 Ben Pfaff <blp@gnu.org>
1578 Continue reforming procedure execution. In this phase, move
1579 procedure.c and procedure.h from src to src/data. Update
1580 makefiles and #includes accordingly.
1582 * procedure.c: Moved here from src/.
1584 * procedure.h: Moved here from src/.
1586 Wed May 3 22:42:12 2006 Ben Pfaff <blp@gnu.org>
1588 Continue reforming procedure execution. In this phase, get rid of
1589 many global variables, consolidating procedure execution in
1590 procedure.c. Encapsulate transformations in new "struct
1591 trns_chain". Also, change implementation of N OF CASES, FILTER,
1592 and PROCESS IF from special cases to transformations.
1594 * automake.mk: (src_data_libdata_a_SOURCES) Add transformations.c,
1597 * dictionary.c: (global variable default_dict) Move to
1600 * variable.h: (TRNS_*) Move to transformations.h.
1601 (struct transformation) Move to transformations.c.
1603 Thu May 4 13:47:06 WST 2006 John Darrington <john@darrington.wattle.id.au>
1605 * sys-file-reader.c: Fixed invalid read problems.
1607 Tue May 2 15:57:10 2006 Ben Pfaff <blp@gnu.org>
1609 * storage-stream.c: Add missing function comments.
1611 Tue May 2 15:50:21 2006 Ben Pfaff <blp@gnu.org>
1613 Continue reforming procedure execution. In this phase, add some
1614 new, needed functionality to storage-stream.
1616 * storage-stream.c: (storage_source_decapsulate) New function.
1618 Tue May 2 15:43:36 2006 Ben Pfaff <blp@gnu.org>
1620 * variable.c (width_to_bytes): Declarations must precede
1621 statements for C90 compliance.
1623 Tue May 2 10:42:05 WST 2006 John Darrington <john@darrington.wattle.id.au>
1625 * data-out.c, data-in.c, variable.c, variable.h: New functions
1626 copy_mangle and copy_demangle for reading/writing cases; emulates the
1627 way SPSS deals with strings > 255 bytes.
1629 * sys-file-reader.c sys-file-writer.c: Added support for Record 7,
1630 subtype 14 needed for strings longer than 255 bytes.
1632 * dictionary.c, format.def, value.c : Updated to use MAX_STRING
1633 instead of literal values. Also fixed some constness issues.
1635 * format.h: Constness
1637 * sfm-private.h: Renamed the case_size identifier, since I discovered
1638 that SPSS's respect for this variable is very nominal.
1640 Mon May 1 15:45:42 2006 Ben Pfaff <blp@gnu.org>
1642 Change case limit type from int to size_t.
1644 * dictionary.c: (struct dictionary) Change type of case_limit
1646 (dict_get_case_limit) Change return type.
1647 (dict_set_case_limit) Change parameter type.
1649 Wed Apr 26 20:01:19 2006 Ben Pfaff <blp@gnu.org>
1651 * variable.h: (struct variable) Rename `reinit' member as `leave'
1652 and invert sense. Fix up all references.
1654 Wed Apr 26 19:39:28 2006 Ben Pfaff <blp@gnu.org>
1656 Continue reforming procedure execution. In this phase, break
1657 procedure.c into multiple files.
1659 * automake.mk: (src_data_libdata_a_SOURCES) Add all the new files.
1661 * case-sink.c: New file.
1663 * case-sink.h: New file.
1665 * case-source.c: New file.
1667 * case-source.h: New file.
1669 * storage-stream.c: New file.
1671 * storage-stream.h: New file.
1673 Wed Apr 26 14:55:19 2006 Ben Pfaff <blp@gnu.org>
1675 * variable.h: (struct variable) Remove `init' member and all
1676 references to it from other files. It was initialized in several
1677 places, but nothing really ever used it for anything worthwhile.
1678 Thanks to Jason Stover for pointing out how confusing this
1681 Sun Apr 23 22:04:45 2006 Ben Pfaff <blp@gnu.org>
1683 Continue reforming error message support. In this phase, get rid
1684 of message "titles" and put the message text in `struct error'.
1685 Now `struct error' encapsulates a message more properly.
1687 * casefile.c: (io_error) Use err_msg() instead of err_vmsg().
1688 Format message ourselves.
1690 * data-in.c: (vdls_error) Ditto.
1692 * por-file-reader.c: (error) Ditto.
1694 * sys-file-reader.c: (corrupt_msg) Ditto.
1696 Sun Apr 16 18:49:51 2006 Ben Pfaff <blp@gnu.org>
1698 GNU standards require "file name" instead of "filename" in
1699 documentation. It's nice for our code to follow the convention
1702 * casefile.c: (struct casefile) Rename `filename' member to
1703 `file_name'. Updated all references.
1705 * file-name.c: [!unix] (struct file_identity) Rename
1706 normalized_filename member to normalized_file_name. Updated all
1709 Sun Apr 16 18:35:33 2006 Ben Pfaff <blp@gnu.org>
1711 We don't really support anything but Unix-like environments well,
1712 so we might as well de-obfuscate by writing directory and path
1713 separators explicitly.
1715 * file-name.h: (macro DIR_SEPARATOR) Removed. Changed all usages
1717 (macro PATH_SEPARATOR) Removed. Changed all usages to just ':'.
1718 (macro DIR_SEPARATOR_STRING) Removed. Changed all usages to just
1720 (macro PATH_SEPARATOR_STRING) Removed. Changed all usages to just
1723 Sun Apr 16 18:28:35 2006 Ben Pfaff <blp@gnu.org>
1725 GNU standards require "file name" instead of "filename" in
1726 documentation. It's nice for our code to follow the convention
1729 * filename.c: Rename to file-name.c.
1731 * filename.h: Rename to file-name.h. Update all inclusions.
1732 Update header guards.
1734 * automake.mk: Update file names.
1736 Sun Apr 16 16:42:47 2006 Ben Pfaff <blp@gnu.org>
1738 * filename.c: (fn_dirname) Renamed fn_dir_name(), all references
1740 (fn_basename) Removed (dead code).
1741 (fn_absolute_p) Renamed fn_is_absolute(), all references updated.
1742 (fn_special_p) Renamed fn_is_special(), all references updated.
1743 (fn_exists_p) Renamed fn_exists(), all references updated.
1745 Sun Apr 16 16:33:58 2006 Ben Pfaff <blp@gnu.org>
1747 * filename.c: (fn_tilde_expand) Rewrite for cleaner code.
1748 Also, now it only tilde-expands file names, not paths.
1749 (fn_search_path) Tilde-expand one directory at a time.
1751 Sun Apr 16 16:28:06 2006 Ben Pfaff <blp@gnu.org>
1753 * filename.c: (fn_search_path) rewrite for cleaner code. Also,
1754 get rid of non-Unixlike version of the code, which has probably
1756 (fn_prepend_dir) Removed (dead code).
1758 * filename.h: (macro DIR_SEPARATOR_STRING) New.
1759 (macro PATH_SEPARATOR_STRING) New.
1760 Sun Apr 16 16:05:28 2006 Ben Pfaff <blp@gnu.org>
1762 Continue reforming error message support. In this phase, we get
1763 rid of VM() and the other msg() support for "verbosity", replacing
1764 it by a new function verbose_msg().
1766 * filename.c: (fn_search_path) Use verbose_msg() instead of
1769 Sat Apr 15 19:53:19 2006 Ben Pfaff <blp@gnu.org>
1771 * sfm-private.h: Get rid of #defines after #error, which makes no
1774 Sat Apr 15 19:48:57 2006 Ben Pfaff <blp@gnu.org>
1776 Get rid of our own int32 type in favor of the standard int32_t
1779 * sfm-private.h: (int32 macro) Don't define this anymore. Do
1782 * sys-file-reader.c: Use int32_t instead of int32 throughout.
1784 * sys-file-writer.c: Use int32_t instead of int32 throughout.
1786 Sat Apr 15 19:36:47 2006 Ben Pfaff <blp@gnu.org>
1788 Remove ill-considered file routines that are no longer used.
1790 * filename.c: (fn_open_ext) Removed.
1791 (fn_close_ext) Removed.
1793 * filename.h: (struct file_ext) Removed.
1795 Mon Apr 3 13:22:39 2006 Ben Pfaff <blp@gnu.org>
1797 * variable.c (var_is_valid_name): Move declarations before code
1800 Tue Apr 4 15:28:40 WST 2006 John Darrington <john@darrington.wattle.id.au>
1802 * filename.ch (fn_interp_vars): Fixed small buglet.
1804 Tue Mar 28 13:47:16 WST 2006 John Darrington <john@darrington.wattle.id.au>
1806 * filename.[ch] (fn_interp_vars): Changed the signature and semantics
1807 so as to modify the string inline. Thus makeing it easier to
1808 destroy the results when no longer needed.
1810 2006-03-25 Jason Stover <jhs@math.gcsu.edu>
1812 * category.c (cat_stored_values_destroy): Fixed memory leak.
1814 Fri Mar 24 18:15:41 2006 Ben Pfaff <blp@gnu.org>
1816 Add some missing frees. Thanks to John Darrington for reporting
1819 * any-writer.c (any_writer_close): Free writer.
1821 * any-reader.c (any_reader_close): Free reader.
1823 Mon Mar 20 16:33:53 2006 Ben Pfaff <blp@gnu.org>
1825 * por-file-reader.c: (error) Mark as NO_RETURN.
1827 Sat Mar 11 15:06:07 WST 2006 John Darrington <john@darrington.wattle.id.au>
1829 * settings.c: Changed default value of scompress to true.
1831 Sat Mar 4 13:22:51 2006 Ben Pfaff <blp@gnu.org>
1833 * sfm-private.h: Include variable.h, to get SHORT_NAME_LEN.
1835 * value.h: Remove check on MAX_SHORT_STRING, which I don't think
1838 * variable.h: Move definition of SHORT_NAME_LEN, LONG_NAME_LEN
1839 here from pref.h.orig.
1841 Sat Mar 4 12:50:48 WST 2006 John Darrington <john@darrington.wattle.id.au>
1843 * sys-file-reader.c: Fixed bug reading compressed files.
1845 Thu Mar 2 08:40:33 WST 2006 John Darrington <john@darrington.wattle.id.au>
1847 * Numerous renames. See src/ChangeLog for details.
1849 * Moved files from src directory