2008-05-15 Ben Pfaff Patch #6512. * inpt-pgm.c (reread_trns_proc): Use gsl_finite instead of finite, as a stopgap measure for portability until appropriate gnulib modules are available. 2008-02-06 John Darrington * get-data.c: Add a /BSIZE subcommand to PSQL reader. 2008-02-02 John Darrington * get-data.c (cmd_get_data): Support PSQL type. 2007-12-07 Ben Pfaff Patch #6302. * data-parser.c (data_parser_make_active_file): Fix case count argument to casereader_create_sequential, which fixes data reading in the GUI. Provided by John Darrington. 2007-12-04 Ben Pfaff Move DATA LIST parsing into generic infrastructure, and generalize it slightly. Then, use the same infrastructure to implement GET DATA/TYPE=TXT. * data-parser.c: New file. * data-parser.h: New file. * data-list.c (struct dls_var_spec): Removed. (ll_to_dls_var_spec): Removed. (enum dls_type): Removed. (struct data_list_pgm): Rename struct data_list_trns. Remove pool, specs, type, record_cnt, delims, skip_records, value_cnt members. Add new `parser' member. (cmd_data_list): Use data-parser infrastructure. (parse_fixed): Ditto. (parse_free): Ditto. (dump_fixed_table): Removed. (dump_free_table): Removed. (cut_field): Removed. (read_from_data_list): Removed. (read_from_data_list_fixed): Removed. (read_from_data_list_free): Removed. (read_from_data_list_list): Removed. (data_list_trns_free): Rename arguments for clarity. (data_list_trns_proc): Ditto. (data_list_casereader_read): Removed. (data_list_casereader_destroy): Removed. (data_list_casereader_class): Removed. * get-data.c (cmd_get_data): Support TXT type. (set_type): New function. (parse_get_txt): New function. 2007-12-04 Ben Pfaff * placement-parser.c (parse_column): New function. (parse_column_range): Add `base' argument. Update all callers. 2007-12-04 Ben Pfaff Make GET DATA a separate command, instead of something invoked indirectly from GET. * automake.mk: Remove get-data.h from sources. * get-data.h: Removed. * get-data.c (parse_get_data_command): Rename cmd_get_data. * get.c (parse_read_command): No longer any need to check for DATA keyword. 2007-12-04 Ben Pfaff * src/language/data-io/data-reader.c (struct dfm_reader): New `file_size' member to support dfm_get_percent_read. (dfm_open_reader): Initialize file_size. (dfm_get_percent_read): New function. 2007-11-08 Ben Pfaff Patch #6256: add support for binary, 360 file formats. Reviewed by John Darrington. * data-reader.c (struct dfm_reader): New member `block_left'. (dfm_open_reader): Initialize block_left. For FH_MODE_TEXT, open the file in text mode. (read_error): New function. (partial_record): New function. (try_to_read_fully): New function. (enum descriptor_type): New enum. (read_descriptor_word): New function. (corrupt_size): New function. (read_size): New function. (read_file_record): Implement new modes. (read_record): Now take care of tracking line numbers here. (dfm_reader_get_legacy_encoding): New function. * data-writer.c (dfm_put_record): Implement new modes. (dfm_writer_get_legacy_encoding): New function. * file-handle.q: Parse new formats. (cmd_file_handle): Set up new formats. * print.c (struct print_trns): New member `encoding'. (internal_cmd_print): Set encoding. (print_trns_proc): Recode output data if necessary. (flush_records): Recode leader byte. 2007-11-03 Ben Pfaff Allow output files to overwrite input files (bug #21280). * data-list.c (cmd_data_list): Manage file handle reference counts. * data-reader.c (struct dfm_reader): Add `lock' member. (dfm_close_reader): Simplify, as reference counting is now separate from locking. (dfm_open_reader): Lock file. * data-writer.c (struct dfm_writer): Add fh_lock, replace_file members. (dfm_open_writer): Lock file and prepare for replacement. (dfm_close_writer): Unlock file and replace it. * file-handle.q (cmd_close_file_handle): Use fh_unname. (fh_parse): Don't distinguish existing handles for a given file name from new ones. Manage file handle reference counts. * get.c (parse_read_command): Manage file handle reference counts. (parse_write_command): Ditto. (mtf_close_all_files): Ditto. * inpt-pgm.c (cmd_reread): Manage file handle reference counts. * print-space.c (cmd_print_space): Manage file handle reference counts. * print.c (internal_cmd_print): Manage file handle reference counts. 2007-11-03 John Darrington * get.c: Add GET DATA command variant. * get-data.c get-data.h (new files): Added support for GET DATA /TYPE='gnm' command. 2007-09-23 Ben Pfaff Bug #21111. Reviewed by John Darrington. * data-list.c (data_list_trns_proc): Properly set retval when END subcommand is in use. (cmd_data_list): Don't allow END subcommand to be used with DATA LIST FREE or LIST. 2007-09-12 Ben Pfaff * get.c (get_translate_case): Change input case parameter from const struct ccase * to struct ccase *, to match change in casereader and casewriter translators. Destroy input case, to fix memory leak. 2007-08-12 Ben Pfaff * get.c (parse_read_command): Compact the values in the target dictionary, to save space. 2007-08-12 Ben Pfaff * get.c (struct case_map): Move into new file src/data/case-map.c. (start_case_map): Ditto, and rename case_map_prepare_dict. (finish_case_map): Ditto, and rename case_map_from_dict. (map_case): Ditto, and rename case_map_execute. (destroy_case_map): Ditto, and rename case_map_destroy. (case_map_get_value_cnt): Ditto. 2007-08-12 Ben Pfaff * get.c (case_map_get_value_cnt): New function. 2007-07-25 Ben Pfaff Fix bug #17100. * data-list.c (read_from_data_list_fixed): Handle multi-record DATA LIST correctly. 2007-07-11 Ben Pfaff * get.c (map_case): Create destination case instead of leaving it undefined. Fixes bug #20285. Reviewed by John Darrington. 2007-06-06 Ben Pfaff * get.c: Essentially rewrite MATCH FILES to support FIRST and LAST. 2007-06-06 Ben Pfaff Adapt case sources, sinks, and clients of procedure code to the new infrastructure. * data-list.c: Make DATA LIST into a casereader. * get.c: Change GET, IMPORT, SAVE, EXPORT to use casereaders, casewriters. * inpt-pgm.c: Use caseinit code. Turn INPUT PROGRAM into a casereader. * list.q: Adapt to new procedure code. 2007-05-06 Ben Pfaff Abstract the documents within a dictionary a little better. Thanks to John Darrington for suggestion, initial version, and review. Patch #5917. * get.c (mtf_merge_dictionary): Simplify creating merged document. * sys-file-info.c (display_documents): Use new dict_get_document_line_cnt and dict_get_document_line functions. Thu Feb 1 16:56:02 2007 Ben Pfaff * file-handle.q (fh_parse): Update to new fh_create_file prototype. Sat Dec 16 22:16:18 2006 Ben Pfaff Make it possible to pull cases from the active file with a function call, instead of requiring indirection through a callback function. * automake.mk: Removed matrix-data.c. * matrix-data.c: Removed. * data-list.c (data_list_source_read): Conform with new case_source_class interface. (data_list_source_destroy): Ditto. * get.c (case_reader_source_class): Ditto. (case_reader_source_destroy): Ditto. (parse_output_proc): Take advantage of new procedure interface. (output_proc): Removed. (struct mtf_file): Add "struct ccase *" member to allow use of new procedure interface. (cmd_match_files): Take advantage of new procedure interface. (mtf_processing_finish): Removed. (mtf_read_nonactive_records): Renamed mtf_read_records. Now reads from every file, without any exception for the active file. (mtf_compare_BY_values): Simplify for new interface. (mtf_processing): Simplify for new interface. * inpt-pgm.c (is_valid_state): New function. (input_program_source_read): Conform with new case_source_class interface. (input_program_source_destroy): Ditto. (end_case_trns_proc): Now just needs to return TRNS_END_CASE. Sat Dec 9 18:43:34 2006 Ben Pfaff * list.q (cmd_list): Use new var_create, var_destroy functions. Thu Nov 30 21:51:58 2006 Ben Pfaff * inpt-pgm.c (cmd_reread): Always return error code upon detecting syntax error. Fixes bug #18419. Thanks to John Darrington for reporting this bug. Sun Nov 19 09:17:45 2006 Ben Pfaff * data-list.c (parse_free): Follow documented (but odd) rule that N format is treated as F format for free-field input. * data-reader.c (read_file_record): Drop new-line character from input text lines. This is symmetrical with the recently changed dfm_put_record semantics. Thu Nov 2 20:56:03 2006 Ben Pfaff Implement SKIP keyword on DATA LIST. Fixes bug #17099. * data-list.c: (struct data_list_pgm) Add `skip_records' members. (cmd_data_list) Set skip_records based on user input. (data_list_source_read) Skip records requested by user. Tue Oct 31 20:04:06 2006 Ben Pfaff * placement-parser.c: (PRS_TYPE_T) Now that struct fmt_spec uses an enum fmt_type for its type member, we can't depend on the ability to put negative values into that member as out-of-band values, because enum fmt_type might be an unsigned type. So use values around SCHAR_MAX instead, because we know that SCHAR_MAX will fit into any type, signed or unsigned, and there aren't nearly that many format types. (parse_var_placements) Add for_input parameter to specify whether we're parsing input or output formats. Update all callers. (fixed_parse_columns) Ditto. (fixed_parse_fortran) Ditto. Tue Oct 31 18:21:48 2006 Ben Pfaff * print-space.c (print_space_trns_proc): Let dfm_put_record add the new-line character, to match dfm_put_record change below. Sat Oct 28 11:57:19 2006 Ben Pfaff * data-writer.c (struct dfm_writer): Removed `bounce' member, and all references to it. (dfm_put_record) Change semantics so that it adds formatting itself, such as new-line characters, instead of putting that responsibility on the caller. Also, pad binary records with spaces instead of zeros, for compatibility. * print.c (struct prt_out_spec) New member `sysmis_as_spaces'. (struct print_trns) Remove `omit_new_lines' and all references, since dfm_put_record() is taking care of that. Add `include_prefix'. (internal_cmd_print) Allow an empty set of data to print. Set include_prefix. (parse_specs) Allow an empty set of data to print. (parse_variable_argument) Only add space with PRINT command. Set sysmis_as_spaces. (print_trns_proc) Indent records if include_prefix is set, for compatibility. Output SYSMIS as spaces if sysmis_as_spaces is set. Put "1" in first column if PRINT EJECT is used with an external output file. (flush_records) Ditto. Sat Oct 28 16:19:57 WST 2006 John Darrington * data-reader.c: Eliminated references to extern variable getl_buf Sat Aug 5 08:25:07 2006 Ben Pfaff Fix bug #17329 in REREAD parsing, reported by John Darrington. * inpt-pgm.c (cmd_reread): Fix file handle parsing. Mon Jul 31 10:32:31 2006 Ben Pfaff * print.c (parse_specs): Allow a comma between specifications. Sun Jul 16 19:57:10 2006 Ben Pfaff * automake.mk: (src_language_data_io_libdata_io_a_SOURCE) Add print-space.c, placement-parser.c, placement-parser.h. * data-list.c: Basically rewrote the whole thing. Broke out a lot of code into placement-parser.c. Code is much cleaner now. * placement-parser.c: New file. * placement-parser.h: New file. * print.c: Basically rewrote the whole thing. Broke out PRINT SPACE into print-space.c. Code is much cleaner now. * print-space.c: New file. Sat Jul 1 17:39:40 2006 Ben Pfaff Fix bug #11612, "q2c documentation does not agree with code". * list.q: Audit use of q2c "+" prefixes that indicate that a command may appear multiple times. Sat Jul 1 20:44:22 2006 Ben Pfaff Fix bug #15786: System File Creation crashes if directoy is nonexistent. * get.c (parse_write_command): Check that the any_writer open succeeds. Tue Jun 27 22:44:28 2006 Ben Pfaff Fix regression in command name completion reported by John Darrington. Now completion is again state-dependent and occurs only on the first line of a command. * inpt-pgm.c: (cmd_input_program) Reading of first token in command moved into cmd_parse. Fri Jun 9 13:56:00 2006 Ben Pfaff Reform string library. * matrix-data.c (context): Use dynamic string. (another_token) Deal with changed dfm_get_record() interface. (mget_token) Ditto. (force_eol) Ditto. * data-list.c (struct data_list_pgm) Delete delims, delim_cnt members, replacing them by struct string delims. Update all references to use struct string functions. (cut_field) Change interface to avoid needing "end_blank", by getting the data-reader to remember that state for us. Change internals to use substring. Update both callers. * data-reader.c (read_file_record): Use ds_read_stream(). (dfm_get_record) Change interface to return substring. Updated all callers. (dfm_expand_tabs) Use ds_find_char(). Now maintain position relative to end-of-line. Use ds_swap(). (dfm_reread_record) Don't limit position by line length. (dfm_column_start) Make parameter const. (dfm_columns_past_end) New function. (dfm_get_column) New function. Thu May 25 18:26:26 WST 2006 John Darrington * print.c (print_trns_free): Made the code agree with the comment, by not freeing PRT. Has the side effect that the command no longer crashes on invalid syntax. Tue May 9 20:55:46 2006 Ben Pfaff * get.c (cmd_match_files): Fix memory leak replacing default_dict. Sat May 6 22:25:09 2006 Ben Pfaff Fix segfault. * list.q (write_fallback_headers): (write_fallback_headers) Properly record width of leader and pass it to write_varname(). Sat May 6 19:03:13 2006 Ben Pfaff * get.c: (mtf_merge_dictionary) Fix value label memory leak. Sat May 6 13:51:16 2006 Ben Pfaff Use a casefile, instead of a case sink, for MATCH FILES output. It's more straightforward, although it has the same effect. * get.c: (struct mtf_proc) Replace `sink' case sink member by `output' casefile member. (cmd_match_files) Work with casefile instead of sink. (mtf_processing) Add case to casefile instead of sink. Sat May 6 10:43:07 2006 Ben Pfaff Continue reforming procedure execution. In this phase, get rid of the output code for SPLIT FILE groups in procedure.c, which really shouldn't be doing any output. Move it into the individual procedures instead. This also adds some flexibility. * list.q (write_all_headers): Call output_split_file_values(). Wed May 3 23:00:17 2006 Ben Pfaff Continue reforming procedure execution. In this phase, get rid of many global variables, consolidating procedure execution in procedure.c. Encapsulate transformations in new "struct trns_chain". Also, change implementation of N OF CASES, FILTER, and PROCESS IF from special cases to transformations. * data-list.c: (data_list_trns_proc) Return TRNS_END_FILE at end of file. (Why didn't we do this before?) (cmd_match_files) Direct procedure output to null sink. Use discard_variables() instead of indirect version. * inpt-pgm.c: Use transformation chain. (struct input_program_pgm) Add trns_chain member. (cmd_input_program) Initialize trns_chain member and capture transformations with proc_capture_transformations(). (input_program_source_read) Use trns_chain_execute(). (destroy_input_program) Destroy input chain. Tue May 2 10:39:56 WST 2006 John Darrington * list.q Changed from using fixed length char buffers to struct string so that any length variables can be used. Mon May 1 18:21:19 2006 Ben Pfaff Further clean up the CMD_* command result codes. * (enum cmd_result_extensions) New. Add CMD_END_INPUT_PROGRAM and CMD_END_CASE result codes. (struct input_program_pgm) Added case_nr, write_case, wc_data members for use by END CASE transformation. (emit_END_CASE) New function. (cmd_input_program) Interpret CMD_END_CASE by outputting an END CASE transformation. If none is output by the input program itself, add one automatically at the end. Change lack of variables from warning to error. (cmd_end_input_program) Return CMD_END_INPUT_PROGRAM instead of CMD_END_SUBLOOP. (input_program_source_read) No longer any need to special-case END CASE. Handle TRNS_DROP_CASE properly. Initialize new members in inp for use by END CASE transformation. (destroy_input_program) New function. (input_program_source_destroy) Just call destroy_input_program(). (cmd_end_case) Just return CMD_END_CASE. (end_case_trns_proc) No longer a stub handled by input_program_source_read(). Actually output the case and increment the case number. Mon May 1 16:06:30 2006 Ben Pfaff Remove vestiges of REPEATING DATA support. * data-list.c: (struct rpd_num_or_var) Removed. (struct repeating_data_trns) Removed. (cmd_repeating_data) Removed. (find_variable_input_spec) Removed. (parse_num_or_var) Removed. (parse_repeating_data) Removed. (realize_value) Removed. (struct rpd_parse_info) Removed. (rpd_parse_record) Removed. (repeating_data_trns_proc) Removed. (repeating_data_trns_free) Removed. (repeating_data_set_write_case) Removed. (rpd_msg) Removed. * inpt-pgm.c: (input_program_source_read) Don't deal with REPEATING DATA. * data-list.h: Removed. * automake.mk (src_language_data_io_libdata_io_a_SOURCES): Removed data-list.h. Mon May 1 15:58:28 2006 Ben Pfaff Remove vestiges of FILE TYPE support. * data-list.c: (cmd_data_list) Don't check for FILE TYPE. (cmd_repeating_data) Ditto. * automake.mk (src_language_data_io_libdata_io_a_SOURCES): Remove file-type.c, file-type.h. * file-type.c: Removed. * file-type.h: Removed. Wed Apr 26 13:16:28 2006 Ben Pfaff Improve the way we handle the various parsing "states". Until now we've hard-coded the state transitions in the command definition file, but that's error-prone and, worse, it's redundant--we can figure out what state we're in anyhow. We can cleanly handle INPUT PROGRAM and FILE TYPE with a nested command-processing loop. * data-list.c: (cmd_data_list) Use in_file_type() or in_input_program() in place of case_source_is_class() or case_source_is_complex(). * file-type.c: NB: Not really fixed except minimally to compile, because it doesn't work anyway. (in_file_type) New function. (cmd_record_type) No need to check that we're in FILE TYPE. (cmd_end_file_type) Ditto. (var file_type_source_class) Make static. * get.c: (cmd_match_files) Check vfm_source instead of pgm_state. * inpt-pgm.c: (in_input_program) New function. (cmd_input_program) Rewrite to include nested command processing loop. (cmd_end_input_program) Just return CMD_END_SUBLOOP. (var input_program_source_class) Make static. (cmd_end_case) No need to check that we're in INPUT PROGRAM. (cmd_end_file) Ditto. * automake.mk (src_language_data_io_libdata_io_a_SOURCES): Add file-type.h, inpt-pgm.h. * file-type.h: New file. * inpt-pgm.h: New file. Tue Apr 25 13:11:55 2006 Ben Pfaff * print.c: Don't special-case MS-DOS line terminators. (macro LINE_END_WIDTH) Removed. (alloc_line) Line ends are 1 byte. (print_trns_proc) Just output \n for line end. Sun Apr 23 22:05:58 2006 Ben Pfaff Continue reforming error message support. In this phase, get rid of message "titles" and put the message text in `struct error'. Now `struct error' encapsulates a message more properly. * data-list.c: (macro RPD_ERR) Removed. (rpd_msg) New function. Updated all references to tmsg() to call this function instead. Sat Apr 15 19:38:13 2006 Ben Pfaff Remove last users of struct file_ext so we can get rid of it entirely. * data-reader.c: (struct dfm_reader) Change file member from struct file_ext to FILE *. Updated all references. (dfm_close_reader) Close file with fn_close() instead of fn_close_ext(). Also, make a copy of the file name from the file handle before closing it, because we can't extract it after we close the file. (dfm_open_reader) Open file with fn_open() instead of fn_open_ext(). * data-writer.c: (struct dfm_writer) Change file member struct file_ext to FILE *. Updated all references. (dfm_close_writer) Close file with fn_close() instead of fn_close_ext(). Also, make a copy of the file name from the file handle before closing it, because we can't extract it after we close the file. (dfm_open_writer) Open file with fn_open() instead of fn_open_ext(). Sat Apr 15 18:00:32 2006 Ben Pfaff * data-list.c: Add prototype to suppress warning for cmd_repeating_data(). Thu Mar 2 08:40:33 WST 2006 John Darrington * Moved files from src directory