X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?p=pspp-builds.git;a=blobdiff_plain;f=TODO;h=4889f5d0cbfa02952501e880e9a7f5baae6048c9;hp=8a798cf2aa38be74f995ad4044ab9c8e9cc424c7;hb=HEAD;hpb=897a260ef7a8b954d56698cc40241a3197127505

diff --git a/TODO b/TODO
index 8a798cf2..4889f5d0 100644
--- a/TODO
+++ b/TODO
@@ -1,107 +1,25 @@
-Time-stamp: <2005-08-02 10:24:25 blp>
+Time-stamp: <2006-12-17 18:45:35 blp>
 
 Get rid of need for GNU diff in `make check'.
 
-Get rid of need for file name canonicalization.
-
-Use getsubopt()?
-
-Format specifier and missing values code needs to be rewritten for lowered
-crappiness.
-
 CROSSTABS needs to be re-examined.
 
-RANK, which is needed for the Wilcoxon signed-rank statistic, Mann-Whitney U,
-Kruskal-Wallis on NPAR TESTS and for Spearman and the Johnkheere trend test (in
-other procedures).
-
-lex_token_representation() should take a buffer to fill.
-
-Make valgrind --leak-check=yes --show-reachable=yes work.
-
-Add NOT_REACHED() macro.
-
-Add compression to casefiles.
-
-There needs to be another layer onto the lexer, which should probably be
-entirely rewritten anyway.  The lexer needs to read entire *commands* at a
-time, not just a *line* at a time.  It also needs to support arbitrary putback,
-probably by just backing up the "current position" in the command buffer.
-
 Scratch variables should not be available for use following TEMPORARY.
 
-Details of N OF CASES, SAMPLE, FILTER, PROCESS IF, TEMPORARY, etc., need to be
-checked against the documentation.  See notes on these at end of file for a
-start.
-
 Check our results against the NIST StRD benchmark results at
 strd.itl.nist.gov/div898/strd
 
-In debug mode hash table code should verify that collisions are reasonably low.
-
-Use AFM files instead of Groff font files, and include AFMs for our default
-fonts with the distribution.
-
 Storage of value labels on disk is inefficient.  Invent new data structure.
 
-Add an output flag which would cause a page break if a table segment could fit
-vertically on a page but it just happens to be positioned such that it won't.
-
 Fix spanned joint cells, i.e., EDLEVEL on crosstabs.stat.
 
-Cell footnotes.
-
-PostScript driver should emit thin lines, then thick lines, to optimize time
-and space.
-
-New functions?  var_name_or_label(), tab_value_or_label()
-
-Should be able to bottom-justify cells.  It'll be expensive, though, by
-requiring an extra metrics call.
-
-Perhaps instead of the current lines we should define the following line types:
-null, thin, thick, double.  It might look pretty classy.
-
-Perhaps thick table borders that are cut off by a page break should decay to
-thin borders.  (i.e., on a thick bordered table that's longer than one page,
-but narrow, the bottom border would be thin on the first page, and the top and
-bottom borders on middle pages.)
-
-Support multi-line titles on tables. (For the first page only, presumably.)
-
-Rewrite the convert_F() function in data-out.c to be nicer code.
-
-In addition to searching the source directory, we should search the current
-directory (for data files).  (Yuck!)
-
-Fix line-too-long problems in PostScript code, instead of covering them up.
-setlinecap is *not* a proper solution.
-
-Fix som_columns().
-
-Has glob.c been pared down enough?
-
-Improve interactivity of output by allowing a `commit' function for a page.
-This will also allow for infinite-length pages.
-
-Implement thin single lines, should be pretty easy now.
-
 SELECT IF should be moved before other transformations whenever possible.  It
 should only be impossible when one of the variables referred to in SELECT IF is
 created or modified by a previous transformation.
 
-The manual: add text, add index entries, add examples.
-
-The inline file should be improved: There should be *real* detection of whether
-it is used (in dfm.c:cmd_begin_data), not after-the-fact detection.
-
 Figure out a stylesheet for messages displayed by PSPP: i.e., what quotation
 marks around filenames, etc.
 
-New SET subcommand: OUTPUT.  i.e., SET OUTPUT="filename" to send output to that
-file; SET OUTPUT="filename"(APPEND) to append to that file; SET OUTPUT=DEFAULT
-to reset everything.  There might be a better approach, though--think about it.
-
 From Zvi Grauer <z.grauer@csuohio.edu> and <zvi@mail.ohio.net>:
 
    1. design of experiments software, specifically Factorial, response surface
@@ -120,47 +38,6 @@ From Zvi Grauer <z.grauer@csuohio.edu> and <zvi@mail.ohio.net>:
 
    6. Categorical data analsys ?
 
-IDEAS
------
-
-In addition to an "infinite journal", we should keep a number of
-individual-session journals, pspp.jnl-1 through pspp.jnl-X, renaming and
-deleting as needed.  All of the journals should have date/time comments.
-
-Qualifiers for variables giving type--categorical, ordinal, ...
-
-Analysis Wizard
-
-Consider consequences of xmalloc(), fail(), hcf() in interactive
-use:
-a. Can we safely just use setjmp()/longjmp()?
-b. Will that leak memory?
-i. I don't think so: all procedure-created memory is either
-garbage-collected or globally-accessible.
-ii. But you never know... esp. w/o Checker.
-c. Is this too early to worry? too late?
-
-Need to implement a shared buffer for funny functions that require relatively
-large permanent transient buffers (1024 bytes or so), that is, buffers that are
-permanent in the sense that they probably shouldn't be deallocated but are only
-used from time to time, buffers that can't be allocated on the stack because
-they are of variable and unpredictable but usually relatively small (usually
-line buffers).  There are too many of these lurking around; can save a sizeable
-amount of space at very little overhead and with very little effort by merging
-them.
-
-Clever multiplatform GUI idea (due partly to John Williams): write a GUI in
-Java where each statistical procedure dialog box could be downloaded from the
-server independently.  The statistical procedures would run on (the/a) server
-and results would be reported through HTML tables viewed with the user's choice
-of web browsers.  Help could be implemented through the browser as well.
-
-HOWTOs
-------
-
-MORE NOTES/IDEAS/BUGS
----------------------
-
 Sometimes very wide (or very tall) columns can occur in tables.  What is a good
 way to truncate them?  It doesn't seem to cause problems for the ascii or
 postscript drivers, but it's not good in the general case.  Should they be
@@ -168,12 +45,6 @@ split somehow?  (One way that wide columns can occur is through user request,
 for instance through a wide PRINT request--try time-date.stat with a narrow
 ascii page or with the postscript driver on letter size paper.)
 
-NULs in input files break the products we're replacing: although it will input
-them properly and display them properly as AHEX format, it truncates them in A
-format.  Also, string-manipulation functions such as CONCAT truncate their
-results after the first NUL.  This should simplify the result of PSPP design.
-Perhaps those ugly a_string, b_string, ..., can all be eliminated.
-
 From Moshe Braner <mbraner@nessie.vdh.state.vt.us>: An idea regarding MATCH
 FILES, again getting BEYOND the state of SPSS: it always bothered me that if I
 have a large data file and I want to match it to a small lookup table, via
@@ -187,122 +58,6 @@ whatever) for it.  Then read the /FILE and use the index to match to each case.
 OTOH, if the /TABLE is too large, then do it the old way, complaining if either
 file is not sorted on key.
 
-----------------------------------------------------------------------
-Statistical procedures:
-
-For each case we read from the input program:
-
-1. Execute permanent transformations.  If these drop the case, stop.
-2. N OF CASES.  If we have already written N cases, stop.
-3. Write case to replacement active file.
-4. Execute temporary transformations.  If these drop the case, stop.
-5. Post-TEMPORARY N OF CASES.  If we have already analyzed N cases, stop.
-6. FILTER, PROCESS IF.  If these drop the case, stop.
-7. Pass case to procedure.
-
-Ugly cases:
-
-LAG records cases in step 3.
-
-AGGREGATE: When output goes to an external file, this is just an ordinary
-procedure.  When output goes to the active file, step 3 should be skipped,
-because AGGREGATE creates its own case sink and writes to it in step 7.  Also,
-TEMPORARY has no effect and we just cancel it.  Regardless of direction of
-output, we should not implement AGGREGATE through a transformation because that
-will fail to honor FILTER, PROCESS IF, N OF CASES.
-
-ADD FILES: Essentially an input program.  It silently cancels unclosed LOOPs
-and DO IFs.  If the active file is used for input, then runs EXECUTE (if there
-are any transformations) and then steals vfm_source and encapsulates it.  If
-the active file is not used for input, then it cancels all the transformations
-and deletes the original active file.
-
-CASESTOVARS: ???
-
-FLIP:
-
-MATCH FILES: Similar to AGGREGATE.  This is a procedure.  When the active file
-is used for input, it reads the active file; otherwise, it just cancels all the
-transformations and deletes the original active file.  Step 3 should be
-skipped, because MATCH FILES creates its own case sink and writes to it in step
-7.  TEMPORARY is not allowed.
-
-MODIFY VARS:
-
-REPEATING DATA:
-
-SORT CASES:
-
-UPDATE: same as ADD FILES.
-
-VARSTOCASES: ???
-----------------------------------------------------------------------
-N OF CASES
-
-  * Before TEMPORARY, limits number of cases sent to the sink.
-
-  * After TEMPORARY, limits number of cases sent to the procedure.
-
-  * Without TEMPORARY, those are the same cases, so it limits both.
-
-SAMPLE
-
-  * Sample is just a transformation.  It has no special properties.
-
-FILTER
-
-  * Always selects cases sent to the procedure.
-
-  * No effect on cases sent to sink.
-
-  * Before TEMPORARY, selection is permanent.  After TEMPORARY,
-    selection stops after a procedure.
-
-PROCESS IF
-
-  * Always selects cases sent to the procedure.
-
-  * No effect on cases sent to sink.
-
-  * Always stops after a procedure.
-
-SPLIT FILE
-
-  * Ignored by AGGREGATE.  Used when procedures write matrices.
-
-  * Always applies to the procedure.
-
-  * Before TEMPORARY, splitting is permanent.  After TEMPORARY,
-    splitting stops after a procedure.
-
-TEMPORARY
-
-  * TEMPORARY has no effect on AGGREGATE when output goes to the active file.
-
-  * SORT CASES, ADD FILES, RENAME VARIABLES, CASESTOVARS, VARSTOCASES,
-    COMPUTE with a lag function cannot be used after TEMPORARY.
-
-  * Cannot be used in DO IF...END IF or LOOP...END LOOP.
-
-  * FLIP ignores TEMPORARY.  All transformations become permanent.
-
-  * MATCH FILES and UPDATE cannot be used after TEMPORARY if active
-    file is an input source.
-
-  * RENAME VARIABLES is invalid after TEMPORARY.
-
-  * WEIGHT, SPLIT FILE, N OF CASES, FILTER, PROCESS IF apply only to
-    the next procedure when used after TEMPORARY.
-
-WEIGHT
-
-  * Always applies to the procedure.
-
-  * Before TEMPORARY, weighting is permanent.  After TEMPORARY,
-    weighting stops after a procedure.
-
-
--------------------------------------------------------------------------------
 Local Variables:
 mode: text
 fill-column: 79