pintos-os.org Git - pintos-anon/blob - TODO

   1 -*- text -*-
   2
   3 * Reconsider command line arg style--confuses everyone.
   4
   5 * Internal tests.
   6
   7 * Add serial input support.  Also, modify tests to redirect input from
   8   /dev/null, to avoid stray keystrokes getting sent into the VM.
   9
  10 * Make pintos script read the serial output and kill the subprocess if
  11   it panics (after waiting a few seconds) or triple-faults.  Might
  12   want it to be optional, so that interactive users don't get killed.
  13
  14 * Godmar: Introduce memory leak robustness tests - both for the
  15   well-behaved as well as the mis-behaved case - that tests that the
  16   kernel handles low-mem conditions well.
  17
  18 * Godmar: Another area is concurrency. I noticed that I had passed all
  19   tests with bochs 2.2.1 (in reproducibility mode). Then I ran them
  20   with qemu and hit two deadlocks (one of them in rox-*,
  21   incidentally). After fixing those deadlocks, I upgraded to bochs
  22   2.2.5 and hit yet another deadlock in reproducibility mode that
  23   didn't show up in 2.2.1. All in all, a standard grading run would
  24   have missed 3 deadlocks in my code.  I'm not sure how to exploit
  25   that for grading - either run with qemu n times (n=2 or 3), or run
  26   it with bochs and a set of -j parameters.  Some of which could be
  27   known to the students, some not, depending on preference.  (I ported
  28   the -j patch to bochs 2.2.5 -
  29   http://people.cs.vt.edu/~gback/pintos/bochs-2.2.5.jitter.patch but I
  30   have to admit I never tried it so I don't know if it would have
  31   uncovered the deadlocks that qemu and the switch to 2.2.5
  32   uncovered.)
  33
  34 * Godmar: There is also the option to require students to develop test
  35   workloads themselves, for instance, to demonstrate the effectiveness
  36   of a particular algorithm (page eviction & buffer cache replacement
  37   come to mind.) This could involve a problem of the form: develop a
  38   workload that you cover well, and develop a "worst-case" load where
  39   you algorithm performs poorly, and show the results of your
  40   quantitative evaluation in your report - this could then be part of
  41   their test score.
  42
  43 * Threads project:
  44
  45   - Godmar:
  46
  47     >> Describe a potential race in thread_set_priority() and explain how
  48     >> your implementation avoids it.  Can you use a lock to avoid this race?
  49
  50     I'm not sure what you're getting at here:
  51     If changing the priority of a thread involves accessing the ready
  52     list, then of course there's a race with interrupt handlers and locks
  53     can't be used to resolve it.
  54
  55     Changing the priority however also involves a race with respect to
  56     accessing a thread's "priority" field - this race is with respect to
  57     other threads that attempt to donate priority to the thread that's
  58     changing its priority. Since this is a thread-to-thread race, I would
  59     tend to believe that locks could be used, although I'm not certain.  [
  60     I should point out, though, that lock_acquire currently disables
  61     interrupts - the purpose of which I had doubted in an earlier email,
  62     since sema_down() sufficiently establishes mutual exclusion. Taking
  63     priority donation into account, disabling interrupts prevents the race
  64     for the priority field, assuming the priority field of each thread is
  65     always updated with interrupts disabled. ]
  66
  67     What answer are you looking for for this design document question?
  68
  69   - Godmar:
  70
  71     >> Did any ambiguities in the scheduler specification make values in the
  72     >> table uncertain?  If so, what rule did you use to resolve them?  Does
  73     >> this match the behavior of your scheduler?
  74
  75     My guess is that you're referring to the fact the scheduler
  76     specification does not prescribe any order in which the priorities of
  77     all threads are updated, so if multiple threads end up with the same
  78     priority, it doesn't say which one to pick.  ("round-robin" order
  79     would not apply here.)
  80
  81     Is that correct?
  82
  83   - Godmar:
  84
  85     One of my groups implemented priority donation with these data
  86     structures in synch.cc:
  87     ---
  88     struct value
  89     {
  90       struct list_elem elem;      /* List element. */
  91       int value;                  /* Item value. */
  92     };
  93
  94     static struct value values[10];
  95     static int start = 10;
  96     static int numNest = 0;
  97     ---
  98     In their implementation, the "elem" field in their "struct value" is
  99     not even used.
 100
 101     The sad part is that they've passed all tests that are currently in
 102     the Pintos base with this implementation. (They do fail the additional
 103     tests I added priority-donate-sema & priority-donate-multiple2.)
 104
 105     Another group managed to pass all tests with this construct:
 106     ---
 107     struct lock
 108       {
 109         struct thread *holder;      /* Thread holding lock (for debugging). */
 110         struct semaphore semaphore; /* Binary semaphore controlling access. */
 111         //*************************************
 112         int pri_prev;
 113         int pri_delta;              //Used for Priority Donation
 114         /**************************************************/
 115       };
 116     ---
 117     where "pri_delta" keeps track of "priority deltas." They even pass
 118     priority-donate-multiple2.
 119
 120     I think we'll need a test where a larger number of threads & locks
 121     simultaneously exercise priority donation to weed out those
 122     implementations.
 123
 124     It may also be a good idea to use non-constant deltas for the low,
 125     medium, and high priority threads in the tests - otherwise, adding a
 126     "priority delta" might give - by coincidence - the proper priority for
 127     a thread.
 128
 129   - Godmar: Another thing: one group passed all tests even though they
 130     wake up all waiters on a lock_release(), rather than just
 131     one. Since there's never more than one waiter in our tests, they
 132     didn't fail anything. Another possible TODO item - this could be
 133     part a series of "regression tests" that check that they didn't
 134     break basic functionality in project 1. I don't think this would
 135     be insulting to the students.
 136
 137 * Userprog project:
 138
 139   - Get rid of rox--causes more trouble than it's worth
 140
 141   - Extra credit: specifics on how to implement sbrk, malloc.
 142
 143   - Godmar: We're missing tests that pass arguments to system calls
 144     that span multiple pages, where some are mapped and some are not.
 145     An implementation that only checks the first page, rather than all
 146     pages that can be touched during a call to read()/write() passes
 147     all tests.
 148
 149   - Godmar: Need some tests that test that illegal accesses lead to
 150     process termination. I have written some, will add them. In P2,
 151     obviously, this would require that the students break this
 152     functionality since the page directory is initialized for them,
 153     still it would be good to have.
 154
 155   - Godmar: There does not appear to be a test that checks that they
 156     close all fd's on exit.  Idea: add statistics & self-diagnostics
 157     code to palloc.c and malloc.c.  Self-diagnostics code could be
 158     used for debugging.  The statistics code would report how much
 159     kernel memory is free.  Add a system call
 160     "get_kernel_memory_information".  User programs could engage in a
 161     variety of activities and notice leaks by checking the kernel
 162     memory statistics.
 163
 164   - Godmar: is there a test that tests that they properly kill a process that
 165     attempts to access an invalid address in user code, e.g. *(void**)0 =
 166     42;?
 167
 168     It seems all of the robustness tests deal with bad pointers passed to
 169     system calls (at least judging from test/userprog/Rubric.robustness),
 170     but none deals with bad accesses by user code, or I am missing
 171     something.
 172
 173     ps: I found tests/vm/pt-bad-addr, which is in project 3 only, though.
 174
 175     For completeness, we should probably check read/write/jump to unmapped
 176     user virtual address and to mapped kernel address, for a total of 6
 177     cases. I wrote up some tests, see
 178     http://people.cs.vt.edu/~gback/pintos/bad-pointers/
 179
 180   - process_death test needs improvement
 181
 182   - Godmar: In the wait() tests, there's currently no test that tests
 183     that a process can only wait for its own children. There's only
 184     one test that tests that wait() on an invalid pid returns -1 (or
 185     kills the process), but no test where a valid pid is used that is
 186     not a child of the current process.
 187
 188     The current tests also do not ensure that both scenarios (parent waits
 189     first vs. child exits first) are exercised. In this context, I'm
 190     wondering if we should add a sleep() system call that would export
 191     timer_sleep() to user processes; this would allow the construction of
 192     such a test. It would also make it easier to construct a test for the
 193     valid-pid, but not-a-child scenario.
 194
 195     As in Project 4, the baseline implementation of timer_sleep() should
 196     suffice, so this would not necessarily require basing Project 2 on
 197     Project 1. [ A related thought: IMO it would not be entirely
 198     unreasonable to require timer_sleep() and priority scheduling sans
 199     donation from Project 1 working for subsequent projects. ]
 200
 201 * VM project:
 202
 203   - Godmar: Get rid of mmap syscall, add sbrk.
 204
 205   - Godmar: page-linear, page-shuffle VM tests do not use enough
 206     memory to force eviction.  Should increase memory consumption.
 207
 208   - Godmar: fix the page* tests to require swapping
 209
 210   - Godmar: make sure the filesystem fails if not properly
 211     concurrency-protected in project 3.
 212
 213   - Godmar: Another area in which tests could be created are for
 214     project 3: tests that combine mmap with a paging workload to see
 215     their kernel pages properly while mmapping pages - I don't think
 216     the current tests test that, do they?
 217
 218 * Filesys project:
 219
 220   - Need a better way to measure performance improvement of buffer
 221     cache.  Some students reported that their system was slower with
 222     cache--likely, Bochs doesn't simulate a disk with a realistic
 223     speed.
 224
 225     (Perhaps we should count disk reads and writes, not time.)
 226
 227   - Do we check that non-empty directories cannot be removed?
 228
 229   - Need lots more tests.
 230
 231   - Add FS persistence test(s).
 232
 233   - Godmar: I'm not sure if I mentioned that already, but I passed all
 234     tests for the filesys project without having implemented inode
 235     deallocation. A test is needed that checks that blocks are
 236     reclaimed when files are deleted.
 237
 238   - Godmar: I'm in the middle of project 4, I've started by
 239     implementing a buffer cache and plugging it into the existing
 240     filesystem.  Along the way I was wondering how we could test the
 241     cache.
 242
 243     Maybe one could adopt a similar testing strategy as in project 1
 244     for the MLQFS scheduler: add a function that reads
 245     "get_cache_accesses()" and a function "get_cache_hits()".  Then
 246     create a version of pintos that creates access traces for a
 247     to-be-determined workload.  Run an off-line analysis that would
 248     determine how many hits a perfect cache would have (MAX), and how
 249     much say an LRU strategy would give (MIN).  Then add a fudge
 250     factor to account for different index strategies and test that the
 251     reported number of cache hits/accesses is within (MIN, MAX) +/-
 252     fudge factor.
 253
 254     (As an aside - I am curious why you chose to use a clock-style
 255     algorithm rather than the more straightforward LRU for your buffer
 256     cache implementation in your sample solution. Is there a reason
 257     for that?  I was curious to see if it made a difference, so I
 258     implemented LRU for your cache implementation and ran the test
 259     workload of project 4 and printed cache hits/accesses.  I found
 260     that for that workload, the clock-based algorithm performs almost
 261     identical to LRU (within about 1%, but I ran nondeterministally
 262     with QEMU). I then reduced the cache size to 32 blocks and found
 263     again the same performance, which raises the suspicion that the
 264     test workload might not force any cache replacement, so the
 265     eviction strategy doesn't matter.)
 266
 267   - Godmar: I haven't analyzed the tests for project 4 yet, but I'm
 268     wondering if the fairness requirements your specification has for
 269     readers/writers are covered in the tests or not.
 270
 271
 272 * Documentation:
 273
 274   - Add "Digging Deeper" sections that describe the nitty-gritty x86
 275     details for the benefit of those interested.
 276
 277   - Add explanations of what "real" OSes do to give students some
 278     perspective.
 279
 280 * To add partition support:
 281
 282   - Find four partition types that are more or less unused and choose
 283     to use them for Pintos.  (This is implemented.)
 284
 285   - Bootloader reads partition tables of all BIOS devices to find the
 286     first that has the "Pintos kernel" partition type.  (This is
 287     implemented.)  Ideally the bootloader would make sure there is
 288     exactly one such partition, but I didn't implement that yet.
 289
 290   - Bootloader reads kernel into memory at 1 MB using BIOS calls.
 291     (This is implemented.)
 292
 293   - Kernel arguments have to go into a separate sector because the
 294     bootloader is otherwise too big to fit now?  (I don't recall if I
 295     did anything about this.)
 296
 297   - Kernel at boot also scans partition tables of all the disks it can
 298     find to find the ones with the four Pintos partition types
 299     (perhaps not all exist).  After that, it makes them available to
 300     the rest of the kernel (and doesn't allow access to other devices,
 301     for safety).
 302
 303   - "pintos" and "pintos-mkdisk" need to write a partition table to
 304     the disks that they create.  "pintos-mkdisk" will need to take a
 305     new parameter specifying the type.  (I might have partially
 306     implemented this, don't remember.)
 307
 308   - "pintos" should insist on finding a partition header on disks
 309     handed to it, for safety.
 310
 311   - Need some way for "pintos" to assemble multiple disks or
 312     partitions into a single image that can be copied directly to a
 313     USB block device.  (I don't know whether I came up with a good
 314     solution yet or not, or whether I implemented any of it.)
 315
 316 * To add USB support:
 317
 318     - Needs to be able to scan PCI bus for UHCI controller.  (I
 319       implemented this partially.)
 320
 321     - May want to be able to initialize USB controllers over CardBus
 322       bridges.  I don't know whether this requires additional work or
 323       if it's useful enough to warrant extra work.  (It's of special
 324       interest for me because I have a laptop that only has USB via
 325       CardBus.)
 326
 327     - There are many protocol layers involved: SCSI over USB-Mass
 328       Storage over USB over UHCI over PCI.  (I may be forgetting one.)
 329       I don't know yet whether it's best to separate the layers or to
 330       merge (some of) them.  I think that a simple and clean
 331       organization should be a priority.
 332
 333     - VMware can likely be used for testing because it can expose host
 334       USB devices as guest USB devices.  This is safer and more
 335       convenient than using real hardware for testing.
 336
 337     - Should test with a variety of USB keychain devices because there
 338       seems to be wide variation among them, especially in the SCSI
 339       protocols they support.  Should try to use a "lowest-common
 340       denominator" SCSI protocol if any such thing really exists.
 341
 342     - Might want to add a feature whereby kernel arguments can be
 343       given interactively, rather than passed on-disk.  Needs some
 344       though.