-*- text -*-
-* The p1-4.c testcase needs significant tuning. Currently it takes
- too long (especially when SHOW_PROGRESS is turned on) and doesn't
- show significant improvement.
+* Reconsider command line arg style--confuses everyone.
-* The semantics of the join system call should change so that it only
- returns the exit code once.
+* Internal tests.
-* mmap/munmap should use segment IDs like Nachos. Too hard otherwise.
+* Add serial input support. Also, modify tests to redirect input from
+ /dev/null, to avoid stray keystrokes getting sent into the VM.
-* Some confusion--do we really get overlapping ro/rw segment in normal
- link? Student example seemed to show that we don't.
+* Make pintos script read the serial output and kill the subprocess if
+ it panics (after waiting a few seconds) or triple-faults. Might
+ want it to be optional, so that interactive users don't get killed.
-* Finish writing the tour.
+* Godmar: Introduce memory leak robustness tests - both for the
+ well-behaved as well as the mis-behaved case - that tests that the
+ kernel handles low-mem conditions well.
-* Come up with a way for us to release some of the tests.
+* Godmar: Another area is concurrency. I noticed that I had passed all
+ tests with bochs 2.2.1 (in reproducibility mode). Then I ran them
+ with qemu and hit two deadlocks (one of them in rox-*,
+ incidentally). After fixing those deadlocks, I upgraded to bochs
+ 2.2.5 and hit yet another deadlock in reproducibility mode that
+ didn't show up in 2.2.1. All in all, a standard grading run would
+ have missed 3 deadlocks in my code. I'm not sure how to exploit
+ that for grading - either run with qemu n times (n=2 or 3), or run
+ it with bochs and a set of -j parameters. Some of which could be
+ known to the students, some not, depending on preference. (I ported
+ the -j patch to bochs 2.2.5 -
+ http://people.cs.vt.edu/~gback/pintos/bochs-2.2.5.jitter.patch but I
+ have to admit I never tried it so I don't know if it would have
+ uncovered the deadlocks that qemu and the switch to 2.2.5
+ uncovered.)
-* userprog project should mark read-only pages as actually read-only
- in the page table
+* Godmar: There is also the option to require students to develop test
+ workloads themselves, for instance, to demonstrate the effectiveness
+ of a particular algorithm (page eviction & buffer cache replacement
+ come to mind.) This could involve a problem of the form: develop a
+ workload that you cover well, and develop a "worst-case" load where
+ you algorithm performs poorly, and show the results of your
+ quantitative evaluation in your report - this could then be part of
+ their test score.
-* Add src/testcases/vm, src/testcases/filesys and make it clear to use
- them?
+* Threads project:
-* Speed up disk routines: filling an 8 MB disk takes a long time.
+ - Godmar:
+
+ >> Describe a potential race in thread_set_priority() and explain how
+ >> your implementation avoids it. Can you use a lock to avoid this race?
+
+ I'm not sure what you're getting at here:
+ If changing the priority of a thread involves accessing the ready
+ list, then of course there's a race with interrupt handlers and locks
+ can't be used to resolve it.
+
+ Changing the priority however also involves a race with respect to
+ accessing a thread's "priority" field - this race is with respect to
+ other threads that attempt to donate priority to the thread that's
+ changing its priority. Since this is a thread-to-thread race, I would
+ tend to believe that locks could be used, although I'm not certain. [
+ I should point out, though, that lock_acquire currently disables
+ interrupts - the purpose of which I had doubted in an earlier email,
+ since sema_down() sufficiently establishes mutual exclusion. Taking
+ priority donation into account, disabling interrupts prevents the race
+ for the priority field, assuming the priority field of each thread is
+ always updated with interrupts disabled. ]
+
+ What answer are you looking for for this design document question?
+
+ - Godmar:
+
+ >> Did any ambiguities in the scheduler specification make values in the
+ >> table uncertain? If so, what rule did you use to resolve them? Does
+ >> this match the behavior of your scheduler?
+
+ My guess is that you're referring to the fact the scheduler
+ specification does not prescribe any order in which the priorities of
+ all threads are updated, so if multiple threads end up with the same
+ priority, it doesn't say which one to pick. ("round-robin" order
+ would not apply here.)
+
+ Is that correct?
+
+ - Godmar:
+
+ One of my groups implemented priority donation with these data
+ structures in synch.cc:
+ ---
+ struct value
+ {
+ struct list_elem elem; /* List element. */
+ int value; /* Item value. */
+ };
+
+ static struct value values[10];
+ static int start = 10;
+ static int numNest = 0;
+ ---
+ In their implementation, the "elem" field in their "struct value" is
+ not even used.
+
+ The sad part is that they've passed all tests that are currently in
+ the Pintos base with this implementation. (They do fail the additional
+ tests I added priority-donate-sema & priority-donate-multiple2.)
+
+ Another group managed to pass all tests with this construct:
+ ---
+ struct lock
+ {
+ struct thread *holder; /* Thread holding lock (for debugging). */
+ struct semaphore semaphore; /* Binary semaphore controlling access. */
+ //*************************************
+ int pri_prev;
+ int pri_delta; //Used for Priority Donation
+ /**************************************************/
+ };
+ ---
+ where "pri_delta" keeps track of "priority deltas." They even pass
+ priority-donate-multiple2.
+
+ I think we'll need a test where a larger number of threads & locks
+ simultaneously exercise priority donation to weed out those
+ implementations.
+
+ It may also be a good idea to use non-constant deltas for the low,
+ medium, and high priority threads in the tests - otherwise, adding a
+ "priority delta" might give - by coincidence - the proper priority for
+ a thread.
+
+ - Godmar: Another thing: one group passed all tests even though they
+ wake up all waiters on a lock_release(), rather than just
+ one. Since there's never more than one waiter in our tests, they
+ didn't fail anything. Another possible TODO item - this could be
+ part a series of "regression tests" that check that they didn't
+ break basic functionality in project 1. I don't think this would
+ be insulting to the students.
+
+* Userprog project:
+
+ - Get rid of rox--causes more trouble than it's worth
+
+ - Extra credit: specifics on how to implement sbrk, malloc.
+
+ - Godmar: We're missing tests that pass arguments to system calls
+ that span multiple pages, where some are mapped and some are not.
+ An implementation that only checks the first page, rather than all
+ pages that can be touched during a call to read()/write() passes
+ all tests.
+
+ - Godmar: Need some tests that test that illegal accesses lead to
+ process termination. I have written some, will add them. In P2,
+ obviously, this would require that the students break this
+ functionality since the page directory is initialized for them,
+ still it would be good to have.
+
+ - Godmar: There does not appear to be a test that checks that they
+ close all fd's on exit. Idea: add statistics & self-diagnostics
+ code to palloc.c and malloc.c. Self-diagnostics code could be
+ used for debugging. The statistics code would report how much
+ kernel memory is free. Add a system call
+ "get_kernel_memory_information". User programs could engage in a
+ variety of activities and notice leaks by checking the kernel
+ memory statistics.
+
+ - Godmar: is there a test that tests that they properly kill a process that
+ attempts to access an invalid address in user code, e.g. *(void**)0 =
+ 42;?
+
+ It seems all of the robustness tests deal with bad pointers passed to
+ system calls (at least judging from test/userprog/Rubric.robustness),
+ but none deals with bad accesses by user code, or I am missing
+ something.
+
+ ps: I found tests/vm/pt-bad-addr, which is in project 3 only, though.
+
+ For completeness, we should probably check read/write/jump to unmapped
+ user virtual address and to mapped kernel address, for a total of 6
+ cases. I wrote up some tests, see
+ http://people.cs.vt.edu/~gback/pintos/bad-pointers/
+
+ - process_death test needs improvement
+
+ - Godmar: In the wait() tests, there's currently no test that tests
+ that a process can only wait for its own children. There's only
+ one test that tests that wait() on an invalid pid returns -1 (or
+ kills the process), but no test where a valid pid is used that is
+ not a child of the current process.
+
+ The current tests also do not ensure that both scenarios (parent waits
+ first vs. child exits first) are exercised. In this context, I'm
+ wondering if we should add a sleep() system call that would export
+ timer_sleep() to user processes; this would allow the construction of
+ such a test. It would also make it easier to construct a test for the
+ valid-pid, but not-a-child scenario.
+
+ As in Project 4, the baseline implementation of timer_sleep() should
+ suffice, so this would not necessarily require basing Project 2 on
+ Project 1. [ A related thought: IMO it would not be entirely
+ unreasonable to require timer_sleep() and priority scheduling sans
+ donation from Project 1 working for subsequent projects. ]
+
+* VM project:
+
+ - Godmar: Get rid of mmap syscall, add sbrk.
+
+ - Godmar: page-linear, page-shuffle VM tests do not use enough
+ memory to force eviction. Should increase memory consumption.
+
+ - Godmar: fix the page* tests to require swapping
+
+ - Godmar: make sure the filesystem fails if not properly
+ concurrency-protected in project 3.
+
+ - Godmar: Another area in which tests could be created are for
+ project 3: tests that combine mmap with a paging workload to see
+ their kernel pages properly while mmapping pages - I don't think
+ the current tests test that, do they?
+
+* Filesys project:
+
+ - Need a better way to measure performance improvement of buffer
+ cache. Some students reported that their system was slower with
+ cache--likely, Bochs doesn't simulate a disk with a realistic
+ speed.
+
+ (Perhaps we should count disk reads and writes, not time.)
+
+ - Do we check that non-empty directories cannot be removed?
+
+ - Need lots more tests.
+
+ - Add FS persistence test(s).
+
+ - Detect implementations that represent the cwd as a string, by
+ removing a directory that is the cwd of another process, then
+ creating a new directory of the same name and putting some files
+ in it, then checking whether the process that had it as cwd sees
+ them.
+
+ - Godmar: I'm not sure if I mentioned that already, but I passed all
+ tests for the filesys project without having implemented inode
+ deallocation. A test is needed that checks that blocks are
+ reclaimed when files are deleted.
+
+ - Godmar: I'm in the middle of project 4, I've started by
+ implementing a buffer cache and plugging it into the existing
+ filesystem. Along the way I was wondering how we could test the
+ cache.
+
+ Maybe one could adopt a similar testing strategy as in project 1
+ for the MLQFS scheduler: add a function that reads
+ "get_cache_accesses()" and a function "get_cache_hits()". Then
+ create a version of pintos that creates access traces for a
+ to-be-determined workload. Run an off-line analysis that would
+ determine how many hits a perfect cache would have (MAX), and how
+ much say an LRU strategy would give (MIN). Then add a fudge
+ factor to account for different index strategies and test that the
+ reported number of cache hits/accesses is within (MIN, MAX) +/-
+ fudge factor.
+
+ (As an aside - I am curious why you chose to use a clock-style
+ algorithm rather than the more straightforward LRU for your buffer
+ cache implementation in your sample solution. Is there a reason
+ for that? I was curious to see if it made a difference, so I
+ implemented LRU for your cache implementation and ran the test
+ workload of project 4 and printed cache hits/accesses. I found
+ that for that workload, the clock-based algorithm performs almost
+ identical to LRU (within about 1%, but I ran nondeterministally
+ with QEMU). I then reduced the cache size to 32 blocks and found
+ again the same performance, which raises the suspicion that the
+ test workload might not force any cache replacement, so the
+ eviction strategy doesn't matter.)
+
+ - Godmar: I haven't analyzed the tests for project 4 yet, but I'm
+ wondering if the fairness requirements your specification has for
+ readers/writers are covered in the tests or not.
+
+
+* Documentation:
+
+ - Add "Digging Deeper" sections that describe the nitty-gritty x86
+ details for the benefit of those interested.
+
+ - Add explanations of what "real" OSes do to give students some
+ perspective.
+
+* To add partition support:
+
+ - Find four partition types that are more or less unused and choose
+ to use them for Pintos. (This is implemented.)
+
+ - Bootloader reads partition tables of all BIOS devices to find the
+ first that has the "Pintos kernel" partition type. (This is
+ implemented.) Ideally the bootloader would make sure there is
+ exactly one such partition, but I didn't implement that yet.
+
+ - Bootloader reads kernel into memory at 1 MB using BIOS calls.
+ (This is implemented.)
+
+ - Kernel arguments have to go into a separate sector because the
+ bootloader is otherwise too big to fit now? (I don't recall if I
+ did anything about this.)
+
+ - Kernel at boot also scans partition tables of all the disks it can
+ find to find the ones with the four Pintos partition types
+ (perhaps not all exist). After that, it makes them available to
+ the rest of the kernel (and doesn't allow access to other devices,
+ for safety).
+
+ - "pintos" and "pintos-mkdisk" need to write a partition table to
+ the disks that they create. "pintos-mkdisk" will need to take a
+ new parameter specifying the type. (I might have partially
+ implemented this, don't remember.)
+
+ - "pintos" should insist on finding a partition header on disks
+ handed to it, for safety.
+
+ - Need some way for "pintos" to assemble multiple disks or
+ partitions into a single image that can be copied directly to a
+ USB block device. (I don't know whether I came up with a good
+ solution yet or not, or whether I implemented any of it.)
+
+* To add USB support:
+
+ - Needs to be able to scan PCI bus for UHCI controller. (I
+ implemented this partially.)
+
+ - May want to be able to initialize USB controllers over CardBus
+ bridges. I don't know whether this requires additional work or
+ if it's useful enough to warrant extra work. (It's of special
+ interest for me because I have a laptop that only has USB via
+ CardBus.)
+
+ - There are many protocol layers involved: SCSI over USB-Mass
+ Storage over USB over UHCI over PCI. (I may be forgetting one.)
+ I don't know yet whether it's best to separate the layers or to
+ merge (some of) them. I think that a simple and clean
+ organization should be a priority.
+
+ - VMware can likely be used for testing because it can expose host
+ USB devices as guest USB devices. This is safer and more
+ convenient than using real hardware for testing.
+
+ - Should test with a variety of USB keychain devices because there
+ seems to be wide variation among them, especially in the SCSI
+ protocols they support. Should try to use a "lowest-common
+ denominator" SCSI protocol if any such thing really exists.
+
+ - Might want to add a feature whereby kernel arguments can be
+ given interactively, rather than passed on-disk. Needs some
+ though.