-*- text -*-
-* The tests in tests/ don't apply the grading patches.
+* Bochs is not fully reproducible.
-* We need better and more example programs.
+Godmar says:
- - Need an mmap example program as a replacement for the crappy mmap FAQ
- question.
+- In Project 2, we're missing tests that pass arguments to system calls
+that span multiple pages, where some are mapped and some are not.
+An implementation that only checks the first page, rather than all pages
+that can be touched during a call to read()/write() passes all tests.
- - How about `diff' and `cmp' programs?
+- In Project 2, we're missing a test that would fail if they assumed
+that contiguous user-virtual addresses are laid out contiguously
+in memory. The loading code should ensure that non-contiguous
+physical pages are allocated for the data segment (at least.)
-* Make it clear that the students own their code, because there was some
- confusion on that point.
+- Need some tests that test that illegal accesses lead to process
+termination. I have written some, will add them. In P2, obviously,
+this would require that the students break this functionality since
+the page directory is initialized for them, still it would be good
+to have.
-* Threads:
+- There does not appear to be a test that checks that they close all
+fd's on exit. Idea: add statistics & self-diagnostics code to palloc.c
+and malloc.c. Self-diagnostics code could be used for debugging.
+The statistics code would report how much kernel memory is free.
+Add a system call "get_kernel_memory_information". User programs
+could engage in a variety of activities and notice leaks by checking
+the kernel memory statistics.
- - join-invalid doesn't compile if tid_t is not scalar type.
+---
- - mlfqs tests suck. They aren't even correct, e.g. the amarv
- submission from win0405 is graded incorrectly.
+From: "Godmar Back" <godmar@gmail.com>
+Subject: priority donation tests
+To: "Ben Pfaff" <blp@cs.stanford.edu>
+Date: Fri, 3 Mar 2006 11:02:08 -0500
-* Userprog project:
+Ben,
- - Don't emphasize that stuff needs to be copied from user space to
- kernel space. Instead, emphasize validation and suggest that
- copying is a common solution and that it will be necessary in
- project 3 and in real OSes. Also revise the grading criteria to
- match.
+it seems the priority donation tests are somewhat incomplete and allow
+incorrect implementations to pass with a perfect score.
- - Move `join' implementation here, from `threads' project, to help
- normalize the project difficulties.
+We are seeing the following wrong implementations pass all tests:
- - The semantics of the join system call should change so that it
- only returns the exit code once.
+- Implementations that assume locks are released in the opposite order
+in which they're acquired. The students implement this by
+popping/pushing on the donation list.
- - Mark read-only pages as actually read-only in the page table. Or,
- since this was consistently rated as the easiest project by the
- students, require them to do it.
+- Implementations that assume that the priority of a thread waiting on
+a semaphore or condition variable cannot change between when the
+thread was blocked and when it is unblocked. The students implement
+this by doing an insert into an ordered list on block, rather than
+picking the maximum thread on unblock.
- - Don't provide per-process pagedir implementation but only
- single-process implementation and require students to implement
- the separation? This project was rated as the easiest after all.
- Alternately we could just remove the synchronization on pid
- selection and check that students fix it.
+Neither of these two cases is detected; do you currently check for
+these mistakes manually?
-* VM project:
+I wrote a test that checks for the first case; it is here:
+http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple-2.patch
- - Discuss the perils of mixing dirty bits between kernel and user virtual
- memory.
+[...]
- - Sample solution.
+I also wrote a test case for the second scenario:
+http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.c
+http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.ck
- - Update grading/vm to reflect new mmap, munmap forms.
+I put the other tests up here:
+http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple2.c
+http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple2.ck
-* Filesys project:
+From: "Godmar Back" <godmar@gmail.com>
+Subject: multiple threads waking up at same clock tick
+To: "Ben Pfaff" <blp@cs.stanford.edu>
+Date: Wed, 1 Mar 2006 08:14:47 -0500
+
+Greg Benson points out another potential TODO item for P1.
+
+----
+One thing I recall:
+
+The alarm tests do not test to see if multiple threads are woken up if
+their timers have expired. That is, students can write a solution
+that just wakes up the first thread on the sleep queue rather than
+check for additional threads. Of course, the next thread will be
+woken up on the next tick. Also, this might be hard to test.
+
+---
+Way to test this: (from Godmar Back)
+
+Thread A with high priority spins until 'ticks' changes, then calls to
+timer_sleep(X), Thread B with lower priority is then resumed, calls
+set_priority to make its priority equal to that of thread A, then
+calls timer_sleep(X), all of that before the next clock interrupt
+arrives.
+
+On wakeup, each thread records wake-up time and calls yield
+immediately, forcing the scheduler to switch to the other
+equal-priority thread. Both wake-up times must be the same (and match
+the planned wake-up time.)
+
+PS:
+I actually tested it and it's hard to pass with the current ips setting.
+The bounds on how quickly a thread would need to be able to return after
+sleep appear too tight. Need another idea.
+
+From: "Godmar Back" <godmar@gmail.com>
+
+For reasons I don't currently understand, some of our students seem
+hesitant to include each thread in a second "all-threads" list and are
+looking for ways to implement the advanced scheduler without one.
+
+Currently, I believe, all tests for the mlfqs are such that all
+threads are either ready or sleeping in timer_sleep(). This allows for
+an incorrect implementation in which recent-cpu and priorities are
+updated only for those threads that are on the alarm list or the ready
+list.
+
+The todo item would be a test where a thread is blocked on a
+semaphore, lock or condition variable and have its recent_cpu decay to
+zero, and check that it's scheduled right after the unlock/up/signal.
+
+From: "Godmar Back" <godmar@gmail.com>
+Subject: set_priority & donation - a TODO item
+To: "Ben Pfaff" <blp@cs.stanford.edu>
+Date: Mon, 20 Feb 2006 22:20:26 -0500
+
+Ben,
+
+it seems that there are currently no tests that check the proper
+behavior of thread_set_priority() when called by a thread that is
+running under priority donation. The proper behavior, I assume, is to
+temporarily drop the donation if the set priority is higher, and to
+reassume the donation should the thread subsequently set its own
+priority again to a level that's lower than a still active donation.
+
+ - Godmar
+
+From: Godmar Back <godmar@gmail.com>
+Subject: project 4 question/comment regarding caching inode data
+To: Ben Pfaff <blp@cs.stanford.edu>
+Date: Sat, 14 Jan 2006 15:59:33 -0500
+
+Ben,
+
+in section 6.3.3 in the P4 FAQ, you write:
+
+"You can store a pointer to inode data in struct inode, if you want,"
+
+Should you point out that if they indeed do that, they likely wouldn't
+be able to support more than 64 open inodes systemwide at any given
+point in time.
+
+(This seems like a rather strong limitation; do your current tests
+open more than 64 files?
+It would also point to an obvious way to make the projects harder by
+specifically disallowing that inode data be locked in memory during
+the entire time an inode is kept open.)
+
+ - Godmar
+
+From: Godmar Back <godmar@gmail.com>
+Subject: on caching in project 4
+To: Ben Pfaff <blp@cs.stanford.edu>
+Date: Mon, 9 Jan 2006 20:58:01 -0500
+
+here's an idea for future semesters.
+
+I'm in the middle of project 4, I've started by implementing a buffer
+cache and plugging it into the existing filesystem. Along the way I
+was wondering how we could test the cache.
+
+Maybe one could adopt a similar testing strategy as in project 1 for
+the MLQFS scheduler: add a function that reads "get_cache_accesses()"
+and a function "get_cache_hits()". Then create a version of pintos
+that creates access traces for a to-be-determined workload. Run an
+off-line analysis that would determine how many hits a perfect cache
+would have (MAX), and how much say an LRU strategy would give (MIN).
+Then add a fudge factor to account for different index strategies and
+test that the reported number of cache hits/accesses is within (MIN,
+MAX) +/- fudge factor.
+
+(As an aside - I am curious why you chose to use a clock-style
+algorithm rather than the more straightforward LRU for your buffer
+cache implementation in your sample solution. Is there a reason for
+that? I was curious to see if it made a difference, so I implemented
+LRU for your cache implementation and ran the test workload of project
+4 and printed cache hits/accesses.
+I found that for that workload, the clock-based algorithm performs
+almost identical to LRU (within about 1%, but I ran nondeterministally
+with QEMU). I then reduced the cache size to 32 blocks and found again
+the same performance, which raises the suspicion that the test
+workload might not force any cache replacement, so the eviction
+strategy doesn't matter.)
+
+Godmar Back <godmar@gmail.com> writes:
- - Increase maximum disk size from 8 MB to something that actually
- requires doubly indirect nodes. There is a negative pressure here
- from the bitmap object--perhaps we need a specialized bitmap that
- doesn't have to be all in-memory at once.
+> in your sample solution to P4, dir_reopen does not take any locks when
+> changing a directory's open_cnt. This looks like a race condition to
+> me, considering that dir_reopen is called from execute_process without
+> any filesystem locks held.
- Alternatively, shrink the inode size.
+* Get rid of rox--causes more trouble than it's worth
- - Add mkdir and ls example user programs.
+* Reconsider command line arg style--confuses everyone.
- - Add option to disable buffer cache.
+* Finish writing tour.
- - Get rid of "dump" commands--they're not really useful.
+* Introduce a "yield" system call to speed up the syn-* tests.
- - Sample solution.
+via Godmar Back:
+
+* Project 3 solution needs FS lock.
+
+* Get rid of mmap syscall, add sbrk.
+
+* Make backtrace program accept multiple object file arguments,
+ e.g. add -u option to allow backtracing user program also.
+
+* page-linear, page-shuffle VM tests do not use enough memory to force
+ eviction. Should increase memory consumption.
+
+* Add FS persistence test(s).
+
+* process_death test needs improvement
+
+* Internal tests.
+
+* Improve automatic interpretation of exception messages.
+
+* Userprog project:
+
+ - Mark read-only pages as actually read-only in the page table. Or,
+ since this was consistently rated as the easiest project by the
+ students, require them to do it.
+
+ - Don't provide per-process pagedir implementation but only
+ single-process implementation and require students to implement
+ the separation? This project was rated as the easiest after all.
+ Alternately we could just remove the synchronization on pid
+ selection and check that students fix it.
+
+* Filesys project:
- Need a better way to measure performance improvement of buffer
cache. Some students reported that their system was slower with
cache--likely, Bochs doesn't simulate a disk with a realistic
speed.
- - Clarify effect of remove(cwd).
-
* Documentation:
- - Finish writing tour.
-
- Add "Digging Deeper" sections that describe the nitty-gritty x86
details for the benefit of those interested.
. everything needed for getcwd()
- - Add src/testcases/vm, src/testcases/filesys and make it clear to use
- them?
+To add partition support:
+
+- Find four partition types that are more or less unused and choose to
+ use them for Pintos. (This is implemented.)
+
+- Bootloader reads partition tables of all BIOS devices to find the
+ first that has the "Pintos kernel" partition type. (This is
+ implemented.) Ideally the bootloader would make sure there is
+ exactly one such partition, but I didn't implement that yet.
+
+- Bootloader reads kernel into memory at 1 MB using BIOS calls. (This
+ is implemented.)
+
+- Kernel arguments have to go into a separate sector because the
+ bootloader is otherwise too big to fit now? (I don't recall if I
+ did anything about this.)
+
+- Kernel at boot also scans partition tables of all the disks it can
+ find to find the ones with the four Pintos partition types (perhaps
+ not all exist). After that, it makes them available to the rest of
+ the kernel (and doesn't allow access to other devices, for safety).
+
+- "pintos" and "pintos-mkdisk" need to write a partition table to the
+ disks that they create. "pintos-mkdisk" will need to take a new
+ parameter specifying the type. (I might have partially implemented
+ this, don't remember.)
+
+- "pintos" should insist on finding a partition header on disks handed
+ to it, for safety.
-* Tests:
+- Need some way for "pintos" to assemble multiple disks or partitions
+ into a single image that can be copied directly to a USB block
+ device. (I don't know whether I came up with a good solution yet or
+ not, or whether I implemented any of it.)
- - Release some of them.
+To add USB support:
- - The threads, userprog, vm test source files could use
- factorization and cleanup along the lines of fslib in the filesys
- tests.
+- Needs to be able to scan PCI bus for UHCI controller. (I
+ implemented this partially.)
- - The p1-4.c testcase needs significant tuning. Currently it takes
- too long (especially when SHOW_PROGRESS is turned on) and doesn't
- show significant improvement.
+- May want to be able to initialize USB controllers over CardBus
+ bridges. I don't know whether this requires additional work or if
+ it's useful enough to warrant extra work. (It's of special interest
+ for me because I have a laptop that only has USB via CardBus.)
-* Code:
+- There are many protocol layers involved: SCSI over USB-Mass Storage
+ over USB over UHCI over PCI. (I may be forgetting one.) I don't
+ know yet whether it's best to separate the layers or to merge (some
+ of) them. I think that a simple and clean organization should be a
+ priority.
- - Rewrite quick_sort() to use heap sort, for O(1) stack usage.
+- VMware can likely be used for testing because it can expose host USB
+ devices as guest USB devices. This is safer and more convenient
+ than using real hardware for testing.
- - Rewrite list_sort() to use merge sort, for O(1) heap usage.
+- Should test with a variety of USB keychain devices because there
+ seems to be wide variation among them, especially in the SCSI
+ protocols they support. Should try to use a "lowest-common
+ denominator" SCSI protocol if any such thing really exists.
- - Make printf() test actually check its results.
+- Might want to add a feature whereby kernel arguments can be given
+ interactively, rather than passed on-disk. Needs some though.