-*- text -*-
-* Bochs is not fully reproducible.
+* In grading scripts, warn when a fault is caused by an attempt to
+ write to the kernel text segment. (Among other things we need to
+ explain that "text" means "code".)
-Godmar says:
-
-- In Project 2, we're missing tests that pass arguments to system calls
-that span multiple pages, where some are mapped and some are not.
-An implementation that only checks the first page, rather than all pages
-that can be touched during a call to read()/write() passes all tests.
-
-- In Project 2, we're missing a test that would fail if they assumed
-that contiguous user-virtual addresses are laid out contiguously
-in memory. The loading code should ensure that non-contiguous
-physical pages are allocated for the data segment (at least.)
-
-- Need some tests that test that illegal accesses lead to process
-termination. I have written some, will add them. In P2, obviously,
-this would require that the students break this functionality since
-the page directory is initialized for them, still it would be good
-to have.
-
-- There does not appear to be a test that checks that they close all
-fd's on exit. Idea: add statistics & self-diagnostics code to palloc.c
-and malloc.c. Self-diagnostics code could be used for debugging.
-The statistics code would report how much kernel memory is free.
-Add a system call "get_kernel_memory_information". User programs
-could engage in a variety of activities and notice leaks by checking
-the kernel memory statistics.
-
----
-
-From: "Godmar Back" <godmar@gmail.com>
-Subject: priority donation tests
-To: "Ben Pfaff" <blp@cs.stanford.edu>
-Date: Fri, 3 Mar 2006 11:02:08 -0500
-
-Ben,
-
-it seems the priority donation tests are somewhat incomplete and allow
-incorrect implementations to pass with a perfect score.
-
-We are seeing the following wrong implementations pass all tests:
-
-- Implementations that assume locks are released in the opposite order
-in which they're acquired. The students implement this by
-popping/pushing on the donation list.
-
-- Implementations that assume that the priority of a thread waiting on
-a semaphore or condition variable cannot change between when the
-thread was blocked and when it is unblocked. The students implement
-this by doing an insert into an ordered list on block, rather than
-picking the maximum thread on unblock.
-
-Neither of these two cases is detected; do you currently check for
-these mistakes manually?
-
-I wrote a test that checks for the first case; it is here:
-http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple-2.patch
-
-[...]
-
-I also wrote a test case for the second scenario:
-http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.c
-http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.ck
-
-I put the other tests up here:
-http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple2.c
-http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple2.ck
-
-From: "Godmar Back" <godmar@gmail.com>
-Subject: multiple threads waking up at same clock tick
-To: "Ben Pfaff" <blp@cs.stanford.edu>
-Date: Wed, 1 Mar 2006 08:14:47 -0500
-
-Greg Benson points out another potential TODO item for P1.
-
-----
-One thing I recall:
-
-The alarm tests do not test to see if multiple threads are woken up if
-their timers have expired. That is, students can write a solution
-that just wakes up the first thread on the sleep queue rather than
-check for additional threads. Of course, the next thread will be
-woken up on the next tick. Also, this might be hard to test.
-
----
-Way to test this: (from Godmar Back)
-
-Thread A with high priority spins until 'ticks' changes, then calls to
-timer_sleep(X), Thread B with lower priority is then resumed, calls
-set_priority to make its priority equal to that of thread A, then
-calls timer_sleep(X), all of that before the next clock interrupt
-arrives.
-
-On wakeup, each thread records wake-up time and calls yield
-immediately, forcing the scheduler to switch to the other
-equal-priority thread. Both wake-up times must be the same (and match
-the planned wake-up time.)
-
-PS:
-I actually tested it and it's hard to pass with the current ips setting.
-The bounds on how quickly a thread would need to be able to return after
-sleep appear too tight. Need another idea.
-
----
-From: "Waqar Mohsin" <wmohsin@gmail.com>
-Subject: 3 questions about switch_threads() in switch.S
-To: blp@cs.stanford.edu, joshwise@stanford.edu
-Date: Fri, 3 Mar 2006 17:09:21 -0800
-
-QUESTION 1
-
-In the section
-
- # Save current stack pointer to old thread's stack, if any.
- movl SWITCH_CUR(%esp), %eax
- test %eax, %eax
- jz 1f
- movl %esp, (%eax,%edx,1)
-1:
-
- # Restore stack pointer from new thread's stack.
- movl SWITCH_NEXT(%esp), %ecx
- movl (%ecx,%edx,1), %esp
-
-why are we saving the current stack pointer only if the "cur" thread pointer
-is non-NULL ? Isn't it gauranteed to be non-NULL because switch_threads() is
-only called form schedule(), where we have
-
- struct thread *cur = running_thread ();
-
-which should always be non-NULL (given the way kernel pool is laid out).
-
-QUESTION 2
-
- # This stack frame must match the one set up by thread_create().
- pushl %ebx
- pushl %ebp
- pushl %esi
- pushl %edi
-
-I find the comment confusing. thread_create() is a special case: the set of
-registers popped from switch_threads stack frame for a newly created thread
-are all zero, so their order shouldn't dictate the order above.
-
-I think all that matters is that the order of pops at the end of
-switch_threads() is the opposite of the pushes at the beginning (as shown
-above).
-
-QUESTION 3
-
-Is it true that struct switch_threads_frame does NOT strictly require
-
- struct thread *cur; /* 20: switch_threads()'s CUR argument. */
- struct thread *next; /* 24: switch_threads()'s NEXT argument. */
-at the end ?
-
-When a newly created thread's stack pointer is installed in switch_threads(),
-all we do is pop the saved registers and return to switch_entry() which pops
-off and discards the above two simulated (and not used) arguments to
-switch_threads().
-
-If we remove these two from struct switch_threads_frame and don't do a
-
- # Discard switch_threads() arguments.
- addl $8, %esp
-in switch_entry(), things should still work. Am I right ?
-
-Thanks
-Waqar
-
-From: "Godmar Back" <godmar@gmail.com>
-Subject: thread_yield in irq handler
-To: "Ben Pfaff" <blp@cs.stanford.edu>
-Date: Wed, 22 Feb 2006 22:18:50 -0500
-
-Ben,
-
-you write in your Tour of Pintos:
-
-"Second, an interrupt handler must not call any function that can
-sleep, which rules out thread_yield(), lock_acquire(), and many
-others. This is because external interrupts use space on the stack of
-the kernel thread that was running at the time the interrupt occurred.
-If the interrupt handler tried to sleep and that thread resumed, then
-the two uses of the single stack would interfere, which cannot be
-allowed."
-
-Is the last sentence really true?
-
-I thought the reason that you couldn't sleep is that you would put
-effectively a random thread/process to sleep, but I don't think it
-would cause problems with the kernel stack. After all, it doesn't
-cause this problem if you call thread_yield at the end of
-intr_handler(), so why would it cause this problem earlier.
-
-As for thread_yield(), my understanding is that the reason it's called
-at the end is to ensure it's done after the interrupt is acknowledged,
-which you can't do until the end because Pintos doesn't handle nested
-interrupts.
-
- - Godmar
-
-From: "Godmar Back" <godmar@gmail.com>
-
-For reasons I don't currently understand, some of our students seem
-hesitant to include each thread in a second "all-threads" list and are
-looking for ways to implement the advanced scheduler without one.
-
-Currently, I believe, all tests for the mlfqs are such that all
-threads are either ready or sleeping in timer_sleep(). This allows for
-an incorrect implementation in which recent-cpu and priorities are
-updated only for those threads that are on the alarm list or the ready
-list.
+* Reconsider command line arg style--confuses everyone.
-The todo item would be a test where a thread is blocked on a
-semaphore, lock or condition variable and have its recent_cpu decay to
-zero, and check that it's scheduled right after the unlock/up/signal.
+* Internal tests.
-From: "Godmar Back" <godmar@gmail.com>
-Subject: set_priority & donation - a TODO item
-To: "Ben Pfaff" <blp@cs.stanford.edu>
-Date: Mon, 20 Feb 2006 22:20:26 -0500
+* Godmar: Extend memory leak robustness tests.
+ multi-oom should still pass in project 3/4 because kernel will run out
+ of kernel pool memory before running out of swap space.
+
+* Godmar: Another area is concurrency. I noticed that I had passed all
+ tests with bochs 2.2.1 (in reproducibility mode). Then I ran them
+ with qemu and hit two deadlocks (one of them in rox-*,
+ incidentally). After fixing those deadlocks, I upgraded to bochs
+ 2.2.5 and hit yet another deadlock in reproducibility mode that
+ didn't show up in 2.2.1. All in all, a standard grading run would
+ have missed 3 deadlocks in my code. I'm not sure how to exploit
+ that for grading - either run with qemu n times (n=2 or 3), or run
+ it with bochs and a set of -j parameters. Some of which could be
+ known to the students, some not, depending on preference. (I ported
+ the -j patch to bochs 2.2.5 -
+ http://people.cs.vt.edu/~gback/pintos/bochs-2.2.5.jitter.patch but I
+ have to admit I never tried it so I don't know if it would have
+ uncovered the deadlocks that qemu and the switch to 2.2.5
+ uncovered.)
+
+* Godmar: There is also the option to require students to develop test
+ workloads themselves, for instance, to demonstrate the effectiveness
+ of a particular algorithm (page eviction & buffer cache replacement
+ come to mind.) This could involve a problem of the form: develop a
+ workload that you cover well, and develop a "worst-case" load where
+ you algorithm performs poorly, and show the results of your
+ quantitative evaluation in your report - this could then be part of
+ their test score.
+
+* Godmar: the spec says that illegal syscall arguments can be handled either by
+ terminating the process, or by returning an error code such as -1.
+
+ Looking at http://gback.cs.vt.edu:8080/source/xref/tests/userprog/write-bad-ptr.c
+ and http://gback.cs.vt.edu:8080/source/xref/tests/userprog/write-bad-ptr.ck
+ I'm wondering if write-bad-ptr isn't forcing them to terminate the
+ process(?). Even though write-bad-ptr.ck has a provision to allow
+ continuation after returning -1, wouldn't it still fail since the test
+ executes:
+ fail ("should have exited with -1");
+ ?
+
+* Godmar: mmap-inherit needs a IGNORE_USER_FAULTS since we say to "not output
+ any messages Pintos doesn't already print." - which technically puts
+ the onus on us to ignore the default page fault msg whenever a test is
+ expected to fault.
+
+* Godmar: add _end to user.lds script and construct some tests that fail
+ unless students check a region for validity rather than just the first
+ address of a region. Right now, unfortunately, they pass all p2 tests
+ with just checking the first address. [A possible problem is that the
+ tests may be unable to tell termination due to unintentional fault
+ from willful termination when address check fails. Should we require
+ they return -1/EINVAL on a bad address and disallow termination? Or
+ construct a test that they'll likely fail if they unintentionally
+ terminate, maybe while holding the filesystem lock? Or require that
+ the diagnostic message only be output when fault occurs in user mode?
+ Something to think about.]
+
+* Threads project:
+
+ - Godmar:
+
+ >> Describe a potential race in thread_set_priority() and explain how
+ >> your implementation avoids it. Can you use a lock to avoid this race?
+
+ I'm not sure what you're getting at here:
+ If changing the priority of a thread involves accessing the ready
+ list, then of course there's a race with interrupt handlers and locks
+ can't be used to resolve it.
+
+ Changing the priority however also involves a race with respect to
+ accessing a thread's "priority" field - this race is with respect to
+ other threads that attempt to donate priority to the thread that's
+ changing its priority. Since this is a thread-to-thread race, I would
+ tend to believe that locks could be used, although I'm not certain. [
+ I should point out, though, that lock_acquire currently disables
+ interrupts - the purpose of which I had doubted in an earlier email,
+ since sema_down() sufficiently establishes mutual exclusion. Taking
+ priority donation into account, disabling interrupts prevents the race
+ for the priority field, assuming the priority field of each thread is
+ always updated with interrupts disabled. ]
+
+ What answer are you looking for for this design document question?
+
+ - Godmar: Another thing: one group passed all tests even though they
+ wake up all waiters on a lock_release(), rather than just
+ one. Since there's never more than one waiter in our tests, they
+ didn't fail anything. Another possible TODO item - this could be
+ part a series of "regression tests" that check that they didn't
+ break basic functionality in project 1. I don't think this would
+ be insulting to the students.
-Ben,
+* Userprog project:
-it seems that there are currently no tests that check the proper
-behavior of thread_set_priority() when called by a thread that is
-running under priority donation. The proper behavior, I assume, is to
-temporarily drop the donation if the set priority is higher, and to
-reassume the donation should the thread subsequently set its own
-priority again to a level that's lower than a still active donation.
+ - Get rid of rox--causes more trouble than it's worth
- - Godmar
+ - Extra credit: specifics on how to implement sbrk, malloc.
+ Godmar: I have a sample solution and tests for that! Stay tuned.
-From: Godmar Back <godmar@gmail.com>
-Subject: project 4 question/comment regarding caching inode data
-To: Ben Pfaff <blp@cs.stanford.edu>
-Date: Sat, 14 Jan 2006 15:59:33 -0500
+ - Godmar: We're missing tests that pass arguments to system calls
+ that span multiple pages, where some are mapped and some are not.
+ An implementation that only checks the first page, rather than all
+ pages that can be touched during a call to read()/write() passes
+ all tests.
-Ben,
+ - Godmar: There does not appear to be a test that checks that they
+ close all fd's on exit. Idea: add statistics & self-diagnostics
+ code to palloc.c and malloc.c. Self-diagnostics code could be
+ used for debugging. The statistics code would report how much
+ kernel memory is free. Add a system call
+ "get_kernel_memory_information". User programs could engage in a
+ variety of activities and notice leaks by checking the kernel
+ memory statistics.
+ - note: multi-oom tests that now.
-in section 6.3.3 in the P4 FAQ, you write:
+ - Godmar: In the wait() tests, there's currently no test that tests
+ that a process can only wait for its own children. There's only
+ one test that tests that wait() on an invalid pid returns -1 (or
+ kills the process), but no test where a valid pid is used that is
+ not a child of the current process.
-"You can store a pointer to inode data in struct inode, if you want,"
+ The current tests also do not ensure that both scenarios (parent waits
+ first vs. child exits first) are exercised. In this context, I'm
+ wondering if we should add a sleep() system call that would export
+ timer_sleep() to user processes; this would allow the construction of
+ such a test. It would also make it easier to construct a test for the
+ valid-pid, but not-a-child scenario.
-Should you point out that if they indeed do that, they likely wouldn't
-be able to support more than 64 open inodes systemwide at any given
-point in time.
+ As in Project 4, the baseline implementation of timer_sleep() should
+ suffice, so this would not necessarily require basing Project 2 on
+ Project 1. [ A related thought: IMO it would not be entirely
+ unreasonable to require timer_sleep() and priority scheduling sans
+ donation from Project 1 working for subsequent projects. ]
-(This seems like a rather strong limitation; do your current tests
-open more than 64 files?
-It would also point to an obvious way to make the projects harder by
-specifically disallowing that inode data be locked in memory during
-the entire time an inode is kept open.)
+* VM project:
- - Godmar
-
-From: Godmar Back <godmar@gmail.com>
-Subject: on caching in project 4
-To: Ben Pfaff <blp@cs.stanford.edu>
-Date: Mon, 9 Jan 2006 20:58:01 -0500
+ - Godmar: Get rid of mmap syscall, add sbrk.
-here's an idea for future semesters.
-
-I'm in the middle of project 4, I've started by implementing a buffer
-cache and plugging it into the existing filesystem. Along the way I
-was wondering how we could test the cache.
-
-Maybe one could adopt a similar testing strategy as in project 1 for
-the MLQFS scheduler: add a function that reads "get_cache_accesses()"
-and a function "get_cache_hits()". Then create a version of pintos
-that creates access traces for a to-be-determined workload. Run an
-off-line analysis that would determine how many hits a perfect cache
-would have (MAX), and how much say an LRU strategy would give (MIN).
-Then add a fudge factor to account for different index strategies and
-test that the reported number of cache hits/accesses is within (MIN,
-MAX) +/- fudge factor.
-
-(As an aside - I am curious why you chose to use a clock-style
-algorithm rather than the more straightforward LRU for your buffer
-cache implementation in your sample solution. Is there a reason for
-that? I was curious to see if it made a difference, so I implemented
-LRU for your cache implementation and ran the test workload of project
-4 and printed cache hits/accesses.
-I found that for that workload, the clock-based algorithm performs
-almost identical to LRU (within about 1%, but I ran nondeterministally
-with QEMU). I then reduced the cache size to 32 blocks and found again
-the same performance, which raises the suspicion that the test
-workload might not force any cache replacement, so the eviction
-strategy doesn't matter.)
-
-Godmar Back <godmar@gmail.com> writes:
-
-> in your sample solution to P4, dir_reopen does not take any locks when
-> changing a directory's open_cnt. This looks like a race condition to
-> me, considering that dir_reopen is called from execute_process without
-> any filesystem locks held.
+ - Godmar: page-linear, page-shuffle VM tests do not use enough
+ memory to force eviction. Should increase memory consumption.
-* Get rid of rox--causes more trouble than it's worth
+ - Godmar: fix the page* tests to require swapping
-* Reconsider command line arg style--confuses everyone.
+ - Godmar: make sure the filesystem fails if not properly
+ concurrency-protected in project 3.
-* Finish writing tour.
+ - Godmar: Another area in which tests could be created are for
+ project 3: tests that combine mmap with a paging workload to see
+ their kernel pages properly while mmapping pages - I don't think
+ the current tests test that, do they?
-* Introduce a "yield" system call to speed up the syn-* tests.
+* Filesys project:
-via Godmar Back:
+ - Need a better way to measure performance improvement of buffer
+ cache. Some students reported that their system was slower with
+ cache--likely, Bochs doesn't simulate a disk with a realistic
+ speed.
-* Project 3 solution needs FS lock.
+ (Perhaps we should count disk reads and writes, not time.)
+
+ - Need lots more tests.
+
+ - Detect implementations that represent the cwd as a string, by
+ removing a directory that is the cwd of another process, then
+ creating a new directory of the same name and putting some files
+ in it, then checking whether the process that had it as cwd sees
+ them.
+
+ - dir-rm-cwd should have a related test that uses a separate process
+ to try to pin the directory as its cwd.
+
+ - Godmar: I'm not sure if I mentioned that already, but I passed all
+ tests for the filesys project without having implemented inode
+ deallocation. A test is needed that checks that blocks are
+ reclaimed when files are deleted.
+
+ - Godmar: I'm in the middle of project 4, I've started by
+ implementing a buffer cache and plugging it into the existing
+ filesystem. Along the way I was wondering how we could test the
+ cache.
+
+ Maybe one could adopt a similar testing strategy as in project 1
+ for the MLQFS scheduler: add a function that reads
+ "get_cache_accesses()" and a function "get_cache_hits()". Then
+ create a version of pintos that creates access traces for a
+ to-be-determined workload. Run an off-line analysis that would
+ determine how many hits a perfect cache would have (MAX), and how
+ much say an LRU strategy would give (MIN). Then add a fudge
+ factor to account for different index strategies and test that the
+ reported number of cache hits/accesses is within (MIN, MAX) +/-
+ fudge factor.
+
+ (As an aside - I am curious why you chose to use a clock-style
+ algorithm rather than the more straightforward LRU for your buffer
+ cache implementation in your sample solution. Is there a reason
+ for that? I was curious to see if it made a difference, so I
+ implemented LRU for your cache implementation and ran the test
+ workload of project 4 and printed cache hits/accesses. I found
+ that for that workload, the clock-based algorithm performs almost
+ identical to LRU (within about 1%, but I ran nondeterministally
+ with QEMU). I then reduced the cache size to 32 blocks and found
+ again the same performance, which raises the suspicion that the
+ test workload might not force any cache replacement, so the
+ eviction strategy doesn't matter.)
+
+ - Godmar: I haven't analyzed the tests for project 4 yet, but I'm
+ wondering if the fairness requirements your specification has for
+ readers/writers are covered in the tests or not.
-* Get rid of mmap syscall, add sbrk.
-* Make backtrace program accept multiple object file arguments,
- e.g. add -u option to allow backtracing user program also.
+* Documentation:
-* page-linear, page-shuffle VM tests do not use enough memory to force
- eviction. Should increase memory consumption.
+ - Add "Digging Deeper" sections that describe the nitty-gritty x86
+ details for the benefit of those interested.
-* Add FS persistence test(s).
+ - Add explanations of what "real" OSes do to give students some
+ perspective.
-* lock_acquire(), lock_release() don't need additional intr_dis/enable
- calls, because the semaphore protects lock->holder.
- [ Think this over: is this really true when priority donation is
- implemented? intr_dis/enable prevents the race with thread_set_priority.
- Leaving it there could help the students getting the correct synchronization
- right.
- ]
+* To add partition support:
+ - Find four partition types that are more or less unused and choose
+ to use them for Pintos. (This is implemented.)
+ - Bootloader reads partition tables of all BIOS devices to find the
+ first that has the "Pintos kernel" partition type. (This is
+ implemented.) Ideally the bootloader would make sure there is
+ exactly one such partition, but I didn't implement that yet.
-* process_death test needs improvement
+ - Bootloader reads kernel into memory at 1 MB using BIOS calls.
+ (This is implemented.)
-* Internal tests.
+ - Kernel arguments have to go into a separate sector because the
+ bootloader is otherwise too big to fit now? (I don't recall if I
+ did anything about this.)
-* Improve automatic interpretation of exception messages.
+ - Kernel at boot also scans partition tables of all the disks it can
+ find to find the ones with the four Pintos partition types
+ (perhaps not all exist). After that, it makes them available to
+ the rest of the kernel (and doesn't allow access to other devices,
+ for safety).
-* Userprog project:
+ - "pintos" and "pintos-mkdisk" need to write a partition table to
+ the disks that they create. "pintos-mkdisk" will need to take a
+ new parameter specifying the type. (I might have partially
+ implemented this, don't remember.)
- - Mark read-only pages as actually read-only in the page table. Or,
- since this was consistently rated as the easiest project by the
- students, require them to do it.
+ - "pintos" should insist on finding a partition header on disks
+ handed to it, for safety.
- - Don't provide per-process pagedir implementation but only
- single-process implementation and require students to implement
- the separation? This project was rated as the easiest after all.
- Alternately we could just remove the synchronization on pid
- selection and check that students fix it.
+ - Need some way for "pintos" to assemble multiple disks or
+ partitions into a single image that can be copied directly to a
+ USB block device. (I don't know whether I came up with a good
+ solution yet or not, or whether I implemented any of it.)
-* Filesys project:
+* To add USB support:
- - Need a better way to measure performance improvement of buffer
- cache. Some students reported that their system was slower with
- cache--likely, Bochs doesn't simulate a disk with a realistic
- speed.
+ - Needs to be able to scan PCI bus for UHCI controller. (I
+ implemented this partially.)
-* Documentation:
+ - May want to be able to initialize USB controllers over CardBus
+ bridges. I don't know whether this requires additional work or
+ if it's useful enough to warrant extra work. (It's of special
+ interest for me because I have a laptop that only has USB via
+ CardBus.)
- - Add "Digging Deeper" sections that describe the nitty-gritty x86
- details for the benefit of those interested.
+ - There are many protocol layers involved: SCSI over USB-Mass
+ Storage over USB over UHCI over PCI. (I may be forgetting one.)
+ I don't know yet whether it's best to separate the layers or to
+ merge (some of) them. I think that a simple and clean
+ organization should be a priority.
- - Add explanations of what "real" OSes do to give students some
- perspective.
+ - VMware can likely be used for testing because it can expose host
+ USB devices as guest USB devices. This is safer and more
+ convenient than using real hardware for testing.
-* Assignments:
+ - Should test with a variety of USB keychain devices because there
+ seems to be wide variation among them, especially in the SCSI
+ protocols they support. Should try to use a "lowest-common
+ denominator" SCSI protocol if any such thing really exists.
- - Add extra credit:
+ - Might want to add a feature whereby kernel arguments can be
+ given interactively, rather than passed on-disk. Needs some
+ though.
- . Low-level x86 stuff, like paged page tables.
+==========================================================================
+============================== COMPLETED TASKS ===========================
+==========================================================================
- . Specifics on how to implement sbrk, malloc.
+* Godmar: Introduce memory leak robustness tests - both for the
+ well-behaved as well as the mis-behaved case - that tests that the
+ kernel handles low-mem conditions well.
- . Other good ideas.
+ - handled by new multi-oom.
- . opendir/readdir/closedir
+* Godmar: improved priority inheritance tests (see priority-donate-chain)
- . everything needed for getcwd()