X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=TODO;h=5bf19c1d3c94176bbc9e523b0538e12b893fa4ce;hb=26682996a2bf33a095546a4947e28ffb58dd0588;hp=2d61506b6882bbfdb81b170fb1de9779110dc8d1;hpb=a8fc8d6b82dd68230c443157dea73879f0b95bd4;p=pintos-anon diff --git a/TODO b/TODO index 2d61506..5bf19c1 100644 --- a/TODO +++ b/TODO @@ -1,16 +1,342 @@ -*- text -*- -* Grader: +* Bochs is not fully reproducible. - - Fix bug where failures are being treated as warnings. +Godmar says: -* Userprog project: +- In Project 2, we're missing tests that pass arguments to system calls +that span multiple pages, where some are mapped and some are not. +An implementation that only checks the first page, rather than all pages +that can be touched during a call to read()/write() passes all tests. + +- In Project 2, we're missing a test that would fail if they assumed +that contiguous user-virtual addresses are laid out contiguously +in memory. The loading code should ensure that non-contiguous +physical pages are allocated for the data segment (at least.) + +- Need some tests that test that illegal accesses lead to process +termination. I have written some, will add them. In P2, obviously, +this would require that the students break this functionality since +the page directory is initialized for them, still it would be good +to have. + +- There does not appear to be a test that checks that they close all +fd's on exit. Idea: add statistics & self-diagnostics code to palloc.c +and malloc.c. Self-diagnostics code could be used for debugging. +The statistics code would report how much kernel memory is free. +Add a system call "get_kernel_memory_information". User programs +could engage in a variety of activities and notice leaks by checking +the kernel memory statistics. + +--- + +From: "Godmar Back" +Subject: priority donation tests +To: "Ben Pfaff" +Date: Fri, 3 Mar 2006 11:02:08 -0500 + +Ben, + +it seems the priority donation tests are somewhat incomplete and allow +incorrect implementations to pass with a perfect score. + +We are seeing the following wrong implementations pass all tests: + +- Implementations that assume locks are released in the opposite order +in which they're acquired. The students implement this by +popping/pushing on the donation list. + +- Implementations that assume that the priority of a thread waiting on +a semaphore or condition variable cannot change between when the +thread was blocked and when it is unblocked. The students implement +this by doing an insert into an ordered list on block, rather than +picking the maximum thread on unblock. + +Neither of these two cases is detected; do you currently check for +these mistakes manually? + +I wrote a test that checks for the first case; it is here: +http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple-2.patch + +[...] + +I also wrote a test case for the second scenario: +http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.c +http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.ck + +I put the other tests up here: +http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple2.c +http://people.cs.vt.edu/~gback/pintos/priority-donate-multiple2.ck + +From: "Godmar Back" +Subject: multiple threads waking up at same clock tick +To: "Ben Pfaff" +Date: Wed, 1 Mar 2006 08:14:47 -0500 + +Greg Benson points out another potential TODO item for P1. + +---- +One thing I recall: + +The alarm tests do not test to see if multiple threads are woken up if +their timers have expired. That is, students can write a solution +that just wakes up the first thread on the sleep queue rather than +check for additional threads. Of course, the next thread will be +woken up on the next tick. Also, this might be hard to test. + +--- +Way to test this: (from Godmar Back) + +Thread A with high priority spins until 'ticks' changes, then calls to +timer_sleep(X), Thread B with lower priority is then resumed, calls +set_priority to make its priority equal to that of thread A, then +calls timer_sleep(X), all of that before the next clock interrupt +arrives. + +On wakeup, each thread records wake-up time and calls yield +immediately, forcing the scheduler to switch to the other +equal-priority thread. Both wake-up times must be the same (and match +the planned wake-up time.) + +PS: +I actually tested it and it's hard to pass with the current ips setting. +The bounds on how quickly a thread would need to be able to return after +sleep appear too tight. Need another idea. + +--- +From: "Waqar Mohsin" +Subject: 3 questions about switch_threads() in switch.S +To: blp@cs.stanford.edu, joshwise@stanford.edu +Date: Fri, 3 Mar 2006 17:09:21 -0800 + +QUESTION 1 + +In the section + + # Save current stack pointer to old thread's stack, if any. + movl SWITCH_CUR(%esp), %eax + test %eax, %eax + jz 1f + movl %esp, (%eax,%edx,1) +1: + + # Restore stack pointer from new thread's stack. + movl SWITCH_NEXT(%esp), %ecx + movl (%ecx,%edx,1), %esp + +why are we saving the current stack pointer only if the "cur" thread pointer +is non-NULL ? Isn't it gauranteed to be non-NULL because switch_threads() is +only called form schedule(), where we have + + struct thread *cur = running_thread (); + +which should always be non-NULL (given the way kernel pool is laid out). + +QUESTION 2 + + # This stack frame must match the one set up by thread_create(). + pushl %ebx + pushl %ebp + pushl %esi + pushl %edi + +I find the comment confusing. thread_create() is a special case: the set of +registers popped from switch_threads stack frame for a newly created thread +are all zero, so their order shouldn't dictate the order above. + +I think all that matters is that the order of pops at the end of +switch_threads() is the opposite of the pushes at the beginning (as shown +above). + +QUESTION 3 + +Is it true that struct switch_threads_frame does NOT strictly require + + struct thread *cur; /* 20: switch_threads()'s CUR argument. */ + struct thread *next; /* 24: switch_threads()'s NEXT argument. */ +at the end ? + +When a newly created thread's stack pointer is installed in switch_threads(), +all we do is pop the saved registers and return to switch_entry() which pops +off and discards the above two simulated (and not used) arguments to +switch_threads(). + +If we remove these two from struct switch_threads_frame and don't do a + + # Discard switch_threads() arguments. + addl $8, %esp +in switch_entry(), things should still work. Am I right ? + +Thanks +Waqar + +From: "Godmar Back" +Subject: thread_yield in irq handler +To: "Ben Pfaff" +Date: Wed, 22 Feb 2006 22:18:50 -0500 + +Ben, + +you write in your Tour of Pintos: + +"Second, an interrupt handler must not call any function that can +sleep, which rules out thread_yield(), lock_acquire(), and many +others. This is because external interrupts use space on the stack of +the kernel thread that was running at the time the interrupt occurred. +If the interrupt handler tried to sleep and that thread resumed, then +the two uses of the single stack would interfere, which cannot be +allowed." + +Is the last sentence really true? + +I thought the reason that you couldn't sleep is that you would put +effectively a random thread/process to sleep, but I don't think it +would cause problems with the kernel stack. After all, it doesn't +cause this problem if you call thread_yield at the end of +intr_handler(), so why would it cause this problem earlier. + +As for thread_yield(), my understanding is that the reason it's called +at the end is to ensure it's done after the interrupt is acknowledged, +which you can't do until the end because Pintos doesn't handle nested +interrupts. - - Move `join' implementation here, from `threads' project, to help - normalize the project difficulties. + - Godmar - - The semantics of the join system call should change so that it - only returns the exit code once. +From: "Godmar Back" + +For reasons I don't currently understand, some of our students seem +hesitant to include each thread in a second "all-threads" list and are +looking for ways to implement the advanced scheduler without one. + +Currently, I believe, all tests for the mlfqs are such that all +threads are either ready or sleeping in timer_sleep(). This allows for +an incorrect implementation in which recent-cpu and priorities are +updated only for those threads that are on the alarm list or the ready +list. + +The todo item would be a test where a thread is blocked on a +semaphore, lock or condition variable and have its recent_cpu decay to +zero, and check that it's scheduled right after the unlock/up/signal. + +From: "Godmar Back" +Subject: set_priority & donation - a TODO item +To: "Ben Pfaff" +Date: Mon, 20 Feb 2006 22:20:26 -0500 + +Ben, + +it seems that there are currently no tests that check the proper +behavior of thread_set_priority() when called by a thread that is +running under priority donation. The proper behavior, I assume, is to +temporarily drop the donation if the set priority is higher, and to +reassume the donation should the thread subsequently set its own +priority again to a level that's lower than a still active donation. + + - Godmar + +From: Godmar Back +Subject: project 4 question/comment regarding caching inode data +To: Ben Pfaff +Date: Sat, 14 Jan 2006 15:59:33 -0500 + +Ben, + +in section 6.3.3 in the P4 FAQ, you write: + +"You can store a pointer to inode data in struct inode, if you want," + +Should you point out that if they indeed do that, they likely wouldn't +be able to support more than 64 open inodes systemwide at any given +point in time. + +(This seems like a rather strong limitation; do your current tests +open more than 64 files? +It would also point to an obvious way to make the projects harder by +specifically disallowing that inode data be locked in memory during +the entire time an inode is kept open.) + + - Godmar + +From: Godmar Back +Subject: on caching in project 4 +To: Ben Pfaff +Date: Mon, 9 Jan 2006 20:58:01 -0500 + +here's an idea for future semesters. + +I'm in the middle of project 4, I've started by implementing a buffer +cache and plugging it into the existing filesystem. Along the way I +was wondering how we could test the cache. + +Maybe one could adopt a similar testing strategy as in project 1 for +the MLQFS scheduler: add a function that reads "get_cache_accesses()" +and a function "get_cache_hits()". Then create a version of pintos +that creates access traces for a to-be-determined workload. Run an +off-line analysis that would determine how many hits a perfect cache +would have (MAX), and how much say an LRU strategy would give (MIN). +Then add a fudge factor to account for different index strategies and +test that the reported number of cache hits/accesses is within (MIN, +MAX) +/- fudge factor. + +(As an aside - I am curious why you chose to use a clock-style +algorithm rather than the more straightforward LRU for your buffer +cache implementation in your sample solution. Is there a reason for +that? I was curious to see if it made a difference, so I implemented +LRU for your cache implementation and ran the test workload of project +4 and printed cache hits/accesses. +I found that for that workload, the clock-based algorithm performs +almost identical to LRU (within about 1%, but I ran nondeterministally +with QEMU). I then reduced the cache size to 32 blocks and found again +the same performance, which raises the suspicion that the test +workload might not force any cache replacement, so the eviction +strategy doesn't matter.) + +Godmar Back writes: + +> in your sample solution to P4, dir_reopen does not take any locks when +> changing a directory's open_cnt. This looks like a race condition to +> me, considering that dir_reopen is called from execute_process without +> any filesystem locks held. + +* Get rid of rox--causes more trouble than it's worth + +* Reconsider command line arg style--confuses everyone. + +* Finish writing tour. + +* Introduce a "yield" system call to speed up the syn-* tests. + +via Godmar Back: + +* Project 3 solution needs FS lock. + +* Get rid of mmap syscall, add sbrk. + +* Make backtrace program accept multiple object file arguments, + e.g. add -u option to allow backtracing user program also. + +* page-linear, page-shuffle VM tests do not use enough memory to force + eviction. Should increase memory consumption. + +* Add FS persistence test(s). + +* lock_acquire(), lock_release() don't need additional intr_dis/enable + calls, because the semaphore protects lock->holder. + [ Think this over: is this really true when priority donation is + implemented? intr_dis/enable prevents the race with thread_set_priority. + Leaving it there could help the students getting the correct synchronization + right. + ] + + + +* process_death test needs improvement + +* Internal tests. + +* Improve automatic interpretation of exception messages. + +* Userprog project: - Mark read-only pages as actually read-only in the page table. Or, since this was consistently rated as the easiest project by the @@ -22,9 +348,14 @@ Alternately we could just remove the synchronization on pid selection and check that students fix it. -* Documentation: +* Filesys project: + + - Need a better way to measure performance improvement of buffer + cache. Some students reported that their system was slower with + cache--likely, Bochs doesn't simulate a disk with a realistic + speed. - - Finish writing tour. +* Documentation: - Add "Digging Deeper" sections that describe the nitty-gritty x86 details for the benefit of those interested. @@ -38,23 +369,10 @@ . Low-level x86 stuff, like paged page tables. - . Other good ideas. - - - mmap/munmap should use segment IDs like Nachos. Too hard - otherwise. - - - Add src/testcases/vm, src/testcases/filesys and make it clear to use - them? + . Specifics on how to implement sbrk, malloc. -* Tests: - - - Release some of them. - - - The threads, userprog, vm test source files could use - factorization and cleanup along the lines of fslib in the filesys - tests. + . Other good ideas. - - The p1-4.c testcase needs significant tuning. Currently it takes - too long (especially when SHOW_PROGRESS is turned on) and doesn't - show significant improvement. + . opendir/readdir/closedir + . everything needed for getcwd()