pintos-os.org Git - pintos-anon/blob - TODO

   1 -*- text -*-
   2
   3 Godmar says:
   4
   5 - In Project 2, we're missing tests that pass arguments to system calls
   6 that span multiple pages, where some are mapped and some are not.
   7 An implementation that only checks the first page, rather than all pages
   8 that can be touched during a call to read()/write() passes all tests.
   9
  10 - In Project 2, we're missing a test that would fail if they assumed
  11 that contiguous user-virtual addresses are laid out contiguously
  12 in memory.  The loading code should ensure that non-contiguous
  13 physical pages are allocated for the data segment (at least.)
  14
  15 - Need some tests that test that illegal accesses lead to process
  16 termination. I have written some, will add them. In P2, obviously,
  17 this would require that the students break this functionality since
  18 the page directory is initialized for them, still it would be good
  19 to have.
  20
  21 - There does not appear to be a test that checks that they close all
  22 fd's on exit.  Idea: add statistics & self-diagnostics code to palloc.c
  23 and malloc.c.  Self-diagnostics code could be used for debugging.
  24 The statistics code would report how much kernel memory is free.
  25 Add a system call "get_kernel_memory_information".  User programs
  26 could engage in a variety of activities and notice leaks by checking
  27 the kernel memory statistics.
  28
  29 ---
  30
  31 From: "Godmar Back" <godmar@gmail.com>
  32 Subject: priority donation tests
  33 To: "Ben Pfaff" <blp@cs.stanford.edu>
  34 Date: Fri, 3 Mar 2006 11:02:08 -0500
  35
  36 Ben,
  37
  38 it seems the priority donation tests are somewhat incomplete and allow
  39 incorrect implementations to pass with a perfect score.
  40
  41 We are seeing the following wrong implementations pass all tests:
  42
  43 - Implementations that assume that the priority of a thread waiting on
  44 a semaphore or condition variable cannot change between when the
  45 thread was blocked and when it is unblocked. The students implement
  46 this by doing an insert into an ordered list on block, rather than
  47 picking the maximum thread on unblock.
  48
  49 Neither of these two cases is detected; do you currently check for
  50 these mistakes manually?
  51
  52 I also wrote a test case for the second scenario:
  53 http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.c
  54 http://people.cs.vt.edu/~gback/pintos/priority-donate-sema.ck
  55
  56 From: "Godmar Back" <godmar@gmail.com>
  57 Subject: multiple threads waking up at same clock tick
  58 To: "Ben Pfaff" <blp@cs.stanford.edu>
  59 Date: Wed, 1 Mar 2006 08:14:47 -0500
  60
  61 Greg Benson points out another potential TODO item for P1.
  62
  63 ----
  64 One thing I recall:
  65
  66 The alarm tests do not test to see if multiple threads are woken up if
  67 their timers have expired.  That is, students can write a solution
  68 that just wakes up the first thread on the sleep queue rather than
  69 check for additional threads.  Of course, the next thread will be
  70 woken up on the next tick.  Also, this might be hard to test.
  71
  72 ---
  73 Way to test this: (from Godmar Back)
  74
  75 Thread A with high priority spins until 'ticks' changes, then calls to
  76 timer_sleep(X), Thread B with lower priority is then resumed, calls
  77 set_priority to make its priority equal to that of thread A, then
  78 calls timer_sleep(X), all of that before the next clock interrupt
  79 arrives.
  80
  81 On wakeup, each thread records wake-up time and calls yield
  82 immediately, forcing the scheduler to switch to the other
  83 equal-priority thread. Both wake-up times must be the same (and match
  84 the planned wake-up time.)
  85
  86 PS:
  87 I actually tested it and it's hard to pass with the current ips setting.
  88 The bounds on how quickly a thread would need to be able to return after
  89 sleep appear too tight.  Need another idea.
  90
  91 From: "Godmar Back" <godmar@gmail.com>
  92
  93 For reasons I don't currently understand, some of our students seem
  94 hesitant to include each thread in a second "all-threads" list and are
  95 looking for ways to implement the advanced scheduler without one.
  96
  97 Currently, I believe, all tests for the mlfqs are such that all
  98 threads are either ready or sleeping in timer_sleep(). This allows for
  99 an incorrect implementation in which recent-cpu and priorities are
 100 updated only for those threads that are on the alarm list or the ready
 101 list.
 102
 103 The todo item would be a test where a thread is blocked on a
 104 semaphore, lock or condition variable and have its recent_cpu decay to
 105 zero, and check that it's scheduled right after the unlock/up/signal.
 106
 107 From: "Godmar Back" <godmar@gmail.com>
 108 Subject: set_priority & donation - a TODO item
 109 To: "Ben Pfaff" <blp@cs.stanford.edu>
 110 Date: Mon, 20 Feb 2006 22:20:26 -0500
 111
 112 Ben,
 113
 114 it seems that there are currently no tests that check the proper
 115 behavior of thread_set_priority() when called by a thread that is
 116 running under priority donation.  The proper behavior, I assume, is to
 117 temporarily drop the donation if the set priority is higher, and to
 118 reassume the donation should the thread subsequently set its own
 119 priority again to a level that's lower than a still active donation.
 120
 121  - Godmar
 122
 123 From: Godmar Back <godmar@gmail.com>
 124 Subject: project 4 question/comment regarding caching inode data
 125 To: Ben Pfaff <blp@cs.stanford.edu>
 126 Date: Sat, 14 Jan 2006 15:59:33 -0500
 127
 128 Ben,
 129
 130 in section 6.3.3 in the P4 FAQ, you write:
 131
 132 "You can store a pointer to inode data in struct inode, if you want,"
 133
 134 Should you point out that if they indeed do that, they likely wouldn't
 135 be able to support more than 64 open inodes systemwide at any given
 136 point in time.
 137
 138 (This seems like a rather strong limitation; do your current tests
 139 open more than 64 files?
 140 It would also point to an obvious way to make the projects harder by
 141 specifically disallowing that inode data be locked in memory during
 142 the entire time an inode is kept open.)
 143
 144  - Godmar
 145
 146 From: Godmar Back <godmar@gmail.com>
 147 Subject: on caching in project 4
 148 To: Ben Pfaff <blp@cs.stanford.edu>
 149 Date: Mon, 9 Jan 2006 20:58:01 -0500
 150
 151 here's an idea for future semesters.
 152
 153 I'm in the middle of project 4, I've started by implementing a buffer
 154 cache and plugging it into the existing filesystem.  Along the way I
 155 was wondering how we could test the cache.
 156
 157 Maybe one could adopt a similar testing strategy as in project 1 for
 158 the MLQFS scheduler: add a function that reads "get_cache_accesses()"
 159 and a function "get_cache_hits()".  Then create a version of pintos
 160 that creates access traces for a to-be-determined workload.  Run an
 161 off-line analysis that would determine how many hits a perfect cache
 162 would have (MAX), and how much say an LRU strategy would give (MIN).
 163 Then add a fudge factor to account for different index strategies and
 164 test that the reported number of cache hits/accesses is within (MIN,
 165 MAX) +/- fudge factor.
 166
 167 (As an aside - I am curious why you chose to use a clock-style
 168 algorithm rather than the more straightforward LRU for your buffer
 169 cache implementation in your sample solution. Is there a reason for
 170 that?  I was curious to see if it made a difference, so I implemented
 171 LRU for your cache implementation and ran the test workload of project
 172 4 and printed cache hits/accesses.
 173 I found that for that workload, the clock-based algorithm performs
 174 almost identical to LRU (within about 1%, but I ran nondeterministally
 175 with QEMU). I then reduced the cache size to 32 blocks and found again
 176 the same performance, which raises the suspicion that the test
 177 workload might not force any cache replacement, so the eviction
 178 strategy doesn't matter.)
 179
 180 Godmar Back <godmar@gmail.com> writes:
 181
 182 > in your sample solution to P4, dir_reopen does not take any locks when
 183 > changing a directory's open_cnt. This looks like a race condition to
 184 > me, considering that dir_reopen is called from execute_process without
 185 > any filesystem locks held.
 186
 187 * Get rid of rox--causes more trouble than it's worth
 188
 189 * Reconsider command line arg style--confuses everyone.
 190
 191 * Finish writing tour.
 192
 193 via Godmar Back:
 194
 195 * Get rid of mmap syscall, add sbrk.
 196
 197 * page-linear, page-shuffle VM tests do not use enough memory to force
 198   eviction.  Should increase memory consumption.
 199
 200 * Add FS persistence test(s).
 201
 202 * process_death test needs improvement
 203
 204 * Internal tests.
 205
 206 * Improve automatic interpretation of exception messages.
 207
 208 * Userprog project:
 209
 210   - Mark read-only pages as actually read-only in the page table.  Or,
 211     since this was consistently rated as the easiest project by the
 212     students, require them to do it.
 213
 214   - Don't provide per-process pagedir implementation but only
 215     single-process implementation and require students to implement
 216     the separation?  This project was rated as the easiest after all.
 217     Alternately we could just remove the synchronization on pid
 218     selection and check that students fix it.
 219
 220 * Filesys project:
 221
 222   - Need a better way to measure performance improvement of buffer
 223     cache.  Some students reported that their system was slower with
 224     cache--likely, Bochs doesn't simulate a disk with a realistic
 225     speed.
 226
 227 * Documentation:
 228
 229   - Add "Digging Deeper" sections that describe the nitty-gritty x86
 230     details for the benefit of those interested.
 231
 232   - Add explanations of what "real" OSes do to give students some
 233     perspective.
 234
 235 * Assignments:
 236
 237   - Add extra credit:
 238
 239     . Low-level x86 stuff, like paged page tables.
 240
 241     . Specifics on how to implement sbrk, malloc.
 242
 243     . Other good ideas.
 244
 245     . opendir/readdir/closedir
 246
 247     . everything needed for getcwd()
 248
 249 To add partition support:
 250
 251 - Find four partition types that are more or less unused and choose to
 252   use them for Pintos.  (This is implemented.)
 253
 254 - Bootloader reads partition tables of all BIOS devices to find the
 255   first that has the "Pintos kernel" partition type.  (This is
 256   implemented.)  Ideally the bootloader would make sure there is
 257   exactly one such partition, but I didn't implement that yet.
 258
 259 - Bootloader reads kernel into memory at 1 MB using BIOS calls.  (This
 260   is implemented.)
 261
 262 - Kernel arguments have to go into a separate sector because the
 263   bootloader is otherwise too big to fit now?  (I don't recall if I
 264   did anything about this.)
 265
 266 - Kernel at boot also scans partition tables of all the disks it can
 267   find to find the ones with the four Pintos partition types (perhaps
 268   not all exist).  After that, it makes them available to the rest of
 269   the kernel (and doesn't allow access to other devices, for safety).
 270
 271 - "pintos" and "pintos-mkdisk" need to write a partition table to the
 272   disks that they create.  "pintos-mkdisk" will need to take a new
 273   parameter specifying the type.  (I might have partially implemented
 274   this, don't remember.)
 275
 276 - "pintos" should insist on finding a partition header on disks handed
 277   to it, for safety.
 278
 279 - Need some way for "pintos" to assemble multiple disks or partitions
 280   into a single image that can be copied directly to a USB block
 281   device.  (I don't know whether I came up with a good solution yet or
 282   not, or whether I implemented any of it.)
 283
 284 To add USB support:
 285
 286 - Needs to be able to scan PCI bus for UHCI controller.  (I
 287   implemented this partially.)
 288
 289 - May want to be able to initialize USB controllers over CardBus
 290   bridges.  I don't know whether this requires additional work or if
 291   it's useful enough to warrant extra work.  (It's of special interest
 292   for me because I have a laptop that only has USB via CardBus.)
 293
 294 - There are many protocol layers involved: SCSI over USB-Mass Storage
 295   over USB over UHCI over PCI.  (I may be forgetting one.)  I don't
 296   know yet whether it's best to separate the layers or to merge (some
 297   of) them.  I think that a simple and clean organization should be a
 298   priority.
 299
 300 - VMware can likely be used for testing because it can expose host USB
 301   devices as guest USB devices.  This is safer and more convenient
 302   than using real hardware for testing.
 303
 304 - Should test with a variety of USB keychain devices because there
 305   seems to be wide variation among them, especially in the SCSI
 306   protocols they support.  Should try to use a "lowest-common
 307   denominator" SCSI protocol if any such thing really exists.
 308
 309 - Might want to add a feature whereby kernel arguments can be given
 310   interactively, rather than passed on-disk.  Needs some though.