pintos-os.org Git - pintos-anon/blob - doc/threads.texi

   1 @node Project 1--Threads, Project 2--User Programs, Introduction, Top
   2 @chapter Project 1: Threads
   3
   4 In this assignment, we give you a minimally functional thread system.
   5 Your job is to extend the functionality of this system to gain a
   6 better understanding of synchronization problems. Additionally, you
   7 will use at least part of this increased functionality in future
   8 assignments.
   9
  10 You will be working in primarily in the @file{threads} directory for
  11 this assignment, with some work in the @file{devices} directory on the
  12 side.  Compilation should be done in the @file{threads} directory.
  13
  14 @menu
  15 * Understanding Threads::
  16 * Debugging versus Testing::
  17 * Tips::
  18 * Problem 1-1 Alarm Clock::
  19 * Problem 1-2 Join::
  20 * Problem 1-3 Priority Scheduling::
  21 * Problem 1-4 Advanced Scheduler::
  22 * Threads FAQ::
  23 * Multilevel Feedback Scheduling::
  24 @end menu
  25
  26 @node Understanding Threads
  27 @section Understanding Threads
  28
  29 The first step is to read and understand the initial thread system.
  30 Pintos, by default, implements thread creation and thread completion,
  31 a simple scheduler to switch between threads, and synchronization
  32 primitives (semaphores, locks, and condition variables).
  33 @c FIXME: base system doesn't do anything. Debugger sucks.
  34 However, there's a lot of magic going on in some of this code, so you
  35 should compile and run the base system to see a simple test of the
  36 code.  You should read through the source code by hand to see what's
  37 going on.  If you like, you can add calls to @code{printf()} almost
  38 anywhere, then recompile and run to see what happens and in what
  39 order.
  40
  41 When a thread is created, you are creating a new context to be
  42 scheduled. You provide a function to be run in this context as an
  43 argument to @code{thread_create()}. The first time the thread is
  44 scheduled and runs, it will start from the beginning of that function
  45 and execute it in the context. When that function returns, that thread
  46 completes. Each thread, therefore, acts like a mini-program running
  47 inside Pintos, with the function passed to @code{thread_create()}
  48 acting like @code{main()}.
  49
  50 At any given time, Pintos is running exactly one thread, with the
  51 others switched out.  The scheduler decides which thread to run next
  52 when it needs to switch between them.  (If no thread is ready to run
  53 at any given time, then the special ``idle'' thread runs.)  The
  54 synchronization primitives are used to force context switches when one
  55 thread needs to wait for another thread to do something.
  56
  57 The exact mechanics of a context switch are pretty gruesome and have
  58 been provided for you in @file{threads/switch.S} (this is 80@var{x}86
  59 assembly; don't worry about understanding it).  It involves saving the
  60 state of the currently running thread and restoring the state of the
  61 thread we're switching to.
  62 @c FIXME
  63 Slowly trace through a context switch to see what happens.  Be sure to
  64 keep track of each thread's address and state, and what procedures are
  65 on the call stack for each thread. You will notice that when one
  66 thread calls @code{switch_threads()}, another thread starts running,
  67 and the first thing the new thread does is to return from
  68 @code{switch_threads()}. We realize this comment will seem cryptic to
  69 you at this point, but you will understand threads once you understand
  70 why the @code{switch_threads()} that gets called is different from the
  71 @code{switch_threads()} that returns.
  72
  73 @strong{Warning}: In Pintos, each thread is assigned a small,
  74 fixed-size execution stack just under @w{4 kB} in size.  The kernel
  75 does try to detect stack overflow, but it cannot always succeed.  You
  76 ma cause bizarre problems, such as mysterious kernel panics, if you
  77 declare large data structures as non-static local variables,
  78 e.g. @samp{int buf[1000];}.  Alternatives to stack allocation include
  79 the page allocator in @file{threads/palloc.c} and the block allocator
  80 in @file{threads/malloc.c}.  Note that the page allocator doles out
  81 @w{4 kB} chunks and that @code{malloc()} has a @w{2 kB} block size
  82 limit.  If you need larger chunks, consider using a linked structure
  83 instead.
  84
  85 @node Debugging versus Testing
  86 @section Debugging versus Testing
  87
  88 When you're debugging code, it's useful to be able to be able to run a
  89 program twice and have it do exactly the same thing.  On second and
  90 later runs, you can make new observations without having to discard or
  91 verify your old observations.  This property is called
  92 ``reproducibility.''  The simulator we use, Bochs, can be set up for
  93 reproducibility.  If you use the Bochs configuration files we provide,
  94 which specify @samp{ips: @var{n}} where @var{n} is a number of
  95 simulated instructions per second, your simulations can be
  96 reproducible.
  97
  98 Of course, a simulation can only be reproducible from one run to the
  99 next if its input is the same each time.  For simulating an entire
 100 computer, as we do, this means that every part of the computer must be
 101 the same.  For example, you must use the same disks, the same version
 102 of Bochs, and you must not hit any keys on the keyboard (because you
 103 could not be sure to hit them at exactly the same point each time)
 104 during the runs.
 105
 106 While reproducibility is useful for debugging, it is a problem for
 107 testing thread synchronization, an important part of this project.  In
 108 particular, when Bochs is set up for reproducibility, timer interrupts
 109 will come at perfectly reproducible points, and therefore so will
 110 thread switches.  That means that running the same test several times
 111 doesn't give you any greater confidence in your code's correctness
 112 than does running it only once.
 113
 114 So, to make your code easier to test, we've added a feature to Bochs
 115 that makes timer interrupts come at random intervals, but in a
 116 perfectly predictable way.  In particular, if you put a line
 117 @samp{ips-jitter: @var{seed}}, where @var{seed} is an integer, into
 118 your Bochs configuration file, then timer interrupts will come at
 119 irregularly spaced intervals.  Within a single @var{seed} value,
 120 execution will still be reproducible, but timer behavior will change
 121 as @var{seed} is varied.  Thus, for the highest degree of confidence
 122 you should test your code with many seed values.
 123
 124 @node Tips
 125 @section Tips
 126
 127 There should be no busy-waiting in any of your solutions to this
 128 assignment.  Furthermore, resist the temptation to directly disable
 129 interrupts in your solution by calling @code{intr_disable()} or
 130 @code{intr_set_level()}, although you may find doing so to be useful
 131 while debugging.  Instead, use semaphores, locks and condition
 132 variables to solve synchronization problems.  Hint: read the comments
 133 in @file{threads/synch.h} if you're unsure what synchronization
 134 primitives may be used in what situations.
 135
 136 Given some designs of some problems, there may be one or two instances
 137 in which it is appropriate to directly change the interrupt levels
 138 instead of relying on the given synchroniztion primitives.  This must
 139 be justified in your @file{DESIGNDOC} file.  If you're not sure you're
 140 justified, ask!
 141
 142 While all parts of this assignment are required if you intend to earn
 143 full credit on this project, keep in mind that Problem 2 (Join) will
 144 be needed for future assignments, so you'll want to get this one
 145 right.  We don't give out solutions, so you're stuck with your Join
 146 code for the whole quarter.  Problem 1 (Alarm Clock) could be very
 147 handy, but not strictly required in the future.  The upshot of all
 148 this is that you should focus heavily on making sure that your
 149 implementation of Join works correctly, since if it's broken, you will
 150 need to fix it for future assignments.  The other parts can be turned
 151 off in the future if you find you can't make them work quite right.
 152
 153 Also keep in mind that Problem 4 (the MLFQS) builds on the features you
 154 implement in Problem 3, so to avoid unnecessary code duplication, it
 155 would be a good idea to divide up the work among your team members
 156 such that you have Problem 3 fully working before you begin to tackle
 157 Problem 4.
 158
 159 @node Problem 1-1 Alarm Clock
 160 @section Problem 1-2: Alarm Clock
 161
 162 Improve the implementation of the timer device defined in
 163 @file{devices/timer.c} by reimplementing @code{timer_sleep()}.
 164 Threads call @code{timer_sleep(@var{x})} to suspend execution until
 165 time has advanced by at least @w{@var{x} timer ticks}.  This is
 166 useful for threads that operate in real-time, for example, for
 167 blinking the cursor once per second.  There is no requirement that
 168 threads start running immediately after waking up; just put them on
 169 the ready queue after they have waited for approximately the right
 170 amount of time.
 171
 172 A working implementation of this function is provided.  However, the
 173 version provided is poor, because it ``busy waits,'' that is, it spins
 174 in a tight loop checking the current time until the current time has
 175 advanced far enough.  This is undesirable because it wastes time that
 176 could potentially be used more profitably by another thread.  Your
 177 solution should not busy wait.
 178
 179 The argument to @code{timer_sleep()} is expressed in timer ticks, not
 180 in milliseconds or some other unit.
 181
 182 @node Problem 1-2 Join
 183 @section Problem 1-2: Join
 184
 185 Implement @code{thread_join(struct thread *)} in
 186 @file{threads/thread.c}.  There is already a prototype for it in
 187 @file{threads/thread.h}, which you should not change.  This function
 188 causes the currently running thread to block until thread passed as an
 189 argument exits.  If A is the running thread and B is the argument,
 190 then we say that ``A joins B'' in this case.
 191
 192 The model for @code{thread_join()} is the @command{wait} system call
 193 in Unix-like systems.  (Try reading the manpages.)  That system call
 194 can only be used by a parent process to wait for a child's death.  You
 195 should implement @code{thread_join()} to have the same restriction.
 196 That is, a thread may only join on its immediate children.
 197
 198 A thread need not ever be joined.  Your solution should properly free
 199 all of a thread's resources, including its @code{struct thread},
 200 whether it is ever joined or not, and regardless of whether the child
 201 exits before or after its parent.  That is, a thread should be freed
 202 exactly once in all cases.
 203
 204 Joining a given thread is idempotent.  That is, joining a thread T
 205 multiple times is equivalent to joining it once, because T has already
 206 exited at the time of the later joins.  Thus, joins on T after the
 207 first should return immediately.
 208
 209 The behavior of calling @code{thread_join()} on an thread that is not
 210 the caller's child is undefined.  You need not handle this case
 211 gracefully.
 212
 213 Consider all the ways a join can occur: nested joins (A joins B when B
 214 is joined on C), multiple joins (A joins B, then A joins C), and so
 215 on.  Does your join work if @code{thread_join()} is called on a thread
 216 that has not yet been scheduled for the first time?  You should handle
 217 all of these cases.  Write test code that demonstrates the cases your
 218 join works for.  Don't overdo the output volume, please!
 219
 220 Be careful to program this function correctly.  You will need its
 221 functionality for project 2.
 222
 223 @node Problem 1-3 Priority Scheduling
 224 @section Problem 1-3 Priority Scheduling
 225
 226 Implement priority scheduling in Pintos.  Priority
 227 scheduling is a key building block for real-time systems.  Implement functions
 228 @code{thread_set_priority()} to set the priority of a thread and
 229 @code{thread_get_priority()} to get the priority of a thread.  There
 230 are already prototypes for these functions in @file{threads/thread.h},
 231 which you should not change.
 232
 233 When a thread is added to the ready list that has a higher priority
 234 than the currently running thread, the current thread should
 235 immediately yield the processor to the new thread.  Similarly, when
 236 threads are waiting for a lock, semaphore or condition variable, the
 237 highest priority waiting thread should be woken up first.  A thread's
 238 priority may be set at any time, including while the thread is waiting
 239 on a lock, semaphore, or condition variable.
 240
 241 One issue with priority scheduling is ``priority inversion'': if a
 242 high priority thread needs to wait for a low priority thread (for
 243 instance, for a lock held by a low priority thread, or in
 244 @code{thread_join()} for a thread to complete), and a middle priority
 245 thread is on the ready list, then the high priority thread will never
 246 get the CPU because the low priority thread will not get any CPU time.
 247 A partial fix for this problem is to have the waiting thread
 248 ``donate'' its priority to the low priority thread while it is holding
 249 the lock, then recall the donation once it has acquired the lock.
 250 Implement this fix.
 251
 252 You will need to account for all different orders that priority
 253 donation and inversion can occur.  Be sure to handle multiple
 254 donations, in which multiple priorities are donated to a thread.  You
 255 must also handle nested donation: given high, medium, and low priority
 256 threads H, M, and L, respectively, and supposing H is waiting on a
 257 lock that M holds and M is waiting on a lock that L holds, both M and
 258 L should be boosted to H's priority.
 259
 260 You only need to implement priority donation when a thread is waiting
 261 for a lock held by a lower-priority thread. You do not need to
 262 implement this fix for semaphores, condition variables or joins.
 263 However, you do need to implement priority scheduling in all cases.
 264
 265 @node Problem 1-4 Advanced Scheduler
 266 @section Problem 1-4 Advanced Scheduler
 267
 268 Implement Solaris's multilevel feedback queue scheduler (MLFQS) to
 269 reduce the average response time for running jobs on your system.
 270 @xref{Multilevel Feedback Scheduling}, for a detailed description of
 271 the MLFQS requirements.
 272
 273 Demonstrate that your scheduling algorithm reduces response time
 274 relative to the original Pintos scheduling algorithm (round robin) for
 275 at least one workload of your own design (i.e. in addition to the
 276 provided test).
 277
 278 You may assume a static priority for this problem. It is not necessary
 279 to ``re-donate'' a thread's priority if it changes (although you are
 280 free to do so).
 281
 282 @node Threads FAQ, Multilevel Feedback Scheduling, Problem 1-4 Advanced Scheduler, Project 1--Threads
 283 @section FAQ
 284
 285 @enumerate 1
 286 @item General FAQs
 287
 288 @enumerate 1
 289 @item
 290 @b{I am adding a new @file{.h} or @file{.c} file.  How do I fix the
 291 @file{Makefile}s?}@anchor{Adding c or h Files}
 292
 293 To add a @file{.c} file, edit the top-level @file{Makefile.build}.
 294 You'll want to add your file to variable @samp{@var{dir}_SRC}, where
 295 @var{dir} is the directory where you added the file.  For this
 296 project, that means you should add it to @code{threads_SRC}, or
 297 possibly @code{devices_SRC} if you put in the @file{devices}
 298 directory.  Then run @code{make}.  If your new file doesn't get
 299 compiled, run @code{make clean} and then try again.
 300
 301 When you modify the top-level @file{Makefile.build}, the modified
 302 version should be automatically copied to
 303 @file{threads/build/Makefile} when you re-run make.  The opposite is
 304 not true, so any changes will be lost the next time you run @code{make
 305 clean} from the @file{threads} directory.  Therefore, you should
 306 prefer to edit @file{Makefile.build} (unless your changes are meant to
 307 be truly temporary).
 308
 309 There is no need to edit the @file{Makefile}s to add a @file{.h} file.
 310
 311 @item
 312 @b{If a thread finishes, should its children be terminated immediately,
 313 or should they finish normally?}
 314
 315 You should feel free to decide what semantics you think this
 316 should have. You need only provide justification for your
 317 decision.
 318
 319 @item
 320 @b{Why can't I disable interrupts?}
 321
 322 Turning off interrupts should only be done for short amounts of time,
 323 or else you end up losing important things such as disk or input
 324 events.  Turning off interrupts also increases the interrupt handling
 325 latency, which can make a machine feel sluggish if taken too far.
 326 Therefore, in general, setting the interrupt level should be used
 327 sparingly.  Also, any synchronization problem can be easily solved by
 328 turning interrupts off, since while interrupts are off, there is no
 329 concurrency, so there's no possibility for race condition.
 330
 331 To make sure you understand concurrency well, we are discouraging you
 332 from taking this shortcut at all in your solution.  If you are unable
 333 to solve a particular synchronization problem with semaphores, locks,
 334 or conditions, or think that they are inadequate for a particular
 335 reason, you may turn to disabling interrupts.  If you want to do this,
 336 we require in your design document a complete justification and
 337 scenario (i.e.@: exact sequence of events) to show why interrupt
 338 manipulation is the best solution.  If you are unsure, the TAs can
 339 help you determine if you are using interrupts too haphazardly.  We
 340 want to emphasize that there are only limited cases where this is
 341 appropriate.
 342
 343 @item
 344 @b{Where might interrupt-level manipuation be appropriate?}
 345
 346 You might find it necessary in some solutions to the Alarm problem.
 347
 348 You might want it at one small point for the priority scheduling
 349 problem.  Note that it is not required to use interrupts for these
 350 problems.  There are other, equally correct solutions that do not
 351 require interrupt manipulation.  However, if you do manipulate
 352 interrupts and @strong{correctly and fully document it} in your design
 353 document, we will allow limited use of interrupt disabling.
 354 @end enumerate
 355
 356 @item Alarm Clock FAQs
 357
 358 @enumerate 1
 359 @item
 360 @b{Why can't I use most synchronization primitives in an interrupt
 361 handler?}
 362
 363 As you've discovered, you cannot sleep in an external interrupt
 364 handler.  Since many lock, semaphore, and condition variable functions
 365 attempt to sleep, you won't be able to call those in
 366 @code{timer_interrupt()}.  You may still use those that never sleep.
 367
 368 Having said that, you need to make sure that global data does not get
 369 updated by multiple threads simultaneously executing
 370 @code{timer_sleep()}.  Here are some pieces of information to think
 371 about:
 372
 373 @enumerate a
 374 @item
 375 Interrupts are turned off while @code{timer_interrupt()} runs.  This
 376 means that @code{timer_interrupt()} will not be interrupted by a
 377 thread running in @code{timer_sleep()}.
 378
 379 @item
 380 A thread in @code{timer_sleep()}, however, can be interrupted by a
 381 call to @code{timer_interrupt()}, except when that thread has turned
 382 off interrupts.
 383
 384 @item
 385 Examples of synchronization mechanisms have been presented in lecture.
 386 Going over these examples should help you understand when each type is
 387 useful or needed.
 388 @end enumerate
 389
 390 @item
 391 @b{What about timer overflow due to the fact that times are defined as
 392 integers? Do I need to check for that?}
 393
 394 Don't worry about the possibility of timer values overflowing.  Timer
 395 values are expressed as signed 63-bit numbers, which at 100 ticks per
 396 second should be good for almost 2,924,712,087 years.
 397 @end enumerate
 398
 399 @item Join FAQs
 400
 401 @enumerate 1
 402 @item
 403 @b{Am I correct to assume that once a thread is deleted, it is no
 404 longer accessible by the parent (i.e.@: the parent can't call
 405 @code{thread_join(child)})?}
 406
 407 A parent joining a child that has completed should be handled
 408 gracefully and should act as a no-op.
 409 @end enumerate
 410
 411 @item Priority Scheduling FAQs
 412
 413 @enumerate 1
 414 @item
 415 @b{Doesn't the priority scheduling lead to starvation? Or do I have to
 416 implement some sort of aging?}
 417
 418
 419 It is true that strict priority scheduling can lead to starvation
 420 because thread may not run if a higher-priority thread is runnable.
 421 In this problem, don't worry about starvation or any sort of aging
 422 technique.  Problem 4 will introduce a mechanism for dynamically
 423 changing thread priorities.
 424
 425 This sort of scheduling is valuable in real-time systems because it
 426 offers the programmer more control over which jobs get processing
 427 time.  High priorities are generally reserved for time-critical
 428 tasks. It's not ``fair,'' but it addresses other concerns not
 429 applicable to a general-purpose operating system.
 430
 431 @item
 432 @b{After a lock has been released, does the program need to switch to
 433 the highest priority thread that needs the lock (assuming that its
 434 priority is higher than that of the current thread)?}
 435
 436 When a lock is released, the highest priority thread waiting for that
 437 lock should be unblocked and put on the ready to run list.  The
 438 scheduler should then run the highest priority thread on the ready
 439 list.
 440
 441 @item
 442 @b{If a thread calls @code{thread_yield()} and then it turns out that
 443 it has higher priority than any other threads, does the high-priority
 444 thread continue running?}
 445
 446 Yes.  If there is a single highest-priority thread, it continues
 447 running until it blocks or finishes, even if it calls
 448 @code{thread_yield()}.
 449
 450 @item
 451 @b{If the highest priority thread is added to the ready to run list it
 452 should start execution immediately.  Is it immediate enough if I
 453 wait until next timer interrupt occurs?}
 454
 455 The highest priority thread should run as soon as it is runnable,
 456 preempting whatever thread is currently running.
 457
 458 @item
 459 @b{What happens to the priority of the donating thread?  Do the priorities
 460 get swapped?}
 461
 462 No.  Priority donation only changes the priority of the low-priority
 463 thread.  The donating thread's priority stays unchanged.  Also note
 464 that priorities aren't additive: if thread A (with priority 5) donates
 465 to thread B (with priority 3), then B's new priority is 5, not 8.
 466
 467 @item
 468 @b{Can a thread's priority be changed while it is sitting on the ready
 469 queue?}
 470
 471 Yes.  Consider this case: low-priority thread L currently has a lock
 472 that high-priority thread H wants.  H donates its priority to L (the
 473 lock holder).  L finishes with the lock, and then loses the CPU and is
 474 moved to the ready queue.  Now L's old priority is restored while it
 475 is in the ready queue.
 476
 477 @item
 478 @b{Can a thread's priority change while it is sitting on the queue of a
 479 semaphore?}
 480
 481 Yes.  Same scenario as above except L gets blocked waiting on a new
 482 lock when H restores its priority.
 483
 484 @item
 485 @b{Why is pubtest3's FIFO test skipping some threads! I know my scheduler
 486 is round-robin'ing them like it's supposed to!  Our output is like this:}
 487
 488 @example
 489 Thread 0 goes.
 490 Thread 2 goes.
 491 Thread 3 goes.
 492 Thread 4 goes.
 493 Thread 0 goes.
 494 Thread 1 goes.
 495 Thread 2 goes.
 496 Thread 3 goes.
 497 Thread 4 goes.
 498 @end example
 499
 500 @noindent @b{which repeats 5 times and then}
 501
 502 @example
 503 Thread 1 goes.
 504 Thread 1 goes.
 505 Thread 1 goes.
 506 Thread 1 goes.
 507 Thread 1 goes.
 508 @end example
 509
 510 This happens because context switches are being invoked by the test
 511 when it explicitly calls @code{thread_yield()}.  However, the time
 512 slice timer is still alive and so, every tick (by default), thread 1
 513 gets switched out (caused by @code{timer_interrupt()} calling
 514 @code{intr_yield_on_return()}) before it gets a chance to run its
 515 mainline.  It is by coincidence that Thread 1 is the one that gets
 516 skipped in our example.  If we use a different jitter value, the same
 517 behavior is seen where a thread gets started and switched out
 518 completely.
 519
 520 Solution: Increase the value of @code{TIME_SLICE} in
 521 @file{devices/timer.c} to a very high value, such as 10000, to see
 522 that the threads will round-robin if they aren't interrupted.
 523
 524 @item
 525 @b{What happens when a thread is added to the ready list which has
 526 higher priority than the currently running thread?}
 527
 528 The correct behavior is to immediately yield the processor.  Your
 529 solution must act this way.
 530
 531 @item
 532 @b{What range of priorities should be supported and what should the
 533 default priority of a thread be?}
 534
 535 Your implementation should support priorities from 0 through 59 and
 536 the default priority of a thread should be 29.
 537 @end enumerate
 538
 539 @item Advanced Scheduler FAQs
 540
 541 @enumerate 1
 542 @item
 543 @b{What is the interval between timer interrupts?}
 544
 545 Timer interrupts occur @code{TIMER_FREQ} times per second.  You can
 546 adjust this value by editing @file{devices/timer.h}.  The default is
 547 100 Hz.
 548
 549 @item
 550 @b{Do I have to modify the dispatch table?}
 551
 552 No, although you are allowed to. It is possible to complete
 553 this problem (i.e.@: demonstrate response time improvement)
 554 without doing so.
 555
 556 @item
 557 @b{When the scheduler changes the priority of a thread, how does this
 558 affect priority donation?}
 559
 560 Short (official) answer: Don't worry about it. Your priority donation
 561 code may assume static priority assignment.
 562
 563 Longer (unofficial) opinion: If you wish to take this into account,
 564 however, your design may end up being ``cleaner.''  You have
 565 considerable freedom in what actually takes place. I believe what
 566 makes the most sense is for scheduler changes to affect the
 567 ``original'' (non-donated) priority.  This change may actually be
 568 masked by the donated priority.  Priority changes should only
 569 propagate with donations, not ``backwards'' from donees to donors.
 570
 571 @item
 572 @b{What is meant by ``static priority''?}
 573
 574 Once thread A has donated its priority to thread B, if thread A's
 575 priority changes (due to the scheduler) while the donation still
 576 exists, you do not have to change thread B's donated priority.
 577 However, you are free to do so.
 578
 579 @item
 580 @b{Do I have to make my dispatch table user-configurable?}
 581
 582 No.  Hard-coding the dispatch table values is fine.
 583 @end enumerate
 584 @end enumerate
 585
 586 @include mlfqs.texi