pintos-os.org Git - pintos-anon/blob - doc/threads.texi

   1 @node Project 1--Threads, Project 2--User Programs, Top, Top
   2 @chapter Project 1: Threads
   3
   4 In this assignment, we give you a minimally functional thread system.
   5 Your job is to extend the functionality of this system to gain a
   6 better understanding of synchronization problems. Additionally, you
   7 will use at least part of this increased functionality in future
   8 assignments.
   9
  10 You will be working in primarily in the @file{threads} directory for
  11 this assignment, with some work in the @file{devices} directory on the
  12 side.  Compilation should be done in the @file{threads} directory.
  13
  14 @menu
  15 * Understanding Threads::
  16 * Debugging versus Testing::
  17 * Tips::
  18 * Problem 1-1 Alarm Clock::
  19 * Problem 1-2 Join::
  20 * Problem 1-3 Priority Scheduling::
  21 * Problem 1-4 Advanced Scheduler::
  22 * Threads FAQ::
  23 @end menu
  24
  25 @node Understanding Threads, Debugging versus Testing, Project 1--Threads, Project 1--Threads
  26 @section Understanding Threads
  27
  28 The first step is to read and understand the initial thread system.
  29 Pintos, by default, implements thread creation and thread completion,
  30 a simple scheduler to switch between threads, and synchronization
  31 primitives (semaphores, locks, and condition variables).
  32 @c FIXME: base system doesn't do anything. Debugger sucks.
  33 However, there's a lot of magic going on in some of this code, so you
  34 should compile and run the base system to see a simple test of the
  35 code.  You should trace the execution using your favorite debugger to
  36 get a sense for what's going on.
  37
  38 When a thread is created, you are creating a new context to be
  39 scheduled. You provide a function to be run in this context as an
  40 argument to @code{thread_create()}. The first time the thread is
  41 scheduled and runs, it will start from the beginning of that function
  42 and execute it in the context. When that function returns, that thread
  43 completes. Each thread, therefore, acts like a mini-program running
  44 inside Pintos, with the function passed to @code{thread_create()}
  45 acting like @code{main()}.
  46
  47 At any given time, Pintos is running exactly one thread, with the
  48 others switched out.  The scheduler decides which thread to run next
  49 when it needs to switch between them.  (If no thread is ready to run
  50 at any given time, then the special ``idle'' thread runs.)  The
  51 synchronization primitives are used to force context switches when one
  52 thread needs to wait for another thread to do something.
  53
  54 The exact mechanics of a context switch are pretty gruesome and have
  55 been provided for you in @file{switch.S} (this is 80@var{x}86
  56 assembly; don't worry about understanding it). It involves saving the
  57 state of the currently running thread and restoring the state of the
  58 thread we're switching to.
  59 @c FIXME
  60 Slowly trace through a context switch to see what happens.  Be sure to
  61 keep track of each thread's address and state, and what procedures are
  62 on the call stack for each thread. You will notice that when one
  63 thread calls @code{switch_threads()}, another thread starts running,
  64 and the first thing the new thread does is to return from
  65 @code{switch_threads()}. We realize this comment will seem cryptic to
  66 you at this point, but you will understand threads once you understand
  67 why the @code{switch_threads()} that gets called is different from the
  68 @code{switch_threads()} that returns.
  69
  70 @strong{Warning}: In Pintos, each thread is assigned a small,
  71 fixed-size execution stack just under @w{4 kB} in size.  The kernel
  72 does try to detect stack overflow, but it cannot always succeed.  You
  73 ma cause bizarre problems, such as mysterious kernel panics, if you
  74 declare large data structures as non-static local variables,
  75 e.g. @samp{int buf[1000];}.  Alternatives to stack allocation include
  76 the page allocator in @file{threads/palloc.c} and the block allocator
  77 in @file{threads/malloc.c}.  Note that the page allocator doles out
  78 @w{4 kB} chunks and that @code{malloc()} has a @w{2 kB} block size
  79 limit.  If you need larger chunks, consider using a linked structure
  80 instead.
  81
  82 @node Debugging versus Testing, Tips, Understanding Threads, Project 1--Threads
  83 @section Debugging versus Testing
  84
  85 When you're debugging code, it's useful to be able to be able to run a
  86 program twice and have it do exactly the same thing.  On second and
  87 later runs, you can make new observations without having to discard or
  88 verify your old observations.  This property is called
  89 ``reproducibility.''  The simulator we use, Bochs, can be set up for
  90 reproducibility.  If you use the Bochs configuration files we provide,
  91 which specify @samp{ips: @var{n}} where @var{n} is a number of
  92 simulated instructions per second, your simulations can be
  93 reproducible.
  94
  95 Of course, a simulation can only be reproducible from one run to the
  96 next if its input is the same each time.  For simulating an entire
  97 computer, as we do, this means that every part of the computer must be
  98 the same.  For example, you must use the same disks, the same version
  99 of Bochs, and you must not hit any keys on the keyboard (because you
 100 could not be sure to hit them at exactly the same point each time)
 101 during the runs.
 102
 103 While reproducibility is useful for debugging, it is a problem for
 104 testing thread synchronization, an important part of this project.  In
 105 particular, when Bochs is set up for reproducibility, timer interrupts
 106 will come at perfectly reproducible points, and therefore so will
 107 thread switches.  That means that running the same test several times
 108 doesn't give you any greater confidence in your code's correctness
 109 than does running it only once.
 110
 111 So, to make your code easier to test, we've added a feature to Bochs
 112 that makes timer interrupts come at random intervals, but in a
 113 perfectly predictable way.  In particular, if you put a line
 114 @samp{ips-jitter: @var{seed}}, where @var{seed} is an integer, into
 115 your Bochs configuration file, then timer interrupts will come at
 116 irregularly spaced intervals.  Within a single @var{seed} value,
 117 execution will still be reproducible, but timer behavior will change
 118 as @var{seed} is varied.  Thus, for the highest degree of confidence
 119 you should test your code with many seed values.
 120
 121 @node Tips, Problem 1-1 Alarm Clock, Debugging versus Testing, Project 1--Threads
 122 @section Tips
 123
 124 There should be no busy-waiting in any of your solutions to this
 125 assignment.  Furthermore, resist the temptation to directly disable
 126 interrupts in your solution by calling @code{intr_disable()} or
 127 @code{intr_set_level()}, although you may find doing so to be useful
 128 while debugging.  Instead, use semaphores, locks and condition
 129 variables to solve synchronization problems.  Hint: read the comments
 130 in @file{threads/synch.h} if you're unsure what synchronization
 131 primitives may be used in what situations.
 132
 133 Given some designs of some problems, there may be one or two instances
 134 in which it is appropriate to directly change the interrupt levels
 135 instead of relying on the given synchroniztion primitives.  This must
 136 be justified in your @file{DESIGNDOC} file.  If you're not sure you're
 137 justified, ask!
 138
 139 While all parts of this assignment are required if you intend to earn
 140 full credit on this project, keep in mind that Problem 2 (Join) will
 141 be needed for future assignments, so you'll want to get this one
 142 right.  We don't give out solutions, so you're stuck with your Join
 143 code for the whole quarter.  Problem 1 (Alarm Clock) could be very
 144 handy, but not strictly required in the future.  The upshot of all
 145 this is that you should focus heavily on making sure that your
 146 implementation of Join works correctly, since if it's broken, you will
 147 need to fix it for future assignments.  The other parts can be turned
 148 off in the future if you find you can't make them work quite right.
 149
 150 Also keep in mind that Problem 4 (the MFQS) builds on the features you
 151 implement in Problem 3, so to avoid unnecessary code duplication, it
 152 would be a good idea to divide up the work among your team members
 153 such that you have Problem 3 fully working before you begin to tackle
 154 Problem 4.
 155
 156 @node Problem 1-1 Alarm Clock, Problem 1-2 Join, Tips, Project 1--Threads
 157 @section Problem 1-2: Alarm Clock
 158
 159 Improve the implementation of the timer device defined in
 160 @file{devices/timer.c} by reimplementing @code{timer_sleep()}.
 161 Threads call @code{timer_sleep(@var{x})} to suspend execution until
 162 time has advanced by at least @w{@var{x} timer ticks}.  This is
 163 useful for threads that operate in real-time, for example, for
 164 blinking the cursor once per second.  There is no requirement that
 165 threads start running immediately after waking up; just put them on
 166 the ready queue after they have waited for approximately the right
 167 amount of time.
 168
 169 A working implementation of this function is provided.  However, the
 170 version provided is poor, because it ``busy waits,'' that is, it spins
 171 in a tight loop checking the current time until the current time has
 172 advanced far enough.  This is undesirable because it wastes time that
 173 could potentially be used more profitably by another thread.  Your
 174 solution should not busy wait.
 175
 176 The argument to @code{timer_sleep()} is expressed in timer ticks, not
 177 in milliseconds or some other unit.
 178
 179 @node Problem 1-2 Join, Problem 1-3 Priority Scheduling, Problem 1-1 Alarm Clock, Project 1--Threads
 180 @section Problem 1-2: Join
 181
 182 Implement @code{thread_join(struct thread *)} in
 183 @file{threads/thread.c}.  There is already a prototype for it in
 184 @file{threads/thread.h}, which you should not change.  This function
 185 causes the currently running thread to block until thread passed as an
 186 argument exits.  If A is the running thread and B is the argument,
 187 then we say that ``A joins B'' in this case.
 188
 189 The model for @code{thread_join()} is the @command{wait} system call
 190 in Unix-like systems.  (Try reading the manpages.)  That system call
 191 can only be used by a parent process to wait for a child's death.  You
 192 should implement @code{thread_join()} to have the same restriction.
 193 That is, a thread may only join on its immediate children.
 194
 195 A thread need not ever be joined.  Your solution should properly free
 196 all of a thread's resources, including its @code{struct thread},
 197 whether it is ever joined or not, and regardless of whether the child
 198 exits before or after its parent.  That is, a thread should be freed
 199 exactly once in all cases.
 200
 201 Joining a given thread is idempotent.  That is, joining a thread T
 202 multiple times is equivalent to joining it once, because T has already
 203 exited at the time of the later joins.  Thus, joins on T after the
 204 first should return immediately.
 205
 206 The behavior of calling @code{thread_join()} on an thread that is not
 207 the caller's child is undefined.  You need not handle this case
 208 gracefully.
 209
 210 Consider all the ways a join can occur: nested joins (A joins B when B
 211 is joined on C), multiple joins (A joins B, then A joins C), and so
 212 on.  Does your join work if @code{thread_join()} is called on a thread
 213 that has not yet been scheduled for the first time?  You should handle
 214 all of these cases.  Write test code that demonstrates the cases your
 215 join works for.  Don't overdo the output volume, please!
 216
 217 Be careful to program this function correctly.  You will need its
 218 functionality for project 2.
 219
 220 @node Problem 1-3 Priority Scheduling, Problem 1-4 Advanced Scheduler, Problem 1-2 Join, Project 1--Threads
 221 @section Problem 1-3 Priority Scheduling
 222
 223 Implement priority scheduling in Pintos.  Priority
 224 scheduling is a key building block for real-time systems.  Implement functions
 225 @code{thread_set_priority()} to set the priority of a thread and
 226 @code{thread_get_priority()} to get the priority of a thread.  There
 227 are already prototypes for these functions in @file{threads/thread.h},
 228 which you should not change.
 229
 230 When a thread is added to the ready list that has a higher priority
 231 than the currently running thread, the current thread should
 232 immediately yield the processor to the new thread.  Similarly, when
 233 threads are waiting for a lock, semaphore or condition variable, the
 234 highest priority waiting thread should be woken up first.  A thread's
 235 priority may be set at any time, including while the thread is waiting
 236 on a lock, semaphore, or condition variable.
 237
 238 One issue with priority scheduling is ``priority inversion'': if a
 239 high priority thread needs to wait for a low priority thread (for
 240 instance, for a lock held by a low priority thread, or in
 241 @code{thread_join()} for a thread to complete), and a middle priority
 242 thread is on the ready list, then the high priority thread will never
 243 get the CPU because the low priority thread will not get any CPU time.
 244 A partial fix for this problem is to have the waiting thread
 245 ``donate'' its priority to the low priority thread while it is holding
 246 the lock, then recall the donation once it has acquired the lock.
 247 Implement this fix.
 248
 249 You will need to account for all different orders that priority
 250 donation and inversion can occur.  Be sure to handle multiple
 251 donations, in which multiple priorities are donated to a thread.  You
 252 must also handle nested donation: given high, medium, and low priority
 253 threads H, M, and L, respectively, and supposing H is waiting on a
 254 lock that M holds and M is waiting on a lock that L holds, both M and
 255 L should be boosted to H's priority.
 256
 257 You only need to implement priority donation when a thread is waiting
 258 for a lock held by a lower-priority thread. You do not need to
 259 implement this fix for semaphores, condition variables or joins.
 260 However, you do need to implement priority scheduling in all cases.
 261
 262 @node Problem 1-4 Advanced Scheduler, Threads FAQ, Problem 1-3 Priority Scheduling, Project 1--Threads
 263 @section Problem 1-4 Advanced Scheduler
 264
 265 Implement Solaris's multilevel feedback queue scheduler (MFQS), as
 266 explained below, to reduce the average response time for running jobs
 267 on your system.
 268 @c FIXME need link
 269
 270 Demonstrate that your scheduling algorithm reduces response time
 271 relative to the original Pintos scheduling algorithm (round robin) for
 272 at least one workload of your own design (i.e. in addition to the
 273 provided test).
 274
 275 You may assume a static priority for this problem. It is not necessary
 276 to ``re-donate'' a thread's priority if it changes (although you are
 277 free to do so).
 278
 279 @node Threads FAQ,  , Problem 1-4 Advanced Scheduler, Project 1--Threads
 280 @section FAQ
 281
 282 @enumerate 1
 283 @item General FAQs
 284
 285 @enumerate 1
 286 @item
 287 @b{I am adding a new @file{.h} or @file{.c} file.  How do I fix the
 288 @file{Makefile}s?}
 289
 290 To add a @file{.c} file, edit the top-level @file{Makefile.build}.
 291 You'll want to add your file to variable @samp{@var{dir}_SRC}, where
 292 @var{dir} is the directory where you added the file.  For this
 293 project, that means you should add it to @code{threads_SRC}, or
 294 possibly @code{devices_SRC} if you put in the @file{devices}
 295 directory.  Then run @code{make}.  If your new file doesn't get
 296 compiled, run @code{make clean} and then try again.
 297
 298 There is no need to edit the @file{Makefile}s to add a @file{.h} file.
 299
 300 @item
 301 @b{If a thread finishes, should its children be terminated immediately,
 302 or should they finish normally?}
 303
 304 You should feel free to decide what semantics you think this
 305 should have. You need only provide justification for your
 306 decision.
 307
 308 @item
 309 @b{Why can't I disable interrupts?}
 310
 311 Turning off interrupts should only be done for short amounts of time,
 312 or else you end up losing important things such as disk or input
 313 events.  Turning off interrupts also increases the interrupt handling
 314 latency, which can make a machine feel sluggish if taken too far.
 315 Therefore, in general, setting the interrupt level should be used
 316 sparingly.  Also, any synchronization problem can be easily solved by
 317 turning interrupts off, since while interrupts are off, there is no
 318 concurrency, so there's no possibility for race condition.
 319
 320 To make sure you understand concurrency well, we are discouraging you
 321 from taking this shortcut at all in your solution.  If you are unable
 322 to solve a particular synchronization problem with semaphores, locks,
 323 or conditions, or think that they are inadequate for a particular
 324 reason, you may turn to disabling interrupts.  If you want to do this,
 325 we require in your design document a complete justification and
 326 scenario (i.e.@: exact sequence of events) to show why interrupt
 327 manipulation is the best solution.  If you are unsure, the TAs can
 328 help you determine if you are using interrupts too haphazardly.  We
 329 want to emphasize that there are only limited cases where this is
 330 appropriate.
 331
 332 @item
 333 @b{Where might interrupt-level manipuation be appropriate?}
 334
 335 You might find it necessary in some solutions to the Alarm problem.
 336
 337 You might want it at one small point for the priority scheduling
 338 problem.  Note that it is not required to use interrupts for these
 339 problems.  There are other, equally correct solutions that do not
 340 require interrupt manipulation.  However, if you do manipulate
 341 interrupts and @strong{correctly and fully document it} in your design
 342 document, we will allow limited use of interrupt disabling.
 343 @end enumerate
 344
 345 @item Alarm Clock FAQs
 346
 347 @enumerate 1
 348 @item
 349 @b{Why can't I use most synchronization primitives in an interrupt
 350 handler?}
 351
 352 As you've discovered, you cannot sleep in an external interrupt
 353 handler.  Since many lock, semaphore, and condition variable functions
 354 attempt to sleep, you won't be able to call those in
 355 @code{timer_interrupt()}.  You may still use those that never sleep.
 356
 357 Having said that, you need to make sure that global data does not get
 358 updated by multiple threads simultaneously executing
 359 @code{timer_sleep()}.  Here are some pieces of information to think
 360 about:
 361
 362 @enumerate a
 363 @item
 364 Interrupts are turned off while @code{timer_interrupt()} runs.  This
 365 means that @code{timer_interrupt()} will not be interrupted by a
 366 thread running in @code{timer_sleep()}.
 367
 368 @item
 369 A thread in @code{timer_sleep()}, however, can be interrupted by a
 370 call to @code{timer_interrupt()}, except when that thread has turned
 371 off interrupts.
 372
 373 @item
 374 Examples of synchronization mechanisms have been presented in lecture.
 375 Going over these examples should help you understand when each type is
 376 useful or needed.
 377 @end enumerate
 378
 379 @item
 380 @b{What about timer overflow due to the fact that times are defined as
 381 integers? Do I need to check for that?}
 382
 383 Don't worry about the possibility of timer values overflowing.  Timer
 384 values are expressed as signed 63-bit numbers, which at 100 ticks per
 385 second should be good for almost 2,924,712,087 years.
 386 @end enumerate
 387
 388 @item Join FAQs
 389
 390 @enumerate 1
 391 @item
 392 @b{Am I correct to assume that once a thread is deleted, it is no
 393 longer accessible by the parent (i.e.@: the parent can't call
 394 @code{thread_join(child)})?}
 395
 396 A parent joining a child that has completed should be handled
 397 gracefully and should act as a no-op.
 398 @end enumerate
 399
 400 @item Priority Scheduling FAQs
 401
 402 @enumerate 1
 403 @item
 404 @b{Doesn't the priority scheduling lead to starvation? Or do I have to
 405 implement some sort of aging?}
 406
 407
 408 It is true that strict priority scheduling can lead to starvation
 409 because thread may not run if a higher-priority thread is runnable.
 410 In this problem, don't worry about starvation or any sort of aging
 411 technique.  Problem 4 will introduce a mechanism for dynamically
 412 changing thread priorities.
 413
 414 This sort of scheduling is valuable in real-time systems because it
 415 offers the programmer more control over which jobs get processing
 416 time.  High priorities are generally reserved for time-critical
 417 tasks. It's not ``fair,'' but it addresses other concerns not
 418 applicable to a general-purpose operating system.
 419
 420 @item
 421 @b{After a lock has been released, does the program need to switch to
 422 the highest priority thread that needs the lock (assuming that its
 423 priority is higher than that of the current thread)?}
 424
 425 When a lock is released, the highest priority thread waiting for that
 426 lock should be unblocked and put on the ready to run list.  The
 427 scheduler should then run the highest priority thread on the ready
 428 list.
 429
 430 @item
 431 @b{If a thread calls @code{thread_yield()} and then it turns out that
 432 it has higher priority than any other threads, does the high-priority
 433 thread continue running?}
 434
 435 Yes.  If there is a single highest-priority thread, it continues
 436 running until it blocks or finishes, even if it calls
 437 @code{thread_yield()}.
 438
 439 @item
 440 @b{If the highest priority thread is added to the ready to run list it
 441 should start execution immediately.  Is it immediate enough if I
 442 wait until next timer interrupt occurs?}
 443
 444 The highest priority thread should run as soon as it is runnable,
 445 preempting whatever thread is currently running.
 446
 447 @item
 448 @b{What happens to the priority of the donating thread?  Do the priorities
 449 get swapped?}
 450
 451 No.  Priority donation only changes the priority of the low-priority
 452 thread.  The donating thread's priority stays unchanged.  Also note
 453 that priorities aren't additive: if thread A (with priority 5) donates
 454 to thread B (with priority 3), then B's new priority is 5, not 8.
 455
 456 @item
 457 @b{Can a thread's priority be changed while it is sitting on the ready
 458 queue?}
 459
 460 Yes.  Consider this case: low-priority thread L currently has a lock
 461 that high-priority thread H wants.  H donates its priority to L (the
 462 lock holder).  L finishes with the lock, and then loses the CPU and is
 463 moved to the ready queue.  Now L's old priority is restored while it
 464 is in the ready queue.
 465
 466 @item
 467 @b{Can a thread's priority change while it is sitting on the queue of a
 468 semaphore?}
 469
 470 Yes.  Same scenario as above except L gets blocked waiting on a new
 471 lock when H restores its priority.
 472
 473 @item
 474 @b{Why is pubtest3's FIFO test skipping some threads! I know my scheduler
 475 is round-robin'ing them like it's supposed to!  Our output is like this:}
 476
 477 @example
 478 Thread 0 goes.
 479 Thread 2 goes.
 480 Thread 3 goes.
 481 Thread 4 goes.
 482 Thread 0 goes.
 483 Thread 1 goes.
 484 Thread 2 goes.
 485 Thread 3 goes.
 486 Thread 4 goes.
 487 @end example
 488
 489 @noindent @b{which repeats 5 times and then}
 490
 491 @example
 492 Thread 1 goes.
 493 Thread 1 goes.
 494 Thread 1 goes.
 495 Thread 1 goes.
 496 Thread 1 goes.
 497 @end example
 498
 499 This happens because context switches are being invoked by the test
 500 when it explicitly calls @code{thread_yield()}.  However, the time
 501 slice timer is still alive and so, every tick (by default), thread 1
 502 gets switched out (caused by @code{timer_interrupt()} calling
 503 @code{intr_yield_on_return()}) before it gets a chance to run its
 504 mainline.  It is by coincidence that Thread 1 is the one that gets
 505 skipped in our example.  If we use a different jitter value, the same
 506 behavior is seen where a thread gets started and switched out
 507 completely.
 508
 509 Solution: Increase the value of @code{TIME_SLICE} in
 510 @file{devices/timer.c} to a very high value, such as 10000, to see
 511 that the threads will round-robin if they aren't interrupted.
 512
 513 @item
 514 @b{What happens when a thread is added to the ready list which has
 515 higher priority than the currently running thread?}
 516
 517 The correct behavior is to immediately yield the processor.  Your
 518 solution must act this way.
 519
 520 @item
 521 @b{What range of priorities should be supported and what should the
 522 default priority of a thread be?}
 523
 524 Your implementation should support priorities from 0 through 59 and
 525 the default priority of a thread should be 29.
 526 @end enumerate
 527
 528 @item Advanced Scheduler FAQs
 529
 530 @enumerate 1
 531 @item
 532 @b{What is the interval between timer interrupts?}
 533
 534 Timer interrupts occur @code{TIMER_FREQ} times per second.  You can
 535 adjust this value by editing @file{devices/timer.h}.  The default is
 536 100 Hz.
 537
 538 @item
 539 @b{Do I have to modify the dispatch table?}
 540
 541 No, although you are allowed to. It is possible to complete
 542 this problem (i.e.@: demonstrate response time improvement)
 543 without doing so.
 544
 545 @item
 546 @b{When the scheduler changes the priority of a thread, how does this
 547 affect priority donation?}
 548
 549 Short (official) answer: Don't worry about it. Your priority donation
 550 code may assume static priority assignment.
 551
 552 Longer (unofficial) opinion: If you wish to take this into account,
 553 however, your design may end up being ``cleaner.''  You have
 554 considerable freedom in what actually takes place. I believe what
 555 makes the most sense is for scheduler changes to affect the
 556 ``original'' (non-donated) priority.  This change may actually be
 557 masked by the donated priority.  Priority changes should only
 558 propagate with donations, not ``backwards'' from donees to donors.
 559
 560 @item
 561 @b{What is meant by ``static priority''?}
 562
 563 Once thread A has donated its priority to thread B, if thread A's
 564 priority changes (due to the scheduler) while the donation still
 565 exists, you do not have to change thread B's donated priority.
 566 However, you are free to do so.
 567
 568 @item
 569 @b{Do I have to make my dispatch table user-configurable?}
 570
 571 No.  Hard-coding the dispatch table values is fine.
 572 @end enumerate
 573 @end enumerate