pintos-os.org Git - pintos-anon/blob - doc/threads.texi

   1 @node Project 1--Threads, Project 2--User Programs, Top, Top
   2 @chapter Project 1: Threads
   3
   4 In this assignment, we give you a minimally functional thread system.
   5 Your job is to extend the functionality of this system to gain a
   6 better understanding of synchronization problems. Additionally, you
   7 will use at least part of this increased functionality in future
   8 assignments.
   9
  10 You will be working in primarily in the @file{threads} directory for
  11 this assignment, with some work in the @file{devices} directory on the
  12 side.  Compilation should be done in the @file{threads} directory.
  13
  14 @menu
  15 * Understanding Threads::
  16 * Debugging versus Testing::
  17 * Tips::
  18 * Problem 1-1 Alarm Clock::
  19 * Problem 1-2 Join::
  20 * Problem 1-3 Priority Scheduling::
  21 * Problem 1-4 Advanced Scheduler::
  22 * Threads FAQ::
  23 @end menu
  24
  25 @node Understanding Threads, Debugging versus Testing, Project 1--Threads, Project 1--Threads
  26 @section Understanding Threads
  27
  28 The first step is to read and understand the initial thread system.
  29 Pintos, by default, implements thread creation and thread completion,
  30 a simple scheduler to switch between threads, and synchronization
  31 primitives (semaphores, locks, and condition variables).
  32 @c FIXME: base system doesn't do anything. Debugger sucks.
  33 However, there's a lot of magic going on in some of this code, so you
  34 should compile and run the base system to see a simple test of the
  35 code.  You should read through the source code by hand to see what's
  36 going on.  If you like, you can add calls to @code{printf()} almost
  37 anywhere, then recompile and run to see what happens and in what
  38 order.
  39
  40 When a thread is created, you are creating a new context to be
  41 scheduled. You provide a function to be run in this context as an
  42 argument to @code{thread_create()}. The first time the thread is
  43 scheduled and runs, it will start from the beginning of that function
  44 and execute it in the context. When that function returns, that thread
  45 completes. Each thread, therefore, acts like a mini-program running
  46 inside Pintos, with the function passed to @code{thread_create()}
  47 acting like @code{main()}.
  48
  49 At any given time, Pintos is running exactly one thread, with the
  50 others switched out.  The scheduler decides which thread to run next
  51 when it needs to switch between them.  (If no thread is ready to run
  52 at any given time, then the special ``idle'' thread runs.)  The
  53 synchronization primitives are used to force context switches when one
  54 thread needs to wait for another thread to do something.
  55
  56 The exact mechanics of a context switch are pretty gruesome and have
  57 been provided for you in @file{threads/switch.S} (this is 80@var{x}86
  58 assembly; don't worry about understanding it).  It involves saving the
  59 state of the currently running thread and restoring the state of the
  60 thread we're switching to.
  61 @c FIXME
  62 Slowly trace through a context switch to see what happens.  Be sure to
  63 keep track of each thread's address and state, and what procedures are
  64 on the call stack for each thread. You will notice that when one
  65 thread calls @code{switch_threads()}, another thread starts running,
  66 and the first thing the new thread does is to return from
  67 @code{switch_threads()}. We realize this comment will seem cryptic to
  68 you at this point, but you will understand threads once you understand
  69 why the @code{switch_threads()} that gets called is different from the
  70 @code{switch_threads()} that returns.
  71
  72 @strong{Warning}: In Pintos, each thread is assigned a small,
  73 fixed-size execution stack just under @w{4 kB} in size.  The kernel
  74 does try to detect stack overflow, but it cannot always succeed.  You
  75 ma cause bizarre problems, such as mysterious kernel panics, if you
  76 declare large data structures as non-static local variables,
  77 e.g. @samp{int buf[1000];}.  Alternatives to stack allocation include
  78 the page allocator in @file{threads/palloc.c} and the block allocator
  79 in @file{threads/malloc.c}.  Note that the page allocator doles out
  80 @w{4 kB} chunks and that @code{malloc()} has a @w{2 kB} block size
  81 limit.  If you need larger chunks, consider using a linked structure
  82 instead.
  83
  84 @node Debugging versus Testing, Tips, Understanding Threads, Project 1--Threads
  85 @section Debugging versus Testing
  86
  87 When you're debugging code, it's useful to be able to be able to run a
  88 program twice and have it do exactly the same thing.  On second and
  89 later runs, you can make new observations without having to discard or
  90 verify your old observations.  This property is called
  91 ``reproducibility.''  The simulator we use, Bochs, can be set up for
  92 reproducibility.  If you use the Bochs configuration files we provide,
  93 which specify @samp{ips: @var{n}} where @var{n} is a number of
  94 simulated instructions per second, your simulations can be
  95 reproducible.
  96
  97 Of course, a simulation can only be reproducible from one run to the
  98 next if its input is the same each time.  For simulating an entire
  99 computer, as we do, this means that every part of the computer must be
 100 the same.  For example, you must use the same disks, the same version
 101 of Bochs, and you must not hit any keys on the keyboard (because you
 102 could not be sure to hit them at exactly the same point each time)
 103 during the runs.
 104
 105 While reproducibility is useful for debugging, it is a problem for
 106 testing thread synchronization, an important part of this project.  In
 107 particular, when Bochs is set up for reproducibility, timer interrupts
 108 will come at perfectly reproducible points, and therefore so will
 109 thread switches.  That means that running the same test several times
 110 doesn't give you any greater confidence in your code's correctness
 111 than does running it only once.
 112
 113 So, to make your code easier to test, we've added a feature to Bochs
 114 that makes timer interrupts come at random intervals, but in a
 115 perfectly predictable way.  In particular, if you put a line
 116 @samp{ips-jitter: @var{seed}}, where @var{seed} is an integer, into
 117 your Bochs configuration file, then timer interrupts will come at
 118 irregularly spaced intervals.  Within a single @var{seed} value,
 119 execution will still be reproducible, but timer behavior will change
 120 as @var{seed} is varied.  Thus, for the highest degree of confidence
 121 you should test your code with many seed values.
 122
 123 @node Tips, Problem 1-1 Alarm Clock, Debugging versus Testing, Project 1--Threads
 124 @section Tips
 125
 126 There should be no busy-waiting in any of your solutions to this
 127 assignment.  Furthermore, resist the temptation to directly disable
 128 interrupts in your solution by calling @code{intr_disable()} or
 129 @code{intr_set_level()}, although you may find doing so to be useful
 130 while debugging.  Instead, use semaphores, locks and condition
 131 variables to solve synchronization problems.  Hint: read the comments
 132 in @file{threads/synch.h} if you're unsure what synchronization
 133 primitives may be used in what situations.
 134
 135 Given some designs of some problems, there may be one or two instances
 136 in which it is appropriate to directly change the interrupt levels
 137 instead of relying on the given synchroniztion primitives.  This must
 138 be justified in your @file{DESIGNDOC} file.  If you're not sure you're
 139 justified, ask!
 140
 141 While all parts of this assignment are required if you intend to earn
 142 full credit on this project, keep in mind that Problem 2 (Join) will
 143 be needed for future assignments, so you'll want to get this one
 144 right.  We don't give out solutions, so you're stuck with your Join
 145 code for the whole quarter.  Problem 1 (Alarm Clock) could be very
 146 handy, but not strictly required in the future.  The upshot of all
 147 this is that you should focus heavily on making sure that your
 148 implementation of Join works correctly, since if it's broken, you will
 149 need to fix it for future assignments.  The other parts can be turned
 150 off in the future if you find you can't make them work quite right.
 151
 152 Also keep in mind that Problem 4 (the MFQS) builds on the features you
 153 implement in Problem 3, so to avoid unnecessary code duplication, it
 154 would be a good idea to divide up the work among your team members
 155 such that you have Problem 3 fully working before you begin to tackle
 156 Problem 4.
 157
 158 @node Problem 1-1 Alarm Clock, Problem 1-2 Join, Tips, Project 1--Threads
 159 @section Problem 1-2: Alarm Clock
 160
 161 Improve the implementation of the timer device defined in
 162 @file{devices/timer.c} by reimplementing @code{timer_sleep()}.
 163 Threads call @code{timer_sleep(@var{x})} to suspend execution until
 164 time has advanced by at least @w{@var{x} timer ticks}.  This is
 165 useful for threads that operate in real-time, for example, for
 166 blinking the cursor once per second.  There is no requirement that
 167 threads start running immediately after waking up; just put them on
 168 the ready queue after they have waited for approximately the right
 169 amount of time.
 170
 171 A working implementation of this function is provided.  However, the
 172 version provided is poor, because it ``busy waits,'' that is, it spins
 173 in a tight loop checking the current time until the current time has
 174 advanced far enough.  This is undesirable because it wastes time that
 175 could potentially be used more profitably by another thread.  Your
 176 solution should not busy wait.
 177
 178 The argument to @code{timer_sleep()} is expressed in timer ticks, not
 179 in milliseconds or some other unit.
 180
 181 @node Problem 1-2 Join, Problem 1-3 Priority Scheduling, Problem 1-1 Alarm Clock, Project 1--Threads
 182 @section Problem 1-2: Join
 183
 184 Implement @code{thread_join(struct thread *)} in
 185 @file{threads/thread.c}.  There is already a prototype for it in
 186 @file{threads/thread.h}, which you should not change.  This function
 187 causes the currently running thread to block until thread passed as an
 188 argument exits.  If A is the running thread and B is the argument,
 189 then we say that ``A joins B'' in this case.
 190
 191 The model for @code{thread_join()} is the @command{wait} system call
 192 in Unix-like systems.  (Try reading the manpages.)  That system call
 193 can only be used by a parent process to wait for a child's death.  You
 194 should implement @code{thread_join()} to have the same restriction.
 195 That is, a thread may only join on its immediate children.
 196
 197 A thread need not ever be joined.  Your solution should properly free
 198 all of a thread's resources, including its @code{struct thread},
 199 whether it is ever joined or not, and regardless of whether the child
 200 exits before or after its parent.  That is, a thread should be freed
 201 exactly once in all cases.
 202
 203 Joining a given thread is idempotent.  That is, joining a thread T
 204 multiple times is equivalent to joining it once, because T has already
 205 exited at the time of the later joins.  Thus, joins on T after the
 206 first should return immediately.
 207
 208 The behavior of calling @code{thread_join()} on an thread that is not
 209 the caller's child is undefined.  You need not handle this case
 210 gracefully.
 211
 212 Consider all the ways a join can occur: nested joins (A joins B when B
 213 is joined on C), multiple joins (A joins B, then A joins C), and so
 214 on.  Does your join work if @code{thread_join()} is called on a thread
 215 that has not yet been scheduled for the first time?  You should handle
 216 all of these cases.  Write test code that demonstrates the cases your
 217 join works for.  Don't overdo the output volume, please!
 218
 219 Be careful to program this function correctly.  You will need its
 220 functionality for project 2.
 221
 222 @node Problem 1-3 Priority Scheduling, Problem 1-4 Advanced Scheduler, Problem 1-2 Join, Project 1--Threads
 223 @section Problem 1-3 Priority Scheduling
 224
 225 Implement priority scheduling in Pintos.  Priority
 226 scheduling is a key building block for real-time systems.  Implement functions
 227 @code{thread_set_priority()} to set the priority of a thread and
 228 @code{thread_get_priority()} to get the priority of a thread.  There
 229 are already prototypes for these functions in @file{threads/thread.h},
 230 which you should not change.
 231
 232 When a thread is added to the ready list that has a higher priority
 233 than the currently running thread, the current thread should
 234 immediately yield the processor to the new thread.  Similarly, when
 235 threads are waiting for a lock, semaphore or condition variable, the
 236 highest priority waiting thread should be woken up first.  A thread's
 237 priority may be set at any time, including while the thread is waiting
 238 on a lock, semaphore, or condition variable.
 239
 240 One issue with priority scheduling is ``priority inversion'': if a
 241 high priority thread needs to wait for a low priority thread (for
 242 instance, for a lock held by a low priority thread, or in
 243 @code{thread_join()} for a thread to complete), and a middle priority
 244 thread is on the ready list, then the high priority thread will never
 245 get the CPU because the low priority thread will not get any CPU time.
 246 A partial fix for this problem is to have the waiting thread
 247 ``donate'' its priority to the low priority thread while it is holding
 248 the lock, then recall the donation once it has acquired the lock.
 249 Implement this fix.
 250
 251 You will need to account for all different orders that priority
 252 donation and inversion can occur.  Be sure to handle multiple
 253 donations, in which multiple priorities are donated to a thread.  You
 254 must also handle nested donation: given high, medium, and low priority
 255 threads H, M, and L, respectively, and supposing H is waiting on a
 256 lock that M holds and M is waiting on a lock that L holds, both M and
 257 L should be boosted to H's priority.
 258
 259 You only need to implement priority donation when a thread is waiting
 260 for a lock held by a lower-priority thread. You do not need to
 261 implement this fix for semaphores, condition variables or joins.
 262 However, you do need to implement priority scheduling in all cases.
 263
 264 @node Problem 1-4 Advanced Scheduler, Threads FAQ, Problem 1-3 Priority Scheduling, Project 1--Threads
 265 @section Problem 1-4 Advanced Scheduler
 266
 267 Implement Solaris's multilevel feedback queue scheduler (MFQS), as
 268 explained in this @uref{mlfqs.pdf, , PDF} or @uref{mlfqs.ps, ,
 269 PostScript} file, to reduce the average response time for running jobs
 270 on your system.
 271
 272 Demonstrate that your scheduling algorithm reduces response time
 273 relative to the original Pintos scheduling algorithm (round robin) for
 274 at least one workload of your own design (i.e. in addition to the
 275 provided test).
 276
 277 You may assume a static priority for this problem. It is not necessary
 278 to ``re-donate'' a thread's priority if it changes (although you are
 279 free to do so).
 280
 281 @node Threads FAQ,  , Problem 1-4 Advanced Scheduler, Project 1--Threads
 282 @section FAQ
 283
 284 @enumerate 1
 285 @item General FAQs
 286
 287 @enumerate 1
 288 @item
 289 @b{I am adding a new @file{.h} or @file{.c} file.  How do I fix the
 290 @file{Makefile}s?}
 291
 292 To add a @file{.c} file, edit the top-level @file{Makefile.build}.
 293 You'll want to add your file to variable @samp{@var{dir}_SRC}, where
 294 @var{dir} is the directory where you added the file.  For this
 295 project, that means you should add it to @code{threads_SRC}, or
 296 possibly @code{devices_SRC} if you put in the @file{devices}
 297 directory.  Then run @code{make}.  If your new file doesn't get
 298 compiled, run @code{make clean} and then try again.
 299
 300 There is no need to edit the @file{Makefile}s to add a @file{.h} file.
 301
 302 @item
 303 @b{If a thread finishes, should its children be terminated immediately,
 304 or should they finish normally?}
 305
 306 You should feel free to decide what semantics you think this
 307 should have. You need only provide justification for your
 308 decision.
 309
 310 @item
 311 @b{Why can't I disable interrupts?}
 312
 313 Turning off interrupts should only be done for short amounts of time,
 314 or else you end up losing important things such as disk or input
 315 events.  Turning off interrupts also increases the interrupt handling
 316 latency, which can make a machine feel sluggish if taken too far.
 317 Therefore, in general, setting the interrupt level should be used
 318 sparingly.  Also, any synchronization problem can be easily solved by
 319 turning interrupts off, since while interrupts are off, there is no
 320 concurrency, so there's no possibility for race condition.
 321
 322 To make sure you understand concurrency well, we are discouraging you
 323 from taking this shortcut at all in your solution.  If you are unable
 324 to solve a particular synchronization problem with semaphores, locks,
 325 or conditions, or think that they are inadequate for a particular
 326 reason, you may turn to disabling interrupts.  If you want to do this,
 327 we require in your design document a complete justification and
 328 scenario (i.e.@: exact sequence of events) to show why interrupt
 329 manipulation is the best solution.  If you are unsure, the TAs can
 330 help you determine if you are using interrupts too haphazardly.  We
 331 want to emphasize that there are only limited cases where this is
 332 appropriate.
 333
 334 @item
 335 @b{Where might interrupt-level manipuation be appropriate?}
 336
 337 You might find it necessary in some solutions to the Alarm problem.
 338
 339 You might want it at one small point for the priority scheduling
 340 problem.  Note that it is not required to use interrupts for these
 341 problems.  There are other, equally correct solutions that do not
 342 require interrupt manipulation.  However, if you do manipulate
 343 interrupts and @strong{correctly and fully document it} in your design
 344 document, we will allow limited use of interrupt disabling.
 345 @end enumerate
 346
 347 @item Alarm Clock FAQs
 348
 349 @enumerate 1
 350 @item
 351 @b{Why can't I use most synchronization primitives in an interrupt
 352 handler?}
 353
 354 As you've discovered, you cannot sleep in an external interrupt
 355 handler.  Since many lock, semaphore, and condition variable functions
 356 attempt to sleep, you won't be able to call those in
 357 @code{timer_interrupt()}.  You may still use those that never sleep.
 358
 359 Having said that, you need to make sure that global data does not get
 360 updated by multiple threads simultaneously executing
 361 @code{timer_sleep()}.  Here are some pieces of information to think
 362 about:
 363
 364 @enumerate a
 365 @item
 366 Interrupts are turned off while @code{timer_interrupt()} runs.  This
 367 means that @code{timer_interrupt()} will not be interrupted by a
 368 thread running in @code{timer_sleep()}.
 369
 370 @item
 371 A thread in @code{timer_sleep()}, however, can be interrupted by a
 372 call to @code{timer_interrupt()}, except when that thread has turned
 373 off interrupts.
 374
 375 @item
 376 Examples of synchronization mechanisms have been presented in lecture.
 377 Going over these examples should help you understand when each type is
 378 useful or needed.
 379 @end enumerate
 380
 381 @item
 382 @b{What about timer overflow due to the fact that times are defined as
 383 integers? Do I need to check for that?}
 384
 385 Don't worry about the possibility of timer values overflowing.  Timer
 386 values are expressed as signed 63-bit numbers, which at 100 ticks per
 387 second should be good for almost 2,924,712,087 years.
 388 @end enumerate
 389
 390 @item Join FAQs
 391
 392 @enumerate 1
 393 @item
 394 @b{Am I correct to assume that once a thread is deleted, it is no
 395 longer accessible by the parent (i.e.@: the parent can't call
 396 @code{thread_join(child)})?}
 397
 398 A parent joining a child that has completed should be handled
 399 gracefully and should act as a no-op.
 400 @end enumerate
 401
 402 @item Priority Scheduling FAQs
 403
 404 @enumerate 1
 405 @item
 406 @b{Doesn't the priority scheduling lead to starvation? Or do I have to
 407 implement some sort of aging?}
 408
 409
 410 It is true that strict priority scheduling can lead to starvation
 411 because thread may not run if a higher-priority thread is runnable.
 412 In this problem, don't worry about starvation or any sort of aging
 413 technique.  Problem 4 will introduce a mechanism for dynamically
 414 changing thread priorities.
 415
 416 This sort of scheduling is valuable in real-time systems because it
 417 offers the programmer more control over which jobs get processing
 418 time.  High priorities are generally reserved for time-critical
 419 tasks. It's not ``fair,'' but it addresses other concerns not
 420 applicable to a general-purpose operating system.
 421
 422 @item
 423 @b{After a lock has been released, does the program need to switch to
 424 the highest priority thread that needs the lock (assuming that its
 425 priority is higher than that of the current thread)?}
 426
 427 When a lock is released, the highest priority thread waiting for that
 428 lock should be unblocked and put on the ready to run list.  The
 429 scheduler should then run the highest priority thread on the ready
 430 list.
 431
 432 @item
 433 @b{If a thread calls @code{thread_yield()} and then it turns out that
 434 it has higher priority than any other threads, does the high-priority
 435 thread continue running?}
 436
 437 Yes.  If there is a single highest-priority thread, it continues
 438 running until it blocks or finishes, even if it calls
 439 @code{thread_yield()}.
 440
 441 @item
 442 @b{If the highest priority thread is added to the ready to run list it
 443 should start execution immediately.  Is it immediate enough if I
 444 wait until next timer interrupt occurs?}
 445
 446 The highest priority thread should run as soon as it is runnable,
 447 preempting whatever thread is currently running.
 448
 449 @item
 450 @b{What happens to the priority of the donating thread?  Do the priorities
 451 get swapped?}
 452
 453 No.  Priority donation only changes the priority of the low-priority
 454 thread.  The donating thread's priority stays unchanged.  Also note
 455 that priorities aren't additive: if thread A (with priority 5) donates
 456 to thread B (with priority 3), then B's new priority is 5, not 8.
 457
 458 @item
 459 @b{Can a thread's priority be changed while it is sitting on the ready
 460 queue?}
 461
 462 Yes.  Consider this case: low-priority thread L currently has a lock
 463 that high-priority thread H wants.  H donates its priority to L (the
 464 lock holder).  L finishes with the lock, and then loses the CPU and is
 465 moved to the ready queue.  Now L's old priority is restored while it
 466 is in the ready queue.
 467
 468 @item
 469 @b{Can a thread's priority change while it is sitting on the queue of a
 470 semaphore?}
 471
 472 Yes.  Same scenario as above except L gets blocked waiting on a new
 473 lock when H restores its priority.
 474
 475 @item
 476 @b{Why is pubtest3's FIFO test skipping some threads! I know my scheduler
 477 is round-robin'ing them like it's supposed to!  Our output is like this:}
 478
 479 @example
 480 Thread 0 goes.
 481 Thread 2 goes.
 482 Thread 3 goes.
 483 Thread 4 goes.
 484 Thread 0 goes.
 485 Thread 1 goes.
 486 Thread 2 goes.
 487 Thread 3 goes.
 488 Thread 4 goes.
 489 @end example
 490
 491 @noindent @b{which repeats 5 times and then}
 492
 493 @example
 494 Thread 1 goes.
 495 Thread 1 goes.
 496 Thread 1 goes.
 497 Thread 1 goes.
 498 Thread 1 goes.
 499 @end example
 500
 501 This happens because context switches are being invoked by the test
 502 when it explicitly calls @code{thread_yield()}.  However, the time
 503 slice timer is still alive and so, every tick (by default), thread 1
 504 gets switched out (caused by @code{timer_interrupt()} calling
 505 @code{intr_yield_on_return()}) before it gets a chance to run its
 506 mainline.  It is by coincidence that Thread 1 is the one that gets
 507 skipped in our example.  If we use a different jitter value, the same
 508 behavior is seen where a thread gets started and switched out
 509 completely.
 510
 511 Solution: Increase the value of @code{TIME_SLICE} in
 512 @file{devices/timer.c} to a very high value, such as 10000, to see
 513 that the threads will round-robin if they aren't interrupted.
 514
 515 @item
 516 @b{What happens when a thread is added to the ready list which has
 517 higher priority than the currently running thread?}
 518
 519 The correct behavior is to immediately yield the processor.  Your
 520 solution must act this way.
 521
 522 @item
 523 @b{What range of priorities should be supported and what should the
 524 default priority of a thread be?}
 525
 526 Your implementation should support priorities from 0 through 59 and
 527 the default priority of a thread should be 29.
 528 @end enumerate
 529
 530 @item Advanced Scheduler FAQs
 531
 532 @enumerate 1
 533 @item
 534 @b{What is the interval between timer interrupts?}
 535
 536 Timer interrupts occur @code{TIMER_FREQ} times per second.  You can
 537 adjust this value by editing @file{devices/timer.h}.  The default is
 538 100 Hz.
 539
 540 @item
 541 @b{Do I have to modify the dispatch table?}
 542
 543 No, although you are allowed to. It is possible to complete
 544 this problem (i.e.@: demonstrate response time improvement)
 545 without doing so.
 546
 547 @item
 548 @b{When the scheduler changes the priority of a thread, how does this
 549 affect priority donation?}
 550
 551 Short (official) answer: Don't worry about it. Your priority donation
 552 code may assume static priority assignment.
 553
 554 Longer (unofficial) opinion: If you wish to take this into account,
 555 however, your design may end up being ``cleaner.''  You have
 556 considerable freedom in what actually takes place. I believe what
 557 makes the most sense is for scheduler changes to affect the
 558 ``original'' (non-donated) priority.  This change may actually be
 559 masked by the donated priority.  Priority changes should only
 560 propagate with donations, not ``backwards'' from donees to donors.
 561
 562 @item
 563 @b{What is meant by ``static priority''?}
 564
 565 Once thread A has donated its priority to thread B, if thread A's
 566 priority changes (due to the scheduler) while the donation still
 567 exists, you do not have to change thread B's donated priority.
 568 However, you are free to do so.
 569
 570 @item
 571 @b{Do I have to make my dispatch table user-configurable?}
 572
 573 No.  Hard-coding the dispatch table values is fine.
 574 @end enumerate
 575 @end enumerate