pintos-os.org Git - pintos-anon/blob - doc/tour.texi

   1 @node Pintos Tour, Project 1--Threads, Introduction, Top
   2 @chapter A Tour Through Pintos
   3
   4 This chapter is a brief tour through the Pintos code.  It covers the
   5 entire code base, but you'll only be using Pintos one part at a time,
   6 so you may find that you want to read each part as you work on the
   7 corresponding project.
   8
   9 Hint: try using ``tags'' to follow along with references to function
  10 and variable names (@pxref{Tags}).
  11
  12 @menu
  13 * Pintos Loading::
  14 * Threads Tour::
  15 @end menu
  16
  17 @node Pintos Loading
  18 @section Loading
  19
  20 This section covers the Pintos loader and basic kernel
  21 initialization.
  22
  23 @menu
  24 * Pintos Loader::
  25 * Kernel Initialization::
  26 @end menu
  27
  28 @node Pintos Loader
  29 @subsection The Loader
  30
  31 The first part of Pintos that runs is the loader, in
  32 @file{threads/loader.S}.  The PC BIOS loads the loader into memory.
  33 The loader, in turn, is responsible for initializing the CPU, loading
  34 the rest of Pintos into memory, and then jumping to its start.  It's
  35 not important to understand exactly what the loader does, but if
  36 you're interested, read on.  You should probably read along with the
  37 loader's source.  You should also understand the basics of the
  38 80@var{x}86 architecture as described by chapter 3 of
  39 @bibref{IA32-v1}.
  40
  41 Because the PC BIOS loads the loader, the loader has to play by the
  42 BIOS's rules.  In particular, the BIOS only loads 512 bytes (one disk
  43 sector) into memory.  This is a severe restriction and it means that,
  44 practically speaking, the loader has to be written in assembly
  45 language.
  46
  47 Pintos' loader first initializes the CPU.  The first important part of
  48 this is to enable the A20 line, that is, the CPU's address line
  49 numbered 20.  For historical reasons, PCs start out with this address
  50 line fixed at 0, which means that attempts to access memory beyond the
  51 first 1 MB (2 raised to the 20th power) will fail.  Pintos wants to
  52 access more memory than this, so we have to enable it.
  53
  54 Next, the loader asks the BIOS for the PC's memory size.  Again for
  55 historical reasons, the function that we call in the BIOS to do this
  56 can only detect up to 64 MB of RAM, so that's the practical limit that
  57 Pintos can support.  The memory size is stashed away in a location in
  58 the loader that the kernel can read after it boots.
  59
  60 Third, the loader creates a basic page table.  This page table maps
  61 the 64 MB at the base of virtual memory (starting at virtual address
  62 0) directly to the identical physical addresses.  It also maps the
  63 same physical memory starting at virtual address
  64 @code{LOADER_PHYS_BASE}, which defaults to @t{0xc0000000} (3 GB).  The
  65 Pintos kernel only wants the latter mapping, but there's a
  66 chicken-and-egg problem if we don't include the former: our current
  67 virtual address is roughly @t{0x7c00}, the location where the BIOS
  68 loaded us, and we can't jump to @t{0xc0007c00} until we turn on the
  69 page table, but if we turn on the page table without jumping there,
  70 then we've just pulled the rug out from under ourselves.  At any rate,
  71 it's necessary.
  72
  73 After the page table is initialized, we load the CPU's control
  74 registers to turn on protected mode and paging, and then we set up the
  75 segment registers.  We aren't equipped to handle interrupts in
  76 protected mode yet, so we disable interrupts.
  77
  78 Finally it's time to load the kernel from disk.  We choose a simple,
  79 although inflexible, method to do this: we program the IDE disk
  80 controller directly.  We assume that the kernel is stored starting
  81 from the second sector of the first IDE disk (the first sector
  82 contains the boot loader itself).  We also assume that the BIOS has
  83 already set up the IDE controller for us.  We read
  84 @code{KERNEL_LOAD_PAGES} 4 kB pages of data from the disk directly
  85 into virtual memory starting @code{LOADER_KERN_BASE} bytes past
  86 @code{LOADER_PHYS_BASE}, which by default means that we load the
  87 kernel starting 1 MB into physical memory.
  88
  89 Then we jump to the start of the compiled kernel image.  Using the
  90 ``linker script'' in @file{threads/kernel.lds.S}, the kernel has
  91 arranged that it begins with the assembly module
  92 @file{threads/start.S}.  This assembly module just calls
  93 @func{main}, which never returns.
  94
  95 There's one more trick to the loader: the Pintos kernel command line
  96 is stored in the boot loader.  The @command{pintos} program actually
  97 modifies the boot loader on disk each time it runs the kernel, putting
  98 in whatever command line arguments the user supplies to the kernel,
  99 and then the kernel at boot time reads those arguments out of the boot
 100 loader in memory.  This is something of a nasty hack, but it is simple
 101 and effective.
 102
 103 @node Kernel Initialization
 104 @subsection Kernel Initialization
 105
 106 The kernel proper starts with the @func{main} function.  The
 107 @func{main} function is written in C, as will be most of the code we
 108 encounter in Pintos from here on out.
 109
 110 When @func{main} starts, the system is in a pretty raw state.  We're
 111 in protected mode with paging enabled, but hardly anything else is
 112 ready.  Thus, the @func{main} function consists primarily of calls
 113 into other Pintos modules' initialization functions.  You should
 114 notice that these are usually named @func{@var{module}_init}, where
 115 @var{module} is the module's name, @file{@var{module}.c} is the
 116 module's source code, and @file{@var{module}.h} is the module's
 117 header.
 118
 119 First we initialize kernel RAM in @func{ram_init}.  The first step
 120 is to clear out the kernel's so-called ``BSS'' segment.  The BSS is a
 121 segment that should be initialized to all zeros.  In C, whenever you
 122 declare a variable outside a function without providing an
 123 initializer, that variable goes into the BSS.@footnote{This isn't
 124 actually required by the ANSI C standard, but it is the case with most
 125 implementations of C on most machines.}  Because it's all zeros, the
 126 BSS isn't stored in the image that the loader brought into memory.  We
 127 just use @func{memset} to zero it out.  The other task of
 128 @func{ram_init} is to read out the machine's memory size from where
 129 the loader stored it and put it into the @code{ram_pages} variable for
 130 later use.
 131
 132 Next, @func{thread_init} initializes the thread system.  We will defer
 133 full discussion to our discussion of Pintos threads below.  It is
 134 called so early in initialization because the console, initialized
 135 just below, tries to use locks and locks in turn require there to be a
 136 running thread.
 137
 138 Then we initialize the console so that we can use @func{printf}.
 139 @func{main} calls @func{vga_init}, which initializes the VGA text
 140 display and clears the screen.  It also calls @func{serial_init_poll}
 141 to initialize the first serial port in ``polling mode,'' that is,
 142 where the kernel busy-waits for the port to be ready for each
 143 character to be output.  (We use polling mode until we're ready to set
 144 up interrupts later.)  Finally we initialize the console device and
 145 print a startup message to the console.
 146
 147 @func{main} calls @func{argv_init} to parse the kernel command line.
 148 This entails reading the command line out of the loader and updating
 149 global variables appropriately.
 150
 151 The next block of functions we call initialize the kernel's memory
 152 system.  @func{palloc_init} sets up the kernel page allocator, which
 153 doles out memory one or more pages at a time.  @func{malloc_init} sets
 154 up the allocator that handles odd-sized allocations.
 155 @func{paging_init} sets up a page table for the kernel.
 156
 157 In projects 2 and later, @func{main} also calls @func{tss_init} and
 158 @func{gdt_init}, but we'll talk about those later.
 159
 160 @func{main} calls @func{random_init} to initialize the kernel random
 161 number generator.  @func{argv_init} might already have done this (if
 162 the user specified @option{-rs} on the @command{pintos} command line)
 163 but calling it a second time is harmless and has no effect.
 164
 165 We initialize the interrupt system in the next set of calls.
 166 @func{intr_init} sets up the CPU's @dfn{interrupt descriptor table}
 167 (IDT) to ready it for interrupt handling, then @func{timer_init} and
 168 @func{kbd_init} prepare us to handle timer interrupts and keyboard
 169 interrupts, respectively.  In projects 2 and later, we also prepare to
 170 handle interrupts caused by user programs using @func{exception_init}
 171 and @func{syscall_init}.
 172
 173 Now that interrupts are set up, we can start preemptively scheduling
 174 threads with @func{thread_start}, which also enables interrupts.
 175 Interrupt-driven serial port I/O is also possible now, so we use
 176 @func{serial_init_queue} to switch to that mode.  Finally,
 177 @func{timer_calibrate} calibrates the timer for accurate short delays.
 178
 179 If the filesystem is compiled in, as it will be in project 2 and
 180 later, we now initialize the disks with @func{disk_init}, then the
 181 filesystem with @func{filesys_init}, and run any operations that were
 182 requested on the kernel command line with @func{fsutil_run}.
 183
 184 Boot is complete, so we print a message.
 185
 186 For project 1, now we execute @func{test}, which runs whatever test is
 187 currently compiled into the kernel.  For later projects, we will
 188 instead run a user program and wait for it to complete.
 189
 190 Finally, if @option{-q} was specified on the kernel command line, we
 191 call @func{power_off} to terminate the machine simulator.  Otherwise,
 192 @func{main} calls @func{thread_exit}, which allows any other running
 193 threads to continue running.
 194
 195 @node Threads Tour
 196 @section Threads
 197
 198 @menu
 199 * struct thread::
 200 * Thread Functions::
 201 * Thread Switching::
 202 * Synchronization::
 203 * Interrupt Handling::
 204 * Memory Allocation::
 205 @end menu
 206
 207 @node struct thread
 208 @subsection @code{struct thread}
 209
 210 The main Pintos data structure for threads is @struct{thread},
 211 declared in @file{threads/thread.h}.  @struct{thread} has these
 212 members with the given type:
 213
 214 @table @code
 215 @item tid_t tid;
 216 The thread's thread identifier or @dfn{tid}.  Every thread must have a
 217 tid that is unique over the entire lifetime of the kernel.  By
 218 default, @code{tid_t} is a @code{typedef} for @code{int} and each new
 219 thread receives the numerically next higher tid, starting from 1 for
 220 the initial process.  You can change the type and the numbering scheme
 221 if you like.
 222
 223 @item enum thread_status status;
 224 The thread's state, one of these:
 225
 226 @table @code
 227 @item THREAD_RUNNING
 228 The thread is running.  Exactly one thread is running at a given time.
 229 @func{thread_current} returns the running thread.
 230
 231 @item THREAD_READY
 232 The thread is ready to run, but it's not running right now.  The
 233 thread could be selected to run the next time the scheduler is
 234 invoked.  Ready threads are kept in a doubly linked list called
 235 @code{ready_list}.
 236
 237 @item THREAD_BLOCKED
 238 The thread is waiting for something, e.g.@: a lock to become
 239 available, an interrupt to be invoked.  The thread won't be scheduled
 240 again until it transitions to the @code{THREAD_READY} state with a
 241 call to @func{thread_unblock}.
 242
 243 @item THREAD_DYING
 244 The thread will be destroyed by the scheduler after switching to the
 245 next thread.
 246 @end table
 247
 248 @item char name[16];
 249 The thread's name as a string, or at least the first few characters of
 250 it.
 251
 252 @item uint8_t *stack;
 253 Every thread has its own stack to keep track of its state.  When the
 254 thread is running, the CPU's stack pointer register tracks the top of
 255 the stack and this member is unused.  But when the CPU switches to
 256 another thread, this member saves the thread's stack pointer.  No
 257 other members are needed to save the thread's registers, because the
 258 other registers that must be saved are saved on the stack.
 259
 260 @item int priority;
 261 A thread priority, ranging from the lowest possible priority
 262 @code{PRI_MIN} (0) to the highest possible priority @code{PRI_MAX}
 263 (59).  Pintos as provided ignores thread priorities, but you will
 264 implement priority scheduling in problem 1-2 (@pxref{Problem 1-2
 265 Priority Scheduling}).
 266
 267 @item struct list_elem elem;
 268 A ``list element'' used to put the thread into doubly linked lists,
 269 either the list of threads ready to run or a list of threads waiting
 270 on a semaphore.  Take a look at @file{lib/kernel/list.h} for
 271 information on how to use the Pintos doubly linked list ADT.
 272
 273 @item uint32_t *pagedir;
 274 Only present in project 2 and later.
 275
 276 @item unsigned magic;
 277 Always set to @code{THREAD_MAGIC}, which is just a random number
 278 defined in @file{threads/thread.c}, and used to detect stack overflow.
 279 See below for more information.
 280 @end table
 281
 282 Every @struct{thread} occupies the beginning of its own page of
 283 memory.  The rest of the page is used for the thread's stack, which
 284 grows downward from the end of the page.  It looks like this:
 285
 286 @example
 287 @group
 288         4 kB +---------------------------------+
 289              |          kernel stack           |
 290              |                |                |
 291              |                |                |
 292              |                V                |
 293              |         grows downward          |
 294              |                                 |
 295              |                                 |
 296              |                                 |
 297              |                                 |
 298              |                                 |
 299              |                                 |
 300              |                                 |
 301              |                                 |
 302              +---------------------------------+
 303              |              magic              |
 304              |                :                |
 305              |                :                |
 306              |              status             |
 307              |               tid               |
 308         0 kB +---------------------------------+
 309 @end group
 310 @end example
 311
 312 The upshot of this is twofold.  First, @struct{thread} must not be
 313 allowed to grow too big.  If it does, then there will not be enough
 314 room for the kernel stack.  Our base @struct{thread} is only a few
 315 bytes in size.  It probably should stay well under 1 kB.
 316
 317 Second, kernel stacks must not be allowed to grow too large.  If a
 318 stack overflows, it will corrupt the thread state.  Thus, kernel
 319 functions should not allocate large structures or arrays as non-static
 320 local variables.  Use dynamic allocation with @func{malloc} or
 321 @func{palloc_get_page} instead.
 322
 323 This brings us to the purpose of @struct{thread}'s @code{magic}
 324 member.  If a thread's stack does overflow, then @code{magic} will be
 325 the first member of @struct{thread} to be overwritten, because it is
 326 closest to the kernel stack.  Thus, some of the thread functions
 327 (notably @func{thread_current}) check that @code{magic} has the proper
 328 value.
 329
 330 @node Thread Functions
 331 @subsection Thread Functions
 332
 333 @file{threads/thread.c} implements several public functions for thread
 334 support.  Let's take a look at the most useful:
 335
 336 @deftypefun void thread_init (void)
 337 Called by @func{main} to initialize the thread system.  Its main
 338 purpose is to create a @struct{thread} for Pintos's initial thread.
 339 This is possible because the Pintos loader sets up the initial
 340 thread's stack at the end of a page.  Before @func{thread_init} runs,
 341 @func{thread_current} will fail because the running thread's
 342 @code{magic} value is incorrect.  Lots of functions call
 343 @func{thread_current} directly or indirectly, including
 344 @func{lock_acquire} for locking a lock, so @func{thread_init} is
 345 called early in Pintos initialization.
 346 @end deftypefun
 347
 348 @deftypefun void thread_start (void)
 349 Called by @func{main} to start the scheduler.  Creates the idle
 350 thread, that is, the thread that is scheduled when no other thread is
 351 ready.  Then enables interrupts, which enables the scheduler because
 352 processes are rescheduled on return from the timer interrupt, using
 353 @func{intr_yield_on_return} (@pxref{External Interrupt Handling}).
 354 @end deftypefun
 355
 356 @deftypefun void thread_create (const char *@var{name}, int @var{priority}, thread_func *@var{func}, void *@var{aux})
 357 Creates and starts a new thread named @var{name} with the given
 358 @var{priority}, returning the new thread's tid.  The thread executes
 359 @var{func}, passing @var{aux} as the function's single argument.
 360
 361 @func{thread_create} works by allocating a page for the thread's
 362 @struct{thread} and stack and initializing its members, then setting
 363 up a set of fake stack frames for it (we'll talk more about this
 364 later).  The thread was initialized in the blocked state, so the final
 365 action before returning is to unblock it, allowing the new thread to
 366 be scheduled.
 367 @end deftypefun
 368
 369 @deftypefun void thread_block (void)
 370 Transitions the running thread from the running state to the blocked
 371 state.  This thread will not run again until @func{thread_unblock} is
 372 called on it, so you'd better have some way arranged for that to
 373 happen.
 374 @end deftypefun
 375
 376 @deftypefun void thread_unblock (struct thread *@var{thread})
 377 Transitions @var{thread}, which must be in the blocked state, to the
 378 ready state, allowing it to continue to run.  This is called when the
 379 event that the thread is waiting for happens, such as the lock that
 380 the thread is waiting on becoming available.
 381 @end deftypefun
 382
 383 @deftypefun struct thread *thread_current (void)
 384 Returns the running thread.
 385 @end deftypefun
 386
 387 @deftypefun void thread_exit (void) NO_RETURN
 388 Causes the current thread to exit.  Never returns (hence
 389 @code{NO_RETURN}: @pxref{UNUSED NO_RETURN NO_INLINE PRINTF_FORMAT}).
 390 @end deftypefun
 391
 392 @deftypefun void thread_yield (void)
 393 Yields the CPU to the scheduler, which picks a new thread to run.  The
 394 new thread might be the current thread, so you can't depend on this
 395 function to keep this thread from running for any particular length of
 396 time.
 397 @end deftypefun
 398
 399 @node Thread Switching
 400 @subsection Thread Switching
 401
 402 @func{schedule} is the function responsible for switching threads.  It
 403 is internal to @file{threads/thread.c} and called only by the three
 404 public thread functions that need to switch threads:
 405 @func{thread_block}, @func{thread_exit}, and @func{thread_yield}.
 406 Before these functions call @func{schedule}, they all disable
 407 interrupts (or ensure that they are already disabled) and then change
 408 the running thread's state to something other than running.
 409
 410 The actual @func{schedule} implementation is simple.  It records the
 411 current thread in local variable @var{cur}, determines the next thread
 412 to run as local variable @var{next} (by calling
 413 @func{next_thread_to_run}, and then calls @func{switch_threads} to do
 414 the actual thread switch.  The thread we switched to was also running
 415 inside @func{switch_threads}, as are all the threads not currently
 416 running in Pintos, so the new thread now returns out of
 417 @func{switch_threads}, returning the previously running thread.
 418
 419 @func{switch_threads} is an assembly language routine in
 420 @file{threads/switch.S}.  It saves registers on the stack, stores the
 421 CPU's current stack pointer into the current @struct{thread}'s @code{stack}
 422 member, restores the new thread's @code{stack} into the CPU's stack
 423 pointer, restores registers from the stack, and returns.
 424
 425 The rest of the scheduler is implemented as @func{schedule_tail}.  It
 426 marks the new thread as running.  If the thread we just switched from
 427 is in the dying state, then it also frees the page that contained the
 428 dying thread's @struct{thread} and stack, which couldn't be freed
 429 before the thread switch because the switch needed to use it.
 430
 431 Running a thread for the first time is a special case.  When
 432 @func{thread_create} creates a new thread, it goes through a fair
 433 amount of trouble to get it started properly.  In particular, a new
 434 thread hasn't started running yet, so there's no way for it to be
 435 running inside @func{switch_threads} as the scheduler expects.  To
 436 solve the problem, @func{thread_create} creates some fake stack frames
 437 in the new thread's stack:
 438
 439 @itemize @bullet
 440 @item
 441 The topmost fake stack frame is for @func{switch_threads}, represented
 442 by @struct{switch_threads_frame}.  The important part of this frame is
 443 its @code{eip} member, the return address.  We point @code{eip} to
 444 @func{switch_entry}.
 445
 446 @item
 447 The next fake stack frame is for @func{switch_entry}, an assembly
 448 language routine in @file{threads/switch.S} that adjusts the stack
 449 pointer,@footnote{This is because @func{switch_threads} takes
 450 arguments on the stack and the 80@var{x}86 SVR4 calling convention
 451 requires the caller, not the called function, to remove them when the
 452 call is complete.  See @bibref{SysV-i386} chapter 3 for details.}
 453 calls @func{schedule_tail} (this special case is why
 454 @func{schedule_tail} is separate from @func{schedule}), and returns.
 455 We fill in its stack frame so that it returns into
 456 @func{kernel_thread}, a function in @file{threads/thread.c}.
 457
 458 @item
 459 The final stack frame is for @func{kernel_thread}, which enables
 460 interrupts and calls the thread's function (the function passed to
 461 @func{thread_create}).  If the thread's function returns, it calls
 462 @func{thread_exit} to terminate the thread.
 463 @end itemize
 464
 465 @node Synchronization
 466 @subsection Synchronization
 467
 468 Situations often arise in threaded programs in which threads want to
 469 share resources.  If resource sharing is not handled in a careful,
 470 controlled fashion, then threads can end up making a big mess.
 471 This is especially the case in operating system kernels, where
 472 faulty sharing can crash the entire machine.  Pintos provides several
 473 synchronization primitives to help out.
 474
 475 @menu
 476 * Disabling Interrupts::
 477 * Semaphores::
 478 * Locks::
 479 * Condition Variables::
 480 @end menu
 481
 482 @node Disabling Interrupts
 483 @subsubsection Disabling Interrupts
 484
 485 The crudest way to do synchronization is to disable interrupts, that
 486 is, to temporarily prevent the CPU from responding to interrupts.  If
 487 interrupts are off, no other thread will preempt the running thread,
 488 because thread preemption is driven by the timer interrupt.  If
 489 interrupts are on, as they normally are, then the running thread may
 490 be preempted by another at any time, whether between two C statements
 491 or even within the execution of one.
 492
 493 Incidentally, this means that Pintos is a ``preemptible kernel,'' that
 494 is, kernel threads can be preempted at any time.  Traditional Unix
 495 systems are ``nonpreemptible,'' that is, kernel threads can only be
 496 preempted at points where they explicitly call into the scheduler.
 497 User programs can be preempted at any time in both models.  As you
 498 might imagine, preemptible kernels require more explicit
 499 synchronization.
 500
 501 You should have little need to set the interrupt state directly.  Most
 502 of the time you should use the other synchronization primitives
 503 described in the following sections.
 504
 505 Types and functions for disabling and enabling interrupts are in
 506 @file{threads/interrupt.h}.
 507
 508 @deftp Type {enum intr_level}
 509 One of @code{INTR_OFF} or @code{INTR_ON}, denoting that interrupts are
 510 disabled or enabled, respectively.
 511 @end deftp
 512
 513 @deftypefun {enum intr_level} intr_get_level (void)
 514 Returns the current interrupt state.
 515 @end deftypefun
 516
 517 @deftypefun {enum intr_level} intr_set_level (enum intr_level @var{level})
 518 Turns interrupts on or off according to @var{level} and returns the
 519 previous interrupt state.
 520 @end deftypefun
 521
 522 @deftypefun {enum intr_level} intr_enable (void)
 523 Turns interrupts on and returns the previous interrupt state.
 524 @end deftypefun
 525
 526 @deftypefun {enum intr_level} intr_disable (void)
 527 Turns interrupts off and returns the previous interrupt state.
 528 @end deftypefun
 529
 530 @node Semaphores
 531 @subsubsection Semaphores
 532
 533 A semaphore is a nonnegative integer along with two operators
 534 for atomically manipulating it, which are:
 535
 536 @itemize @bullet
 537 @item
 538 ``Down'' or ``P'': wait for the value to become positive, then
 539 decrement it.
 540
 541 @item
 542 ``Up'' or ``V'': increment the value (and wake up one waiting thread,
 543 if any).
 544 @end itemize
 545
 546 A semaphore initialized to 1 is typically used for controlling access
 547 to a resource.  Before a block of code starts using the resource, it
 548 ``downs'' the semaphore, then after it is done with the resource it
 549 ``ups'' the resource.  In such a case a lock, described below, may be
 550 more appropriate.
 551
 552 A semaphore initialized to 0 can be useful for waiting for an event
 553 that will happen exactly once.  For example, suppose thread @var{A}
 554 starts another thread @var{B} and wants to wait for @var{B} to signal
 555 that some activity is complete.  @var{A} can create a semaphore
 556 initialized to 0, pass it to @var{B} as it starts it, and then
 557 ``down'' the semaphore.  When @var{B} finishes its activity, it
 558 ``ups'' the semaphore.  This works regardless of whether @var{A}
 559 ``downs'' the semaphore or @var{B} ``ups'' it first.
 560
 561 Pintos declared its semaphore type and operations on them in
 562 @file{threads/synch.h}.
 563
 564 @deftp {Type} {struct semaphore}
 565 Represents a semaphore.
 566 @end deftp
 567
 568 @deftypefun void sema_init (struct semaphore *sema, unsigned value, const char *name)
 569 Initializes @var{sema} as a new semaphore with the given initial
 570 @var{value}.  Gives @var{sema} the specified @var{name} for use in
 571 debugging.
 572 @end deftypefun
 573
 574 @deftypefun void sema_down (struct semaphore *@var{sema})
 575 Executes the ``down'' or ``P'' operation on the semaphore, waiting for
 576 its value to become positive and then decrementing it by one.
 577 @end deftypefun
 578
 579 @deftypefun void sema_up (struct semaphore *@var{sema})
 580 Increments @var{sema}'s value.  If any threads are waiting on
 581 @var{sema}, wakes one of them up.
 582 @end deftypefun
 583
 584 Semaphores are internally built out of interrupt disabling
 585 (@pxref{Disabling Interrupts}), thread blocking and unblocking
 586 (@func{thread_block} and @func{thread_unblock}).  Semaphores maintain
 587 a doubly linked list of waiting threads using the linked list
 588 implementation in @file{lib/kernel/list.c}.
 589
 590 @node Locks
 591 @subsubsection Locks
 592
 593 A lock is a specialization of a semaphore (@pxref{Semaphores}) with an
 594 initial value of 1.  The difference between a lock and such a
 595 semaphore is twofold.  First, a semaphore can have a value greater
 596 than 1, but a lock can only be owned by a single thread at a time.
 597 Second, a semaphore does not have an owner, meaning that one thread
 598 can ``down'' the semaphore and then another one ``up'' it, but with a
 599 lock a single thread must both acquire and release it.  When these
 600 restrictions prove onerous, it's a good sign that a semaphore should
 601 be used, instead of a lock.
 602
 603 Locks in Pintos are not ``recursive,'' that is, it is an error for the
 604 thread currently holding a lock to try to acquire that lock.
 605
 606 Lock types and functions are declared in @file{threads/synch.h}.
 607
 608 @deftp {Type} {struct lock}
 609 Represents a lock.
 610 @end deftp
 611
 612 @deftypefun void lock_init (struct lock *@var{lock}, const char *@var{name})
 613 Initializes @var{lock} as a new lock and gives it the specified
 614 @var{name} for use in debugging.
 615 @end deftypefun
 616
 617 @deftypefun void lock_acquire (struct lock *@var{lock})
 618 Acquires @var{lock} for use by the current thread, first waiting for
 619 any current owner to release it if necessary.
 620 @end deftypefun
 621
 622 @deftypefun void lock_release (struct lock *@var{lock})
 623 Releases @var{lock}, which the current thread must own.
 624 @end deftypefun
 625
 626 @deftypefun bool lock_held_by_current_thread (const struct lock *@var{lock})
 627 Returns true if the running thread owns @var{lock},
 628 false otherwise.
 629 @end deftypefun
 630
 631 @node Condition Variables
 632 @subsubsection Condition Variables
 633
 634 A condition variable allows one piece of code to signal a condition
 635 and cooperating code to receive the signal and act upon it.  Each
 636 condition variable is associated with a lock.  A given condition
 637 variable is associated with only a single lock, but one lock may be
 638 associated with any number of condition variables.  A set of condition
 639 variables taken together with their lock is called a ``monitor.''
 640
 641 A thread that owns the monitor lock is said to be ``in the monitor.''
 642 The thread in the monitor has control over all the data protected with
 643 the lock.  It may freely examine or modify this data.  If it discovers
 644 that it needs to wait for some condition to become true, then it
 645 ``waits'' on the associated condition, which releases the lock and
 646 waits for the condition to be signaled.  If it, on the other hand, has
 647 caused one of these conditions to become true, it ``signals'' the
 648 condition to wake up one waiter, or ``broadcasts'' the condition to
 649 wake all of them.
 650
 651 The classical example of a monitor is handling a buffer into which one
 652 ``producer'' thread writes characters and out of which a second
 653 ``consumer'' thread reads characters.  To implement this case we need,
 654 besides the monitor lock, two condition variables which we will call
 655 @var{not_full} and @var{not_empty}:
 656
 657 @example
 658 char buf[BUF_SIZE];     /* @r{Buffer.} */
 659 size_t n = 0;           /* @r{0 <= n <= @var{BUF_SIZE}: # of characters in buffer.} */
 660 size_t head = 0;        /* @r{@var{buf} index of next char to write (mod @var{BUF_SIZE}).} */
 661 size_t tail = 0;        /* @r{@var{buf} index of next char to read (mod @var{BUF_SIZE}).} */
 662 struct lock lock;       /* @r{Monitor lock.} */
 663 struct condition not_empty; /* @r{Signaled when the buffer is not empty.} */
 664 struct condition not_full; /* @r{Signaled when the buffer is not full.} */
 665
 666 @dots{}@r{initialize the locks and condition variables}@dots{}
 667
 668 void put (char ch) @{
 669   lock_acquire (&lock);
 670   while (n == BUF_SIZE)            /* @r{Can't add to @var{buf} as long as it's full.} */
 671     cond_wait (&not_full, &lock);
 672   buf[head++ % BUF_SIZE] = ch;     /* @r{Add @var{ch} to @var{buf}.} */
 673   n++;
 674   cond_signal (&not_empty, &lock); /* @r{@var{buf} can't be empty anymore.} */
 675   lock_release (&lock);
 676 @}
 677
 678 char get (void) @{
 679   char ch;
 680   lock_acquire (&lock);
 681   while (n == 0)                  /* @r{Can't read from @var{buf} as long as it's empty.} */
 682     cond_wait (&not_empty, &lock);
 683   ch = buf[tail++ % BUF_SIZE];    /* @r{Get @var{ch} from @var{buf}.} */
 684   n--;
 685   cond_signal (&not_full, &lock); /* @r{@var{buf} can't be full anymore.} */
 686   lock_release (&lock);
 687 @}
 688 @end example
 689
 690 As the code above illustrates, Pintos monitors are ``Mesa'' style, not
 691 ``Hoare'' style, that is, sending and receiving a signal are not an
 692 atomic operation.  Thus, typically the caller must recheck the
 693 condition after the wait completes and, if necessary, wait again.
 694
 695 Condition variable types and functions are declared in
 696 @file{threads/synch.h}.
 697
 698 @deftp {Type} {struct condition}
 699 Represents a condition variable.
 700 @end deftp
 701
 702 @deftypefun void cond_init (struct condition *@var{cond}, const char *@var{name})
 703 Initializes @var{cond} as a new condition variable and gives it the
 704 specified @var{name} for use in debugging.
 705 @end deftypefun
 706
 707 @deftypefun void cond_wait (struct condition *@var{cond}, struct lock *@var{name})
 708 Atomically releases @var{lock} (the monitor lock) and waits for
 709 @var{cond} to be signaled by some other piece of code.  After
 710 @var{cond} is signaled, @var{lock} is reacquired before returning.
 711 @var{lock} must be held before calling this function.
 712 @end deftypefun
 713
 714 @deftypefun void cond_signal (struct condition *@var{cond}, struct lock *@var{lock})
 715 If any threads are waiting on @var{cond} (protected by monitor lock
 716 @var{lock}), then this function signals one of them to wake up from
 717 its wait.  @var{lock} must be held before calling this function.
 718 @end deftypefun
 719
 720 @deftypefun void cond_broadcast (struct condition *@var{cond}, struct lock *@var{lock})
 721 Wakes up all threads, if any, waiting on @var{cond} (protected by
 722 monitor lock @var{lock}).  @var{lock} must be held before calling this
 723 function.
 724 @end deftypefun
 725
 726 @node Interrupt Handling
 727 @subsection Interrupt Handling
 728
 729 An @dfn{interrupt} notifies the CPU of some event.  Much of the work
 730 of an operating system relates to interrupts in one way or another.
 731 For our purposes, we classify interrupts into two broad categories:
 732
 733 @itemize @bullet
 734 @item
 735 @dfn{External interrupts}, that is, interrupts originating outside the
 736 CPU.  These interrupts come from hardware devices such as the system
 737 timer, keyboard, serial ports, and disks.  External interrupts are
 738 @dfn{asynchronous}, meaning that they don't occur in a fashion
 739 synchronized with anything going on in the CPU.  External interrupts
 740 are what @func{intr_disable} and related functions can arrange to
 741 postpone (@pxref{Disabling Interrupts}).
 742
 743 @item
 744 @dfn{Internal interrupts}, that is, interrupts caused by something
 745 executing on the CPU.  These interrupts are caused by something
 746 unusual happening during instruction execution: accessing invalid
 747 memory (a @dfn{page fault}), executing invalid instructions, and
 748 various other disallowed activities.  Because they are caused by CPU
 749 instructions, internal interrupts are @dfn{synchronous} or
 750 synchronized with CPU instructions.  @func{intr_disable} does not
 751 disable internal interrupts.
 752 @end itemize
 753
 754 Because the CPU treats all interrupts largely the same way, regardless
 755 of source, Pintos uses the same infrastructure for both internal and
 756 external interrupts, up to a point.  The next section describes this
 757 common infrastructure.  But external and internal interrupts are
 758 actually quite different, so we also follow up that section with one
 759 specific to each category.
 760
 761 If you haven't already read chapter 3 in @bibref{IA32-v1}, it is
 762 recommended that you do so now.  You might also want to skim chapter 5
 763 in @bibref{IA32-v3}.
 764
 765 @menu
 766 * Interrupt Infrastructure::
 767 * Internal Interrupt Handling::
 768 * External Interrupt Handling::
 769 @end menu
 770
 771 @node Interrupt Infrastructure
 772 @subsubsection Interrupt Infrastructure
 773
 774 When an interrupt occurs while the kernel is running, the CPU saves
 775 its most essential state on the stack and jumps to an interrupt
 776 handler routine.  The 80@var{x}86 architecture allows for 256 possible
 777 interrupts, each of which can have its own handler. The handler for
 778 each interrupt is defined in an array called the @dfn{interrupt
 779 descriptor table} or IDT.
 780
 781 In Pintos, @func{intr_init} in @file{threads/interrupt.c} sets up the
 782 IDT so that each entry points to a unique entry point in
 783 @file{threads/intr-stubs.S} named @func{intr@var{NN}_stub}, where
 784 @var{NN} is the interrupt number in
 785 hexadecimal.  Because the CPU doesn't give
 786 us any other way to find out the interrupt number, this entry point
 787 pushes the interrupt number on the stack.  Then it jumps to
 788 @func{intr_entry}, which pushes all the registers that the processor
 789 didn't save for us, and then calls @func{intr_handler}, which
 790 brings us back into C in @file{threads/interrupt.c}.
 791
 792 The main job of @func{intr_handler} is to call any function that has
 793 been registered for handling the particular interrupt.  (If there's no
 794 function registered, it dumps some information to the console and
 795 panics the kernel.)  It does some extra processing for external
 796 interrupts that we'll discuss later.
 797
 798 When @func{intr_handler} returns, the assembly code in
 799 @file{threads/intr-stubs.S} restores all the CPU registers saved
 800 earlier and directs the CPU to return from the interrupt.
 801
 802 A few types and functions apply to both internal and external
 803 interrupts.
 804
 805 @deftypefun void intr_register (uint8_t @var{vec}, int @var{dpl}, enum intr_level @var{level}, intr_handler_func *@var{func}, const char *@var{name})
 806 Registers @var{func} to be called when the interrupt numbered
 807 @var{vec} is triggered.  Names the interrupt @var{name} for debugging
 808 purposes.
 809
 810 If @var{level} is @code{INTR_OFF} then handling of further interrupts
 811 will be disabled while the interrupt is being processed.  If @var{vec}
 812 denotes an external interrupt, then @var{level} must be
 813 @code{INTR_OFF}.  Otherwise, interrupts should normally be turned on
 814 during the handling of an internal interrupt.
 815
 816 For internal interrupts, @var{dpl} determines how the interrupt can be
 817 invoked.  If @var{dpl} is 0, then the interrupt can be invoked only by
 818 kernel threads.  Otherwise @var{dpl} should be 3, which allows user
 819 processes to invoke the interrupt as well (this is useful only
 820 starting with project 2).  @var{dpl} has no effect on external
 821 interrupts
 822 @end deftypefun
 823
 824 @deftp {Type} {void intr_handler_func (struct intr_frame *@var{frame})}
 825 This is the type of an interrupt handler function.  Its @var{frame}
 826 argument (see below) allows it to determine the cause of the interrupt
 827 and the state of the thread that was interrupted.
 828 @end deftp
 829
 830 @deftp {Type} {struct intr_frame}
 831 The stack frame of an interrupt handler.  Its most interesting members
 832 are as follows:
 833 @table @code
 834 @item uint32_t edi;
 835 @item uint32_t esi;
 836 @item uint32_t ebp;
 837 @item uint32_t esp_dummy;
 838 @item uint32_t ebx;
 839 @item uint32_t edx;
 840 @item uint32_t ecx;
 841 @item uint32_t eax;
 842 @item uint16_t es;
 843 @item uint16_t ds;
 844 Register values in the interrupted thread saved by @func{intr_entry}.
 845 The @code{esp_dummy} value isn't actually used (refer to the
 846 description of @code{PUSHA} in @bibref{IA32-v2b} for details).
 847
 848 @item uint32_t vec_no;
 849 The interrupt vector number, ranging from 0 to 255.
 850
 851 @item uint32_t error_code;
 852 The ``error code'' pushed on the stack by the CPU for some internal
 853 interrupts.
 854
 855 @item void (*eip) (void);
 856 The address of the next instruction to be executed by the interrupted
 857 thread.
 858
 859 @item void *esp;
 860 The interrupted thread's stack pointer.
 861 @end table
 862 @end deftp
 863
 864 @deftypefun {const char *} intr_name (uint8_t @var{vec})
 865 Returns the registered name of the interrupt numbered @var{vec}, or
 866 @code{"unknown"} if the interrupt's name is not known.
 867 @end deftypefun
 868
 869 @node Internal Interrupt Handling
 870 @subsubsection Internal Interrupt Handling
 871
 872 When an internal interrupt occurs, it is because the running kernel
 873 thread (or, starting from project 2, the running user process) has
 874 caused it.  Thus, because it is related to a thread (or process), an
 875 internal interrupt is said to happen in a ``process context.''
 876
 877 In an internal interrupt, it can make sense to examine the
 878 @struct{intr_frame} passed to the interrupt handler.  In cases where
 879 the interrupted thread intentionally caused the interrupt, it can even
 880 make sense to modify it.  When the interrupt returns, modified members
 881 in @struct{intr_frame} become changes to the thread's registers.
 882 We'll use this in project 2 to return values from system call
 883 handlers.
 884
 885 There are no special restrictions on what an internal interrupt
 886 handler can or can't do.  Generally they should run with interrupts
 887 enabled, just like other code, and so they can be preempted by other
 888 kernel threads.  Thus, they do need to synchronize with other threads
 889 on shared data and other resources (@pxref{Synchronization}).
 890
 891 @node External Interrupt Handling
 892 @subsubsection External Interrupt Handling
 893
 894 Whereas an internal interrupt runs in the context of the thread that
 895 caused it, external interrupts do not have any predictable context.
 896 They are asynchronous, so it can be invoked at any time that
 897 interrupts have not been enabled.  We say that an external interrupt
 898 runs in an ``interrupt context.''
 899
 900 In an external interrupt, the @struct{intr_frame} passed to the
 901 handler is not very meaningful.  It describes the state of the thread
 902 or process that was interrupted, but there is no way to predict which
 903 one that is.  It is possible, though rarely useful, to examine it, but
 904 modifying it is a recipe for disaster.
 905
 906 The activities of an external interrupt handler are severely
 907 restricted.  First, only one external interrupt may be processed at a
 908 time, that is, nested external interrupt handling is not supported.
 909 This means that external interrupts must be processed with interrupts
 910 disabled (@pxref{Disabling Interrupts}) and that interrupts may not be
 911 enabled at any point during their execution.
 912
 913 Second, an interrupt handler must not call any function that can
 914 sleep, which includes @func{thread_yield}, @func{lock_acquire}, and
 915 many others.  This is because external interrupts use space on the
 916 stack of the kernel thread that was running at the time the interrupt
 917 occurred.  If the interrupt handler tried to sleep and that thread
 918 resumed, then the two uses of the single stack would interfere, which
 919 cannot be allowed.
 920
 921 Because an external interrupt runs with interrupts disabled, it
 922 effectively monopolizes the machine and delays all other activities.
 923 Therefore, external interrupt handlers should complete as quickly as
 924 they can.  Any activities that require much CPU time should instead
 925 run in a kernel thread, possibly a thread whose activity is triggered
 926 by the interrupt using some synchronization primitive.
 927
 928 External interrupts are also special because they are controlled by a
 929 device external to the CPU called a @dfn{programmable interrupt
 930 controller} or @dfn{PIC} for short.  When @func{intr_init} sets up the
 931 CPU's IDT, it also initializes the PIC for interrupt handling.  The
 932 PIC also has to be ``acknowledged'' at the end of processing for each
 933 external interrupt.  @func{intr_handler} takes care of that by calling
 934 @func{pic_end_of_interrupt}, which sends the proper signals to the
 935 PIC.
 936
 937 The following additional interrupts functions are related to external
 938 interrupts.
 939
 940 @deftypefun bool intr_context (void)
 941 Returns true if we are running in an interrupt context and false
 942 otherwise.  Mainly used at the beginning of functions that might sleep
 943 or that otherwise should not be, in this form:
 944 @example
 945 ASSERT (!intr_context ());
 946 @end example
 947 @end deftypefun
 948
 949 @deftypefun intr_yield_on_return (void)
 950 When called in an interrupt context, causes @func{thread_yield} to be
 951 called just before the interrupt returns.  This is used, for example,
 952 in the timer interrupt handler to cause a new thread to be scheduled
 953 on every timer tick.
 954 @end deftypefun
 955
 956 @node Memory Allocation
 957 @subsection Memory Allocation
 958
 959 Pintos contains two memory allocators, one that allocates memory in
 960 units of a page, and one that can allocate blocks of any size.
 961
 962 @menu
 963 * Page Allocation::
 964 * Block Allocator::
 965 @end menu
 966
 967 @node Page Allocation
 968 @subsubsection Page Allocation
 969
 970 The page allocator declared in @file{threads/palloc.h} allocates
 971 memory in units of a page.  It is most often used to allocate memory
 972 one page at a time, but it can also allocate multiple contiguous pages
 973 at once.
 974
 975 The page allocator divides the memory it allocates into two pools,
 976 called the kernel and user pools.  By default, each pool gets half of
 977 system memory, but this can be changed with a kernel command line
 978 option (@pxref{Why PAL_USER?}).  An allocation request draws from one
 979 pool or the other.  If one pool becomes empty, the other may still
 980 have free pages.  The user pool should be used for allocating memory
 981 for user processes and the kernel pool for all other allocations.
 982 This will only become important starting with project 3.  Until then,
 983 all allocations should be made from the kernel pool.
 984
 985 Each pools is managed with a bitmap of used pages, one bit per page in
 986 the pool.  An allocation request for @var{n} pages scans the bitmap,
 987 starting from the beginning, for @var{n} consecutive bits set to
 988 false, indicating that those pages are free, and then sets those bits
 989 to true to mark them as now used.  This is a ``first fit'' allocation
 990 strategy.
 991
 992 The page allocator is subject to fragmentation.  That is, it may not
 993 be possible to allocate @var{n} contiguous pages even though @var{n}
 994 or more pages are free, because the free pages are separated by used
 995 pages.  In fact, in pathological cases it may be impossible to
 996 allocate 2 contiguous pages even though @var{n} / 2 pages are free!
 997 However, single-page requests can't fail due to fragmentation.  Thus,
 998 it is best to limit the need to allocate more than one contiguous page
 999 as much as possible.
1000
1001 The interesting page allocator types and functions are described
1002 below.
1003
1004 @deftp {Type} {enum palloc_flags}
1005 A set of flags that describe how to allocate pages.  These flags may
1006 be combined in any combination:
1007
1008 @table @code
1009 @item PAL_ASSERT
1010 If the pages cannot be allocated, panic the kernel.  This is only
1011 appropriate during kernel initialization, because user processes
1012 should never be permitted to panic the kernel.
1013
1014 @item PAL_ZERO
1015 Clear the pages to all zeros before returning them.  If not set,
1016 newly allocated pages' contents are unpredictable.
1017
1018 @item PAL_USER
1019 Obtain the pages from the user pool.  If not set, pages are allocated
1020 from the kernel pool.
1021 @end table
1022 @end deftp
1023
1024 @deftypefun {void *} palloc_get_page (enum palloc_flags @var{flags})
1025 Obtains a single free page, allocating it in the manner specified by
1026 @var{flags}, and returns it.  Returns a null pointer if no pages are
1027 free.
1028 @end deftypefun
1029
1030 @deftypefun {void *} palloc_get_multiple (enum palloc_flags @var{flags}, size_t @var{page_cnt})
1031 Obtains @var{page_cnt} contiguous free pages, allocating them in the
1032 manner specified by @var{flags}, and returns them.  Returns a null
1033 pointer if no pages are free.
1034 @end deftypefun
1035
1036 @deftypefun void palloc_free_page (void *@var{page})
1037 Frees @var{page}, which must have been obtained using
1038 @func{palloc_get_page} or @func{palloc_get_multiple}.
1039 @end deftypefun
1040
1041 @deftypefun void palloc_free_multiple (void *@var{pages}, size_t @var{page_cnt})
1042 Frees the @var{page_cnt} contiguous pages starting at @var{pages}.
1043 All of the pages must have been obtained using @func{palloc_get_page}
1044 or @func{palloc_get_multiple}.
1045 @end deftypefun
1046
1047 @node Block Allocator
1048 @subsubsection Block Allocator
1049
1050 The block allocator, declared in @file{threads/malloc.h}, can allocate
1051 blocks of any size.  It is layered on top of the page allocator
1052 described in the previous section.  Blocks returned by the block
1053 allocator are obtained from the kernel pool.
1054
1055 The block allocator uses two different strategies for allocating
1056 memory.  The first of these applies to blocks no larger than 1 kB (one
1057 fourth of the the page size).  These allocations are rounded up to the
1058 nearest power of 2, or 16 bytes, whichever is larger.  Then they are
1059 grouped into a page used only for allocations of that power of 2
1060 size.
1061
1062 A different strategy applies to allocating blocks larger than 1 kB.
1063 These allocations (plus a small amount of overhead) are rounded up to
1064 the nearest page in size, and then the block allocator requests that
1065 number of contiguous pages from the page allocator.
1066
1067 In either case, the difference between the allocation requested size
1068 and the actual block size is wasted.  A real operating system would
1069 carefully tune its allocator to minimize this waste, but this is
1070 unimportant in an instructional system like Pintos.
1071
1072 As long as a page can be obtained from the page allocator, small
1073 allocations always succeed.  Most small allocations will not require a
1074 new page from the page allocator at all.  However, large allocations
1075 always require calling into the page allocator, and any allocation
1076 that needs more than one contiguous page can fail due to fragmentation,
1077 as already discussed in the previous section.  Thus, you should
1078 minimize the number of large allocations in your code, especially
1079 those over approximately 4 kB each.
1080
1081 The interface to the block allocator is through the standard C library
1082 functions @func{malloc}, @func{calloc}, and @func{free}.