pintos-os.org Git - pintos-anon/blob - sigcse2009/assignments.tex

   1 \section{Assignments}
   2 \label{sec:assignments}
   3
   4 \pintostestcounttable{}
   5
   6 %
   7 % Not sure if we need that.
   8 %
   9 % \subsection{Project 0}
  10 % If Pintos is used in a semester-long course, project 0 serves as a ``warm-up'' project.
  11 % In this project, students will gain familiarity with the Pintos source tree and some
  12 % supporting classes, in particular its implementation of doubly-linked lists.
  13 % In OS, doubly-linked lists are frequently used because they allow $O(1)$ insertion and
  14 % removal operations.  Moreover, they are often used in a style in which the list cell
  15 % containing the next and prev pointers is embedded in some larger structure, such as
  16 % a thread control block, rather than having separately allocated list cells.
  17 % In project 0, students use Pintos's list implementation to implement a simple, first-fit
  18 % memory allocator.
  19
  20 \subsection{Project 1 -- Threads}
  21 % intro
  22 Project 1 revolves around threads.  The baseline Pintos code boots into a kernel that
  23 supports multiple in-kernel threads.  It provides code for initialization, thread creation and
  24 destruction, context switches, thread blocking and unblocking as well as a simple but
  25 preemptive round-robin scheduler.
  26 Students study the existing, barebones threading system (about 600 lines of C code) to
  27 understand how threads are created and destroyed, and to understand the transitioning of
  28 threads between the READY, RUNNING, and BLOCKED states.  They also study how a thread's
  29 internal memory is managed, which is used to store its runtime stack and thread control block.
  30 Students can examine the context switch code, but the projects do not involve any modifications
  31 to it.
  32
  33 After reading the baseline code, the project asks students to implement several features
  34 that exercise thread state transitions.  The first part of this project includes a simple
  35 alarm clock, which requires maintaining a timer queue of sleeping threads and changing
  36 the timer interrupt handler to unblock those threads whose wakeup time has arrived.
  37 Students learn how to protect data structures that are shared between a thread and an
  38 interrupt handler.  The second part of the project constitutes a strict priority-based
  39 uniprocessor scheduler; students learn about the different ways in which
  40 such a scheduler must react to thread state changes.
  41 % cut for length
  42 %Project 1 also introduces synchronization primitives such as semaphores, locks,
  43 %and condition variables and explores the interaction between such primitives,
  44 %thread states, and the scheduler.
  45
  46 Based on the priority scheduler, students implement priority inheritance,
  47 which deepens their understanding of the interaction of threads and locks.
  48 We use the example of the near-failure of the Mars PathFinder mission to motivate
  49 the need for priority inheritance.
  50 Separately, students build a multi-level feedback queue scheduler on top of the strict
  51 priority scheduler.  This scheduler adjusts threads' priorities based on a sampling
  52 of how much CPU time a thread has received recently.
  53
  54 \paragraph{Testing and Grading}
  55 Project 1 is accompanied by 27 tests as shown in Table~\ref{table:tests}, which are
  56 run using the Bochs simulator by a grading script.
  57 Most of the tests are designed to produce deterministic output;
  58 the grading script will point out differences between expected and actual output.
  59 Usually, a test failure leads students to study the documented source code of the test
  60 and understand how the expected output derives from it.
  61
  62 The MLFQS scheduler tests require a different approach.  Since those tests rely on estimating CPU
  63 usage, they depend on how much CPU time a specific implementation uses, which in turn depends on how
  64 efficient it is.  We compute the expected CPU consumption values by simulation and provide an
  65 envelope within which the output is accepted.  The envelope is large enough to allow for minor
  66 inefficiencies, but major inefficiencies will usually lead to test failures.  Such failures
  67 convey the importance of using efficient algorithms and data structures within a OS kernel.
  68 % cfl: because wasting CPU cycles in the kernel reduces the amount available to applications.
  69
  70 % intro
  71 \paragraph{Learning Objectives}
  72 Project 1 has three learning objectives.  First, students will understand how
  73 the illusion that ``computers can do multiple things at once'' is created by a sequence
  74 of thread state transitions and context switches.  Second, they will understand how
  75 sophisticated scheduling policies can be built on top of a simple priority-based scheduler.
  76 Third, having seen the mechanisms a preemptive scheduler uses to create apparent
  77 concurrency, students gain a better intuition of the non-determinism inherent
  78 in concurrent systems.
  79
  80 %
  81 %
  82 %
  83 \subsection{Project 2 -- User Programs}
  84 The second project illustrates how OS implements protection and isolation between user processes,
  85 how user processes gain access to kernel services, and how user processes lay out the virtual
  86 address space in which their program and data is contained.
  87 The project adds support to Pintos to load and execute user programs that make use
  88 of kernel services.  We kept the provided code purposefully minimal, consisting of
  89 only a library that reads the program and data segments of ELF binaries.   These binaries
  90 are loaded into a new address space; the baseline code includes functionality to allocate
  91 physical memory and set up a page directory to establish the process's virtual address
  92 mappings.
  93
  94 Students implement support for a small set of system calls that allow processes to perform
  95 I/O, start new processes, wait for their termination, and exit.  Both the Pintos user
  96 process model and system call API are modeled after traditional Unix, with the exception
  97 that Pintos processes do not separate process creation (i.e., fork()) from program loading
  98 (i.e., exec) - instead, Pintos's exec() system call combines these two pieces of functionality
  99 into one.  The implementation of these calls requires the students to keep track of
 100 per-process resources such as file descriptors, deepening their understanding of how
 101 an operating system provides the abstraction of a process.
 102
 103 Like most current OS, Pintos exploits dual-mode operation in which user processes run
 104 in a nonprivileged mode that prevents access to kernel resources and to resources allocated
 105 to other processes.  Processes attempting to bypass these restrictions are terminated.
 106 Students implement the system call handling code, a key aspect of which includes the
 107 careful examination of arguments passed by user processes.
 108
 109 Project 2 also illustrates parallel programming techniques, notably fork/join
 110 parallelism, which students implement when providing support for the exec()/wait()/exit()
 111 system calls.  The rendezvous-style synchronization necessary to support this model
 112 can be implemented using semaphores, providing a practical example of this style
 113 synchronization presented in accompanying lectures.
 114
 115 % How threads extend into processes.
 116
 117 \paragraph{Testing and Grading}
 118 The tests for project 2 exclusively consist of user programs written in C.
 119 They are divided into functionality and robustness tests.  Functionality tests check that
 120 the operating system provides the expected set of services when it is used as
 121 expected.  A set of robustness tests checks that the OS rejects all attempts at passing
 122 invalid input to system calls.  To pass those tests, the student's kernel must be
 123 ``bullet-proof.'' We include a stress test in which we artificially induce low memory
 124 conditions by creating a large number of processes and pseudo-randomly introducing
 125 failures in some of them.  We expect the kernel to fully recover from such situations.
 126
 127 \paragraph{Learning Objectives}
 128 In project 2, students learn how the thread abstraction introduced in project 1 is
 129 extended into the process abstraction, which combines a thread, a virtual address space,
 130 and its associated resources.
 131 Project 2 enables students to understand how operating systems employ dual-mode
 132 operation to implement isolation between processes and to protect system resources
 133 even in the presence of failing or misbehaving processes.
 134 Students understand how processes transition into the kernel to access its services,
 135 and how kernels implement such services in a robust way.
 136 The principles learned in this exercise carry over to all scenarios
 137 in which applications must be robust in the face of input coming from untrusted
 138 sources and uncertain resource availability, as is the case in many server systems.
 139
 140 \subsection{Project 3 -- Virtual Memory}
 141 Project 3 asks students to implement several virtual memory techniques, including
 142 on-demand paging of programs, stack growth, page replacement, and memory-mapped files.
 143 This functionality is implemented in a page fault service handler and during the
 144 program loading process.
 145 We provide supporting code to create and maintain page directories, which hide
 146 the x86-specifics of how to program the memory management unit (MMU) and how
 147 to ensure consistency with the CPU's translation look-aside buffer (TLB).
 148 As a result, students can treat the MMU as an abstract device in which to
 149 install, update, or remove virtual to physical address mappings.
 150 Consequently, they are free to choose any design for the data structures needed to
 151 keep track of the state of each page or region in a process's virtual address
 152 space.
 153
 154 In early offerings, this significant creative freedom came at the cost that
 155 some students were lost as how to accomplish set goals.  We added an intermediate
 156 design review stage to this project using a structured questionnaire in which students
 157 outline their planned design.  We also provide a suggested order of implementation.
 158
 159 Like project 2, project 3 requires the use of parallel programming techniques.
 160 Since the Pintos kernel is fully preemptive, students must consider which data structures
 161 require locking, and they must design a locking strategy that both avoids deadlock
 162 and reduces unnecessary serialization.
 163
 164 \paragraph{Testing and Grading}
 165 Project 3 relies on project 2, therefore, we include all tests provided with project 2
 166 as regression tests to ensure that system call functionality does not break in the
 167 presence of virtual memory.  Furthermore, we provide functionality tests for the
 168 added functionality that lends itself to such testing, namely, memory-mapped files
 169 and stack growth.  Some of the project tasks, such as on-demand paging, are
 170 performance-enhancing techniques that do not directly add functionality that is
 171 apparent to user programs, which are graded by inspection.  We test the students
 172 page replacement code by varying the amount of physical memory available to
 173 the kernel when run under emulation, relative to the amount of memory that is
 174 accessed by our test programs.  Timeouts are used to detect grossly inefficient
 175 page replacement schemes.
 176
 177 \paragraph{Learning Objectives}
 178 In project 3, students learn how an OS creates the environment in which a user
 179 program executes as it relates to the program's code and variables.
 180 It also provides a deep understanding of how OS use fault resumption
 181 to virtualize a process's use of physical memory.
 182 In addition, students gain hands-on experience with page replacement algorithms
 183 and have the opportunity to observe their performance impact.
 184
 185 %
 186 %
 187 %
 188 \subsection{Project 4 -- File Systems}
 189 Project 4 asks the students to design and implement a hierarchical, multi-threaded
 190 filesystem and buffer cache.  In projects 2 and 3, students use a basic filesystem
 191 to access the disk, which supports only fixed-size files, no subdirectories,
 192 and which lacks a buffer cache.
 193 Though we suggest a traditional, Unix-like filesystem design, which stores file
 194 metadata in inodes and in which directories are treated as files, students have
 195 complete freedom in designing the layout of their filesystem's metadata as long
 196 as their design does not suffer from external fragmentation.
 197 Since our host tools will not know how to interpret the student's filesystems,
 198 we use an intermediate ``scratch'' disk or partition that is attached to the
 199 physical or virtual computer on which Pintos runs, and use the student's kernel
 200 to copy files in and out of their filesystems.
 201 Similarly, we encourage students to experiment with different replacement
 202 strategies for their buffer cache (though we require that their algorithm
 203 behaves at least as good as a least-recently-used (LRU) strategy.
 204
 205 As with all projects, this assignment includes additional parallel programming
 206 tasks: in this project, we suggest that students implement
 207 a multiple-reader, single-writer access scheme for individual buffer cache blocks.
 208
 209 \paragraph{Testing and Grading}
 210 Project 4 adds a new set of test cases that test the extended functionality.
 211 Project 4 does not require the virtual memory functionality, so it can be built
 212 on either project 2 or 3, depending on the instructor's judgment.
 213 For each functionality test, we provide a sibling persistence test that verifies
 214 that the changes done to the filesystem survive a shutdown and restart.
 215
 216 \paragraph{Learning Objectives}
 217 This project provides a deeper understanding of how OS's manage secondary storage
 218 while avoiding fragmentation and providing efficiency for commonly occurring
 219 disk access patterns.
 220 Students learn how the use of a buffer cache helps absorb disk requests and
 221 improve performance.
 222 They also gain insight into filesystem semantics in the presence of simultaneously
 223 occurring requests.
 224