From b8a1f89c3dcd584ebc1f91d9e1e8f42606d4df03 Mon Sep 17 00:00:00 2001 From: Godmar Back Date: Sat, 30 Aug 2008 02:45:54 +0000 Subject: [PATCH 1/1] various updates --- sigcse2009/Makefile | 2 +- sigcse2009/abstract.tex | 4 +- sigcse2009/assignments.tex | 126 ++++++++++++++++++------------------ sigcse2009/figures.tex | 3 +- sigcse2009/introduction.tex | 17 +++-- sigcse2009/principles.tex | 12 ++-- sigcse2009/racedt.tex | 9 ++- sigcse2009/rest.tex | 13 ++-- 8 files changed, 95 insertions(+), 91 deletions(-) diff --git a/sigcse2009/Makefile b/sigcse2009/Makefile index b5471d4..cb9d73a 100644 --- a/sigcse2009/Makefile +++ b/sigcse2009/Makefile @@ -5,7 +5,7 @@ BASE=sigcse2009 $(BASE).pdf: $(BASE).tex introduction.tex abstract.tex $(BASE).bib principles.tex assignments.tex figures.tex \ - rest.tex + rest.tex racedt.tex pdflatex $(BASE).tex -bibtex $(BASE) pdflatex $(BASE).tex diff --git a/sigcse2009/abstract.tex b/sigcse2009/abstract.tex index b472f73..beb3b03 100644 --- a/sigcse2009/abstract.tex +++ b/sigcse2009/abstract.tex @@ -1,7 +1,7 @@ Pintos is an instructional operating system, complete with documentation -and ready-made, structured assignments that introduce students to +and ready-made, modular projects that introduce students to the principles of multi-programming, scheduling, virtual memory, -and file systems. By allowing students to run their work product on +and filesystems. By allowing students to run their work product on actual hardware, while simultaneously benefiting from debugging and dynamic analysis tools provided in simulated and emulated environments, Pintos increases student engagement. Unlike tailored versions of diff --git a/sigcse2009/assignments.tex b/sigcse2009/assignments.tex index b68289c..36e366c 100644 --- a/sigcse2009/assignments.tex +++ b/sigcse2009/assignments.tex @@ -1,6 +1,7 @@ -\section{Assignments} +\section{Pintos Projects} \label{sec:assignments} +The Pintos instructional operating system is split into four projects. \pintostestcounttable{} % @@ -53,7 +54,7 @@ of how much CPU time a thread has received recently. \paragraph{Testing and Grading} Project 1 is accompanied by 27 tests as shown in Table~\ref{table:tests}, which are -run using the Bochs simulator by a grading script. +run by a grading script using the Bochs simulator. Most of the tests are designed to produce deterministic output; the grading script will point out differences between expected and actual output. Usually, a test failure leads students to study the documented source code of the test @@ -64,7 +65,7 @@ usage, they depend on how much CPU time a specific implementation uses, which in efficient it is. We compute the expected CPU consumption values by simulation and provide an envelope within which the output is accepted. The envelope is large enough to allow for minor inefficiencies, but major inefficiencies will usually lead to test failures. Such failures -convey the importance of using efficient algorithms and data structures within a OS kernel. +convey the importance of using efficient algorithms and data structures within an OS kernel. % cfl: because wasting CPU cycles in the kernel reduces the amount available to applications. % intro @@ -81,12 +82,12 @@ in concurrent systems. % % \subsection{Project 2 -- User Programs} -The second project illustrates how OS implements protection and isolation between user processes, -how user processes gain access to kernel services, and how user processes lay out the virtual +The second project illustrates how an OS implements protection and isolation between user processes, +how user processes access kernel services, and how user processes lay out the virtual address space in which their program and data is contained. -The project adds support to Pintos to load and execute user programs that make use -of kernel services. We kept the provided code purposefully minimal, consisting of -only a library that reads the program and data segments of ELF binaries. These binaries +Students first add support to Pintos to load and execute user programs. +We kept the provided code purposefully minimal to +only a library that reads the text and data segments of ELF binaries. These binaries are loaded into a new address space; the baseline code includes functionality to allocate physical memory and set up a page directory to establish the process's virtual address mappings. @@ -94,38 +95,36 @@ mappings. Students implement support for a small set of system calls that allow processes to perform I/O, start new processes, wait for their termination, and exit. Both the Pintos user process model and system call API are modeled after traditional Unix, with the exception -that Pintos processes do not separate process creation (i.e., fork()) from program loading +that Pintos does not separate process creation (i.e., fork()) from program loading (i.e., exec) - instead, Pintos's exec() system call combines these two pieces of functionality into one. The implementation of these calls requires the students to keep track of -per-process resources such as file descriptors, deepening their understanding of how +per-process resources such as file descriptors, which deepens their understanding of how an operating system provides the abstraction of a process. -Like most current OS, Pintos exploits dual-mode operation in which user processes run -in a nonprivileged mode that prevents access to kernel resources and to resources allocated -to other processes. Processes attempting to bypass these restrictions are terminated. +Like most OS, Pintos exploits dual-mode operation in which user processes run +in a nonprivileged mode. +% cfl: that prevents access to kernel resources and to resources allocated to other processes. +Processes attempting to bypass these restrictions are terminated. Students implement the system call handling code, a key aspect of which includes the careful examination of arguments passed by user processes. -Project 2 also illustrates parallel programming techniques, notably fork/join -parallelism, which students implement when providing support for the exec()/wait()/exit() -system calls. The rendezvous-style synchronization necessary to support this model -can be implemented using semaphores, providing a practical example of this style -synchronization presented in accompanying lectures. - -% How threads extend into processes. +Project 2 also illustrates concurrent programming techniques, notably fork/join +parallelism, which students implement using rendezvous-style synchronization +% based on semaphores +when providing support for the exec()/wait()/exit() system calls. \paragraph{Testing and Grading} -The tests for project 2 exclusively consist of user programs written in C. +All tests for Project 2 are user programs written in C. They are divided into functionality and robustness tests. Functionality tests check that -the operating system provides the expected set of services when it is used as -expected. A set of robustness tests checks that the OS rejects all attempts at passing +the operating system provides the specified set of services when it is used as +expected. Robustness tests check that the OS rejects all attempts at passing invalid input to system calls. To pass those tests, the student's kernel must be ``bullet-proof.'' We include a stress test in which we artificially induce low memory conditions by creating a large number of processes and pseudo-randomly introducing failures in some of them. We expect the kernel to fully recover from such situations. \paragraph{Learning Objectives} -In project 2, students learn how the thread abstraction introduced in project 1 is +Students learn how the thread abstraction introduced in Project 1 is extended into the process abstraction, which combines a thread, a virtual address space, and its associated resources. Project 2 enables students to understand how operating systems employ dual-mode @@ -140,85 +139,88 @@ sources and uncertain resource availability, as is the case in many server syste \subsection{Project 3 -- Virtual Memory} Project 3 asks students to implement several virtual memory techniques, including on-demand paging of programs, stack growth, page replacement, and memory-mapped files. -This functionality is implemented in a page fault service handler and during the -program loading process. +This functionality is primarily implemented in a page fault handler routine. +% and during the program loading process. We provide supporting code to create and maintain page directories, which hide the x86-specifics of how to program the memory management unit (MMU) and how to ensure consistency with the CPU's translation look-aside buffer (TLB). As a result, students can treat the MMU as an abstract device in which to -install, update, or remove virtual to physical address mappings. +install, update, or remove virtual-to-physical address mappings. Consequently, they are free to choose any design for the data structures needed to keep track of the state of each page or region in a process's virtual address space. In early offerings, this significant creative freedom came at the cost that -some students were lost as how to accomplish set goals. We added an intermediate +some students were lost as to how to accomplish set goals. We added an intermediate design review stage to this project using a structured questionnaire in which students -outline their planned design. We also provide a suggested order of implementation. +outline their planned design. We also provided a suggested order of implementation. -Like project 2, project 3 requires the use of parallel programming techniques. +Like Project 2, Project 3 requires the use of concurrent programming techniques. Since the Pintos kernel is fully preemptive, students must consider which data structures require locking, and they must design a locking strategy that both avoids deadlock -and reduces unnecessary serialization. +and unnecessary serialization. \paragraph{Testing and Grading} -Project 3 relies on project 2, therefore, we include all tests provided with project 2 +Project 3 relies on Project 2, therefore, we include all tests provided with Project 2 as regression tests to ensure that system call functionality does not break in the -presence of virtual memory. Furthermore, we provide functionality tests for the +presence of virtual memory. Furthermore, we provide tests for the added functionality that lends itself to such testing, namely, memory-mapped files and stack growth. Some of the project tasks, such as on-demand paging, are performance-enhancing techniques that do not directly add functionality that is -apparent to user programs, which are graded by inspection. We test the students +apparent to user programs; these tasks are graded by inspection. We test the students page replacement code by varying the amount of physical memory available to the kernel when run under emulation, relative to the amount of memory that is -accessed by our test programs. Timeouts are used to detect grossly inefficient -page replacement schemes. +accessed by our test programs. +Grossly inefficient page replacement schemes result test failures due to time-outs. \paragraph{Learning Objectives} -In project 3, students learn how an OS creates the environment in which a user +Students learn how an OS creates the environment in which a user program executes as it relates to the program's code and variables. -It also provides a deep understanding of how OS use fault resumption +It provides a thorough understanding of how OS use resumption after page faults to virtualize a process's use of physical memory. -In addition, students gain hands-on experience with page replacement algorithms -and have the opportunity to observe their performance impact. +Students gain hands-on experience with page replacement algorithms +and have the opportunity to directly observe their performance impact. % % % -\subsection{Project 4 -- File Systems} -Project 4 asks the students to design and implement a hierarchical, multi-threaded -filesystem and buffer cache. In projects 2 and 3, students use a basic filesystem -to access the disk, which supports only fixed-size files, no subdirectories, +\subsection{Project 4 -- Filesystems} +Project 4 asks students to design and implement a hierarchical, multi-threaded +filesystem and buffer cache. In Projects 2 and 3, students use a basic filesystem +we provide to access the disk, which supports only fixed-size files, no subdirectories, and which lacks a buffer cache. -Though we suggest a traditional, Unix-like filesystem design, which stores file -metadata in inodes and in which directories are treated as files, students have -complete freedom in designing the layout of their filesystem's metadata as long -as their design does not suffer from external fragmentation. +Although we suggest a traditional, Unix-like filesystem design, which stores file +metadata in inodes and treats directories as files, students have +complete freedom in designing the layout of their filesystem's metadata. +% as long as their design does not suffer from external fragmentation. Since our host tools will not know how to interpret the student's filesystems, -we use an intermediate ``scratch'' disk or partition that is attached to the +we attach an intermediate ``scratch'' disk or partition to the physical or virtual computer on which Pintos runs, and use the student's kernel -to copy files in and out of their filesystems. +to copy files into and out of their filesystems. Similarly, we encourage students to experiment with different replacement -strategies for their buffer cache (though we require that their algorithm -behaves at least as good as a least-recently-used (LRU) strategy. +strategies for their buffer cache, although we require that their algorithm +behaves at least as well as a least-recently-used (LRU) strategy. -As with all projects, this assignment includes additional parallel programming +As with all projects, this assignment includes additional concurrent programming tasks: in this project, we suggest that students implement -a multiple-reader, single-writer access scheme for individual buffer cache blocks. +a multiple-reader, single-writer access scheme for individual buffer cache blocks +to allow shared access when reading file data and metadata. \paragraph{Testing and Grading} -Project 4 adds a new set of test cases that test the extended functionality. -Project 4 does not require the virtual memory functionality, so it can be built -on either project 2 or 3, depending on the instructor's judgment. +Project 4 adds a new set of test cases for the extended functionality, including +stress tests that check correct behavior for deeply nested directories and +for nearly full disks. For each functionality test, we provide a sibling persistence test that verifies that the changes done to the filesystem survive a shutdown and restart. +Since the project does not require the virtual memory functionality, it can be built +on either Project 2 or 3, depending on the instructor's judgment. \paragraph{Learning Objectives} -This project provides a deeper understanding of how OS's manage secondary storage +This project provides a deep understanding of how OS's manage secondary storage while avoiding fragmentation and providing efficiency for commonly occurring -disk access patterns. +disk access patterns. Students learn how the use of a buffer cache helps absorb disk requests and -improve performance. -They also gain insight into filesystem semantics in the presence of simultaneously -occurring requests. +improves performance. +They also gain insight into filesystem semantics in the presence of +simultaneously occurring requests. diff --git a/sigcse2009/figures.tex b/sigcse2009/figures.tex index 68194ce..0ba5169 100644 --- a/sigcse2009/figures.tex +++ b/sigcse2009/figures.tex @@ -10,6 +10,7 @@ \end{figure} } +% old PDF that has legend at the bottom, we use pintosdetailfiguretwo instead now. \newcommand{\pintosdetailfigure}{ \begin{figure*}[htp] \centering @@ -44,7 +45,7 @@ \newcommand{\pintostestcounttable}{ \begin{table} - \begin{tabular}{lccc} + \begin{tabular}{cccc} Project & Functionality & Robustness & Regression \\ 1 & 27 & - & - \\ 2 & 41 & 35 & - \\ diff --git a/sigcse2009/introduction.tex b/sigcse2009/introduction.tex index e591bb8..a36b89b 100644 --- a/sigcse2009/introduction.tex +++ b/sigcse2009/introduction.tex @@ -3,7 +3,7 @@ Despite the wide use of higher-level languages and environments, gaining a robust understanding of operating systems (OS) fundamentals and training in the current design and -implementation practices of OS remains a cornerstone goal of +implementation practices of OS remains a cornerstone of undergraduate computer science education. % abstract/concrete @@ -21,23 +21,22 @@ from the point of view of the OS designer, whereas the external perspective assumes the role of a user or programmer using an OS's facilities~\cite{Bryant2002Computer}. +% we don't really say about the benefits of the 'internal' approach here. % is this too controversial for this audience? -The approach advocated in this paper adopts a concrete approach and the internal -perspective. Students who have been in the role of an -OS designer bring a better understanding of how to use one; and students -who have both studied, implemented, and evaluated core OS techniques obtain +The approach advocated in this paper is concrete and adopts the internal perspective. +Students who have studied, implemented, and evaluated core OS techniques attain a deeper understanding than those who have merely studied them. Finally, adopting a concrete approach brings significant secondary benefits, including training in modern software development techniques and tools. The C language remains the implementation language of choice for operating system kernels and for many embedded systems. -Practice and debugging skills in C, particularly using modern tools, +Practice and debugging experience in C, particularly using modern tools, not only increases students' ``market value,''~\cite{1292450} but provides students with the insight that a low-level programming and runtime model is not incompatible with high-level tools. Designing course material for the internal and concrete -approach is challenging for several reasons. While realistic, +approach is challenging. While realistic, assignments should be relatively simple and doable within a realistic time frame. Whereas assignments should use current hardware architectures, they must not impart too much transient knowledge. @@ -46,7 +45,7 @@ engineering practices and tools, such as dynamic program analysis. This paper introduces Pintos, an instructional operating system kernel that has been in use at multiple institutions for about 4 years. Pintos provides -a bootable kernel for standard personal computers. We provide four +a bootable kernel for standard x86-based personal computers. We provide four structured assignments in which students implement a basic priority scheduler, a multi-level feedback queue scheduler, a process-based multi-programming system, page-based virtual memory @@ -59,7 +58,7 @@ the parts that are created by students, and their relationship. Although Pintos follows in the tradition of instructional operating systems such as Nachos~\cite{Christopher1993Nachos}, OS/161~\cite{Holland2002New}, and GeekOS~\cite{Hovemeyer2004Running}, -PortOS~\cite{Atkin2002PortOS}), +PortOS~\cite{Atkin2002PortOS}, BLITZ~\cite{PorterOverview}, JOS~\cite{1088822}, or Yalnix~\cite{1088822}, we believe that it is unique in two diff --git a/sigcse2009/principles.tex b/sigcse2009/principles.tex index 221fb43..9b015be 100644 --- a/sigcse2009/principles.tex +++ b/sigcse2009/principles.tex @@ -18,8 +18,7 @@ by example, the coding style we expect from students. %This style includes purely syntactical convention such as the choice of the %GNU indentation style, and extends to commenting style and naming conventions. We continuously refined the internal code documentation over several -semesters, focusing on those -portions that initially proved difficult to understand or confusing. +semesters, focusing on those portions that initially proved difficult to understand. \paragraph{Maximize Creative Freedom} OS design involves a tremendous amount of creative freedom, both in the @@ -30,8 +29,8 @@ own data structures and associated algorithms as much as possible. \paragraph{Practice Test-driven Development} %Test-driven development~\cite{Edwards} -Each project includes a large number of test cases that is accessible -to students, as shown in Table~\ref{table:tests}. +Each project includes a large number of test cases that are accessible +to students, as shown in Table~\ref{table:tests}. They must implement the API that is exercised by these test cases. Students are encouraged to add their own test cases. @@ -56,8 +55,9 @@ As a result, Pintos kernels can be debugged in a manner that is substantially similar to how user programs are being debugged. \paragraph{Include Analysis Tools} -Static and dynamic analysis tools are now being widely used; an OS course should -be no exception. In Section~\ref{sec:dynamicanalysis}, we describe how we +Dynamic analysis tools are now being widely used in software development; +an OS course should be no exception. +In Section~\ref{sec:dynamicanalysis}, we describe how we extended the QEMU emulator~\cite{Bellard2005QEMU} to perform tailored analyses that find errors such as race conditions. diff --git a/sigcse2009/racedt.tex b/sigcse2009/racedt.tex index d5e2871..37cc78c 100644 --- a/sigcse2009/racedt.tex +++ b/sigcse2009/racedt.tex @@ -5,7 +5,7 @@ Data races and invalid memory accesses are some of the most common and difficult to debug errors that may occur in concurrent C code. We developed dynamic analysis tools that run on top of the QEMU system emulator~\cite{Bellard2005QEMU} to help detect these mistakes. -Since these tools do not require additional support from the Pintos kernel; +Since these tools do not require additional support from the Pintos kernel, students can use them without complicating their code. Data races are found by using a semaphore-aware modification of the RaceTrack algorithm~\cite{Yu2005RaceTrack}. @@ -18,7 +18,10 @@ call stacks for the racing threads is generated. Invalid memory accesses, such as a read from newly allocated but uninitialized data, are detected by tracking all memory accesses. Heap allocation calls are instrumented to map a range of addresses as uninitialized. When data is written to a memory address, it is marked as initialized. If an address -marked as uninitialized is read from, the error is reported and the address is marked as -uninitialized to mask spurious reports. +marked as uninitialized is read from, an error is reported. +%cfl: and the address is marked as uninitialized to mask spurious reports. % More sophisticated analysis may be implemented in the future. +Each of our tools presents students with one or more concrete backtraces that +show where the error occurred, which not only helps students debug their code, +but makes the concept of race conditions more concrete. diff --git a/sigcse2009/rest.tex b/sigcse2009/rest.tex index 7f22a85..fde4f50 100644 --- a/sigcse2009/rest.tex +++ b/sigcse2009/rest.tex @@ -1,13 +1,12 @@ \section{Future Work} In the future, we will expand Pintos's analysis capabilities to -provide quantitative information and include realistic -device models. +provide quantitative information and include realistic device models. We are also considering extending Pintos to multiple -CPUs, and developing assignments that involve +CPUs and assignments that involve networking and interprocess communication (IPC). -Although feedback from our industrial affiliates, who compare -students having used Pintos to students having taken courses -that use less concrete or external approaches, is highly favorable, -we need to perform a formal evaluation comparing learning +Although we have received highly favorable feedback from our +industrial affiliates, who compare students having used Pintos +to students having taken courses that use less concrete or external approaches, +we need to perform a formal evaluation to compare learning outcomes using Pintos to other alternatives. -- 2.30.2