#
BASE=sigcse2009
-$(BASE).pdf: $(BASE).tex introduction.tex abstract.tex $(BASE).bib principles.tex assignments.tex
+$(BASE).pdf: $(BASE).tex introduction.tex abstract.tex $(BASE).bib principles.tex assignments.tex figures.tex
pdflatex $(BASE).tex
-bibtex $(BASE)
pdflatex $(BASE).tex
%
% Not sure if we need that.
%
-\subsection{Project 0}
-If Pintos is used in a semester-long course, project 0 serves as a ``warm-up'' project.
-In this project, students will gain familiarity with the Pintos source tree and some
-supporting classes, in particular its implementation of doubly-linked lists.
-In OS, doubly-linked lists are frequently used because they allow $O(1)$ insertion and
-removal operations. Moreover, they are often used in a style in which the list cell
-containing the next and prev pointers is embedded in some larger structure, such as
-a thread control block, rather than having separately allocated list cells.
-In project 0, students use Pintos's list implementation to implement a simple, first-fit
-memory allocator.
+% \subsection{Project 0}
+% If Pintos is used in a semester-long course, project 0 serves as a ``warm-up'' project.
+% In this project, students will gain familiarity with the Pintos source tree and some
+% supporting classes, in particular its implementation of doubly-linked lists.
+% In OS, doubly-linked lists are frequently used because they allow $O(1)$ insertion and
+% removal operations. Moreover, they are often used in a style in which the list cell
+% containing the next and prev pointers is embedded in some larger structure, such as
+% a thread control block, rather than having separately allocated list cells.
+% In project 0, students use Pintos's list implementation to implement a simple, first-fit
+% memory allocator.
\subsection{Project 1 -- Threads}
% intro
Student can examine the context switch code, but the projects do not involve any modifications
to it.
-After reading the baseline code, the projects ask students to implement several features
+After reading the baseline code, the project asks students to implement several features
that exercise thread state transitions. The first part of this project includes a simple
alarm clock, which requires maintaining a timer queue of sleeping threads and changing
the timer interrupt handler to unblock those threads whose wakeup time has arrived.
Based on the priority scheduler, students implement two additional tasks: priority
inheritance and a multi-level feedback queue scheduler. Priority inheritance is a way
to avoid priority inversion, a phenonemon that most famously led to an almost-failure
-of the Mars Pathfinder Mission~\cite{MarsPathFinder}. We use this example to motivate
+of the Mars Pathfinder Mission. We use this example to motivate
the problem. Implementing priority inheritance correctly requires a deep understanding of the
interaction of threads and locks.
Separately, students build a multi-level feedback queue scheduler on top of the strict
priority scheduler. This scheduler adjusts threads' priority based on a sampling of how
much CPU time a thread has received recently.
-\paragraph{Testing and Grading.}
+\paragraph{Testing and Grading}
Project 1 is accompanied by about $XX$ tests, which are run using the Bochs simulator by
a grading script. Most tests are designed to test a single aspect, but some tests
test more involved scenarios. Most of the tests are designed to produce a deterministic
because wasting CPU cycles in the kernel reduces the amount available to applications.
% intro
-\paragraph{Learning Objectives.}
+\paragraph{Learning Objectives}
Project 1 has three learning objectives. First, students will understand how
the illusion that ``computers can do multiple things at once'' is created by a sequence
of thread state transitions and context switches. Second, they will understand how
% How threads extend into processes.
-\paragraph{Testing and Grading.}
+\paragraph{Testing and Grading}
The tests for project 2 exclusively consist of user programs written in C.
They are divided into functionality and robustness tests. Functionality tests check that
the operating system provides the expected set of services when it is used as
conditions by creating a large number of processes and pseudo-randomly introducing
failures in some of them. We expect the kernel to fully recover from such situations.
-\paragraph{Learning Objectives.}
+\paragraph{Learning Objectives}
In project 2, students learn how the thread abstraction introduced in project 1 is
extended into the process abstraction, which combines a thread, a virtual address space,
and its associated resources.
In early offerings, this significant creative freedom came at the cost that
some students were lost as how to accomplish set goals. We added an intermediate
-design review stage to this projects using a structured questionnaire in which students
+design review stage to this project using a structured questionnaire in which students
outline their planned design. We also provide a suggested order of implementation.
-Like project 2, project 3 requires reasoning using parallel programming strategies.
+Like project 2, project 3 requires the use of parallel programming techniques.
Since the Pintos kernel is fully preemptive, students must consider which data structures
-require locking, and the must design a locking strategy that both avoids deadlock
+require locking, and they must design a locking strategy that both avoids deadlock
and reduces unnecessary serialization.
-\paragraph{Testing and Grading.}
+\paragraph{Testing and Grading}
Project 3 relies on project 2, therefore, we include all tests provided with project 2
as regression tests to ensure that system call functionality does not break in the
presence of virtual memory. Furthermore, we provide functionality tests for the
accessed by our test programs. Timeouts are used to detect grossly inefficient
page replacement schemes.
-\paragraph{Learning Objectives.}
+\paragraph{Learning Objectives}
In project 3, students learn how an OS creates the environment in which a user
-program executes, specifically as it relates to code and variables used in a program.
-It provides a deep understanding of how OS use fault resumption to
-to virtualize a process's interaction with physical memory.
+program executes as it relates to the program's code and variables.
+It also provides a deep understanding of how OS use fault resumption to
+to virtualize a process's use of physical memory.
In addition, students gain hands-on experience with page replacement algorithms
and have the opportunity to observe their performance impact.
%
%
\subsection{Project 4}
-\paragraph{Testing and Grading.}
-\paragraph{Learning Objectives.}
+Project 4 asks the students to design and implement a hierarchical, multi-threaded
+filesystem and buffer cache. In projects 2 and 3, students use a basic filesystem
+to access the disk, which supports only fixed-size files, no subdirectories,
+and which lacks a buffer cache.
+Though we suggest a traditional, Unix-like filesystem design, which stores file
+metadata in inodes and in which directories are treated as files, students have
+complete freedom in designing the layout of their filesystem's metadata as long
+as their design does not suffer from external fragmentation.
+Since our host tools will not know how to interpret the student's filesystems,
+we use an intermediate ``scratch'' disk or partition that is attached to the
+physical or virtual computer on which Pintos runs, and use the student's kernel
+to copy files in and out of their filesystems.
+Similarly, we encourage students to experiment with different replacement
+strategies for their buffercache (though we require that their algorithm
+behaves at least as good as a least-recently-used (LRU) strategy.
+
+As with all projects, this assignment includes additional parallel programming
+tasks: in this project, we include a requirement that students a multiple-reader,
+single-writer access scheme for individual buffer cache blocks.
+
+\paragraph{Testing and Grading}
+Project 4 adds a new set of test cases that test the extended functionality.
+Project 4 does not require the virtual memory functionality, so can be built
+either on project 2 or 3 depending on the instructor's judgment.
+For each functionality test, we provide a sibling persistence test that verifies
+that the changes done to the filesystem survive a shutdown and restart.
+
+\paragraph{Learning Objectives}
+This project provides a deeper understanding of how OS's manage secondary storage
+while avoiding fragmentation and providing efficiency for commonly occurring
+disk access patterns.
+Students learn how the use of a buffer cache helps absorb disk requests and
+improve performance.
+They also gain insight into filesystem semantics in the presence of simultaneously
+occurring requests.
--- /dev/null
+\newcommand{\pintosenvfigure}{
+ \begin{figure}[htp]
+ \centering
+ \includegraphics[trim=.5in 3.2in .7in .3in, clip,width=\columnwidth]{pintosoptions.pdf}
+ \caption{The same Pintos instructional kernel runs in a
+ fully reproducible simulated environment, in an enhanced
+ emulated environment with dynamic analysis capability, and
+ on actual hardware.}
+ \label{fig:pintosenvs}
+ \end{figure}
+}
+
+\newcommand{\pintosdetailfigure}{
+ \begin{figure*}[htp]
+ \centering
+ \includegraphics[width=.7\textwidth]{pintosoverview.pdf}
+ \caption{Components of Pintos split in provided support code, test cases,
+ and components created in assignments. Overlapping components indicate
+ when students have to replace parts of the support code.}
+ \label{fig:pintosdetail}
+ \end{figure*}
+}
and tools. The C language remains the implementation language of choice
for operating system kernels and for many embedded systems.
Practice and debugging skills in C, particularly using modern tools,
-not only increases students' ``market value,'' but provides students with
+not only increases students' ``market value,''~\cite{1292450} but provides students with
the insight that a low-level programming and runtime model is not incompatible
with high-level tools.
Designing course material for the internal and concrete
-approach is challenging for several reasons. While realistic, assignments should be
-relatively simple and doable within a realistic time frame.
+approach is challenging for several reasons. While realistic,
+assignments should be relatively simple and doable within a realistic time frame.
Whereas assignments should use current hardware architectures,
they must not impart too much knowledge that is transient.
Assignments should include and emphasize the use of modern software
scheduler, a multi-level feedback queue scheduler, the ability to
load programs and support a set of system calls, page-based virtual memory
including on-demand paging, memory-mapped files, and swapping, and a
-simple hierarchical file system.
+simple hierarchical file system. An overview of the projects enabled
+by Pintos is given in Figure~\ref{fig:pintosdetail}, which shows which
+software is provided, which is created by students, and the relative
+relationship of test cases to Pintos modules.
Although Pintos follows in the tradition of instructional operating systems
-such as Nachos~\cite{Christopher1993Nachos},
+such as Nachos~\cite{Christopher1993Nachos}, OS/161~\cite{Holland2002New}, and
GeekOS~\cite{Hovemeyer2004Running},
-and OS/161~\cite{Holland2002New}, we believe that it is unique in two
+PortOS~\cite{Atkin2002PortOS},
+JOS~\cite{1088822}, or Yalnix~\cite{1088822}
+we believe that it is unique in two
aspects. First, Pintos runs on both real hardware and in emulated and
-simulated environments. Second, we have created a set of analysis tools
+simulated environments.~\footnote{GeekOS claims to also run on real hardware, it requires,
+however, a dedicated disk and does not support running off USB devices, making
+it impractical for many lab settings.}
+Second, we have created a set of analysis tools
for the emulated environment that allows students to detect programming
-mistakes such as race conditions.
+mistakes such as race conditions. Figure~\ref{fig:pintosenvs} shows
+the three environments in which the same kernel can be run.
This paper reports on the design philosophy that underlies Pintos,
-details its structure, and outline the nature and learning goals of each
-assignment.
-
+details its structure, and outlines the nature and learning goals of each
+assignments.
+
+\pintosenvfigure{}
+
+\pintosdetailfigure{}
+
+To be discussed:
+User-Mode Linux\cite{1008027}
+iPodLinux~\cite{1352199}
+Linux in VM\cite{Nieh2005Experiences}
+
% Challenges.
% How to embed principles?
% How to teach software engineering?
Each project involves a significant amount of reading code before
students write the first line of their code.
Because software maintenance constitutes the vast majority of all
-software development efforts~\cite{askEliforcite}, this setup mirrors the
+software development efforts~\cite{Boehm1981Software}, this setup mirrors the
environment in which most software engineers work.
We went to great lengths to write the entire Pintos baseline code,
and in particular the portions students will read, in a style that shows,
\paragraph{Practice Test-driven Development}
%Test-driven development~\cite{Edwards}
Each project includes a large number of test cases that is accessible
-to students. In keeping with us adopting an internal perspective, students
-do not develop test cases, rather, they must implement the API that is exercised
-by these test cases.
+to students.
+They must implement the API that is exercised by these test cases.
+Students are encouraged to add their own test cases.
\paragraph{Work in a Team}
The projects presented in this paper are designed to be accomplished by teams of
that fulfill a set of given requirements. We designed a set of structured questionnaires
in which students describe their design and discuss choices and trade-offs they made.
-\paragraph{Provide a reproducible, manageable environment.}
-Some concurrent environments are difficult to manage and debug.
-
-Teaching OS involves teaching concurrency
-
-Operating systems are fundamentally
+\paragraph{Provide a Reproducible, Manageable Environment}
+Operating Systems are inherently concurrent environments, which can be difficult
+to debug. For educational use, we must provide an environment that is
+manageable and reproducible, which is given by the option
+of running Pintos in a simulated environment eliminates this
+non-determinism. As a result, Pintos kernels can be debugged in a manner that
+is substantially similar to how user programs are being debugged.
\paragraph{Provide analysis tools.}
+Static and dynamic analysis tools are now widely being used; an OS course should
+be no exception. We have extended the Qemu emulator that perform tailored
+analyses that can point out errors such as race conditions.
+
--- /dev/null
+\section{Dynamic Analysis Tools}
+
+Data races and invalid memory accesses are some of the most common and
+difficult to debug errors that may occur in concurrent C code.
+We developed dynamic analysis tools that run on top of the QEMU
+system emulator~\cite{Bellard2005QEMU} to help detect these mistakes.
+Since these tools do not require additional support from the Pintos kernel;
+students can use them without complicating their code.
+
+Data races are found by using a semaphore-aware modification of the RaceTrack algorithm~\cite{RaceTrack}.
+Calls to Pintos's synchronization primitives are instrumented at runtime to track every thread's data
+sharing pattern. Meanwhile, every memory access records synchronization information to shadow memory
+maintained by the analysis tool. When the synchronization information for a memory address
+indicates that a data race occurred, a report including heap information for the data location and the
+call stacks for the racing threads is generated.
+
+Invalid memory accesses, such as a read from newly allocated but uninitialized data, are detected by
+tracking all memory accesses. Heap allocation calls are instrumented to map a range of addresses as
+uninitialized. When data is written to a memory address, it is marked as initialized. If a address
+marked as uninitialized is read from, the error is reported and the address is marked as
+uninitialized to mask spurious reports.
+% More sophisticated analysis may be implemented in the future.
+
-\section{Rest of paper}
-
-philosophy
-
\section{Future Work}
-Pintos doesn't do SMP or multicore.
-Pintos doesn't do IPC.
-Pintos doesn't do networking.
+In the future, we will expand Pintos's analysis capabilities to
+provide quantitative information and include realistic
+device models.
+We also considering the extension of Pintos to multiple
+CPUs, and the development of assignments that involve
+networking and interprocess communication (IPC).
%
% This file is automatically generated by citeulike.org
%
+@inproceedings{1352201,
+ address = {New York, NY, USA},
+ author = {Brylow, Dennis },
+ booktitle = {SIGCSE '08: Proceedings of the 39th SIGCSE technical symposium on Computer science education},
+ citeulike-article-id = {3170966},
+ doi = {http://doi.acm.org/10.1145/1352135.1352201},
+ location = {Portland, OR, USA},
+ pages = {192--196},
+ posted-at = {2008-08-29 02:40:19},
+ priority = {2},
+ publisher = {ACM},
+ title = {An experimental laboratory environment for teaching embedded operating systems},
+ url = {http://dx.doi.org/http://doi.acm.org/10.1145/1352135.1352201},
+ year = {2008}
+}
+
+
+
+@article{1067462,
+ address = {New York, NY, USA},
+ author = {Goldweber, Michael and Davoli, Renzo and Morsiani, Mauro },
+ citeulike-article-id = {3170961},
+ doi = {http://doi.acm.org/10.1145/1151954.1067462},
+ journal = {SIGCSE Bull.},
+ number = {3},
+ pages = {49--53},
+ posted-at = {2008-08-29 02:36:28},
+ priority = {2},
+ publisher = {ACM},
+ title = {The Kaya OS project and the {\$\\mu\$}MPS hardware emulator},
+ url = {http://dx.doi.org/http://doi.acm.org/10.1145/1151954.1067462},
+ volume = {37},
+ year = {2005}
+}
+
+
+
+@inproceedings{1008027,
+ address = {New York, NY, USA},
+ author = {Davoli, Renzo },
+ booktitle = {ITiCSE '04: Proceedings of the 9th annual SIGCSE conference on Innovation and technology in computer science education},
+ citeulike-article-id = {3170960},
+ doi = {http://doi.acm.org/10.1145/1007996.1008027},
+ location = {Leeds, United Kingdom},
+ pages = {112--116},
+ posted-at = {2008-08-29 02:36:03},
+ priority = {2},
+ publisher = {ACM},
+ title = {Teaching operating systems administration with user mode linux},
+ url = {http://dx.doi.org/http://doi.acm.org/10.1145/1007996.1008027},
+ year = {2004}
+}
+
+
+
+@inproceedings{299805,
+ address = {New York, NY, USA},
+ author = {Goldweber, Michael and Barr, John and Camp, Tracy and Grahm, John and Hartley, Stephen },
+ booktitle = {SIGCSE '99: The proceedings of the thirtieth SIGCSE technical symposium on Computer science education},
+ citeulike-article-id = {3170955},
+ doi = {http://doi.acm.org/10.1145/299649.299805},
+ location = {New Orleans, Louisiana, United States},
+ pages = {348--349},
+ posted-at = {2008-08-29 02:32:49},
+ priority = {2},
+ publisher = {ACM},
+ title = {A comparison of operating systems courseware},
+ url = {http://dx.doi.org/http://doi.acm.org/10.1145/299649.299805},
+ year = {1999}
+}
+
+
+
+@inproceedings{563384,
+ address = {New York, NY, USA},
+ author = {Atkin, Benjamin and Sirer, Emin G. },
+ booktitle = {SIGCSE '02: Proceedings of the 33rd SIGCSE technical symposium on Computer science education},
+ citeulike-article-id = {3170954},
+ doi = {http://doi.acm.org.ezproxy.lib.vt.edu:8080/10.1145/563340.563384},
+ location = {Cincinnati, Kentucky},
+ pages = {116--120},
+ posted-at = {2008-08-29 02:32:08},
+ priority = {2},
+ publisher = {ACM},
+ title = {PortOS: an educational operating system for the Post-PC environment},
+ url = {http://dx.doi.org/http://doi.acm.org.ezproxy.lib.vt.edu:8080/10.1145/563340.563384},
+ year = {2002}
+}
+
+
+
+@article{1088822,
+ address = {, USA},
+ author = {Anderson, Charles L. and Nguyen, Minh },
+ citeulike-article-id = {3170948},
+ journal = {J. Comput. Small Coll.},
+ number = {1},
+ pages = {183--190},
+ posted-at = {2008-08-29 02:26:24},
+ priority = {2},
+ publisher = {Consortium for Computing Sciences in Colleges},
+ title = {A survey of contemporary instructional operating systems for use in undergraduate courses},
+ volume = {21},
+ year = {2005}
+}
+
+
+
+@inproceedings{1370881,
+ address = {New York, NY, USA},
+ author = {Babka, Vlastimil and Bulej, Lubomir and Decky, Martin and Holub, Viliam and Tuma, Petr },
+ booktitle = {SEESE '08: Proceedings of the 2008 international workshop on Software Engineering in east and south europe},
+ citeulike-article-id = {3170946},
+ doi = {http://doi.acm.org/10.1145/1370868.1370881},
+ location = {Leipzig, Germany},
+ pages = {71--78},
+ posted-at = {2008-08-29 02:24:56},
+ priority = {2},
+ publisher = {ACM},
+ title = {Teaching operating systems: student assignments and the software engineering perspective},
+ url = {http://dx.doi.org/http://doi.acm.org/10.1145/1370868.1370881},
+ year = {2008}
+}
+
+
+
+@article{1292450,
+ address = {, USA},
+ author = {Gaspar, Alessio and Boyer, Naomi and Ejnioui, Abdel },
+ citeulike-article-id = {3170945},
+ journal = {J. Comput. Small Coll.},
+ number = {2},
+ pages = {120--127},
+ posted-at = {2008-08-29 02:23:25},
+ priority = {2},
+ publisher = {Consortium for Computing Sciences in Colleges},
+ title = {Role of the C language in current computing curricula part 1: survey analysis},
+ volume = {23},
+ year = {2007}
+}
+
+
+
+@inproceedings{1167448,
+ address = {New York, NY, USA},
+ author = {Hill, James H. and Gokhale, Aniruddha S. },
+ booktitle = {ACM-SE 43: Proceedings of the 43rd annual Southeast regional conference},
+ citeulike-article-id = {3170941},
+ doi = {http://doi.acm.org/10.1145/1167350.1167448},
+ location = {Kennesaw, Georgia},
+ pages = {355--358},
+ posted-at = {2008-08-29 02:19:41},
+ priority = {2},
+ publisher = {ACM},
+ title = {Visual OS: design and implementation of a visual framework for learning operating system concepts},
+ url = {http://dx.doi.org/http://doi.acm.org/10.1145/1167350.1167448},
+ year = {2005}
+}
+
+
+
+@article{1352199,
+ address = {New York, NY, USA},
+ author = {Lawson, Barry and Barnett, Lewis },
+ citeulike-article-id = {3170937},
+ doi = {http://doi.acm.org/10.1145/1352322.1352199},
+ journal = {SIGCSE Bull.},
+ number = {1},
+ pages = {182--186},
+ posted-at = {2008-08-29 02:18:53},
+ priority = {2},
+ publisher = {ACM},
+ title = {Using iPodLinux in an introductory OS course},
+ url = {http://dx.doi.org/http://doi.acm.org/10.1145/1352322.1352199},
+ volume = {40},
+ year = {2008}
+}
+
+
+
+@inproceedings{Bellard2005QEMU,
+ address = {Berkeley, CA, USA},
+ author = {Bellard, Fabrice },
+ booktitle = {ATEC'05: Proceedings of the USENIX Annual Technical Conference 2005 on USENIX Annual Technical Conference},
+ citeulike-article-id = {2373099},
+ pages = {41},
+ posted-at = {2008-08-29 02:02:24},
+ priority = {2},
+ publisher = {USENIX Association},
+ title = {QEMU, a fast and portable dynamic translator},
+ url = {http://portal.acm.org/citation.cfm?id=1247401},
+ year = {2005}
+}
+
+
+
+@book{Boehm1981Software,
+ author = {Boehm, Barry W. },
+ citeulike-article-id = {126034},
+ isbn = {0138221227},
+ posted-at = {2008-08-29 02:01:18},
+ priority = {2},
+ publisher = {Prentice Hall PTR},
+ title = {Software Engineering Economics},
+ url = {http://portal.acm.org/citation.cfm?id=539425},
+ year = {1981}
+}
+
+
+
@book{Deitel2003Operating,
abstract = {The third edition of \_Operating Systems\_**\_ has been entirely updated to
reflect current core operating system concepts and design considerations. To
\documentclass{sig-alternate}
+\usepackage{graphicx}
+
+% from http://mintaka.sdsu.edu/GF/bibliog/latex/floats.html
+% Alter some LaTeX defaults for better treatment of figures:
+% See p.105 of "TeX Unbound" for suggested values.
+% See pp. 199-200 of Lamport's "LaTeX" book for details.
+% General parameters, for ALL pages:
+\renewcommand{\topfraction}{0.9} % max fraction of floats at top
+\renewcommand{\bottomfraction}{0.8} % max fraction of floats at bottom
+% Parameters for TEXT pages (not float pages):
+\setcounter{topnumber}{2}
+\setcounter{bottomnumber}{2}
+\setcounter{totalnumber}{4} % 2 may work better
+\setcounter{dbltopnumber}{2} % for 2-column pages
+\renewcommand{\dbltopfraction}{0.9} % fit big float above 2-col. text
+\renewcommand{\textfraction}{0.07} % allow minimal text w. figs
+% Parameters for FLOAT pages (not text pages):
+\renewcommand{\floatpagefraction}{0.7} % require fuller float pages
+% N.B.: floatpagefraction MUST be less than topfraction !!
+\renewcommand{\dblfloatpagefraction}{0.7} % require fuller float pages
+% remember to use [htp] or [htpb] for placement
+%------------------
+
\begin{document}
%
% --- Author Metadata here ---
\title{The Pintos Instructional Operating System Kernel}
-\subtitle{[Draft]}
+% \subtitle{[Draft]}
\numberofauthors{3}
\author{
%\keywords{Fill in keywords here if we need them}
+\input{figures}
+
\input{introduction}
\input{principles}
\input{assignments}
+\input{racedt}
+
\input{rest}
% remove the following line before submitting!
-\nocite{*}
+% \nocite{*}
\bibliographystyle{abbrv}