pintos-os.org Git - pintos-anon/blob - doc/intro.texi

   1 @node Introduction, Pintos Tour, Top, Top
   2 @chapter Introduction
   3
   4 Welcome to Pintos.  Pintos is a simple operating system framework for
   5 the 80@var{x}86 architecture.  It supports kernel threads, loading and
   6 running user programs, and a file system, but it implements all of
   7 these in a very simple way.  In the Pintos projects, you and your
   8 project team will strengthen its support in all three of these areas.
   9 You will also add a virtual memory implementation.
  10
  11 Pintos could, theoretically, run on a regular IBM-compatible PC.
  12 Unfortunately, it is impractical to supply every CS 140 student
  13 a dedicated PC for use with Pintos.  Therefore, we will run Pintos projects
  14 in a system simulator, that is, a program that simulates an 80@var{x}86
  15 CPU and its peripheral devices accurately enough that unmodified operating
  16 systems and software can run under it.  In class we will use the
  17 @uref{http://bochs.sourceforge.net, , Bochs} and
  18 @uref{http://fabrice.bellard.free.fr/qemu/, ,
  19 qemu} simulators.  Pintos has also been tested with
  20 @uref{http://www.vmware.com/products/server/gsx_features.html, ,
  21 VMware GSX Server}.
  22
  23 These projects are hard.  CS 140 has a reputation of taking a lot of
  24 time, and deservedly so.  We will do what we can to reduce the workload, such
  25 as providing a lot of support material, but there is plenty of
  26 hard work that needs to be done.  We welcome your
  27 feedback.  If you have suggestions on how we can reduce the unnecessary
  28 overhead of assignments, cutting them down to the important underlying
  29 issues, please let us know.
  30
  31 This chapter explains how to get started working with Pintos.  You
  32 should read the entire chapter before you start work on any of the
  33 projects.
  34
  35 @menu
  36 * Getting Started::
  37 * Grading::
  38 * License::
  39 * Acknowledgements::
  40 * Trivia::
  41 @end menu
  42
  43 @node Getting Started
  44 @section Getting Started
  45
  46 To get started, you'll have to log into a machine that Pintos can be
  47 built on.  The CS140 ``officially supported''
  48 Pintos development machines are the machines in Sweet Hall managed by
  49 Stanford ITSS, as described on the
  50 @uref{http://www.stanford.edu/services/cluster/environs/sweet/, , ITSS
  51 webpage}.  You may use the Solaris or Linux machines.  We will test your
  52 code on these machines, and the instructions given here assume this
  53 environment.  However, Pintos and its supporting tools are portable
  54 enough that it should build ``out of the box'' in other environments.
  55
  56 Once you've logged into one of these machines, either locally or
  57 remotely, start out by adding our binaries directory to your @env{PATH}
  58 environment.  Under @command{csh}, Stanford's login shell, you can do so
  59 with this command:@footnote{The term @samp{`uname -m`} expands to either
  60 @file{sun4u} or @file{i686} according to the type of computer you're
  61 logged into.}
  62 @example
  63 set path = ( /usr/class/cs140/`uname -m`/bin $path )
  64 @end example
  65 @noindent
  66 @strong{Notice that both @samp{`} are left single quotes or
  67 ``backticks,'' not apostrophes (@samp{'}).}
  68 It is a good idea to add this line to the @file{.cshrc} file
  69 in your home directory.  Otherwise, you'll have to type it every time
  70 you log in.
  71
  72 @menu
  73 * Source Tree Overview::
  74 * Building Pintos::
  75 * Running Pintos::
  76 * Debugging versus Testing::
  77 @end menu
  78
  79 @node Source Tree Overview
  80 @subsection Source Tree Overview
  81
  82 Now you can extract the source for Pintos into a directory named
  83 @file{pintos/src}, by executing
  84 @example
  85 tar xzf /usr/class/cs140/pintos/pintos.tar.gz
  86 @end example
  87 Alternatively, fetch
  88 @uref{http://@/www.stanford.edu/@/class/@/cs140/@/pintos/@/pintos.@/tar.gz}
  89 and extract it in a similar way.
  90
  91 Let's take a look at what's inside.  Here's the directory structure
  92 that you should see in @file{pintos/src}:
  93
  94 @table @file
  95 @item threads/
  96 Source code for the base kernel, which you will modify starting in
  97 project 1.
  98
  99 @item userprog/
 100 Source code for the user program loader, which you will modify
 101 starting with project 2.
 102
 103 @item vm/
 104 An almost empty directory.  You will implement virtual memory here in
 105 project 3.
 106
 107 @item filesys/
 108 Source code for a basic file system.  You will use this file system
 109 starting with project 2, but you will not modify it until project 4.
 110
 111 @item devices/
 112 Source code for I/O device interfacing: keyboard, timer, disk, etc.
 113 You will modify the timer implementation in project 1.  Otherwise
 114 you should have no need to change this code.
 115
 116 @item lib/
 117 An implementation of a subset of the standard C library.  The code in
 118 this directory is compiled into both the Pintos kernel and, starting
 119 from project 2, user programs that run under it.  In both kernel code
 120 and user programs, headers in this directory can be included using the
 121 @code{#include <@dots{}>} notation.  You should have little need to
 122 modify this code.
 123
 124 @item lib/kernel/
 125 Parts of the C library that are included only in the Pintos kernel.
 126 This also includes implementations of some data types that you are
 127 free to use in your kernel code: bitmaps, doubly linked lists, and
 128 hash tables.  In the kernel, headers in this
 129 directory can be included using the @code{#include <@dots{}>}
 130 notation.
 131
 132 @item lib/user/
 133 Parts of the C library that are included only in Pintos user programs.
 134 In user programs, headers in this directory can be included using the
 135 @code{#include <@dots{}>} notation.
 136
 137 @item tests/
 138 Tests for each project.  You can modify this code if it helps you test
 139 your submission, but we will replace it with the originals before we run
 140 the tests.
 141
 142 @item examples/
 143 Example user programs for use starting with project 2.
 144
 145 @item misc/
 146 @itemx utils/
 147 These files may come in handy if you decide to try working with Pintos
 148 away from the ITSS machines.  Otherwise, you can ignore them.
 149 @end table
 150
 151 @node Building Pintos
 152 @subsection Building Pintos
 153
 154 As the next step, build the source code supplied for
 155 the first project.  First, @command{cd} into the @file{threads}
 156 directory.  Then, issue the @samp{make} command.  This will create a
 157 @file{build} directory under @file{threads}, populate it with a
 158 @file{Makefile} and a few subdirectories, and then build the kernel
 159 inside.  The entire build should take less than 30 seconds.
 160
 161 Watch the commands executed during the build.  On the Linux machines,
 162 the ordinary system tools are used.  On a SPARC machine, special build
 163 tools are used, whose names begin with @samp{i386-elf-}, e.g.@:
 164 @code{i386-elf-gcc}, @code{i386-elf-ld}.  These are ``cross-compiler''
 165 tools.  That is, the build is running on a SPARC machine (called the
 166 @dfn{host}), but the result will run on a simulated 80@var{x}86 machine
 167 (called the @dfn{target}).  The @samp{i386-elf-@var{program}} tools are
 168 specially built for this configuration.
 169
 170 Following the build, the following are the interesting files in the
 171 @file{build} directory:
 172
 173 @table @file
 174 @item Makefile
 175 A copy of @file{pintos/src/Makefile.build}.  It describes how to build
 176 the kernel.  @xref{Adding Source Files}, for more information.
 177
 178 @item kernel.o
 179 Object file for the entire kernel.  This is the result of linking
 180 object files compiled from each individual kernel source file into a
 181 single object file.  It contains debug information, so you can run
 182 @command{gdb} or @command{backtrace} (@pxref{Backtraces}) on it.
 183
 184 @item kernel.bin
 185 Memory image of the kernel.  These are the exact bytes loaded into
 186 memory to run the Pintos kernel.  To simplify loading, it is always
 187 padded out with zero bytes up to an exact multiple of 4 kB in
 188 size.
 189
 190 @item loader.bin
 191 Memory image for the kernel loader, a small chunk of code written in
 192 assembly language that reads the kernel from disk into memory and
 193 starts it up.  It is exactly 512 bytes long, a size fixed by the
 194 PC BIOS.
 195
 196 @item os.dsk
 197 Disk image for the kernel, which is just @file{loader.bin} followed by
 198 @file{kernel.bin}.  This file is used as a ``virtual disk'' by the
 199 simulator.
 200 @end table
 201
 202 Subdirectories of @file{build} contain object files (@file{.o}) and
 203 dependency files (@file{.d}), both produced by the compiler.  The
 204 dependency files tell @command{make} which source files need to be
 205 recompiled when other source or header files are changed.
 206
 207 @node Running Pintos
 208 @subsection Running Pintos
 209
 210 We've supplied a program for conveniently running Pintos in a simulator,
 211 called @command{pintos}.  In the simplest case, you can invoke
 212 @command{pintos} as @code{pintos @var{argument}@dots{}}.  Each
 213 @var{argument} is passed to the Pintos kernel for it to act on.
 214
 215 Try it out.  First @command{cd} into the newly created @file{build}
 216 directory.  Then issue the command @code{pintos run alarm-multiple},
 217 which passes the arguments @code{run alarm-multiple} to the Pintos
 218 kernel.  In these arguments, @command{run} instructs the kernel to run a
 219 test and @code{alarm-multiple} is the test to run.
 220
 221 This command creates a @file{bochsrc.txt} file, which is needed for
 222 running Bochs, and then invoke Bochs.  Bochs opens a new window that
 223 represents the simulated machine's display, and a BIOS message briefly
 224 flashes.  Then Pintos boots and runs the @code{alarm-multiple} test
 225 program, which outputs a few screenfuls of text.  When it's done, you
 226 can close Bochs by clicking on the ``Power'' button in the window's top
 227 right corner, or rerun the whole process by clicking on the ``Reset''
 228 button just to its left.  The other buttons are not very useful for our
 229 purposes.
 230
 231 (If no window appeared at all, and you just got a terminal full of
 232 corrupt-looking text, then you're probably logged in remotely and X
 233 forwarding is not set up correctly.  In this case, you can fix your X
 234 setup, or you can use the @option{-v} option to disable X output:
 235 @code{pintos -v -- run alarm-multiple}.)
 236
 237 The text printed by Pintos inside Bochs probably went by too quickly to
 238 read.  However, you've probably noticed by now that the same text was
 239 displayed in the terminal you used to run @command{pintos}.  This is
 240 because Pintos sends all output both to the VGA display and to the first
 241 serial port, and by default the serial port is connected to Bochs's
 242 @code{stdout}.  You can log this output to a file by redirecting at the
 243 command line, e.g.@: @code{pintos run alarm-multiple > logfile}.
 244
 245 The @command{pintos} program offers several options for configuring the
 246 simulator or the virtual hardware.  If you specify any options, they
 247 must precede the commands passed to the Pintos kernel and be separated
 248 from them by @option{--}, so that the whole command looks like
 249 @code{pintos @var{option}@dots{} -- @var{argument}@dots{}}.  Invoke
 250 @code{pintos} without any arguments to see a list of available options.
 251 Options can select a simulator to use: the default is Bochs, but on the
 252 Linux machines @option{--qemu} selects qemu.  You can run the simulator
 253 with a debugger (@pxref{gdb}).  You can set the amount of memory to give
 254 the VM.  Finally, you can select how you want VM output to be displayed:
 255 use @option{-v} to turn off the VGA display, @option{-t} to use your
 256 terminal window as the VGA display instead of opening a new window
 257 (Bochs only), or @option{-s} to suppress the serial output to
 258 @code{stdout}.
 259
 260 The Pintos kernel has commands and options other than @command{run}.
 261 These are not very interesting for now, but you can see a list of them
 262 using @option{-h}, e.g.@: @code{pintos -h}.
 263
 264 @node Debugging versus Testing
 265 @subsection Debugging versus Testing
 266
 267 When you're debugging code, it's useful to be able to run a
 268 program twice and have it do exactly the same thing.  On second and
 269 later runs, you can make new observations without having to discard or
 270 verify your old observations.  This property is called
 271 ``reproducibility.''  The simulator we use by default, Bochs, can be set
 272 up for
 273 reproducibility, and that's the way that @command{pintos} invokes it
 274 by default.
 275
 276 Of course, a simulation can only be reproducible from one run to the
 277 next if its input is the same each time.  For simulating an entire
 278 computer, as we do, this means that every part of the computer must be
 279 the same.  For example, you must use the same command-line argument, the
 280 same disks, the same version
 281 of Bochs, and you must not hit any keys on the keyboard (because you
 282 could not be sure to hit them at exactly the same point each time)
 283 during the runs.
 284
 285 While reproducibility is useful for debugging, it is a problem for
 286 testing thread synchronization, an important part of most of the projects.  In
 287 particular, when Bochs is set up for reproducibility, timer interrupts
 288 will come at perfectly reproducible points, and therefore so will
 289 thread switches.  That means that running the same test several times
 290 doesn't give you any greater confidence in your code's correctness
 291 than does running it only once.
 292
 293 So, to make your code easier to test, we've added a feature, called
 294 ``jitter,'' to Bochs, that makes timer interrupts come at random
 295 intervals, but in a perfectly predictable way.  In particular, if you
 296 invoke @command{pintos} with the option @option{-j @var{seed}}, timer
 297 interrupts will come at irregularly spaced intervals.  Within a single
 298 @var{seed} value, execution will still be reproducible, but timer
 299 behavior will change as @var{seed} is varied.  Thus, for the highest
 300 degree of confidence you should test your code with many seed values.
 301
 302 On the other hand, when Bochs runs in reproducible mode, timings are not
 303 realistic, meaning that a ``one-second'' delay may be much shorter or
 304 even much longer than one second.  You can invoke @command{pintos} with
 305 a different option, @option{-r}, to set up Bochs for realistic
 306 timings, in which a one-second delay should take approximately one
 307 second of real time.  Simulation in real-time mode is not reproducible,
 308 and options @option{-j} and @option{-r} are mutually exclusive.
 309
 310 On the Linux machines only, the qemu simulator is available as an
 311 alternative to Bochs (use @option{--qemu} when invoking
 312 @command{pintos}).  The qemu simulator is much faster than Bochs, but it
 313 only supports real-time simulation and does not have a reproducible
 314 mode.
 315
 316 @node Grading
 317 @section Grading
 318
 319 We will grade your assignments based on test results and design quality,
 320 each of which comprises 50% of your grade.
 321
 322 @menu
 323 * Testing::
 324 * Design::
 325 @end menu
 326
 327 @node Testing
 328 @subsection Testing
 329
 330 Your test result grade will be based on our tests.  Each project has
 331 several tests, each of which has a name beginning with @file{tests}.
 332 To completely test your submission, invoke @code{make check} from the
 333 project @file{build} directory.  This will build and run each test and
 334 print a ``pass'' or ``fail'' message for each one.  When a test fails,
 335 @command{make check} also prints some details of the reason for failure.
 336 After running all the tests, @command{make check} also prints a summary
 337 of the test results.
 338
 339 For project 1, the tests will probably run faster in Bochs.  For the
 340 rest of the projects, they will probably run faster in qemu.
 341
 342 You can also run individual tests one at a time.  A given test @var{t}
 343 writes its output to @file{@var{t}.output}, then a script scores the
 344 output as ``pass'' or ``fail'' and writes the verdict to
 345 @file{@var{t}.result}.  To run and grade a single test, @command{make}
 346 the @file{.result} file explicitly from the @file{build} directory, e.g.@:
 347 @code{make tests/threads/alarm-multiple.result}.  If @command{make} says
 348 that the test result is up-to-date, but you want to re-run it anyway,
 349 either run @code{make clean} or delete the @file{.output} file by hand.
 350
 351 By default, each test provides feedback only at completion, not during
 352 its run.  If you prefer, you can observe the progress of each test by
 353 specifying @option{VERBOSE=1} on the @command{make} command line, as in
 354 @code{make check VERBOSE=1}.  You can also provide arbitrary options to the
 355 @command{pintos} run by the tests with @option{PINTOSOPTS='@dots{}'},
 356 e.g.@: @code{make check PINTOSOPTS='--qemu'} to run the tests under
 357 qemu.
 358
 359 All of the tests and related files are in @file{pintos/src/tests}.
 360 Before we test your submission, we will replace the contents of that
 361 directory by a pristine, unmodified copy, to ensure that the correct
 362 tests are used.  Thus, you can modify some of the tests if that helps in
 363 debugging, but we will run the originals.
 364
 365 All software has bugs, so some of our tests may be flawed.  If you think
 366 a test failure is a bug in the test, not a bug in your code,
 367 please point it out.  We will look at it and fix it if necessary.
 368
 369 Please don't try to take advantage of our generosity in giving out our
 370 test suite.  Your code has to work properly in the general case, not
 371 just for the test cases we supply.  For example, it would be unacceptable
 372 to explicitly base the kernel's behavior on the name of the running
 373 test case.  Such attempts to side-step the test cases will receive no
 374 credit.  If you think your solution may be in a gray area here, please
 375 ask us about it.
 376
 377 @node Design
 378 @subsection Design
 379
 380 We will judge your design based on the design document and the source
 381 code that you submit.  We will read your entire design document and much
 382 of your source code.
 383
 384 Don't forget that the design document is 50% of your project grade.  It
 385 is better to spend one or two hours writing a good design document than
 386 it is to spend that time getting the last 5% of the points for tests and
 387 then trying to rush through writing the design document in the last 15
 388 minutes.
 389
 390 @menu
 391 * Design Document::
 392 * Source Code::
 393 @end menu
 394
 395 @node Design Document
 396 @subsubsection Design Document
 397
 398 We provide a design document template for each project.  For each
 399 significant part of a project, the template asks questions in four
 400 areas:
 401
 402 @table @strong
 403 @item Data Structures
 404
 405 The instructions for this section are always the same:
 406
 407 @quotation
 408 Copy here the declaration of each new or changed @code{struct} or
 409 @code{struct} member, global or static variable, @code{typedef}, or
 410 enumeration.  Identify the purpose of each in 25 words or less.
 411 @end quotation
 412
 413 The first part is mechanical.  Just copy new or modified declarations
 414 into the design document, to highlight for us the actual changes to data
 415 structures.  Each declaration should include the comment that should
 416 accompany it in the source code (see below).
 417
 418 We also ask for a very brief description of the purpose of each new or
 419 changed data structure.  The limit of 25 words or less is a guideline
 420 intended to save your time and avoid duplication with later areas.
 421
 422 @item Algorithms
 423
 424 This is where you tell us how your code works, through questions that
 425 probe your understanding of your code.  We might not be able to easily
 426 figure it out from the code, because many creative solutions exist for
 427 most OS problems.  Help us out a little.
 428
 429 Your answers should be at a level below the high level description of
 430 requirements given in the assignment.  We have read the assignment too,
 431 so it is unnecessary to repeat or rephrase what is stated there.  On the
 432 other hand, your answers should be at a level above the low level of the
 433 code itself.  Don't give a line-by-line run-down of what your code does.
 434 Instead, use your answers to explain how your code works to implement
 435 the requirements.
 436
 437 @item Synchronization
 438
 439 An operating system kernel is a complex, multithreaded program, in which
 440 synchronizing multiple threads can be difficult.  This section asks
 441 about how you chose to synchronize this particular type of activity.
 442
 443 @item Rationale
 444
 445 Whereas the other sections primarily ask ``what'' and ``how,'' the
 446 rationale section concentrates on ``why.''  This is where we ask you to
 447 justify some design decisions, by explaining why the choices you made
 448 are better than alternatives.  You may be able to state these in terms
 449 of time and space complexity, which can be made as rough or informal
 450 arguments (formal language or proofs are unnecessary).
 451 @end table
 452
 453 An incomplete, evasive, or non-responsive design document or one that
 454 strays from the template without good reason may be penalized.
 455 Incorrect capitalization, punctuation, spelling, or grammar can also
 456 cost points.  @xref{Project Documentation}, for a sample design document
 457 for a fictitious project.
 458
 459 @node Source Code
 460 @subsubsection Source Code
 461
 462 Your design will also be judged by looking at your source code.  We will
 463 typically look at the differences between the original Pintos source
 464 tree and your submission, based on the output of a command like
 465 @code{diff -urpb pintos.orig pintos.submitted}.  We will try to match up your
 466 description of the design with the code submitted.  Important
 467 discrepancies between the description and the actual code will be
 468 penalized, as will be any bugs we find by spot checks.
 469
 470 The most important aspects of source code design are those that specifically
 471 relate to the operating system issues at stake in the project.  For
 472 example, the organization of an inode is an important part of file
 473 system design, so in the file system project a poorly designed inode
 474 would lose points.  Other issues are much less important.  For
 475 example, multiple Pintos design problems call for a ``priority
 476 queue,'' that is, a dynamic collection from which the minimum (or
 477 maximum) item can quickly be extracted.  Fast priority queues can be
 478 implemented many ways, but we do not expect you to build a fancy data
 479 structure even if it might improve performance.  Instead, you are
 480 welcome to use a linked list (and Pintos even provides one with
 481 convenient functions for sorting and finding minimums and maximums).
 482
 483 Pintos is written in a consistent style.  Make your additions and
 484 modifications in existing Pintos source files blend in, not stick out.
 485 In new source files, adopt the existing Pintos style by preference, but
 486 make your code self-consistent at the very least.  There should not be a
 487 patchwork of different styles that makes it obvious that three different
 488 people wrote the code.  Use horizontal and vertical white space to make
 489 code readable.  Add a brief comment on every structure, structure
 490 member, global or static variable, and function definition.  Update
 491 existing comments as you modify code.  Don't comment out or use the
 492 preprocessor to ignore blocks of code (instead, remove it entirely).
 493 Use assertions to document key invariants.  Decompose code into
 494 functions for clarity.  Code that is difficult to understand because it
 495 violates these or other ``common sense'' software engineering practices
 496 will be penalized.
 497
 498 In the end, remember your audience.  Code is written primarily to be
 499 read by humans.  It has to be acceptable to the compiler too, but the
 500 compiler doesn't care about how it looks or how well it is written.
 501
 502 @node License
 503 @section License
 504
 505 Pintos is distributed under a liberal license that allows free use,
 506 modification, and distribution.  Students and others who work on Pintos
 507 own the code that they write and may use it for any purpose.
 508
 509 In the context of Stanford's CS 140 course, please respect the spirit
 510 and the letter of the honor code by refraining from reading any homework
 511 solutions available online or elsewhere.  Reading the source code for
 512 other operating system kernels, such as Linux or FreeBSD, is allowed,
 513 but do not copy code from them literally.  Please cite the code that
 514 inspired your own in your design documentation.
 515
 516 Pintos comes with NO WARRANTY, not even for MERCHANTABILITY or FITNESS
 517 FOR A PARTICULAR PURPOSE.
 518
 519 The @file{LICENSE} file at the top level of the Pintos source
 520 distribution has full details of the license and lack of warranty.
 521
 522 @node Acknowledgements
 523 @section Acknowledgements
 524
 525 Pintos and this documentation were written by Ben Pfaff
 526 @email{blp@@cs.stanford.edu}.
 527
 528 The original structure and form of Pintos was inspired by the Nachos
 529 instructional operating system from the University of California,
 530 Berkeley.  A few of the source files were originally more-or-less
 531 literal translations of the Nachos C++ code into C.  These files bear
 532 the original UCB license notice.
 533
 534 A few of the Pintos source files are derived from code used in the
 535 Massachusetts Institute of Technology's 6.828 advanced operating systems
 536 course.  These files bear the original MIT license notice.
 537
 538 The Pintos projects and documentation originated with those designed for
 539 Nachos by current and former CS140 teaching assistants at Stanford
 540 University, including at least Yu Ping, Greg Hutchins, Kelly Shaw, Paul
 541 Twohey, Sameer Qureshi, and John Rector.  If you're not on this list but
 542 should be, please let me know.
 543
 544 Example code for condition variables (@pxref{Condition Variables}) is
 545 from classroom slides originally by Dawson Engler and updated by Mendel
 546 Rosenblum.
 547
 548 @node Trivia
 549 @section Trivia
 550
 551 Pintos originated as a replacement for Nachos with a similar design.
 552 Since then Pintos has greatly diverged from the Nachos design.  Pintos
 553 differs from Nachos in two important ways.  First, Pintos runs on real
 554 or simulated 80@var{x}86 hardware, but Nachos runs as a process on a
 555 host operating system.  Second, Pintos is written in C like most
 556 real-world operating systems, but Nachos is written in C++.
 557
 558 Why the name ``Pintos''?  First, like nachos, pinto beans are a common
 559 Mexican food.  Second, Pintos is small and a ``pint'' is a small amount.
 560 Third, like drivers of the eponymous car, students are likely to have
 561 trouble with blow-ups.