@node Reference Guide
@appendix Reference Guide
-This chapter is a reference for the Pintos code. It covers the
-entire code base, but you'll only be using Pintos one part at a time,
-so you may find that you want to read each part as you work on the
+This chapter is a reference for the Pintos code. The reference guide
+does not cover all of the code in Pintos, but it does cover those
+pieces that students most often find troublesome. You may find that
+you want to read each part of the reference guide as you work on the
project where it becomes important.
-(Actually, the reference guide is currently incomplete.)
-
We recommend using ``tags'' to follow along with references to function
and variable names (@pxref{Tags}).
@menu
* Pintos Loader::
-* Kernel Initialization::
+* Kernel Initialization::
+* Physical Memory Map::
@end menu
@node Pintos Loader
@func{main}, which never returns.
There's one more trick: the Pintos kernel command line
-is in stored the boot loader. The @command{pintos} program actually
+is stored in the boot loader. The @command{pintos} program actually
modifies a copy of the boot loader on disk each time it runs the kernel,
putting
in whatever command line arguments the user supplies to the kernel,
module's source code, and @file{@var{module}.h} is the module's
header.
-First we initialize kernel RAM in @func{ram_init}. The first step
-is to clear out the kernel's so-called ``BSS'' segment. The BSS is a
+The first step in @func{main} is to call @func{bss_init}, which clears
+out the kernel's ``BSS'', which is the traditional name for a
segment that should be initialized to all zeros. In most C
implementations, whenever you
declare a variable outside a function without providing an
initializer, that variable goes into the BSS. Because it's all zeros, the
BSS isn't stored in the image that the loader brought into memory. We
-just use @func{memset} to zero it out. The other task of
-@func{ram_init} is to read out the machine's memory size from where
-the loader stored it and put it into the @code{ram_pages} variable for
-later use.
-
-Next, @func{thread_init} initializes the thread system. We will defer
-full discussion to our discussion of Pintos threads below. It is
-called so early in initialization because the console, initialized
-just afterward, tries to use locks, and locks in turn require there to be a
-running thread.
-
-Then we initialize the console so that @func{printf} will work.
-@func{main} calls @func{vga_init}, which initializes the VGA text
-display and clears the screen. It also calls @func{serial_init_poll}
-to initialize the first serial port in ``polling mode,'' that is,
-where the kernel busy-waits for the port to be ready for each
-character to be output. (We use polling mode until we're ready to enable
-interrupts, later.) Finally we initialize the console device and
-print a startup message to the console.
-
-@func{main} calls @func{read_command_line} to break the kernel command
+just use @func{memset} to zero it out.
+
+Next, @func{main} calls @func{read_command_line} to break the kernel command
line into arguments, then @func{parse_options} to read any options at
the beginning of the command line. (Actions specified on the
command line execute later.)
-@func{main} calls @func{random_init} to initialize the kernel random
-number generator. If the user specified @option{-rs} on the
-@command{pintos} command line, @func{parse_options} already did
-this, but calling it a second time is harmless.
+@func{thread_init} initializes the thread system. We will defer full
+discussion to our discussion of Pintos threads below. It is called so
+early in initialization because a valid thread structure is a
+prerequisite for acquiring a lock, and lock acquisition in turn is
+important to other Pintos subsystems. Then we initialize the console
+and print a startup message to the console.
-The next block of functions we call initialize the kernel's memory
+The next block of functions we call initializes the kernel's memory
system. @func{palloc_init} sets up the kernel page allocator, which
doles out memory one or more pages at a time (@pxref{Page Allocator}).
@func{malloc_init} sets
@func{timer_calibrate} calibrates the timer for accurate short delays.
If the file system is compiled in, as it will starting in project 2, we
-initialize the disks with @func{disk_init}, then the
+initialize the IDE disks with @func{ide_init}, then the
file system with @func{filesys_init}.
Boot is complete, so we print a message.
@func{main} calls @func{thread_exit}, which allows any other running
threads to continue running.
+@node Physical Memory Map
+@subsection Physical Memory Map
+
+@multitable {@t{00000000}--@t{00000000}} {Hardware} {Some much longer explanatory text}
+@headitem Memory Range
+@tab Owner
+@tab Contents
+
+@item @t{00000000}--@t{000003ff} @tab CPU @tab Real mode interrupt table.
+@item @t{00000400}--@t{000005ff} @tab BIOS @tab Miscellaneous data area.
+@item @t{00000600}--@t{00007bff} @tab --- @tab ---
+@item @t{00007c00}--@t{00007dff} @tab Pintos @tab Loader.
+@item @t{0000e000}--@t{0000efff} @tab Pintos
+@tab Stack for loader; kernel stack and @struct{thread} for initial
+kernel thread.
+@item @t{0000f000}--@t{0000ffff} @tab Pintos
+@tab Page directory for startup code.
+@item @t{00010000}--@t{00020000} @tab Pintos
+@tab Page tables for startup code.
+@item @t{00020000}--@t{0009ffff} @tab Pintos
+@tab Kernel code, data, and uninitialized data segments.
+@item @t{000a0000}--@t{000bffff} @tab Video @tab VGA display memory.
+@item @t{000c0000}--@t{000effff} @tab Hardware
+@tab Reserved for expansion card RAM and ROM.
+@item @t{000f0000}--@t{000fffff} @tab BIOS @tab ROM BIOS.
+@item @t{00100000}--@t{03ffffff} @tab Pintos @tab Dynamic memory allocation.
+@end multitable
+
@node Threads
@section Threads
* Semaphores::
* Locks::
* Monitors::
-* Memory Barriers::
+* Optimization Barriers::
@end menu
@node Disabling Interrupts
A @dfn{lock} is like a semaphore with an initial value of 1
(@pxref{Semaphores}). A lock's equivalent of ``up'' is called
-``acquire'', and the ``down'' operation is called ``release''.
+``release'', and the ``down'' operation is called ``acquire''.
Compared to a semaphore, a lock has one added restriction: only the
thread that acquires a lock, called the lock's ``owner'', is allowed to
@}
@end example
-@node Memory Barriers
-@subsection Memory Barriers
+Note that @code{BUF_SIZE} must divide evenly into @code{SIZE_MAX + 1}
+for the above code to be completely correct. Otherwise, it will fail
+the first time @code{head} wraps around to 0. In practice,
+@code{BUF_SIZE} would ordinarily be a power of 2.
+
+@node Optimization Barriers
+@subsection Optimization Barriers
@c We should try to come up with a better example.
@c Perhaps something with a linked list?
-Suppose we add a ``feature'' that, whenever a timer interrupt
-occurs, the character in global variable @code{timer_put_char} is
-printed on the console, but only if global Boolean variable
-@code{timer_do_put} is true.
+An @dfn{optimization barrier} is a special statement that prevents the
+compiler from making assumptions about the state of memory across the
+barrier. The compiler will not reorder reads or writes of variables
+across the barrier or assume that a variable's value is unmodified
+across the barrier, except for local variables whose address is never
+taken. In Pintos, @file{threads/synch.h} defines the @code{barrier()}
+macro as an optimization barrier.
-If interrupts are enabled, this code for setting up @samp{x} to be
-printed is clearly incorrect, because the timer interrupt could intervene
-between the two assignments:
+One reason to use an optimization barrier is when data can change
+asynchronously, without the compiler's knowledge, e.g.@: by another
+thread or an interrupt handler. The @func{too_many_loops} function in
+@file{devices/timer.c} is an example. This function starts out by
+busy-waiting in a loop until a timer tick occurs:
@example
-timer_do_put = true; /* INCORRECT CODE */
-timer_put_char = 'x';
+/* Wait for a timer tick. */
+int64_t start = ticks;
+while (ticks == start)
+ barrier ();
@end example
-It might not be as obvious that the following code is just as
-incorrect:
+@noindent
+Without an optimization barrier in the loop, the compiler could
+conclude that the loop would never terminate, because @code{start} and
+@code{ticks} start out equal and the loop itself never changes them.
+It could then ``optimize'' the function into an infinite loop, which
+would definitely be undesirable.
+
+Optimization barriers can be used to avoid other compiler
+optimizations. The @func{busy_wait} function, also in
+@file{devices/timer.c}, is an example. It contains this loop:
@example
-timer_put_char = 'x'; /* INCORRECT CODE */
-timer_do_put = true;
+while (loops-- > 0)
+ barrier ();
@end example
-The reason this second example might be a problem is that the compiler
-is, in general, free to reorder operations when it doesn't have a
-visible reason to keep them in the same order. In this case, the
-compiler doesn't know that the order of assignments is important, so its
-optimization pass is permitted to exchange their order.
-There's no telling whether it will actually do this, and it is possible
-that passing the compiler different optimization flags or changing
-compiler versions will produce different behavior.
-
-The following is @emph{not} a solution, because locks neither prevent
-interrupts nor prevent the compiler from reordering the code within the
-region where the lock is held:
+@noindent
+The goal of this loop is to busy-wait by counting @code{loops} down
+from its original value to 0. Without the barrier, the compiler could
+delete the loop entirely, because it produces no useful output and has
+no side effects. The barrier forces the compiler to pretend that the
+loop body has an important effect.
+
+Finally, optimization barriers can be used to force the ordering of
+memory reads or writes. For example, suppose we add a ``feature''
+that, whenever a timer interrupt occurs, the character in global
+variable @code{timer_put_char} is printed on the console, but only if
+global Boolean variable @code{timer_do_put} is true. The best way to
+set up @samp{x} to be printed is then to use an optimization barrier,
+like this:
@example
-lock_acquire (&timer_lock); /* INCORRECT CODE */
timer_put_char = 'x';
+barrier ();
timer_do_put = true;
-lock_release (&timer_lock);
@end example
-Fortunately, real solutions do exist. One possibility is to
-disable interrupts around the assignments. This does not prevent
-reordering, but it makes the assignments atomic as observed by the
-interrupt handler. It also has the extra runtime cost of disabling and
-re-enabling interrupts:
+Without the barrier, the code is buggy because the compiler is free to
+reorder operations when it doesn't see a reason to keep them in the
+same order. In this case, the compiler doesn't know that the order of
+assignments is important, so its optimizer is permitted to exchange
+their order. There's no telling whether it will actually do this, and
+it is possible that passing the compiler different optimization flags
+or using a different version of the compiler will produce different
+behavior.
+
+Another solution is to disable interrupts around the assignments.
+This does not prevent reordering, but it prevents the interrupt
+handler from intervening between the assignments. It also has the
+extra runtime cost of disabling and re-enabling interrupts:
@example
enum intr_level old_level = intr_disable ();
intr_set_level (old_level);
@end example
-A second possibility is to mark the declarations of
+A second solution is to mark the declarations of
@code{timer_put_char} and @code{timer_do_put} as @samp{volatile}. This
keyword tells the compiler that the variables are externally observable
and restricts its latitude for optimization. However, the semantics of
@samp{volatile} are not well-defined, so it is not a good general
-solution.
+solution. The base Pintos code does not use @samp{volatile} at all.
-Usually, the best solution is to use a compiler feature called a
-@dfn{memory barrier}, a special statement that prevents the compiler
-from reordering memory operations across the barrier. In Pintos,
-@file{threads/synch.h} defines the @code{barrier()} macro as a memory
-barrier. Here's how we would use a memory barrier to fix this code:
+The following is @emph{not} a solution, because locks neither prevent
+interrupts nor prevent the compiler from reordering the code within the
+region where the lock is held:
@example
+lock_acquire (&timer_lock); /* INCORRECT CODE */
timer_put_char = 'x';
-barrier ();
timer_do_put = true;
+lock_release (&timer_lock);
@end example
-The compiler also treats invocation of any function defined externally,
-that is, in another source file, as a limited form of a memory barrier.
-Specifically, the compiler assumes that any externally defined function
-may access any statically or dynamically allocated data and any local
-variable whose address is taken. This often means that explicit
-barriers can be omitted, and, indeed, this is why the base Pintos code
-does not need any barriers.
+The compiler treats invocation of any function defined externally,
+that is, in another source file, as a limited form of optimization
+barrier. Specifically, the compiler assumes that any externally
+defined function may access any statically or dynamically allocated
+data and any local variable whose address is taken. This often means
+that explicit barriers can be omitted. It is one reason that Pintos
+contains few explicit barriers.
A function defined in the same source file, or in a header included by
-the source file, cannot be relied upon as a memory barrier.
+the source file, cannot be relied upon as a optimization barrier.
This applies even to invocation of a function before its
definition, because the compiler may read and parse the entire source
file before performing optimization.
The page allocator divides the memory it allocates into two pools,
called the kernel and user pools. By default, each pool gets half of
-system memory, but this can be changed with the @option{-ul} kernel
+system memory above @w{1 MB}, but the division can be changed with the
+@option{-ul} kernel
command line
option (@pxref{Why PAL_USER?}). An allocation request draws from one
pool or the other. If one pool becomes empty, the other may still