X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Ftour.texi;h=41fde25ae08c6b163e1d7118486e5540b8a412b3;hb=c574297b7acbcbf04d128113cf91dbbcc1730f9c;hp=abbcb692b1f5f0e1b2890a7453d7beea762ecad6;hpb=4b82f041ebf3300af81fc19cb62fd8915e26aa94;p=pintos-anon diff --git a/doc/tour.texi b/doc/tour.texi index abbcb69..41fde25 100644 --- a/doc/tour.texi +++ b/doc/tour.texi @@ -1,4 +1,4 @@ -@node Pintos Tour +@node Pintos Tour, Project 1--Threads, Introduction, Top @chapter A Tour Through Pintos This chapter is a brief tour through the Pintos code. It covers the @@ -6,12 +6,25 @@ entire code base, but you'll only be using Pintos one part at a time, so you may find that you want to read each part as you work on the corresponding project. +Hint: try using ``tags'' to follow along with references to function +and variable names (@pxref{Tags}). + +@menu +* Pintos Loading:: +* Threads Tour:: +@end menu + @node Pintos Loading @section Loading This section covers the Pintos loader and basic kernel initialization. +@menu +* Pintos Loader:: +* Kernel Initialization:: +@end menu + @node Pintos Loader @subsection The Loader @@ -77,7 +90,7 @@ Then we jump to the start of the compiled kernel image. Using the ``linker script'' in @file{threads/kernel.lds.S}, the kernel has arranged that it begins with the assembly module @file{threads/start.S}. This assembly module just calls -@code{main()}, which never returns. +@func{main}, which never returns. There's one more trick to the loader: the Pintos kernel command line is stored in the boot loader. The @command{pintos} program actually @@ -90,18 +103,20 @@ and effective. @node Kernel Initialization @subsection Kernel Initialization -The kernel proper starts with the @code{main()} function. The -@code{main()} function is written in C, as will be most of the code we +The kernel proper starts with the @func{main} function. The +@func{main} function is written in C, as will be most of the code we encounter in Pintos from here on out. -When @code{main()} starts, the system is in a pretty raw state. We're +When @func{main} starts, the system is in a pretty raw state. We're in protected mode with paging enabled, but hardly anything else is -ready. Thus, the @code{main()} function consists primarily of calls -into other Pintos modules' initialization functions, which are -typically named @code{@var{module}_init()}, where @var{module} is the -module's name. +ready. Thus, the @func{main} function consists primarily of calls +into other Pintos modules' initialization functions. You should +notice that these are usually named @func{@var{module}_init}, where +@var{module} is the module's name, @file{@var{module}.c} is the +module's source code, and @file{@var{module}.h} is the module's +header. -First we initialize kernel RAM in @code{ram_init()}. The first step +First we initialize kernel RAM in @func{ram_init}. The first step is to clear out the kernel's so-called ``BSS'' segment. The BSS is a segment that should be initialized to all zeros. In C, whenever you declare a variable outside a function without providing an @@ -109,12 +124,248 @@ initializer, that variable goes into the BSS.@footnote{This isn't actually required by the ANSI C standard, but it is the case with most implementations of C on most machines.} Because it's all zeros, the BSS isn't stored in the image that the loader brought into memory. We -just use @code{memset()} to zero it out. The other task of -@code{ram_init()} is to read out the machine's memory size from where +just use @func{memset} to zero it out. The other task of +@func{ram_init} is to read out the machine's memory size from where the loader stored it and put it into the @code{ram_pages} variable for later use. +Next, @func{thread_init} initializes the thread system. We will defer +full discussion to our discussion of Pintos threads below. It is +called so early in initialization because the console, initialized +just below, tries to use locks and locks in turn require there to be a +running thread. + +Then we initialize the console so that we can use @func{printf}. +@func{main} calls @func{vga_init}, which initializes the VGA text +display and clears the screen. It also calls @func{serial_init_poll} +to initialize the first serial port in ``polling mode,'' that is, +where the kernel busy-waits for the port to be ready for each +character to be output. (We use polling mode until we're ready to set +up interrupts later.) Finally we initialize the console device and +print a startup message to the console. + +@func{main} calls @func{argv_init} to parse the kernel command line. +This entails reading the command line out of the loader and updating +global variables appropriately. + +The next block of functions we call initialize the kernel's memory +system. @func{palloc_init} sets up the kernel page allocator, which +doles out memory one or more pages at a time. @func{malloc_init} sets +up the allocator that handles odd-sized allocations. +@func{paging_init} sets up a page table for the kernel. + +In projects 2 and later, @func{main} also calls @func{tss_init} and +@func{gdt_init}, but we'll talk about those later. + +@func{main} calls @func{random_init} to initialize the kernel random +number generator. @func{argv_init} might already have done this (if +the user specified @option{-rs} on the @command{pintos} command line) +but calling it a second time is harmless and has no effect. + +We initialize the interrupt system in the next set of calls. +@func{intr_init} sets up the CPU's @dfn{interrupt descriptor table} +(IDT) to ready it for interrupt handling, then @func{timer_init} and +@func{kbd_init} prepare us to handle timer interrupts and keyboard +interrupts, respectively. In projects 2 and later, we also prepare to +handle interrupts caused by user programs using @func{exception_init} +and @func{syscall_init}. + +Now that interrupts are set up, we can start preemptively scheduling +threads with @func{thread_start}, which also enables interrupts. +Interrupt-driven serial port I/O is also possible now, so we use +@func{serial_init_queue} to switch to that mode. + +If the filesystem is compiled in, as it will be in project 2 and +later, we now initialize the disks with @func{disk_init}, then the +filesystem with @func{filesys_init}, and run any operations that were +requested on the kernel command line with @func{fsutil_run}. +Boot is complete, so we print a message. + +For project 1, now we execute @func{test}, which runs whatever test is +currently compiled into the kernel. For later projects, we will +instead run a user program and wait for it to complete. + +Finally, if @option{-q} was specified on the kernel command line, we +call @func{power_off} to terminate the machine simulator. Otherwise, +@func{main} calls @func{thread_exit}, which allows any other running +threads to continue running. @node Threads Tour @section Threads + +@menu +* struct thread:: +@end menu + +@node struct thread +@subsection @struct{thread} + +The main Pintos data structure for threads is @struct{thread}, +declared in @file{threads/thread.h}. @struct{thread} has these +members with the given type: + +@table @code +@item tid_t tid; +The thread's thread identifier or @dfn{tid}. Every thread must have a +tid that is unique over the entire lifetime of the kernel. By +default, @code{tid_t} is a @code{typedef} for @code{int} and each new +thread receives the numerically next higher tid, starting from 1 for +the initial process. You can change the type and the numbering scheme +if you like. + +@item enum thread_status status; +The thread's state, one of these: + +@table @code +@item THREAD_RUNNING +The thread is running. Exactly one thread is running at a given time. +@func{thread_current} returns the running thread. + +@item THREAD_READY +The thread is ready to run, but it's not running right now. The +thread could be selected to run the next time the scheduler is +invoked. Ready threads are kept in a doubly linked list called +@code{ready_list}. + +@item THREAD_BLOCKED +The thread is waiting for something, e.g.@: a lock to become +available, an interrupt to be invoked. The thread won't be scheduled +again until it transitions to the @code{THREAD_READY} state with a +call to @func{thread_unblock}. + +@item THREAD_DYING +The thread will be destroyed by the scheduler after switching to the +next thread. +@end table + +@item char name[16]; +The thread's name as a string, or at least the first few characters of +it. + +@item uint8_t *stack; +Every thread has its own stack to keep track of its state. When the +thread is running, the CPU's stack pointer register tracks the top of +the stack and this member is unused. But when the CPU switches to +another thread, this member saves the thread's stack pointer. No +other members are needed to save the thread's registers, because the +other registers that must be saved are saved on the stack. + +@item int priority; +A thread priority, ranging from the lowest possible priority +@code{PRI_MIN} (0) to the highest possible priority @code{PRI_MAX} +(59). Pintos as provided ignores thread priorities, but you will +implement priority scheduling in problem 1-3 (@pxref{Problem 1-3 +Priority Scheduling}). + +@item list_elem elem; +A ``list element'' used to put the thread into doubly linked lists, +either the list of threads ready to run or a list of threads waiting +on a semaphore. Take a look at @file{lib/kernel/list.h} for +information on how to use the Pintos doubly linked list ADT. + +@item uint32_t *pagedir; +Only present in project 2 and later. + +@item unsigned magic; +Always set to @code{THREAD_MAGIC}, which is just a random number +defined in @file{threads/thread.c}, and used to detect stack overflow. +See below for more information. +@end table + +Every @struct{thread} occupies the beginning of its own page of +memory. The rest of the page is used for the thread's stack, which +grows downward from the end of the page. It looks like this: + +@example + 4 kB +---------------------------------+ + | kernel stack | + | | | + | | | + | V | + | grows downward | + | | + | | + | | + | | + | | + | | + | | + | | + +---------------------------------+ + | magic | + | : | + | : | + | name | + | status | + 0 kB +---------------------------------+ +@end example + +The upshot of this is twofold. First, @struct{thread} must not be +allowed to grow too big. If it does, then there will not be enough +room for the kernel stack. Our base @struct{thread} is only a few +bytes in size. It probably should stay well under 1 kB. + +Second, kernel stacks must not be allowed to grow too large. If a +stack overflows, it will corrupt the thread state. Thus, kernel +functions should not allocate large structures or arrays as non-static +local variables. Use dynamic allocation with @func{malloc} or +@func{palloc_get_page} instead. + +This brings us to the purpose of @struct{thread}'s @code{magic} +member. If a thread's stack does overflow, then @code{magic} will be +the first member of @struct{thread} to be overwritten, because it is +closest to the kernel stack. Thus, some of the thread functions +(notably @func{thread_current}) check that @code{magic} has the proper +value. + +@node Thread Functions +@subsection Thread Functions + +@file{threads/thread.c} implements several public functions for +thread support. Let's take a look at some of them: + +@deftypefun void thread_init (void) +Called by @func{main} to initialize the thread system. Its main +purpose is to create a @struct{thread} for Pintos's initial thread. +This is possible because the Pintos loader sets up the initial +thread's stack at the end of a page. Before @func{thread_init} runs, +@func{thread_current} will fail because the running thread's +@code{magic} value is incorrect. Lots of functions call +@func{thread_current} directly or indirectly, including +@func{lock_acquire} for locking a lock, so @func{thread_init} is +called early in Pintos initialization. +@end deftypefun + +@deftypefun void thread_start (void) +Called by @func{main} to start the scheduler. Creates the idle +thread, that is, the thread that is scheduled when no other thread is +ready. Then enables interrupts, which enables the scheduler because +processes are rescheduled in the return path from the timer +interrupt. FIXME +@end deftypefun + +@deftypefun void thread_tick (void) +Called by @func{timer_interrupt} on every timer interrupt. Just +increments counters that keep track of how often various kinds of +threads run. +@end deftypefun + +@deftypefun void thread_print_stats (void) +Prints the statistics kept by @func{thread_ticK] to the console. +Called when Pintos is terminating. +@end deftypefun + +@deftypefun void thread_create (const char *@var{name}, int @var{priority}, @ + thread_func *@var{func}, void *@var{aux}) Creates and +starts a new thread named @var{name} with the given @var{priority}, +returning the new thread's tid. The thread executes @var{func}, +passing @var{aux} as the function's single argument. + +@func{thread_create} works by allocating a page for the thread's +@struct{thread} and stack and initializing its members, then setting +up a set of fake stack frames for it (we'll talk more about this +later). The thread was initialized in the blocked state, so the final +action before returning is to unblock it, allowing the new thread to +be scheduled. +@end deftypefun