Implement a proper block layer with partition support.

[pintos-anon] / doc / vm.texi
diff --git a/doc/vm.texi b/doc/vm.texi

index 59e88ca4dffccbe19c3ab90d1fb36621abe8363c..ab64a4de57f4b60a09ad35fc26c17e6029128169 100644 (file)
--- a/doc/vm.texi
+++ b/doc/vm.texi
@@ -1,657 +1,748 @@
-@node Project 3--Virtual Memory, Project 4--File Systems, Project 2--User Programs, Top
+@node Project 3--Virtual Memory
  @chapter Project 3: Virtual Memory
  
-By now you should be familiar with the inner workings of Pintos.
-You've already come a long way: your OS can properly handle multiple
-threads of execution with proper synchronization, and can load
-multiple user programs at once.  However, when loading user programs,
-your OS is limited by how much main memory the simulated machine has.
-In this assignment, you will remove that limitation.
-
-You will be using the @file{vm} directory for this project.  There is
-no new code to get acquainted with for this assignment.  The @file{vm}
-directory contains only the @file{Makefile}s.  The only change from
-@file{userprog} is that this new @file{Makefile} turns on the setting
-@option{-DVM}, which you will need for this assignment.  All code you
-write will either be newly generated files (e.g.@: if you choose to
-implement your paging code in their own source files), or will be
-modifications to pre-existing code (e.g.@: you will change the
-behavior of @file{addrspace.c} significantly).
-
-You will be building this assignment on the last one.  It will benefit
-you to get your project 2 in good working order before this assignment
-so those bugs don't keep haunting you.
+By now you should have some familiarity with the inner workings of
+Pintos.  Your
+OS can properly handle multiple threads of execution with proper
+synchronization, and can load multiple user programs at once.  However,
+the number and size of programs that can run is limited by the machine's
+main memory size.  In this assignment, you will remove that limitation.
+
+You will build this assignment on top of the last one.  Test programs
+from project 2 should also work with project 3.  You should take care to
+fix any bugs in your project 2 submission before you start work on
+project 3, because those bugs will most likely cause the same problems
+in project 3.
+
+You will continue to handle Pintos disks and file systems the same way
+you did in the previous assignment (@pxref{Using the File System}).
  
  @menu
-* VM Design::                   
-* Page Faults::                 
-* Disk as Backing Store::       
-* Memory Mapped Files::         
-* Stack::                       
-* Problem 3-1 Page Table Management::  
-* Problem 3-2 Paging To and From Disk::  
-* Problem 3-3 Memory Mapped Files::  
-* Virtual Memory FAQ::          
+* Project 3 Background::        
+* Project 3 Suggested Order of Implementation::  
+* Project 3 Requirements::      
+* Project 3 FAQ::               
  @end menu
  
-@node VM Design
-@section A Word about Design
-
-It is important for you to note that in addition to getting virtual
-memory working, this assignment is also meant to be an open-ended
-design problem.  We will expect you to come up with a design that
-makes sense.  You will have the freedom to choose how to do software
-translation on TLB misses, how to represent the swap partition, how to
-implement paging, etc.  In each case, we will expect you to provide a
-defensible justification in your design documentation as to why your
-choices are reasonable.  You should evaluate your design on all the
-available criteria: speed of handling a page fault, space overhead in
-memory, minimizing the number of page faults, simplicity, etc.
-
-In keeping with this, you will find that we are going to say as little
-as possible about how to do things.  Instead we will focus on what end
-functionality we require your OS to support.
-
-@node Page Faults
-@section Page Faults
-
-For the last assignment, whenever a context switch occurred, the new
-process would install its own page table into the machine.  The page
-table contained all the virtual-to-physical translations for the
-process.  Whenever the processor needed to look up a translation, it
-consulted the page table.  As long as the process only accessed
-memory that it didn't own, all was well.  If the process accessed
-memory it didn't own, it ``page faulted'' and @code{page_fault()}
-terminated the process.
-
-When we implement virtual memory, the rules have to change.  A page
-fault is no longer necessarily an error, since it might only indicate
-that the page must be brought in from a disk file or from swap.  You
-will have to implement a more sophisticated page fault handler to
-handle these cases.
-
-On the 80@var{x}86, the page table format is fixed by hardware.  The
-top-level data structure is a 4 kB page called the ``page directory''
-(PD) arranged as an array of 1,024 32-bit page directory entries
-(PDEs), each of which represents 4 MB of virtual memory.  Each PDE may
-point to the physical address of another 4 kB page called a ``page
-table'' (PT) arranged in the same fashion as an array of 1,024 32-bit
-page table entries (PTEs), each of which translates a single 4 kB
-virtual page into physical memory.
-
-Thus, translation of a virtual address into a physical address follows
-the three-step process illustrated in the diagram
-below:@footnote{Actually, virtual to physical translation on the
-80@var{x}86 architecture happens via an intermediate ``linear
-address,'' but Pintos (and most other 80@var{x}86 OSes) set up the CPU
-so that linear and virtual addresses are one and the same, so that you
-can effectively ignore this CPU feature.}
-
-@enumerate 1
-@item
-The top 10 bits of the virtual address (bits 22:31) are used to index
-into the page directory.  If the PDE is marked ``present,'' the
-physical address of a page table is read from the PDE thus obtained.
-If the PDE is marked ``not present'' then a page fault occurs.
+@node Project 3 Background
+@section Background
  
-@item
-The next 10 bits of the virtual address (bits 12:21) are used to index
-into the page table.  If the PTE is marked ``present,'' the physical
-address of a data page is read from the PTE thus obtained.  If the PTE
-is marked ``not present'' then a page fault occurs.
+@menu
+* Project 3 Source Files::      
+* Memory Terminology::          
+* Resource Management Overview::  
+* Managing the Supplemental Page Table::  
+* Managing the Frame Table::    
+* Managing the Swap Table::     
+* Managing Memory Mapped Files Back::  
+@end menu
  
+@node Project 3 Source Files
+@subsection Source Files
  
-@item
-The bottom 12 bits of the virtual address (bits 0:11) are added to the
-data page's physical base address, producing the final physical
-address.
-@end enumerate
+You will work in the @file{vm} directory for this project.  The
+@file{vm} directory contains only @file{Makefile}s.  The only
+change from @file{userprog} is that this new @file{Makefile} turns on
+the setting @option{-DVM}.  All code you write will be in new
+files or in files introduced in earlier projects.
  
-@example
-32                    22                     12                      0
-+--------------------------------------------------------------------+
-| Page Directory Index |   Page Table Index   |    Page Offset       |
-+--------------------------------------------------------------------+
-             |                    |                     |
-     _______/             _______/                _____/
-    /                    /                       /
-   /    Page Directory  /      Page Table       /    Data Page
-  /     .____________. /     .____________.    /   .____________.
-  |1,023|____________| |1,023|____________|    |   |____________|
-  |1,022|____________| |1,022|____________|    |   |____________|
-  |1,021|____________| |1,021|____________|    \__\|____________|
-  |1,020|____________| |1,020|____________|       /|____________|
-  |     |            | |     |            |        |            |
-  |     |            | \____\|            |_       |            |
-  |     |      .     |      /|      .     | \      |      .     |
-  \____\|      .     |_      |      .     |  |     |      .     |
-       /|      .     | \     |      .     |  |     |      .     |
-        |      .     |  |    |      .     |  |     |      .     |
-        |            |  |    |            |  |     |            |
-        |____________|  |    |____________|  |     |____________|
-       4|____________|  |   4|____________|  |     |____________|
-       3|____________|  |   3|____________|  |     |____________|
-       2|____________|  |   2|____________|  |     |____________|
-       1|____________|  |   1|____________|  |     |____________|
-       0|____________|  \__\0|____________|  \____\|____________|
-                           /                      /
-@end example
+You will probably be encountering just a few files for the first time:
  
+@table @file
+@item devices/block.h
+@itemx devices/block.c
+Provides sector-based read and write access to block device.  You will
+use this interface to access the swap partition as a block device.
+@end table
  
-FIXME need to explain virtual and physical memory layout - probably
-back in userprog project
-
-FIXME need to mention that there are many possible implementations and
-that the above is just an outline
-
-@node Disk as Backing Store
-@section Disk as Backing Store
-
-In VM systems, since memory is less plentiful than disk, you will
-effectively use memory as a cache for disk.  Looking at it from
-another angle, you will use disk as a backing store for memory.  This
-provides the abstraction of an (almost) unlimited virtual memory size.
-Part of your task in this project is to do this, with the additional
-constraint that your performance should be close to that provided by
-physical memory.  You will use the page tables' ``dirty'' bits to
-denote whether pages need to be written back to disk when they're
-evicted from main memory and the ``accessed'' bit for page replacement
-algorithms.  Whenever the hardware writes memory, it sets the dirty
-bit, and if it reads or writes to the page, it sets the accessed bit.
-
-As with any caching system, performance depends on the policy used to
-decide which things are kept in memory and which are only stored on
-disk.  On a page fault, the kernel must decide which page to replace.
-Ideally, it will throw out a page that will not be referenced for a
-long time, keeping in memory those pages that are soon to be
-referenced.  Another consideration is that if the replaced page has
-been modified, the page must be first saved to disk before the needed
-page can be brought in.  Many virtual memory systems avoid this extra
-overhead by writing modified pages to disk in advance, so that later
-page faults can be completed more quickly.
+@node Memory Terminology
+@subsection Memory Terminology
  
-@node Memory Mapped Files
-@section Memory Mapped Files
-
-The traditional way to access the file system is via @code{read} and
-@code{write} system calls, but that requires an extra level of copying
-between the kernel and the user level.  A secondary interface is
-simply to ``map'' the file into the virtual address space.  The
-program can then use load and store instructions directly on the file
-data.  (An alternative way of viewing the file system is as ``durable
-memory.''  Files just store data structures.  If you access data
-structures in memory using load and store instructions, why not access
-data structures in files the same way?)
-
-Memory mapped files are typically implemented using system calls.  One
-system call maps the file to a particular part of the address space.
-For example, one might map the file @file{foo}, which is 1000 bytes
-long, starting at address 5000.  Assuming that nothing else is already
-at virtual addresses 5000@dots{}6000, any memory accesses to these
-locations will access the corresponding bytes of @file{foo}.
-
-A consequence of memory mapped files is that address spaces are
-sparsely populated with lots of segments, one for each memory mapped
-file (plus one each for code, data, and stack).  You will implement
-memory mapped files for problem 3 of this assignment, but you should
-design your solutions to problems 1 and 2 to account for this.
-
-@node Stack
-@section Stack
-
-In project 2, the stack was a single page at the top of the user
-virtual address space.  The stack's location does not change in this
-project, but your kernel should allocate additional pages to the stack
-on demand.  That is, if the stack grows past its current bottom, the
-system should allocate additional pages for the stack as necessary,
-unless those pages are unavailable because they are in use by another
-segment, in which case some sort of fault should occur.
-
-@node Problem 3-1 Page Table Management
-@section Problem 3-1: Page Table Management
-
-Implement page directory and page table management to support virtual
-memory.  You will need data structures to accomplish the following
-tasks:
+Careful definitions are needed to keep discussion of virtual memory from
+being confusing.  Thus, we begin by presenting some terminology for
+memory and storage.  Some of these terms should be familiar from project
+2 (@pxref{Virtual Memory Layout}), but much of it is new.
  
-@itemize @bullet
-@item
-Some way of translating in software from virtual page frames to
-physical page frames (consider using a hash table---note
-that we provide one in @file{lib/kernel}).
+@menu
+* Pages::                       
+* Frames::                      
+* Page Tables::                 
+* Swap Slots::                  
+@end menu
  
-@item
-Some way of translating from physical page frames back to virtual
-page frames, so that when you replace a page, you can invalidate
-its translation(s).
+@node Pages
+@subsubsection Pages
  
-@item
-Some way of finding a page on disk if it is not in memory.  You won't
-need this data structure until part 2, but planning ahead is a good
-idea.
-@end itemize
+A @dfn{page}, sometimes called a @dfn{virtual page}, is a continuous
+region of virtual memory 4,096 bytes (the @dfn{page size}) in length.  A
+page must be @dfn{page-aligned}, that is, start on a virtual address
+evenly divisible by the page size.  Thus, a 32-bit virtual address can
+be divided into a 20-bit @dfn{page number} and a 12-bit @dfn{page
+offset} (or just @dfn{offset}), like this:
  
-You need to do the roughly the following to handle a page fault:
+@example
+@group
+               31               12 11        0
+              +-------------------+-----------+
+              |    Page Number    |   Offset  |
+              +-------------------+-----------+
+                       Virtual Address
+@end group
+@end example
  
-@enumerate 1
-@item
-Determine the location of the physical page backing the virtual
-address that faulted.  It might be in the file system, in swap,
-already be in physical memory and just not set up in the page table,
-or it might be an invalid virtual address.
+Each process has an independent set of @dfn{user (virtual) pages}, which
+are those pages below virtual address @code{PHYS_BASE}, typically
+@t{0xc0000000} (3 GB).  The set of @dfn{kernel (virtual) pages}, on the
+other hand, is global, remaining the same regardless of what thread or
+process is active.  The kernel may access both user and kernel pages,
+but a user process may access only its own user pages.  @xref{Virtual
+Memory Layout}, for more information.
  
-If the virtual address is invalid, that is, if there's no physical
-page backing it, or if the virtual address is above @code{PHYS_BASE},
-meaning that it belongs to the kernel instead of the user, then the
-process's memory access must be disallowed.  You should terminate the
-process at this point, being sure to free all of its resources.
+Pintos provides several useful functions for working with virtual
+addresses.  @xref{Virtual Addresses}, for details.
  
-@item
-If the physical page is not in physical memory, bring it into memory.
-If necessary to make room, first evict some other page from memory.
-(When you do that you need to first remove references to the page from
-any page table that refers to it.)
+@node Frames
+@subsubsection Frames
  
-@item
-Each user process's @code{struct thread} has a @samp{pagedir} member
-that points to its own per-process page directory.  Read the PDE for
-the faulting virtual address.
+A @dfn{frame}, sometimes called a @dfn{physical frame} or a @dfn{page
+frame}, is a continuous region of physical memory.  Like pages, frames
+must be page-size and page-aligned.  Thus, a 32-bit physical address can
+be divided into a 20-bit @dfn{frame number} and a 12-bit @dfn{frame
+offset} (or just @dfn{offset}), like this:
  
-@item 
-If the PDE is marked ``not present'' then allocate a new page table
-page and initialize the PDE to point to the new page table.  As when
-you allocated a data page, you might have to first evict some other
-page from memory.
+@example
+@group
+               31               12 11        0
+              +-------------------+-----------+
+              |    Frame Number   |   Offset  |
+              +-------------------+-----------+
+                       Physical Address
+@end group
+@end example
  
-@item
-Follow the PDE to the page table.  Point the PTE for the faulting
-virtual address to the physical page found in step 2.
-@end enumerate
+The 80@var{x}86 doesn't provide any way to directly access memory at a
+physical address.  Pintos works around this by mapping kernel virtual
+memory directly to physical memory: the first page of kernel virtual
+memory is mapped to the first frame of physical memory, the second page
+to the second frame, and so on.  Thus, frames can be accessed through
+kernel virtual memory.  
  
-You'll need to modify the ELF loader in @file{userprog/addrspace.c} to
-do page table management according to your new design.  As supplied,
-it reads all the process's pages from disk and initializes the page
-tables for them at the same time.  For testing purposes, you'll
-probably want to leave the code that reads the pages from disk, but
-use your new page table management code to construct the page tables
-only as page faults occur for them.
-
-@node Problem 3-2 Paging To and From Disk
-@section Problem 3-2: Paging To and From Disk
-
-Implement paging to and from disk.
-
-You will need routines to move a page from memory to disk and from
-disk to memory.  You may use the Pintos file system for swap space, or
-you may use the disk on interface @code{hd1:1}, which is otherwise
-unused.  A swap disk can theoretically be faster than using the file
-system, because it avoid file system overhead and because the swap
-disk and file system disk will be on separate hard disk controllers.
-You will definitely need to be able to retrieve pages from files in
-any case, so to avoid special cases it may be easier to use a file for
-swap.  You will still be using the basic file system provided with
-Pintos.  If you do everything correctly, your VM should still work
-when you implement your own file system for the next assignment.
-
-You will need a way to track pages which are used by a process but
-which are not in physical memory, to fully handle page faults.  Pages
-that you store on disk should not be constrained to be in sequential
-order, and consequently your swap file (or swap disk) should not
-require unused empty space.  You will also need a way to track all of
-the physical memory pages, in order to find an unused one when needed,
-or to evict a page when memory is needed but no empty pages are
-available.  The data structures that you designed in part 1 should do
-most of the work for you.
-
-You will need a page replacement algorithm.  The hardware sets the
-accessed and dirty bits when it accesses memory.  Therefore, you
-should be able to take advantage of this information to implement some
-algorithm which attempts to achieve LRU-type behavior.  We expect that
-your algorithm perform at least as well as a reasonable implementation
-of the second-chance (clock) algorithm.  You will need to show in your
-test cases the value of your page replacement algorithm by
-demonstrating for some workload that it pages less frequently using
-your algorithm than using some inferior page replacement policy.  The
-canonical example of a poor page replacement policy is random
-replacement.
-
-Since you will already be paging from disk, you should implement a
-``lazy'' loading scheme for new processes.  When a process is created,
-it will not run immediately.  Therefore, it doesn't make sense to load
-all its code, data, and stack into memory when the process is created,
-since it might incur additional disk accesses to do so (if it gets
-paged out before it runs).  When loading a new process, you should
-leave most pages on disk, and bring them in as demanded when the
-program begins running.  Your VM system should also use the executable
-file itself as backing store for read-only segments, since these
-segments won't change.
-
-There are a few special cases.  Look at the loop in
-@code{load_segment()} in @file{userprog/addrspace.c}.  Each time
-around the loop, @code{read_bytes} represents the number of bytes to
-read from the executable file and @code{zero_bytes} represents the number
-of bytes to initialize to zero following the bytes read.  The two
-always sum to @code{PGSIZE}.  The page handling depends on these
-variables' values:
+Pintos provides functions for translating between physical addresses and
+kernel virtual addresses.  @xref{Virtual Addresses}, for details.
  
-@itemize @bullet
-@item
-If @code{read_bytes} equals @code{PGSIZE}, the page should be demand
-paged from disk on its first access.
+@node Page Tables
+@subsubsection Page Tables
  
-@item 
-If @code{zero_bytes} equals @code{PGSIZE}, the page does not need to
-be read from disk at all because it is all zeroes.  You should handle
-such pages by creating a new page consisting of all zeroes at the
-first page fault.
+In Pintos, a @dfn{page table} is a data structure that the CPU uses to
+translate a virtual address to a physical address, that is, from a page
+to a frame.  The page table format is dictated by the 80@var{x}86
+architecture.  Pintos provides page table management code in
+@file{pagedir.c} (@pxref{Page Table}).
  
-@item
-If neither @code{read_bytes} nor @code{zero_bytes} equals
-@code{PGSIZE}, then part of the page is to be read from disk and the
-remainder zeroed.  This is a special case, which you should handle by
-reading the partial page from disk at executable load time and zeroing
-the rest of the page.  It is the only case in which loading should not
-be ``lazy''; even real OSes such as Linux do not load partial pages
-lazily.
-@end itemize
+The diagram below illustrates the relationship between pages and frames.
+The virtual address, on the left, consists of a page number and an
+offset.  The page table translates the page number into a frame number,
+which is combined with the unmodified offset to obtain the physical
+address, on the right.
  
-FIXME mention that you can test with these special cases eliminated
+@example
+@group
+                         +----------+
+        .--------------->|Page Table|-----------.
+       /                 +----------+            |
+   0   |  12 11 0                            0   V  12 11 0
+  +---------+----+                          +---------+----+
+  |Page Nr  | Ofs|                          |Frame Nr | Ofs|
+  +---------+----+                          +---------+----+
+   Virt Addr   |                             Phys Addr   ^
+                \_______________________________________/
+@end group
+@end example
  
-You may optionally implement sharing: when multiple processes are
-created that use the same executable file, share read-only pages among
-those processes instead of creating separate copies of read-only
-segments for each process.  If you carefully designed your data
-structures in part 1, sharing of read-only pages should not make this
-part significantly harder.
+@node Swap Slots
+@subsubsection Swap Slots
  
-@node Problem 3-3 Memory Mapped Files
-@section Problem 3-3: Memory Mapped Files
+A @dfn{swap slot} is a continuous, page-size region of disk space in the
+swap partition.  Although hardware limitations dictating the placement of
+slots are looser than for pages and frames, swap slots should be
+page-aligned because there is no downside in doing so.
  
-Implement memory mapped files.
+@node Resource Management Overview
+@subsection Resource Management Overview
  
-You will need to implement the following system calls:
+You will need to design the following data structures:
  
  @table @asis
-@item SYS_mmap
-@itemx bool mmap (int @var{fd}, void *@var{addr}, unsigned @var{length})
+@item Supplemental page table
+
+Enables page fault handling by supplementing the page table.
+@xref{Managing the Supplemental Page Table}.
+
+@item Frame table
+
+Allows efficient implementation of eviction policy.
+@xref{Managing the Frame Table}.
  
-Maps the file open as @var{fd} into the process's address space
-starting at @var{addr} for @var{length} bytes.  Returns true if
-successful, false on failure.  
+@item Swap table
  
-@item SYS_munmap
-@itemx bool munmap (void *addr, unsigned length)
+Tracks usage of swap slots.
+@xref{Managing the Swap Table}.
  
-Unmaps the segment specified by id.  This cannot be used to unmap
-segments mapped by the executable loader.  Returns 0 on success, -1 on
-failure.  When a file is unmapped, all outstanding changes are written
-to the file, and the segment's pages are removed from the process's
-list of used virtual pages.
+@item Table of file mappings
+
+Processes may map files into their virtual memory space.  You need a
+table to track which files are mapped into which pages.
  @end table
  
-Calls to @code{mmap} must fail if the address is not page-aligned, if
-the length is not positive and a multiple of @var{PGSIZE}.  You also
-must error check to make sure that the new segment does not overlap
-already existing segments, and fail if it isn't.  If the length passed
-to @code{mmap} is less than the file's length, you should only map the
-first part of the file.  If the length passed to @code{mmap} is longer
-than the file, the file should grow to the requested length.  Similar
-to the code segment, your VM system should be able to use the
-@code{mmap}'d file itself as backing store for the mmap segment, since
-the changes to the @code{mmap} segment will eventually be written to
-the file.  (In fact, you may choose to implement executable mappings
-as a special case of file mappings.)
-
-@node Virtual Memory FAQ
-@section FAQ
+You do not necessarily need to implement four completely distinct data
+structures: it may be convenient to wholly or partially merge related
+resources into a unified data structure.
+
+For each data structure, you need to determine what information each
+element should contain.  You also need to decide on the data structure's
+scope, either local (per-process) or global (applying to the whole
+system), and how many instances are required within its scope.
+
+To simplify your design, you may store these data structures in
+non-pageable memory.  That means that you can be sure that pointers
+among them will remain valid.
+
+Possible choices of data structures include arrays, lists, bitmaps, and
+hash tables.  An array is often the simplest approach, but a sparsely
+populated array wastes memory.  Lists are also simple, but traversing a
+long list to find a particular position wastes time.  Both arrays and
+lists can be resized, but lists more efficiently support insertion and
+deletion in the middle.
+
+Pintos includes a bitmap data structure in @file{lib/kernel/bitmap.c}
+and @file{lib/kernel/bitmap.h}.  A bitmap is an array of bits, each of
+which can be true or false.  Bitmaps are typically used to track usage
+in a set of (identical) resources: if resource @var{n} is in use, then
+bit @var{n} of the bitmap is true.  Pintos bitmaps are fixed in size,
+although you could extend their implementation to support resizing.
+
+Pintos also includes a hash table data structure (@pxref{Hash Table}).
+Pintos hash tables efficiently support insertions and deletions over a
+wide range of table sizes.
+
+Although more complex data structures may yield performance or other
+benefits, they may also needlessly complicate your implementation.
+Thus, we do not recommend implementing any advanced data structure
+(e.g.@: a balanced binary tree) as part of your design.
+
+@node Managing the Supplemental Page Table
+@subsection Managing the Supplemental Page Table
+
+The @dfn{supplemental page table} supplements the page table with
+additional data about each page.  It is needed because of the
+limitations imposed by the page table's format.  Such a data structure
+is often called a ``page table'' also; we add the word ``supplemental''
+to reduce confusion.
+
+The supplemental page table is used for at least two purposes.  Most
+importantly, on a page fault, the kernel looks up the virtual page that
+faulted in the supplemental page table to find out what data should be
+there.  Second, the kernel consults the supplemental page table when a
+process terminates, to decide what resources to free.
+
+You may organize the supplemental page table as you wish.  There are at
+least two basic approaches to its organization: in terms of segments or
+in terms of pages.  Optionally, you may use the page table itself as an
+index to track the members of the supplemental page table.  You will
+have to modify the Pintos page table implementation in @file{pagedir.c}
+to do so.  We recommend this approach for advanced students only.
+@xref{Page Table Entry Format}, for more information.
+
+The most important user of the supplemental page table is the page fault
+handler.  In project 2, a page fault always indicated a bug in the
+kernel or a user program.  In project 3, this is no longer true.  Now, a
+page fault might only indicate that the page must be brought in from a
+file or swap.  You will have to implement a more sophisticated page
+fault handler to handle these cases.  Your page fault handler, which you
+should implement by modifying @func{page_fault} in
+@file{userprog/exception.c}, needs to do roughly the following:
  
  @enumerate 1
  @item
-@b{Do we need a working HW 2 to implement HW 3?}
+Locate the page that faulted in the supplemental page table.  If the
+memory reference is valid, use the supplemental page table entry to
+locate the data that goes in the page, which might be in the file
+system, or in a swap slot, or it might simply be an all-zero page.  If
+you implement sharing, the page's data might even already be in a page
+frame, but not in the page table.
  
-Yes.
+If the page is unmapped, that is, if there's no data there, or if the
+page lies within kernel virtual memory, or if the access is an attempt
+to write to a read-only page, then the access is invalid.  Any invalid
+access terminates the process and thereby frees all of its resources.
  
  @item
-@b{How do I use the hash table provided in @file{lib/hash.c}?}
-
-FIXME
+Obtain a frame to store the page.  @xref{Managing the Frame Table}, for
+details.
  
-There are two things you need to use this hashtable:
+If you implement sharing, the data you need may already be in a frame,
+in which case you must be able to locate that frame.
  
-1. You need to decide on a key type. The key should be something
-that is unique for each object as inserting two objects with
-the same key will cause the second to overwrite the first.
-(The keys are compared with ==, so you should stick to
-integers and pointers unless you know how to do operator
-overloading.) You also need to write a hash function that
-converts key values to integers, which you will pass into the
-hash table constructor.
-
-2. Your key needs to be a field of your object type, and you
-will need to supply a 'get' function that given an object
-returns the key.
+@item
+Fetch the data into the frame, by reading it from the file system or
+swap, zeroing it, etc.
  
-Here's a quick example of how to construct a hash table. In
-this table the keys are Thread pointers and the objects are
-integers (you will be using different key/value pairs I'm
-sure). In addition, this hash function is pretty puny. You
-should probably use a better one.
+If you implement sharing, the page you need may already be in a frame,
+in which case no action is necessary in this step.
  
-@example
-FIXME
-@end example
+@item
+Point the page table entry for the faulting virtual address to the
+physical page.  You can use the functions in @file{userprog/pagedir.c}.
+@end enumerate
  
-and to construct the hash table:
+@node Managing the Frame Table
+@subsection Managing the Frame Table
  
-HashTable<Thread *, HashObject *> *htable;
+The @dfn{frame table} contains one entry for each frame that contains a
+user page.  Each entry in the frame table contains a pointer to the
+page, if any, that currently occupies it, and other data of your choice.
+The frame table allows Pintos to efficiently implement an eviction
+policy, by choosing a page to evict when no frames are free.
  
-htable = new HashTable<Thread *, HashObject *>(ExtractKeyFromHashObject,
-                                            MyKeyToHashValue);
+The frames used for user pages should be obtained from the ``user
+pool,'' by calling @code{palloc_get_page(PAL_USER)}.  You must use
+@code{PAL_USER} to avoid allocating from the ``kernel pool,'' which
+could cause some test cases to fail unexpectedly (@pxref{Why
+PAL_USER?}).  If you modify @file{palloc.c} as part of your frame table
+implementation, be sure to retain the distinction between the two pools.
  
-If you have any other questions about hash tables, the CS109
-and CS161 textbooks have good chapters on them, or you can come
-to any of the TA's office hours for further clarification.
+The most important operation on the frame table is obtaining an unused
+frame.  This is easy when a frame is free.  When none is free, a frame
+must be made free by evicting some page from its frame.
  
-@item
-@b{The current implementation of the hash table does not do something
-that we need it to do. What gives?}
+If no frame can be evicted without allocating a swap slot, but swap is
+full, panic the kernel.  Real OSes apply a wide range of policies to
+recover from or prevent such situations, but these policies are beyond
+the scope of this project.
  
-You are welcome to modify it.  It is not used by any of the code we
-provided, so modifying it won't affect any code but yours.  Do
-whatever it takes to make it work like you want it to.
+The process of eviction comprises roughly the following steps:
  
+@enumerate 1
  @item
-@b{Is the data segment page-aligned?}
-
-No.
+Choose a frame to evict, using your page replacement algorithm.  The
+``accessed'' and ``dirty'' bits in the page table, described below, will
+come in handy.
  
  @item
-@b{What controls the layout of user programs?}
+Remove references to the frame from any page table that refers to it.
  
-The linker is responsible for the layout of a user program in
-memory. The linker is directed by a ``linker script'' which tells it
-the names and locations of the various program segments. The
-test/script and testvm/script files are the linker scripts for the
-multiprogramming and virtual memory assignments respectively. You can
-learn more about linker scripts by reading the ``Scripts'' chapter in
-the linker manual, accessible via @samp{info ld}.
+Unless you have implemented sharing, only a single page should refer to
+a frame at any given time.
  
-@item Page Table Management FAQs
-@enumerate 1
  @item
-@b{How do we manage allocation of pages used for page tables?}
-
-You can use any reasonable algorithm to do so.  However, you should
-make sure that memory used for page tables doesn't grow so much that
-it encroaches deeply on the memory used for data pages.
-
-Here is one reasonable algorithm.  At OS boot time, reserve some fixed
-number of pages for page tables.  Then, each time a new page table
-page is needed, select one of these pages in ``round robin'' fashion.
-If the page in use, clean up any pointers to it.  Then use it for the
-new page table page.
+If necessary, write the page to the file system or to swap.
+@end enumerate
  
-@item
-@b{Our code handles the PageFault exceptions. However, the number of
-page faults handled does not show up in the final stats output. Is
-there a counter that we must increment to correct this problem?}
+The evicted frame may then be used to store a different page.
  
-FIXME 
+@menu
+* Accessed and Dirty Bits::     
+@end menu
  
-Yes, you'll need to update kernel->stats->numPageFaults when
-you handle a page fault in your code.
-@end enumerate
+@node Accessed and Dirty Bits
+@subsubsection Accessed and Dirty Bits
+
+80@var{x}86 hardware provides some assistance for implementing page
+replacement algorithms, through a pair of bits in the page table entry
+(PTE) for each page.  On any read or write to a page, the CPU sets the
+@dfn{accessed bit} to 1 in the page's PTE, and on any write, the CPU
+sets the @dfn{dirty bit} to 1.  The CPU never resets these bits to 0,
+but the OS may do so.
+
+You need to be aware of @dfn{aliases}, that is, two (or more) pages that
+refer to the same frame.  When an aliased frame is accessed, the
+accessed and dirty bits are updated in only one page table entry (the
+one for the page used for access).  The accessed and dirty bits for the
+other aliases are not updated.
+
+In Pintos, every user virtual page is aliased to its kernel virtual
+page.  You must manage these aliases somehow.  For example, your code
+could check and update the accessed and dirty bits for both addresses.
+Alternatively, the kernel could avoid the problem by only accessing user
+data through the user virtual address.
+
+Other aliases should only arise if you implement sharing for extra
+credit (@pxref{VM Extra Credit}), or if there is a bug in your code.
+
+@xref{Page Table Accessed and Dirty Bits}, for details of the functions
+to work with accessed and dirty bits.
+
+@node Managing the Swap Table
+@subsection Managing the Swap Table
+
+The swap table tracks in-use and free swap slots.  It should allow
+picking an unused swap slot for evicting a page from its frame to the
+swap partition.  It should allow freeing a swap slot when its page is read
+back or the process whose page was swapped is terminated.
+
+You may use the @code{BLOCK_SWAP} block device for swapping, obtaining
+the @struct{block} that represents it by calling @func{block_get_role}.
+From the
+@file{vm/build} directory, use the command @code{pintos-mkdisk swap.dsk
+--swap-size=@var{n}} to create an disk named @file{swap.dsk} that
+contains a @var{n}-MB swap partition.
+Afterward, @file{swap.dsk} will automatically be attached as an extra disk
+when you run @command{pintos}.  Alternatively, you can tell
+@command{pintos} to use a temporary @var{n}-MB swap disk for a single
+run with @option{--swap-size=@var{n}}.
+
+Swap slots should be allocated lazily, that is, only when they are
+actually required by eviction.  Reading data pages from the executable
+and writing them to swap immediately at process startup is not lazy.
+Swap slots should not be reserved to store particular pages.
+
+Free a swap slot when its contents are read back into a frame.
+
+@node Managing Memory Mapped Files Back
+@subsection Managing Memory Mapped Files
+
+The file system is most commonly accessed with @code{read} and
+@code{write} system calls.  A secondary interface is to ``map'' the file
+into virtual pages, using the @code{mmap} system call.  The program can
+then use memory instructions directly on the file data.
+
+Suppose file @file{foo} is @t{0x1000} bytes (4 kB, or one page) long.
+If @file{foo} is mapped into memory starting at address @t{0x5000}, then
+any memory accesses to locations @t{0x5000}@dots{}@t{0x5fff} will access
+the corresponding bytes of @file{foo}.
+
+Here's a program that uses @code{mmap} to print a file to the console.
+It opens the file specified on the command line, maps it at virtual
+address @t{0x10000000}, writes the mapped data to the console (fd 1),
+and unmaps the file.
  
-@item Paging FAQs
+@example
+#include <stdio.h>
+#include <syscall.h>
+int main (int argc UNUSED, char *argv[]) 
+@{
+  void *data = (void *) 0x10000000;     /* @r{Address at which to map.} */
  
-@enumerate 1
-@item
-@b{Can we assume (and enforce) that the user's stack will
-never increase beyond one page?}
+  int fd = open (argv[1]);              /* @r{Open file.} */
+  mapid_t map = mmap (fd, data);        /* @r{Map file.} */
+  write (1, data, filesize (fd));       /* @r{Write file to console.} */
+  munmap (map);                         /* @r{Unmap file (optional).} */
+  return 0;
+@}
+@end example
  
-No.  This value was useful for project 2, but for this assignment, you
-need to implement an extensible stack segment.
+A similar program with full error handling is included as @file{mcat.c}
+in the @file{examples} directory, which also contains @file{mcp.c} as a
+second example of @code{mmap}.
  
-@item
-@b{Does the virtual memory system need to support growth of the data
-segment?}
-
-No.  The size of the data segment is determined by the linker.  We
-still have no dynamic allocation in Pintos (although it is possible to
-``fake'' it at the user level by using memory-mapped files).
-Implementing @code{sbrk()} has been an extra-credit assignment in
-previous years, but adds little additional complexity to a
-well-designed system.
+Your submission must be able to track what memory is used by memory
+mapped files.  This is necessary to properly handle page faults in the
+mapped regions and to ensure that mapped files do not overlap any other
+segments within the process.
  
-@item
-@b{Does the virtual memory system need to support growth of the stack
-segment?}
+@node Project 3 Suggested Order of Implementation
+@section Suggested Order of Implementation
  
-Yes. If a page fault appears just below the last stack segment page,
-you must add a new page to the bottom of the stack. It is impossible
-to predict how large the stack will grow at compile time, so we must
-allocate pages as necessary. You should only allocate additional pages
-if they ``appear'' to be stack accesses.
+We suggest the following initial order of implementation:
  
+@enumerate 1
  @item
-@b{But what do you mean by ``appear'' to be stack accesses? How big can a
-stack growth be?  Under what circumstances do we grow the stack?}
+Frame table (@pxref{Managing the Frame Table}).  Change @file{process.c}
+to use your frame table allocator.
  
-If it looks like a stack request, then you grow the stack. Yes, that's
-ambiguous. You need to make a reasonable decision about what looks
-like a stack request. For example, you could decide a page, or two
-pages, or ten pages, or more@enddots{}  Or, you could use some other
-heuristic to figure this out.
+Do not implement swapping yet.  If you run out of frames, fail the
+allocator or panic the kernel.
  
-Make a reasonable decision and document it in your code and in
-your design document.  Please make sure to justify your decision.
+After this step, your kernel should still pass all the project 2 test
+cases.
  
  @item
-@b{How big should the file(s) we're using as a backing store for memory
-be?}
+Supplemental page table and page fault handler (@pxref{Managing the
+Supplemental Page Table}).  Change @file{process.c} to record the
+necessary information in the supplemental page table when loading an
+executable and setting up its stack.  Implement loading of code and data
+segments in the page fault handler.  For now, consider only valid
+accesses.
  
-These files will need to be able to grow based on the number of pages
-you're committed to storing on behalf of the processes currently in
-memory.  They should be able to grow to the full size of the disk.
+After this step, your kernel should pass all of the project 2
+functionality test cases, but only some of the robustness tests.
  @end enumerate
  
-@item Memory Mapped File FAQs
+From here, you can implement stack growth, mapped files, and page
+reclamation on process exit in parallel.
  
-@enumerate 1
-@item
-@b{How do we interact with memory-mapped files?}
+The next step is to implement eviction (@pxref{Managing the Frame
+Table}).  Initially you could choose the page to evict randomly.  At
+this point, you need to consider how to manage accessed and dirty bits
+and aliasing of user and kernel pages.  Synchronization is also a
+concern: how do you deal with it if process A faults on a page whose
+frame process B is in the process of evicting?  Finally, implement a
+eviction strategy such as the clock algorithm.
  
-Let's say you want to map a file called @file{foo} into your address
-space at address @t{0x10000000}. You open the file, determine its
-length, and then use Mmap:
+@node Project 3 Requirements
+@section Requirements
  
-@example
-#include <stdio.h>
-#include <syscall.h>
+This assignment is an open-ended design problem.  We are going to say as
+little as possible about how to do things.  Instead we will focus on
+what functionality we require your OS to support.  We will expect
+you to come up with a design that makes sense.  You will have the
+freedom to choose how to handle page faults, how to organize the swap
+partition, how to implement paging, etc.
  
-int main (void)
-@{
-    void *addr = (void *) 0x10000000;
-    int fd = open ("foo");
-    int length = filesize (fd);
-    if (mmap (fd, addr, length))
-        printf ("success!\n");
-@}
-@end example
+@menu
+* Project 3 Design Document::   
+* Paging::                      
+* Stack Growth::                
+* Memory Mapped Files::         
+@end menu
  
-Suppose @file{foo} is a text file and you want to print the first 64
-bytes on the screen (assuming, of course, that the length of the file
-is at least 64).  Without @code{mmap}, you'd need to allocate a
-buffer, use @code{read} to get the data from the file into the buffer,
-and finally use @code{write} to put the buffer out to the display. But
-with the file mapped into your address space, you can directly address
-it like so:
+@node Project 3 Design Document
+@subsection Design Document
+
+Before you turn in your project, you must copy @uref{vm.tmpl, , the
+project 3 design document template} into your source tree under the name
+@file{pintos/src/vm/DESIGNDOC} and fill it in.  We recommend that you
+read the design document template before you start working on the
+project.  @xref{Project Documentation}, for a sample design document
+that goes along with a fictitious project.
+
+@node Paging
+@subsection Paging
+
+Implement paging for segments loaded from executables.  All of these
+pages should be loaded lazily, that is, only as the kernel intercepts
+page faults for them.  Upon eviction, pages modified since load (e.g.@:
+as indicated by the ``dirty bit'') should be written to swap.
+Unmodified pages, including read-only pages, should never be written to
+swap because they can always be read back from the executable.
+
+Implement a global page replacement algorithm that approximates LRU.
+Your algorithm should perform at least as well as the ``second chance''
+or ``clock'' algorithm.
+
+Your design should allow for parallelism.  If one page fault requires
+I/O, in the meantime processes that do not fault should continue
+executing and other page faults that do not require I/O should be able
+to complete.  This will require some synchronization effort.
+
+You'll need to modify the core of the program loader, which is the loop
+in @func{load_segment} in @file{userprog/process.c}.  Each time around
+the loop, @code{page_read_bytes} receives the number of bytes to read
+from the executable file and @code{page_zero_bytes} receives the number
+of bytes to initialize to zero following the bytes read.  The two always
+sum to @code{PGSIZE} (4,096).  The handling of a page depends on these
+variables' values:
  
-@example
-write (addr, 64, STDOUT_FILENO);
-@end example
+@itemize @bullet
+@item
+If @code{page_read_bytes} equals @code{PGSIZE}, the page should be demand
+paged from the underlying file on its first access.
  
-Similarly, if you wanted to replace the first byte of the file,
-all you need to do is:
+@item
+If @code{page_zero_bytes} equals @code{PGSIZE}, the page does not need to
+be read from disk at all because it is all zeroes.  You should handle
+such pages by creating a new page consisting of all zeroes at the
+first page fault.
  
-@example
-addr[0] = 'b';
-@end example
+@item
+Otherwise, neither @code{page_read_bytes} nor @code{page_zero_bytes}
+equals @code{PGSIZE}.  In this case, an initial part of the page is to
+be read from the underlying file and the remainder zeroed.
+@end itemize
  
-When you're done using the memory-mapped file, you simply unmap
-it:
+@node Stack Growth
+@subsection Stack Growth
+
+Implement stack growth.  In project 2, the stack was a single page at
+the top of the user virtual address space, and programs were limited to
+that much stack.  Now, if the stack grows past its current size,
+allocate additional pages as necessary.
+
+Allocate additional pages only if they ``appear'' to be stack accesses.
+Devise a heuristic that attempts to distinguish stack accesses from
+other accesses.
+
+User programs are buggy if they write to the stack below the stack
+pointer, because typical real OSes may interrupt a process at any time
+to deliver a ``signal,'' which pushes data on the stack.@footnote{This rule is
+common but not universal.  One modern exception is the
+@uref{http://www.x86-64.org/documentation/abi.pdf, @var{x}86-64 System V
+ABI}, which designates 128 bytes below the stack pointer as a ``red
+zone'' that may not be modified by signal or interrupt handlers.}
+However, the 80@var{x}86 @code{PUSH} instruction checks access
+permissions before it adjusts the stack pointer, so it may cause a page
+fault 4 bytes below the stack pointer.  (Otherwise, @code{PUSH} would
+not be restartable in a straightforward fashion.)  Similarly, the
+@code{PUSHA} instruction pushes 32 bytes at once, so it can fault 32
+bytes below the stack pointer.
+
+You will need to be able to obtain the current value of the user
+program's stack pointer.  Within a system call or a page fault generated
+by a user program, you can retrieve it from the @code{esp} member of the
+@struct{intr_frame} passed to @func{syscall_handler} or
+@func{page_fault}, respectively.  If you verify user pointers before
+accessing them (@pxref{Accessing User Memory}), these are the only cases
+you need to handle.  On the other hand, if you depend on page faults to
+detect invalid memory access, you will need to handle another case,
+where a page fault occurs in the kernel.  Since the processor only 
+saves the stack pointer when an exception causes a switch from user
+to kernel mode, reading @code{esp} out of the @struct{intr_frame} 
+passed to @func{page_fault} would yield an undefined value, not the 
+user stack pointer.  You will need to arrange another way, such as 
+saving @code{esp} into @struct{thread} on the initial transition 
+from user to kernel mode.
+
+You should impose some absolute limit on stack size, as do most OSes.
+Some OSes make the limit user-adjustable, e.g.@: with the
+@command{ulimit} command on many Unix systems.  On many GNU/Linux systems,
+the default limit is 8 MB.
+
+The first stack page need not be allocated lazily.  You can allocate
+and initialize it with the command line arguments at load time, with 
+no need to wait for it to be faulted in.
+
+All stack pages should be candidates for eviction.  An evicted stack
+page should be written to swap.
  
-@example
-munmap (addr);
-@end example
+@node Memory Mapped Files
+@subsection Memory Mapped Files
+
+Implement memory mapped files, including the following system calls.
+
+@deftypefn {System Call} mapid_t mmap (int @var{fd}, void *@var{addr})
+Maps the file open as @var{fd} into the process's virtual address
+space.  The entire file is mapped into consecutive virtual pages
+starting at @var{addr}.
+
+Your VM system must lazily load pages in @code{mmap} regions and use the
+@code{mmap}ed file itself as backing store for the mapping.  That is,
+evicting a page mapped by @code{mmap} writes it back to the file it was
+mapped from.
+
+If the file's length is not a multiple of @code{PGSIZE}, then some
+bytes in the final mapped page ``stick out'' beyond the end of the
+file.  Set these bytes to zero when the page is faulted in from the
+file system,
+and discard them when the page is written back to disk.
+
+If successful, this function returns a ``mapping ID'' that
+uniquely identifies the mapping within the process.  On failure,
+it must return -1, which otherwise should not be a valid mapping id,
+and the process's mappings must be unchanged.
+
+A call to @code{mmap} may fail if the file open as @var{fd} has a
+length of zero bytes.  It must fail if @var{addr} is not page-aligned
+or if the range of pages mapped overlaps any existing set of mapped
+pages, including the stack or pages mapped at executable load time.
+It must also fail if @var{addr} is 0, because some Pintos code assumes
+virtual page 0 is not mapped.  Finally, file descriptors 0 and 1,
+representing console input and output, are not mappable.
+@end deftypefn
+
+@deftypefn {System Call} void munmap (mapid_t @var{mapping})
+Unmaps the mapping designated by @var{mapping}, which must be a
+mapping ID returned by a previous call to @code{mmap} by the same
+process that has not yet been unmapped.
+@end deftypefn
+
+All mappings are implicitly unmapped when a process exits, whether via
+@code{exit} or by any other means.  When a mapping is unmapped, whether
+implicitly or explicitly, all pages written to by the process are
+written back to the file, and pages not written must not be.  The pages
+are then removed from the process's list of virtual pages.
+
+Closing or removing a file does not unmap any of its mappings.  Once
+created, a mapping is valid until @code{munmap} is called or the process
+exits, following the Unix convention.  @xref{Removing an Open File}, for
+more information.
+
+If two or more processes map the same file, there is no requirement that
+they see consistent data.  Unix handles this by making the two mappings
+share the same physical page, but the @code{mmap} system call also has
+an argument allowing the client to specify whether the page is shared or
+private (i.e.@: copy-on-write).
  
-@item
-@b{What if two processes memory-map the same file?}
+@node Project 3 FAQ
+@section FAQ
  
-There is no requirement in Pintos that the two processes see
-consistent data.  Unix handles this by making the processes share the
-same physical page, but the @code{mmap} system call also has an
-argument allowing the client to specify whether the page is shared or
-private (i.e.@: copy-on-write).
+@table @b
+@item How much code will I need to write?
+
+Here's a summary of our reference solution, produced by the
+@command{diffstat} program.  The final row gives total lines inserted
+and deleted; a changed line counts as both an insertion and a deletion.
+
+This summary is relative to the Pintos base code, but the reference
+solution for project 3 starts from the reference solution to project 2.
+@xref{Project 2 FAQ}, for the summary of project 2.
+
+The reference solution represents just one possible solution.  Many
+other solutions are also possible and many of those differ greatly from
+the reference solution.  Some excellent solutions may not modify all the
+files modified by the reference solution, and some may modify files not
+modified by the reference solution.
+
+@verbatim
+ Makefile.build       |    4
+ devices/timer.c      |   42 ++
+ threads/init.c       |    5
+ threads/interrupt.c  |    2
+ threads/thread.c     |   31 +
+ threads/thread.h     |   37 +-
+ userprog/exception.c |   12
+ userprog/pagedir.c   |   10
+ userprog/process.c   |  319 +++++++++++++-----
+ userprog/syscall.c   |  545 ++++++++++++++++++++++++++++++-
+ userprog/syscall.h   |    1
+ vm/frame.c           |  162 +++++++++
+ vm/frame.h           |   23 +
+ vm/page.c            |  297 ++++++++++++++++
+ vm/page.h            |   50 ++
+ vm/swap.c            |   85 ++++
+ vm/swap.h            |   11
+ 17 files changed, 1532 insertions(+), 104 deletions(-)
+@end verbatim
+
+@item Do we need a working Project 2 to implement Project 3?
  
-@item
-@b{What happens if a user removes a @code{mmap}'d file?}
+Yes.
  
-@item
-You should follow the Unix convention and the mapping should still be
-valid.  This is similar to the question in the User Programs FAQ about
-a process with a file descriptor to a file that has been removed.
+@item What extra credit is available?
+@anchor{VM Extra Credit}
  
-@item
-@b{What if a process writes to a page that is memory-mapped, but the
-location written to in the memory-mapped page is past the end
-of the memory-mapped file?}
+You may implement sharing: when multiple processes are created that use
+the same executable file, share read-only pages among those processes
+instead of creating separate copies of read-only segments for each
+process.  If you carefully designed your data structures,
+sharing of read-only pages should not make this part significantly
+harder.
  
-Can't happen.  @code{mmap} extends the file to the requested length,
-and Pintos provides no way to shorten a file.  You can remove a file,
-but the mapping remains valid (see the previous question).
+@item How do we resume a process after we have handled a page fault?
  
-@item
-@b{Do we have to handle memory mapping @code{stdin} or @code{stdout}?}
+Returning from @func{page_fault} resumes the current user process
+(@pxref{Internal Interrupt Handling}).
+It will then retry the instruction to which the instruction pointer points.
  
-No.  Memory mapping implies that a file has a length and that a user
-can seek to any location in the file.  Since the console device has
-neither of these properties, @code{mmap} should return false when the
-user attempts to memory map a file descriptor for the console device.
+@item Does the virtual memory system need to support data segment growth?
  
-@item
-@b{What happens when a process exits with mmap'd files?}
+No.  The size of the data segment is determined by the linker.  We still
+have no dynamic allocation in Pintos (although it is possible to
+``fake'' it at the user level by using memory-mapped files).  Supporting
+data segment growth should add little additional complexity to a
+well-designed system.
  
-When a process finishes each of its @code{mmap}'d files is implicitly
-unmapped.  When a process @code{mmap}s a file and then writes into the
-area for the file it is making the assumption the changes will be
-written to the file.
+@item Why should I use @code{PAL_USER} for allocating page frames?
+@anchor{Why PAL_USER?}
  
-@item
-@b{If a user closes a mmaped file, should be automatically unmap it
-for them?}
+Passing @code{PAL_USER} to @func{palloc_get_page} causes it to allocate
+memory from the user pool, instead of the main kernel pool.  Running out
+of pages in the user pool just causes user programs to page, but running
+out of pages in the kernel pool will cause many failures because so many
+kernel functions need to obtain memory.
+You can layer some other allocator on top of @func{palloc_get_page} if
+you like, but it should be the underlying mechanism.
  
-No, once created the mapping is valid until @code{munmap} is called
-or the process exits.
-@end enumerate
-@end enumerate
+Also, you can use the @option{-ul} kernel command-line option to limit
+the size of the user pool, which makes it easy to test your VM
+implementation with various user memory sizes.
+@end table