pintos-os.org Git - pintos-anon/blob - doc/vm.texi

   1 @node Project 3--Virtual Memory, Project 4--File Systems, Project 2--User Programs, Top
   2 @chapter Project 3: Virtual Memory
   3
   4 By now you should be familiar with the inner workings of Pintos.
   5 You've already come a long way: your OS can properly handle multiple
   6 threads of execution with proper synchronization, and can load
   7 multiple user programs at once.  However, when loading user programs,
   8 your OS is limited by how much main memory the simulated machine has.
   9 In this assignment, you will remove that limitation.
  10
  11 You will be using the @file{vm} directory for this project.  The
  12 @file{vm} directory contains only the @file{Makefile}s.  The only
  13 change from @file{userprog} is that this new @file{Makefile} turns on
  14 the setting @option{-DVM}.  All code you write will either be newly
  15 generated files (e.g.@: if you choose to implement your paging code in
  16 their own source files), or will be modifications to pre-existing code
  17 (e.g.@: you will change the behavior of @file{process.c}
  18 significantly).
  19
  20 There are only a couple of source files you will probably be
  21 encountering for the first time:
  22
  23 @table @file
  24 @item devices/disk.h
  25 @itemx devices/disk.c
  26 Provides access to the physical disk, abstracting away the rather
  27 awful IDE interface.
  28 @end table
  29
  30 You will be building this assignment on the last one.  It will benefit
  31 you to get your project 2 in good working order before this assignment
  32 so those bugs don't keep haunting you.
  33
  34 All the test programs from the previous project should also work with
  35 this project.  You should also write programs to test the new features
  36 introduced in this project.
  37
  38 Your submission should define @code{THREAD_JOIN_IMPLEMENTED} in
  39 @file{constants.h} (@pxref{Conditional Compilation}).
  40
  41 @menu
  42 * VM Design::
  43 * Page Faults::
  44 * Disk as Backing Store::
  45 * Memory Mapped Files::
  46 * Stack::
  47 * Problem 3-1 Page Table Management::
  48 * Problem 3-2 Paging To and From Disk::
  49 * Problem 3-3 Memory Mapped Files::
  50 * Virtual Memory FAQ::
  51 @end menu
  52
  53 @node VM Design
  54 @section A Word about Design
  55
  56 It is important for you to note that in addition to getting virtual
  57 memory working, this assignment is also meant to be an open-ended
  58 design problem.  We will expect you to come up with a design that
  59 makes sense.  You will have the freedom to choose how to handle page
  60 faults, how to organize the swap disk, how to implement paging, etc.
  61 In each case, we will expect you to provide a defensible justification
  62 in your design documentation as to why your choices are reasonable.
  63 You should evaluate your design on all the available criteria: speed
  64 of handling a page fault, space overhead in memory, minimizing the
  65 number of page faults, simplicity, etc.
  66
  67 In keeping with this, you will find that we are going to say as little
  68 as possible about how to do things.  Instead we will focus on what end
  69 functionality we require your OS to support.
  70
  71 @node Page Faults
  72 @section Page Faults
  73
  74 For the last assignment, whenever a context switch occurred, the new
  75 process would install its own page table into the machine.  The page
  76 table contained all the virtual-to-physical translations for the
  77 process.  Whenever the processor needed to look up a translation, it
  78 consulted the page table.  As long as the process only accessed
  79 memory that it didn't own, all was well.  If the process accessed
  80 memory it didn't own, it ``page faulted'' and @func{page_fault}
  81 terminated the process.
  82
  83 When we implement virtual memory, the rules have to change.  A page
  84 fault is no longer necessarily an error, since it might only indicate
  85 that the page must be brought in from a disk file or from swap.  You
  86 will have to implement a more sophisticated page fault handler to
  87 handle these cases.
  88
  89 On the 80@var{x}86, the page table format is fixed by hardware.  We
  90 have provided code for managing page tables for you to use in
  91 @file{userprog/pagedir.c}.  The functions in there should provide an
  92 abstract interface to all the page table functionality that you need
  93 to complete the project.  However, you may still find it worthwhile to
  94 understand a little about the hardware page table format, so we'll go
  95 into a little of detail about that in this section.
  96
  97 The top-level paging data structure is a 4 kB page called the ``page
  98 directory'' (PD) arranged as an array of 1,024 32-bit page directory
  99 entries (PDEs), each of which represents 4 MB of virtual memory.  Each
 100 PDE may point to the physical address of another 4 kB page called a
 101 ``page table'' (PT) arranged in the same fashion as an array of 1,024
 102 32-bit page table entries (PTEs), each of which translates a single 4
 103 kB virtual page into physical memory.
 104
 105 Thus, translation of a virtual address into a physical address follows
 106 the three-step process illustrated in the diagram
 107 below:@footnote{Actually, virtual to physical translation on the
 108 80@var{x}86 architecture happens via an intermediate ``linear
 109 address,'' but Pintos (and most other 80@var{x}86 OSes) set up the CPU
 110 so that linear and virtual addresses are one and the same, so that you
 111 can effectively ignore this CPU feature.}
 112
 113 @enumerate 1
 114 @item
 115 The top 10 bits of the virtual address (bits 22:32) are used to index
 116 into the page directory.  If the PDE is marked ``present,'' the
 117 physical address of a page table is read from the PDE thus obtained.
 118 If the PDE is marked ``not present'' then a page fault occurs.
 119
 120 @item
 121 The next 10 bits of the virtual address (bits 12:22) are used to index
 122 into the page table.  If the PTE is marked ``present,'' the physical
 123 address of a data page is read from the PTE thus obtained.  If the PTE
 124 is marked ``not present'' then a page fault occurs.
 125
 126
 127 @item
 128 The bottom 12 bits of the virtual address (bits 0:12) are added to the
 129 data page's physical base address, producing the final physical
 130 address.
 131 @end enumerate
 132
 133 @example
 134 32                    22                     12                      0
 135 +--------------------------------------------------------------------+
 136 | Page Directory Index |   Page Table Index   |    Page Offset       |
 137 +--------------------------------------------------------------------+
 138              |                    |                     |
 139      _______/             _______/                _____/
 140     /                    /                       /
 141    /    Page Directory  /      Page Table       /    Data Page
 142   /     .____________. /     .____________.    /   .____________.
 143   |1,023|____________| |1,023|____________|    |   |____________|
 144   |1,022|____________| |1,022|____________|    |   |____________|
 145   |1,021|____________| |1,021|____________|    \__\|____________|
 146   |1,020|____________| |1,020|____________|       /|____________|
 147   |     |            | |     |            |        |            |
 148   |     |            | \____\|            |_       |            |
 149   |     |      .     |      /|      .     | \      |      .     |
 150   \____\|      .     |_      |      .     |  |     |      .     |
 151        /|      .     | \     |      .     |  |     |      .     |
 152         |      .     |  |    |      .     |  |     |      .     |
 153         |            |  |    |            |  |     |            |
 154         |____________|  |    |____________|  |     |____________|
 155        4|____________|  |   4|____________|  |     |____________|
 156        3|____________|  |   3|____________|  |     |____________|
 157        2|____________|  |   2|____________|  |     |____________|
 158        1|____________|  |   1|____________|  |     |____________|
 159        0|____________|  \__\0|____________|  \____\|____________|
 160                            /                      /
 161 @end example
 162
 163 Header @file{threads/mmu.h} has useful functions for various
 164 operations on virtual addresses.  You should look over the header
 165 yourself, but its most important functions include these:
 166
 167 @table @code
 168 @item pd_no(@var{va})
 169 Returns the page directory index in virtual address @var{va}.
 170
 171 @item pt_no(@var{va})
 172 Returns the page table index in virtual address @var{va}.
 173
 174 @item pg_ofs(@var{va})
 175 Returns the page offset in virtual address @var{va}.
 176
 177 @item pg_round_down(@var{va})
 178 Returns @var{va} rounded down to the nearest page boundary, that is,
 179 @var{va} but with its page offset set to 0.
 180
 181 @item pg_round_up(@var{va})
 182 Returns @var{va} rounded up to the nearest page boundary.
 183 @end table
 184
 185 @node Disk as Backing Store
 186 @section Disk as Backing Store
 187
 188 In VM systems, since memory is less plentiful than disk, you will
 189 effectively use memory as a cache for disk.  Looking at it from
 190 another angle, you will use disk as a backing store for memory.  This
 191 provides the abstraction of an (almost) unlimited virtual memory size.
 192 Part of your task in this project is to do this, with the additional
 193 constraint that your performance should be close to that provided by
 194 physical memory.  You will use the page tables' ``dirty'' bits to
 195 denote whether pages need to be written back to disk when they're
 196 evicted from main memory and the ``accessed'' bit for page replacement
 197 algorithms.  Whenever the hardware writes memory, it sets the dirty
 198 bit, and if it reads or writes to the page, it sets the accessed bit.
 199
 200 As with any caching system, performance depends on the policy used to
 201 decide which things are kept in memory and which are only stored on
 202 disk.  On a page fault, the kernel must decide which page to replace.
 203 Ideally, it will throw out a page that will not be referenced for a
 204 long time, keeping in memory those pages that are soon to be
 205 referenced.  Another consideration is that if the replaced page has
 206 been modified, the page must be first saved to disk before the needed
 207 page can be brought in.  Many virtual memory systems avoid this extra
 208 overhead by writing modified pages to disk in advance, so that later
 209 page faults can be completed more quickly.
 210
 211 @node Memory Mapped Files
 212 @section Memory Mapped Files
 213
 214 The traditional way to access the file system is via @code{read} and
 215 @code{write} system calls, but that requires an extra level of copying
 216 between the kernel and the user level.  A secondary interface is
 217 simply to ``map'' the file into the virtual address space.  The
 218 program can then use load and store instructions directly on the file
 219 data.  (An alternative way of viewing the file system is as ``durable
 220 memory.''  Files just store data structures.  If you access data
 221 structures in memory using load and store instructions, why not access
 222 data structures in files the same way?)
 223
 224 Memory mapped files are typically implemented using system calls.  One
 225 system call maps the file to a particular part of the address space.
 226 For example, one might map the file @file{foo}, which is 1000 bytes
 227 long, starting at address 5000.  Assuming that nothing else is already
 228 at virtual addresses 5000@dots{}6000, any memory accesses to these
 229 locations will access the corresponding bytes of @file{foo}.
 230
 231 A consequence of memory mapped files is that address spaces are
 232 sparsely populated with lots of segments, one for each memory mapped
 233 file (plus one each for code, data, and stack).  You will implement
 234 memory mapped files for problem 3 of this assignment, but you should
 235 design your solutions to problems 1 and 2 to account for this.
 236
 237 @node Stack
 238 @section Stack
 239
 240 In project 2, the stack was a single page at the top of the user
 241 virtual address space.  The stack's location does not change in this
 242 project, but your kernel should allocate additional pages to the stack
 243 on demand.  That is, if the stack grows past its current bottom, the
 244 system should allocate additional pages for the stack as necessary
 245 (unless those pages are unavailable because they are in use by another
 246 segment).
 247
 248 @node Problem 3-1 Page Table Management
 249 @section Problem 3-1: Page Table Management
 250
 251 Implement page directory and page table management to support virtual
 252 memory.  You will need data structures to accomplish the following
 253 tasks:
 254
 255 @itemize @bullet
 256 @item
 257 Some way of translating in software from virtual page frames to
 258 physical page frames.  Consider using a hash table (@pxref{Hash
 259 Table}).
 260
 261 @item
 262 Some way of translating from physical page frames back to virtual
 263 page frames, so that when you replace a page, you can invalidate
 264 its translation(s).
 265
 266 @item
 267 Some way of finding a page on disk if it is not in memory.  You won't
 268 need this data structure until part 2, but planning ahead is a good
 269 idea.
 270 @end itemize
 271
 272 The page fault handler, @func{page_fault} in
 273 @file{threads/exception.c}, needs to do roughly the following:
 274
 275 @enumerate 1
 276 @item
 277 Determine the location of the physical page backing the virtual
 278 address that faulted.  It might be in the file system, in swap,
 279 already be in physical memory and just not set up in the page table,
 280 or it might be an invalid virtual address.
 281
 282 If the virtual address is invalid, that is, if there's no physical
 283 page backing it, or if the virtual address is above @code{PHYS_BASE},
 284 meaning that it belongs to the kernel instead of the user, then the
 285 process's memory access must be disallowed.  You should terminate the
 286 process at this point, being sure to free all of its resources.
 287
 288 @item
 289 If the physical page is not in physical memory, bring it into memory.
 290 If necessary to make room, first evict some other page from memory.
 291 (When you do that you need to first remove references to the page from
 292 any page table that refers to it.)
 293
 294 @item
 295 Point the page table entry for the faulting virtual address to the
 296 physical page.  You can use the functions in @file{userprog/pagedir.c}.
 297 @end enumerate
 298
 299 You'll need to modify the ELF loader in @file{userprog/process.c} to
 300 do page table management according to your new design.  As supplied,
 301 it reads all the process's pages from disk and initializes the page
 302 tables for them at the same time.  For testing purposes, you'll
 303 probably want to leave the code that reads the pages from disk, but
 304 use your new page table management code to construct the page tables
 305 only as page faults occur for them.
 306
 307 You should use the @func{palloc_get_page} function to get the page
 308 frames that you use for storing user virtual pages.  Be sure to pass
 309 the @code{PAL_USER} flag to this function when you do so, because that
 310 allocates pages from a ``user pool'' separate from the ``kernel pool''
 311 that other calls to @func{palloc_get_page} make.
 312
 313 There are many possible ways to implement virtual memory.  The above
 314 is simply an outline of our suggested implementation.
 315
 316 @node Problem 3-2 Paging To and From Disk
 317 @section Problem 3-2: Paging To and From Disk
 318
 319 Implement paging to and from files and the swap disk.  You may use the
 320 disk on interface @code{hd1:1} as the swap disk, using the disk
 321 interface prototyped in @code{devices/disk.h}.
 322
 323 You will need routines to move a page from memory to disk and from
 324 disk to memory, where ``disk'' is either a file or the swap disk.  If
 325 you do everything correctly, your VM should still work when you
 326 implement your own file system for the next assignment.
 327
 328 You will need a way to track pages which are used by a process but
 329 which are not in physical memory, to fully handle page faults.  Pages
 330 that you write to swap should not be constrained to be in sequential
 331 order.  You will also need a way to track all of the physical memory
 332 pages, in order to find an unused one when needed, or to evict a page
 333 when memory is needed but no empty pages are available.  The data
 334 structures that you designed in part 1 should do most of the work for
 335 you.
 336
 337 You will need a page replacement algorithm.  The hardware sets the
 338 accessed and dirty bits when it accesses memory.  You can gain access
 339 this information using the functions prototyped in
 340 @file{userprog/pagedir.h}.  You should be able to take advantage of
 341 this information to implement some algorithm which attempts to achieve
 342 LRU-type behavior.  We expect that your algorithm perform at least as
 343 well as a reasonable implementation of the second-chance (clock)
 344 algorithm.  You will need to show in your test cases the value of your
 345 page replacement algorithm by demonstrating for some workload that it
 346 pages less frequently using your algorithm than using some inferior
 347 page replacement policy.  The canonical example of a poor page
 348 replacement policy is random replacement.
 349
 350 Since you will already be paging from disk, you should implement a
 351 ``lazy'' loading scheme for new processes.  When a process is created,
 352 it will not run immediately.  Therefore, it doesn't make sense to load
 353 all its code, data, and stack into memory when the process is created,
 354 since it might incur additional disk accesses to do so (if it gets
 355 paged out before it runs).  When loading a new process, you should
 356 leave most pages on disk, and bring them in as demanded when the
 357 program begins running.  Your VM system should also use the executable
 358 file itself as backing store for read-only segments, since these
 359 segments won't change.
 360
 361 There are a few special cases.  Look at the loop in
 362 @func{load_segment} in @file{userprog/process.c}.  Each time
 363 around the loop, @code{read_bytes} represents the number of bytes to
 364 read from the executable file and @code{zero_bytes} represents the number
 365 of bytes to initialize to zero following the bytes read.  The two
 366 always sum to @code{PGSIZE}.  The page handling depends on these
 367 variables' values:
 368
 369 @itemize @bullet
 370 @item
 371 If @code{read_bytes} equals @code{PGSIZE}, the page should be demand
 372 paged from disk on its first access.
 373
 374 @item
 375 If @code{zero_bytes} equals @code{PGSIZE}, the page does not need to
 376 be read from disk at all because it is all zeroes.  You should handle
 377 such pages by creating a new page consisting of all zeroes at the
 378 first page fault.
 379
 380 @item
 381 If neither @code{read_bytes} nor @code{zero_bytes} equals
 382 @code{PGSIZE}, then part of the page is to be read from disk and the
 383 remainder zeroed.  This is a special case.  You are allowed to handle
 384 it by reading the partial page from disk at executable load time and
 385 zeroing the rest of the page.  This is the only case in which we will
 386 allow you to load a page in a non-``lazy'' fashion.  Many real OSes
 387 such as Linux do not load partial pages lazily.
 388 @end itemize
 389
 390 Incidentally, if you have trouble handling the third case above, you
 391 can eliminate it temporarily by linking the test programs with a
 392 special ``linker script.''  Read @file{tests/userprog/Makefile} for
 393 details.  We will not test your submission with this special linker
 394 script, so the code you turn in must properly handle all cases.
 395
 396 For extra credit, you may implement sharing: when multiple processes
 397 are created that use the same executable file, share read-only pages
 398 among those processes instead of creating separate copies of read-only
 399 segments for each process.  If you carefully designed your data
 400 structures in part 1, sharing of read-only pages should not make this
 401 part significantly harder.
 402
 403 @node Problem 3-3 Memory Mapped Files
 404 @section Problem 3-3: Memory Mapped Files
 405
 406 Implement memory mapped files.
 407
 408 You will need to implement the following system calls:
 409
 410 @table @code
 411 @item SYS_mmap
 412 @itemx bool mmap (int @var{fd}, void *@var{addr}, unsigned @var{length})
 413
 414 Maps the file open as @var{fd} into the process's address space
 415 starting at @var{addr} for @var{length} bytes.  Returns true if
 416 successful, false on failure.
 417
 418 @item SYS_munmap
 419 @itemx bool munmap (void *addr, unsigned length)
 420
 421 Unmaps the segment specified by id.  This cannot be used to unmap
 422 segments mapped by the executable loader.  Returns 0 on success, -1 on
 423 failure.  When a file is unmapped, all outstanding changes are written
 424 to the file, and the segment's pages are removed from the process's
 425 list of used virtual pages.
 426 @end table
 427
 428 Calls to @code{mmap} must fail if the address is not page-aligned, if
 429 the length is not positive, or if the length is not a multiple of
 430 @code{PGSIZE}.  You also must error check to make sure that the new
 431 segment does not overlap already existing segments, and fail if it
 432 does.  If the length passed to @code{mmap} is less than the file's
 433 length, you should only map the first part of the file.  If the length
 434 passed to @code{mmap} is longer than the file, the call should fail.
 435 (Ideally it should extend the file, but our file system does not yet
 436 support growing files.)  Similar to the code segment, your VM system
 437 should be able to use the @code{mmap}'d file itself as backing store
 438 for the mapped segment, since the changes to the @code{mmap} segment
 439 will eventually be written to the file.  (In fact, you may choose to
 440 implement executable mappings as a special case of file mappings.)
 441
 442 @node Virtual Memory FAQ
 443 @section FAQ
 444
 445 @enumerate 1
 446 @item
 447 @b{Do we need a working HW 2 to implement HW 3?}
 448
 449 Yes.
 450
 451 @item
 452 @anchor{Hash Table}
 453 @b{How do I use the hash table provided in @file{lib/kernel/hash.c}?}
 454
 455 First, you need to embed a @code{hash_elem} object as a member of the
 456 object that the hash table will contain.  Each @code{hash_elem} allows
 457 the object to a member of at most one hash table at a given time.  All
 458 the hash table functions that deal with hash table items actually use
 459 the address of a @code{hash_elem}.  You can convert a pointer to a
 460 @code{hash_elem} member into a pointer to the structure in which
 461 member is embedded using the @code{hash_entry} macro.
 462
 463 Second, you need to decide on a key type.  The key should be something
 464 that is unique for each object, because a given hash table may not
 465 contain two objects with equal keys.  Then you need to write two
 466 functions.  The first is a @dfn{hash function} that converts a key
 467 into an integer.  Some sample hash functions that you can use or just
 468 examine are given in @file{lib/kernel/hash.c}.  The second function
 469 needed is a @dfn{comparison function} that compares a pair and returns
 470 true if the first is less than the second.  These two functions have
 471 to be compatible with the prototypes for @code{hash_hash_func} and
 472 @code{hash_less_func} in @file{lib/kernel/hash.h}.
 473
 474 Here's a quick example.  Suppose you want to put @code{struct thread}s
 475 in a hash table.  First, add a @code{hash_elem} to the thread
 476 structure by adding a line to its definition:
 477
 478 @example
 479 hash_elem h_elem;               /* Hash table element. */
 480 @end example
 481
 482 We'll choose the @code{tid} member in @code{struct thread} as the key,
 483 and write a hash function and a comparison function:
 484
 485 @example
 486 /* Returns a hash for E. */
 487 unsigned
 488 thread_hash (const hash_elem *e, void *aux UNUSED)
 489 @{
 490   struct thread *t = hash_entry (e, struct thread, h_elem);
 491   return hash_int (t->tid);
 492 @}
 493
 494 /* Returns true if A's tid is less than B's tid. */
 495 bool
 496 thread_less (const hash_elem *a_, const hash_elem *b_, void *aux UNUSED)
 497 @{
 498   struct thread *a = hash_entry (a_, struct thread, h_elem);
 499   struct thread *b = hash_entry (b_, struct thread, h_elem);
 500   return a->tid < b->tid;
 501 @}
 502 @end example
 503
 504 Then we can create a hash table like this:
 505
 506 @example
 507 struct hash threads;
 508
 509 hash_init (&threads, thread_hash, thread_less, NULL);
 510 @end example
 511
 512 Finally, if @code{@var{t}} is a pointer to a @code{struct thread},
 513 then we can insert it into the hash table with:
 514
 515 @example
 516 hash_insert (&threads, &@var{t}->h_elem);
 517 @end example
 518
 519 If you have any other questions about hash tables, the CS109
 520 and CS161 textbooks have good chapters on them, or you can come
 521 to any of the TA's office hours for further clarification.
 522
 523 @item
 524 @b{What are the @var{aux} parameters to the hash table functions good
 525 for?}
 526
 527 In simple cases you won't have any need for the @var{aux} parameters.
 528 In these cases you can just pass a null pointer to @func{hash_init}
 529 for @var{aux} and ignore the values passed to the hash function and
 530 comparison functions.  (You'll get a compiler warning if you don't use
 531 the @var{aux} parameter, but you can turn that off with the
 532 @code{UNUSED} macro, as shown above, or you can just ignore it.)
 533
 534 @var{aux} is useful when you have some property of the data in the
 535 hash table that's both constant and needed for hashing or comparisons,
 536 but which is not stored in the data items themselves.  For example, if
 537 the items in a hash table contain fixed-length strings, but the items
 538 themselves don't indicate what that fixed length is, you could pass
 539 the length as an @var{aux} parameter.
 540
 541 @item
 542 @b{The current implementation of the hash table does not do something
 543 that we need it to do. What gives?}
 544
 545 You are welcome to modify it.  It is not used by any of the code we
 546 provided, so modifying it won't affect any code but yours.  Do
 547 whatever it takes to make it work the way you want.
 548
 549 @item
 550 @b{What controls the layout of user programs?}
 551
 552 The linker is responsible for the layout of a user program in
 553 memory. The linker is directed by a ``linker script'' which tells it
 554 the names and locations of the various program segments.  You can
 555 learn more about linker scripts by reading the ``Scripts'' chapter in
 556 the linker manual, accessible via @samp{info ld}.
 557 @end enumerate
 558
 559 @menu
 560 * Problem 3-1 and 3-2 FAQ::
 561 * Problem 3-3 Memory Mapped File FAQ::
 562 @end menu
 563
 564 @node Problem 3-1 and 3-2 FAQ
 565 @subsection Problem 3-1 and 3-2 FAQ
 566
 567 @enumerate 1
 568 @item
 569 @b{Does the virtual memory system need to support growth of the stack
 570 segment?}
 571
 572 Yes. If a page fault appears just below the last stack segment page,
 573 you must add a new page to the bottom of the stack. It is impossible
 574 to predict how large the stack will grow at compile time, so we must
 575 allocate pages as necessary. You should only allocate additional pages
 576 if they ``appear'' to be stack accesses.
 577
 578 @item
 579 @b{Does the first stack page need to be loaded lazily?}
 580
 581 No, you can initialize the first stack page with the command line at
 582 load time.  There's no need to wait for it to be faulted in.  Even if
 583 you did wait, the very first instruction in the user program is likely
 584 to be one that faults in the page.
 585
 586 @item
 587 @b{Does the virtual memory system need to support growth of the data
 588 segment?}
 589
 590 No.  The size of the data segment is determined by the linker.  We
 591 still have no dynamic allocation in Pintos (although it is possible to
 592 ``fake'' it at the user level by using memory-mapped files).  However,
 593 implementing it would add little additional complexity to a
 594 well-designed system.
 595
 596 @item
 597 @b{But what do you mean by ``appear'' to be stack accesses? How big can a
 598 stack growth be?  Under what circumstances do we grow the stack?}
 599
 600 If it looks like a stack request, then you grow the stack. Yes, that's
 601 ambiguous. You need to make a reasonable decision about what looks
 602 like a stack request. For example, you could decide a page, or two
 603 pages, or ten pages, or more@enddots{}  Or, you could use some other
 604 heuristic to figure this out.
 605
 606 Make a reasonable decision and document it in your code and in
 607 your design document.  Please make sure to justify your decision.
 608
 609 @item
 610 @b{Why do I need to pass @code{PAL_USER} to @func{palloc_get_page}
 611 when I allocate physical page frames?}@anchor{Why PAL_USER?}
 612
 613 You can layer some other allocator on top of @func{palloc_get_page}
 614 if you like, but it should be the underlying mechanism, directly or
 615 indirectly, for two reasons.  First, running out of pages in the user
 616 pool just causes user programs to page, but running out of pages in
 617 the kernel pool will cause all kinds of problems, because many kernel
 618 functions depend on being able to allocate memory.  Second, you can
 619 use the @option{-ul} option to @command{pintos} to limit the size of
 620 the user pool, which makes it easy to test your VM implementation with
 621 various user memory sizes.
 622 @end enumerate
 623
 624 @node Problem 3-3 Memory Mapped File FAQ
 625 @subsection Problem 3-3: Memory Mapped File FAQ
 626
 627 @enumerate 1
 628 @item
 629 @b{How do we interact with memory-mapped files?}
 630
 631 Let's say you want to map a file called @file{foo} into your address
 632 space at address @t{0x10000000}. You open the file, determine its
 633 length, and then use @code{mmap}:
 634
 635 @example
 636 #include <stdio.h>
 637 #include <syscall.h>
 638
 639 int main (void)
 640 @{
 641     void *addr = (void *) 0x10000000;
 642     int fd = open ("foo");
 643     int length = filesize (fd);
 644     if (mmap (fd, addr, length))
 645         printf ("success!\n");
 646 @}
 647 @end example
 648
 649 Suppose @file{foo} is a text file and you want to print the first 64
 650 bytes on the screen (assuming, of course, that the length of the file
 651 is at least 64).  Without @code{mmap}, you'd need to allocate a
 652 buffer, use @code{read} to get the data from the file into the buffer,
 653 and finally use @code{write} to put the buffer out to the display. But
 654 with the file mapped into your address space, you can directly address
 655 it like so:
 656
 657 @example
 658 write (addr, 64, STDOUT_FILENO);
 659 @end example
 660
 661 Similarly, if you wanted to replace the first byte of the file,
 662 all you need to do is:
 663
 664 @example
 665 addr[0] = 'b';
 666 @end example
 667
 668 When you're done using the memory-mapped file, you simply unmap
 669 it:
 670
 671 @example
 672 munmap (addr);
 673 @end example
 674
 675 @item
 676 @b{What if two processes memory-map the same file?}
 677
 678 There is no requirement in Pintos that the two processes see
 679 consistent data.  Unix handles this by making the processes share the
 680 same physical page, but the @code{mmap} system call also has an
 681 argument allowing the client to specify whether the page is shared or
 682 private (i.e.@: copy-on-write).
 683
 684 @item
 685 @b{What happens if a user removes a @code{mmap}'d file?}
 686
 687 You should follow the Unix convention and the mapping should still be
 688 valid.  @xref{Removing an Open File}, for more information.
 689
 690 @item
 691 @b{What if a process writes to a page that is memory-mapped, but the
 692 location written to in the memory-mapped page is past the end
 693 of the memory-mapped file?}
 694
 695 Can't happen.  @code{mmap} checks that the mapped region is within the
 696 file's length and Pintos provides no way to shorten a file.  (Until
 697 project 4, there's no way to extend a file either.)  You can remove a
 698 file, but the mapping remains valid (see the previous question).
 699
 700 @item
 701 @b{Do we have to handle memory mapping @code{stdin} or @code{stdout}?}
 702
 703 No.  Memory mapping implies that a file has a length and that a user
 704 can seek to any location in the file.  Since the console device has
 705 neither of these properties, @code{mmap} should return false when the
 706 user attempts to memory map a file descriptor for the console device.
 707
 708 @item
 709 @b{What happens when a process exits with mapped files?}
 710
 711 When a process finishes, each of its mapped files is implicitly
 712 unmapped.  When a process @code{mmap}s a file and then writes into the
 713 area for the file it is making the assumption the changes will be
 714 written to the file.
 715
 716 @item
 717 @b{If a user closes a mapped file, should it be automatically
 718 unmapped?}
 719
 720 No, once created the mapping is valid until @code{munmap} is called
 721 or the process exits.
 722 @end enumerate