pintos-os.org Git - pintos-anon/blob - doc/filesys.texi

   1 @node Project 4--File Systems
   2 @chapter Project 4: File Systems
   3
   4 In the previous two assignments, you made extensive use of a
   5 file system without actually worrying about how it was implemented
   6 underneath.  For this last assignment, you will improve the
   7 implementation of the file system.  You will be working primarily in
   8 the @file{filesys} directory.
   9
  10 You may build project 4 on top of project 2 or project 3.  In either
  11 case, all of the functionality needed for project 2 must work in your
  12 filesys submission.  If you build on project 3, then all of the project
  13 3 functionality must work also, and you will need to edit
  14 @file{filesys/Make.vars} to enable VM functionality. You can receive up
  15 to 5% extra credit if you do enable VM.
  16
  17 @menu
  18 * Project 4 Background::
  19 * Project 4 Suggested Order of Implementation::
  20 * Project 4 Requirements::
  21 * Project 4 FAQ::
  22 @end menu
  23
  24 @node Project 4 Background
  25 @section Background
  26
  27 @menu
  28 * File System New Code::
  29 * Testing File System Persistence::
  30 @end menu
  31
  32 @node File System New Code
  33 @subsection New Code
  34
  35 Here are some files that are probably new to you.  These are in the
  36 @file{filesys} directory except where indicated:
  37
  38 @table @file
  39 @item fsutil.c
  40 Simple utilities for the file system that are accessible from the
  41 kernel command line.
  42
  43 @item filesys.h
  44 @itemx filesys.c
  45 Top-level interface to the file system.  @xref{Using the File System},
  46 for an introduction.
  47
  48 @item directory.h
  49 @itemx directory.c
  50 Translates file names to inodes.  The directory data structure is
  51 stored as a file.
  52
  53 @item inode.h
  54 @itemx inode.c
  55 Manages the data structure representing the layout of a
  56 file's data on disk.
  57
  58 @item file.h
  59 @itemx file.c
  60 Translates file reads and writes to disk sector reads
  61 and writes.
  62
  63 @item lib/kernel/bitmap.h
  64 @itemx lib/kernel/bitmap.c
  65 A bitmap data structure along with routines for reading and writing
  66 the bitmap to disk files.
  67 @end table
  68
  69 Our file system has a Unix-like interface, so you may also wish to
  70 read the Unix man pages for @code{creat}, @code{open}, @code{close},
  71 @code{read}, @code{write}, @code{lseek}, and @code{unlink}.  Our file
  72 system has calls that are similar, but not identical, to these.  The
  73 file system translates these calls into disk operations.
  74
  75 All the basic functionality is there in the code above, so that the
  76 file system is usable from the start, as you've seen
  77 in the previous two projects.  However, it has severe limitations
  78 which you will remove.
  79
  80 While most of your work will be in @file{filesys}, you should be
  81 prepared for interactions with all previous parts.
  82
  83 @node Testing File System Persistence
  84 @subsection Testing File System Persistence
  85
  86 By now, you should be familiar with the basic process of running the
  87 Pintos tests.  @xref{Testing}, for review, if necessary.
  88
  89 Until now, each test invoked Pintos just once.  However, an important
  90 purpose of a file system is to ensure that data remains accessible from
  91 one boot to another.  Thus, the tests that are part of the file system
  92 project invoke Pintos a second time.  The second run combines all the
  93 files and directories in the file system into a single file, then copies
  94 that file out of the Pintos file system into the host (Unix) file
  95 system.
  96
  97 The grading scripts check the file system's correctness based on the
  98 contents of the file copied out in the second run.  This means that your
  99 project will not pass any of the extended file system tests until the
 100 file system is implemented well enough to support @command{tar}, the
 101 Pintos user program that produces the file that is copied out.  The
 102 @command{tar} program is fairly demanding (it requires both extensible
 103 file and subdirectory support), so this will take some work.  Until
 104 then, you can ignore errors from @command{make check} regarding the
 105 extracted file system.
 106
 107 Incidentally, as you may have surmised, the file format used for copying
 108 out the file system contents is the standard Unix ``tar'' format.  You
 109 can use the Unix @command{tar} program to examine them.  The tar file
 110 for test @var{t} is named @file{@var{t}.tar}.
 111
 112 @node Project 4 Suggested Order of Implementation
 113 @section Suggested Order of Implementation
 114
 115 To make your job easier, we suggest implementing the parts of this
 116 project in the following order:
 117
 118 @enumerate
 119 @item
 120 Buffer cache (@pxref{Buffer Cache}).  Implement the buffer cache and
 121 integrate it into the existing file system.  At this point all the
 122 tests from project 2 (and project 3, if you're building on it) should
 123 still pass.
 124
 125 @item
 126 Extensible files (@pxref{Indexed and Extensible Files}).  After this
 127 step, your project should pass the file growth tests.
 128
 129 @item
 130 Subdirectories (@pxref{Subdirectories}).  Afterward, your project
 131 should pass the directory tests.
 132
 133 @item
 134 Remaining miscellaneous items.
 135 @end enumerate
 136
 137 You can implement extensible files and subdirectories in parallel if
 138 you temporarily make the number of entries in new directories fixed.
 139
 140 You should think about synchronization throughout.
 141
 142 @node Project 4 Requirements
 143 @section Requirements
 144
 145 @menu
 146 * Project 4 Design Document::
 147 * Indexed and Extensible Files::
 148 * Subdirectories::
 149 * Buffer Cache::
 150 * File System Synchronization::
 151 @end menu
 152
 153 @node Project 4 Design Document
 154 @subsection Design Document
 155
 156 Before you turn in your project, you must copy @uref{filesys.tmpl, , the
 157 project 4 design document template} into your source tree under the name
 158 @file{pintos/src/filesys/DESIGNDOC} and fill it in.  We recommend that
 159 you read the design document template before you start working on the
 160 project.  @xref{Project Documentation}, for a sample design document
 161 that goes along with a fictitious project.
 162
 163 @node Indexed and Extensible Files
 164 @subsection Indexed and Extensible Files
 165
 166 The basic file system allocates files as a single extent, making it
 167 vulnerable to external fragmentation, that is, it is possible that an
 168 @var{n}-block file cannot be allocated even though @var{n} blocks are
 169 free.  Eliminate this problem by
 170 modifying the on-disk inode structure.  In practice, this probably means using
 171 an index structure with direct, indirect, and doubly indirect blocks.
 172 You are welcome to choose a different scheme as long as you explain the
 173 rationale for it in your design documentation, and as long as it does
 174 not suffer from external fragmentation (as does the extent-based file
 175 system we provide).
 176
 177 You can assume that the file system partition will not be larger than
 178 8 MB.  You must
 179 support files as large as the partition (minus metadata).  Each inode is
 180 stored in one disk sector, limiting the number of block pointers that it
 181 can contain.  Supporting 8 MB files will require you to implement
 182 doubly-indirect blocks.
 183
 184 An extent-based file can only grow if it is followed by empty space, but
 185 indexed inodes make file growth possible whenever free space is
 186 available.  Implement file growth.  In the basic file system, the file
 187 size is specified when the file is created.  In most modern file
 188 systems, a file is initially created with size 0 and is then expanded
 189 every time a write is made off the end of the file.  Your file system
 190 must allow this.
 191
 192 There should be no predetermined limit on the size of a file, except
 193 that a file cannot exceed the size of the file system (minus metadata).  This
 194 also applies to the root directory file, which should now be allowed
 195 to expand beyond its initial limit of 16 files.
 196
 197 User programs are allowed to seek beyond the current end-of-file (EOF).  The
 198 seek itself does not extend the file.  Writing at a position past EOF
 199 extends the file to the position being written, and any gap between the
 200 previous EOF and the start of the write must be filled with zeros.  A
 201 read starting from a position past EOF returns no bytes.
 202
 203 Writing far beyond EOF can cause many blocks to be entirely zero.  Some
 204 file systems allocate and write real data blocks for these implicitly
 205 zeroed blocks.  Other file systems do not allocate these blocks at all
 206 until they are explicitly written.  The latter file systems are said to
 207 support ``sparse files.''  You may adopt either allocation strategy in
 208 your file system.
 209
 210 @node Subdirectories
 211 @subsection Subdirectories
 212
 213 Implement a hierarchical name space.  In the basic file system, all
 214 files live in a single directory.  Modify this to allow directory
 215 entries to point to files or to other directories.
 216
 217 Make sure that directories can expand beyond their original size just
 218 as any other file can.
 219
 220 The basic file system has a 14-character limit on file names.  You may
 221 retain this limit for individual file name components, or may extend
 222 it, at your option.  You must allow full path names to be
 223 much longer than 14 characters.
 224
 225 Maintain a separate current directory for each process.  At
 226 startup, set the root as the initial process's current directory.
 227 When one process starts another with the @code{exec} system call, the
 228 child process inherits its parent's current directory.  After that, the
 229 two processes' current directories are independent, so that either
 230 changing its own current directory has no effect on the other.
 231 (This is why, under Unix, the @command{cd} command is a shell built-in,
 232 not an external program.)
 233
 234 Update the existing system calls so that, anywhere a file name is
 235 provided by the caller, an absolute or relative path name may used.
 236 The directory separator character is forward slash (@samp{/}).
 237 You must also support special file names @file{.} and @file{..}, which
 238 have the same meanings as they do in Unix.
 239
 240 Update the @code{open} system call so that it can also open directories.
 241 Of the existing system calls, only @code{close} needs to accept a file
 242 descriptor for a directory.
 243
 244 Update the @code{remove} system call so that it can delete empty
 245 directories (other than the root) in addition to regular files.
 246 Directories may only be deleted if they do not contain any files or
 247 subdirectories (other than @file{.} and @file{..}).  You may decide
 248 whether to allow deletion of a directory that is open by a process or in
 249 use as a process's current working directory.  If it is allowed, then
 250 attempts to open files (including @file{.} and @file{..}) or create new
 251 files in a deleted directory must be disallowed.
 252
 253 Implement the following new system calls:
 254
 255 @deftypefn {System Call} bool chdir (const char *@var{dir})
 256 Changes the current working directory of the process to
 257 @var{dir}, which may be relative or absolute.  Returns true if
 258 successful, false on failure.
 259 @end deftypefn
 260
 261 @deftypefn {System Call} bool mkdir (const char *@var{dir})
 262 Creates the directory named @var{dir}, which may be
 263 relative or absolute.  Returns true if successful, false on failure.
 264 Fails if @var{dir} already exists or if any directory name in
 265 @var{dir}, besides the last, does not already exist.  That is,
 266 @code{mkdir("/a/b/c")} succeeds only if @file{/a/b} already exists and
 267 @file{/a/b/c} does not.
 268 @end deftypefn
 269
 270 @deftypefn {System Call} bool readdir (int @var{fd}, char *@var{name})
 271 Reads a directory entry from file descriptor @var{fd}, which must
 272 represent a directory.  If successful, stores the null-terminated file
 273 name in @var{name}, which must have room for @code{READDIR_MAX_LEN + 1}
 274 bytes, and returns true.  If no entries are left in the directory,
 275 returns false.
 276
 277 @file{.} and @file{..} should not be returned by @code{readdir}.
 278
 279 If the directory changes while it is open, then it is acceptable for
 280 some entries not to be read at all or to be read multiple times.
 281 Otherwise, each directory entry should be read once, in any order.
 282
 283 @code{READDIR_MAX_LEN} is defined in @file{lib/user/syscall.h}.  If your
 284 file system supports longer file names than the basic file system, you
 285 should increase this value from the default of 14.
 286 @end deftypefn
 287
 288 @deftypefn {System Call} bool isdir (int @var{fd})
 289 Returns true if @var{fd} represents a directory,
 290 false if it represents an ordinary file.
 291 @end deftypefn
 292
 293 @deftypefn {System Call} int inumber (int @var{fd})
 294 Returns the @dfn{inode number} of the inode associated with @var{fd},
 295 which may represent an ordinary file or a directory.
 296
 297 An inode number persistently identifies a file or directory.  It is
 298 unique during the file's existence.  In Pintos, the sector number of the
 299 inode is suitable for use as an inode number.
 300 @end deftypefn
 301
 302 We have provided @command{ls} and @command{mkdir} user programs, which
 303 are straightforward once the above syscalls are implemented.
 304 We have also provided @command{pwd}, which is not so straightforward.
 305 The @command{shell} program implements @command{cd} internally.
 306
 307 The @code{pintos} @option{extract} and @option{append} commands should now
 308 accept full path names, assuming that the directories used in the
 309 paths have already been created.  This should not require any significant
 310 extra effort on your part.
 311
 312 @node Buffer Cache
 313 @subsection Buffer Cache
 314
 315 Modify the file system to keep a cache of file blocks.  When a request
 316 is made to read or write a block, check to see if it is in the
 317 cache, and if so, use the cached data without going to
 318 disk.  Otherwise, fetch the block from disk into the cache, evicting an
 319 older entry if necessary.  You are limited to a cache no greater than 64
 320 sectors in size.
 321
 322 You must implement a cache replacement algorithm that is at least as
 323 good as the ``clock'' algorithm.  We encourage you to account for
 324 the generally greater value of metadata compared to data.  Experiment
 325 to see what combination of accessed, dirty, and other information
 326 results in the best performance, as measured by the number of disk
 327 accesses.
 328
 329 You can keep a cached copy of the free map permanently in memory if you
 330 like.  It doesn't have to count against the cache size.
 331
 332 The provided inode code uses a ``bounce buffer'' allocated with
 333 @func{malloc} to translate the disk's sector-by-sector interface into
 334 the system call interface's byte-by-byte interface.  You should get rid
 335 of these bounce buffers.  Instead, copy data into and out of sectors in
 336 the buffer cache directly.
 337
 338 Your cache should be @dfn{write-behind}, that is,
 339 keep dirty blocks in the cache, instead of immediately writing modified
 340 data to disk.  Write dirty blocks to disk whenever they are evicted.
 341 Because write-behind makes your file system more fragile in the face of
 342 crashes, in addition you should periodically write all dirty, cached
 343 blocks back to disk.  The cache should also be written back to disk in
 344 @func{filesys_done}, so that halting Pintos flushes the cache.
 345
 346 If you have @func{timer_sleep} from the first project working, write-behind is
 347 an excellent application.  Otherwise, you may implement a less general
 348 facility, but make sure that it does not exhibit busy-waiting.
 349
 350 You should also implement @dfn{read-ahead}, that is,
 351 automatically fetch the next block of a file
 352 into the cache when one block of a file is read, in case that block is
 353 about to be read.
 354 Read-ahead is only really useful when done asynchronously.  That means,
 355 if a process requests disk block 1 from the file, it should block until disk
 356 block 1 is read in, but once that read is complete, control should
 357 return to the process immediately.  The read-ahead request for disk
 358 block 2 should be handled asynchronously, in the background.
 359
 360 @strong{We recommend integrating the cache into your design early.}  In
 361 the past, many groups have tried to tack the cache onto a design late in
 362 the design process.  This is very difficult.  These groups have often
 363 turned in projects that failed most or all of the tests.
 364
 365 @node File System Synchronization
 366 @subsection Synchronization
 367
 368 The provided file system requires external synchronization, that is,
 369 callers must ensure that only one thread can be running in the file
 370 system code at once.  Your submission must adopt a finer-grained
 371 synchronization strategy that does not require external synchronization.
 372 To the extent possible, operations on independent entities should be
 373 independent, so that they do not need to wait on each other.
 374
 375 Operations on different cache blocks must be independent.  In
 376 particular, when I/O is required on a particular block, operations on
 377 other blocks that do not require I/O should proceed without having to
 378 wait for the I/O to complete.
 379
 380 Multiple processes must be able to access a single file at once.
 381 Multiple reads of a single file must be able to complete without
 382 waiting for one another.  When writing to a file does not extend the
 383 file, multiple processes should also be able to write a single file at
 384 once.  A read of a file by one process when the file is being written by
 385 another process is allowed to show that none, all, or part of the write
 386 has completed.  (However, after the @code{write} system call returns to
 387 its caller, all subsequent readers must see the change.)  Similarly,
 388 when two processes simultaneously write to the same part of a file,
 389 their data may be interleaved.
 390
 391 On the other hand, extending a file and writing data into the new
 392 section must be atomic.  Suppose processes A and B both have a given
 393 file open and both are positioned at end-of-file.  If A reads and B
 394 writes the file at the same time, A may read all, part, or none of what
 395 B writes.  However, A may not read data other than what B writes, e.g.@:
 396 if B's data is all nonzero bytes, A is not allowed to see any zeros.
 397
 398 Operations on different directories should take place concurrently.
 399 Operations on the same directory may wait for one another.
 400
 401 Keep in mind that only data shared by multiple threads needs to be
 402 synchronized.  In the base file system, @struct{file} and @struct{dir}
 403 are accessed only by a single thread.
 404
 405 @node Project 4 FAQ
 406 @section FAQ
 407
 408 @table @b
 409 @item How much code will I need to write?
 410
 411 Here's a summary of our reference solution, produced by the
 412 @command{diffstat} program.  The final row gives total lines inserted
 413 and deleted; a changed line counts as both an insertion and a deletion.
 414
 415 This summary is relative to the Pintos base code, but the reference
 416 solution for project 4 is based on the reference solution to project 3.
 417 Thus, the reference solution runs with virtual memory enabled.
 418 @xref{Project 3 FAQ}, for the summary of project 3.
 419
 420 The reference solution represents just one possible solution.  Many
 421 other solutions are also possible and many of those differ greatly from
 422 the reference solution.  Some excellent solutions may not modify all the
 423 files modified by the reference solution, and some may modify files not
 424 modified by the reference solution.
 425
 426 @verbatim
 427  Makefile.build       |    5
 428  devices/timer.c      |   42 ++
 429  filesys/Make.vars    |    6
 430  filesys/cache.c      |  473 +++++++++++++++++++++++++
 431  filesys/cache.h      |   23 +
 432  filesys/directory.c  |   99 ++++-
 433  filesys/directory.h  |    3
 434  filesys/file.c       |    4
 435  filesys/filesys.c    |  194 +++++++++-
 436  filesys/filesys.h    |    5
 437  filesys/free-map.c   |   45 +-
 438  filesys/free-map.h   |    4
 439  filesys/fsutil.c     |    8
 440  filesys/inode.c      |  444 ++++++++++++++++++-----
 441  filesys/inode.h      |   11
 442  threads/init.c       |    5
 443  threads/interrupt.c  |    2
 444  threads/thread.c     |   32 +
 445  threads/thread.h     |   38 +-
 446  userprog/exception.c |   12
 447  userprog/pagedir.c   |   10
 448  userprog/process.c   |  332 +++++++++++++----
 449  userprog/syscall.c   |  582 ++++++++++++++++++++++++++++++-
 450  userprog/syscall.h   |    1
 451  vm/frame.c           |  161 ++++++++
 452  vm/frame.h           |   23 +
 453  vm/page.c            |  297 +++++++++++++++
 454  vm/page.h            |   50 ++
 455  vm/swap.c            |   85 ++++
 456  vm/swap.h            |   11
 457  30 files changed, 2721 insertions(+), 286 deletions(-)
 458 @end verbatim
 459
 460 @item Can @code{BLOCK_SECTOR_SIZE} change?
 461
 462 No, @code{BLOCK_SECTOR_SIZE} is fixed at 512.  For IDE disks, this
 463 value is a fixed property of the hardware.  Other disks do not
 464 necessarily have a 512-byte sector, but for simplicity Pintos only
 465 supports those that do.
 466 @end table
 467
 468 @menu
 469 * Indexed Files FAQ::
 470 * Subdirectories FAQ::
 471 * Buffer Cache FAQ::
 472 @end menu
 473
 474 @node Indexed Files FAQ
 475 @subsection Indexed Files FAQ
 476
 477 @table @b
 478 @item What is the largest file size that we are supposed to support?
 479
 480 The file system partition we create will be 8 MB or smaller.  However,
 481 individual files will have to be smaller than the partition to
 482 accommodate the metadata.  You'll need to consider this when deciding
 483 your inode organization.
 484 @end table
 485
 486 @node Subdirectories FAQ
 487 @subsection Subdirectories FAQ
 488
 489 @table @b
 490 @item How should a file name like @samp{a//b} be interpreted?
 491
 492 Multiple consecutive slashes are equivalent to a single slash, so this
 493 file name is the same as @samp{a/b}.
 494
 495 @item How about a file name like @samp{/../x}?
 496
 497 The root directory is its own parent, so it is equivalent to @samp{/x}.
 498
 499 @item How should a file name that ends in @samp{/} be treated?
 500
 501 Most Unix systems allow a slash at the end of the name for a directory,
 502 and reject other names that end in slashes.  We will allow this
 503 behavior, as well as simply rejecting a name that ends in a slash.
 504 @end table
 505
 506 @node Buffer Cache FAQ
 507 @subsection Buffer Cache FAQ
 508
 509 @table @b
 510 @item Can we keep a @struct{inode_disk} inside @struct{inode}?
 511
 512 The goal of the 64-block limit is to bound the amount of cached file
 513 system data.  If you keep a block of disk data---whether file data or
 514 metadata---anywhere in kernel memory then you have to count it against
 515 the 64-block limit.  The same rule applies to anything that's
 516 ``similar'' to a block of disk data, such as a @struct{inode_disk}
 517 without the @code{length} or @code{sector_cnt} members.
 518
 519 That means you'll have to change the way the inode implementation
 520 accesses its corresponding on-disk inode right now, since it currently
 521 just embeds a @struct{inode_disk} in @struct{inode} and reads the
 522 corresponding sector from disk when it's created.  Keeping extra
 523 copies of inodes would subvert the 64-block limitation that we place
 524 on your cache.
 525
 526 You can store a pointer to inode data in @struct{inode}, but if you do
 527 so you should carefully make sure that this does not limit your OS to 64
 528 simultaneously open files.
 529 You can also store other information to help you find the inode when you
 530 need it.  Similarly, you may store some metadata along each of your 64
 531 cache entries.
 532
 533 You can keep a cached copy of the free map permanently in memory if you
 534 like.  It doesn't have to count against the cache size.
 535
 536 @func{byte_to_sector} in @file{filesys/inode.c} uses the
 537 @struct{inode_disk} directly, without first reading that sector from
 538 wherever it was in the storage hierarchy.  This will no longer work.
 539 You will need to change @func{inode_byte_to_sector} to obtain the
 540 @struct{inode_disk} from the cache before using it.
 541 @end table