pintos-os.org Git - pintos-anon/blob - doc/filesys.texi

   1 @node Project 4--File Systems
   2 @chapter Project 4: File Systems
   3
   4 In the previous two assignments, you made extensive use of a
   5 file system without actually worrying about how it was implemented
   6 underneath.  For this last assignment, you will improve the
   7 implementation of the file system.  You will be working primarily in
   8 the @file{filesys} directory.
   9
  10 You may build project 4 on top of project 2 or project 3.  In either
  11 case, all of the functionality needed for project 2 must work in your
  12 filesys submission.  If you build on project 3, then all of the project
  13 3 functionality must work also, and you will need to edit
  14 @file{filesys/Make.vars} to enable VM functionality.  You can receive up
  15 to 5% extra credit if you do enable VM.
  16
  17 @menu
  18 * Project 4 Background::
  19 * Project 4 Requirements::
  20 * Project 4 FAQ::
  21 @end menu
  22
  23 @node Project 4 Background
  24 @section Background
  25
  26 @menu
  27 * File System New Code::
  28 * Testing File System Persistence::
  29 @end menu
  30
  31 @node File System New Code
  32 @subsection New Code
  33
  34 Here are some files that are probably new to you.  These are in the
  35 @file{filesys} directory except where indicated:
  36
  37 @table @file
  38 @item fsutil.c
  39 Simple utilities for the file system that are accessible from the
  40 kernel command line.
  41
  42 @item filesys.h
  43 @itemx filesys.c
  44 Top-level interface to the file system.  @xref{Using the File System},
  45 for an introduction.
  46
  47 @item directory.h
  48 @itemx directory.c
  49 Translates file names to inodes.  The directory data structure is
  50 stored as a file.
  51
  52 @item inode.h
  53 @itemx inode.c
  54 Manages the data structure representing the layout of a
  55 file's data on disk.
  56
  57 @item file.h
  58 @itemx file.c
  59 Translates file reads and writes to disk sector reads
  60 and writes.
  61
  62 @item lib/kernel/bitmap.h
  63 @itemx lib/kernel/bitmap.c
  64 A bitmap data structure along with routines for reading and writing
  65 the bitmap to disk files.
  66 @end table
  67
  68 Our file system has a Unix-like interface, so you may also wish to
  69 read the Unix man pages for @code{creat}, @code{open}, @code{close},
  70 @code{read}, @code{write}, @code{lseek}, and @code{unlink}.  Our file
  71 system has calls that are similar, but not identical, to these.  The
  72 file system translates these calls into disk operations.
  73
  74 All the basic functionality is there in the code above, so that the
  75 file system is usable from the start, as you've seen
  76 in the previous two projects.  However, it has severe limitations
  77 which you will remove.
  78
  79 While most of your work will be in @file{filesys}, you should be
  80 prepared for interactions with all previous parts.
  81
  82 @node Testing File System Persistence
  83 @subsection Testing File System Persistence
  84
  85 By now, you should be familiar with the basic process of running the
  86 Pintos tests.  @xref{Testing}, for review, if necessary.
  87
  88 Until now, each test invoked Pintos just once.  However, an important
  89 purpose of a file system is to ensure that data remains accessible from
  90 one boot to another.  Thus, the tests that are part of the file system
  91 project invoke Pintos a second time.  The second run combines all the
  92 files and directories in the file system into a single file, then copies
  93 that file out of the Pintos file system into the host (Unix) file
  94 system.
  95
  96 The grading scripts check the file system's correctness based on the
  97 contents of the file copied out in the second run.  This means that your
  98 project will not pass any of the extended file system tests until the
  99 file system is implemented well enough to support @command{tar}, the
 100 Pintos user program that produces the file that is copied out.  The
 101 @command{tar} program is fairly demanding (it requires both extensible
 102 file and subdirectory support), so this will take some work.  Until
 103 then, you can ignore errors from @command{make check} regarding the
 104 extracted file system.
 105
 106 Incidentally, as you may have surmised, the file format used for copying
 107 out the file system contents is the standard Unix ``tar'' format.  You
 108 can use the Unix @command{tar} program to examine them.  The tar file
 109 for test @var{t} is named @file{@var{t}.tar}.
 110
 111 @node Project 4 Requirements
 112 @section Requirements
 113
 114 @menu
 115 * Project 4 Design Document::
 116 * Indexed and Extensible Files::
 117 * Subdirectories::
 118 * Buffer Cache::
 119 * File System Synchronization::
 120 @end menu
 121
 122 @node Project 4 Design Document
 123 @subsection Design Document
 124
 125 Before you turn in your project, you must copy @uref{filesys.tmpl, , the
 126 project 4 design document template} into your source tree under the name
 127 @file{pintos/src/filesys/DESIGNDOC} and fill it in.  We recommend that
 128 you read the design document template before you start working on the
 129 project.  @xref{Project Documentation}, for a sample design document
 130 that goes along with a fictitious project.
 131
 132 @node Indexed and Extensible Files
 133 @subsection Indexed and Extensible Files
 134
 135 The basic file system allocates files as a single extent, making it
 136 vulnerable to external fragmentation, that is, it is possible that an
 137 @var{n}-block file cannot be allocated even though @var{n} blocks are
 138 free.  Eliminate this problem by
 139 modifying the on-disk inode structure.  In practice, this probably means using
 140 an index structure with direct, indirect, and doubly indirect blocks.
 141 You are welcome to choose a different scheme as long as you explain the
 142 rationale for it in your design documentation, and as long as it does
 143 not suffer from external fragmentation (as does the extent-based file
 144 system we provide).
 145
 146 You can assume that the disk will not be larger than 8 MB.  You must
 147 support files as large as the disk (minus metadata).  Each inode is
 148 stored in one disk sector, limiting the number of block pointers that it
 149 can contain.  Supporting 8 MB files will require you to implement
 150 doubly-indirect blocks.
 151
 152 An extent-based file can only grow if it is followed by empty space, but
 153 indexed inodes make file growth possible whenever free space is
 154 available.  Implement file growth.  In the basic file system, the file
 155 size is specified when the file is created.  In most modern file
 156 systems, a file is initially created with size 0 and is then expanded
 157 every time a write is made off the end of the file.  Your file system
 158 must allow this.
 159
 160 There should be no predetermined limit on the size of a file, except
 161 that a file cannot exceed the size of the disk (minus metadata).  This
 162 also applies to the root directory file, which should now be allowed
 163 to expand beyond its initial limit of 16 files.
 164
 165 User programs are allowed to seek beyond the current end-of-file (EOF).  The
 166 seek itself does not extend the file.  Writing at a position past EOF
 167 extends the file to the position being written, and any gap between the
 168 previous EOF and the start of the write must be filled with zeros.  A
 169 read starting from a position past EOF returns no bytes.
 170
 171 Writing far beyond EOF can cause many blocks to be entirely zero.  Some
 172 file systems allocate and write real data blocks for these implicitly
 173 zeroed blocks.  Other file systems do not allocate these blocks at all
 174 until they are explicitly written.  The latter file systems are said to
 175 support ``sparse files.''  You may adopt either allocation strategy in
 176 your file system.
 177
 178 @node Subdirectories
 179 @subsection Subdirectories
 180
 181 Implement a hierarchical name space.  In the basic file system, all
 182 files live in a single directory.  Modify this to allow directory
 183 entries to point to files or to other directories.
 184
 185 Make sure that directories can expand beyond their original size just
 186 as any other file can.
 187
 188 The basic file system has a 14-character limit on file names.  You may
 189 retain this limit for individual file name components, or may extend
 190 it, at your option.  You must allow full path names to be
 191 much longer than 14 characters.
 192
 193 Maintain a separate current directory for each process.  At
 194 startup, set the root as the initial process's current directory.
 195 When one process starts another with the @code{exec} system call, the
 196 child process inherits its parent's current directory.  After that, the
 197 two processes' current directories are independent, so that either
 198 changing its own current directory has no effect on the other.
 199 (This is why, under Unix, the @command{cd} command is a shell built-in,
 200 not an external program.)
 201
 202 Update the existing system calls so that, anywhere a file name is
 203 provided by the caller, an absolute or relative path name may used.
 204 The directory separator character is forward slash (@samp{/}).
 205 You must also support special file names @file{.} and @file{..}, which
 206 have the same meanings as they do in Unix.
 207
 208 Update the @code{open} system call so that it can also open directories.
 209 Of the existing system calls, only @code{close} needs to accept a file
 210 descriptor for a directory.
 211
 212 Update the @code{remove} system call so that it can delete empty
 213 directories (other than the root) in addition to regular files.
 214 Directories may only be deleted if they do not contain any files or
 215 subdirectories (other than @file{.} and @file{..}).  You may decide
 216 whether to allow deletion of a directory that is open by a process or in
 217 use as a process's current working directory.  If it is allowed, then
 218 attempts to open files (including @file{.} and @file{..}) or create new
 219 files in a deleted directory must be disallowed.
 220
 221 Implement the following new system calls:
 222
 223 @deftypefn {System Call} bool chdir (const char *@var{dir})
 224 Changes the current working directory of the process to
 225 @var{dir}, which may be relative or absolute.  Returns true if
 226 successful, false on failure.
 227 @end deftypefn
 228
 229 @deftypefn {System Call} bool mkdir (const char *@var{dir})
 230 Creates the directory named @var{dir}, which may be
 231 relative or absolute.  Returns true if successful, false on failure.
 232 Fails if @var{dir} already exists or if any directory name in
 233 @var{dir}, besides the last, does not already exist.  That is,
 234 @code{mkdir("/a/b/c")} succeeds only if @file{/a/b} already exists and
 235 @file{/a/b/c} does not.
 236 @end deftypefn
 237
 238 @deftypefn {System Call} bool readdir (int @var{fd}, char *@var{name})
 239 Reads a directory entry from file descriptor @var{fd}, which must
 240 represent a directory.  If successful, stores the null-terminated file
 241 name in @var{name}, which must have room for @code{READDIR_MAX_LEN + 1}
 242 bytes, and returns true.  If no entries are left in the directory,
 243 returns false.
 244
 245 @file{.} and @file{..} should not be returned by @code{readdir}.
 246
 247 If the directory changes while it is open, then it is acceptable for
 248 some entries not to be read at all or to be read multiple times.
 249 Otherwise, each directory entry should be read once, in any order.
 250
 251 @code{READDIR_MAX_LEN} is defined in @file{lib/user/syscall.h}.  If your
 252 file system supports longer file names than the basic file system, you
 253 should increase this value from the default of 14.
 254 @end deftypefn
 255
 256 @deftypefn {System Call} bool isdir (int @var{fd})
 257 Returns true if @var{fd} represents a directory,
 258 false if it represents an ordinary file.
 259 @end deftypefn
 260
 261 @deftypefn {System Call} int inumber (int @var{fd})
 262 Returns the @dfn{inode number} of the inode associated with @var{fd},
 263 which may represent an ordinary file or a directory.
 264
 265 An inode number persistently identifies a file or directory.  It is
 266 unique during the file's existence.  In Pintos, the sector number of the
 267 inode is suitable for use as an inode number.
 268 @end deftypefn
 269
 270 We have provided @command{ls} and @command{mkdir} user programs, which
 271 are straightforward once the above syscalls are implemented.
 272 We have also provided @command{pwd}, which is not so straightforward.
 273 The @command{shell} program implements @command{cd} internally.
 274
 275 The @code{pintos} @option{put} and @option{get} commands should now
 276 accept full path names, assuming that the directories used in the
 277 paths have already been created.  This should not require any significant
 278 extra effort on your part.
 279
 280 @node Buffer Cache
 281 @subsection Buffer Cache
 282
 283 Modify the file system to keep a cache of file blocks.  When a request
 284 is made to read or write a block, check to see if it is in the
 285 cache, and if so, use the cached data without going to
 286 disk.  Otherwise, fetch the block from disk into cache, evicting an
 287 older entry if necessary.  You are limited to a cache no greater than 64
 288 sectors in size.
 289
 290 You must implement a cache replacement algorithm that is at least as
 291 good as the ``clock'' algorithm.  Your algorithm must also account for
 292 the generally greater value of metadata compared to data.  Experiment
 293 to see what combination of accessed, dirty, and other information
 294 results in the best performance, as measured by the number of disk
 295 accesses.
 296
 297 You can keep a cached copy of the free map permanently in memory if you
 298 like.  It doesn't have to count against the cache size.
 299
 300 The provided inode code uses a ``bounce buffer'' allocated with
 301 @func{malloc} to translate the disk's sector-by-sector interface into
 302 the system call interface's byte-by-byte interface.  You should get rid
 303 of these bounce buffers.  Instead, copy data into and out of sectors in
 304 the buffer cache directly.
 305
 306 Your cache should be @dfn{write-behind}, that is,
 307 keep dirty blocks in the cache, instead of immediately writing modified
 308 data to disk.  Write dirty blocks to disk whenever they are evicted.
 309 Because write-behind makes your file system more fragile in the face of
 310 crashes, in addition you should periodically write all dirty, cached
 311 blocks back to disk.  The cache should also be written back to disk in
 312 @func{filesys_done}, so that halting Pintos flushes the cache.
 313
 314 If you have @func{timer_sleep} from the first project working, write-behind is
 315 an excellent application.  Otherwise, you may implement a less general
 316 facility, but make sure that it does not exhibit busy-waiting.
 317
 318 You should also implement @dfn{read-ahead}, that is,
 319 automatically fetch the next block of a file
 320 into the cache when one block of a file is read, in case that block is
 321 about to be read.
 322 Read-ahead is only really useful when done asynchronously.  That means,
 323 if a process requests disk block 1 from the file, it should block until disk
 324 block 1 is read in, but once that read is complete, control should
 325 return to the process immediately.  The read-ahead request for disk
 326 block 2 should be handled asynchronously, in the background.
 327
 328 @strong{We recommend integrating the cache into your design early.}  In
 329 the past, many groups have tried to tack the cache onto a design late in
 330 the design process.  This is very difficult.  These groups have often
 331 turned in projects that failed most or all of the tests.
 332
 333 @node File System Synchronization
 334 @subsection Synchronization
 335
 336 The provided file system requires external synchronization, that is,
 337 callers must ensure that only one thread can be running in the file
 338 system code at once.  Your submission must adopt a finer-grained
 339 synchronization strategy that does not require external synchronization.
 340 To the extent possible, operations on independent entities should be
 341 independent, so that they do not need to wait on each other.
 342
 343 Operations on different cache blocks must be independent.  In
 344 particular, when I/O is required on a particular block, operations on
 345 other blocks that do not require I/O should proceed without having to
 346 wait for the I/O to complete.
 347
 348 Multiple processes must be able to access a single file at once.
 349 Multiple reads of a single file must be able to complete without
 350 waiting for one another.  When writing to a file does not extend the
 351 file, multiple processes should also be able to write a single file at
 352 once.  A read of a file by one process when the file is being written by
 353 another process is allowed to show that none, all, or part of the write
 354 has completed.  (However, after the @code{write} system call returns to
 355 its caller, all subsequent readers must see the change.)  Similarly,
 356 when two processes simultaneously write to the same part of a file,
 357 their data may be interleaved.
 358
 359 On the other hand, extending a file and writing data into the new
 360 section must be atomic.  Suppose processes A and B both have a given
 361 file open and both are positioned at end-of-file.  If A reads and B
 362 writes the file at the same time, A may read all, part, or none of what
 363 B writes.  However, A may not read data other than what B writes, e.g.@:
 364 if B's data is all nonzero bytes, A is not allowed to see any zeros.
 365
 366 Operations on different directories should take place concurrently.
 367 Operations on the same directory may wait for one another.
 368
 369 @node Project 4 FAQ
 370 @section FAQ
 371
 372 @table @b
 373 @item How much code will I need to write?
 374
 375 Here's a summary of our reference solution, produced by the
 376 @command{diffstat} program.  The final row gives total lines inserted
 377 and deleted; a changed line counts as both an insertion and a deletion.
 378
 379 This summary is relative to the Pintos base code, but the reference
 380 solution for project 4 is based on the reference solution to project 3.
 381 Thus, the reference solution runs with virtual memory enabled.
 382 @xref{Project 3 FAQ}, for the summary of project 3.
 383
 384 The reference solution represents just one possible solution.  Many
 385 other solutions are also possible and many of those differ greatly from
 386 the reference solution.  Some excellent solutions may not modify all the
 387 files modified by the reference solution, and some may modify files not
 388 modified by the reference solution.
 389
 390 @verbatim
 391  Makefile.build       |    5
 392  devices/timer.c      |   42 ++
 393  filesys/Make.vars    |    6
 394  filesys/cache.c      |  473 +++++++++++++++++++++++++
 395  filesys/cache.h      |   23 +
 396  filesys/directory.c  |   99 ++++-
 397  filesys/directory.h  |    3
 398  filesys/file.c       |    4
 399  filesys/filesys.c    |  194 +++++++++-
 400  filesys/filesys.h    |    5
 401  filesys/free-map.c   |   45 +-
 402  filesys/free-map.h   |    4
 403  filesys/fsutil.c     |    8
 404  filesys/inode.c      |  444 ++++++++++++++++++-----
 405  filesys/inode.h      |   11
 406  threads/init.c       |    5
 407  threads/interrupt.c  |    2
 408  threads/thread.c     |   32 +
 409  threads/thread.h     |   38 +-
 410  userprog/exception.c |   12
 411  userprog/pagedir.c   |   10
 412  userprog/process.c   |  332 +++++++++++++----
 413  userprog/syscall.c   |  582 ++++++++++++++++++++++++++++++-
 414  userprog/syscall.h   |    1
 415  vm/frame.c           |  161 ++++++++
 416  vm/frame.h           |   23 +
 417  vm/page.c            |  297 +++++++++++++++
 418  vm/page.h            |   50 ++
 419  vm/swap.c            |   85 ++++
 420  vm/swap.h            |   11
 421  30 files changed, 2721 insertions(+), 286 deletions(-)
 422 @end verbatim
 423
 424 @item Can @code{DISK_SECTOR_SIZE} change?
 425
 426 No, @code{DISK_SECTOR_SIZE} is fixed at 512.  This is a fixed property
 427 of IDE disk hardware.
 428 @end table
 429
 430 @menu
 431 * Indexed Files FAQ::
 432 * Subdirectories FAQ::
 433 * Buffer Cache FAQ::
 434 @end menu
 435
 436 @node Indexed Files FAQ
 437 @subsection Indexed Files FAQ
 438
 439 @table @b
 440 @item What is the largest file size that we are supposed to support?
 441
 442 The disk we create will be 8 MB or smaller.  However, individual files
 443 will have to be smaller than the disk to accommodate the metadata.
 444 You'll need to consider this when deciding your inode organization.
 445 @end table
 446
 447 @node Subdirectories FAQ
 448 @subsection Subdirectories FAQ
 449
 450 @table @b
 451 @item How should a file name like @samp{a//b} be interpreted?
 452
 453 Multiple consecutive slashes are equivalent to a single slash, so this
 454 file name is the same as @samp{a/b}.
 455
 456 @item How about a file name like @samp{/../x}?
 457
 458 The root directory is its own parent, so it is equivalent to @samp{/x}.
 459
 460 @item How should a file name that ends in @samp{/} be treated?
 461
 462 Most Unix systems allow a slash at the end of the name for a directory,
 463 and reject other names that end in slashes.  We will allow this
 464 behavior, as well as simply rejecting a name that ends in a slash.
 465 @end table
 466
 467 @node Buffer Cache FAQ
 468 @subsection Buffer Cache FAQ
 469
 470 @table @b
 471 @item Can we keep a @struct{inode_disk} inside @struct{inode}?
 472
 473 The goal of the 64-block limit is to bound the amount of cached file
 474 system data.  If you keep a block of disk data---whether file data or
 475 metadata---anywhere in kernel memory then you have to count it against
 476 the 64-block limit.  The same rule applies to anything that's
 477 ``similar'' to a block of disk data, such as a @struct{inode_disk}
 478 without the @code{length} or @code{sector_cnt} members.
 479
 480 That means you'll have to change the way the inode implementation
 481 accesses its corresponding on-disk inode right now, since it currently
 482 just embeds a @struct{inode_disk} in @struct{inode} and reads the
 483 corresponding sector from disk when it's created.  Keeping extra
 484 copies of inodes would subvert the 64-block limitation that we place
 485 on your cache.
 486
 487 You can store a pointer to inode data in @struct{inode}, but it you do
 488 so you should carefully make sure that this does not limit your OS to 64
 489 simultaneously open files.
 490 You can also store other information to help you find the inode when you
 491 need it.  Similarly, you may store some metadata along each of your 64
 492 cache entries.
 493
 494 You can keep a cached copy of the free map permanently in memory if you
 495 like.  It doesn't have to count against the cache size.
 496
 497 @func{byte_to_sector} in @file{filesys/inode.c} uses the
 498 @struct{inode_disk} directly, without first reading that sector from
 499 wherever it was in the storage hierarchy.  This will no longer work.
 500 You will need to change @func{inode_byte_to_sector} to obtain the
 501 @struct{inode_disk} from the cache before using it.
 502 @end table