pintos-os.org Git - pintos-anon/blob - doc/filesys.texi

   1 @node Project 4--File Systems
   2 @chapter Project 4: File Systems
   3
   4 In the previous two assignments, you made extensive use of a
   5 file system without actually worrying about how it was implemented
   6 underneath.  For this last assignment, you will improve the
   7 implementation of the file system.  You will be working primarily in
   8 the @file{filesys} directory.
   9
  10 You may build project 4 on top of project 2 or project 3.  In either
  11 case, all of the functionality needed for project 2 must work in your
  12 filesys submission.  If you build on project 3, then all of the project
  13 3 functionality must work also, and you will need to edit
  14 @file{filesys/Make.vars} to enable VM functionality.  You can receive up
  15 to 5% extra credit if you do enable VM.
  16
  17 The tests for project 4 will run much faster if
  18 you use the qemu emulator, e.g.@: via @code{make check
  19 PINTOSOPTS='--qemu'}.
  20
  21 @menu
  22 * Project 4 Background::
  23 * Project 4 Requirements::
  24 * Project 4 FAQ::
  25 @end menu
  26
  27 @node Project 4 Background
  28 @section Background
  29
  30 @menu
  31 * File System New Code::
  32 @end menu
  33
  34 @node File System New Code
  35 @subsection New Code
  36
  37 Here are some files that are probably new to you.  These are in the
  38 @file{filesys} directory except where indicated:
  39
  40 @table @file
  41 @item fsutil.c
  42 Simple utilities for the file system that are accessible from the
  43 kernel command line.
  44
  45 @item filesys.h
  46 @itemx filesys.c
  47 Top-level interface to the file system.  @xref{Using the File System},
  48 for an introduction.
  49
  50 @item directory.h
  51 @itemx directory.c
  52 Translates file names to inodes.  The directory data structure is
  53 stored as a file.
  54
  55 @item inode.h
  56 @itemx inode.c
  57 Manages the data structure representing the layout of a
  58 file's data on disk.
  59
  60 @item file.h
  61 @itemx file.c
  62 Translates file reads and writes to disk sector reads
  63 and writes.
  64
  65 @item lib/kernel/bitmap.h
  66 @itemx lib/kernel/bitmap.c
  67 A bitmap data structure along with routines for reading and writing
  68 the bitmap to disk files.
  69 @end table
  70
  71 Our file system has a Unix-like interface, so you may also wish to
  72 read the Unix man pages for @code{creat}, @code{open}, @code{close},
  73 @code{read}, @code{write}, @code{lseek}, and @code{unlink}.  Our file
  74 system has calls that are similar, but not identical, to these.  The
  75 file system translates these calls into disk operations.
  76
  77 All the basic functionality is there in the code above, so that the
  78 file system is usable from the start, as you've seen
  79 in the previous two projects.  However, it has severe limitations
  80 which you will remove.
  81
  82 While most of your work will be in @file{filesys}, you should be
  83 prepared for interactions with all previous parts.
  84
  85 @node Project 4 Requirements
  86 @section Requirements
  87
  88 @menu
  89 * Project 4 Design Document::
  90 * Indexed and Extensible Files::
  91 * Subdirectories::
  92 * Buffer Cache::
  93 * File System Synchronization::
  94 @end menu
  95
  96 @node Project 4 Design Document
  97 @subsection Design Document
  98
  99 Before you turn in your project, you must copy @uref{filesys.tmpl, , the
 100 project 4 design document template} into your source tree under the name
 101 @file{pintos/src/filesys/DESIGNDOC} and fill it in.  We recommend that
 102 you read the design document template before you start working on the
 103 project.  @xref{Project Documentation}, for a sample design document
 104 that goes along with a fictitious project.
 105
 106 @node Indexed and Extensible Files
 107 @subsection Indexed and Extensible Files
 108
 109 The basic file system allocates files as a single extent, making it
 110 vulnerable to external fragmentation, that is, it is possible that an
 111 @var{n}-block file cannot be allocated even though @var{n} blocks are
 112 free.  Eliminate this problem by
 113 modifying the on-disk inode structure.  In practice, this probably means using
 114 an index structure with direct, indirect, and doubly indirect blocks.
 115 You are welcome to choose a different scheme as long as you explain the
 116 rationale for it in your design documentation, and as long as it does
 117 not suffer from external fragmentation (as does the extent-based file
 118 system we provide).
 119
 120 You can assume that the disk will not be larger than 8 MB.  You must
 121 support files as large as the disk (minus metadata).  Each inode is
 122 stored in one disk sector, limiting the number of block pointers that it
 123 can contain.  Supporting 8 MB files will require you to implement
 124 doubly-indirect blocks.
 125
 126 An extent-based file can only grow if it is followed by empty space, but
 127 indexed inodes make file growth possible whenever free space is
 128 available.  Implement file growth.  In the basic file system, the file
 129 size is specified when the file is created.  In most modern file
 130 systems, a file is initially created with size 0 and is then expanded
 131 every time a write is made off the end of the file.  Your file system
 132 must allow this.
 133
 134 There should be no predetermined limit on the size of a file, except
 135 that a file cannot exceed the size of the disk (minus metadata).  This
 136 also applies to the root directory file, which should now be allowed
 137 to expand beyond its initial limit of 16 files.
 138
 139 User programs are allowed to seek beyond the current end-of-file (EOF).  The
 140 seek itself does not extend the file.  Writing at a position past EOF
 141 extends the file to the position being written, and any gap between the
 142 previous EOF and the start of the write must be filled with zeros.  A
 143 read starting from a position past EOF returns no bytes.
 144
 145 Writing far beyond EOF can cause many blocks to be entirely zero.  Some
 146 file systems allocate and write real data blocks for these implicitly
 147 zeroed blocks.  Other file systems do not allocate these blocks at all
 148 until they are explicitly written.  The latter file systems are said to
 149 support ``sparse files.''  You may adopt either allocation strategy in
 150 your file system.
 151
 152 @node Subdirectories
 153 @subsection Subdirectories
 154
 155 Implement a hierarchical name space.  In the basic file system, all
 156 files live in a single directory.  Modify this to allow directory
 157 entries to point to files or to other directories.
 158
 159 Make sure that directories can expand beyond their original size just
 160 as any other file can.
 161
 162 The basic file system has a 14-character limit on file names.  You may
 163 retain this limit for individual file name components, or may extend
 164 it, at your option.  You must allow full path names to be
 165 much longer than 14 characters.
 166
 167 Maintain a separate current directory for each process.  At
 168 startup, set the root as the initial process's current directory.
 169 When one process starts another with the @code{exec} system call, the
 170 child process inherits its parent's current directory.  After that, the
 171 two processes' current directories are independent, so that either
 172 changing its own current directory has no effect on the other.
 173 (This is why, under Unix, the @command{cd} command is a shell built-in,
 174 not an external program.)
 175
 176 Update the existing system calls so that, anywhere a file name is
 177 provided by the caller, an absolute or relative path name may used.
 178 The directory separator character is forward slash (@samp{/}).
 179 You must also support special file names @file{.} and @file{..}, which
 180 have the same meanings as they do in Unix.
 181
 182 Update the @code{remove} system call so that it can delete empty
 183 directories in addition to regular files.  Directories may only be
 184 deleted if they do not contain any files or subdirectories (other than
 185 @file{.} and @file{..}).
 186
 187 Update the @code{open} system call so that it can also open directories.
 188 Of the existing system calls, only @code{close} needs to accept a file
 189 descriptor for a directory.
 190
 191 Implement the following new system calls:
 192
 193 @deftypefn {System Call} bool chdir (const char *@var{dir})
 194 Changes the current working directory of the process to
 195 @var{dir}, which may be relative or absolute.  Returns true if
 196 successful, false on failure.
 197 @end deftypefn
 198
 199 @deftypefn {System Call} bool mkdir (const char *@var{dir})
 200 Creates the directory named @var{dir}, which may be
 201 relative or absolute.  Returns true if successful, false on failure.
 202 Fails if @var{dir} already exists or if any directory name in
 203 @var{dir}, besides the last, does not already exist.  That is,
 204 @code{mkdir("/a/b/c")} succeeds only if @file{/a/b} already exists and
 205 @file{/a/b/c} does not.
 206 @end deftypefn
 207
 208 @deftypefn {System Call} bool readdir (int @var{fd}, char *@var{name})
 209 Reads a directory entry from file descriptor @var{fd}, which must
 210 represent a directory.  If successful, stores the null-terminated file
 211 name in @var{name}, which must have room for @code{READDIR_MAX_LEN + 1}
 212 bytes, and returns true.  If no entries are left in the directory,
 213 returns false.
 214
 215 @file{.} and @file{..} should not be returned by @code{readdir}.
 216
 217 If the directory changes while it is open, then it is acceptable for
 218 some entries not to be read at all or to be read multiple times.
 219 Otherwise, each directory entry should be read once, in any order.
 220
 221 @code{READDIR_MAX_LEN} is defined in @file{lib/user/syscall.h}.  If your
 222 file system supports longer file names than the basic file system, you
 223 should increase this value from the default of 14.
 224 @end deftypefn
 225
 226 @deftypefn {System Call} bool isdir (int @var{fd})
 227 Returns true if @var{fd} represents a directory,
 228 false if it represents an ordinary file.
 229 @end deftypefn
 230
 231 @deftypefn {System Call} int inumber (int @var{fd})
 232 Returns the @dfn{inode number} of the inode associated with @var{fd}.
 233 Applicable to file descriptors for both files and directories.
 234
 235 An inode number persistently identifies a file or directory.  It is
 236 unique during the file's existence.  In Pintos, the sector number of the
 237 inode is suitable for use as an inode number.
 238 @end deftypefn
 239
 240 We have provided @command{ls} and @command{mkdir} user programs, which
 241 are straightforward once the above syscalls are implemented.
 242 We have also provided @command{pwd}, which is not so straightforward.
 243 The @command{shell} program implements @command{cd} internally.
 244
 245 The @code{pintos} @option{put} and @option{get} commands should now
 246 accept full path names, assuming that the directories used in the
 247 paths have already been created.  This should not require any significant
 248 extra effort on your part.
 249
 250 @node Buffer Cache
 251 @subsection Buffer Cache
 252
 253 Modify the file system to keep a cache of file blocks.  When a request
 254 is made to read or write a block, check to see if it is in the
 255 cache, and if so, use the cached data without going to
 256 disk.  Otherwise, fetch the block from disk into cache, evicting an
 257 older entry if necessary.  You are limited to a cache no greater than 64
 258 sectors in size.
 259
 260 Be sure to choose an intelligent cache replacement algorithm.
 261 Experiment to see what combination of accessed, dirty, and other
 262 information results in the best performance, as measured by the number
 263 of disk accesses.  For example, metadata is generally more valuable to
 264 cache than data.
 265
 266 You can keep a cached copy of the free map permanently in memory if you
 267 like.  It doesn't have to count against the cache size.
 268
 269 The provided inode code uses a ``bounce buffer'' allocated with
 270 @func{malloc} to translate the disk's sector-by-sector interface into
 271 the system call interface's byte-by-byte interface.  You should get rid
 272 of these bounce buffers.  Instead, copy data into and out of sectors in
 273 the buffer cache directly.
 274
 275 Your cache should be @dfn{write-behind}, that is,
 276 keep dirty blocks in the cache, instead of immediately writing modified
 277 data to disk.  Write dirty blocks to disk whenever they are evicted.
 278 Because write-behind makes your file system more fragile in the face of
 279 crashes, in addition you should periodically write all dirty, cached
 280 blocks back to disk.  The cache should also be written back to disk in
 281 @func{filesys_done}, so that halting Pintos flushes the cache.
 282
 283 If you have @func{timer_sleep} from the first project working, write-behind is
 284 an excellent application.  If you're still using the base
 285 implementation of @func{timer_sleep}, be aware that it busy-waits, which
 286 is not acceptable here (or elsewhere).  If @func{timer_sleep}'s delays seem too
 287 short or too long, reread the explanation of the @option{-r} option to
 288 @command{pintos} (@pxref{Debugging versus Testing}).
 289
 290 You should also implement @dfn{read-ahead}, that is,
 291 automatically fetch the next block of a file
 292 into the cache when one block of a file is read, in case that block is
 293 about to be read.
 294 Read-ahead is only really useful when done asynchronously.  That means,
 295 if a process requests disk block 1 from the file, it should block until disk
 296 block 1 is read in, but once that read is complete, control should
 297 return to the process immediately.  The read-ahead request for disk
 298 block 2 should be handled asynchronously, in the background.
 299
 300 @strong{We recommend integrating the cache into your design early.}  In
 301 the past, many groups have tried to tack the cache onto a design late in
 302 the design process.  This is very difficult.  These groups have often
 303 turned in projects that failed most or all of the tests.
 304
 305 @node File System Synchronization
 306 @subsection Synchronization
 307
 308 The provided file system requires external synchronization, that is,
 309 callers must ensure that only one thread can be running in the file
 310 system code at once.  Your submission must adopt a finer-grained
 311 synchronization strategy that does not require external synchronization.
 312 To the extent possible, operations on independent entities should be
 313 independent, so that they do not need to wait on each other.
 314
 315 Operations on different cache blocks must be independent.  In
 316 particular, when I/O is required on a particular block, operations on
 317 other blocks that do not require I/O should proceed without having to
 318 wait for the I/O to complete.
 319
 320 Multiple processes must be able to access a single file at once.
 321 Multiple reads of a single file must be able to complete without
 322 waiting for one another.  When writing to a file does not extend the
 323 file, multiple processes should also be able to write a single file at
 324 once.  A read of a file by one process when the file is being written by
 325 another process is allowed to show that none, all, or part of the write
 326 has completed.  (However, after the @code{write} system call returns to
 327 its caller, all subsequent readers must see the change.)  Similarly,
 328 when two processes simultaneously write to the same part of a file,
 329 their data may be interleaved.
 330
 331 On the other hand, extending a file and writing data into the new
 332 section must be atomic.  Suppose processes A and B both have a given
 333 file open and both are positioned at end-of-file.  If A reads and B
 334 writes the file at the same time, A may read all, part, or none of what
 335 B writes.  However, A may not read data other than what B writes, e.g.@:
 336 if B's data is all nonzero bytes, A is not allowed to see any zeros.
 337
 338 Operations on different directories should take place concurrently.
 339 Operations on the same directory may wait for one another.
 340
 341 @node Project 4 FAQ
 342 @section FAQ
 343
 344 @table @b
 345 @item How much code will I need to write?
 346
 347 Here's a summary of our reference solution, produced by the
 348 @command{diffstat} program.  The final row gives total lines inserted
 349 and deleted; a changed line counts as both an insertion and a deletion.
 350
 351 This summary is relative to the Pintos base code, but the reference
 352 solution for project 4 is based on the reference solution to project 3.
 353 Thus, the reference solution runs with virtual memory enabled.
 354 @xref{Project 3 FAQ}, for the summary of project 3.
 355
 356 The reference solution represents just one possible solution.  Many
 357 other solutions are also possible and many of those differ greatly from
 358 the reference solution.  Some excellent solutions may not modify all the
 359 files modified by the reference solution, and some may modify files not
 360 modified by the reference solution.
 361
 362 @verbatim
 363  Makefile.build       |    5
 364  devices/timer.c      |   42 ++
 365  filesys/Make.vars    |    6
 366  filesys/cache.c      |  473 +++++++++++++++++++++++++
 367  filesys/cache.h      |   23 +
 368  filesys/directory.c  |   99 ++++-
 369  filesys/directory.h  |    3
 370  filesys/file.c       |    4
 371  filesys/filesys.c    |  194 +++++++++-
 372  filesys/filesys.h    |    5
 373  filesys/free-map.c   |   45 +-
 374  filesys/free-map.h   |    4
 375  filesys/fsutil.c     |    8
 376  filesys/inode.c      |  444 ++++++++++++++++++-----
 377  filesys/inode.h      |   11
 378  threads/init.c       |    5
 379  threads/interrupt.c  |    2
 380  threads/thread.c     |   32 +
 381  threads/thread.h     |   38 +-
 382  userprog/exception.c |   12
 383  userprog/pagedir.c   |   10
 384  userprog/process.c   |  332 +++++++++++++----
 385  userprog/syscall.c   |  582 ++++++++++++++++++++++++++++++-
 386  userprog/syscall.h   |    1
 387  vm/frame.c           |  161 ++++++++
 388  vm/frame.h           |   23 +
 389  vm/page.c            |  297 +++++++++++++++
 390  vm/page.h            |   50 ++
 391  vm/swap.c            |   85 ++++
 392  vm/swap.h            |   11
 393  30 files changed, 2721 insertions(+), 286 deletions(-)
 394 @end verbatim
 395
 396 @item Can @code{DISK_SECTOR_SIZE} change?
 397
 398 No, @code{DISK_SECTOR_SIZE} is fixed at 512.  This is a fixed property
 399 of IDE disk hardware.
 400 @end table
 401
 402 @menu
 403 * Indexed Files FAQ::
 404 * Subdirectories FAQ::
 405 * Buffer Cache FAQ::
 406 @end menu
 407
 408 @node Indexed Files FAQ
 409 @subsection Indexed Files FAQ
 410
 411 @table @b
 412 @item What is the largest file size that we are supposed to support?
 413
 414 The disk we create will be 8 MB or smaller.  However, individual files
 415 will have to be smaller than the disk to accommodate the metadata.
 416 You'll need to consider this when deciding your inode organization.
 417 @end table
 418
 419 @node Subdirectories FAQ
 420 @subsection Subdirectories FAQ
 421
 422 @table @b
 423 @item How should a file name like @samp{a//b} be interpreted?
 424
 425 Multiple consecutive slashes are equivalent to a single slash, so this
 426 file name is the same as @samp{a/b}.
 427
 428 @item How about a file name like @samp{/../x}?
 429
 430 The root directory is its own parent, so it is equivalent to @samp{/x}.
 431
 432 @item How should a file name that ends in @samp{/} be treated?
 433
 434 Most Unix systems allow a slash at the end of the name for a directory,
 435 and reject other names that end in slashes.  We will allow this
 436 behavior, as well as simply rejecting a name that ends in a slash.
 437 @end table
 438
 439 @node Buffer Cache FAQ
 440 @subsection Buffer Cache FAQ
 441
 442 @table @b
 443 @item Can we keep a @struct{inode_disk} inside @struct{inode}?
 444
 445 The goal of the 64-block limit is to bound the amount of cached file
 446 system data.  If you keep a block of disk data---whether file data or
 447 metadata---anywhere in kernel memory then you have to count it against
 448 the 64-block limit.  The same rule applies to anything that's
 449 ``similar'' to a block of disk data, such as a @struct{inode_disk}
 450 without the @code{length} or @code{sector_cnt} members.
 451
 452 That means you'll have to change the way the inode implementation
 453 accesses its corresponding on-disk inode right now, since it currently
 454 just embeds a @struct{inode_disk} in @struct{inode} and reads the
 455 corresponding sector from disk when it's created.  Keeping extra
 456 copies of inodes would subvert the 64-block limitation that we place
 457 on your cache.
 458
 459 You can store a pointer to inode data in @struct{inode}, but it you do
 460 so you should carefully make sure that this does not limit your OS to 64
 461 simultaneously open files.
 462 You can also store other information to help you find the inode when you
 463 need it.  Similarly, you may store some metadata along each of your 64
 464 cache entries.
 465
 466 You can keep a cached copy of the free map permanently in memory if you
 467 like.  It doesn't have to count against the cache size.
 468
 469 @func{byte_to_sector} in @file{filesys/inode.c} uses the
 470 @struct{inode_disk} directly, without first reading that sector from
 471 wherever it was in the storage hierarchy.  This will no longer work.
 472 You will need to change @func{inode_byte_to_sector} to obtain the
 473 @struct{inode_disk} from the cache before using it.
 474 @end table