pintos-os.org Git - pintos-anon/blob - doc/filesys.texi

   1 @node Project 4--File Systems
   2 @chapter Project 4: File Systems
   3
   4 In the previous two assignments, you made extensive use of a
   5 file system without actually worrying about how it was implemented
   6 underneath.  For this last assignment, you will improve the
   7 implementation of the file system.  You will be working primarily in
   8 the @file{filesys} directory.
   9
  10 You may build project 4 on top of project 2 or project 3.  In either
  11 case, all of the functionality needed for project 2 must work in your
  12 filesys submission.  If you build on project 3, then all of the project
  13 3 functionality must work also, and you will need to edit
  14 @file{filesys/Make.vars} to enable VM functionality.  You can receive up
  15 to 5% extra credit if you do enable VM.
  16
  17 @menu
  18 * Project 4 Background::
  19 * Project 4 Requirements::
  20 * Project 4 FAQ::
  21 @end menu
  22
  23 @node Project 4 Background
  24 @section Background
  25
  26 @menu
  27 * File System New Code::
  28 @end menu
  29
  30 @node File System New Code
  31 @subsection New Code
  32
  33 Here are some files that are probably new to you.  These are in the
  34 @file{filesys} directory except where indicated:
  35
  36 @table @file
  37 @item fsutil.c
  38 Simple utilities for the file system that are accessible from the
  39 kernel command line.
  40
  41 @item filesys.h
  42 @itemx filesys.c
  43 Top-level interface to the file system.  @xref{Using the File System},
  44 for an introduction.
  45
  46 @item directory.h
  47 @itemx directory.c
  48 Translates file names to inodes.  The directory data structure is
  49 stored as a file.
  50
  51 @item inode.h
  52 @itemx inode.c
  53 Manages the data structure representing the layout of a
  54 file's data on disk.
  55
  56 @item file.h
  57 @itemx file.c
  58 Translates file reads and writes to disk sector reads
  59 and writes.
  60
  61 @item lib/kernel/bitmap.h
  62 @itemx lib/kernel/bitmap.c
  63 A bitmap data structure along with routines for reading and writing
  64 the bitmap to disk files.
  65 @end table
  66
  67 Our file system has a Unix-like interface, so you may also wish to
  68 read the Unix man pages for @code{creat}, @code{open}, @code{close},
  69 @code{read}, @code{write}, @code{lseek}, and @code{unlink}.  Our file
  70 system has calls that are similar, but not identical, to these.  The
  71 file system translates these calls into disk operations.
  72
  73 All the basic functionality is there in the code above, so that the
  74 file system is usable from the start, as you've seen
  75 in the previous two projects.  However, it has severe limitations
  76 which you will remove.
  77
  78 While most of your work will be in @file{filesys}, you should be
  79 prepared for interactions with all previous parts.
  80
  81 @node Project 4 Requirements
  82 @section Requirements
  83
  84 @menu
  85 * Project 4 Design Document::
  86 * Indexed and Extensible Files::
  87 * Subdirectories::
  88 * Buffer Cache::
  89 * File System Synchronization::
  90 @end menu
  91
  92 @node Project 4 Design Document
  93 @subsection Design Document
  94
  95 Before you turn in your project, you must copy @uref{filesys.tmpl, , the
  96 project 4 design document template} into your source tree under the name
  97 @file{pintos/src/filesys/DESIGNDOC} and fill it in.  We recommend that
  98 you read the design document template before you start working on the
  99 project.  @xref{Project Documentation}, for a sample design document
 100 that goes along with a fictitious project.
 101
 102 @node Indexed and Extensible Files
 103 @subsection Indexed and Extensible Files
 104
 105 The basic file system allocates files as a single extent, making it
 106 vulnerable to external fragmentation, that is, it is possible that an
 107 @var{n}-block file cannot be allocated even though @var{n} blocks are
 108 free.  Eliminate this problem by
 109 modifying the on-disk inode structure.  In practice, this probably means using
 110 an index structure with direct, indirect, and doubly indirect blocks.
 111 You are welcome to choose a different scheme as long as you explain the
 112 rationale for it in your design documentation, and as long as it does
 113 not suffer from external fragmentation (as does the extent-based file
 114 system we provide).
 115
 116 You can assume that the disk will not be larger than 8 MB.  You must
 117 support files as large as the disk (minus metadata).  Each inode is
 118 stored in one disk sector, limiting the number of block pointers that it
 119 can contain.  Supporting 8 MB files will require you to implement
 120 doubly-indirect blocks.
 121
 122 An extent-based file can only grow if it is followed by empty space, but
 123 indexed inodes make file growth possible whenever free space is
 124 available.  Implement file growth.  In the basic file system, the file
 125 size is specified when the file is created.  In most modern file
 126 systems, a file is initially created with size 0 and is then expanded
 127 every time a write is made off the end of the file.  Your file system
 128 must allow this.
 129
 130 There should be no predetermined limit on the size of a file, except
 131 that a file cannot exceed the size of the disk (minus metadata).  This
 132 also applies to the root directory file, which should now be allowed
 133 to expand beyond its initial limit of 16 files.
 134
 135 User programs are allowed to seek beyond the current end-of-file (EOF).  The
 136 seek itself does not extend the file.  Writing at a position past EOF
 137 extends the file to the position being written, and any gap between the
 138 previous EOF and the start of the write must be filled with zeros.  A
 139 read starting from a position past EOF returns no bytes.
 140
 141 Writing far beyond EOF can cause many blocks to be entirely zero.  Some
 142 file systems allocate and write real data blocks for these implicitly
 143 zeroed blocks.  Other file systems do not allocate these blocks at all
 144 until they are explicitly written.  The latter file systems are said to
 145 support ``sparse files.''  You may adopt either allocation strategy in
 146 your file system.
 147
 148 @node Subdirectories
 149 @subsection Subdirectories
 150
 151 Implement a hierarchical name space.  In the basic file system, all
 152 files live in a single directory.  Modify this to allow directory
 153 entries to point to files or to other directories.
 154
 155 Make sure that directories can expand beyond their original size just
 156 as any other file can.
 157
 158 The basic file system has a 14-character limit on file names.  You may
 159 retain this limit for individual file name components, or may extend
 160 it, at your option.  You must allow full path names to be
 161 much longer than 14 characters.
 162
 163 Maintain a separate current directory for each process.  At
 164 startup, set the root as the initial process's current directory.
 165 When one process starts another with the @code{exec} system call, the
 166 child process inherits its parent's current directory.  After that, the
 167 two processes' current directories are independent, so that either
 168 changing its own current directory has no effect on the other.
 169 (This is why, under Unix, the @command{cd} command is a shell built-in,
 170 not an external program.)
 171
 172 Update the existing system calls so that, anywhere a file name is
 173 provided by the caller, an absolute or relative path name may used.
 174 The directory separator character is forward slash (@samp{/}).
 175 You must also support special file names @file{.} and @file{..}, which
 176 have the same meanings as they do in Unix.
 177
 178 Update the @code{remove} system call so that it can delete empty
 179 directories in addition to regular files.  Directories may only be
 180 deleted if they do not contain any files or subdirectories (other than
 181 @file{.} and @file{..}).
 182
 183 Update the @code{open} system call so that it can also open directories.
 184 Of the existing system calls, only @code{close} needs to accept a file
 185 descriptor for a directory.
 186
 187 Implement the following new system calls:
 188
 189 @deftypefn {System Call} bool chdir (const char *@var{dir})
 190 Changes the current working directory of the process to
 191 @var{dir}, which may be relative or absolute.  Returns true if
 192 successful, false on failure.
 193 @end deftypefn
 194
 195 @deftypefn {System Call} bool mkdir (const char *@var{dir})
 196 Creates the directory named @var{dir}, which may be
 197 relative or absolute.  Returns true if successful, false on failure.
 198 Fails if @var{dir} already exists or if any directory name in
 199 @var{dir}, besides the last, does not already exist.  That is,
 200 @code{mkdir("/a/b/c")} succeeds only if @file{/a/b} already exists and
 201 @file{/a/b/c} does not.
 202 @end deftypefn
 203
 204 @deftypefn {System Call} bool readdir (int @var{fd}, char *@var{name})
 205 Reads a directory entry from file descriptor @var{fd}, which must
 206 represent a directory.  If successful, stores the null-terminated file
 207 name in @var{name}, which must have room for @code{READDIR_MAX_LEN + 1}
 208 bytes, and returns true.  If no entries are left in the directory,
 209 returns false.
 210
 211 @file{.} and @file{..} should not be returned by @code{readdir}.
 212
 213 If the directory changes while it is open, then it is acceptable for
 214 some entries not to be read at all or to be read multiple times.
 215 Otherwise, each directory entry should be read once, in any order.
 216
 217 @code{READDIR_MAX_LEN} is defined in @file{lib/user/syscall.h}.  If your
 218 file system supports longer file names than the basic file system, you
 219 should increase this value from the default of 14.
 220 @end deftypefn
 221
 222 @deftypefn {System Call} bool isdir (int @var{fd})
 223 Returns true if @var{fd} represents a directory,
 224 false if it represents an ordinary file.
 225 @end deftypefn
 226
 227 @deftypefn {System Call} int inumber (int @var{fd})
 228 Returns the @dfn{inode number} of the inode associated with @var{fd}.
 229 Applicable to file descriptors for both files and directories.
 230
 231 An inode number persistently identifies a file or directory.  It is
 232 unique during the file's existence.  In Pintos, the sector number of the
 233 inode is suitable for use as an inode number.
 234 @end deftypefn
 235
 236 We have provided @command{ls} and @command{mkdir} user programs, which
 237 are straightforward once the above syscalls are implemented.
 238 We have also provided @command{pwd}, which is not so straightforward.
 239 The @command{shell} program implements @command{cd} internally.
 240
 241 The @code{pintos} @option{put} and @option{get} commands should now
 242 accept full path names, assuming that the directories used in the
 243 paths have already been created.  This should not require any significant
 244 extra effort on your part.
 245
 246 @node Buffer Cache
 247 @subsection Buffer Cache
 248
 249 Modify the file system to keep a cache of file blocks.  When a request
 250 is made to read or write a block, check to see if it is in the
 251 cache, and if so, use the cached data without going to
 252 disk.  Otherwise, fetch the block from disk into cache, evicting an
 253 older entry if necessary.  You are limited to a cache no greater than 64
 254 sectors in size.
 255
 256 Be sure to choose an intelligent cache replacement algorithm.
 257 Experiment to see what combination of accessed, dirty, and other
 258 information results in the best performance, as measured by the number
 259 of disk accesses.  For example, metadata is generally more valuable to
 260 cache than data.
 261
 262 You can keep a cached copy of the free map permanently in memory if you
 263 like.  It doesn't have to count against the cache size.
 264
 265 The provided inode code uses a ``bounce buffer'' allocated with
 266 @func{malloc} to translate the disk's sector-by-sector interface into
 267 the system call interface's byte-by-byte interface.  You should get rid
 268 of these bounce buffers.  Instead, copy data into and out of sectors in
 269 the buffer cache directly.
 270
 271 Your cache should be @dfn{write-behind}, that is,
 272 keep dirty blocks in the cache, instead of immediately writing modified
 273 data to disk.  Write dirty blocks to disk whenever they are evicted.
 274 Because write-behind makes your file system more fragile in the face of
 275 crashes, in addition you should periodically write all dirty, cached
 276 blocks back to disk.  The cache should also be written back to disk in
 277 @func{filesys_done}, so that halting Pintos flushes the cache.
 278
 279 If you have @func{timer_sleep} from the first project working, write-behind is
 280 an excellent application.  If you're still using the base
 281 implementation of @func{timer_sleep}, be aware that it busy-waits, which
 282 is not acceptable here (or elsewhere).  If @func{timer_sleep}'s delays seem too
 283 short or too long, reread the explanation of the @option{-r} option to
 284 @command{pintos} (@pxref{Debugging versus Testing}).
 285
 286 You should also implement @dfn{read-ahead}, that is,
 287 automatically fetch the next block of a file
 288 into the cache when one block of a file is read, in case that block is
 289 about to be read.
 290 Read-ahead is only really useful when done asynchronously.  That means,
 291 if a process requests disk block 1 from the file, it should block until disk
 292 block 1 is read in, but once that read is complete, control should
 293 return to the process immediately.  The read-ahead request for disk
 294 block 2 should be handled asynchronously, in the background.
 295
 296 @strong{We recommend integrating the cache into your design early.}  In
 297 the past, many groups have tried to tack the cache onto a design late in
 298 the design process.  This is very difficult.  These groups have often
 299 turned in projects that failed most or all of the tests.
 300
 301 @node File System Synchronization
 302 @subsection Synchronization
 303
 304 The provided file system requires external synchronization, that is,
 305 callers must ensure that only one thread can be running in the file
 306 system code at once.  Your submission must adopt a finer-grained
 307 synchronization strategy that does not require external synchronization.
 308 To the extent possible, operations on independent entities should be
 309 independent, so that they do not need to wait on each other.
 310
 311 Operations on different cache blocks must be independent.  In
 312 particular, when I/O is required on a particular block, operations on
 313 other blocks that do not require I/O should proceed without having to
 314 wait for the I/O to complete.
 315
 316 Multiple processes must be able to access a single file at once.
 317 Multiple reads of a single file must be able to complete without
 318 waiting for one another.  When writing to a file does not extend the
 319 file, multiple processes should also be able to write a single file at
 320 once.  A read of a file by one process when the file is being written by
 321 another process is allowed to show that none, all, or part of the write
 322 has completed.  (However, after the @code{write} system call returns to
 323 its caller, all subsequent readers must see the change.)  Similarly,
 324 when two processes simultaneously write to the same part of a file,
 325 their data may be interleaved.
 326
 327 On the other hand, extending a file and writing data into the new
 328 section must be atomic.  Suppose processes A and B both have a given
 329 file open and both are positioned at end-of-file.  If A reads and B
 330 writes the file at the same time, A may read all, part, or none of what
 331 B writes.  However, A may not read data other than what B writes, e.g.@:
 332 if B's data is all nonzero bytes, A is not allowed to see any zeros.
 333
 334 Operations on different directories should take place concurrently.
 335 Operations on the same directory may wait for one another.
 336
 337 @node Project 4 FAQ
 338 @section FAQ
 339
 340 @table @b
 341 @item How much code will I need to write?
 342
 343 Here's a summary of our reference solution, produced by the
 344 @command{diffstat} program.  The final row gives total lines inserted
 345 and deleted; a changed line counts as both an insertion and a deletion.
 346
 347 This summary is relative to the Pintos base code, but the reference
 348 solution for project 4 is based on the reference solution to project 3.
 349 Thus, the reference solution runs with virtual memory enabled.
 350 @xref{Project 3 FAQ}, for the summary of project 3.
 351
 352 The reference solution represents just one possible solution.  Many
 353 other solutions are also possible and many of those differ greatly from
 354 the reference solution.  Some excellent solutions may not modify all the
 355 files modified by the reference solution, and some may modify files not
 356 modified by the reference solution.
 357
 358 @verbatim
 359  Makefile.build       |    5
 360  devices/timer.c      |   42 ++
 361  filesys/Make.vars    |    6
 362  filesys/cache.c      |  473 +++++++++++++++++++++++++
 363  filesys/cache.h      |   23 +
 364  filesys/directory.c  |   99 ++++-
 365  filesys/directory.h  |    3
 366  filesys/file.c       |    4
 367  filesys/filesys.c    |  194 +++++++++-
 368  filesys/filesys.h    |    5
 369  filesys/free-map.c   |   45 +-
 370  filesys/free-map.h   |    4
 371  filesys/fsutil.c     |    8
 372  filesys/inode.c      |  444 ++++++++++++++++++-----
 373  filesys/inode.h      |   11
 374  threads/init.c       |    5
 375  threads/interrupt.c  |    2
 376  threads/thread.c     |   32 +
 377  threads/thread.h     |   38 +-
 378  userprog/exception.c |   12
 379  userprog/pagedir.c   |   10
 380  userprog/process.c   |  332 +++++++++++++----
 381  userprog/syscall.c   |  582 ++++++++++++++++++++++++++++++-
 382  userprog/syscall.h   |    1
 383  vm/frame.c           |  161 ++++++++
 384  vm/frame.h           |   23 +
 385  vm/page.c            |  297 +++++++++++++++
 386  vm/page.h            |   50 ++
 387  vm/swap.c            |   85 ++++
 388  vm/swap.h            |   11
 389  30 files changed, 2721 insertions(+), 286 deletions(-)
 390 @end verbatim
 391
 392 @item Can @code{DISK_SECTOR_SIZE} change?
 393
 394 No, @code{DISK_SECTOR_SIZE} is fixed at 512.  This is a fixed property
 395 of IDE disk hardware.
 396 @end table
 397
 398 @menu
 399 * Indexed Files FAQ::
 400 * Subdirectories FAQ::
 401 * Buffer Cache FAQ::
 402 @end menu
 403
 404 @node Indexed Files FAQ
 405 @subsection Indexed Files FAQ
 406
 407 @table @b
 408 @item What is the largest file size that we are supposed to support?
 409
 410 The disk we create will be 8 MB or smaller.  However, individual files
 411 will have to be smaller than the disk to accommodate the metadata.
 412 You'll need to consider this when deciding your inode organization.
 413 @end table
 414
 415 @node Subdirectories FAQ
 416 @subsection Subdirectories FAQ
 417
 418 @table @b
 419 @item How should a file name like @samp{a//b} be interpreted?
 420
 421 Multiple consecutive slashes are equivalent to a single slash, so this
 422 file name is the same as @samp{a/b}.
 423
 424 @item How about a file name like @samp{/../x}?
 425
 426 The root directory is its own parent, so it is equivalent to @samp{/x}.
 427
 428 @item How should a file name that ends in @samp{/} be treated?
 429
 430 Most Unix systems allow a slash at the end of the name for a directory,
 431 and reject other names that end in slashes.  We will allow this
 432 behavior, as well as simply rejecting a name that ends in a slash.
 433 @end table
 434
 435 @node Buffer Cache FAQ
 436 @subsection Buffer Cache FAQ
 437
 438 @table @b
 439 @item Can we keep a @struct{inode_disk} inside @struct{inode}?
 440
 441 The goal of the 64-block limit is to bound the amount of cached file
 442 system data.  If you keep a block of disk data---whether file data or
 443 metadata---anywhere in kernel memory then you have to count it against
 444 the 64-block limit.  The same rule applies to anything that's
 445 ``similar'' to a block of disk data, such as a @struct{inode_disk}
 446 without the @code{length} or @code{sector_cnt} members.
 447
 448 That means you'll have to change the way the inode implementation
 449 accesses its corresponding on-disk inode right now, since it currently
 450 just embeds a @struct{inode_disk} in @struct{inode} and reads the
 451 corresponding sector from disk when it's created.  Keeping extra
 452 copies of inodes would subvert the 64-block limitation that we place
 453 on your cache.
 454
 455 You can store a pointer to inode data in @struct{inode}, but it you do
 456 so you should carefully make sure that this does not limit your OS to 64
 457 simultaneously open files.
 458 You can also store other information to help you find the inode when you
 459 need it.  Similarly, you may store some metadata along each of your 64
 460 cache entries.
 461
 462 You can keep a cached copy of the free map permanently in memory if you
 463 like.  It doesn't have to count against the cache size.
 464
 465 @func{byte_to_sector} in @file{filesys/inode.c} uses the
 466 @struct{inode_disk} directly, without first reading that sector from
 467 wherever it was in the storage hierarchy.  This will no longer work.
 468 You will need to change @func{inode_byte_to_sector} to obtain the
 469 @struct{inode_disk} from the cache before using it.
 470 @end table