1 @node Project 4--File Systems
2 @chapter Project 4: File Systems
4 In the previous two assignments, you made extensive use of a
5 file system without actually worrying about how it was implemented
6 underneath. For this last assignment, you will improve the
7 implementation of the file system. You will be working primarily in
8 the @file{filesys} directory.
10 You may build project 4 on top of project 2 or project 3. In either
11 case, all of the functionality needed for project 2 must work in your
12 filesys submission. If you build on project 3, then all of the project
13 3 functionality must work also, and you will need to edit
14 @file{filesys/Make.vars} to enable VM functionality. You can receive up
15 to 5% extra credit if you do enable VM.
18 * Project 4 Background::
19 * Project 4 Suggested Order of Implementation::
20 * Project 4 Requirements::
24 @node Project 4 Background
28 * File System New Code::
29 * Testing File System Persistence::
32 @node File System New Code
35 Here are some files that are probably new to you. These are in the
36 @file{filesys} directory except where indicated:
40 Simple utilities for the file system that are accessible from the
45 Top-level interface to the file system. @xref{Using the File System},
50 Translates file names to inodes. The directory data structure is
55 Manages the data structure representing the layout of a
60 Translates file reads and writes to disk sector reads
63 @item lib/kernel/bitmap.h
64 @itemx lib/kernel/bitmap.c
65 A bitmap data structure along with routines for reading and writing
66 the bitmap to disk files.
69 Our file system has a Unix-like interface, so you may also wish to
70 read the Unix man pages for @code{creat}, @code{open}, @code{close},
71 @code{read}, @code{write}, @code{lseek}, and @code{unlink}. Our file
72 system has calls that are similar, but not identical, to these. The
73 file system translates these calls into disk operations.
75 All the basic functionality is there in the code above, so that the
76 file system is usable from the start, as you've seen
77 in the previous two projects. However, it has severe limitations
78 which you will remove.
80 While most of your work will be in @file{filesys}, you should be
81 prepared for interactions with all previous parts.
83 @node Testing File System Persistence
84 @subsection Testing File System Persistence
86 By now, you should be familiar with the basic process of running the
87 Pintos tests. @xref{Testing}, for review, if necessary.
89 Until now, each test invoked Pintos just once. However, an important
90 purpose of a file system is to ensure that data remains accessible from
91 one boot to another. Thus, the tests that are part of the file system
92 project invoke Pintos a second time. The second run combines all the
93 files and directories in the file system into a single file, then copies
94 that file out of the Pintos file system into the host (Unix) file
97 The grading scripts check the file system's correctness based on the
98 contents of the file copied out in the second run. This means that your
99 project will not pass any of the extended file system tests until the
100 file system is implemented well enough to support @command{tar}, the
101 Pintos user program that produces the file that is copied out. The
102 @command{tar} program is fairly demanding (it requires both extensible
103 file and subdirectory support), so this will take some work. Until
104 then, you can ignore errors from @command{make check} regarding the
105 extracted file system.
107 Incidentally, as you may have surmised, the file format used for copying
108 out the file system contents is the standard Unix ``tar'' format. You
109 can use the Unix @command{tar} program to examine them. The tar file
110 for test @var{t} is named @file{@var{t}.tar}.
112 @node Project 4 Suggested Order of Implementation
113 @section Suggested Order of Implementation
115 We suggest implementing the parts of this project in the following
116 order to make your job easier:
120 Buffer cache (@pxref{Buffer Cache}). Implement the buffer cache and
121 integrate it into the existing file system. At this point all the
122 tests from project 2 (and project 3, if you're building on it) should
126 Extensible files (@pxref{Indexed and Extensible Files}). After this
127 step, your project should pass the file growth tests.
130 Subdirectories (@pxref{Subdirectories}). Afterward, your project
131 should pass the directory tests.
134 Remaining miscellaneous items.
137 You should think about synchronization throughout.
139 @node Project 4 Requirements
140 @section Requirements
143 * Project 4 Design Document::
144 * Indexed and Extensible Files::
147 * File System Synchronization::
150 @node Project 4 Design Document
151 @subsection Design Document
153 Before you turn in your project, you must copy @uref{filesys.tmpl, , the
154 project 4 design document template} into your source tree under the name
155 @file{pintos/src/filesys/DESIGNDOC} and fill it in. We recommend that
156 you read the design document template before you start working on the
157 project. @xref{Project Documentation}, for a sample design document
158 that goes along with a fictitious project.
160 @node Indexed and Extensible Files
161 @subsection Indexed and Extensible Files
163 The basic file system allocates files as a single extent, making it
164 vulnerable to external fragmentation, that is, it is possible that an
165 @var{n}-block file cannot be allocated even though @var{n} blocks are
166 free. Eliminate this problem by
167 modifying the on-disk inode structure. In practice, this probably means using
168 an index structure with direct, indirect, and doubly indirect blocks.
169 You are welcome to choose a different scheme as long as you explain the
170 rationale for it in your design documentation, and as long as it does
171 not suffer from external fragmentation (as does the extent-based file
174 You can assume that the disk will not be larger than 8 MB. You must
175 support files as large as the disk (minus metadata). Each inode is
176 stored in one disk sector, limiting the number of block pointers that it
177 can contain. Supporting 8 MB files will require you to implement
178 doubly-indirect blocks.
180 An extent-based file can only grow if it is followed by empty space, but
181 indexed inodes make file growth possible whenever free space is
182 available. Implement file growth. In the basic file system, the file
183 size is specified when the file is created. In most modern file
184 systems, a file is initially created with size 0 and is then expanded
185 every time a write is made off the end of the file. Your file system
188 There should be no predetermined limit on the size of a file, except
189 that a file cannot exceed the size of the disk (minus metadata). This
190 also applies to the root directory file, which should now be allowed
191 to expand beyond its initial limit of 16 files.
193 User programs are allowed to seek beyond the current end-of-file (EOF). The
194 seek itself does not extend the file. Writing at a position past EOF
195 extends the file to the position being written, and any gap between the
196 previous EOF and the start of the write must be filled with zeros. A
197 read starting from a position past EOF returns no bytes.
199 Writing far beyond EOF can cause many blocks to be entirely zero. Some
200 file systems allocate and write real data blocks for these implicitly
201 zeroed blocks. Other file systems do not allocate these blocks at all
202 until they are explicitly written. The latter file systems are said to
203 support ``sparse files.'' You may adopt either allocation strategy in
207 @subsection Subdirectories
209 Implement a hierarchical name space. In the basic file system, all
210 files live in a single directory. Modify this to allow directory
211 entries to point to files or to other directories.
213 Make sure that directories can expand beyond their original size just
214 as any other file can.
216 The basic file system has a 14-character limit on file names. You may
217 retain this limit for individual file name components, or may extend
218 it, at your option. You must allow full path names to be
219 much longer than 14 characters.
221 Maintain a separate current directory for each process. At
222 startup, set the root as the initial process's current directory.
223 When one process starts another with the @code{exec} system call, the
224 child process inherits its parent's current directory. After that, the
225 two processes' current directories are independent, so that either
226 changing its own current directory has no effect on the other.
227 (This is why, under Unix, the @command{cd} command is a shell built-in,
228 not an external program.)
230 Update the existing system calls so that, anywhere a file name is
231 provided by the caller, an absolute or relative path name may used.
232 The directory separator character is forward slash (@samp{/}).
233 You must also support special file names @file{.} and @file{..}, which
234 have the same meanings as they do in Unix.
236 Update the @code{open} system call so that it can also open directories.
237 Of the existing system calls, only @code{close} needs to accept a file
238 descriptor for a directory.
240 Update the @code{remove} system call so that it can delete empty
241 directories (other than the root) in addition to regular files.
242 Directories may only be deleted if they do not contain any files or
243 subdirectories (other than @file{.} and @file{..}). You may decide
244 whether to allow deletion of a directory that is open by a process or in
245 use as a process's current working directory. If it is allowed, then
246 attempts to open files (including @file{.} and @file{..}) or create new
247 files in a deleted directory must be disallowed.
249 Implement the following new system calls:
251 @deftypefn {System Call} bool chdir (const char *@var{dir})
252 Changes the current working directory of the process to
253 @var{dir}, which may be relative or absolute. Returns true if
254 successful, false on failure.
257 @deftypefn {System Call} bool mkdir (const char *@var{dir})
258 Creates the directory named @var{dir}, which may be
259 relative or absolute. Returns true if successful, false on failure.
260 Fails if @var{dir} already exists or if any directory name in
261 @var{dir}, besides the last, does not already exist. That is,
262 @code{mkdir("/a/b/c")} succeeds only if @file{/a/b} already exists and
263 @file{/a/b/c} does not.
266 @deftypefn {System Call} bool readdir (int @var{fd}, char *@var{name})
267 Reads a directory entry from file descriptor @var{fd}, which must
268 represent a directory. If successful, stores the null-terminated file
269 name in @var{name}, which must have room for @code{READDIR_MAX_LEN + 1}
270 bytes, and returns true. If no entries are left in the directory,
273 @file{.} and @file{..} should not be returned by @code{readdir}.
275 If the directory changes while it is open, then it is acceptable for
276 some entries not to be read at all or to be read multiple times.
277 Otherwise, each directory entry should be read once, in any order.
279 @code{READDIR_MAX_LEN} is defined in @file{lib/user/syscall.h}. If your
280 file system supports longer file names than the basic file system, you
281 should increase this value from the default of 14.
284 @deftypefn {System Call} bool isdir (int @var{fd})
285 Returns true if @var{fd} represents a directory,
286 false if it represents an ordinary file.
289 @deftypefn {System Call} int inumber (int @var{fd})
290 Returns the @dfn{inode number} of the inode associated with @var{fd},
291 which may represent an ordinary file or a directory.
293 An inode number persistently identifies a file or directory. It is
294 unique during the file's existence. In Pintos, the sector number of the
295 inode is suitable for use as an inode number.
298 We have provided @command{ls} and @command{mkdir} user programs, which
299 are straightforward once the above syscalls are implemented.
300 We have also provided @command{pwd}, which is not so straightforward.
301 The @command{shell} program implements @command{cd} internally.
303 The @code{pintos} @option{put} and @option{get} commands should now
304 accept full path names, assuming that the directories used in the
305 paths have already been created. This should not require any significant
306 extra effort on your part.
309 @subsection Buffer Cache
311 Modify the file system to keep a cache of file blocks. When a request
312 is made to read or write a block, check to see if it is in the
313 cache, and if so, use the cached data without going to
314 disk. Otherwise, fetch the block from disk into cache, evicting an
315 older entry if necessary. You are limited to a cache no greater than 64
318 You must implement a cache replacement algorithm that is at least as
319 good as the ``clock'' algorithm. Your algorithm must also account for
320 the generally greater value of metadata compared to data. Experiment
321 to see what combination of accessed, dirty, and other information
322 results in the best performance, as measured by the number of disk
325 You can keep a cached copy of the free map permanently in memory if you
326 like. It doesn't have to count against the cache size.
328 The provided inode code uses a ``bounce buffer'' allocated with
329 @func{malloc} to translate the disk's sector-by-sector interface into
330 the system call interface's byte-by-byte interface. You should get rid
331 of these bounce buffers. Instead, copy data into and out of sectors in
332 the buffer cache directly.
334 Your cache should be @dfn{write-behind}, that is,
335 keep dirty blocks in the cache, instead of immediately writing modified
336 data to disk. Write dirty blocks to disk whenever they are evicted.
337 Because write-behind makes your file system more fragile in the face of
338 crashes, in addition you should periodically write all dirty, cached
339 blocks back to disk. The cache should also be written back to disk in
340 @func{filesys_done}, so that halting Pintos flushes the cache.
342 If you have @func{timer_sleep} from the first project working, write-behind is
343 an excellent application. Otherwise, you may implement a less general
344 facility, but make sure that it does not exhibit busy-waiting.
346 You should also implement @dfn{read-ahead}, that is,
347 automatically fetch the next block of a file
348 into the cache when one block of a file is read, in case that block is
350 Read-ahead is only really useful when done asynchronously. That means,
351 if a process requests disk block 1 from the file, it should block until disk
352 block 1 is read in, but once that read is complete, control should
353 return to the process immediately. The read-ahead request for disk
354 block 2 should be handled asynchronously, in the background.
356 @strong{We recommend integrating the cache into your design early.} In
357 the past, many groups have tried to tack the cache onto a design late in
358 the design process. This is very difficult. These groups have often
359 turned in projects that failed most or all of the tests.
361 @node File System Synchronization
362 @subsection Synchronization
364 The provided file system requires external synchronization, that is,
365 callers must ensure that only one thread can be running in the file
366 system code at once. Your submission must adopt a finer-grained
367 synchronization strategy that does not require external synchronization.
368 To the extent possible, operations on independent entities should be
369 independent, so that they do not need to wait on each other.
371 Operations on different cache blocks must be independent. In
372 particular, when I/O is required on a particular block, operations on
373 other blocks that do not require I/O should proceed without having to
374 wait for the I/O to complete.
376 Multiple processes must be able to access a single file at once.
377 Multiple reads of a single file must be able to complete without
378 waiting for one another. When writing to a file does not extend the
379 file, multiple processes should also be able to write a single file at
380 once. A read of a file by one process when the file is being written by
381 another process is allowed to show that none, all, or part of the write
382 has completed. (However, after the @code{write} system call returns to
383 its caller, all subsequent readers must see the change.) Similarly,
384 when two processes simultaneously write to the same part of a file,
385 their data may be interleaved.
387 On the other hand, extending a file and writing data into the new
388 section must be atomic. Suppose processes A and B both have a given
389 file open and both are positioned at end-of-file. If A reads and B
390 writes the file at the same time, A may read all, part, or none of what
391 B writes. However, A may not read data other than what B writes, e.g.@:
392 if B's data is all nonzero bytes, A is not allowed to see any zeros.
394 Operations on different directories should take place concurrently.
395 Operations on the same directory may wait for one another.
397 Keep in mind that only data shared by multiple threads needs to be
398 synchronized. In the base file system, @struct{file} and @struct{dir}
399 are accessed only by a single thread.
405 @item How much code will I need to write?
407 Here's a summary of our reference solution, produced by the
408 @command{diffstat} program. The final row gives total lines inserted
409 and deleted; a changed line counts as both an insertion and a deletion.
411 This summary is relative to the Pintos base code, but the reference
412 solution for project 4 is based on the reference solution to project 3.
413 Thus, the reference solution runs with virtual memory enabled.
414 @xref{Project 3 FAQ}, for the summary of project 3.
416 The reference solution represents just one possible solution. Many
417 other solutions are also possible and many of those differ greatly from
418 the reference solution. Some excellent solutions may not modify all the
419 files modified by the reference solution, and some may modify files not
420 modified by the reference solution.
424 devices/timer.c | 42 ++
425 filesys/Make.vars | 6
426 filesys/cache.c | 473 +++++++++++++++++++++++++
427 filesys/cache.h | 23 +
428 filesys/directory.c | 99 ++++-
429 filesys/directory.h | 3
431 filesys/filesys.c | 194 +++++++++-
432 filesys/filesys.h | 5
433 filesys/free-map.c | 45 +-
434 filesys/free-map.h | 4
436 filesys/inode.c | 444 ++++++++++++++++++-----
439 threads/interrupt.c | 2
440 threads/thread.c | 32 +
441 threads/thread.h | 38 +-
442 userprog/exception.c | 12
443 userprog/pagedir.c | 10
444 userprog/process.c | 332 +++++++++++++----
445 userprog/syscall.c | 582 ++++++++++++++++++++++++++++++-
446 userprog/syscall.h | 1
447 vm/frame.c | 161 ++++++++
449 vm/page.c | 297 +++++++++++++++
453 30 files changed, 2721 insertions(+), 286 deletions(-)
456 @item Can @code{DISK_SECTOR_SIZE} change?
458 No, @code{DISK_SECTOR_SIZE} is fixed at 512. This is a fixed property
459 of IDE disk hardware.
463 * Indexed Files FAQ::
464 * Subdirectories FAQ::
468 @node Indexed Files FAQ
469 @subsection Indexed Files FAQ
472 @item What is the largest file size that we are supposed to support?
474 The disk we create will be 8 MB or smaller. However, individual files
475 will have to be smaller than the disk to accommodate the metadata.
476 You'll need to consider this when deciding your inode organization.
479 @node Subdirectories FAQ
480 @subsection Subdirectories FAQ
483 @item How should a file name like @samp{a//b} be interpreted?
485 Multiple consecutive slashes are equivalent to a single slash, so this
486 file name is the same as @samp{a/b}.
488 @item How about a file name like @samp{/../x}?
490 The root directory is its own parent, so it is equivalent to @samp{/x}.
492 @item How should a file name that ends in @samp{/} be treated?
494 Most Unix systems allow a slash at the end of the name for a directory,
495 and reject other names that end in slashes. We will allow this
496 behavior, as well as simply rejecting a name that ends in a slash.
499 @node Buffer Cache FAQ
500 @subsection Buffer Cache FAQ
503 @item Can we keep a @struct{inode_disk} inside @struct{inode}?
505 The goal of the 64-block limit is to bound the amount of cached file
506 system data. If you keep a block of disk data---whether file data or
507 metadata---anywhere in kernel memory then you have to count it against
508 the 64-block limit. The same rule applies to anything that's
509 ``similar'' to a block of disk data, such as a @struct{inode_disk}
510 without the @code{length} or @code{sector_cnt} members.
512 That means you'll have to change the way the inode implementation
513 accesses its corresponding on-disk inode right now, since it currently
514 just embeds a @struct{inode_disk} in @struct{inode} and reads the
515 corresponding sector from disk when it's created. Keeping extra
516 copies of inodes would subvert the 64-block limitation that we place
519 You can store a pointer to inode data in @struct{inode}, but it you do
520 so you should carefully make sure that this does not limit your OS to 64
521 simultaneously open files.
522 You can also store other information to help you find the inode when you
523 need it. Similarly, you may store some metadata along each of your 64
526 You can keep a cached copy of the free map permanently in memory if you
527 like. It doesn't have to count against the cache size.
529 @func{byte_to_sector} in @file{filesys/inode.c} uses the
530 @struct{inode_disk} directly, without first reading that sector from
531 wherever it was in the storage hierarchy. This will no longer work.
532 You will need to change @func{inode_byte_to_sector} to obtain the
533 @struct{inode_disk} from the cache before using it.