X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=blobdiff_plain;f=doc%2Ffilesys.texi;h=b71974f717da1437ab95a72ff941e5b8c81219c8;hb=9d6ad9a479ddae9628dc2d45862f9281986c430b;hp=090397c1e4e1fe10330befc771ab837e57ed903a;hpb=43ff64ec54910a7425ee72ee5b3ff3e73a682c27;p=pintos-anon diff --git a/doc/filesys.texi b/doc/filesys.texi index 090397c..b71974f 100644 --- a/doc/filesys.texi +++ b/doc/filesys.texi @@ -26,12 +26,13 @@ Your submission should define @code{THREAD_JOIN_IMPLEMENTED} in @menu * File System New Code:: -* Problem 4-1 Large Files:: +* File System Synchronization:: +* Problem 4-1 Indexed Files:: * Problem 4-2 File Growth:: * Problem 4-3 Subdirectories:: * Problem 4-4 Buffer Cache:: * File System Design Document Requirements:: -* File System FAQ:: +* File System FAQ:: @end menu @node File System New Code @@ -47,7 +48,9 @@ kernel command line. @item filesys.h @itemx filesys.c -Top-level interface to the file system. +Top-level interface to the file system. Please read the long comment +near the top of @file{filesys.c}, which introduces some details of the +file system code as provided. @item directory.h @itemx directory.c @@ -84,19 +87,59 @@ which you will remove. While most of your work will be in @file{filesys}, you should be prepared for interactions with all previous parts (as usual). -@node Problem 4-1 Large Files -@section Problem 4-1: Large Files +@node File System Synchronization +@section Synchronization -Modify the file system to allow the maximum size of a file to be as -large as the disk. You can assume that the disk will not be larger -than 8 MB. In the basic file system, each file is limited to a file -size of just under 64 kB. Each file has a header called an index node -or @dfn{inode} (represented by @struct{inode}) that is a table of -direct pointers to the disk blocks for that file. Since the inode is -stored in one disk sector, the maximum size of a file is limited by -the number of pointers that will fit in one disk sector. Increasing -the limit to 8 MB will require you to implement doubly-indirect -blocks. +The provided file system requires external synchronization, that is, +callers must ensure that only one thread can be running in the file +system code at once. Your submission should use a finer-grained +synchronization strategy. You will need to consider synchronization +issues for each type of file system object. The provided code uses the +following strategies: + +@itemize @bullet +@item +The free map and root directory are read each time they are needed for +an operation, and if they are modified, they are written back before the +operation completes. Thus, the free map is always consistent from an +external viewpoint. + +@item +Inodes are immutable in the provided file system, that is, their content +never changes between creation and deletion. Furthermore, only one copy +of an inode's data is maintained in memory at once, even if the file is +open in multiple contexts. + +@item +File data doesn't have to be consistent because it's just not part of +the model. In Unix and many other operating systems, a read of a file +by one process when the file is being written by another process can +show inconsistent results: it can show that none, all, or part of the +write has completed. (However, after the write system call returns to +its caller, all subsequent readers must see the change.) Similarly, +when two threads write to the same part of a file at the same time, +their data may be interleaved. External synchronization of the provided +file system ensures that reads and writes are fully serialized, but your +file system doesn't have to maintain full serialization as long as it +follows the rules above. +@end itemize + +@node Problem 4-1 Indexed Files +@section Problem 4-1: Indexed Files + +The basic file system allocates files as a single extent, making it +vulnerable to external fragmentation. Eliminate this problem by +modifying the inode structure. In practice, this probably means using +an index structure with direct, indirect, and doubly indirect blocks. +(You are welcome to choose a different scheme as long as you explain the +rationale for it in your design documentation, and as long as it does +not suffer from external fragmentation.) + +You can assume that the disk will not be larger than 8 MB. You must +support files as large as the disk (minus metadata). Each inode is +stored in one disk sector, limiting the number of block pointers that it +can contain. Supporting 8 MB files will require you to implement +doubly-indirect blocks. @node Problem 4-2 File Growth @section Problem 4-2: File Growth @@ -111,6 +154,19 @@ root directory file to expand beyond its current limit of ten files. Make sure that concurrent accesses to the inode remain properly synchronized. +The user is allowed to seek beyond the current end-of-file (EOF). The +seek itself does not extend the file. Writing at a position past EOF +extends the file to the position being written, and any gap between the +previous EOF and the start of the write must be filled with zeros. A +read past EOF returns zero bytes. + +Writing far beyond EOF can cause many blocks to be entirely zero. Some +file systems allocate and write real data blocks for these implicitly +zeroed blocks. Other file systems do not allocate these blocks at all +until they are explicitly written. The latter file systems are said to +support ``sparse files.'' You may adopt either allocation strategy in +your file system. + @node Problem 4-3 Subdirectories @section Problem 4-3: Subdirectories @@ -213,7 +269,11 @@ Note that write-behind makes your filesystem more fragile in the face of crashes. Therefore, you should periodically write all cached blocks to disk. If you have @func{timer_sleep} from the first project working, this is an -excellent application for it. +excellent application for it. (If you're still using the base +implementation of @func{timer_sleep}, be aware that it busy-waits, which +is not an acceptable solution.) If @func{timer_sleep}'s delays seem too +short or too long, reread the explanation of the @option{-r} option to +@command{pintos} (@pxref{Debugging versus Testing}). Likewise, read-ahead is only really useful when done asynchronously. That is, if a process wants disk block 1 from the file, it needs to @@ -235,6 +295,9 @@ As always, submit a design document file summarizing your design. Be sure to cover the following points: @itemize @bullet +@item +How did you choose to synchronize file system operations? + @item How did you structure your inodes? How many blocks did you access directly, via single-indirection, and/or via double-indirection? Why? @@ -421,16 +484,15 @@ the 64-block limit. The same rule applies to anything that's ``similar'' to a block of disk data, such as a @struct{inode_disk} without the @code{length} or @code{sector_cnt} members. +You can keep a cached copy of the free map in memory permanently if you +like. It doesn't have to count against the cache size. + That means you'll have to change the way the inode implementation accesses its corresponding on-disk inode right now, since it currently just embeds a @struct{inode_disk} in @struct{inode} and reads the -corresponding sector in from disk when it's created. - -There are two reasons for not storing inode data in @struct{inode}. -First, keeping extra copies of inodes would be cheating the 64-block -limitation that we place on your cache. Second, if two processes have -the same file open, you will create a huge synchronization headache for -yourself if each @struct{inode} has its own copy of the on-disk inode. +corresponding sector in from disk when it's created. Keeping extra +copies of inodes would be cheating the 64-block limitation that we place +on your cache. You can store pointers to inode data in @struct{inode}, if you want, and you can store some other small amount of information to help you find