pintos-os.org Git - pintos-anon/blob - ta-advice/HW4

   1                        PROJECT 4 GRADING ADVICE
   2                        ========================
   3
   4 This file contains advice for grading project 4.  General advice for
   5 Pintos grading is in README, which you should read first.
   6
   7                      INDEXED AND EXTENSIBLE FILES
   8                      ============================
   9
  10 A1:
  11
  12     struct inode_disk should change here.  Many students just present
  13     the new version instead of the detailed changes, which is fine
  14     because most of it changes.  The typical result looks something
  15     like this:
  16
  17         struct inode_disk
  18           {
  19             disk_sector_t sectors[SECTOR_CNT];  /* Sectors. */
  20             enum inode_type type;               /* FILE_INODE or DIR_INODE. */
  21             off_t length;                       /* File size in bytes. */
  22             unsigned magic;                     /* Magic number. */
  23           };
  24
  25     Some details:
  26
  27         - The "sectors" array might be divided into separate arrays or
  28           scalars devoted to direct, indirect, and doubly indirect
  29           blocks.  That's fine, and arguably cleaner.
  30
  31         - The "magic" member should be retained, but some groups might
  32           drop it because they don't understand its purpose.  That's
  33           OK.
  34
  35         - A "type" or "is_dir" member might have been added to
  36           directory entries instead, or its addition might be
  37           mentioned in B1, because it's more relevant there.
  38
  39         - The "start" member should be dropped.  If it's still there,
  40           take off points.
  41
  42         - The "unused" array should be dropped, because if there's
  43           extra space then it can be used for more direct blocks, etc.
  44           If there's an unused array that's between 0 and 3 bytes (not
  45           elements) long, then that's fine--probably they needed to
  46           pad out some "char" or "short" elements to make the whole
  47           thing exactly 512 bytes long.  Otherwise, take off points.
  48
  49     Most groups add a lock to struct inode that synchronizes file
  50     growth (see A3, A4).
  51
  52 A2:
  53
  54     Most groups use 123 or 124 direct blocks, 1 indirect block, and 1
  55     doubly indirect block, which yield the following maximums:
  56
  57         123 direct: (123 + 128 + 128**2) * 512 = 8,517,120 bytes = 8,317.5 kB
  58         124 direct: (124 + 128 + 128**2) * 512 = 8,517,632 bytes = 8,318 kB
  59
  60     Occasionally a group thinks that the "max file size" includes
  61     metadata size.  The metadata in either situation above would be of
  62     this size:
  63
  64         (1 + 1 + (1 + 128)) * 512 = 67,072 bytes = 65.5 kB
  65
  66     so that the totals would then be
  67
  68         123 direct: 8,517,120 + 67,072 = 8,584,192 bytes = 8,383 kB
  69         124 direct: 8,517,632 + 67,072 = 8,584,704 bytes = 8,383.5 kB
  70
  71     Check that their math is correct if it doesn't match one of these.
  72
  73     Students must "show their work" or at least give enough numbers in
  74     A1, A2, or A6 to allow you to check their work.  You shouldn't
  75     have to look at the source code to check it.  Take off points if
  76     you do.
  77
  78     Some students think that their design has 128 indirect blocks or
  79     128 doubly indirect blocks.  That's not true.  Take off points.
  80     (However, the blocks that the doubly indirect blocks point to are
  81     indirect blocks in their own right.  But in that case they would
  82     normally have 129, not 128, because of the indirect block that the
  83     inode points to directly.)
  84
  85     See also A6.
  86
  87 A3:
  88
  89     One important fact here is that Pintos files can grow but never
  90     shrink.  (Although deletion might be thought of as "shrinking" to
  91     size 0, this can never be observed by a user process, because the
  92     file's contents are retained as long as a user process has it
  93     open.)
  94
  95     A few approaches:
  96
  97         1. The most common approach is to use a lock to make extending
  98            a file into a critical section.  A write first checks the
  99            length of the file.  A write that won't extend the file
 100            doesn't need to take the lock and can proceed immediately.
 101            A write that extends the file takes the lock.
 102
 103            The lock should be per-inode (per-file).  Take off points
 104            if it is global.
 105
 106            There are some subtleties to this approach.  The first is
 107            that checking and updating the length of the file has to be
 108            synchronized.  If two writers try to write 10 bytes
 109            starting at the current EOF, they can both see the old file
 110            length.  Thus, the file length must be checked again after
 111            obtaining the lock, e.g.
 112
 113                 if (final_ofs > disk_inode->length)   // First check.
 114                   {
 115                     lock_acquire (&inode->grow_lock);
 116                     if (final_ofs > disk_inode->length)   // Second check.
 117                       {
 118                          // ...extend file...
 119                       }
 120                     lock_release (&inode->grow_lock);
 121                   }
 122
 123            Students who use this scheme must mention how they handle
 124            this problem.  If they don't, take off points.
 125
 126            An even more subtle problem is that of race-free reads of
 127            the "length" member of disk_inode.  There ideally should be
 128            some locking around access to it, but a call to barrier()
 129            or use of volatile would be sufficient also.  However,
 130            reading a single integer is normally atomic on modern
 131            machines, so we'll just ignore this problem.
 132
 133         2. Use sector-based locking in the buffer cache.  In this
 134            approach, the buffer cache supports shared and exclusive
 135            locking of sectors.  To read or write a file normally can
 136            use a shared lock on the inode sector; extending a file
 137            requires an exclusive lock on the inode sector.
 138
 139         3. If all file accesses (or all write accesses) require a lock
 140            and hold it during the entire read or write, then that's a
 141            trivial solution, but it doesn't meet the requirements in
 142            the Synchronization section of the assignment: "Multiple
 143            reads of a single file must be able to complete without
 144            waiting for one another.  When writing to a file does not
 145            extend the file, multiple processes should also be able to
 146            write a single file at once."  Take off points.
 147
 148            This is not the same as taking a lock across a couple of
 149            lines of code to check or set a variable in a race-free
 150            way, etc.  That's fine.
 151
 152 A4:
 153
 154     The typical solution to A4 depends on the solution chosen for A3:
 155
 156         1. a. Typically, the process extending the file doesn't update
 157               the file's length until the extension is fully
 158               completed.  Thus, before the extension is complete,
 159               readers won't see that the file's length has increased
 160               at all.
 161
 162            b. Another solution is to make reads past end-of-file wait
 163               for file extension to be completed, by making read past
 164               end-of-file take the lock needed to extend the file.
 165               That's fine too.
 166
 167         2. A process that wants to read the file will need to get a
 168            shared lock on the inode sector, which it can't get until
 169            the extending process has released its exclusive lock.
 170            Thus, this approach will be correct as long as the
 171            exclusive lock on the inode is held until the file
 172            extension is complete.
 173
 174         3. This approach trivially solves the problem but doesn't
 175            fulfill the synchronization requirements.  Take off points.
 176
 177 A5:
 178
 179     Many answers are implicitly predicated on the fact that Pintos
 180     locks and semaphores have fair, first-come, first-served queuing.
 181     Most students don't mention it, though.
 182
 183     The typical answer depends on the answer to A4:
 184
 185         1. a. Readers, and writers not at end-of-file, don't have to
 186               wait, so they have no fairness issues.  Writers that
 187               extend a file are waiting on a lock, which queues their
 188               extensions fairly.  After a file extension is complete,
 189               subsequent reads and writes within the new file length
 190               are completed fairly because there's no need for them to
 191               wait.
 192
 193            b. Essentially the same as 1a, except that readers beyond
 194               end-of-file are treated like writers beyond end-of-file.
 195               Because the lock queuing is fair, readers get serviced
 196               fairly too.
 197
 198         2. Fairness here depends on the fairness of locking in the
 199            buffer cache.  The group must explain how the buffer cache
 200            provides fair locking.  Generally, two rules are needed to
 201            ensure fairness:
 202
 203                 - A rule to prevent readers from waiting for writers
 204                   forever.  For example: when the exclusive holder of
 205                   a lock releases it, any processes waiting for a
 206                   shared lock are preferred over processes waiting for
 207                   an exclusive lock.
 208
 209                 - A rule to prevent writers from waiting for readers
 210                   forever.  For example: if one or more processes hold
 211                   a shared lock, and one or more processes are waiting
 212                   for an exclusive lock, then no more processes may
 213                   obtain a shared lock until at least one process has
 214                   obtained an exclusive lock.
 215
 216            The group must explain both rules; take off points
 217            otherwise.
 218
 219         3. This approach trivially solves the problem but doesn't
 220            fulfill the synchronization requirements.  Take off points.
 221
 222     There should be no use of a timer to ensure fairness.  That's not
 223     the right approach.
 224
 225     Fairness is not a matter of what kind of file access is "rare" or
 226     "common".  Take off points from groups that argue, e.g., that
 227     their implementation is unfair if a file is being extended but
 228     that file extension happens "rarely".
 229
 230 A6:
 231
 232     I haven't seen a group try to use a non-multilevel index structure
 233     yet.  Such a choice would need good rationale.
 234
 235     Good answers include mention of advantages of multilevel indexes:
 236
 237         - Both random and sequential access are efficient.
 238
 239         - Large files can be supported, without imposing a penalty on
 240           efficiency of small files.
 241
 242         - Works in presence of external fragmentation.
 243
 244         - Fast access to data for small files.
 245
 246     Some answers mention some disadvantages of multilevel indexes:
 247
 248         - It takes up to 3 disk accesses to find a data block in a big
 249           file (although caching helps with sequential access
 250           patterns).
 251
 252         - Fixed maximum file size.
 253
 254         - High overhead for small files.
 255
 256                             SUBDIRECTORIES
 257                             ==============
 258
 259 B1:
 260
 261     struct thread should have a new member to keep track of the
 262     process's current directory.  Possibilities include:
 263
 264         - Pointer to a struct dir or struct inode.  This is the best
 265           choice.
 266
 267         - A string.  This is not a good choice (see B6).
 268
 269         - A sector number.  This is not a good choice (see B6).
 270
 271     struct dir_entry or struct inode_disk should have a new member to
 272     distinguish ordinary files from directories.  It is often named
 273     something like "type" or "is_dir".  This might have been mentioned
 274     in A1 instead; look there if it's missing.  If it's not mentioned
 275     either place, take off points.
 276
 277     Directory synchronization possibilities include:
 278
 279         - Most implementations add a lock to struct inode or struct
 280           dir and use it to synchronize directory operations.  This is
 281           simple and effective.
 282
 283         - Some implementations use a global list of open directories.
 284           This generally requires a new global lock to control adding
 285           and removing list elements.  If there's a list but no
 286           locking or other explanation, deduct points.
 287
 288           In addition, this implementation needs some way to wait for
 289           a particular directory and to free directories when no one
 290           is still using them.  This can be done with an additional
 291           per-directory lock and a waiting-processes counter.  If
 292           there's no such data or explanation, deduct points.
 293
 294         - Some implementations use a global lock, or a few of them, to
 295           synchronize directory operations.  This is unacceptable;
 296           there is no reason to do global locking.  Deduct points.
 297
 298 B2:
 299
 300     There's nothing tricky here, really.  This should be a
 301     straightforward description of straightforward code.  There are
 302     only two slightly tricky bits:
 303
 304         - Absolute versus relative names.  When the current directory
 305           is represented by a struct dir or struct inode, the
 306           difference should just be whether traversal starts from the
 307           root directory or the current directory.  When the current
 308           directory is represented by a string, either way you start
 309           from the root directory, but if it's a relative name then
 310           you first traverse to the current directory before
 311           traversing the provided name.
 312
 313         - Whether to traverse the final component of the name.  Some
 314           system calls, e.g. create, remove, mkdir, don't want to
 315           traverse the final component, because they want to act on a
 316           directory entry.  Others, e.g. open, chdir, do want to
 317           traverse the final component, because they want to act on an
 318           inode.
 319
 320     The best answers talk about both of these.
 321
 322 B3:
 323
 324     This question is new for summer 2006, so the answers may not be
 325     what I expect.  I'm looking for roughly this:
 326
 327         It determines the current working directory (cwd) in reverse
 328         order, moving from the cwd to the root.  It first determines
 329         the inumber of ".".  Then it opens ".." and reads its entries
 330         until it finds the one that has that inumber.  The
 331         corresponding name is the last component of the cwd.  Then it
 332         reads "../..", looking for the inumber of "..", which
 333         determines the second-to-last component of the cwd.  It stops
 334         when the inumber of a directory and its parent are the same,
 335         because that means it's at the root.
 336
 337 B4:
 338
 339     This should explain how to apply the directory synchronization
 340     whose data structure was added in B1.  Most commonly, it's just a
 341     matter of taking the directory's lock around the directory
 342     operation.
 343
 344 B5:
 345
 346
 347
 348 B6:
 349
 350                              BUFFER CACHE
 351                              ============
 352
 353 C1:
 354
 355     struct inode should drop inode_disk; if it doesn't, then take off
 356     points.  This might have been mentioned in A1 instead.
 357
 358 C2:
 359
 360 C3:
 361
 362 C4:
 363
 364 C5:
 365
 366 C6:
 367
 368 C7: