PROJECT 4 GRADING ADVICE
                       ========================

This file contains advice for grading project 4.  General advice for
Pintos grading is in README, which you should read first.

                     INDEXED AND EXTENSIBLE FILES
                     ============================

A1:

    struct inode_disk should change here.  Many students just present
    the new version instead of the detailed changes, which is fine
    because most of it changes.  The typical result looks something
    like this:

        struct inode_disk
          {
            disk_sector_t sectors[SECTOR_CNT];  /* Sectors. */
            enum inode_type type;               /* FILE_INODE or DIR_INODE. */
            off_t length;                       /* File size in bytes. */
            unsigned magic;                     /* Magic number. */
          };

    Some details:

        - The "sectors" array might be divided into separate arrays or
          scalars devoted to direct, indirect, and doubly indirect
          blocks.  That's fine, and arguably cleaner.

        - The "magic" member should be retained, but some groups might
          drop it because they don't understand its purpose.  That's
          OK.

        - A "type" or "is_dir" member might have been added to
          directory entries instead, or its addition might be
          mentioned in B1, because it's more relevant there.

        - The "start" member should be dropped.  If it's still there,
          take off points.

        - The "unused" array should be dropped, because if there's
          extra space then it can be used for more direct blocks, etc.
          If there's an unused array that's between 0 and 3 bytes (not
          elements) long, then that's fine--probably they needed to
          pad out some "char" or "short" elements to make the whole
          thing exactly 512 bytes long.  Otherwise, take off points.

    Most groups add a lock to struct inode that synchronizes file
    growth (see A3, A4).

A2:

    Most groups use 123 or 124 direct blocks, 1 indirect block, and 1
    doubly indirect block, which yield the following maximums:

        123 direct: (123 + 128 + 128**2) * 512 = 8,517,120 bytes = 8,317.5 kB
        124 direct: (124 + 128 + 128**2) * 512 = 8,517,632 bytes = 8,318 kB

    Occasionally a group thinks that the "max file size" includes
    metadata size.  The metadata in either situation above would be of
    this size:

        (1 + 1 + (1 + 128)) * 512 = 67,072 bytes = 65.5 kB

    so that the totals would then be

        123 direct: 8,517,120 + 67,072 = 8,584,192 bytes = 8,383 kB
        124 direct: 8,517,632 + 67,072 = 8,584,704 bytes = 8,383.5 kB

    Check that their math is correct if it doesn't match one of these.

    Students must "show their work" or at least give enough numbers in
    A1, A2, or A6 to allow you to check their work.  You shouldn't
    have to look at the source code to check it.  Take off points if
    you do.

    Some students think that their design has 128 indirect blocks or
    128 doubly indirect blocks.  That's not true.  Take off points.
    (However, the blocks that the doubly indirect blocks point to are
    indirect blocks in their own right.  But in that case they would
    normally have 129, not 128, because of the indirect block that the
    inode points to directly.)

    See also A6.

A3:

    One important fact here is that Pintos files can grow but never
    shrink.  (Although deletion might be thought of as "shrinking" to
    size 0, this can never be observed by a user process, because the
    file's contents are retained as long as a user process has it
    open.)

    A few approaches:

        1. The most common approach is to use a lock to make extending
           a file into a critical section.  A write first checks the
           length of the file.  A write that won't extend the file
           doesn't need to take the lock and can proceed immediately.
           A write that extends the file takes the lock.

           The lock should be per-inode (per-file).  Take off points
           if it is global.

           There are some subtleties to this approach.  The first is
           that checking and updating the length of the file has to be
           synchronized.  If two writers try to write 10 bytes
           starting at the current EOF, they can both see the old file
           length.  Thus, the file length must be checked again after
           obtaining the lock, e.g.

                if (final_ofs > disk_inode->length)   // First check.
                  {
                    lock_acquire (&inode->grow_lock);
                    if (final_ofs > disk_inode->length)   // Second check.
                      {
                         // ...extend file...
                      }
                    lock_release (&inode->grow_lock);
                  }

           Students who use this scheme must mention how they handle
           this problem.  If they don't, take off points.

           An even more subtle problem is that of race-free reads of
           the "length" member of disk_inode.  There ideally should be
           some locking around access to it, but a call to barrier()
           or use of volatile would be sufficient also.  However,
           reading a single integer is normally atomic on modern
           machines, so we'll just ignore this problem.

        2. Use sector-based locking in the buffer cache.  In this
           approach, the buffer cache supports shared and exclusive
           locking of sectors.  To read or write a file normally can
           use a shared lock on the inode sector; extending a file
           requires an exclusive lock on the inode sector.

        3. If all file accesses (or all write accesses) require a lock
           and hold it during the entire read or write, then that's a
           trivial solution, but it doesn't meet the requirements in
           the Synchronization section of the assignment: "Multiple
           reads of a single file must be able to complete without
           waiting for one another.  When writing to a file does not
           extend the file, multiple processes should also be able to
           write a single file at once."  Take off points.

           This is not the same as taking a lock across a couple of
           lines of code to check or set a variable in a race-free
           way, etc.  That's fine.  The issue is waiting for I/O to
           complete, not for updating little bits of metadata.

A4:

    The typical solution to A4 depends on the solution chosen for A3:

        1. a. Typically, the process extending the file doesn't update
              the file's length until the extension is fully
              completed.  Thus, before the extension is complete,
              readers won't see that the file's length has increased
              at all.

           b. Another solution is to make reads past end-of-file wait
              for file extension to be completed, by making read past
              end-of-file take the lock needed to extend the file.
              That's fine too.

        2. A process that wants to read the file will need to get a
           shared lock on the inode sector, which it can't get until
           the extending process has released its exclusive lock.
           Thus, this approach will be correct as long as the
           exclusive lock on the inode is held until the file
           extension is complete.

        3. This approach trivially solves the problem but doesn't
           fulfill the synchronization requirements.  Take off points.

A5:

    Many answers are implicitly predicated on the fact that Pintos
    locks and semaphores have fair, first-come, first-served queuing.
    Most students don't mention it, though.  Give a bonus point if
    they do.

    The typical answer depends on the answer to A4:

        1. a. Readers, and writers not at end-of-file, don't have to
              wait, so they have no fairness issues.  Writers that
              extend a file are waiting on a lock, which queues their
              extensions fairly.  After a file extension is complete,
              subsequent reads and writes within the new file length
              are completed fairly because there's no need for them to
              wait.

           b. Essentially the same as 1a, except that readers beyond
              end-of-file are treated like writers beyond end-of-file.
              Because the lock queuing is fair, readers get serviced
              fairly too.

        2. Fairness here depends on the fairness of locking in the
           buffer cache.  The group must explain how the buffer cache
           provides fair locking.  Generally, two rules are needed to
           ensure fairness:

                - A rule to prevent readers from waiting for writers
                  forever.  For example: when the exclusive holder of
                  a lock releases it, any processes waiting for a
                  shared lock are preferred over processes waiting for
                  an exclusive lock.

                - A rule to prevent writers from waiting for readers
                  forever.  For example: if one or more processes hold
                  a shared lock, and one or more processes are waiting
                  for an exclusive lock, then no more processes may
                  obtain a shared lock until at least one process has
                  obtained an exclusive lock.

           The group must explain both rules; take off points
           otherwise.

        3. This approach trivially solves the problem but doesn't
           fulfill the synchronization requirements.  Take off points.

    There should be no use of a timer to ensure fairness.  That's not
    the right approach.

    Fairness is not a matter of what kind of file access is "rare" or
    "common".  Take off points from groups that argue, e.g., that
    their implementation is unfair if a file is being extended but
    that file extension happens "rarely".

A6:

    I haven't seen a group try to use a non-multilevel index structure
    yet.  Such a choice would need good rationale.

    Good answers include mention of advantages of multilevel indexes:

        - Both random and sequential access are efficient.

        - Large files can be supported, without imposing a penalty on
          efficiency of small files.

        - Works in presence of external fragmentation.

        - Fast access to data for small files.

    Some answers mention some disadvantages of multilevel indexes:

        - It takes up to 3 disk accesses to find a data block in a big
          file (although caching helps with sequential access
          patterns).

        - Fixed maximum file size.

        - High overhead for small files.

    At least two reasons must be mentioned.

                            SUBDIRECTORIES
                            ==============

B1:

    struct thread should have a new member to keep track of the
    process's current directory.  Possibilities include:

        - Pointer to a struct dir or struct inode.  This is the best
          choice.

        - A string.  This is not a good choice (see B6).

        - A sector number.  This is not a good choice (see B6).

    struct dir_entry or struct inode_disk should have a new member to
    distinguish ordinary files from directories.  It is often named
    something like "type" or "is_dir".  This might have been mentioned
    in A1 instead; look there if it's missing.  If it's not mentioned
    either place, take off points.

    Directory synchronization possibilities include:

        - Most implementations add a lock to struct inode or struct
          dir and use it to synchronize directory operations.  This is
          simple and effective.

        - Some implementations use a global list of open directories.
          This generally requires a new global lock to control adding
          and removing list elements.  If there's a list but no
          locking or other explanation, deduct points.

          In addition, this implementation needs some way to wait for
          a particular directory and to free directories when no one
          is still using them.  This can be done with an additional
          per-directory lock and a waiting-processes counter.  If
          there's no such data or explanation, deduct points.

        - Some implementations use a global lock, or a few of them, to
          synchronize directory operations.  This is unacceptable;
          there is no reason to do global locking.  Deduct points.

B2:

    There's nothing tricky here, really.  This should be a
    straightforward description of straightforward code.  There are
    only two slightly tricky bits:

        - Absolute versus relative names.  When the current directory
          is represented by a struct dir or struct inode, the
          difference should just be whether traversal starts from the
          root directory or the current directory.  When the current
          directory is represented by a string, either way you start
          from the root directory, but if it's a relative name then
          you first traverse to the current directory before
          traversing the provided name.

        - Whether to traverse the final component of the name.  Some
          system calls, e.g. create, remove, mkdir, don't want to
          traverse the final component, because they want to act on a
          directory entry.  Others, e.g. open, chdir, do want to
          traverse the final component, because they want to act on an
          inode.

    The best answers talk about both of these.

B3:

    I'm looking for roughly this:

        It determines components of the current working directory
        (cwd) in reverse order, moving from the cwd to the root.  It
        first determines the inumber of ".".  Then it opens ".." and
        reads its entries until it finds the one that has that
        inumber.  The corresponding name is the last component of the
        cwd.  Then it reads "../..", looking for the inumber of "..",
        which determines the second-to-last component of the cwd.  It
        stops when the inumber of a directory and its parent are the
        same, because that means it's at the root.

    This question is new for summer 2006, so the answers may not be
    what I expect.

B4:

    This should explain how to apply the directory synchronization
    whose data structure was added in B1.  Most commonly, it's just a
    matter of taking the directory's lock around the directory
    operation.

B5:

    The typical answer depends on how the current directory is
    represented:

        - Groups that use a struct dir or struct inode pointer to
          represent the current working directory usually answer "yes"
          (that they do prevent removal).  The usual way to prevent
          removal is through the open_cnt member in struct inode: a
          value of 1 means that it may be removed, a higher value
          means that it's also open by some other process or in use as
          a current working directory.  (It's a value of 1, not 0,
          because typically the "remove" system call opens the
          directory inode as part of deleting it.)

        - Groups that represent the current working directory with a
          string usually answer "no" (that removal is allowed),
          because it requires no additional effort: when a directory
          is deleted, traversing to it through a string fails, so
          operations on the directory also fail.

        - Groups that use a sector number to represent the current
          working directory are hard to predict.  If they answer "no"
          (that removal is allowed), they may try to claim that the
          deleted directory inode on disk is recognized as deleted and
          that based on that it is special cased.  But that won't work
          reliably, because the inode may be reused at any time.
          Deduct points.  If they give another answer, you're on your
          own.

    If the answer is "yes" (that removal is prevented), extra code
    should not be necessary to avoid deleting the chain of parents of
    processes' working directories.  None of those directories are
    empty (they have at least one child directory), so they may not be
    deleted on that basis.  Deduct points if students added special
    code for this case.

B6:

    The rationale depends on how the current directory is represented:

        - Groups that use a struct dir or struct inode pointer to
          represent the current working directory: 

                * Access to cwd is fast and constant time, with no
                  traversal necessary (better than a string
                  representation).

                * Storage of the cwd is small and constant space
                  (better than a string representation).

                * Easy to tell how many users a directory has (to
                  prevent removing an in-use directory).

                * No need to limit the depth of the cwd within the
                  file system and no need for dynamic allocation in
                  chdir system call (better than a string
                  representation).

        - Groups that represent the current working directory with a
          string typically make these claims:

                * No extra work is needed to handle removed
                  directories.

                  This is true.

                * Recreating a removed directory makes operations in
                  that directory work again.

                  This is not normally desirable.  In a more advanced
                  OS environment, where files and directories can be
                  renamed and moved, it is normally desirable for a
                  process to retain its current directory across a
                  series of operations, not for it to suddenly be in
                  an entirely new place.

                * Operations that use file names are relatively
                  infrequent compared to operations on file
                  descriptors, so the penalty for having to
                  re-traverse paths on each operation is amortized
                  over many operations.

                  I doubt this is true.

                * Space efficiency (!).

                  This is obviously false.

        - Groups that use a sector number to represent the current
          working directory: I haven't seen any of these groups make
          sensible claims yet.

    At least two reasons must be given.

                             BUFFER CACHE
                             ============

C1:

    struct inode should drop inode_disk; if it doesn't, then take off
    points.  This might have been mentioned in A1 instead.

    Expect to see a cache entry structure containing the following
    members:

        - Sector number.

        - Sector data (512 bytes) or a pointer to sector data.

        - Dirty bit.

        - A way to decide choose a block to evict, e.g.:

                * Accessed bit.

                * Aging, using multiple "accessed bits".  An access
                  sets the high bit, and periodically the bits are
                  shifted to the right.

                * LRU list.  This should probably be a linked list,
                  which makes it easy and fast to move an entry to the
                  front.  If a linked list element is used, there
                  should be a global list also and probably a lock
                  too.

                  Occasionally some group will rearrange the whole
                  64-element array to manage LRU; that's inefficient,
                  so deduct points.

                * Some kind of "points" scheme, where accesses add
                  points based on the type of access.

          Often, any of these schemes are combined with a bit to
          distinguish data from metadata, so that metadata can be
          preferentially retained.

        - Synchronization data, typically a lock.

        - Often, a struct hash_elem to allow fast lookup.  This is not
          essential; linear search is fast enough for a 64-element
          list.

    Generally there's a queue of blocks for use by a separate
    read-ahead thread, as well as a lock to control access.

C2:

    Many schemes are possible.  For full credit, students must use an
    algorithm that is at least as good as the second-chance (clock)
    algorithm.

    Some implementations I've seen, with comments:

        - Reserve 1/3 of the cache for data, 2/3 for metadata, and do
          local replacement within each cache.  Manage metadata in LRU
          fashion, data in MRU fashion in the hope that sequential
          access to data is common.

          This is fine, although I suspect that MRU may not be the
          best choice.

        - Use the clock algorithm across data and metadata.  Skip over
          metadata on the first trip around the clock.  Special case
          calls to the "read" system call with "large" sizes,
          scheduling them for quick eviction in the hope that they're
          doing sequential access.

          I suspect that this will actually penalize data *too* much,
          but it's OK.

        - Do LRU based on a "points" scheme.  Data accesses add 1
          point, metadata accesses add 2 points.

          This is OK.

    It should not be necessary to modify all 64 entries on an access.
    Deduct points.

    Some implementations strictly prefer all metadata over all data,
    so that the cache will end up with just 1 data block and 63
    metadata blocks in it much of the time.  This leans too far toward
    preferring metadata.  Deduct points.

C3:

    The expected implementation is use a background thread that sleeps
    most of the time and flushes the cache whenever it wakes up.

    Make sure they don't state or imply that the background thread
    busy-waits.

C4:

    I've seen a few different implementations:

        - Most commonly, readers put block numbers on a shared queue.
          A single background thread gets block numbers from the queue
          and reads the blocks.  This is fine.

        - A reader creates a new thread to read the next block.
          This creates one thread per read request, which is too
          expensive.  Deduct points.

        - A single threads does *all* reads, both for immediate
          requests and read-ahead, prioritizing immediate requests.
          This is fine (and kind of clever).  This thread might be in
          charge of eviction, too.

C5:

    There are multiple possibilities:

        - "Double-checked locking": Search for the block in the cache
          without taking any lock.  Then, take the lock and see
          whether it's still the same block.  If so, you've got locked
          access to it.  Otherwise, release the lock and start over.

        - Global lock: A global lock held during the whole eviction
          process normally does not meet the synchronization
          requirements, which says "...when I/O is required on a
          particular block, operations on other blocks that do not
          require I/O should proceed without having to wait for the
          I/O to complete."  Deduct points.

        - Two-level locking: Use one global lock to protect the sector
          numbers in the list from changing, and a second per-block
          lock to prevent eviction.  In pseudo-code it looks like
          this:

                acquire global lock
                find block in cache
                acquire per-block lock
                release global lock
                do I/O
                release per-block lock

          The problem is that if thread Y wants to access block B
          while thread X is evicting block B, any thread Z that wants
          to access *any* block must wait for X's I/O to finish:

                Thread X            Thread Y           Thread Z
                ------------------  ----------------   --------------------

                get global lock
                find B in cache
                get B's lock
                release global lock
                start I/O
                                    get global lock
                                    find B in cache
                                    block on B's lock
                                                      block on global lock

	  As you can see, Y still holds the global lock while it waits
	  for B's lock, so Z has to wait for the global lock to be
	  released, which won't happen until Y can get B's lock, which
	  in turn won't happen until X's I/O is complete.

	  There are ways around this problem.  If the students mention
	  it and describe a solution, it's fine.  If they don't
	  mention it at all, deduct points.

        - Only one, specialized thread is allowed to do eviction.

C6:

    This is generally a consequence of the locking done in C5.  You
    should expect the students to explain that.  Just a sentence or
    two is necessary.

C7:

    Three different workloads must be described, one for each
    optimization.

    Workloads likely to benefit from buffer caching include:

	- Use of a large file, because caching saves time re-reading
          the inode and indirect and doubly indirect blocks.

	- Random access within a small set of files that can be
          completely cached.

	- Repeated reads with locality in space.

    Workloads likely to benefit from read-ahead include:

	- Sequential reads, especially when processing is necessary on
          each block, because that allows for greater parallelism.

    Workloads likely to benefit from write-behind include:

	- Multiple writes to small regions of a file, e.g. appending
          to a file a file bytes at a time.

	- Writes interspersed with computation, so that a process
          never blocks on writes.