PROJECT 1 GRADING ADVICE
		       ========================

This file contains advice for grading project 1.  General advice for
Pintos grading is in README, which you should read first.

			     ALARM CLOCK
			     ===========

Most students solve this problem with a linked list of sleeping
threads and a timer interrupt handler that wakes up sleeping threads
at the appropriate time.  Details that distinguish solutions include:

	- List ordering: Good solutions order the list of threads by
          wake-up time.

	- Time left vs. wake-up time: Good solutions store the wake-up
          time in the list, instead of the number of ticks remaining.

	- Interrupt efficiency: Good solutions that order the list by
          wake-up time and store wake-up times (see above two items),
          only need their timer interrupt to examine as many items in
          the list as are actually waking up on this tick (plus one).
          Otherwise, every sleeping thread must be examined (and
          possibly modified).

	- Allocation: Good solutions put the fields needed for
          sleeping in struct thread (or, possibly, in local
          variables).  Dynamic allocation is another possibility, but
          malloc() failure must be dealt with; it is unacceptable to
          busy-wait or fail to wait if malloc() fails.  A third
          possibility is a static array of list elements, but that
          puts a fixed limit on the number of simultaneously sleeping
          threads, which is also unacceptable.

	- Synchronization: Most solutions manipulate the list with
          interrupts disabled to protect against the interrupt
          handler.  Other solutions are possible but most of them are
          incorrect--watch out for missing memory barriers (see the
          Memory Barriers section in the reference guide) or bad
          assumptions of atomicity when there is none.

          In most solutions only the list insertion (typically a
          single call to list_insert_ordered()) needs to be protected
          by disabling interrupts.

	- Wake-up precision: Good solutions wake up sleeping threads
          at exactly the tick they should (although the timer
          interrupt might be processed a little late due to disabled
          interrupts).  Some solutions deal with synchronization by
          delaying a wake-up that should occur during a call to
          timer_sleep() until the following timer tick, which is
          unacceptable.

	- Wake-one vs. wake-all: Good solutions wake up all the
          threads that are scheduled to wake up at a particular timer
          tick.  Bad solutions only wake up one thread and delay
          waking the next until the following timer tick.

A1:

    Usually this includes a structure or, as in the sample solution,
    new members in struct thread:

	/* Alarm clock. */
	int64_t wakeup_time;                /* Time to wake this thread up. */
	struct list_elem timer_elem;        /* Element in wait_list. */
	struct semaphore timer_sema;        /* Semaphore. */

    Typically there is a global list of sleeping threads also:

	/* Threads waiting in timer_sleep(). */
	static struct list wait_list;

A2:

    Usually, timer_sleep() initializes a data structure (or struct
    thread members) and, with interrupts disabled, inserts it in the
    correct position on the global list.  Then it goes to sleep by
    downing a semaphore.

    On each tick, the timer interrupt handler checks the front of the
    list.  If it is ready to wake up, it pops it off the list, ups its
    semaphore, and checks the next one.  If it's not ready, then none
    of the others can be ready, either, so it's done.

A3:

    Ordering the list by time of wake-up and storing wake-up times
    instead of ticks remaining, typically.  (Some students don't
    realize that the latter is an optimization, because it is so
    obvious; that's fine, as long as they mention the first part.)

A4:

    Typically only the global list itself is shared with other
    threads, and the list elements are private, so that disabling
    interrupts to insert into the list is sufficient.

    If the group introduces additional synchronization, such as a
    semaphore, make sure that it is actually needed.

A5:

    Typically, disabling interrupts around the list insertion protects
    against races against the timer interrupt.

A6:

    A good discussion will mention some of the features on the list of
    distinguishing characteristics above.

    It is insufficient to compare against "solutions" that do not
    satisfy the terms of the assignment, such as against the supplied
    code that busy-waits.

			 PRIORITY SCHEDULING
			 ===================

Priority scheduling by itself is a simple exercise.  Priority donation
is more difficult.  Many groups do not seem to completely grasp the
structure of priority donation.  Some properties of priority donation
that you may find useful while grading:

	- A thread can only directly donate its priority to at most
	  one donee thread at a time, because donation only occurs
	  when a thread is blocked waiting for a lock, and a thread
	  can only block waiting for a single lock at a time.

          This property means that each thread can keep track of its
          donee, if any, with a single pointer in struct thread.

	- A thread can receive donations from any number of threads,
          because any number of threads can wait on a lock.  The
          donors may be waiting for a single lock or multiple locks.

	  These first two properties, taken together, mean that if a
	  thread keeps a "struct list" of its donor threads, a single
	  list_elem in "struct thread" in the donor is sufficient,
	  because the donor cannot be donating (directly) to more than
	  one thread.

	- Donations can nest to form a tree, but circularity is
          impossible as long as no deadlock in the locking structure
          exists.  Deadlock would itself be a bug, so there is no need
          to worry about circularity in donations.

B1:

    The sample solution has essentially a minimal set of additions to
    "struct thread":

	/* Scheduler data. */
	int priority;                   /* Priority, including donations. */
	int normal_priority;            /* Priority, without donations. */
	struct list donors;             /* Threads donating priority to us. */
	struct list_elem donor_elem;    /* Element in donors list. */
	struct thread *donee;           /* Thread we're donating to. */
	struct lock *want_lock;         /* Lock we're waiting to acquire. */

    The sample solution redefines the "priority" member to include
    priority donations and adds a second "normal_priority" member that
    does not include donations.  An alternative might be to
    recalculate the donated priority whenever it is needed, but this
    is often, so a good rationale would be required.

    Some submissions omit the list of donors and attempt to record
    pre-donation priority in the "struct lock" whose ownership causes
    the donation.  These submissions typically just restore the
    original priority when the lock is released.  But this is
    insufficient.  Consider a thread of priority 3 that owns locks A
    and B and that receives a donation of priority 5 via A, then
    priority 8 via B.  If it releases A first, then its priority will
    be restored to 3, even though it should be 5.  Deduct points for
    such an implementation.

    A common mistake is dynamically allocating the list of donations.
    It is unacceptable to fail to maintain the required semantics if
    malloc() fails.  Failing to check malloc() for failure is also
    unacceptable (see the OVERALL section).

    A global list of donations is not necessary.  Deduct points if one
    is present.

    Most solutions add an element to struct semaphore_elem also, used
    in cond_signal() to find the highest-priority waiting thread to
    signal:
        struct thread *thread;              /* Thread. */
    If the students don't mention such an addition, check whether they
    made one.  If so, deduct a few points for failing to mention it.
    If not, read the answer to B3 carefully and make sure that they
    actually handle condition variables properly.

B2:

    The minimum ASCII art requirement here includes three threads and
    two locks, e.g. for the sample solution:

       	                                 /--------------------------\
	                                 |                          |
	            /------------------------\                      |
	 Thread H   V        Thread M    V   |   Thread L           |
	 +-------------+     +-------------+ |   +----------------+ |
  	 |priority = 8 |     |priority = 8 | |   |priority = 8    | |
  	 |norm_pri = 8 |     |norm_pri = 5 | |   |norm_pri = 3    | |
  	 |donors = {}  |     |donors = {H} |-/   |donors = {M}    |-/
  	 |donee = M    |---->|donee = L    |---->|donee = null    |
    	 |want_lock = A|--\  |want_lock = B|--\  |want_lock = null|
    	 +-------------+  |  +-------------+  |  +----------------+
	      ^           |      ^     ^      |           ^
	      |  Lock A   V      |     |      V    Lock B |
	      |  +-------------+ |     |  +-------------+ |
	      |  |holder = M   |-/     |  |holder = L   |-/
	      |  |waiters = {H}|-\     |  |waiters = {M}|-\
	      |  +-------------+ |     |  +-------------+ |
	      |                  |     |                  |
	      \------------------/     \------------------/

    Some explanatory text must also be included.

B3:

    The answer must describe what happens for each kind of
    synchronization primitive.

    For semaphores, typically sema_up() selects the highest-priority
    thread on the wait list, instead of the one that has been waiting
    the longest.  

    Locks are implemented in terms of semaphores so no extra work is
    typically needed (but the submission must say this).

    Condition variables need the same kind of change as semaphores.

    One minor issue is the ordering of the ready_list and other lists
    (such as semaphore lists) that keep track of threads waiting for
    various events.  It makes sense to maintain these in sorted order,
    but threads' priorities can change while they waiting on lists,
    which would require their positions to be adjusted (see the FAQ
    for this problem).  In practice it may be easier to simply search
    the lists for maximum or minimum elements when required, as done
    in the sample solution.

B4:

    The expected answer is that lock_acquire() determines the holder
    of the lock (lock->holder), if any, and then donate the current
    thread's priority to it.  The answer must mention how the priority
    donation cascades through to nested donations.

    There is a race in lock_acquire() between checking for the lock's
    holder and "down"ing the semaphore.  The obvious way to fix this
    is by disabling interrupts before checking the holder and keeping
    them off until after sema_down() returns.  You will check the code
    for this by hand (see below).

B5:

    The expected answer is that lock_release() returns the donations
    to the threads waiting for the lock, releases the semaphore, and
    recomputes the current thread's priority.

    Take a closer look if the answer says something about "restoring",
    or similar, a previous priority instead of recomputing it.  This
    probably indicates that the implementation is buggy: it will fail
    in the same way described in the last paragraph of B1 (above).
    Deduct points for such an implementation.

B6:

    The expected answer is that thread_set_priority() needs to
    recompute the priority based on the highest donated priority and
    the new base priority (see the FAQ for this problem for details).
    However, the donated priorities can change during the computation
    (e.g. a new donation can occur when another thread attempts to
    take a lock owned by this thread).

    A lock cannot easily be used to avoid this race, because
    lock_acquire() would need to take it before making a new
    donation, and that would cause lock_acquire() to call itself.  The
    problem is probably solvable, but it is more straightforward to
    simply disable interrupts.

    Sometimes students identify other races in this function.  That is
    acceptable, as long as it is a genuine race and they describe it
    adequately.

B7:

    The issue mentioned in the final paragraph of B3 above may come up
    here.

Check the submitted code, by hand, for the following:

 1. When sema_up() unblocks a thread with a higher priority than the
    running thread, it yields to it, but *not* if it unblocks a thread
    with priority equal to or lower than the running thread.

 2. When sema_up() does not unblock a thread, it does not yield.

 3. Race condition between priority donation, which is probably
    implemented in lock_acquire(), and "down"ing the underlying
    semaphore, in sema_down().  This likely requires disabling
    interrupts in lock_acquire(), as done in the sample solution.

 4. Race condition between release of donated priorities, which is
    probably implemented in lock_release(), and "up"ing the underlying
    semaphore, in sema_up().  This likely requires disabling
    interrupts in lock_release(), as done in the sample solution.

			  ADVANCED SCHEDULER
			  ==================

C1:

    Most groups create a list of all threads so that they can update
    recent_cpu in all of them during a timer interrupt, e.g.:

	/* List of all threads. */
	static struct list all_threads_list;

    Some new "struct thread" members also make sense:

	int nice;                     /* Niceness. */
	fixed_point_t recent_cpu;     /* Recent amount of CPU time. */    
	struct list_elem all_elem;    /* all_threads list element. */

    Typically the load average is maintained as a global variable,
    e.g.:

        static fixed_point_t load_avg;  /* Load average. */

    Most groups choose to abstract fixed-point numbers to some extent,
    and so there is probably an associated type definition, e.g.:

	/* A fixed-point number. */
	typedef struct 
	  {
	    int f;
	  }
	fixed_point_t;

C2 and C3:

    The first two rows are unambiguous and must be filled in just as
    shown here:

	timer  recent_cpu    priority   thread
	ticks   A   B   C   A   B   C   to run
	-----  --  --  --  --  --  --   ------
	 0      0   0   0  63  61  59      A
	 4      4   0   0  62  61  59      A

    From this point on, the answers will vary because A and B both
    have equal priority (61) and there is no clearly stated rule on
    which to choose.  Some form of round-robin should be used, and on
    C3 students must describe how they chose to do it.  The following
    answers give priority to the thread that has run least recently:

	 8      8   0   0  61  61  59      B
	12      8   4   0  61  60  59      A
	16     12   4   0  60  60  59      B
	20     12   8   0  60  59  59      C
	24     12   8   4  60  59  58      A
	28     16   8   4  59  59  58      B
	32     16  12   4  59  58  58      C
	36     16  12   8  59  58  57      A

    Don't feel obligated to check every entry.  However, please do
    "spot checks" on at least these rules of thumb:

	- Within each row, the recent_cpu values must sum to the
          number of timer ticks.

	- Only one priority value changes from one row to the next,
          decreasing by 1 for the thread that was chosen on the
          previous row.

	- A runs more than B, which runs more than C.

    A few more notes:

	- It's possible there are more ambiguities than this one, but
          that's the one that most students notice.

	- There is no need for non-integers or rounding in the table.
          recent_cpu is only reduced based on load_avg every 100
          ticks, and the table is not that long.  (It's possible that
          some group could choose to reduce recent_cpu in some single
          row on the assumption that the 100th tick happens there, but
          no group has actually done that yet.)

C4:

    This is the first quarter we've used this question, so I'm not
    sure what kind of answers to expect.  Please use your judgment in
    grading answers.

C5:

    Many groups comment on their choice between a single queue and an
    array of 64 queues here.  Good reasons for either choice include:

	- Scanning a 64-queue array may waste time if there aren't any
          high-priority processes.  (A good implementation could use a
          bit vector to avoid scanning empty queues, but this may not
          be obvious to students.)

	- A single queue has to be kept sorted, or it has to be
          searched for the maximum-priority element, but there's no
          such cost if there are 64 queues--we can just "push" and
          "pop" at each end.

	- A single queue is more flexible if PRI_MIN or PRI_MAX is
          changed, especially if PRI_MIN is changed to be nonzero or
          if the range between PRI_MIN and PRI_MAX becomes much
          larger.

    If they don't mention this choice, then they should talk about
    some other important choice.

    Make sure that each group mentions in their answer at least one
    advantage and one disadvantage of their design choices.

C6:

    Generally the "right" answer here is that the group designed some
    kind of abstraction for fixed-point math.  If they didn't, then a
    good justification is needed.  

Check the submitted code, by hand, for the following: 

 1. That the code that updates the load average, recent_cpu, and
    thread priorities once per second is readable.  It is probably in
    thread_tick() in thread.c.  Look at the code that uses the
    fixed-point math, in particular, and try to figure out how it
    implements the formulas for load_avg and updating recent_cpu.  As
    an example, the core of the code in the reference solution is
    pretty straightforward:

    /* Update load average. */
    load_avg = fix_add (fix_mul (fix_frac (59, 60), load_avg),
			fix_mul (fix_frac (1, 60), fix_int (ready_threads)));

    /* Update recent_cpu.  First, calculate some constants: */
    fixed_point_t twice_load = fix_scale (load_avg, 2);
    fixed_point_t twice_load_plus_1 = fix_add (twice_load, fix_int (1));
    fixed_point_t load_factor = fix_div (twice_load, twice_load_plus_1);

    /* Then, for each thread: */
    t->recent_cpu = fix_add (fix_mul (load_factor, t->recent_cpu),
			     fix_int (t->nice));

    If you can't understand how their code does its calculations, take
    off a lot of points.  If you can't understand *easily*, take off a
    few.

 2. That the timer interrupt does *not* update every thread's priority
    on every fourth timer tick.  The 4.4BSD scheduler specification
    says that priority is "recalculated once every fourth clock tick,
    for every thread," but clearly the students should realize that
    only the *running* thread's recent_cpu, and therefore priority,
    will actually change.

    This code is probably also in thread_tick() in thread.c.

 3. For a race condition in thread_get_recent_cpu() or
    thread_get_load_avg().  Generally these need to be protected by
    disabling interrupts, because they are updated in the timer
    interrupt.  A call to barrier() would be acceptable, too.