PROJECT 1 GRADING ADVICE ======================== This file contains advice for grading project 1. General advice for Pintos grading is in README, which you should read first. ALARM CLOCK =========== Most students solve this problem with a linked list of sleeping threads and a timer interrupt handler that wakes up sleeping threads at the appropriate time. Details that distinguish solutions include: - List ordering: Good solutions order the list of threads by wake-up time. - Time left vs. wake-up time: Good solutions store the wake-up time in the list, instead of the number of ticks remaining. - Interrupt efficiency: Good solutions that order the list by wake-up time and store wake-up times (see above two items), only need their timer interrupt to examine as many items in the list as are actually waking up on this tick (plus one). Otherwise, every sleeping thread must be examined (and possibly modified). - Allocation: Good solutions put the fields needed for sleeping in struct thread (or, possibly, in local variables). Dynamic allocation is another possibility, but malloc() failure must be dealt with; it is unacceptable to busy-wait or fail to wait if malloc() fails. A third possibility is a static array of list elements, but that puts a fixed limit on the number of simultaneously sleeping threads, which is also unacceptable. - Synchronization: Most solutions manipulate the list with interrupts disabled to protect against the interrupt handler. Other solutions are possible but most of them are incorrect--watch out for missing memory barriers (see the Memory Barriers section in the reference guide) or bad assumptions of atomicity when there is none. In most solutions only the list insertion (typically a single call to list_insert_ordered()) needs to be protected by disabling interrupts. - Wake-up precision: Good solutions wake up sleeping threads at exactly the tick they should (although the timer interrupt might be processed a little late due to disabled interrupts). Some solutions deal with synchronization by delaying a wake-up that should occur during a call to timer_sleep() until the following timer tick, which is unacceptable. - Wake-one vs. wake-all: Good solutions wake up all the threads that are scheduled to wake up at a particular timer tick. Bad solutions only wake up one thread and delay waking the next until the following timer tick. A1: Usually this includes a structure or, as in the sample solution, new members in struct thread: /* Alarm clock. */ int64_t wakeup_time; /* Time to wake this thread up. */ struct list_elem timer_elem; /* Element in wait_list. */ struct semaphore timer_sema; /* Semaphore. */ Typically there is a global list of sleeping threads also: /* Threads waiting in timer_sleep(). */ static struct list wait_list; A2: Usually, timer_sleep() initializes a data structure (or struct thread members) and, with interrupts disabled, inserts it in the correct position on the global list. Then it goes to sleep by downing a semaphore. On each tick, the timer interrupt handler checks the front of the list. If it is ready to wake up, it pops it off the list, ups its semaphore, and checks the next one. If it's not ready, then none of the others can be ready, either, so it's done. A3: Ordering the list by time of wake-up and storing wake-up times instead of ticks remaining, typically. (Some students don't realize that the latter is an optimization, because it is so obvious; that's fine, as long as they mention the first part.) A4: Typically only the global list itself is shared with other threads, and the list elements are private, so that disabling interrupts to insert into the list is sufficient. If the group introduces additional synchronization, such as a semaphore, make sure that it is actually needed. A5: Typically, disabling interrupts around the list insertion protects against races against the timer interrupt. A6: A good discussion will mention some of the features on the list of distinguishing characteristics above. It is insufficient to compare against "solutions" that do not satisfy the terms of the assignment, such as against the supplied code that busy-waits. PRIORITY SCHEDULING =================== Priority scheduling by itself is a simple exercise. Priority donation is more difficult. Many groups do not seem to completely grasp the structure of priority donation. Some properties of priority donation that you may find useful while grading: - A thread can only directly donate its priority to at most one donee thread at a time, because donation only occurs when a thread is blocked waiting for a lock, and a thread can only block waiting for a single lock at a time. This property means that each thread can keep track of its donee, if any, with a single pointer in struct thread. - A thread can receive donations from any number of threads, because any number of threads can wait on a lock. The donors may be waiting for a single lock or multiple locks. These first two properties, taken together, mean that if a thread keeps a "struct list" of its donor threads, a single list_elem in "struct thread" in the donor is sufficient, because the donor cannot be donating (directly) to more than one thread. - Donations can nest to form a tree, but circularity is impossible as long as no deadlock in the locking structure exists. Deadlock would itself be a bug, so there is no need to worry about circularity in donations. B1: The sample solution has essentially a minimal set of additions to "struct thread": /* Scheduler data. */ int priority; /* Priority, including donations. */ int normal_priority; /* Priority, without donations. */ struct list donors; /* Threads donating priority to us. */ struct list_elem donor_elem; /* Element in donors list. */ struct thread *donee; /* Thread we're donating to. */ struct lock *want_lock; /* Lock we're waiting to acquire. */ The sample solution redefines the "priority" member to include priority donations and adds a second "normal_priority" member that does not include donations. An alternative might be to recalculate the donated priority whenever it is needed, but this is often, so a good rationale would be required. Some submissions omit the list of donors and attempt to record pre-donation priority in the "struct lock" whose ownership causes the donation. These submissions typically just restore the original priority when the lock is released. But this is insufficient. Consider a thread of priority 3 that owns locks A and B and that receives a donation of priority 5 via A, then priority 8 via B. If it releases A first, then its priority will be restored to 3, even though it should be 5. Deduct points for such an implementation. A common mistake is dynamically allocating the list of donations. It is unacceptable to fail to maintain the required semantics if malloc() fails. Failing to check malloc() for failure is also unacceptable (see the OVERALL section). A global list of donations is not necessary. Deduct points if one is present. Most solutions add an element to struct semaphore_elem also, used in cond_signal() to find the highest-priority waiting thread to signal: struct thread *thread; /* Thread. */ If the students don't mention such an addition, check whether they made one. If so, deduct a few points for failing to mention it. If not, read the answer to B3 carefully and make sure that they actually handle condition variables properly. B2: The minimum ASCII art requirement here includes three threads and two locks, e.g. for the sample solution: /--------------------------\ | | /------------------------\ | Thread H V Thread M V | Thread L | +-------------+ +-------------+ | +----------------+ | |priority = 8 | |priority = 8 | | |priority = 8 | | |norm_pri = 8 | |norm_pri = 5 | | |norm_pri = 3 | | |donors = {} | |donors = {H} |-/ |donors = {M} |-/ |donee = M |---->|donee = L |---->|donee = null | |want_lock = A|--\ |want_lock = B|--\ |want_lock = null| +-------------+ | +-------------+ | +----------------+ ^ | ^ ^ | ^ | Lock A V | | V Lock B | | +-------------+ | | +-------------+ | | |holder = M |-/ | |holder = L |-/ | |waiters = {H}|-\ | |waiters = {M}|-\ | +-------------+ | | +-------------+ | | | | | \------------------/ \------------------/ Some explanatory text must also be included. B3: The answer must describe what happens for each kind of synchronization primitive. For semaphores, typically sema_up() selects the highest-priority thread on the wait list, instead of the one that has been waiting the longest. Locks are implemented in terms of semaphores so no extra work is typically needed (but the submission must say this). Condition variables need the same kind of change as semaphores. One minor issue is the ordering of the ready_list and other lists (such as semaphore lists) that keep track of threads waiting for various events. It makes sense to maintain these in sorted order, but threads' priorities can change while they waiting on lists, which would require their positions to be adjusted (see the FAQ for this problem). In practice it may be easier to simply search the lists for maximum or minimum elements when required, as done in the sample solution. B4: The expected answer is that lock_acquire() determines the holder of the lock (lock->holder), if any, and then donate the current thread's priority to it. The answer must mention how the priority donation cascades through to nested donations. There is a race in lock_acquire() between checking for the lock's holder and "down"ing the semaphore. The obvious way to fix this is by disabling interrupts before checking the holder and keeping them off until after sema_down() returns. You will check the code for this by hand (see below). B5: The expected answer is that lock_release() returns the donations to the threads waiting for the lock, releases the semaphore, and recomputes the current thread's priority. Take a closer look if the answer says something about "restoring", or similar, a previous priority instead of recomputing it. This probably indicates that the implementation is buggy: it will fail in the same way described in the last paragraph of B1 (above). Deduct points for such an implementation. B6: The expected answer is that thread_set_priority() needs to recompute the priority based on the highest donated priority and the new base priority (see the FAQ for this problem for details). However, the donated priorities can change during the computation (e.g. a new donation can occur when another thread attempts to take a lock owned by this thread). A lock cannot easily be used to avoid this race, because lock_acquire() would need to take it before making a new donation, and that would cause lock_acquire() to call itself. The problem is probably solvable, but it is more straightforward to simply disable interrupts. Sometimes students identify other races in this function. That is acceptable, as long as it is a genuine race and they describe it adequately. B7: The issue mentioned in the final paragraph of B3 above may come up here. Check the submitted code, by hand, for the following: 1. When sema_up() unblocks a thread with a higher priority than the running thread, it yields to it, but *not* if it unblocks a thread with priority equal to or lower than the running thread. 2. When sema_up() does not unblock a thread, it does not yield. 3. Race condition between priority donation, which is probably implemented in lock_acquire(), and "down"ing the underlying semaphore, in sema_down(). This likely requires disabling interrupts in lock_acquire(), as done in the sample solution. 4. Race condition between release of donated priorities, which is probably implemented in lock_release(), and "up"ing the underlying semaphore, in sema_up(). This likely requires disabling interrupts in lock_release(), as done in the sample solution. ADVANCED SCHEDULER ================== C1: Most groups create a list of all threads so that they can update recent_cpu in all of them during a timer interrupt, e.g.: /* List of all threads. */ static struct list all_threads_list; Some new "struct thread" members also make sense: int nice; /* Niceness. */ fixed_point_t recent_cpu; /* Recent amount of CPU time. */ struct list_elem all_elem; /* all_threads list element. */ Typically the load average is maintained as a global variable, e.g.: static fixed_point_t load_avg; /* Load average. */ Most groups choose to abstract fixed-point numbers to some extent, and so there is probably an associated type definition, e.g.: /* A fixed-point number. */ typedef struct { int f; } fixed_point_t; C2 and C3: The first two rows are unambiguous and must be filled in just as shown here: timer recent_cpu priority thread ticks A B C A B C to run ----- -- -- -- -- -- -- ------ 0 0 0 0 63 61 59 A 4 4 0 0 62 61 59 A From this point on, the answers will vary because A and B both have equal priority (61) and there is no clearly stated rule on which to choose. Some form of round-robin should be used, and on C3 students must describe how they chose to do it. The following answers give priority to the thread that has run least recently: 8 8 0 0 61 61 59 B 12 8 4 0 61 60 59 A 16 12 4 0 60 60 59 B 20 12 8 0 60 59 59 C 24 12 8 4 60 59 58 A 28 16 8 4 59 59 58 B 32 16 12 4 59 58 58 C 36 16 12 8 59 58 57 A Don't feel obligated to check every entry. However, please do "spot checks" on at least these rules of thumb: - Within each row, the recent_cpu values must sum to the number of timer ticks. - Only one priority value changes from one row to the next, decreasing by 1 for the thread that was chosen on the previous row. - A runs more than B, which runs more than C. A few more notes: - It's possible there are more ambiguities than this one, but that's the one that most students notice. - There is no need for non-integers or rounding in the table. recent_cpu is only reduced based on load_avg every 100 ticks, and the table is not that long. (It's possible that some group could choose to reduce recent_cpu in some single row on the assumption that the 100th tick happens there, but no group has actually done that yet.) C4: This is the first quarter we've used this question, so I'm not sure what kind of answers to expect. Please use your judgment in grading answers. C5: Many groups comment on their choice between a single queue and an array of 64 queues here. Good reasons for either choice include: - Scanning a 64-queue array may waste time if there aren't any high-priority processes. (A good implementation could use a bit vector to avoid scanning empty queues, but this may not be obvious to students.) - A single queue has to be kept sorted, or it has to be searched for the maximum-priority element, but there's no such cost if there are 64 queues--we can just "push" and "pop" at each end. - A single queue is more flexible if PRI_MIN or PRI_MAX is changed, especially if PRI_MIN is changed to be nonzero or if the range between PRI_MIN and PRI_MAX becomes much larger. If they don't mention this choice, then they should talk about some other important choice. Make sure that each group mentions in their answer at least one advantage and one disadvantage of their design choices. C6: Generally the "right" answer here is that the group designed some kind of abstraction for fixed-point math. If they didn't, then a good justification is needed. Check the submitted code, by hand, for the following: 1. That the code that updates the load average, recent_cpu, and thread priorities once per second is readable. It is probably in thread_tick() in thread.c. Look at the code that uses the fixed-point math, in particular, and try to figure out how it implements the formulas for load_avg and updating recent_cpu. As an example, the core of the code in the reference solution is pretty straightforward: /* Update load average. */ load_avg = fix_add (fix_mul (fix_frac (59, 60), load_avg), fix_mul (fix_frac (1, 60), fix_int (ready_threads))); /* Update recent_cpu. First, calculate some constants: */ fixed_point_t twice_load = fix_scale (load_avg, 2); fixed_point_t twice_load_plus_1 = fix_add (twice_load, fix_int (1)); fixed_point_t load_factor = fix_div (twice_load, twice_load_plus_1); /* Then, for each thread: */ t->recent_cpu = fix_add (fix_mul (load_factor, t->recent_cpu), fix_int (t->nice)); If you can't understand how their code does its calculations, take off a lot of points. If you can't understand *easily*, take off a few. 2. That the timer interrupt does *not* update every thread's priority on every fourth timer tick. The 4.4BSD scheduler specification says that priority is "recalculated once every fourth clock tick, for every thread," but clearly the students should realize that only the *running* thread's recent_cpu, and therefore priority, will actually change. This code is probably also in thread_tick() in thread.c. 3. For a race condition in thread_get_recent_cpu() or thread_get_load_avg(). Generally these need to be protected by disabling interrupts, because they are updated in the timer interrupt. A call to barrier() would be acceptable, too.