--- /dev/null
+ PROJECT 2 GRADING ADVICE
+ ========================
+
+This file contains advice for grading project 2. General advice for
+Pintos grading is in README, which you should read first.
+
+ ARGUMENT PASSING
+ ================
+
+A1:
+
+ Most correct solutions don't introduce any new data structures for
+ this part.
+
+ If any global variables are introduced, be suspicious about the
+ possibility for race conditions.
+
+A2:
+
+ The "Program Startup Details" section in the User Programs chapter
+ of the Pintos manual describes what needs to be done.
+
+ There are two main ways to put argv[] in the correct order:
+
+ - Many solutions use strtok_r() to divide the arguments into
+ space-separated strings, storing a pointer to each argument
+ in a temporary storage area, then push each pointer on the
+ stack in reverse order.
+
+ - Another approach is to push the arguments obtained by
+ strtok_r() on the user stack directly, then reverse their
+ order when finished. This is what the sample solution does.
+
+ Students must specify how they avoid overflow:
+
+ - One acceptable approach is to check for overflow just before
+ pushing each item on the stack.
+
+ - Another acceptable approach is to make two passes. In the
+ first pass, calculate how many bytes of stack space are
+ required. Abort at this point if the stack would overflow.
+ Then, in the second pass, actually set up the stack.
+
+ - A third acceptable approach is to allocate additional stack
+ pages as the currently allocate page or pages are exhausted.
+ If this is done, verify that the case where no more memory
+ is available is correctly handled (by returning an error and
+ ensuring that the pages already allocated are freed).
+
+ Few groups choose this approach.
+
+ - Some students assume that any command line under 4,096 bytes
+ long will fit. This is wrong: some space is required for
+ the pointers in argv[], as well as a little extra for other
+ data that goes on the stack and some space for the user
+ process's use as well.
+
+ - Generalizing the previous approach, some students set some
+ other fixed limit on command-line length in bytes. Usually
+ they don't properly consider the consequences. Every two
+ bytes in a command line can require four bytes for a pointer
+ in argv[], as in a command line that consists of alternating
+ space and non-space characters. Thus, N bytes of
+ command-line arguments can require N + 4(N / 2) = 3N bytes
+ of stack space. Setting 3N = 4,096 we obtain N = 1,365.
+ Thus, any solution that sets a fixed cap on command-line
+ length greater than 1,365 is incorrect. Preferably, any
+ fixed limit should be significantly less than that.
+
+ - Some students believe that there is a 128-byte limit on the
+ length of a command line. This is wrong: the 128-byte limit
+ only applies to the "pintos" utility's command line. It
+ does not apply to the command line passed to the "exec"
+ system call.
+
+ - Some groups limit the number of command line arguments to a
+ fixed value. There's nothing wrong with that as long as
+ it's sufficiently large, e.g. 512.
+
+ One common fallacy is an assumption that each string must be
+ word-aligned for performance. This is not true. It's okay if
+ they align strings, because it's harmless, but stating that it
+ improves performance should incur a small deduction.
+
+A3:
+
+ strtok() uses a static object to hold the starting position for
+ the next token. Thus, it isn't safe for concurrent use in
+ different threads (the problem here) or even for nested use within
+ a single thread.
+
+ strtok_r() solves the problem by requiring the caller to provide
+ storage for the starting position for the next token.
+
+ Wrong answers:
+
+ - Some students believe that strtok_r() does not modify the
+ string it tokenizes whereas strtok() does. This is wrong:
+ both functions modify the string that they tokenize.
+
+ - Some students think strtok() uses a "static buffer" for
+ parsing. This is wrong: it modifies the string provided and
+ stores a pointer into it statically.
+
+A4:
+
+ Some correct answers:
+
+ - Increased flexibility: a variety of shells can accept
+ varying syntaxes without any need to change the kernel.
+
+ - Simplifies the kernel and improves its security: a simpler
+ kernel is a kernel less likely to have important bugs or
+ security holes. This is especially true given the
+ complicated parsing rules of real-world shells.
+
+ Some wrong answers:
+
+ - Saves time and disk space: doing parsing in user space or
+ kernel space doesn't have a strong time or space trade-off,
+ but some students think so.
+
+ - Relative paths: some students think that relative paths are
+ implemented by the shell, but this is wrong.
+
+ - Fewer context switches: where parsing is done has no effect
+ on the number of context switches required to start a
+ process.
+
+ - Policy enforcement: some students think that the shell is in
+ charge of enforcing system security policies, but this is
+ not normally true.
+
+Check the submitted code, by hand, for the following:
+
+ 1. That the code used for argument passing is readable. This is
+ probably near the end of userprog/process.c, in or near
+ setup_stack(). In particular, many stack "push" operations are
+ required, so the code to do this should either be simple
+ (e.g. "*--p = something;") or, if it is more complicated,
+ abstracted into a function or set of functions.
+
+ SYSTEM CALLS
+ ============
+
+B1:
+
+ This question has a pretty big answer, because system calls need
+ quite a few data structures. The guidelines for B1 are therefore
+ broken down into B1(i) through B1(iv), below, to make them easier
+ to follow.
+
+B1(i):
+
+ A global lock or semaphore for file system synchronization is
+ needed, e.g.
+
+ /* Serializes file system operations. */
+ static struct lock fs_lock;
+
+B1(ii):
+
+ Some mechanism for tracking file descriptors is needed. There are
+ three common variants (see B2 for more on this topic):
+
+ - Per-thread list: adds a list to struct thread, like so:
+
+ struct list fds; /* List of file descriptors. */
+
+ Each member of the list includes at least a file and a fd
+ number, e.g.:
+
+ /* A file descriptor, for binding a file handle to a file. */
+ struct file_descriptor
+ {
+ struct list_elem elem; /* List element. */
+ struct file *file; /* File. */
+ int handle; /* File handle. */
+ };
+
+ This case also requires a way to come up with unique file
+ descriptors. The simplest and best way is to add a counter
+ to struct thread, e.g.:
+
+ int next_handle; /* Next handle value. */
+
+ Another alternative is to use a global counter, but this
+ requires adding a lock as well. (It's possible to use the
+ global file system lock or some other lock for this, but I
+ haven't seen that done in practice.) Deduct points if
+ synchronization is omitted:
+
+ static int next_fd = 2; /* Next file descriptor. */
+ static struct lock fd_lock; /* Protects next_fd. */
+
+ - Per-thread array: adds an array of 128 elements to struct
+ thread (128 because the assignment states that as a minimum
+ number):
+
+ struct file *fds[128];
+
+ Some solutions include some extra members to speed up
+ finding an empty slot.
+
+ - Global hash table: adds a global hash table that maps from a
+ descriptor number to a "struct file *". Check that there is
+ also a new hash_elem in struct thread in this case.
+
+ Some way to generate unique file descriptors is needed,
+ usually a global counter. A lock or other synchronization
+ is also required.
+
+ The global data for this case looks something like this:
+
+ static struct hash fd_hash;
+ static int fd_next = 2;
+ static struct lock fd_lock;
+
+ The hash table elements look something like this:
+
+ struct file_descriptor
+ {
+ struct hash_elem hash_elem;
+ struct file *file;
+ int handle;
+ pid_t owner;
+ };
+
+ Notice the inclusion of the "owner" member, which ensures
+ that a fd can only be used by the process that opened it.
+ If there's no mechanism to ensure this, deduct points.
+
+B1(iii):
+
+ Data structures for implementing the "wait" system call are needed
+ (see B5 and B8 for more on this topic). Fundamentally, "wait"
+ requires communication between a process and its children, usually
+ implemented through shared data. The shared data might be added
+ to struct thread, but many solutions separate it into a separate
+ structure. At least the following must be shared between a parent
+ and each of its children:
+
+ - Child's exit status, so that "wait" can return it.
+
+ - Child's thread id, for "wait" to compare against its
+ argument.
+
+ - A way for the parent to block until the child dies (usually
+ a semaphore).
+
+ - A way for the parent and child to tell whether the other is
+ already dead, in a race-free fashion (to ensure that their
+ shared data can be freed).
+
+ The sample solution uses a new "struct wait_status" for this
+ purpose:
+
+ /* Tracks the completion of a process.
+ Reference held by both the parent, in its `children' list,
+ and by the child, in its `wait_status' pointer. */
+ struct wait_status
+ {
+ struct list_elem elem; /* `children' list element. */
+ struct lock lock; /* Protects ref_cnt. */
+ int ref_cnt; /* 2=child and parent both alive,
+ 1=either child or parent alive,
+ 0=child and parent both dead. */
+ tid_t tid; /* Child thread id. */
+ int exit_code; /* Child exit code, if dead. */
+ struct semaphore dead; /* 1=child alive, 0=child dead. */
+ };
+
+ Of course, each thread needs a way to find the shared data for
+ each of its children and, if shared data is not stored in struct
+ thread, its own shared data as well. The sample solution adds a
+ pair of members to struct thread:
+
+ struct wait_status *wait_status; /* This process's completion state. */
+ struct list children; /* Completion status of children. */
+
+ Most solutions add an exit code member to struct thread, even if
+ there is an exit code member in a separate shared data structure.
+ That's fine.
+
+B1(iv):
+
+ Some data is needed to allow a process that calls "exec" to wait
+ until the new process completes loading. The sample solution, and
+ some student submissions, just add a new structure. A pointer to
+ the new structure is passed to the new thread as the "void *"
+ pointer argument to thread_create(). Here is the sample
+ solution's structure:
+
+ /* Data structure shared between process_execute() in the
+ invoking thread and execute_thread() in the newly invoked
+ thread. */
+ struct exec_info
+ {
+ const char *file_name; /* Program to load. */
+ struct semaphore load_done; /* "Up"ed when loading complete. */
+ struct wait_status *wait_status;/* Child process. */
+ bool success; /* Program successfully loaded? */
+ };
+
+ Many student submissions instead add members to struct thread to
+ do the same things, which is fine.
+
+B2:
+
+ See B1(ii) above for information on the basic data structures used
+ for file descriptors. Part of the answer to this question might
+ be given in B1.
+
+ Globally unique and per-process unique file descriptors are both
+ acceptable.
+
+ If a global counter is used to generate file descriptors, there
+ must be some mention of synchronization.
+
+ If an array totalling 512 bytes or more (e.g. 128 elements of
+ "int" or pointer type) is added to struct thread, then there must
+ be some mention that this is a large amount of data to add to
+ struct thread. See the "struct thread" section in the Reference
+ Guide chapter of the Pintos manual for explanation.
+
+ If the fd counter skips the first 3 fd values, with mention that
+ this is for stdin, stdout, and stderr, then it must also mention
+ that Pintos does not have stderr.
+
+ There seems to be some confusion about locking here among
+ students. No locking is necessary for per-thread data structures,
+ such as a thread-specific list of file descriptors. Locking is
+ required for data that might be concurrently accessed by multiple
+ threads.
+
+ There is no need for submissions to handle overflow of the counter
+ used for generating fds.
+
+B3:
+
+ This question was badly phrased in previous quarters, so there's
+ no pool of good student answers to compare against.
+
+ The answer should say which of the techniques described in
+ "Accessing User Memory" in the assignment is used. It should
+ describe a few of the implementation details, such as interfaces
+ of functions used to read or write user memory.
+
+ A few students believe that writing to the "eax" member of struct
+ intr_frame is a form of writing to user memory. This is wrong:
+ struct intr_frame is in the kernel stack. Modifying its "eax"
+ member changes the value of the EAX register seen by the process
+ when a system call returns, not any kind of user memory.
+
+ Occasionally, a group extends the page fault-based method
+ described in the Pintos manual. The method described there has a
+ little bit of overhead in that it requires the EAX register to be
+ set up with the address of a recovery handler before accessing
+ memory. This is necessary for the page fault handler to know what
+ to do. An alternative without this overhead is to maintain a
+ table that associates EIP values of potentially faulting
+ instructions with their recovery handlers. This technique is used
+ in the Linux kernel and elsewhere to good effect.
+
+B4:
+
+ In the first part of this question, students should address their
+ implementation. In the second part, they should address an ideal
+ implementation. They need to answer both parts, although the
+ answers may be very brief as long as they are correct.
+
+ An ideal implementation would only inspect as many page table
+ entries as aligned virtual pages spanned by the buffer. Both
+ 2-byte and 4,096 buffers can span either 1 or 2 pages, so they
+ would require 1 or 2 checks.
+
+ Student implementations often check page table entries more often
+ than this. That's acceptable, within reason. For example, one
+ common student implementation checks the first byte and last byte
+ of the user memory block to be accessed, plus one byte spaced
+ every 4,096 bytes within the block. That's fine, although it
+ often makes a few more checks than absolutely necessary.
+
+ Implementations that always inspect every byte read or written are
+ not acceptable. Take off points.
+
+ Bizarrely, some students believe that checking a page table entry
+ every 4 bytes within a block is correct. It's not: it will fail,
+ for example, in the case of a 2-byte block spanning a pair of
+ virtual pages. If they check each *aligned* 4-byte region, then
+ it's correct, but then they're still doing 1,024 checks per page,
+ which is still unacceptable.
+
+ An implementation that uses page faults to detect bad user memory
+ accesses requires 0 page table inspections to copy any number of
+ bytes. An answer from this point of view is acceptable too.
+
+ Some students suggest that further improvements could be made by
+ page-aligning the buffers passed to the kernel. That's a
+ reasonable suggestion.
+
+ Some very confused students think that each of the bytes in a
+ 4,096 user memory region can be on a separate page. Penalize
+ them.
+
+B5 and B7:
+
+ These questions really have to be graded together, because they
+ are so closely related. See B1(iii) above for information on the
+ basic data structures used for "wait".
+
+ A call to "wait" should not require disabling interrupts or
+ calling thread_block() by hand. Take off points.
+
+ Global locks, semaphores, etc. should not generally be used for
+ implementing "wait". Their use forces serialization of process
+ exit, and they are unnecessary. They are also, often, an
+ indication of an incorrect implementation.
+
+ Some students indicate that they actually implement the wrong
+ semantics: If C exits before P, then "wait" by P on C should
+ return the process's exit code, not -1 as these students say.
+ Take off points for this. (But it's strange that they would think
+ so, seeing as the tests for "wait" are pretty thorough.)
+
+ The "exit" and "wait" system calls should not have to search a
+ global list of all processes, because it is inefficient and, more
+ importantly, unnecessary given that it is so easy to keep a
+ per-thread list of children. Take off points.
+
+ The procedure for "wait" varies broadly based on the data
+ structures used. If shared data separate from struct thread are
+ used, the procedure is simpler:
+
+ - Find the child in the list of shared data structures. (If
+ none is found, return -1.)
+
+ - Wait for the child to die, by downing a semaphore in the
+ shared data.
+
+ - Obtain the child's exit code from the shared data.
+
+ - Destroy the shared data structure and remove it from the
+ list.
+
+ - Return the exit code.
+
+ A common variation on the above is to mark the shared data
+ structure as "used" instead of removing it from the list and
+ destroying it, then on subsequent "wait"s noticing that it is
+ already used, skipping the down of the semaphore, and returning
+ -1.
+
+ The procedure for "exit", in this case, is:
+
+ - Up the semaphore in the data shared with our parent process
+ (if any). In some kind of race-free way (such as using a
+ lock and a reference count or pair of boolean variables in
+ the shared data area), mark the shared data as unused by us
+ and free it if the parent is also dead.
+
+ - Iterate the list of children and, as in the previous step,
+ mark them as no longer used by us and free them if the child
+ is also dead.
+
+