1 PROJECT 2 GRADING ADVICE
2 ========================
4 This file contains advice for grading project 2. General advice for
5 Pintos grading is in README, which you should read first.
12 Most correct solutions don't introduce any new data structures for
15 If any global variables are introduced, be suspicious about the
16 possibility for race conditions.
20 The "Program Startup Details" section in the User Programs chapter
21 of the Pintos manual describes what needs to be done.
23 There are two main ways to put argv[] in the correct order:
25 - Many solutions use strtok_r() to divide the arguments into
26 space-separated strings, storing a pointer to each argument
27 in a temporary storage area, then push each pointer on the
28 stack in reverse order.
30 - Another approach is to push the arguments obtained by
31 strtok_r() on the user stack directly, then reverse their
32 order when finished. This is what the sample solution does.
34 Students must specify how they avoid overflow:
36 - One acceptable approach is to check for overflow just before
37 pushing each item on the stack.
39 - Another acceptable approach is to make two passes. In the
40 first pass, calculate how many bytes of stack space are
41 required. Abort at this point if the stack would overflow.
42 Then, in the second pass, actually set up the stack.
44 - A third acceptable approach is to allocate additional stack
45 pages as the currently allocate page or pages are exhausted.
46 If this is done, verify that the case where no more memory
47 is available is correctly handled (by returning an error and
48 ensuring that the pages already allocated are freed).
50 Few groups choose this approach.
52 - Some students assume that any command line under 4,096 bytes
53 long will fit. This is wrong: some space is required for
54 the pointers in argv[], as well as a little extra for other
55 data that goes on the stack and some space for the user
56 process's use as well.
58 - Generalizing the previous approach, some students set some
59 other fixed limit on command-line length in bytes. Usually
60 they don't properly consider the consequences. Every two
61 bytes in a command line can require four bytes for a pointer
62 in argv[], as in a command line that consists of alternating
63 space and non-space characters. Thus, N bytes of
64 command-line arguments can require N + 4(N / 2) = 3N bytes
65 of stack space. Setting 3N = 4,096 we obtain N = 1,365.
66 Thus, any solution that sets a fixed cap on command-line
67 length greater than 1,365 is incorrect. Preferably, any
68 fixed limit should be significantly less than that.
70 - Some students believe that there is a 128-byte limit on the
71 length of a command line. This is wrong: the 128-byte limit
72 only applies to the "pintos" utility's command line. It
73 does not apply to the command line passed to the "exec"
76 - Some groups limit the number of command line arguments to a
77 fixed value. There's nothing wrong with that as long as
78 it's sufficiently large, e.g. 512.
80 One common fallacy is an assumption that each string must be
81 word-aligned for performance. This is not true. It's okay if
82 they align strings, because it's harmless, but stating that it
83 improves performance should incur a small deduction.
87 strtok() uses a static object to hold the starting position for
88 the next token. Thus, it isn't safe for concurrent use in
89 different threads (the problem here) or even for nested use within
92 strtok_r() solves the problem by requiring the caller to provide
93 storage for the starting position for the next token.
97 - Some students believe that strtok_r() does not modify the
98 string it tokenizes whereas strtok() does. This is wrong:
99 both functions modify the string that they tokenize.
101 - Some students think strtok() uses a "static buffer" for
102 parsing. This is wrong: it modifies the string provided and
103 stores a pointer into it statically.
105 However, the manpage for on GNU/Linux (at least) says that
106 strtok uses a static buffer. That's wrong. So don't
107 penalize students who repeat that claim. Instead, please
110 Although some manpages for strtok do claim that strtok
111 uses a static buffer, this is incorrect. strtok uses a
112 user-provided buffer and statically stores a pointer
117 Some correct answers:
119 - Increased flexibility: a variety of shells can accept
120 varying syntaxes without any need to change the kernel.
122 - Simplifies the kernel and improves its security: a simpler
123 kernel is a kernel less likely to have important bugs or
124 security holes. This is especially true given the
125 complicated parsing rules of real-world shells.
129 - Saves time and disk space: doing parsing in user space or
130 kernel space doesn't have a strong time or space trade-off,
131 but some students think so.
133 - Relative paths: some students think that relative paths are
134 implemented by the shell, but this is wrong.
136 - Fewer context switches: where parsing is done has no effect
137 on the number of context switches required to start a
138 process. (It might be possible to justify this claim by
139 giving specific details.)
141 - Policy enforcement: some students think that the shell is in
142 charge of enforcing system security policies, but this is
145 Check the submitted code, by hand, for the following:
147 1. That the code used for argument passing is readable. This is
148 probably near the end of userprog/process.c, in or near
149 setup_stack(). In particular, many stack "push" operations are
150 required, so the code to do this should either be simple
151 (e.g. "*--p = something;") or, if it is more complicated,
152 abstracted into a function or set of functions.
159 This question has a pretty big answer, because system calls need
160 quite a few data structures. The guidelines for B1 are therefore
161 broken down into B1(i) through B1(iv), below, to make them easier
166 A global lock or semaphore for file system synchronization is
169 /* Serializes file system operations. */
170 static struct lock fs_lock;
174 Some mechanism for tracking file descriptors is needed. There are
175 three common variants (see B2 for more on this topic):
177 - Per-thread list: adds a list to struct thread, like so:
179 struct list fds; /* List of file descriptors. */
181 Each member of the list includes at least a file and a fd
184 /* A file descriptor, for binding a file handle to a file. */
185 struct file_descriptor
187 struct list_elem elem; /* List element. */
188 struct file *file; /* File. */
189 int handle; /* File handle. */
192 This case also requires a way to come up with unique file
193 descriptors. The simplest and best way is to add a counter
194 to struct thread, e.g.:
196 int next_handle; /* Next handle value. */
198 Another alternative is to use a global counter, but this
199 requires adding a lock as well. (It's possible to use the
200 global file system lock or some other lock for this, but I
201 haven't seen that done in practice.) Deduct points if
202 synchronization is omitted:
204 static int next_fd = 2; /* Next file descriptor. */
205 static struct lock fd_lock; /* Protects next_fd. */
207 - Per-thread array: adds an array of 128 elements to struct
208 thread (128 because the assignment states that as a minimum
211 struct file *fds[128];
213 Some solutions include some extra members to speed up
214 finding an empty slot.
216 - Global hash table: adds a global hash table that maps from a
217 descriptor number to a "struct file *". Check that there is
218 also a new hash_elem in struct thread in this case.
220 Some way to generate unique file descriptors is needed,
221 usually a global counter. A lock or other synchronization
224 The global data for this case looks something like this:
226 static struct hash fd_hash;
227 static int fd_next = 2;
228 static struct lock fd_lock;
230 The hash table elements look something like this:
232 struct file_descriptor
234 struct hash_elem hash_elem;
240 Notice the inclusion of the "owner" member, which ensures
241 that a fd can only be used by the process that opened it.
242 If there's no mechanism to ensure this, deduct points.
246 Data structures for implementing the "wait" system call are needed
247 (see B5 and B8 for more on this topic). Fundamentally, "wait"
248 requires communication between a process and its children, usually
249 implemented through shared data. The shared data might be added
250 to struct thread, but many solutions separate it into a separate
251 structure. At least the following must be shared between a parent
252 and each of its children:
254 - Child's exit status, so that "wait" can return it.
256 - Child's thread id, for "wait" to compare against its
259 - A way for the parent to block until the child dies (usually
262 - A way for the parent and child to tell whether the other is
263 already dead, in a race-free fashion (to ensure that their
264 shared data can be freed).
266 The sample solution uses a new "struct wait_status" for this
269 /* Tracks the completion of a process.
270 Reference held by both the parent, in its `children' list,
271 and by the child, in its `wait_status' pointer. */
274 struct list_elem elem; /* `children' list element. */
275 struct lock lock; /* Protects ref_cnt. */
276 int ref_cnt; /* 2=child and parent both alive,
277 1=either child or parent alive,
278 0=child and parent both dead. */
279 tid_t tid; /* Child thread id. */
280 int exit_code; /* Child exit code, if dead. */
281 struct semaphore dead; /* 1=child alive, 0=child dead. */
284 If the shared data is integrated into struct thread, then one of
285 the cleanest ways to do it is to use a pair of semaphores, both
286 initially set to 0, like so (see B5/B8 below for more information
287 on the usage of these semaphores):
289 int exit_code; /* Process's exit status. */
290 struct semaphore exiting; /* Up'd when process ready to exit. */
291 struct semaphore may_exit; /* Up'd after parent reads exit code. */
293 Of course, each thread needs a way to find the shared data for
294 each of its children and, if shared data is not stored in struct
295 thread, its own shared data as well. The sample solution adds a
296 pair of members to struct thread:
298 struct wait_status *wait_status; /* This process's completion state. */
299 struct list children; /* Completion status of children. */
301 If the shared data is integrated into struct thread, then a list
302 of children and a corresponding list element is needed:
304 struct list children; /* List of child processes. */
305 struct list_elem child_elem; /* Used for `children' list. */
307 Most solutions add an exit code member to struct thread, even if
308 there is an exit code member in a separate shared data structure.
313 Some data is needed to allow a process that calls "exec" to wait
314 until the new process completes loading. The sample solution, and
315 some student submissions, just add a new structure. A pointer to
316 the new structure is passed to the new thread as the "void *"
317 pointer argument to thread_create(). Here is the sample
318 solution's structure:
320 /* Data structure shared between process_execute() in the
321 invoking thread and start_process() in the newly invoked
325 const char *file_name; /* Program to load. */
326 struct semaphore load_done; /* "Up"ed when loading complete. */
327 struct wait_status *wait_status;/* Child process. */
328 bool success; /* Program successfully loaded? */
331 Many student submissions instead add members to struct thread to
332 do the same things, which is fine.
336 See B1(ii) above for information on the basic data structures used
337 for file descriptors. Part of the answer to this question might
340 Globally unique and per-process unique file descriptors are both
343 If a global counter is used to generate file descriptors, there
344 must be some mention of synchronization.
346 If an array totalling 512 bytes or more (e.g. 128 elements of
347 "int" or pointer type) is added to struct thread, then there must
348 be some mention that this is a large amount of data to add to
349 struct thread. This can be mentioned here, or in B1(ii), or in
350 B10 below. See the "struct thread" section in the Reference Guide
351 chapter of the Pintos manual for explanation.
353 If the fd counter skips the first 3 fd values, with mention that
354 this is for stdin, stdout, and stderr, then it must also mention
355 that Pintos does not have stderr.
357 There seems to be some confusion about locking here among
358 students. No locking is necessary for per-thread data structures,
359 such as a thread-specific list of file descriptors. Locking is
360 required for data that might be concurrently accessed by multiple
363 There is no need for submissions to handle overflow of the counter
364 used for generating fds.
368 This question was badly phrased in previous quarters, so there's
369 no pool of good student answers to compare against.
371 The answer should say which of the techniques described in
372 "Accessing User Memory" in the assignment is used. It should
373 describe a few of the implementation details, such as interfaces
374 of functions used to read or write user memory.
376 A few students believe that writing to the "eax" member of struct
377 intr_frame is a form of writing to user memory. This is wrong:
378 struct intr_frame is in the kernel stack. Modifying its "eax"
379 member changes the value of the EAX register seen by the process
380 when a system call returns, not any kind of user memory.
382 Occasionally, a group extends the page fault-based method
383 described in the Pintos manual. The method described there has a
384 little bit of overhead in that it requires the EAX register to be
385 set up with the address of a recovery handler before accessing
386 memory. This is necessary for the page fault handler to know what
387 to do. An alternative without this overhead is to maintain a
388 table that associates EIP values of potentially faulting
389 instructions with their recovery handlers. This technique is used
390 in the Linux kernel and elsewhere to good effect.
394 In the first part of this question, students should address their
395 implementation. In the second part, they should address an ideal
396 implementation. They need to answer both parts, although the
397 answers may be very brief as long as they are correct.
399 An ideal implementation would only inspect as many page table
400 entries as aligned virtual pages spanned by the buffer. Both
401 2-byte and 4,096 buffers can span either 1 or 2 pages, so they
402 would require 1 or 2 checks.
404 Student implementations often check page table entries more often
405 than this. That's acceptable, within reason. For example, one
406 common student implementation checks the first byte and last byte
407 of the user memory block to be accessed, plus one byte spaced
408 every 4,096 bytes within the block. That's fine, although it
409 often makes a few more checks than absolutely necessary.
411 Implementations that always inspect every byte read or written are
412 not acceptable. Take off points.
414 Bizarrely, some students believe that checking a page table entry
415 every 4 bytes within a block is correct. It's not: it will fail,
416 for example, in the case of a 2-byte block spanning a pair of
417 virtual pages. If they check each *aligned* 4-byte region, then
418 it's correct, but then they're still doing 1,024 checks per page,
419 which is still unacceptable.
421 An implementation that uses page faults to detect bad user memory
422 accesses requires 0 page table inspections to copy any number of
423 bytes. An answer from this point of view is acceptable too.
425 Some students suggest that further improvements could be made by
426 page-aligning the buffers passed to the kernel. That's a
427 reasonable suggestion.
429 Some very confused students think that each of the bytes in a
430 4,096 user memory region can be on a separate page. Penalize
435 These questions really have to be graded together, because they
436 are so closely related. See B1(iii) above for information on the
437 basic data structures used for "wait".
439 A call to "wait" should not require disabling interrupts or
440 calling thread_block() by hand. Take off points.
442 Global locks, semaphores, etc. should not generally be used for
443 implementing "wait". Their use forces serialization of process
444 exit, and they are unnecessary. They are also, often, an
445 indication of an incorrect implementation.
447 Some students indicate that they actually implement the wrong
448 semantics: If C exits before P, then "wait" by P on C should
449 return the process's exit code, not -1 as these students say.
450 Take off points for this. (But it's strange that they would think
451 so, seeing as the tests for "wait" are pretty thorough.)
453 The "exit" and "wait" system calls should not have to search a
454 global list of all processes, because it is inefficient and, more
455 importantly, unnecessary given that it is so easy to keep a
456 per-thread list of children. Take off points.
458 The procedures for "wait" and "exit" vary broadly based on the
459 data structures used. If shared data separate from struct thread
462 The procedure for "wait" is roughly this:
464 - Find the child in the list of shared data structures.
465 (If none is found, return -1.)
467 - Wait for the child to die, by downing a semaphore in the
470 - Obtain the child's exit code from the shared data.
472 - Destroy the shared data structure and remove it from the
475 - Return the exit code.
477 The procedure for "exit" is then:
479 - Save the exit code in the shared data.
481 - Up the semaphore in the data shared with our parent
482 process (if any). In some kind of race-free way (such
483 as using a lock and a reference count or pair of boolean
484 variables in the shared data area), mark the shared data
485 as unused by us and free it if the parent is also dead.
487 - Iterate the list of children and, as in the previous
488 step, mark them as no longer used by us and free them if
489 the child is also dead.
491 - Terminate the thread.
493 A common variation on the above "wait" procedure is to mark
494 the shared data structure as "used" instead of removing it
495 from the list and destroying it, then on subsequent "wait"s
496 noticing that it is already used, skipping the down of the
497 semaphore, and returning -1.
499 If the shared data is integrated into struct thread:
501 The procedure for "wait" is roughly this:
503 - Find the child in the list of child processes. (If none
504 is found, return -1.)
506 - Wait for the child to die, by downing a semaphore in the
507 child thread struct (this is the "exiting" semaphore in
508 the example in B1(iii)).
510 - Obtain the child's exit code from its thread struct.
512 - Remove the child from the list of children.
514 - Give the child permission to finish exiting, by "up"ing
515 its may_exit semaphore. (After this action, the child
516 thread struct must no longer be accessed, because the
517 child could wake up at any time and exit. In particular
518 this means that the previous two steps *must* come
521 - Return the exit code.
523 The procedure for "exit" is then:
525 - Save the exit code in the exiting process's thread
528 - Up the "exiting" semaphore, to allow a waiting parent to
529 retrieve the exit code.
531 - Iterate the list of children and up each child's
532 may_exit semaphore, to allow them to exit when they're
535 - Down the "may_exit" semaphore, to wait for our parent to
536 retrieve our exit code.
538 - Terminate the thread.
540 A common variation on the above "wait" procedure is to mark a
541 child as "used" instead of allowing it to terminate then on
542 subsequent "wait"s noticing that it is already used, skipping
543 the down of the semaphore, and returning -1. Then the "exit"
544 procedure actually gives each child permission to terminate.
546 The main "trick" in this kind of implementation is that an
547 exiting process must not terminate until its parent process
548 has terminated or retrieved its exit status. (If the exiting
549 process were allowed to terminate immediately, then its thread
550 struct would be freed and the exit status could not be
551 retrieved at all.) If the students do not explain how they do
554 (Students are not required to explain the "exit" procedure as part
555 of B5, but they usually end up doing so.)
557 Some groups come up with extraordinarily complicated, confusing
558 schemes for implementing wait/exit. If you can't understand their
559 description, take off points.
561 You should expect to see an answer to each of the 4 parts in B8.
562 Here's an example answer that corresponds to the sample solution
563 (which uses shared data separate from the thread struct):
565 - If P calls wait(C) before C exits, P will wait for C to die
566 by "down"ing the semaphore in C's wait_status struct, which
567 C will "up" when it dies.
569 - If P calls wait(C) after C exits, P will "down" the
570 semaphore similarly but no waiting will be required.
572 - If P terminates without waiting, before C exits, then
573 C will "up" the semaphore when it terminates, but P will
574 never "down" it, which is fine.
576 - If P terminates without waiting, after C exits, then the
577 same thing happens, except that C "up"s the semaphore before
578 P dies instead of after.
580 Students should also say something about freeing resources and
583 - When either C or P terminates, it decrements their shared
584 wait_status's reference count (under the wait_status lock).
585 When the reference count reaches 0, the wait_status is no
586 longer referenced so it is freed.
588 - There are no special cases.
592 There are three common types of answers:
594 - Pre-checking: System call handlers verify that all the data
595 they need to access can be accessed before allocating any
596 resources. Thus, there's never anything to free on error.
598 (Once project 3 rolls around, this will no longer be
599 possible, so these students haven't done a good job of
602 - Manual freeing: If an error occurs, the system call handler
603 manually frees everything allocated up to that point. Often
604 these answers focus on "abstraction", making the point that
605 it's easier to free resources if the resources are correctly
608 - Resource pools: All the resources allocated by a thread
609 (memory, locks, etc.) is put on a list. If an error occurs,
610 a handler iterates through the list and releases all of the
613 (This is a clever solution, but rare.)
615 Any of these are fine.
619 This depends on the data structures defined in B1(iv) above.
621 The usual solution is to use a semaphore initialized to 0. The
622 process calling "exec" downs it. The new process "up"s it once
623 the load success or failure is known. A shared Boolean variable
624 is used to indicate success or failure.
626 Some students use thread_block() instead of a semaphore. Take off
629 Some students store the success indicator variable in the child
630 process. This requires additional synchronization so that the
631 child process can't terminate before the parent reads the status.
632 Typically, an additional semaphore is used to give the process
633 calling "exec" a chance to read the shared Boolean variable
634 without a race on the new process's termination. This is not a
635 good approach, because it is so easy to do without the additional
636 synchronization by storing the success indicator in the parent's
637 thread struct or in a local variable (e.g. as in the sample
638 solution). Take off points.
642 See "B5 and B8", above.
646 Just make sure of a few things here:
648 - They actually answered "why", not "how" or "what".
650 - Their answers do not contain any factual inaccuracies
651 (e.g. do not claim that Pintos processes inherit file
652 descriptors on exec, which contradicts what the assignment
655 - Their answers seem "reasonable".
659 A typical answer is just "we didn't change anything". Otherwise,
660 see "B9 and B10" above.
662 Take off points if they come up with some bizarre or stupid scheme
663 and don't properly justify it, e.g. one group decided that pid_t
664 for a process should be the sum of its tid_t and its parent's
667 Take off points if an answer assumes that there can be more than
668 one thread per process and doesn't mention that this would be an
669 extension to the Pintos design.
671 Check the submitted code, by hand, for the following:
673 1. That the system call handler, which is probably in
674 userprog/syscall.c in syscall_handler(), is reasonably
675 abstracted. It should not all be in one function as one huge
676 switch statement, for example. The code that accesses user
677 memory should be separated into specialized functions. If either
678 of these is not the case, or if the code is otherwise difficult
679 to read, take off points.
681 2. The "Source Code" section in the Introduction to the Pintos
682 manual says, "Update existing comments as you modify code." Most
683 students ignore this. Check whether the comment on
684 process_wait() in process.c has been updated. At a minimum, the
685 two final sentences ("This function will be implemented in
686 problem 2-2. For now, it does nothing.") should have been
689 3. Find the implementation of the "open" system call. Verify that
690 all resources are released in error cases and that errors are
691 properly handled. One particular problem can be malloc() failure
692 after a file is successfully opened--is the return value checked?
693 If so, is the file closed before returning an error?
695 (Failure to check malloc() is in the OVERALL section of the grade
696 report, under DESIGN.)