From 745a05b20f3fa03990307d012c71512d6c996899 Mon Sep 17 00:00:00 2001 From: Ben Pfaff Date: Mon, 24 Jul 2006 03:30:06 +0000 Subject: [PATCH] Incomplete first draft. --- ta-advice/HW2 | 466 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 466 insertions(+) create mode 100644 ta-advice/HW2 diff --git a/ta-advice/HW2 b/ta-advice/HW2 new file mode 100644 index 0000000..2127d2e --- /dev/null +++ b/ta-advice/HW2 @@ -0,0 +1,466 @@ + PROJECT 2 GRADING ADVICE + ======================== + +This file contains advice for grading project 2. General advice for +Pintos grading is in README, which you should read first. + + ARGUMENT PASSING + ================ + +A1: + + Most correct solutions don't introduce any new data structures for + this part. + + If any global variables are introduced, be suspicious about the + possibility for race conditions. + +A2: + + The "Program Startup Details" section in the User Programs chapter + of the Pintos manual describes what needs to be done. + + There are two main ways to put argv[] in the correct order: + + - Many solutions use strtok_r() to divide the arguments into + space-separated strings, storing a pointer to each argument + in a temporary storage area, then push each pointer on the + stack in reverse order. + + - Another approach is to push the arguments obtained by + strtok_r() on the user stack directly, then reverse their + order when finished. This is what the sample solution does. + + Students must specify how they avoid overflow: + + - One acceptable approach is to check for overflow just before + pushing each item on the stack. + + - Another acceptable approach is to make two passes. In the + first pass, calculate how many bytes of stack space are + required. Abort at this point if the stack would overflow. + Then, in the second pass, actually set up the stack. + + - A third acceptable approach is to allocate additional stack + pages as the currently allocate page or pages are exhausted. + If this is done, verify that the case where no more memory + is available is correctly handled (by returning an error and + ensuring that the pages already allocated are freed). + + Few groups choose this approach. + + - Some students assume that any command line under 4,096 bytes + long will fit. This is wrong: some space is required for + the pointers in argv[], as well as a little extra for other + data that goes on the stack and some space for the user + process's use as well. + + - Generalizing the previous approach, some students set some + other fixed limit on command-line length in bytes. Usually + they don't properly consider the consequences. Every two + bytes in a command line can require four bytes for a pointer + in argv[], as in a command line that consists of alternating + space and non-space characters. Thus, N bytes of + command-line arguments can require N + 4(N / 2) = 3N bytes + of stack space. Setting 3N = 4,096 we obtain N = 1,365. + Thus, any solution that sets a fixed cap on command-line + length greater than 1,365 is incorrect. Preferably, any + fixed limit should be significantly less than that. + + - Some students believe that there is a 128-byte limit on the + length of a command line. This is wrong: the 128-byte limit + only applies to the "pintos" utility's command line. It + does not apply to the command line passed to the "exec" + system call. + + - Some groups limit the number of command line arguments to a + fixed value. There's nothing wrong with that as long as + it's sufficiently large, e.g. 512. + + One common fallacy is an assumption that each string must be + word-aligned for performance. This is not true. It's okay if + they align strings, because it's harmless, but stating that it + improves performance should incur a small deduction. + +A3: + + strtok() uses a static object to hold the starting position for + the next token. Thus, it isn't safe for concurrent use in + different threads (the problem here) or even for nested use within + a single thread. + + strtok_r() solves the problem by requiring the caller to provide + storage for the starting position for the next token. + + Wrong answers: + + - Some students believe that strtok_r() does not modify the + string it tokenizes whereas strtok() does. This is wrong: + both functions modify the string that they tokenize. + + - Some students think strtok() uses a "static buffer" for + parsing. This is wrong: it modifies the string provided and + stores a pointer into it statically. + +A4: + + Some correct answers: + + - Increased flexibility: a variety of shells can accept + varying syntaxes without any need to change the kernel. + + - Simplifies the kernel and improves its security: a simpler + kernel is a kernel less likely to have important bugs or + security holes. This is especially true given the + complicated parsing rules of real-world shells. + + Some wrong answers: + + - Saves time and disk space: doing parsing in user space or + kernel space doesn't have a strong time or space trade-off, + but some students think so. + + - Relative paths: some students think that relative paths are + implemented by the shell, but this is wrong. + + - Fewer context switches: where parsing is done has no effect + on the number of context switches required to start a + process. + + - Policy enforcement: some students think that the shell is in + charge of enforcing system security policies, but this is + not normally true. + +Check the submitted code, by hand, for the following: + + 1. That the code used for argument passing is readable. This is + probably near the end of userprog/process.c, in or near + setup_stack(). In particular, many stack "push" operations are + required, so the code to do this should either be simple + (e.g. "*--p = something;") or, if it is more complicated, + abstracted into a function or set of functions. + + SYSTEM CALLS + ============ + +B1: + + This question has a pretty big answer, because system calls need + quite a few data structures. The guidelines for B1 are therefore + broken down into B1(i) through B1(iv), below, to make them easier + to follow. + +B1(i): + + A global lock or semaphore for file system synchronization is + needed, e.g. + + /* Serializes file system operations. */ + static struct lock fs_lock; + +B1(ii): + + Some mechanism for tracking file descriptors is needed. There are + three common variants (see B2 for more on this topic): + + - Per-thread list: adds a list to struct thread, like so: + + struct list fds; /* List of file descriptors. */ + + Each member of the list includes at least a file and a fd + number, e.g.: + + /* A file descriptor, for binding a file handle to a file. */ + struct file_descriptor + { + struct list_elem elem; /* List element. */ + struct file *file; /* File. */ + int handle; /* File handle. */ + }; + + This case also requires a way to come up with unique file + descriptors. The simplest and best way is to add a counter + to struct thread, e.g.: + + int next_handle; /* Next handle value. */ + + Another alternative is to use a global counter, but this + requires adding a lock as well. (It's possible to use the + global file system lock or some other lock for this, but I + haven't seen that done in practice.) Deduct points if + synchronization is omitted: + + static int next_fd = 2; /* Next file descriptor. */ + static struct lock fd_lock; /* Protects next_fd. */ + + - Per-thread array: adds an array of 128 elements to struct + thread (128 because the assignment states that as a minimum + number): + + struct file *fds[128]; + + Some solutions include some extra members to speed up + finding an empty slot. + + - Global hash table: adds a global hash table that maps from a + descriptor number to a "struct file *". Check that there is + also a new hash_elem in struct thread in this case. + + Some way to generate unique file descriptors is needed, + usually a global counter. A lock or other synchronization + is also required. + + The global data for this case looks something like this: + + static struct hash fd_hash; + static int fd_next = 2; + static struct lock fd_lock; + + The hash table elements look something like this: + + struct file_descriptor + { + struct hash_elem hash_elem; + struct file *file; + int handle; + pid_t owner; + }; + + Notice the inclusion of the "owner" member, which ensures + that a fd can only be used by the process that opened it. + If there's no mechanism to ensure this, deduct points. + +B1(iii): + + Data structures for implementing the "wait" system call are needed + (see B5 and B8 for more on this topic). Fundamentally, "wait" + requires communication between a process and its children, usually + implemented through shared data. The shared data might be added + to struct thread, but many solutions separate it into a separate + structure. At least the following must be shared between a parent + and each of its children: + + - Child's exit status, so that "wait" can return it. + + - Child's thread id, for "wait" to compare against its + argument. + + - A way for the parent to block until the child dies (usually + a semaphore). + + - A way for the parent and child to tell whether the other is + already dead, in a race-free fashion (to ensure that their + shared data can be freed). + + The sample solution uses a new "struct wait_status" for this + purpose: + + /* Tracks the completion of a process. + Reference held by both the parent, in its `children' list, + and by the child, in its `wait_status' pointer. */ + struct wait_status + { + struct list_elem elem; /* `children' list element. */ + struct lock lock; /* Protects ref_cnt. */ + int ref_cnt; /* 2=child and parent both alive, + 1=either child or parent alive, + 0=child and parent both dead. */ + tid_t tid; /* Child thread id. */ + int exit_code; /* Child exit code, if dead. */ + struct semaphore dead; /* 1=child alive, 0=child dead. */ + }; + + Of course, each thread needs a way to find the shared data for + each of its children and, if shared data is not stored in struct + thread, its own shared data as well. The sample solution adds a + pair of members to struct thread: + + struct wait_status *wait_status; /* This process's completion state. */ + struct list children; /* Completion status of children. */ + + Most solutions add an exit code member to struct thread, even if + there is an exit code member in a separate shared data structure. + That's fine. + +B1(iv): + + Some data is needed to allow a process that calls "exec" to wait + until the new process completes loading. The sample solution, and + some student submissions, just add a new structure. A pointer to + the new structure is passed to the new thread as the "void *" + pointer argument to thread_create(). Here is the sample + solution's structure: + + /* Data structure shared between process_execute() in the + invoking thread and execute_thread() in the newly invoked + thread. */ + struct exec_info + { + const char *file_name; /* Program to load. */ + struct semaphore load_done; /* "Up"ed when loading complete. */ + struct wait_status *wait_status;/* Child process. */ + bool success; /* Program successfully loaded? */ + }; + + Many student submissions instead add members to struct thread to + do the same things, which is fine. + +B2: + + See B1(ii) above for information on the basic data structures used + for file descriptors. Part of the answer to this question might + be given in B1. + + Globally unique and per-process unique file descriptors are both + acceptable. + + If a global counter is used to generate file descriptors, there + must be some mention of synchronization. + + If an array totalling 512 bytes or more (e.g. 128 elements of + "int" or pointer type) is added to struct thread, then there must + be some mention that this is a large amount of data to add to + struct thread. See the "struct thread" section in the Reference + Guide chapter of the Pintos manual for explanation. + + If the fd counter skips the first 3 fd values, with mention that + this is for stdin, stdout, and stderr, then it must also mention + that Pintos does not have stderr. + + There seems to be some confusion about locking here among + students. No locking is necessary for per-thread data structures, + such as a thread-specific list of file descriptors. Locking is + required for data that might be concurrently accessed by multiple + threads. + + There is no need for submissions to handle overflow of the counter + used for generating fds. + +B3: + + This question was badly phrased in previous quarters, so there's + no pool of good student answers to compare against. + + The answer should say which of the techniques described in + "Accessing User Memory" in the assignment is used. It should + describe a few of the implementation details, such as interfaces + of functions used to read or write user memory. + + A few students believe that writing to the "eax" member of struct + intr_frame is a form of writing to user memory. This is wrong: + struct intr_frame is in the kernel stack. Modifying its "eax" + member changes the value of the EAX register seen by the process + when a system call returns, not any kind of user memory. + + Occasionally, a group extends the page fault-based method + described in the Pintos manual. The method described there has a + little bit of overhead in that it requires the EAX register to be + set up with the address of a recovery handler before accessing + memory. This is necessary for the page fault handler to know what + to do. An alternative without this overhead is to maintain a + table that associates EIP values of potentially faulting + instructions with their recovery handlers. This technique is used + in the Linux kernel and elsewhere to good effect. + +B4: + + In the first part of this question, students should address their + implementation. In the second part, they should address an ideal + implementation. They need to answer both parts, although the + answers may be very brief as long as they are correct. + + An ideal implementation would only inspect as many page table + entries as aligned virtual pages spanned by the buffer. Both + 2-byte and 4,096 buffers can span either 1 or 2 pages, so they + would require 1 or 2 checks. + + Student implementations often check page table entries more often + than this. That's acceptable, within reason. For example, one + common student implementation checks the first byte and last byte + of the user memory block to be accessed, plus one byte spaced + every 4,096 bytes within the block. That's fine, although it + often makes a few more checks than absolutely necessary. + + Implementations that always inspect every byte read or written are + not acceptable. Take off points. + + Bizarrely, some students believe that checking a page table entry + every 4 bytes within a block is correct. It's not: it will fail, + for example, in the case of a 2-byte block spanning a pair of + virtual pages. If they check each *aligned* 4-byte region, then + it's correct, but then they're still doing 1,024 checks per page, + which is still unacceptable. + + An implementation that uses page faults to detect bad user memory + accesses requires 0 page table inspections to copy any number of + bytes. An answer from this point of view is acceptable too. + + Some students suggest that further improvements could be made by + page-aligning the buffers passed to the kernel. That's a + reasonable suggestion. + + Some very confused students think that each of the bytes in a + 4,096 user memory region can be on a separate page. Penalize + them. + +B5 and B7: + + These questions really have to be graded together, because they + are so closely related. See B1(iii) above for information on the + basic data structures used for "wait". + + A call to "wait" should not require disabling interrupts or + calling thread_block() by hand. Take off points. + + Global locks, semaphores, etc. should not generally be used for + implementing "wait". Their use forces serialization of process + exit, and they are unnecessary. They are also, often, an + indication of an incorrect implementation. + + Some students indicate that they actually implement the wrong + semantics: If C exits before P, then "wait" by P on C should + return the process's exit code, not -1 as these students say. + Take off points for this. (But it's strange that they would think + so, seeing as the tests for "wait" are pretty thorough.) + + The "exit" and "wait" system calls should not have to search a + global list of all processes, because it is inefficient and, more + importantly, unnecessary given that it is so easy to keep a + per-thread list of children. Take off points. + + The procedure for "wait" varies broadly based on the data + structures used. If shared data separate from struct thread are + used, the procedure is simpler: + + - Find the child in the list of shared data structures. (If + none is found, return -1.) + + - Wait for the child to die, by downing a semaphore in the + shared data. + + - Obtain the child's exit code from the shared data. + + - Destroy the shared data structure and remove it from the + list. + + - Return the exit code. + + A common variation on the above is to mark the shared data + structure as "used" instead of removing it from the list and + destroying it, then on subsequent "wait"s noticing that it is + already used, skipping the down of the semaphore, and returning + -1. + + The procedure for "exit", in this case, is: + + - Up the semaphore in the data shared with our parent process + (if any). In some kind of race-free way (such as using a + lock and a reference count or pair of boolean variables in + the shared data area), mark the shared data as unused by us + and free it if the parent is also dead. + + - Iterate the list of children and, as in the previous step, + mark them as no longer used by us and free them if the child + is also dead. + + -- 2.30.2