pintos-os.org Git - pintos-anon/blob - ta-advice/HW2

   1                        PROJECT 2 GRADING ADVICE
   2                        ========================
   3
   4 This file contains advice for grading project 2.  General advice for
   5 Pintos grading is in README, which you should read first.
   6
   7                            ARGUMENT PASSING
   8                            ================
   9
  10 A1:
  11
  12     Most correct solutions don't introduce any new data structures for
  13     this part.
  14
  15     If any global variables are introduced, be suspicious about the
  16     possibility for race conditions.
  17
  18 A2:
  19
  20     The "Program Startup Details" section in the User Programs chapter
  21     of the Pintos manual describes what needs to be done.
  22
  23     There are two main ways to put argv[] in the correct order:
  24
  25         - Many solutions use strtok_r() to divide the arguments into
  26           space-separated strings, storing a pointer to each argument
  27           in a temporary storage area, then push each pointer on the
  28           stack in reverse order.
  29
  30         - Another approach is to push the arguments obtained by
  31           strtok_r() on the user stack directly, then reverse their
  32           order when finished.  This is what the sample solution does.
  33
  34     Students must specify how they avoid overflow:
  35
  36         - One acceptable approach is to check for overflow just before
  37           pushing each item on the stack.
  38
  39         - Another acceptable approach is to make two passes.  In the
  40           first pass, calculate how many bytes of stack space are
  41           required.  Abort at this point if the stack would overflow.
  42           Then, in the second pass, actually set up the stack.
  43
  44         - A third acceptable approach is to allocate additional stack
  45           pages as the currently allocate page or pages are exhausted.
  46           If this is done, verify that the case where no more memory
  47           is available is correctly handled (by returning an error and
  48           ensuring that the pages already allocated are freed).
  49
  50           Few groups choose this approach.
  51
  52         - Some students assume that any command line under 4,096 bytes
  53           long will fit.  This is wrong: some space is required for
  54           the pointers in argv[], as well as a little extra for other
  55           data that goes on the stack and some space for the user
  56           process's use as well.
  57
  58         - Generalizing the previous approach, some students set some
  59           other fixed limit on command-line length in bytes.  Usually
  60           they don't properly consider the consequences.  Every two
  61           bytes in a command line can require four bytes for a pointer
  62           in argv[], as in a command line that consists of alternating
  63           space and non-space characters.  Thus, N bytes of
  64           command-line arguments can require N + 4(N / 2) = 3N bytes
  65           of stack space.  Setting 3N = 4,096 we obtain N = 1,365.
  66           Thus, any solution that sets a fixed cap on command-line
  67           length greater than 1,365 is incorrect.  Preferably, any
  68           fixed limit should be significantly less than that.
  69
  70         - Some students believe that there is a 128-byte limit on the
  71           length of a command line.  This is wrong: the 128-byte limit
  72           only applies to the "pintos" utility's command line.  It
  73           does not apply to the command line passed to the "exec"
  74           system call.
  75
  76         - Some groups limit the number of command line arguments to a
  77           fixed value.  There's nothing wrong with that as long as
  78           it's sufficiently large, e.g. 512.
  79
  80     One common fallacy is an assumption that each string must be
  81     word-aligned for performance.  This is not true.  It's okay if
  82     they align strings, because it's harmless, but stating that it
  83     improves performance should incur a small deduction.
  84
  85 A3:
  86
  87     strtok() uses a static object to hold the starting position for
  88     the next token.  Thus, it isn't safe for concurrent use in
  89     different threads (the problem here) or even for nested use within
  90     a single thread.
  91
  92     strtok_r() solves the problem by requiring the caller to provide
  93     storage for the starting position for the next token.
  94
  95     Wrong answers:
  96
  97         - Some students believe that strtok_r() does not modify the
  98           string it tokenizes whereas strtok() does.  This is wrong:
  99           both functions modify the string that they tokenize.
 100
 101         - Some students think strtok() uses a "static buffer" for
 102           parsing.  This is wrong: it modifies the string provided and
 103           stores a pointer into it statically.
 104
 105 A4:
 106
 107     Some correct answers:
 108
 109         - Increased flexibility: a variety of shells can accept
 110           varying syntaxes without any need to change the kernel.
 111
 112         - Simplifies the kernel and improves its security: a simpler
 113           kernel is a kernel less likely to have important bugs or
 114           security holes.  This is especially true given the
 115           complicated parsing rules of real-world shells.
 116
 117     Some wrong answers:
 118
 119         - Saves time and disk space: doing parsing in user space or
 120           kernel space doesn't have a strong time or space trade-off,
 121           but some students think so.
 122
 123         - Relative paths: some students think that relative paths are
 124           implemented by the shell, but this is wrong.
 125
 126         - Fewer context switches: where parsing is done has no effect
 127           on the number of context switches required to start a
 128           process.  (It might be possible to justify this claim by
 129           giving specific details.)
 130
 131         - Policy enforcement: some students think that the shell is in
 132           charge of enforcing system security policies, but this is
 133           not normally true.
 134
 135 Check the submitted code, by hand, for the following:
 136
 137   1. That the code used for argument passing is readable.  This is
 138      probably near the end of userprog/process.c, in or near
 139      setup_stack().  In particular, many stack "push" operations are
 140      required, so the code to do this should either be simple
 141      (e.g. "*--p = something;") or, if it is more complicated,
 142      abstracted into a function or set of functions.
 143
 144                              SYSTEM CALLS
 145                              ============
 146
 147 B1:
 148
 149     This question has a pretty big answer, because system calls need
 150     quite a few data structures.  The guidelines for B1 are therefore
 151     broken down into B1(i) through B1(iv), below, to make them easier
 152     to follow.
 153
 154 B1(i):
 155
 156     A global lock or semaphore for file system synchronization is
 157     needed, e.g.
 158
 159         /* Serializes file system operations. */
 160         static struct lock fs_lock;
 161
 162 B1(ii):
 163
 164     Some mechanism for tracking file descriptors is needed.  There are
 165     three common variants (see B2 for more on this topic):
 166
 167         - Per-thread list: adds a list to struct thread, like so:
 168
 169             struct list fds;                /* List of file descriptors. */
 170
 171           Each member of the list includes at least a file and a fd
 172           number, e.g.:
 173
 174             /* A file descriptor, for binding a file handle to a file. */
 175             struct file_descriptor
 176               {
 177                 struct list_elem elem;      /* List element. */
 178                 struct file *file;          /* File. */
 179                 int handle;                 /* File handle. */
 180               };
 181
 182           This case also requires a way to come up with unique file
 183           descriptors.  The simplest and best way is to add a counter
 184           to struct thread, e.g.:
 185
 186             int next_handle;                /* Next handle value. */
 187
 188           Another alternative is to use a global counter, but this
 189           requires adding a lock as well.  (It's possible to use the
 190           global file system lock or some other lock for this, but I
 191           haven't seen that done in practice.)  Deduct points if
 192           synchronization is omitted:
 193
 194             static int next_fd = 2;         /* Next file descriptor. */
 195             static struct lock fd_lock;     /* Protects next_fd. */
 196
 197         - Per-thread array: adds an array of 128 elements to struct
 198           thread (128 because the assignment states that as a minimum
 199           number):
 200
 201             struct file *fds[128];
 202
 203           Some solutions include some extra members to speed up
 204           finding an empty slot.
 205
 206         - Global hash table: adds a global hash table that maps from a
 207           descriptor number to a "struct file *".  Check that there is
 208           also a new hash_elem in struct thread in this case.
 209
 210           Some way to generate unique file descriptors is needed,
 211           usually a global counter.  A lock or other synchronization
 212           is also required.
 213
 214           The global data for this case looks something like this:
 215
 216             static struct hash fd_hash;
 217             static int fd_next = 2;
 218             static struct lock fd_lock;
 219
 220           The hash table elements look something like this:
 221
 222             struct file_descriptor
 223               {
 224                 struct hash_elem hash_elem;
 225                 struct file *file;
 226                 int handle;
 227                 pid_t owner;
 228               };
 229
 230           Notice the inclusion of the "owner" member, which ensures
 231           that a fd can only be used by the process that opened it.
 232           If there's no mechanism to ensure this, deduct points.
 233
 234 B1(iii):
 235
 236     Data structures for implementing the "wait" system call are needed
 237     (see B5 and B8 for more on this topic).  Fundamentally, "wait"
 238     requires communication between a process and its children, usually
 239     implemented through shared data.  The shared data might be added
 240     to struct thread, but many solutions separate it into a separate
 241     structure.  At least the following must be shared between a parent
 242     and each of its children:
 243
 244         - Child's exit status, so that "wait" can return it.
 245
 246         - Child's thread id, for "wait" to compare against its
 247           argument.
 248
 249         - A way for the parent to block until the child dies (usually
 250           a semaphore).
 251
 252         - A way for the parent and child to tell whether the other is
 253           already dead, in a race-free fashion (to ensure that their
 254           shared data can be freed).
 255
 256     The sample solution uses a new "struct wait_status" for this
 257     purpose:
 258
 259         /* Tracks the completion of a process.
 260            Reference held by both the parent, in its `children' list,
 261            and by the child, in its `wait_status' pointer. */
 262         struct wait_status
 263           {
 264             struct list_elem elem;      /* `children' list element. */
 265             struct lock lock;           /* Protects ref_cnt. */
 266             int ref_cnt;                /* 2=child and parent both alive,
 267                                            1=either child or parent alive,
 268                                            0=child and parent both dead. */
 269             tid_t tid;                  /* Child thread id. */
 270             int exit_code;              /* Child exit code, if dead. */
 271             struct semaphore dead;      /* 1=child alive, 0=child dead. */
 272           };
 273
 274     If the shared data is integrated into struct thread, then one of
 275     the cleanest ways to do it is to use a pair of semaphores, both
 276     initially set to 0, like so (see B5/B8 below for more information
 277     on the usage of these semaphores):
 278
 279         int exit_code;              /* Process's exit status. */
 280         struct semaphore exiting;   /* Up'd when process ready to exit. */
 281         struct semaphore may_exit;  /* Up'd after parent reads exit code. */
 282
 283     Of course, each thread needs a way to find the shared data for
 284     each of its children and, if shared data is not stored in struct
 285     thread, its own shared data as well.  The sample solution adds a
 286     pair of members to struct thread:
 287
 288         struct wait_status *wait_status; /* This process's completion state. */
 289         struct list children;            /* Completion status of children. */
 290
 291     If the shared data is integrated into struct thread, then a list
 292     of children and a corresponding list element is needed:
 293
 294         struct list children;            /* List of child processes. */
 295         struct list_elem child_elem;     /* Used for `children' list. */
 296
 297     Most solutions add an exit code member to struct thread, even if
 298     there is an exit code member in a separate shared data structure.
 299     That's fine.
 300
 301 B1(iv):
 302
 303     Some data is needed to allow a process that calls "exec" to wait
 304     until the new process completes loading.  The sample solution, and
 305     some student submissions, just add a new structure.  A pointer to
 306     the new structure is passed to the new thread as the "void *"
 307     pointer argument to thread_create().  Here is the sample
 308     solution's structure:
 309
 310         /* Data structure shared between process_execute() in the
 311            invoking thread and execute_thread() in the newly invoked
 312            thread. */
 313         struct exec_info
 314           {
 315             const char *file_name;          /* Program to load. */
 316             struct semaphore load_done;     /* "Up"ed when loading complete. */
 317             struct wait_status *wait_status;/* Child process. */
 318             bool success;                   /* Program successfully loaded? */
 319           };
 320
 321     Many student submissions instead add members to struct thread to
 322     do the same things, which is fine.
 323
 324 B2:
 325
 326     See B1(ii) above for information on the basic data structures used
 327     for file descriptors.  Part of the answer to this question might
 328     be given in B1.
 329
 330     Globally unique and per-process unique file descriptors are both
 331     acceptable.
 332
 333     If a global counter is used to generate file descriptors, there
 334     must be some mention of synchronization.
 335
 336     If an array totalling 512 bytes or more (e.g. 128 elements of
 337     "int" or pointer type) is added to struct thread, then there must
 338     be some mention that this is a large amount of data to add to
 339     struct thread.  This can be mentioned here, or in B1(ii), or in
 340     B10 below.  See the "struct thread" section in the Reference Guide
 341     chapter of the Pintos manual for explanation.
 342
 343     If the fd counter skips the first 3 fd values, with mention that
 344     this is for stdin, stdout, and stderr, then it must also mention
 345     that Pintos does not have stderr.
 346
 347     There seems to be some confusion about locking here among
 348     students.  No locking is necessary for per-thread data structures,
 349     such as a thread-specific list of file descriptors.  Locking is
 350     required for data that might be concurrently accessed by multiple
 351     threads.
 352
 353     There is no need for submissions to handle overflow of the counter
 354     used for generating fds.
 355
 356 B3:
 357
 358     This question was badly phrased in previous quarters, so there's
 359     no pool of good student answers to compare against.
 360
 361     The answer should say which of the techniques described in
 362     "Accessing User Memory" in the assignment is used.  It should
 363     describe a few of the implementation details, such as interfaces
 364     of functions used to read or write user memory.
 365
 366     A few students believe that writing to the "eax" member of struct
 367     intr_frame is a form of writing to user memory.  This is wrong:
 368     struct intr_frame is in the kernel stack.  Modifying its "eax"
 369     member changes the value of the EAX register seen by the process
 370     when a system call returns, not any kind of user memory.
 371
 372     Occasionally, a group extends the page fault-based method
 373     described in the Pintos manual.  The method described there has a
 374     little bit of overhead in that it requires the EAX register to be
 375     set up with the address of a recovery handler before accessing
 376     memory.  This is necessary for the page fault handler to know what
 377     to do.  An alternative without this overhead is to maintain a
 378     table that associates EIP values of potentially faulting
 379     instructions with their recovery handlers.  This technique is used
 380     in the Linux kernel and elsewhere to good effect.
 381
 382 B4:
 383
 384     In the first part of this question, students should address their
 385     implementation.  In the second part, they should address an ideal
 386     implementation.  They need to answer both parts, although the
 387     answers may be very brief as long as they are correct.
 388
 389     An ideal implementation would only inspect as many page table
 390     entries as aligned virtual pages spanned by the buffer.  Both
 391     2-byte and 4,096 buffers can span either 1 or 2 pages, so they
 392     would require 1 or 2 checks.
 393
 394     Student implementations often check page table entries more often
 395     than this.  That's acceptable, within reason.  For example, one
 396     common student implementation checks the first byte and last byte
 397     of the user memory block to be accessed, plus one byte spaced
 398     every 4,096 bytes within the block.  That's fine, although it
 399     often makes a few more checks than absolutely necessary.
 400
 401     Implementations that always inspect every byte read or written are
 402     not acceptable.  Take off points.
 403
 404     Bizarrely, some students believe that checking a page table entry
 405     every 4 bytes within a block is correct.  It's not: it will fail,
 406     for example, in the case of a 2-byte block spanning a pair of
 407     virtual pages.  If they check each *aligned* 4-byte region, then
 408     it's correct, but then they're still doing 1,024 checks per page,
 409     which is still unacceptable.
 410
 411     An implementation that uses page faults to detect bad user memory
 412     accesses requires 0 page table inspections to copy any number of
 413     bytes.  An answer from this point of view is acceptable too.
 414
 415     Some students suggest that further improvements could be made by
 416     page-aligning the buffers passed to the kernel.  That's a
 417     reasonable suggestion.
 418
 419     Some very confused students think that each of the bytes in a
 420     4,096 user memory region can be on a separate page.  Penalize
 421     them.
 422
 423 B5 and B8:
 424
 425     These questions really have to be graded together, because they
 426     are so closely related.  See B1(iii) above for information on the
 427     basic data structures used for "wait".
 428
 429     A call to "wait" should not require disabling interrupts or
 430     calling thread_block() by hand.  Take off points.
 431
 432     Global locks, semaphores, etc. should not generally be used for
 433     implementing "wait".  Their use forces serialization of process
 434     exit, and they are unnecessary.  They are also, often, an
 435     indication of an incorrect implementation.
 436
 437     Some students indicate that they actually implement the wrong
 438     semantics: If C exits before P, then "wait" by P on C should
 439     return the process's exit code, not -1 as these students say.
 440     Take off points for this.  (But it's strange that they would think
 441     so, seeing as the tests for "wait" are pretty thorough.)
 442
 443     The "exit" and "wait" system calls should not have to search a
 444     global list of all processes, because it is inefficient and, more
 445     importantly, unnecessary given that it is so easy to keep a
 446     per-thread list of children.  Take off points.
 447
 448     The procedures for "wait" and "exit" vary broadly based on the
 449     data structures used.  If shared data separate from struct thread
 450     are used:
 451
 452         The procedure for "wait" is roughly this:
 453
 454             - Find the child in the list of shared data structures.
 455               (If none is found, return -1.)
 456
 457             - Wait for the child to die, by downing a semaphore in the
 458               shared data.
 459
 460             - Obtain the child's exit code from the shared data.
 461
 462             - Destroy the shared data structure and remove it from the
 463               list.
 464
 465             - Return the exit code.
 466
 467         The procedure for "exit" is then:
 468
 469             - Save the exit code in the shared data.
 470
 471             - Up the semaphore in the data shared with our parent
 472               process (if any).  In some kind of race-free way (such
 473               as using a lock and a reference count or pair of boolean
 474               variables in the shared data area), mark the shared data
 475               as unused by us and free it if the parent is also dead.
 476
 477             - Iterate the list of children and, as in the previous
 478               step, mark them as no longer used by us and free them if
 479               the child is also dead.
 480
 481             - Terminate the thread.
 482
 483         A common variation on the above "wait" procedure is to mark
 484         the shared data structure as "used" instead of removing it
 485         from the list and destroying it, then on subsequent "wait"s
 486         noticing that it is already used, skipping the down of the
 487         semaphore, and returning -1.
 488
 489     If the shared data is integrated into struct thread:
 490
 491         The procedure for "wait" is roughly this:
 492
 493             - Find the child in the list of child processes.  (If none
 494               is found, return -1.)
 495
 496             - Wait for the child to die, by downing a semaphore in the
 497               child thread struct (this is the "exiting" semaphore in
 498               the example in B1(iii)).
 499
 500             - Obtain the child's exit code from its thread struct.
 501
 502             - Remove the child from the list of children.
 503
 504             - Give the child permission to finish exiting, by "up"ing
 505               its may_exit semaphore.  (After this action, the child
 506               thread struct must no longer be accessed, because the
 507               child could wake up at any time and exit.  In particular
 508               this means that the previous two steps *must* come
 509               before this one.)
 510
 511             - Return the exit code.
 512
 513         The procedure for "exit" is then:
 514
 515             - Save the exit code in the exiting process's thread
 516               struct.
 517
 518             - Up the "exiting" semaphore, to allow a waiting parent to
 519               retrieve the exit code.
 520
 521             - Iterate the list of children and up each child's
 522               may_exit semaphore, to allow them to exit when they're
 523               ready.
 524
 525             - Down the "may_exit" semaphore, to wait for our parent to
 526               retrieve our exit code.
 527
 528             - Terminate the thread.
 529
 530         A common variation on the above "wait" procedure is to mark a
 531         child as "used" instead of allowing it to terminate then on
 532         subsequent "wait"s noticing that it is already used, skipping
 533         the down of the semaphore, and returning -1.  Then the "exit"
 534         procedure actually gives each child permission to terminate.
 535
 536         The main "trick" in this kind of implementation is that an
 537         exiting process must not terminate until its parent process
 538         has terminated or retrieved its exit status.  (If the exiting
 539         process were allowed to terminate immediately, then its thread
 540         struct would be freed and the exit status could not be
 541         retrieved at all.)  If the students do not explain how they do
 542         this, deduct points.
 543
 544     (Students are not required to explain the "exit" procedure as part
 545     of B5, but they usually end up doing so.)
 546
 547     Some groups come up with extraordinarily complicated, confusing
 548     schemes for implementing wait/exit.  If you can't understand their
 549     description, take off points.
 550
 551     You should expect to see an answer to each of the 4 parts in B8.
 552     Here's an example answer that corresponds to the sample solution
 553     (which uses shared data separate from the thread struct):
 554
 555         - If P calls wait(C) before C exits, P will wait for C to die
 556           by "down"ing the semaphore in C's wait_status struct, which
 557           C will "up" when it dies.
 558
 559         - If P calls wait(C) after C exits, P will "down" the
 560           semaphore similarly but no waiting will be required.
 561
 562         - If P terminates without waiting, before C exits, then
 563           C will "up" the semaphore when it terminates, but P will
 564           never "down" it, which is fine.
 565
 566         - If P terminates without waiting, after C exits, then the
 567           same thing happens, except that C "up"s the semaphore before
 568           P dies instead of after.
 569
 570     Students should also say something about freeing resources and
 571     special cases, e.g.:
 572
 573         - When either C or P terminates, it decrements their shared
 574           wait_status's reference count (under the wait_status lock).
 575           When the reference count reaches 0, the wait_status is no
 576           longer referenced so it is freed.
 577
 578         - There are no special cases.
 579
 580 B6:
 581
 582     There are three common types of answers:
 583
 584         - Pre-checking: System call handlers verify that all the data
 585           they need to access can be accessed before allocating any
 586           resources.  Thus, there's never anything to free on error.
 587
 588           (Once project 3 rolls around, this will no longer be
 589           possible, so these students haven't done a good job of
 590           looking ahead.)
 591
 592         - Manual freeing: If an error occurs, the system call handler
 593           manually frees everything allocated up to that point.  Often
 594           these answers focus on "abstraction", making the point that
 595           it's easier to free resources if the resources are correctly
 596           abstracted.
 597
 598         - Resource pools: All the resources allocated by a thread
 599           (memory, locks, etc.) is put on a list.  If an error occurs,
 600           a handler iterates through the list and releases all of the
 601           resources.
 602
 603           (This is a clever solution, but rare.)
 604
 605     Any of these are fine.
 606
 607 B7:
 608
 609     This depends on the data structures defined in B1(iv) above.
 610
 611     The usual solution is to use a semaphore initialized to 0.  The
 612     process calling "exec" downs it.  The new process "up"s it once
 613     the load success or failure is known.  A shared Boolean variable
 614     is used to indicate success or failure.
 615
 616     Some students use thread_block() instead of a semaphore.  Take off
 617     points.
 618
 619     Some students store the success indicator variable in the child
 620     process.  This requires additional synchronization so that the
 621     child process can't terminate before the parent reads the status.
 622     Typically, an additional semaphore is used to give the process
 623     calling "exec" a chance to read the shared Boolean variable
 624     without a race on the new process's termination.  This is not a
 625     good approach, because it is so easy to do without the additional
 626     synchronization by storing the success indicator in the parent's
 627     thread struct or in a local variable (e.g. as in the sample
 628     solution).  Take off points.
 629
 630 B8:
 631
 632     See "B5 and B8", above.
 633
 634 B9 and B10:
 635
 636     Just make sure of a few things here:
 637
 638         - They actually answered "why", not "how" or "what".
 639
 640         - Their answers do not contain any factual inaccuracies
 641           (e.g. do not claim that Pintos processes inherit file
 642           descriptors on exec, which contradicts what the assignment
 643           says).
 644
 645         - Their answers seem "reasonable".
 646
 647 B11:
 648
 649     A typical answer is just "we didn't change anything".  Otherwise,
 650     see "B9 and B10" above.
 651
 652     Take off points if they come up with some bizarre or stupid scheme
 653     and don't properly justify it, e.g. one group decided that pid_t
 654     for a process should be the sum of its tid_t and its parent's
 655     tid_t.
 656
 657     Take off points if an answer assumes that there can be more than
 658     one thread per process and doesn't mention that this would be an
 659     extension to the Pintos design.
 660
 661 Check the submitted code, by hand, for the following:
 662
 663   1. That the system call handler, which is probably in
 664      userprog/syscall.c in syscall_handler(), is reasonably
 665      abstracted.  It should not all be in one function as one huge
 666      switch statement, for example.  The code that accesses user
 667      memory should be separated into specialized functions.  If either
 668      of these is not the case, or if the code is otherwise difficult
 669      to read, take off points.
 670
 671   2. The "Source Code" section in the Introduction to the Pintos
 672      manual says, "Update existing comments as you modify code."  Most
 673      students ignore this.  Check whether the comment on
 674      process_wait() in process.c has been updated.  At a minimum, the
 675      two final sentences ("This function will be implemented in
 676      problem 2-2.  For now, it does nothing.") should have been
 677      deleted.
 678
 679   3. Find the implementation of the "open" system call.  Verify that
 680      all resources are released in error cases and that errors are
 681      properly handled.  One particular problem can be malloc() failure
 682      after a file is successfully opened--is the return value checked?
 683      If so, is the file closed before returning an error?
 684
 685      (Failure to check malloc() is in the OVERALL section of the grade
 686      report, under DESIGN.)