1 PROJECT 4 GRADING ADVICE
2 ========================
4 This file contains advice for grading project 4. General advice for
5 Pintos grading is in README, which you should read first.
7 INDEXED AND EXTENSIBLE FILES
8 ============================
12 struct inode_disk should change here. Many students just present
13 the new version instead of the detailed changes, which is fine
14 because most of it changes. The typical result looks something
19 disk_sector_t sectors[SECTOR_CNT]; /* Sectors. */
20 enum inode_type type; /* FILE_INODE or DIR_INODE. */
21 off_t length; /* File size in bytes. */
22 unsigned magic; /* Magic number. */
27 - The "sectors" array might be divided into separate arrays or
28 scalars devoted to direct, indirect, and doubly indirect
29 blocks. That's fine, and arguably cleaner.
31 - The "magic" member should be retained, but some groups might
32 drop it because they don't understand its purpose. That's
35 - A "type" or "is_dir" member might have been added to
36 directory entries instead, or its addition might be
37 mentioned in B1, because it's more relevant there.
39 - The "start" member should be dropped. If it's still there,
42 - The "unused" array should be dropped, because if there's
43 extra space then it can be used for more direct blocks, etc.
44 If there's an unused array that's between 0 and 3 bytes (not
45 elements) long, then that's fine--probably they needed to
46 pad out some "char" or "short" elements to make the whole
47 thing exactly 512 bytes long. Otherwise, take off points.
49 Most groups add a lock to struct inode that synchronizes file
54 Most groups use 123 or 124 direct blocks, 1 indirect block, and 1
55 doubly indirect block, which yield the following maximums:
57 123 direct: (123 + 128 + 128**2) * 512 = 8,517,120 bytes = 8,317.5 kB
58 124 direct: (124 + 128 + 128**2) * 512 = 8,517,632 bytes = 8,318 kB
60 Occasionally a group thinks that the "max file size" includes
61 metadata size. The metadata in either situation above would be of
64 (1 + 1 + (1 + 128)) * 512 = 67,072 bytes = 65.5 kB
66 so that the totals would then be
68 123 direct: 8,517,120 + 67,072 = 8,584,192 bytes = 8,383 kB
69 124 direct: 8,517,632 + 67,072 = 8,584,704 bytes = 8,383.5 kB
71 Check that their math is correct if it doesn't match one of these.
73 Students must "show their work" or at least give enough numbers in
74 A1, A2, or A6 to allow you to check their work. You shouldn't
75 have to look at the source code to check it. Take off points if
78 Some students think that their design has 128 indirect blocks or
79 128 doubly indirect blocks. That's not true. Take off points.
80 (However, the blocks that the doubly indirect blocks point to are
81 indirect blocks in their own right. But in that case they would
82 normally have 129, not 128, because of the indirect block that the
83 inode points to directly.)
89 One important fact here is that Pintos files can grow but never
90 shrink. (Although deletion might be thought of as "shrinking" to
91 size 0, this can never be observed by a user process, because the
92 file's contents are retained as long as a user process has it
97 1. The most common approach is to use a lock to make extending
98 a file into a critical section. A write first checks the
99 length of the file. A write that won't extend the file
100 doesn't need to take the lock and can proceed immediately.
101 A write that extends the file takes the lock.
103 The lock should be per-inode (per-file). Take off points
106 There are some subtleties to this approach. The first is
107 that checking and updating the length of the file has to be
108 synchronized. If two writers try to write 10 bytes
109 starting at the current EOF, they can both see the old file
110 length. Thus, the file length must be checked again after
111 obtaining the lock, e.g.
113 if (final_ofs > disk_inode->length) // First check.
115 lock_acquire (&inode->grow_lock);
116 if (final_ofs > disk_inode->length) // Second check.
120 lock_release (&inode->grow_lock);
123 Students who use this scheme must mention how they handle
124 this problem. If they don't, take off points.
126 An even more subtle problem is that of race-free reads of
127 the "length" member of disk_inode. There ideally should be
128 some locking around access to it, but a call to barrier()
129 or use of volatile would be sufficient also. However,
130 reading a single integer is normally atomic on modern
131 machines, so we'll just ignore this problem.
133 2. Use sector-based locking in the buffer cache. In this
134 approach, the buffer cache supports shared and exclusive
135 locking of sectors. To read or write a file normally can
136 use a shared lock on the inode sector; extending a file
137 requires an exclusive lock on the inode sector.
139 3. If all file accesses (or all write accesses) require a lock
140 and hold it during the entire read or write, then that's a
141 trivial solution, but it doesn't meet the requirements in
142 the Synchronization section of the assignment: "Multiple
143 reads of a single file must be able to complete without
144 waiting for one another. When writing to a file does not
145 extend the file, multiple processes should also be able to
146 write a single file at once." Take off points.
148 This is not the same as taking a lock across a couple of
149 lines of code to check or set a variable in a race-free
150 way, etc. That's fine. The issue is waiting for I/O to
151 complete, not for updating little bits of metadata.
155 The typical solution to A4 depends on the solution chosen for A3:
157 1. a. Typically, the process extending the file doesn't update
158 the file's length until the extension is fully
159 completed. Thus, before the extension is complete,
160 readers won't see that the file's length has increased
163 b. Another solution is to make reads past end-of-file wait
164 for file extension to be completed, by making read past
165 end-of-file take the lock needed to extend the file.
168 2. A process that wants to read the file will need to get a
169 shared lock on the inode sector, which it can't get until
170 the extending process has released its exclusive lock.
171 Thus, this approach will be correct as long as the
172 exclusive lock on the inode is held until the file
173 extension is complete.
175 3. This approach trivially solves the problem but doesn't
176 fulfill the synchronization requirements. Take off points.
180 Many answers are implicitly predicated on the fact that Pintos
181 locks and semaphores have fair, first-come, first-served queuing.
182 Most students don't mention it, though. Give a bonus point if
185 The typical answer depends on the answer to A4:
187 1. a. Readers, and writers not at end-of-file, don't have to
188 wait, so they have no fairness issues. Writers that
189 extend a file are waiting on a lock, which queues their
190 extensions fairly. After a file extension is complete,
191 subsequent reads and writes within the new file length
192 are completed fairly because there's no need for them to
195 b. Essentially the same as 1a, except that readers beyond
196 end-of-file are treated like writers beyond end-of-file.
197 Because the lock queuing is fair, readers get serviced
200 2. Fairness here depends on the fairness of locking in the
201 buffer cache. The group must explain how the buffer cache
202 provides fair locking. Generally, two rules are needed to
205 - A rule to prevent readers from waiting for writers
206 forever. For example: when the exclusive holder of
207 a lock releases it, any processes waiting for a
208 shared lock are preferred over processes waiting for
211 - A rule to prevent writers from waiting for readers
212 forever. For example: if one or more processes hold
213 a shared lock, and one or more processes are waiting
214 for an exclusive lock, then no more processes may
215 obtain a shared lock until at least one process has
216 obtained an exclusive lock.
218 The group must explain both rules; take off points
221 3. This approach trivially solves the problem but doesn't
222 fulfill the synchronization requirements. Take off points.
224 There should be no use of a timer to ensure fairness. That's not
227 Fairness is not a matter of what kind of file access is "rare" or
228 "common". Take off points from groups that argue, e.g., that
229 their implementation is unfair if a file is being extended but
230 that file extension happens "rarely".
234 I haven't seen a group try to use a non-multilevel index structure
235 yet. Such a choice would need good rationale.
237 Good answers include mention of advantages of multilevel indexes:
239 - Both random and sequential access are efficient.
241 - Large files can be supported, without imposing a penalty on
242 efficiency of small files.
244 - Works in presence of external fragmentation.
246 - Fast access to data for small files.
248 Some answers mention some disadvantages of multilevel indexes:
250 - It takes up to 3 disk accesses to find a data block in a big
251 file (although caching helps with sequential access
254 - Fixed maximum file size.
256 - High overhead for small files.
258 At least two reasons must be mentioned.
265 struct thread should have a new member to keep track of the
266 process's current directory. Possibilities include:
268 - Pointer to a struct dir or struct inode. This is the best
271 - A string. This is not a good choice (see B6).
273 - A sector number. This is not a good choice (see B6).
275 struct dir_entry or struct inode_disk should have a new member to
276 distinguish ordinary files from directories. It is often named
277 something like "type" or "is_dir". This might have been mentioned
278 in A1 instead; look there if it's missing. If it's not mentioned
279 either place, take off points.
281 Directory synchronization possibilities include:
283 - Most implementations add a lock to struct inode or struct
284 dir and use it to synchronize directory operations. This is
285 simple and effective.
287 - Some implementations use a global list of open directories.
288 This generally requires a new global lock to control adding
289 and removing list elements. If there's a list but no
290 locking or other explanation, deduct points.
292 In addition, this implementation needs some way to wait for
293 a particular directory and to free directories when no one
294 is still using them. This can be done with an additional
295 per-directory lock and a waiting-processes counter. If
296 there's no such data or explanation, deduct points.
298 - Some implementations use a global lock, or a few of them, to
299 synchronize directory operations. This is unacceptable;
300 there is no reason to do global locking. Deduct points.
304 There's nothing tricky here, really. This should be a
305 straightforward description of straightforward code. There are
306 only two slightly tricky bits:
308 - Absolute versus relative names. When the current directory
309 is represented by a struct dir or struct inode, the
310 difference should just be whether traversal starts from the
311 root directory or the current directory. When the current
312 directory is represented by a string, either way you start
313 from the root directory, but if it's a relative name then
314 you first traverse to the current directory before
315 traversing the provided name.
317 - Whether to traverse the final component of the name. Some
318 system calls, e.g. create, remove, mkdir, don't want to
319 traverse the final component, because they want to act on a
320 directory entry. Others, e.g. open, chdir, do want to
321 traverse the final component, because they want to act on an
324 The best answers talk about both of these.
328 I'm looking for roughly this:
330 It determines components of the current working directory
331 (cwd) in reverse order, moving from the cwd to the root. It
332 first determines the inumber of ".". Then it opens ".." and
333 reads its entries until it finds the one that has that
334 inumber. The corresponding name is the last component of the
335 cwd. Then it reads "../..", looking for the inumber of "..",
336 which determines the second-to-last component of the cwd. It
337 stops when the inumber of a directory and its parent are the
338 same, because that means it's at the root.
340 This question is new for summer 2006, so the answers may not be
345 This should explain how to apply the directory synchronization
346 whose data structure was added in B1. Most commonly, it's just a
347 matter of taking the directory's lock around the directory
352 The typical answer depends on how the current directory is
355 - Groups that use a struct dir or struct inode pointer to
356 represent the current working directory usually answer "yes"
357 (that they do prevent removal). The usual way to prevent
358 removal is through the open_cnt member in struct inode: a
359 value of 1 means that it may be removed, a higher value
360 means that it's also open by some other process or in use as
361 a current working directory. (It's a value of 1, not 0,
362 because typically the "remove" system call opens the
363 directory inode as part of deleting it.)
365 - Groups that represent the current working directory with a
366 string usually answer "no" (that removal is allowed),
367 because it requires no additional effort: when a directory
368 is deleted, traversing to it through a string fails, so
369 operations on the directory also fail.
371 - Groups that use a sector number to represent the current
372 working directory are hard to predict. If they answer "no"
373 (that removal is allowed), they may try to claim that the
374 deleted directory inode on disk is recognized as deleted and
375 that based on that it is special cased. But that won't work
376 reliably, because the inode may be reused at any time.
377 Deduct points. If they give another answer, you're on your
380 If the answer is "yes" (that removal is prevented), extra code
381 should not be necessary to avoid deleting the chain of parents of
382 processes' working directories. None of those directories are
383 empty (they have at least one child directory), so they may not be
384 deleted on that basis. Deduct points if students added special
389 The rationale depends on how the current directory is represented:
391 - Groups that use a struct dir or struct inode pointer to
392 represent the current working directory:
394 * Access to cwd is fast and constant time, with no
395 traversal necessary (better than a string
398 * Storage of the cwd is small and constant space
399 (better than a string representation).
401 * Easy to tell how many users a directory has (to
402 prevent removing an in-use directory).
404 * No need to limit the depth of the cwd within the
405 file system and no need for dynamic allocation in
406 chdir system call (better than a string
409 - Groups that represent the current working directory with a
410 string typically make these claims:
412 * No extra work is needed to handle removed
417 * Recreating a removed directory makes operations in
418 that directory work again.
420 This is not normally desirable. In a more advanced
421 OS environment, where files and directories can be
422 renamed and moved, it is normally desirable for a
423 process to retain its current directory across a
424 series of operations, not for it to suddenly be in
425 an entirely new place.
427 * Operations that use file names are relatively
428 infrequent compared to operations on file
429 descriptors, so the penalty for having to
430 re-traverse paths on each operation is amortized
431 over many operations.
433 I doubt this is true.
435 * Space efficiency (!).
437 This is obviously false.
439 - Groups that use a sector number to represent the current
440 working directory: I haven't seen any of these groups make
443 At least two reasons must be given.
450 struct inode should drop inode_disk; if it doesn't, then take off
451 points. Take off a lot of points if they This might have been mentioned in A1 instead.
453 Expect to see a cache entry structure containing the following
458 - Sector data (512 bytes) or a pointer to sector data.
462 - A way to decide choose a block to evict, e.g.:
466 * Aging, using multiple "accessed bits". An access
467 sets the high bit, and periodically the bits are
468 shifted to the right.
470 * LRU list. This should probably be a linked list,
471 which makes it easy and fast to move an entry to the
472 front. If a linked list element is used, there
473 should be a global list also and probably a lock
476 Occasionally some group will rearrange the whole
477 64-element array to manage LRU; that's inefficient,
480 * Some kind of "points" scheme, where accesses add
481 points based on the type of access.
483 Often, any of these schemes are combined with a bit to
484 distinguish data from metadata, so that metadata can be
485 preferentially retained.
487 - Synchronization data, typically a lock.
489 - Often, a struct hash_elem to allow fast lookup. This is not
490 essential; linear search is fast enough for a 64-element
493 Generally there's a queue of blocks for use by a separate
494 read-ahead thread, as well as a lock to control access.
498 Many schemes are possible. For full credit, students must use an
499 algorithm that is at least as good as the second-chance (clock)
500 algorithm, and they must take the difference between data and
501 metadata into account in some way.
503 Some implementations I've seen, with comments:
505 - Reserve 1/3 of the cache for data, 2/3 for metadata, and do
506 local replacement within each cache. Manage metadata in LRU
507 fashion, data in MRU fashion in the hope that sequential
508 access to data is common.
510 This is fine, although I suspect that MRU may not be the
513 - Use the clock algorithm across data and metadata. Skip over
514 metadata on the first trip around the clock. Special case
515 calls to the "read" system call with "large" sizes,
516 scheduling them for quick eviction in the hope that they're
517 doing sequential access.
519 I suspect that this will actually penalize data *too* much,
522 - Do LRU based on a "points" scheme. Data accesses add 1
523 point, metadata accesses add 2 points.
527 It should not be necessary to modify all 64 entries on an access.
530 Some implementations strictly prefer all metadata over all data,
531 so that the cache will end up with just 1 data block and 63
532 metadata blocks in it much of the time. This leans too far toward
533 preferring metadata. Deduct points.
537 The expected implementation is use a background thread that sleeps
538 most of the time and flushes the cache whenever it wakes up.
540 Make sure they don't state or imply that the background thread
545 I've seen a few different implementations:
547 - Most commonly, readers put block numbers on a shared queue.
548 A single background thread gets block numbers from the queue
549 and reads the blocks. This is fine.
551 - A reader creates a new thread to read the next block.
552 This creates one thread per read request, which is too
553 expensive. Deduct points.
555 - A single threads does *all* reads, both for immediate
556 requests and read-ahead, prioritizing immediate requests.
557 This is fine (and kind of clever). This thread might be in
558 charge of eviction, too.
562 There are multiple possibilities:
564 - "Double-checked locking": Search for the block in the cache
565 without taking any lock. Then, take the lock and see
566 whether it's still the same block. If so, you've got locked
567 access to it. Otherwise, release the lock and start over.
569 - Global lock: A global lock held during the whole eviction
570 process normally does not meet the synchronization
571 requirements, which says "...when I/O is required on a
572 particular block, operations on other blocks that do not
573 require I/O should proceed without having to wait for the
574 I/O to complete." Deduct points.
576 - Two-level locking: Use one global lock to protect the sector
577 numbers in the list from changing, and a second per-block
578 lock to prevent eviction. In pseudo-code it looks like
583 acquire per-block lock
586 release per-block lock
588 The problem is that if thread Y wants to access block B
589 while thread X is evicting block B, any thread Z that wants
590 to access *any* block must wait for X's I/O to finish:
592 Thread X Thread Y Thread Z
593 ------------------ ---------------- --------------------
605 As you can see, Y still holds the global lock while it waits
606 for B's lock, so Z has to wait for the global lock to be
607 released, which won't happen until Y can get B's lock, which
608 in turn won't happen until X's I/O is complete.
610 There are ways around this problem. If the students mention
611 it and describe a solution, it's fine. If they don't
612 mention it at all, deduct points.
614 - Only one, specialized thread is allowed to do eviction.
618 This is generally a consequence of the locking done in C5. You
619 should expect the students to explain that. Just a sentence or
624 Three different workloads must be described, one for each
627 Workloads likely to benefit from buffer caching include:
629 - Use of a large file, because caching saves time re-reading
630 the inode and indirect and doubly indirect blocks.
632 - Random access within a small set of files that can be
635 - Repeated reads with locality in space.
637 Workloads likely to benefit from read-ahead include:
639 - Sequential reads, especially when processing is necessary on
640 each block, because that allows for greater parallelism.
642 Workloads likely to benefit from write-behind include:
644 - Multiple writes to small regions of a file, e.g. appending
645 to a file a file bytes at a time.
647 - Writes interspersed with computation, so that a process
648 never blocks on writes.