PINTOS GRADING ADVICE
			=====================

This directory contains advice for TAs regarding Pintos grading.  This
file contains overall advice for grading all the projects, and each
project has a file with additional, more specific advice for grading
that particular project.

Be familiar with the Grading subsection within the Introduction
chapter in the Pintos manual.  The principles stated there should
guide your grading decisions.  You should also carefully read the
Coding Standards chapter and, of course, the assignments themselves.

Grading is inherently subjective.  The most important principle is to
be fair.  Try to be patient with students, in the same way that you
would appreciate a TA being patient with you when you take classes
yourself.  In my experience, this takes practice: many TAs tend to be
fairly impatient in their first quarter of TAing, and then improve in
their second quarter.  I have noticed this pattern in myself and
others.

Submission Structure
====================

At Stanford, each project submission puts files into a directory named
/usr/class/cs140/submissions/hw<number>/<username>, where <number> is
the project number (between 1 and 4) and <username> is the user name
of the team member who did the project submission.

Each submission directory contains a tarball that actually contains
the submission.  The tarball contains pintos/src and the files and
directories underneath it.  If a student group submits more than once,
then there will be multiple tarballs, one for each submission.

Each submission directory also contains a file named grade.txt that
describes the group, giving first and last name and email address for
each student.  There is only a single copy of this file, regardless of
the number of submissions.

If two different students from a single group both submit the project,
then you can end up with almost-identical submissions in two different
directories.  It's best to check for this before beginning grading, to
avoid duplicating effort.  The check-duplicates script in this
directory can help you with this (there should be a copy of it in
/usr/class/cs140/submissions).

Grading Test Results
====================

Obtaining test results should be the easier half of the grading
process.  The procedure for obtaining test results for one submitted
project is roughly this:

	1. Extract the student code from its tarball into "pintos/src":

		tar xzf <file>.tar.gz

	2. Delete the existing pintos/src/tests directory and replace
           it by a pristine copy:

		rm -rf pintos/src/tests
		cp -R /usr/class/cs140/winter13/pintos/src/tests pintos/src/tests

    (make sure you are using the correct version of Pintos for the current
    offering of the course; it may be in a different location than the
    example above)

	3. Run "make clean" in the top-level directory, to get rid of
           any binaries or objects mistakenly included in the
           submission:

		(cd pintos/src && make clean)

	4. Run "make grade" in the project-specific directory,
           e.g. threads:

		(cd pintos/src/threads && make grade)

	5. Make a copy of the "grade" file that this produces, which
           is in the "build" directory.

		cp pintos/src/threads/build/grade tests.out

	6. Compare the grade report that you produced against the one
           submitted by the group.  You can use "diff -u" or just
           compare the final grades:

		diff -u tests.out pintos/src/threads/GRADE

           If there are major discrepancies (e.g. all their tests
           passed, but all yours failed) then you should contact the
           group.  Otherwise, use the grade report that you produced.

	   Grade reports can vary for a number of reasons: QEMU is not
	   fully reproducible, Bochs sometimes has reproducibility
	   bugs, the compilers used on different machines may produce
	   code with different behavior, and so on.  Finally, it's
	   possible that the group submitted a grade report that goes
	   with an earlier version of their code.

	7. Run "make clean" in pintos/src again:

		(cd pintos/src && make clean)

	   You don't have to do this immediately, but if you try to
           grade too many projects without doing so, then you will
           probably run out of quota.

	   An alternative is to do the builds in temporary storage,
	   e.g. in /tmp.  This will probably be a lot faster than
	   doing it in AFS, but it is slightly less convenient.

There is a script called run-tests in this directory (and in
/usr/class/cs140/submissions) that can do most of this work for you.
Run "run-tests --help" for instructions.

You can automate running the tests in several directories using a
command like (in the default C shell)
	foreach d (*)
		cd $d && run-tests threads
	end
or in the Bourne shell:
	for d in *; do cd $d && run-tests threads; done

Grading Design
==============

There are two parts to grading students' designs: their design
documents and their code.  Both are lumped into a single grade, taken
out of 100 points.

The form to use for grading each project is in hw<N>.txt in this
directory.  You should copy this file into each submission directory
and delete the lines that do not apply.  The grading form is divided
into sections, one for each problem, and an OVERALL section.  Each
section has its own set of deductions, with a cap on the total
deductions for that section. To compute the overall design score,
first compute the (capped) deductions within each section, then
subtract the section deductions from 100. If the final score is less
than zero, round it up to zero.

IMPORTANT: Be critical in grading designs.  Most submissions will pass
most of the tests, which means that they get almost 50% of the grade
for "free".  When TAs only take off a few points in the design grade,
then total project scores can average 90% or even higher.  This may
not sound like a bad thing, but it is, because students who receive
numerically high grades think that they did well relative to the other
students.  At the end of the quarter when the curve is applied, these
students are then understandably disappointed or angry that their
final grades are so low when their intermediate grades seemed so high.
It is better to take off lots of points on projects and thereby give
students more realistic expectations about their final course grades.

At the same time, don't be unfair.  You should only deduct points in
situations where students should have known what was expected of them.
For example, don't invent your own standards for coding style based on
what you think is "right": stick to what is documented, or what any
reasonable person might assume.

Grading Design Documents
------------------------

Be familiar with the Design Document subsection of the Introduction
chapter in the Pintos manual.

Deduct all the points for a given question in these cases:

	- Missing: The question is not answered at all.

	- Non-responsive: The response does not actually answer what
    is being asked.  (If the question does not reasonably apply
    to the solution chosen by the group, then the answer should
    explain why it does not.)

	- Too long: e.g. a "25 words or less" response takes a whole
    page.  These qualifiers aim to save the group's time and
    your time, so don't waste your time in these cases.

	- Too short: The response is evasive or excessively terse to
    the point that you don't feel confident in the answer.

	- Grossly inaccurate: When you examine the code, you find that
    it has no resemblance to the description.

	- Not implemented: The functionality described in the answer
    was not implemented.  This often happens when a group runs
    out of time before implementing the entire project.  Don't
    give credit for a design without an implementation.

Take off some points (use your judgment) for:

	- Conceptual errors: Statements, assumptions, or strong
    implications made in the design document are incorrect,
    e.g. assuming that unblocking a thread immediately schedules
    it.

	- Minor inaccuracies: Some aspects of the code do not match
              
	- Partial response: Multiple questions are asked, but only
    some of them are answered.

Minor issues (take off only a few points, and only if the problem
is severe):

	- Capitalization, punctuation, spelling, or grammar: An
    occasional mistake is tolerable, but repeated or frequent
    errors should be penalized.  Try to distinguish grammar
    errors made by non-native speakers of English, which are
    understandable, from those made by others, which are less
    so.

	  In Emacs, it is easy to check the spelling of a word: put
	  the cursor on or just after it, then type M-$.  You can also
	  make it highlight misspelled words with M-x flyspell-buffer.

	- Excessive redundancy: The answer restates much of what is
    specified in the assignment.

Instructions for recurring questions:

    ---- DATA STRUCTURES ----

    Copy here the declaration of each new or changed struct or
    struct member, global or static variable, typedef, or
    enumeration.  Identify the purpose of each in 25 words or
    less.

	- Deduct points if the required comment on each declaration is
    missing.  (The Introduction states "Add a brief comment on
    every structure, structure member, global or static
    variable, and function definition.")

	- Deduct points if the response does not describe *purpose*.
    We can see the type and the name of each entity from their
    declarations.  But why are they needed?  If the comments
    themselves adequately explain purpose, then that is
    sufficient. Comments should provide information that is not
    obvious from the code.  A common mistake is for a comment to
    duplicate information in the variable or function name. For
    example, a variable "pageFaultCount" might have the comment
    "count of page faults"; that's not much help!

    ---- RATIONALE ----

    Why did you choose this design?  In what ways is it superior to
    another design you considered?

        - Deduct points for failing to compare their design to another
          *correct* possibility.

Grading Code
------------

You should start by quickly scanning all the submitted code by eye.
Usually the easiest way to do this is with a command like
	diff -urpbN -X /usr/class/cs140/submissions/diff.ignore \
		 /usr/class/cs140/pintos/pintos/src pintos/src | less
in a group's top-level directory.  The options to "diff" here are
useful:
	-u: produces a "unified" diff that is easier to read than the
            default.
	-r: recurses on directories.
	-p: prints the name of the function in which each difference
            lies.
	-b: ignores differences in white space.
	-N: includes added files in the diff.
	-X .../diff.ignore: ignore files that match patterns listed in
            diff.ignore, which lists files that you don't really want
	    to look at.  You can add to the list when you notice files
	    that should be.

You can page through the "diff" output fairly quickly, perhaps a few
seconds per page.  Nevertheless, you should be able to notice some
important flaws:

	- Inconsistent style: indentation changing randomly between 4
    spaces and 8 spaces per level, between BSD and GNU brace
    placement, and so on.  (The Introduction states "In new
    source files, adopt the existing Pintos style by preference,
    but make your code self-consistent at the very least. There
    should not be a patchwork of different styles that makes it
    obvious that three different people wrote the code.")

	- Bad style: such as no indentation at all or cramming many statements
    onto a single line.

	- Many very long source code lines (over 100 columns wide).

	- Lack of white space: consistent lack of spaces after commas
    or around binary operators that makes code difficult to read.

	- Use of static or file scope ("global") variables instead of
    automatic, block scope ("local") variables: one student
    submission actually declared 12 (!) different global
    variables "just so we don't have to make a new var in each
    function".  This is unacceptable.

	- Use of struct thread members instead of automatic, block
    scope ("local") variables: sometimes it's not obvious
    whether this is the case, but subtract points when it is.

	- Code copied into multiple places that should be abstracted
    into a function.

	- Gratuitous use of dynamic allocation: e.g. a struct that
    contains a pointer to a semaphore instead of a semaphore
    itself.