pintos-os.org Git - pintos-anon/blob - doc/44bsd.texi

   1 @node 4.4BSD Scheduler, Coding Standards, References, Top
   2 @appendix 4.4@acronym{BSD} Scheduler
   3
   4 @iftex
   5 @macro tm{TEX}
   6 @math{\TEX\}
   7 @end macro
   8 @macro nm{TXT}
   9 @end macro
  10 @macro am{TEX, TXT}
  11 @math{\TEX\}
  12 @end macro
  13 @end iftex
  14
  15 @ifnottex
  16 @macro tm{TEX}
  17 @end macro
  18 @macro nm{TXT}
  19 @w{\TXT\}
  20 @end macro
  21 @macro am{TEX, TXT}
  22 @w{\TXT\}
  23 @end macro
  24 @end ifnottex
  25
  26 @ifhtml
  27 @macro math{TXT}
  28 \TXT\
  29 @end macro
  30 @end ifhtml
  31
  32 @macro m{MATH}
  33 @am{\MATH\, \MATH\}
  34 @end macro
  35
  36 The goal of a general-purpose scheduler is to balance threads' different
  37 scheduling needs.  Threads that perform a lot of I/O require a fast
  38 response time to keep input and output devices busy, but need little CPU
  39 time.  On the other hand, compute-bound threads need to receive a lot of
  40 CPU time to finish their work, but have no requirement for fast response
  41 time.  Other threads lie somewhere in between, with periods of I/O
  42 punctuated by periods of computation, and thus have requirements that
  43 vary over time.  A well-designed scheduler can often accommodate threads
  44 with all these requirements simultaneously.
  45
  46 For project 1, you must implement the scheduler described in this
  47 appendix.  Our scheduler resembles the one described in @bibref{4.4BSD},
  48 which is one example of a @dfn{multilevel feedback queue} scheduler.
  49 This type of scheduler maintains several queues of ready-to-run threads,
  50 where each queue holds threads with a different priority.  At any given
  51 time, the scheduler chooses a thread from the highest-priority non-empty
  52 queue.  If the highest-priority queue contains multiple threads, then
  53 they run in ``round robin'' order.
  54
  55 @menu
  56 * Thread Niceness::
  57 * Calculating Priority::
  58 * Calculating recent_cpu::
  59 * Calculating load_avg::
  60 * Fixed-Point Real Arithmetic::
  61 @end menu
  62
  63 @node Thread Niceness
  64 @section Niceness
  65
  66 Thread priority is dynamically determined by the scheduler using a
  67 formula given below.  However, each thread also has a relatively static
  68 @dfn{nice} value between -20 and 20 that determines how ``nice'' the
  69 thread should be to other threads.  A @var{nice} of zero does not affect
  70 thread priority.  A positive @var{nice} increases the numeric priority
  71 of a thread, decreasing its effective priority, and causes it to give up
  72 some CPU time it would otherwise receive.  On the other hand, a negative
  73 @var{nice} tends to take away CPU time from other threads.
  74
  75 The initial thread starts with a @var{nice} value of zero.  Other
  76 threads start with a @var{nice} value inherited from their parent
  77 thread.  You
  78 must implement these functions, for which we have provided skeleton
  79 definitions in @file{threads/thread.c}.
  80
  81 @deftypefun int thread_get_nice (void)
  82 Returns the current thread's @var{nice} value.
  83 @end deftypefun
  84
  85 @deftypefun void thread_set_nice (int @var{new_nice})
  86 Sets the current thread's @var{nice} value to @var{new_nice} and
  87 recalculates the thread's priority based on the new value
  88 (@pxref{Calculating Priority}).  If the running thread no longer has the
  89 highest priority, yields.
  90 @end deftypefun
  91
  92 @node Calculating Priority
  93 @section Calculating Priority
  94
  95 Our scheduler has 64 priorities and thus 64 ready queues, numbered 0
  96 (@code{PRI_MIN}) through 63 (@code{PRI_MAX}).  Lower numbers correspond
  97 to @emph{higher} priorities, so that priority 0 is the highest priority
  98 and priority 63 is the lowest.  Thread priority is calculated initially
  99 at thread initialization.  It is also recalculated once every fourth
 100 clock tick, for every thread.  In either case, it is determined by
 101 the formula
 102
 103 @center @t{@var{priority} = (@var{recent_cpu} / 4) + (@var{nice} * 2)},
 104
 105 @noindent where @var{recent_cpu} is an estimate of the CPU time the
 106 thread has used recently (see below) and @var{nice} is the thread's
 107 @var{nice} value.  The coefficients @math{1/4} and 2 on @var{recent_cpu}
 108 and @var{nice}, respectively, have been found to work well in practice
 109 but lack deeper meaning.  The calculated @var{priority} is always
 110 adjusted to lie in the valid range @code{PRI_MIN} to @code{PRI_MAX}.
 111
 112 This formula gives a thread that has received CPU
 113 time recently lower priority for being reassigned the CPU the next
 114 time the scheduler runs.  This is key to preventing starvation: a
 115 thread that has not received any CPU time recently will have a
 116 @var{recent_cpu} of 0, which barring a high @var{nice} value should
 117 ensure that it receives CPU time soon.
 118
 119 @node Calculating recent_cpu
 120 @section Calculating @var{recent_cpu}
 121
 122 We wish @var{recent_cpu} to measure how much CPU time each process has
 123 received ``recently.'' Furthermore, as a refinement, more recent CPU
 124 time should be weighted more heavily than less recent CPU time.  One
 125 approach would use an array of @var{n} elements to
 126 track the CPU time received in each of the last @var{n} seconds.
 127 However, this approach requires O(@var{n}) space per thread and
 128 O(@var{n}) time per calculation of a new weighted average.
 129
 130 Instead, we use a @dfn{exponentially weighted moving average}, which
 131 takes this general form:
 132
 133 @center @tm{x(0) = f(0),}@nm{x(0) = f(0),}
 134 @center @tm{x(t) = ax(t-1) + (1-a)f(t),}@nm{x(t) = a*x(t-1) + f(t),}
 135 @center @tm{a = k/(k+1),}@nm{a = k/(k+1),}
 136
 137 @noindent where @math{x(t)} is the moving average at integer time @am{t
 138 \ge 0, t >= 0}, @math{f(t)} is the function being averaged, and @math{k
 139 > 0} controls the rate of decay.  We can iterate the formula over a few
 140 steps as follows:
 141
 142 @center @math{x(1) = f(1)},
 143 @center @am{x(2) = af(1) + f(2), x(2) = a*f(1) + f(2)},
 144 @center @am{\vdots, ...}
 145 @center @am{x(5) = a^4f(1) + a^3f(2) + a^2f(3) + af(4) + f(5), x(5) = a**4*f(1) + a**3*f(2) + a**2*f(3) + a*f(4) + f(5)}.
 146
 147 @noindent The value of @math{f(t)} has a weight of 1 at time @math{t}, a
 148 weight of @math{a} at time @math{t+1}, @am{a^2, a**2} at time
 149 @math{t+2}, and so on.  We can also relate @math{x(t)} to @math{k}:
 150 @math{f(t)} has a weight of approximately @math{1/e} at time @math{t+k},
 151 approximately @am{1/e^2, 1/e**2} at time @am{t+2k, t+2*k}, and so on.
 152 From the opposite direction, @math{f(t)} decays to weight @math{w} at
 153 @am{t = \log_aw, t = ln(w)/ln(a)}.
 154
 155 The initial value of @var{recent_cpu} is 0 in the first thread
 156 created, or the parent's value in other new threads.  Each time a timer
 157 interrupt occurs, @var{recent_cpu} is incremented by 1 for the running
 158 thread only.  In addition, once per second the value of @var{recent_cpu}
 159 is recalculated for every thread (whether running, ready, or blocked),
 160 using this formula:
 161
 162 @center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}},
 163
 164 @noindent where @var{load_avg} is a moving average of the number of
 165 threads ready to run (see below).  If @var{load_avg} is 1, indicating
 166 that a single thread, on average, is competing for the CPU, then the
 167 current value of @var{recent_cpu} decays to a weight of .1 in
 168 @am{\log_{2/3}.1 \approx 6, ln(2/3)/ln(.1) = approx. 6} seconds; if
 169 @var{load_avg} is 2, then decay to a weight of .1 takes @am{\log_{3/4}.1
 170 \approx 8, ln(3/4)/ln(.1) = approx. 8} seconds.  The effect is that
 171 @var{recent_cpu} estimates the amount of CPU time the thread has
 172 received ``recently,'' with the rate of decay inversely proportional to
 173 the number of threads competing for the CPU.
 174
 175 Because of assumptions made by some of the tests, @var{recent_cpu} must
 176 be updated exactly when the system tick counter reaches a multiple of a
 177 second, that is, when @code{timer_ticks () % TIMER_FREQ == 0}, and not
 178 at any other time.
 179
 180 Take note that @var{recent_cpu} can be a negative quantity for a thread
 181 with a negative @var{nice} value.  Negative values of @var{recent_cpu}
 182 are not changed to 0.
 183
 184 You must implement @func{thread_get_recent_cpu}, for which there is a
 185 skeleton in @file{threads/thread.c}.
 186
 187 @deftypefun int thread_get_recent_cpu (void)
 188 Returns 100 times the current thread's @var{recent_cpu} value, rounded
 189 to the nearest integer.
 190 @end deftypefun
 191
 192 @node Calculating load_avg
 193 @section Calculating @var{load_avg}
 194
 195 Finally, @var{load_avg}, often known as the system load average,
 196 estimates the average number of threads ready to run over the past
 197 minute.  Like @var{recent_cpu}, it is an exponentially weighted moving
 198 average.  Unlike @var{priority} and @var{recent_cpu}, @var{load_avg} is
 199 system-wide, not thread-specific.  At system boot, it is initialized to
 200 0.  Once per second thereafter, it is updated according to the following
 201 formula:
 202
 203 @center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}},
 204
 205 @noindent where @var{ready_threads} is the number of threads that are
 206 either running or ready to run at time of update (not including the idle
 207 thread).
 208
 209 Because of assumptions made by some of the tests, @var{load_avg} must be
 210 updated exactly when the system tick counter reaches a multiple of a
 211 second, that is, when @code{timer_ticks () % TIMER_FREQ == 0}, and not
 212 at any other time.
 213
 214 You must implement @func{thread_get_load_avg}, for which there is a
 215 skeleton in @file{threads/thread.c}.
 216
 217 @deftypefun int thread_get_load_avg (void)
 218 Returns 100 times the current system load average, rounded to the
 219 nearest integer.
 220 @end deftypefun
 221
 222 @menu
 223 * Fixed-Point Real Arithmetic::
 224 @end menu
 225
 226 @node Fixed-Point Real Arithmetic
 227 @section Fixed-Point Real Arithmetic
 228
 229 In the formulas above, @var{priority}, @var{nice}, and
 230 @var{ready_threads} are integers, but @var{recent_cpu} and @var{load_avg}
 231 are real numbers.  Unfortunately, Pintos does not support floating-point
 232 arithmetic in the kernel, because it would
 233 complicate and slow the kernel.  Real kernels often have the same
 234 limitation, for the same reason.  This means that calculations on real
 235 quantities must be simulated using integers.  This is not
 236 difficult, but many students do not know how to do it.  This
 237 section explains the basics.
 238
 239 The fundamental idea is to treat the rightmost bits of an integer as
 240 representing a fraction.  For example, we can designate the lowest 10
 241 bits of a signed 32-bit integer as fractional bits, so that an integer
 242 @var{x} represents the real number
 243 @iftex
 244 @m{x/2^{10}}.
 245 @end iftex
 246 @ifnottex
 247 @m{x/(2**10)}, where ** represents exponentiation.
 248 @end ifnottex
 249 This is called a 21.10 fixed-point number representation, because there
 250 are 21 bits before the decimal point, 10 bits after it, and one sign
 251 bit.@footnote{Because we are working in binary, the ``decimal'' point
 252 might more correctly be called the ``binary'' point, but the meaning
 253 should be clear.} A number in 21.10 format represents, at maximum, a
 254 value of @am{(2^{31} - 1) / 2^{10} \approx, (2**31 - 1)/(2**10) =
 255 approx.} 2,097,151.999.
 256
 257 Suppose that we are using a @m{p.q} fixed-point format, and let @am{f =
 258 2^q, f = 2**q}.  By the definition above, we can convert an integer or
 259 real number into @m{p.q} format by multiplying with @m{f}.  For example,
 260 in 21.10 format the fraction 59/60 used in the calculation of
 261 @var{load_avg}, above, is @am{(59/60)2^{10}, 59/60*(2**10)} = 1,007
 262 (rounded to nearest).  To convert a fixed-point value back to an
 263 integer, divide by @m{f}.  (The normal @samp{/} operator in C rounds
 264 down.  To round to nearest, add @m{f / 2} before dividing.)
 265
 266 Many operations on fixed-point numbers are straightforward.  Let
 267 @code{x} and @code{y} be fixed-point numbers, and let @code{n} be an
 268 integer.  Then the sum of @code{x} and @code{y} is @code{x + y} and
 269 their difference is @code{x - y}.  The sum of @code{x} and @code{n} is
 270 @code{x + n * f}; difference, @code{x - n * f}; product, @code{x * n};
 271 quotient, @code{x / n}.
 272
 273 Multiplying two fixed-point values has two complications.  First, the
 274 decimal point of the result is @m{q} bits too far to the left.  Consider
 275 that @am{(59/60)(59/60), (59/60)*(59/60)} should be slightly less than
 276 1, but @tm{1,007\times 1,007}@nm{1,007*1,007} = 1,014,049 is much
 277 greater than @am{2^{10},2**10} = 1,024.  Shifting @m{q} bits right, we
 278 get @tm{1,014,049/2^{10}}@nm{1,014,049/(2**10)} = 990, or about 0.97,
 279 the correct answer.  Second, the multiplication can overflow even though
 280 the answer is representable.  For example, 128 in 21.10 format is
 281 @am{128 \times 2^{10}, 128*(2**10)} = 131,072 and its square @am{128^2,
 282 128**2} = 16,384 is well within the 21.10 range, but @tm{131,072^2 =
 283 2^{34}}@nm{131,072**2 = 2**34}, greater than the maximum signed 32-bit
 284 integer value @am{2^{31} - 1, 2**31 - 1}.  An easy solution is to do the
 285 multiplication as a 64-bit operation.  The product of @code{x} and
 286 @code{y} is then @code{((int64_t) x) * y / f}.
 287
 288 Dividing two fixed-point values has the opposite complications.  The
 289 decimal point will be too far to the right, which we fix by shifting the
 290 dividend @m{q} bits to the left before the division.  The left shift
 291 discards the top @m{q} bits of the dividend, which we can again fix by
 292 doing the division in 64 bits.  Thus, the quotient when @code{x} is
 293 divided by @code{y} is @code{((int64_t) x) * f / y}.
 294
 295 This section has consistently used multiplication or division by @m{f},
 296 instead of @m{q}-bit shifts, for two reasons.  First, multiplication and
 297 division do not have the surprising operator precedence of the C shift
 298 operators.  Second, multiplication and division are well-defined on
 299 negative operands, but the C shift operators are not.  Take care with
 300 these issues in your implementation.
 301
 302 The following table summarizes how fixed-point arithmetic operations can
 303 be implemented in C.  In the table, @code{x} and @code{y} are
 304 fixed-point numbers, @code{n} is an integer, fixed-point numbers are in
 305 signed @m{p.q} format where @m{p + q = 31}, and @code{f} is @code{1 <<
 306 q}:
 307
 308 @html
 309 <CENTER>
 310 @end html
 311 @multitable @columnfractions .5 .5
 312 @item Convert @code{n} to fixed point:
 313 @tab @code{n * f}
 314
 315 @item Convert @code{x} to integer (rounding down):
 316 @tab @code{x * f}
 317
 318 @item Convert @code{x} to integer (rounding to nearest):
 319 @tab @code{(x + f / 2) / f}
 320
 321 @item Add @code{x} and @code{y}:
 322 @tab @code{x + y}
 323
 324 @item Subtract @code{y} from @code{x}:
 325 @tab @code{x - y}
 326
 327 @item Add @code{x} and @code{n}:
 328 @tab @code{x + n * f}
 329
 330 @item Subtract @code{n} from @code{x}:
 331 @tab @code{x - n * f}
 332
 333 @item Multiply @code{x} by @code{y}:
 334 @tab @code{((int64_t) x) * y / f}
 335
 336 @item Multiply @code{x} by @code{n}:
 337 @tab @code{x * n}
 338
 339 @item Divide @code{x} by @code{y}:
 340 @tab @code{((int64_t) x) * f / y}
 341
 342 @item Divide @code{x} by @code{n}:
 343 @tab @code{x / n}
 344 @end multitable
 345 @html
 346 </CENTER>
 347 @end html