pintos-os.org Git - pintos-anon/blob - doc/44bsd.texi

   1 @node 4.4BSD Scheduler
   2 @appendix 4.4@acronym{BSD} Scheduler
   3
   4 @iftex
   5 @macro tm{TEX}
   6 @math{\TEX\}
   7 @end macro
   8 @macro nm{TXT}
   9 @end macro
  10 @macro am{TEX, TXT}
  11 @math{\TEX\}
  12 @end macro
  13 @end iftex
  14
  15 @ifnottex
  16 @macro tm{TEX}
  17 @end macro
  18 @macro nm{TXT}
  19 @w{\TXT\}
  20 @end macro
  21 @macro am{TEX, TXT}
  22 @w{\TXT\}
  23 @end macro
  24 @end ifnottex
  25
  26 @ifhtml
  27 @macro math{TXT}
  28 \TXT\
  29 @end macro
  30 @end ifhtml
  31
  32 @macro m{MATH}
  33 @am{\MATH\, \MATH\}
  34 @end macro
  35
  36 The goal of a general-purpose scheduler is to balance threads' different
  37 scheduling needs.  Threads that perform a lot of I/O require a fast
  38 response time to keep input and output devices busy, but need little CPU
  39 time.  On the other hand, compute-bound threads need to receive a lot of
  40 CPU time to finish their work, but have no requirement for fast response
  41 time.  Other threads lie somewhere in between, with periods of I/O
  42 punctuated by periods of computation, and thus have requirements that
  43 vary over time.  A well-designed scheduler can often accommodate threads
  44 with all these requirements simultaneously.
  45
  46 For project 1, you must implement the scheduler described in this
  47 appendix.  Our scheduler resembles the one described in @bibref{McKusick},
  48 which is one example of a @dfn{multilevel feedback queue} scheduler.
  49 This type of scheduler maintains several queues of ready-to-run threads,
  50 where each queue holds threads with a different priority.  At any given
  51 time, the scheduler chooses a thread from the highest-priority non-empty
  52 queue.  If the highest-priority queue contains multiple threads, then
  53 they run in ``round robin'' order.
  54
  55 Multiple facets of the scheduler require data to be updated after a
  56 certain number of timer ticks.  In every case, these updates should
  57 occur before any ordinary kernel thread has a chance to run, so that
  58 there is no chance that a kernel thread could see a newly increased
  59 @func{timer_ticks} value but old scheduler data values.
  60
  61 The 4.4@acronym{BSD} scheduler does not include priority donation.
  62
  63 @menu
  64 * Thread Niceness::
  65 * Calculating Priority::
  66 * Calculating recent_cpu::
  67 * Calculating load_avg::
  68 * 4.4BSD Scheduler Summary::
  69 * Fixed-Point Real Arithmetic::
  70 @end menu
  71
  72 @node Thread Niceness
  73 @section Niceness
  74
  75 Thread priority is dynamically determined by the scheduler using a
  76 formula given below.  However, each thread also has an integer
  77 @dfn{nice} value that determines how ``nice'' the thread should be to
  78 other threads.  A @var{nice} of zero does not affect thread priority.  A
  79 positive @var{nice}, to the maximum of 20, decreases the priority of a
  80 thread and causes it to give up some CPU time it would otherwise receive.
  81 On the other hand, a negative @var{nice}, to the minimum of -20, tends
  82 to take away CPU time from other threads.
  83
  84 The initial thread starts with a @var{nice} value of zero.  Other
  85 threads start with a @var{nice} value inherited from their parent
  86 thread.  You must implement the functions described below, which are for
  87 use by test programs.  We have provided skeleton definitions for them in
  88 @file{threads/thread.c}.
  89
  90 @deftypefun int thread_get_nice (void)
  91 Returns the current thread's @var{nice} value.
  92 @end deftypefun
  93
  94 @deftypefun void thread_set_nice (int @var{new_nice})
  95 Sets the current thread's @var{nice} value to @var{new_nice} and
  96 recalculates the thread's priority based on the new value
  97 (@pxref{Calculating Priority}).  If the running thread no longer has the
  98 highest priority, yields.
  99 @end deftypefun
 100
 101 @node Calculating Priority
 102 @section Calculating Priority
 103
 104 Our scheduler has 64 priorities and thus 64 ready queues, numbered 0
 105 (@code{PRI_MIN}) through 63 (@code{PRI_MAX}).  Lower numbers correspond
 106 to lower priorities, so that priority 0 is the lowest priority
 107 and priority 63 is the highest.  Thread priority is calculated initially
 108 at thread initialization.  It is also recalculated once every fourth
 109 clock tick, for every thread.  In either case, it is determined by
 110 the formula
 111
 112 @center @t{@var{priority} = @code{PRI_MAX} - (@var{recent_cpu} / 4) - (@var{nice} * 2)},
 113
 114 @noindent where @var{recent_cpu} is an estimate of the CPU time the
 115 thread has used recently (see below) and @var{nice} is the thread's
 116 @var{nice} value.  The result should be rounded down to the nearest
 117 integer (truncated).
 118 The coefficients @math{1/4} and 2 on @var{recent_cpu}
 119 and @var{nice}, respectively, have been found to work well in practice
 120 but lack deeper meaning.  The calculated @var{priority} is always
 121 adjusted to lie in the valid range @code{PRI_MIN} to @code{PRI_MAX}.
 122
 123 This formula gives a thread that has received CPU
 124 time recently lower priority for being reassigned the CPU the next
 125 time the scheduler runs.  This is key to preventing starvation: a
 126 thread that has not received any CPU time recently will have a
 127 @var{recent_cpu} of 0, which barring a high @var{nice} value should
 128 ensure that it receives CPU time soon.
 129
 130 @node Calculating recent_cpu
 131 @section Calculating @var{recent_cpu}
 132
 133 We wish @var{recent_cpu} to measure how much CPU time each process has
 134 received ``recently.'' Furthermore, as a refinement, more recent CPU
 135 time should be weighted more heavily than less recent CPU time.  One
 136 approach would use an array of @var{n} elements to
 137 track the CPU time received in each of the last @var{n} seconds.
 138 However, this approach requires O(@var{n}) space per thread and
 139 O(@var{n}) time per calculation of a new weighted average.
 140
 141 Instead, we use a @dfn{exponentially weighted moving average}, which
 142 takes this general form:
 143
 144 @center @tm{x(0) = f(0),}@nm{x(0) = f(0),}
 145 @center @tm{x(t) = ax(t-1) + (1-a)f(t),}@nm{x(t) = a*x(t-1) + f(t),}
 146 @center @tm{a = k/(k+1),}@nm{a = k/(k+1),}
 147
 148 @noindent where @math{x(t)} is the moving average at integer time @am{t
 149 \ge 0, t >= 0}, @math{f(t)} is the function being averaged, and @math{k
 150 > 0} controls the rate of decay.  We can iterate the formula over a few
 151 steps as follows:
 152
 153 @center @math{x(1) = f(1)},
 154 @center @am{x(2) = af(1) + f(2), x(2) = a*f(1) + f(2)},
 155 @center @am{\vdots, ...}
 156 @center @am{x(5) = a^4f(1) + a^3f(2) + a^2f(3) + af(4) + f(5), x(5) = a**4*f(1) + a**3*f(2) + a**2*f(3) + a*f(4) + f(5)}.
 157
 158 @noindent The value of @math{f(t)} has a weight of 1 at time @math{t}, a
 159 weight of @math{a} at time @math{t+1}, @am{a^2, a**2} at time
 160 @math{t+2}, and so on.  We can also relate @math{x(t)} to @math{k}:
 161 @math{f(t)} has a weight of approximately @math{1/e} at time @math{t+k},
 162 approximately @am{1/e^2, 1/e**2} at time @am{t+2k, t+2*k}, and so on.
 163 From the opposite direction, @math{f(t)} decays to weight @math{w} at
 164 time @am{t + \log_aw, t + ln(w)/ln(a)}.
 165
 166 The initial value of @var{recent_cpu} is 0 in the first thread
 167 created, or the parent's value in other new threads.  Each time a timer
 168 interrupt occurs, @var{recent_cpu} is incremented by 1 for the running
 169 thread only, unless the idle thread is running.  In addition, once per
 170 second the value of @var{recent_cpu}
 171 is recalculated for every thread (whether running, ready, or blocked),
 172 using this formula:
 173
 174 @center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}},
 175
 176 @noindent where @var{load_avg} is a moving average of the number of
 177 threads ready to run (see below).  If @var{load_avg} is 1, indicating
 178 that a single thread, on average, is competing for the CPU, then the
 179 current value of @var{recent_cpu} decays to a weight of .1 in
 180 @am{\log_{2/3}.1 \approx 6, ln(.1)/ln(2/3) = approx. 6} seconds; if
 181 @var{load_avg} is 2, then decay to a weight of .1 takes @am{\log_{3/4}.1
 182 \approx 8, ln(.1)/ln(3/4) = approx. 8} seconds.  The effect is that
 183 @var{recent_cpu} estimates the amount of CPU time the thread has
 184 received ``recently,'' with the rate of decay inversely proportional to
 185 the number of threads competing for the CPU.
 186
 187 Assumptions made by some of the tests require that these recalculations of
 188 @var{recent_cpu} be made exactly when the system tick counter reaches a
 189 multiple of a second, that is, when @code{timer_ticks () % TIMER_FREQ ==
 190 0}, and not at any other time.
 191
 192 The value of @var{recent_cpu} can be negative for a thread with a
 193 negative @var{nice} value.  Do not clamp negative @var{recent_cpu} to 0.
 194
 195 You may need to think about the order of calculations in this formula.
 196 We recommend computing the coefficient of @var{recent_cpu} first, then
 197 multiplying.  Some students have reported that multiplying
 198 @var{load_avg} by @var{recent_cpu} directly can cause overflow.
 199
 200 You must implement @func{thread_get_recent_cpu}, for which there is a
 201 skeleton in @file{threads/thread.c}.
 202
 203 @deftypefun int thread_get_recent_cpu (void)
 204 Returns 100 times the current thread's @var{recent_cpu} value, rounded
 205 to the nearest integer.
 206 @end deftypefun
 207
 208 @node Calculating load_avg
 209 @section Calculating @var{load_avg}
 210
 211 Finally, @var{load_avg}, often known as the system load average,
 212 estimates the average number of threads ready to run over the past
 213 minute.  Like @var{recent_cpu}, it is an exponentially weighted moving
 214 average.  Unlike @var{priority} and @var{recent_cpu}, @var{load_avg} is
 215 system-wide, not thread-specific.  At system boot, it is initialized to
 216 0.  Once per second thereafter, it is updated according to the following
 217 formula:
 218
 219 @center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}},
 220
 221 @noindent where @var{ready_threads} is the number of threads that are
 222 either running or ready to run at time of update (not including the idle
 223 thread).
 224
 225 Because of assumptions made by some of the tests, @var{load_avg} must be
 226 updated exactly when the system tick counter reaches a multiple of a
 227 second, that is, when @code{timer_ticks () % TIMER_FREQ == 0}, and not
 228 at any other time.
 229
 230 You must implement @func{thread_get_load_avg}, for which there is a
 231 skeleton in @file{threads/thread.c}.
 232
 233 @deftypefun int thread_get_load_avg (void)
 234 Returns 100 times the current system load average, rounded to the
 235 nearest integer.
 236 @end deftypefun
 237
 238 @node 4.4BSD Scheduler Summary
 239 @section Summary
 240
 241 The following formulas summarize the calculations required to implement the
 242 scheduler.  They are not a complete description of scheduler requirements.
 243
 244 Every thread has a @var{nice} value between -20 and 20 directly under
 245 its control.  Each thread also has a priority, between 0
 246 (@code{PRI_MIN}) through 63 (@code{PRI_MAX}), which is recalculated
 247 using the following formula every fourth tick:
 248
 249 @center @t{@var{priority} = @code{PRI_MAX} - (@var{recent_cpu} / 4) - (@var{nice} * 2)}.
 250
 251 @var{recent_cpu} measures the amount of CPU time a thread has received
 252 ``recently.''  On each timer tick, the running thread's @var{recent_cpu}
 253 is incremented by 1.  Once per second, every thread's @var{recent_cpu}
 254 is updated this way:
 255
 256 @center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}}.
 257
 258 @var{load_avg} estimates the average number of threads ready to run over
 259 the past minute.  It is initialized to 0 at boot and recalculated once
 260 per second as follows:
 261
 262 @center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}}.
 263
 264 @noindent where @var{ready_threads} is the number of threads that are
 265 either running or ready to run at time of update (not including the idle
 266 thread).
 267
 268 @node Fixed-Point Real Arithmetic
 269 @section Fixed-Point Real Arithmetic
 270
 271 In the formulas above, @var{priority}, @var{nice}, and
 272 @var{ready_threads} are integers, but @var{recent_cpu} and @var{load_avg}
 273 are real numbers.  Unfortunately, Pintos does not support floating-point
 274 arithmetic in the kernel, because it would
 275 complicate and slow the kernel.  Real kernels often have the same
 276 limitation, for the same reason.  This means that calculations on real
 277 quantities must be simulated using integers.  This is not
 278 difficult, but many students do not know how to do it.  This
 279 section explains the basics.
 280
 281 The fundamental idea is to treat the rightmost bits of an integer as
 282 representing a fraction.  For example, we can designate the lowest 14
 283 bits of a signed 32-bit integer as fractional bits, so that an integer
 284 @m{x} represents the real number
 285 @iftex
 286 @m{x/2^{14}}.
 287 @end iftex
 288 @ifnottex
 289 @m{x/(2**14)}, where ** represents exponentiation.
 290 @end ifnottex
 291 This is called a 17.14 fixed-point number representation, because there
 292 are 17 bits before the decimal point, 14 bits after it, and one sign
 293 bit.@footnote{Because we are working in binary, the ``decimal'' point
 294 might more correctly be called the ``binary'' point, but the meaning
 295 should be clear.} A number in 17.14 format represents, at maximum, a
 296 value of @am{(2^{31} - 1) / 2^{14} \approx, (2**31 - 1)/(2**14) =
 297 approx.} 131,071.999.
 298
 299 Suppose that we are using a @m{p.q} fixed-point format, and let @am{f =
 300 2^q, f = 2**q}.  By the definition above, we can convert an integer or
 301 real number into @m{p.q} format by multiplying with @m{f}.  For example,
 302 in 17.14 format the fraction 59/60 used in the calculation of
 303 @var{load_avg}, above, is @am{(59/60)2^{14}, 59/60*(2**14)} = 16,111
 304 (rounded to nearest).  To convert a fixed-point value back to an
 305 integer, divide by @m{f}.  (The normal @samp{/} operator in C rounds
 306 toward zero, that is, it rounds positive numbers down and negative
 307 numbers up.  To round to nearest, add @m{f / 2} to a positive number, or
 308 subtract it from a negative number, before dividing.)
 309
 310 Many operations on fixed-point numbers are straightforward.  Let
 311 @code{x} and @code{y} be fixed-point numbers, and let @code{n} be an
 312 integer.  Then the sum of @code{x} and @code{y} is @code{x + y} and
 313 their difference is @code{x - y}.  The sum of @code{x} and @code{n} is
 314 @code{x + n * f}; difference, @code{x - n * f}; product, @code{x * n};
 315 quotient, @code{x / n}.
 316
 317 Multiplying two fixed-point values has two complications.  First, the
 318 decimal point of the result is @m{q} bits too far to the left.  Consider
 319 that @am{(59/60)(59/60), (59/60)*(59/60)} should be slightly less than
 320 1, but @tm{16,111\times 16,111}@nm{16,111*16,111} = 259,564,321 is much
 321 greater than @am{2^{14},2**14} = 16,384.  Shifting @m{q} bits right, we
 322 get @tm{259,564,321/2^{14}}@nm{259,564,321/(2**14)} = 15,842, or about 0.97,
 323 the correct answer.  Second, the multiplication can overflow even though
 324 the answer is representable.  For example, 64 in 17.14 format is
 325 @am{64 \times 2^{14}, 64*(2**14)} = 1,048,576 and its square @am{64^2,
 326 64**2} = 4,096 is well within the 17.14 range, but @tm{1,048,576^2 =
 327 2^{40}}@nm{1,048,576**2 = 2**40}, greater than the maximum signed 32-bit
 328 integer value @am{2^{31} - 1, 2**31 - 1}.  An easy solution is to do the
 329 multiplication as a 64-bit operation.  The product of @code{x} and
 330 @code{y} is then @code{((int64_t) x) * y / f}.
 331
 332 Dividing two fixed-point values has opposite issues.  The
 333 decimal point will be too far to the right, which we fix by shifting the
 334 dividend @m{q} bits to the left before the division.  The left shift
 335 discards the top @m{q} bits of the dividend, which we can again fix by
 336 doing the division in 64 bits.  Thus, the quotient when @code{x} is
 337 divided by @code{y} is @code{((int64_t) x) * f / y}.
 338
 339 This section has consistently used multiplication or division by @m{f},
 340 instead of @m{q}-bit shifts, for two reasons.  First, multiplication and
 341 division do not have the surprising operator precedence of the C shift
 342 operators.  Second, multiplication and division are well-defined on
 343 negative operands, but the C shift operators are not.  Take care with
 344 these issues in your implementation.
 345
 346 The following table summarizes how fixed-point arithmetic operations can
 347 be implemented in C.  In the table, @code{x} and @code{y} are
 348 fixed-point numbers, @code{n} is an integer, fixed-point numbers are in
 349 signed @m{p.q} format where @m{p + q = 31}, and @code{f} is @code{1 <<
 350 q}:
 351
 352 @html
 353 <CENTER>
 354 @end html
 355 @multitable @columnfractions .5 .5
 356 @item Convert @code{n} to fixed point:
 357 @tab @code{n * f}
 358
 359 @item Convert @code{x} to integer (rounding toward zero):
 360 @tab @code{x / f}
 361
 362 @item Convert @code{x} to integer (rounding to nearest):
 363 @tab @code{(x + f / 2) / f} if @code{x >= 0}, @*
 364 @code{(x - f / 2) / f} if @code{x <= 0}.
 365
 366 @item Add @code{x} and @code{y}:
 367 @tab @code{x + y}
 368
 369 @item Subtract @code{y} from @code{x}:
 370 @tab @code{x - y}
 371
 372 @item Add @code{x} and @code{n}:
 373 @tab @code{x + n * f}
 374
 375 @item Subtract @code{n} from @code{x}:
 376 @tab @code{x - n * f}
 377
 378 @item Multiply @code{x} by @code{y}:
 379 @tab @code{((int64_t) x) * y / f}
 380
 381 @item Multiply @code{x} by @code{n}:
 382 @tab @code{x * n}
 383
 384 @item Divide @code{x} by @code{y}:
 385 @tab @code{((int64_t) x) * f / y}
 386
 387 @item Divide @code{x} by @code{n}:
 388 @tab @code{x / n}
 389 @end multitable
 390 @html
 391 </CENTER>
 392 @end html