pintos-os.org Git - pintos-anon/blob - doc/44bsd.texi

   1 @node 4.4BSD Scheduler, Coding Standards, References, Top
   2 @appendix 4.4@acronym{BSD} Scheduler
   3
   4 @iftex
   5 @macro tm{TEX}
   6 @math{\TEX\}
   7 @end macro
   8 @macro nm{TXT}
   9 @end macro
  10 @macro am{TEX, TXT}
  11 @math{\TEX\}
  12 @end macro
  13 @end iftex
  14
  15 @ifnottex
  16 @macro tm{TEX}
  17 @end macro
  18 @macro nm{TXT}
  19 @w{\TXT\}
  20 @end macro
  21 @macro am{TEX, TXT}
  22 @w{\TXT\}
  23 @end macro
  24 @end ifnottex
  25
  26 @ifhtml
  27 @macro math{TXT}
  28 \TXT\
  29 @end macro
  30 @end ifhtml
  31
  32 @macro m{MATH}
  33 @am{\MATH\, \MATH\}
  34 @end macro
  35
  36 The goal of a general-purpose scheduler is to balance threads' different
  37 scheduling needs.  Threads that perform a lot of I/O require a fast
  38 response time to keep input and output devices busy, but need little CPU
  39 time.  On the other hand, compute-bound threads need to receive a lot of
  40 CPU time to finish their work, but have no requirement for fast response
  41 time.  Other threads lie somewhere in between, with periods of I/O
  42 punctuated by periods of computation, and thus have requirements that
  43 vary over time.  A well-designed scheduler can often accommodate threads
  44 with all these requirements simultaneously.
  45
  46 For project 1, you must implement the scheduler described in this
  47 appendix.  Our scheduler resembles the one described in @bibref{4.4BSD},
  48 which is one example of a @dfn{multilevel feedback queue} scheduler.
  49 This type of scheduler maintains several queues of ready-to-run threads,
  50 where each queue holds threads with a different priority.  At any given
  51 time, the scheduler chooses a thread from the highest-priority non-empty
  52 queue.  If the highest-priority queue contains multiple threads, then
  53 they run in ``round robin'' order.
  54
  55 Multiple facets of the scheduler require data to be updated after a
  56 certain number of timer ticks.  In every case, these updates should
  57 occur before any ordinary kernel thread has a chance to run, so that
  58 there is no chance that a kernel thread could see a newly increased
  59 @func{timer_ticks} value but old scheduler data values.
  60
  61 The 4.4@acronym{BSD} scheduler does not include priority donation.
  62
  63 @menu
  64 * Thread Niceness::
  65 * Calculating Priority::
  66 * Calculating recent_cpu::
  67 * Calculating load_avg::
  68 * 4.4BSD Scheduler Summary::
  69 * Fixed-Point Real Arithmetic::
  70 @end menu
  71
  72 @node Thread Niceness
  73 @section Niceness
  74
  75 Thread priority is dynamically determined by the scheduler using a
  76 formula given below.  However, each thread also has an integer
  77 @dfn{nice} value that determines how ``nice'' the thread should be to
  78 other threads.  A @var{nice} of zero does not affect thread priority.  A
  79 positive @var{nice}, to the maximum of 20, increases the numeric
  80 priority of a thread, decreasing its effective priority, and causes it
  81 to give up some CPU time it would otherwise receive.  On the other hand,
  82 a negative @var{nice}, to the minimum of -20, tends to take away CPU
  83 time from other threads.
  84
  85 The initial thread starts with a @var{nice} value of zero.  Other
  86 threads start with a @var{nice} value inherited from their parent
  87 thread.  You must implement the functions described below, which are for
  88 use by test programs.  We have provided skeleton definitions for them in
  89 @file{threads/thread.c}.  by test programs
  90
  91 @deftypefun int thread_get_nice (void)
  92 Returns the current thread's @var{nice} value.
  93 @end deftypefun
  94
  95 @deftypefun void thread_set_nice (int @var{new_nice})
  96 Sets the current thread's @var{nice} value to @var{new_nice} and
  97 recalculates the thread's priority based on the new value
  98 (@pxref{Calculating Priority}).  If the running thread no longer has the
  99 highest priority, yields.
 100 @end deftypefun
 101
 102 @node Calculating Priority
 103 @section Calculating Priority
 104
 105 Our scheduler has 64 priorities and thus 64 ready queues, numbered 0
 106 (@code{PRI_MIN}) through 63 (@code{PRI_MAX}).  Lower numbers correspond
 107 to @emph{higher} priorities, so that priority 0 is the highest priority
 108 and priority 63 is the lowest.  Thread priority is calculated initially
 109 at thread initialization.  It is also recalculated once every fourth
 110 clock tick, for every thread.  In either case, it is determined by
 111 the formula
 112
 113 @center @t{@var{priority} = (@var{recent_cpu} / 4) + (@var{nice} * 2)},
 114
 115 @noindent where @var{recent_cpu} is an estimate of the CPU time the
 116 thread has used recently (see below) and @var{nice} is the thread's
 117 @var{nice} value.  The coefficients @math{1/4} and 2 on @var{recent_cpu}
 118 and @var{nice}, respectively, have been found to work well in practice
 119 but lack deeper meaning.  The calculated @var{priority} is always
 120 adjusted to lie in the valid range @code{PRI_MIN} to @code{PRI_MAX}.
 121
 122 This formula gives a thread that has received CPU
 123 time recently lower priority for being reassigned the CPU the next
 124 time the scheduler runs.  This is key to preventing starvation: a
 125 thread that has not received any CPU time recently will have a
 126 @var{recent_cpu} of 0, which barring a high @var{nice} value should
 127 ensure that it receives CPU time soon.
 128
 129 @node Calculating recent_cpu
 130 @section Calculating @var{recent_cpu}
 131
 132 We wish @var{recent_cpu} to measure how much CPU time each process has
 133 received ``recently.'' Furthermore, as a refinement, more recent CPU
 134 time should be weighted more heavily than less recent CPU time.  One
 135 approach would use an array of @var{n} elements to
 136 track the CPU time received in each of the last @var{n} seconds.
 137 However, this approach requires O(@var{n}) space per thread and
 138 O(@var{n}) time per calculation of a new weighted average.
 139
 140 Instead, we use a @dfn{exponentially weighted moving average}, which
 141 takes this general form:
 142
 143 @center @tm{x(0) = f(0),}@nm{x(0) = f(0),}
 144 @center @tm{x(t) = ax(t-1) + (1-a)f(t),}@nm{x(t) = a*x(t-1) + f(t),}
 145 @center @tm{a = k/(k+1),}@nm{a = k/(k+1),}
 146
 147 @noindent where @math{x(t)} is the moving average at integer time @am{t
 148 \ge 0, t >= 0}, @math{f(t)} is the function being averaged, and @math{k
 149 > 0} controls the rate of decay.  We can iterate the formula over a few
 150 steps as follows:
 151
 152 @center @math{x(1) = f(1)},
 153 @center @am{x(2) = af(1) + f(2), x(2) = a*f(1) + f(2)},
 154 @center @am{\vdots, ...}
 155 @center @am{x(5) = a^4f(1) + a^3f(2) + a^2f(3) + af(4) + f(5), x(5) = a**4*f(1) + a**3*f(2) + a**2*f(3) + a*f(4) + f(5)}.
 156
 157 @noindent The value of @math{f(t)} has a weight of 1 at time @math{t}, a
 158 weight of @math{a} at time @math{t+1}, @am{a^2, a**2} at time
 159 @math{t+2}, and so on.  We can also relate @math{x(t)} to @math{k}:
 160 @math{f(t)} has a weight of approximately @math{1/e} at time @math{t+k},
 161 approximately @am{1/e^2, 1/e**2} at time @am{t+2k, t+2*k}, and so on.
 162 From the opposite direction, @math{f(t)} decays to weight @math{w} at
 163 @am{t = \log_aw, t = ln(w)/ln(a)}.
 164
 165 The initial value of @var{recent_cpu} is 0 in the first thread
 166 created, or the parent's value in other new threads.  Each time a timer
 167 interrupt occurs, @var{recent_cpu} is incremented by 1 for the running
 168 thread only.  In addition, once per second the value of @var{recent_cpu}
 169 is recalculated for every thread (whether running, ready, or blocked),
 170 using this formula:
 171
 172 @center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}},
 173
 174 @noindent where @var{load_avg} is a moving average of the number of
 175 threads ready to run (see below).  If @var{load_avg} is 1, indicating
 176 that a single thread, on average, is competing for the CPU, then the
 177 current value of @var{recent_cpu} decays to a weight of .1 in
 178 @am{\log_{2/3}.1 \approx 6, ln(2/3)/ln(.1) = approx. 6} seconds; if
 179 @var{load_avg} is 2, then decay to a weight of .1 takes @am{\log_{3/4}.1
 180 \approx 8, ln(3/4)/ln(.1) = approx. 8} seconds.  The effect is that
 181 @var{recent_cpu} estimates the amount of CPU time the thread has
 182 received ``recently,'' with the rate of decay inversely proportional to
 183 the number of threads competing for the CPU.
 184
 185 Assumptions made by some of the tests require that updates to
 186 @var{recent_cpu} be made exactly when the system tick counter reaches a
 187 multiple of a second, that is, when @code{timer_ticks () % TIMER_FREQ ==
 188 0}, and not at any other time.
 189
 190 The value of @var{recent_cpu} can be negative for a thread with a
 191 negative @var{nice} value.  Do not clamp negative @var{recent_cpu} to 0.
 192
 193 You must implement @func{thread_get_recent_cpu}, for which there is a
 194 skeleton in @file{threads/thread.c}.
 195
 196 @deftypefun int thread_get_recent_cpu (void)
 197 Returns 100 times the current thread's @var{recent_cpu} value, rounded
 198 to the nearest integer.
 199 @end deftypefun
 200
 201 @node Calculating load_avg
 202 @section Calculating @var{load_avg}
 203
 204 Finally, @var{load_avg}, often known as the system load average,
 205 estimates the average number of threads ready to run over the past
 206 minute.  Like @var{recent_cpu}, it is an exponentially weighted moving
 207 average.  Unlike @var{priority} and @var{recent_cpu}, @var{load_avg} is
 208 system-wide, not thread-specific.  At system boot, it is initialized to
 209 0.  Once per second thereafter, it is updated according to the following
 210 formula:
 211
 212 @center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}},
 213
 214 @noindent where @var{ready_threads} is the number of threads that are
 215 either running or ready to run at time of update (not including the idle
 216 thread).
 217
 218 Because of assumptions made by some of the tests, @var{load_avg} must be
 219 updated exactly when the system tick counter reaches a multiple of a
 220 second, that is, when @code{timer_ticks () % TIMER_FREQ == 0}, and not
 221 at any other time.
 222
 223 You must implement @func{thread_get_load_avg}, for which there is a
 224 skeleton in @file{threads/thread.c}.
 225
 226 @deftypefun int thread_get_load_avg (void)
 227 Returns 100 times the current system load average, rounded to the
 228 nearest integer.
 229 @end deftypefun
 230
 231 @node 4.4BSD Scheduler Summary
 232 @section Summary
 233
 234 This section summarizes the calculations required to implement the
 235 scheduler.  It is not a complete description of scheduler requirements.
 236
 237 Every thread has a @var{nice} value between -20 and 20 directly under
 238 its control.  Each thread also has a priority, between 0
 239 (@code{PRI_MIN}) through 63 (@code{PRI_MAX}), which is recalculated
 240 using the following formula whenever the value of either term changes:
 241
 242 @center @t{@var{priority} = (@var{recent_cpu} / 4) + (@var{nice} * 2)}.
 243
 244 @var{recent_cpu} measures the amount of CPU time a thread has received
 245 ``recently.''  On each timer tick, the running thread's @var{recent_cpu}
 246 is incremented by 1.  Once per second, every thread's @var{recent_cpu}
 247 is updated this way:
 248
 249 @center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}}.
 250
 251 @var{load_avg} estimates the average number of threads ready to run over
 252 the past minute.  It is initialized to 0 at boot and recalculated once
 253 per second as follows:
 254
 255 @center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}}.
 256
 257 @noindent where @var{ready_threads} is the number of threads that are
 258 either running or ready to run at time of update (not including the idle
 259 thread).
 260
 261 @node Fixed-Point Real Arithmetic
 262 @section Fixed-Point Real Arithmetic
 263
 264 In the formulas above, @var{priority}, @var{nice}, and
 265 @var{ready_threads} are integers, but @var{recent_cpu} and @var{load_avg}
 266 are real numbers.  Unfortunately, Pintos does not support floating-point
 267 arithmetic in the kernel, because it would
 268 complicate and slow the kernel.  Real kernels often have the same
 269 limitation, for the same reason.  This means that calculations on real
 270 quantities must be simulated using integers.  This is not
 271 difficult, but many students do not know how to do it.  This
 272 section explains the basics.
 273
 274 The fundamental idea is to treat the rightmost bits of an integer as
 275 representing a fraction.  For example, we can designate the lowest 14
 276 bits of a signed 32-bit integer as fractional bits, so that an integer
 277 @m{x} represents the real number
 278 @iftex
 279 @m{x/2^{14}}.
 280 @end iftex
 281 @ifnottex
 282 @m{x/(2**14)}, where ** represents exponentiation.
 283 @end ifnottex
 284 This is called a 17.14 fixed-point number representation, because there
 285 are 17 bits before the decimal point, 14 bits after it, and one sign
 286 bit.@footnote{Because we are working in binary, the ``decimal'' point
 287 might more correctly be called the ``binary'' point, but the meaning
 288 should be clear.} A number in 17.14 format represents, at maximum, a
 289 value of @am{(2^{31} - 1) / 2^{14} \approx, (2**31 - 1)/(2**14) =
 290 approx.} 131,071.999.
 291
 292 Suppose that we are using a @m{p.q} fixed-point format, and let @am{f =
 293 2^q, f = 2**q}.  By the definition above, we can convert an integer or
 294 real number into @m{p.q} format by multiplying with @m{f}.  For example,
 295 in 17.14 format the fraction 59/60 used in the calculation of
 296 @var{load_avg}, above, is @am{(59/60)2^{14}, 59/60*(2**14)} = 16,111
 297 (rounded to nearest).  To convert a fixed-point value back to an
 298 integer, divide by @m{f}.  (The normal @samp{/} operator in C rounds
 299 toward zero, that is, it rounds positive numbers down and negative
 300 numbers up.  To round to nearest, add @m{f / 2} to a positive number, or
 301 subtract it from a negative number, before dividing.)
 302
 303 Many operations on fixed-point numbers are straightforward.  Let
 304 @code{x} and @code{y} be fixed-point numbers, and let @code{n} be an
 305 integer.  Then the sum of @code{x} and @code{y} is @code{x + y} and
 306 their difference is @code{x - y}.  The sum of @code{x} and @code{n} is
 307 @code{x + n * f}; difference, @code{x - n * f}; product, @code{x * n};
 308 quotient, @code{x / n}.
 309
 310 Multiplying two fixed-point values has two complications.  First, the
 311 decimal point of the result is @m{q} bits too far to the left.  Consider
 312 that @am{(59/60)(59/60), (59/60)*(59/60)} should be slightly less than
 313 1, but @tm{16,111\times 16,111}@nm{16,111*16,111} = 259,564,321 is much
 314 greater than @am{2^{14},2**14} = 16,384.  Shifting @m{q} bits right, we
 315 get @tm{259,564,321/2^{14}}@nm{259,564,321/(2**14)} = 15,842, or about 0.97,
 316 the correct answer.  Second, the multiplication can overflow even though
 317 the answer is representable.  For example, 64 in 17.14 format is
 318 @am{64 \times 2^{14}, 64*(2**14)} = 1,048,576 and its square @am{64^2,
 319 64**2} = 4,096 is well within the 17.14 range, but @tm{1,048,576^2 =
 320 2^{40}}@nm{1,048,576**2 = 2**40}, greater than the maximum signed 32-bit
 321 integer value @am{2^{31} - 1, 2**31 - 1}.  An easy solution is to do the
 322 multiplication as a 64-bit operation.  The product of @code{x} and
 323 @code{y} is then @code{((int64_t) x) * y / f}.
 324
 325 Dividing two fixed-point values has opposite issues.  The
 326 decimal point will be too far to the right, which we fix by shifting the
 327 dividend @m{q} bits to the left before the division.  The left shift
 328 discards the top @m{q} bits of the dividend, which we can again fix by
 329 doing the division in 64 bits.  Thus, the quotient when @code{x} is
 330 divided by @code{y} is @code{((int64_t) x) * f / y}.
 331
 332 This section has consistently used multiplication or division by @m{f},
 333 instead of @m{q}-bit shifts, for two reasons.  First, multiplication and
 334 division do not have the surprising operator precedence of the C shift
 335 operators.  Second, multiplication and division are well-defined on
 336 negative operands, but the C shift operators are not.  Take care with
 337 these issues in your implementation.
 338
 339 The following table summarizes how fixed-point arithmetic operations can
 340 be implemented in C.  In the table, @code{x} and @code{y} are
 341 fixed-point numbers, @code{n} is an integer, fixed-point numbers are in
 342 signed @m{p.q} format where @m{p + q = 31}, and @code{f} is @code{1 <<
 343 q}:
 344
 345 @html
 346 <CENTER>
 347 @end html
 348 @multitable @columnfractions .5 .5
 349 @item Convert @code{n} to fixed point:
 350 @tab @code{n * f}
 351
 352 @item Convert @code{x} to integer (rounding toward zero):
 353 @tab @code{x / f}
 354
 355 @item Convert @code{x} to integer (rounding to nearest):
 356 @tab @code{(x + f / 2) / f} if @code{x >= 0}, @*
 357 @code{(x - f / 2) / f} if @code{x <= 0}.
 358
 359 @item Add @code{x} and @code{y}:
 360 @tab @code{x + y}
 361
 362 @item Subtract @code{y} from @code{x}:
 363 @tab @code{x - y}
 364
 365 @item Add @code{x} and @code{n}:
 366 @tab @code{x + n * f}
 367
 368 @item Subtract @code{n} from @code{x}:
 369 @tab @code{x - n * f}
 370
 371 @item Multiply @code{x} by @code{y}:
 372 @tab @code{((int64_t) x) * y / f}
 373
 374 @item Multiply @code{x} by @code{n}:
 375 @tab @code{x * n}
 376
 377 @item Divide @code{x} by @code{y}:
 378 @tab @code{((int64_t) x) * f / y}
 379
 380 @item Divide @code{x} by @code{n}:
 381 @tab @code{x / n}
 382 @end multitable
 383 @html
 384 </CENTER>
 385 @end html