pintos-os.org Git - pintos-anon/blob - doc/debug.texi

   1 @node Debugging Tools, Development Tools, Project Documentation, Top
   2 @appendix Debugging Tools
   3
   4 Many tools lie at your disposal for debugging Pintos.  This appendix
   5 introduces you to a few of them.
   6
   7 @menu
   8 * printf::
   9 * ASSERT::
  10 * DEBUG::
  11 * UNUSED NO_RETURN NO_INLINE PRINTF_FORMAT::
  12 * Backtraces::
  13 * i386-elf-gdb::
  14 * Modifying Bochs::
  15 * Debugging Tips::
  16 @end menu
  17
  18 @node printf
  19 @section @code{@code{printf()}}
  20
  21 Don't underestimate the value of @func{printf}.  The way
  22 @func{printf} is implemented in Pintos, you can call it from
  23 practically anywhere in the kernel, whether it's in a kernel thread or
  24 an interrupt handler, almost regardless of what locks are held.
  25
  26 @func{printf} isn't useful just because it can print data members.
  27 It can also help figure out when and where something goes wrong, even
  28 when the kernel crashes or panics without a useful error message.  The
  29 strategy is to sprinkle calls to @func{print} with different strings
  30 (e.g.@: @code{"1\n"}, @code{"2\n"}, @dots{}) throughout the pieces of
  31 code you suspect are failing.  If you don't even see @code{1} printed,
  32 then something bad happened before that point, if you see @code{1}
  33 but not @code{2}, then something bad happened between those two
  34 points, and so on.  Based on what you learn, you can then insert more
  35 @func{printf} calls in the new, smaller region of code you suspect.
  36 Eventually you can narrow the problem down to a single statement.
  37
  38 @node ASSERT
  39 @section @code{ASSERT}
  40
  41 Assertions are useful because they can catch problems early, before
  42 they'd otherwise be notices.  Pintos provides a macro for assertions
  43 named @code{ASSERT}, defined in @file{<debug.h>}, that you can use for
  44 this purpose.  Ideally, each function should begin with a set of
  45 assertions that check its arguments for validity.  (Initializers for
  46 functions' local variables are evaluated before assertions are
  47 checked, so be careful not to assume that an argument is valid in an
  48 initializer.)  You can also sprinkle assertions throughout the body of
  49 functions in places where you suspect things are likely to go wrong.
  50
  51 When an assertion proves untrue, the kernel panics.  The panic message
  52 should help you to find the problem.  See the description of
  53 backtraces below for more information.
  54
  55 @node DEBUG
  56 @section @code{DEBUG}
  57
  58 The @code{DEBUG} macro, also defined in @file{<debug.h>}, is a sort of
  59 conditional @func{printf}.  It takes as its arguments the name of a
  60 ``message class'' and a @func{printf}-like format string and
  61 arguments.  The message class is used to filter the messages that are
  62 actually displayed.  You select the messages to display on the Pintos
  63 command line using the @option{-d} option.  This allows you to easily
  64 turn different types of messages on and off while you debug, without
  65 the need to recompile.
  66
  67 For example, suppose you want to output thread debugging messages.  To
  68 use a class named @code{thread}, you could invoke @code{DEBUG} like
  69 this:
  70 @example
  71 DEBUG(thread, "thread id: %d\n", id);
  72 @end example
  73 @noindent
  74 and then to start Pintos with @code{thread} messages enable you'd use
  75 a command line like this:
  76 @example
  77 pintos run -d thread
  78 @end example
  79
  80 @node UNUSED NO_RETURN NO_INLINE PRINTF_FORMAT
  81 @section UNUSED, NO_RETURN, NO_INLINE, and PRINTF_FORMAT
  82
  83 These macros defined in @file{<debug.h>} tell the compiler special
  84 attributes of a function or function parameter.  Their expansions are
  85 GCC-specific.
  86
  87 @defmac UNUSED
  88 Appended to a function parameter to tell the compiler that the
  89 parameter might not be used within the function.  It suppresses the
  90 warning that would otherwise appear.
  91 @end defmac
  92
  93 @defmac NO_RETURN
  94 Appended to a function prototype to tell the compiler that the
  95 function never returns.  It allows the compiler to fine-tune its
  96 warnings and its code generation.
  97 @end defmac
  98
  99 @defmac NO_INLINE
 100 Appended to a function prototype to tell the compiler to never emit
 101 the function in-line.  Occasionally useful to improve the quality of
 102 backtraces (see below).
 103 @end defmac
 104
 105 @defmac PRINTF_FORMAT (@var{format}, @var{first})
 106 Appended to a function prototype to tell the compiler that the
 107 function takes a @func{printf}-like format string as its
 108 @var{format}th argument and that the corresponding value arguments
 109 start at the @var{first}th argument.  This lets the compiler tell you
 110 if you pass the wrong argument types.
 111 @end defmac
 112
 113 @node Backtraces
 114 @section Backtraces
 115
 116 When the kernel panics, it prints a ``backtrace,'' that is, a summary
 117 of how your program got where it is, as a list of addresses inside the
 118 functions that were running at the time of the panic.  You can also
 119 insert a call to @func{debug_backtrace}, prototyped in
 120 @file{<debug.h>}, at any point in your code.
 121
 122 The addresses in a backtrace are listed as raw hexadecimal numbers,
 123 which are meaningless in themselves.  You can translate them into
 124 function names and source file line numbers using a tool called
 125 @command{i386-elf-addr2line}.@footnote{If you're using an 80@var{x}86
 126 system for development, it's probably just called
 127 @command{addr2line}.}
 128
 129 The output format of @command{i386-elf-addr2line} is not ideal, so
 130 we've supplied a wrapper for it simply called @command{backtrace}.
 131 Give it the name of your @file{kernel.o} as the first argument and the
 132 hexadecimal numbers composing the backtrace (including the @samp{0x}
 133 prefixes) as the remaining arguments.  It outputs the function name
 134 and source file line numbers that correspond to each address.
 135
 136 If the translated form of a backtrace is garbled, or doesn't make
 137 sense (e.g.@: function A is listed above function B, but B doesn't
 138 call A), then it's a good sign that you're corrupting a kernel
 139 thread's stack, because the backtrace is extracted from the stack.
 140 Alternatively, it could be that the @file{kernel.o} you passed to
 141 @command{backtrace} does not correspond to the kernel that produced
 142 the backtrace.
 143
 144 @node i386-elf-gdb
 145 @section @command{i386-elf-gdb}
 146
 147 You can run the Pintos kernel under the supervision of the
 148 @command{i386-elf-gdb} debugger.@footnote{If you're using an
 149 80@var{x}86 system for development, it's probably just called
 150 @command{addr2line}.}  There are two steps in the process.  First,
 151 start Pintos with the @option{--gdb} option, e.g.@: @command{pintos
 152 --gdb run}.  Second, in a second terminal, invoke @command{gdb} on
 153 @file{kernel.o}:
 154 @example
 155 i386-elf-gdb kernel.o
 156 @end example
 157 @noindent and issue the following @command{gdb} command:
 158 @example
 159 target remote localhost:1234
 160 @end example
 161
 162 At this point, @command{gdb} is connected to Bochs over a local
 163 network connection.  You can now issue any normal @command{gdb}
 164 commands.  If you issue the @samp{c} command, the Bochs BIOS will take
 165 control, load Pintos, and then Pintos will run in the usual way.  You
 166 can pause the process at any point with @key{Ctrl+C}.  If you want
 167 @command{gdb} to stop when Pintos starts running, set a breakpoint on
 168 @func{main} with the command @code{break main} before @samp{c}.
 169
 170 You can read the @command{gdb} manual by typing @code{info gdb} at a
 171 terminal command prompt, or you can view it in Emacs with the command
 172 @kbd{C-h i}.  Here's a few commonly useful @command{gdb} commands:
 173
 174 @table @code
 175 @item c
 176 Continue execution until the next breakpoint or until @key{Ctrl+C} is
 177 typed.
 178
 179 @item break @var{function}
 180 @itemx break @var{filename}:@var{linenum}
 181 @itemx break *@var{address}
 182 Sets a breakpoint at the given function, line number, or address.
 183 (Use a @samp{0x} prefix to specify an address in hex.)
 184
 185 @item p @var{expression}
 186 Evaluates the given C expression and prints its value.
 187 If the expression contains a function call, the function will actually
 188 be executed, so be careful.
 189
 190 @item l *@var{address}
 191 Lists a few lines of code around the given address.
 192 (Use a @samp{0x} prefix to specify an address in hex.)
 193
 194 @item bt
 195 Prints a stack backtrace similar to that output by the
 196 @command{backtrace} program described above.
 197
 198 @item p/a @var{address}
 199 Prints the name of the function or variable that occupies the given
 200 address.
 201 (Use a @samp{0x} prefix to specify an address in hex.)
 202 @end table
 203
 204 You might notice that @command{gdb} tends to show code being executed
 205 in an order different from the order in the source.  That is, the
 206 current statement jumps around seemingly randomly.  This is due to
 207 GCC's optimizer, which does tend to reorder code.  If it bothers you,
 208 you can turn off optimization by editing
 209 @file{pintos/src/Make.config}, removing @option{-O3} from the
 210 @code{CFLAGS} definition.
 211
 212 If you notice other strange behavior while using @command{gdb}, there
 213 are three possibilities.  The first is that there is a bug in your
 214 modified Pintos.  The second is that there is a bug in Bochs's
 215 interface to @command{gdb} or in @command{gdb} itself.  The third is
 216 that there is a bug in the original Pintos code.  The first and second
 217 are quite likely, and you should seriously consider both.  We hope
 218 that the third is less likely, but it is also possible.
 219
 220 @node Debugging by Infinite Loop
 221 @section Debugging by Infinite Loop
 222
 223 If you get yourself into a situation where the machine reboots in a
 224 loop, you've probably hit a ``triple fault.''  In such a situation you
 225 might not be able to use @func{printf} for debugging, because the
 226 reboots might be happening even before everything needed for
 227 @func{printf} is initialized.  In such a situation, you might want to
 228 try what I call ``debugging by infinite loop.''
 229
 230 What you do is pick a place in the Pintos code, insert the statement
 231 @code{for (;;);} there, and recompile and run.  There are two likely
 232 possibilities:
 233
 234 @itemize @bullet
 235 @item
 236 The machine hangs without rebooting.  If this happens, you know that
 237 the infinite loop is running.  That means that whatever caused the
 238 problem must be @emph{after} the place you inserted the infinite loop.
 239 Now move the infinite loop later in the code sequence.
 240
 241 @item
 242 The machine reboots in a loop.  If this happens, you know that the
 243 machine didn't make it to the infinite loop.  Thus, whatever caused the
 244 reboot must be @emph{before} the place you inserted the infinite loop.
 245 Now move the infinite loop earlier in the code sequence.
 246 @end itemize
 247
 248 If you move around the infinite loop in a ``binary search'' fashion, you
 249 can use this technique to pin down the exact spot that everything goes
 250 wrong.  It should only take a few minutes at most.
 251
 252 @node Modifying Bochs
 253 @section Modifying Bochs
 254
 255 An advanced debugging technique is to modify and recompile the
 256 simulator.  This proves useful when the simulated hardware has more
 257 information than it makes available to the OS.  For example, page
 258 faults have a long list of potential causes, but the hardware does not
 259 report to the OS exactly which one is the particular cause.
 260 Furthermore, a bug in the kernel's handling of page faults can easily
 261 lead to recursive faults, but a ``triple fault'' will cause the CPU to
 262 reset itself, which is hardly conducive to debugging.
 263
 264 In a case like this, you might appreciate being able to make Bochs
 265 print out more debug information, such as the exact type of fault that
 266 occurred.  It's not very hard.  You start by retrieving the source
 267 code for Bochs 2.1.1 from @uref{http://bochs.sourceforge.net} and
 268 extracting it into a directory.  Then read
 269 @file{pintos/src/misc/bochs-2.1.1.patch} and apply the patches needed.
 270 Then run @file{./configure}, supplying the options you want (some
 271 suggestions are in the patch file).  Finally, run @command{make}.
 272 This will compile Bochs and eventually produce a new binary
 273 @file{bochs}.  To use your @file{bochs} binary with @command{pintos},
 274 put it in your @env{PATH}, and make sure that it is earlier than
 275 @file{/usr/class/cs140/i386/bochs}.
 276
 277 Of course, to get any good out of this you'll have to actually modify
 278 Bochs.  Instructions for doing this are firmly out of the scope of
 279 this document.  However, if you want to debug page faults as suggested
 280 above, a good place to start adding @func{printf}s is
 281 @func{BX_CPU_C::dtranslate_linear} in @file{cpu/paging.cc}.
 282
 283 @node Debugging Tips
 284 @section Tips
 285
 286 The page allocator in @file{threads/palloc.c} clears all the bytes in
 287 pages to @t{0xcc} when they are freed.  Thus, if you see an attempt to
 288 dereference a pointer like @t{0xcccccccc}, or some other reference to
 289 @t{0xcc}, there's a good chance you're trying to reuse a page that's
 290 already been freed.  Also, byte @t{0xcc} is the CPU opcode for
 291 ``invoke interrupt 3,'' so if you see an error like @code{Interrupt
 292 0x03 (#BP Breakpoint Exception)}, Pintos tried to execute code in a
 293 freed page.
 294
 295 Similarly, the block allocator in @file{threads/malloc.c} clears all
 296 the bytes in freed blocks to @t{0xcd}.  The two bytes @t{0xcdcd} are
 297 a CPU opcode for ``invoke interrupt @t{0xcd},'' so @code{Interrupt
 298 0xcd (unknown)} is a good sign that you tried to execute code in a
 299 block freed with @func{free}.