pintos-os.org Git - pintos-anon/blob - doc/debug.texi

   1 @node Debugging Tools, , Project Documentation, Top
   2 @appendix Debugging Tools
   3
   4 Many tools lie at your disposal for debugging Pintos.  This appendix
   5 introduces you to a few of them.
   6
   7 @menu
   8 * printf::
   9 * ASSERT::
  10 * DEBUG::
  11 * Backtraces::
  12 * i386-elf-gdb::
  13 * Modifying Bochs::
  14 * Debugging Tips::
  15 @end menu
  16
  17 @node printf
  18 @section @code{printf()}
  19
  20 Don't underestimate the value of @code{printf()}.  The way
  21 @code{printf()} is implemented in Pintos, you can call it from
  22 practically anywhere in the kernel, whether it's in a kernel thread or
  23 an interrupt handler, almost regardless of what locks are held.
  24
  25 @code{printf()} isn't useful just because it can print data members.
  26 It can also help figure out when and where something goes wrong, even
  27 when the kernel crashes or panics without a useful error message.  The
  28 strategy is to sprinkle calls to @code{print()} with different strings
  29 (e.g.@: @code{"1\n"}, @code{"2\n"}, @dots{}) throughout the pieces of
  30 code you suspect are failing.  If you don't even see @code{1} printed,
  31 then something bad happened before that point, if you see @code{1}
  32 but not @code{2}, then something bad happened between those two
  33 points, and so on.  Based on what you learn, you can then insert more
  34 @code{printf()} calls in the new, smaller region of code you suspect.
  35 Eventually you can narrow the problem down to a single statement.
  36
  37 @node ASSERT
  38 @section @code{ASSERT}
  39
  40 Assertions are useful because they can catch problems early, before
  41 they'd otherwise be notices.  Pintos provides a macro for assertions
  42 named @code{ASSERT}, defined in @code{<debug.h>}, that you can use for
  43 this purpose.  Ideally, each function should begin with a set of
  44 assertions that check its arguments for validity.  (Initializers for
  45 functions' local variables are evaluated before assertions are
  46 checked, so be careful not to assume that an argument is valid in an
  47 initializer.)  You can also sprinkle assertions throughout the body of
  48 functions in places where you suspect things are likely to go wrong.
  49
  50 When an assertion proves untrue, the kernel panics.  The panic message
  51 should help you to find the problem.  See the description of
  52 backtraces below for more information.
  53
  54 @node DEBUG
  55 @section @code{DEBUG}
  56
  57 The @code{DEBUG} macro, also defined in @code{<debug.h>}, is a sort of
  58 conditional @code{printf()}.  It takes as its arguments the name of a
  59 ``message class'' and a @code{printf()}-like format string and
  60 arguments.  The message class is used to filter the messages that are
  61 actually displayed.  You select the messages to display on the Pintos
  62 command line using the @option{-d} option.  This allows you to easily
  63 turn different types of messages on and off while you debug, without
  64 the need to recompile.
  65
  66 For example, suppose you want to output thread debugging messages.  To
  67 use a class named @code{thread}, you could invoke @code{DEBUG} like
  68 this:
  69 @example
  70 DEBUG(thread, "thread id: %d\n", id);
  71 @end example
  72 @noindent
  73 and then to start Pintos with @code{thread} messages enable you'd use
  74 a command line like this:
  75 @example
  76 pintos run -d thread
  77 @end example
  78
  79 @node Backtraces
  80 @section Backtraces
  81
  82 When the kernel panics, it prints a ``backtrace,'' that is, a summary
  83 of how your program got where it is, as a list of addresses inside the
  84 functions that were running at the time of the panic.  You can also
  85 insert a call to @code{debug_backtrace()}, prototyped in
  86 @file{<debug.h>}, at any point in your code.
  87
  88 The addresses in a backtrace are listed as raw hexadecimal numbers,
  89 which are meaningless in themselves.  You can translate them into
  90 function names and source file line numbers using a tool called
  91 @command{i386-elf-addr2line}.@footnote{If you're using an 80@var{x}86
  92 system for development, it's probably just called
  93 @command{addr2line}.}
  94
  95 The output format of @command{i386-elf-addr2line} is not ideal, so
  96 we've supplied a wrapper for it simply called @command{backtrace}.
  97 Give it the name of your @file{kernel.o} as the first argument and the
  98 hexadecimal numbers composing the backtrace (including the @samp{0x}
  99 prefixes) as the remaining arguments.  It outputs the function name
 100 and source file line numbers that correspond to each address.
 101
 102 If the translated form of a backtrace is garbled, or doesn't make
 103 sense (e.g.@: function A is listed above function B, but B doesn't
 104 call A), then it's a good sign that you're corrupting a kernel
 105 thread's stack, because the backtrace is extracted from the stack.
 106 Alternatively, it could be that the @file{kernel.o} you passed to
 107 @command{backtrace} does not correspond to the kernel that produced
 108 the backtrace.
 109
 110 @node i386-elf-gdb
 111 @section @command{i386-elf-gdb}
 112
 113 You can run the Pintos kernel under the supervision of the
 114 @command{i386-elf-gdb} debugger.@footnote{If you're using an
 115 80@var{x}86 system for development, it's probably just called
 116 @command{addr2line}.}  There are two steps in the process.  First,
 117 start Pintos with the @option{--gdb} option, e.g.@: @command{pintos
 118 --gdb run}.  Second, in a second terminal, invoke @command{gdb} on
 119 @file{kernel.o}:
 120 @example
 121 i386-elf-gdb kernel.o
 122 @end example
 123 @noindent and issue the following @command{gdb} command:
 124 @example
 125 target remote localhost:1234
 126 @end example
 127
 128 At this point, @command{gdb} is connected to Bochs over a local
 129 network connection.  You can now issue any normal @command{gdb}
 130 commands.  If you issue the @samp{c} command, the Bochs BIOS will take
 131 control, load Pintos, and then Pintos will run in the usual way.  You
 132 can pause the process at any point with @key{Ctrl+C}.  If you want
 133 @command{gdb} to stop when Pintos starts running, set a breakpoint on
 134 @code{main()} with the command @code{break main} before @samp{c}.
 135
 136 You can read the @command{gdb} manual by typing @code{info gdb} at a
 137 terminal command prompt, or you can view it in Emacs with the command
 138 @kbd{C-h i}.  Here's a few commonly useful @command{gdb} commands:
 139
 140 @table @code
 141 @item c
 142 Continue execution until the next breakpoint or until @key{Ctrl+C} is
 143 typed.
 144
 145 @item break @var{function}
 146 @itemx break @var{filename}:@var{linenum}
 147 @itemx break *@var{address}
 148 Sets a breakpoint at the given function, line number, or address.
 149 (Use a @samp{0x} prefix to specify an address in hex.)
 150
 151 @item p @var{expression}
 152 Evaluates the given C expression and prints its value.
 153 If the expression contains a function call, the function will actually
 154 be executed, so be careful.
 155
 156 @item l *@var{address}
 157 Lists a few lines of code around the given address.
 158 (Use a @samp{0x} prefix to specify an address in hex.)
 159
 160 @item bt
 161 Prints a stack backtrace similar to that output by the
 162 @command{backtrace} program described above.
 163
 164 @item p/a @var{address}
 165 Prints the name of the function or variable that occupies the given
 166 address.
 167 (Use a @samp{0x} prefix to specify an address in hex.)
 168 @end table
 169
 170 If you notice unexplainable behavior while using @command{gdb}, there
 171 are three possibilities.  The first is that there is a bug in your
 172 modified Pintos.  The second is that there is a bug in Bochs's
 173 interface to @command{gdb} or in @command{gdb} itself.  The third is
 174 that there is a bug in the original Pintos code.  The first and second
 175 are quite likely, and you should seriously consider both.  We hope
 176 that the third is less likely, but it is also possible.
 177
 178 @node Modifying Bochs
 179 @section Modifying Bochs
 180
 181 An advanced debugging technique is to modify and recompile the
 182 simulator.  This proves useful when the simulated hardware has more
 183 information than it makes available to the OS.  For example, page
 184 faults have a long list of potential causes, but the hardware does not
 185 report to the OS exactly which one is the particular cause.
 186 Furthermore, a bug in the kernel's handling of page faults can easily
 187 lead to recursive faults, but a ``triple fault'' will cause the CPU to
 188 reset itself, which is hardly conducive to debugging.
 189
 190 In a case like this, you might appreciate being able to make Bochs
 191 print out more debug information, such as the exact type of fault that
 192 occurred.  It's not very hard.  You start by retrieving the source
 193 code for Bochs 2.1.1 from @uref{http://bochs.sourceforge.net} and
 194 extracting it into a directory.  Then read
 195 @file{pintos/src/misc/bochs-2.1.1.patch} and apply the patches needed.
 196 Then run @file{./configure}, supplying the options you want (some
 197 suggestions are in the patch file).  Finally, run @command{make}.
 198 This will compile Bochs and eventually produce a new binary
 199 @file{bochs}.  To use your @file{bochs} binary with @command{pintos},
 200 put it in your @env{PATH}, and make sure that it is earlier than
 201 @file{/usr/class/cs140/i386/bochs}.
 202
 203 Of course, to get any good out of this you'll have to actually modify
 204 Bochs.  Instructions for doing this are firmly out of the scope of
 205 this document.  However, if you want to debug page faults as suggested
 206 above, a good place to start adding @code{printf()}s is
 207 @code{BX_CPU_C::dtranslate_linear()} in @file{cpu/paging.cc}.
 208
 209 @node Debugging Tips
 210 @section Tips
 211
 212 The page allocator in @file{threads/palloc.c} clears all the bytes in
 213 pages to @t{0xcc} when they are freed.  Thus, if you see an attempt to
 214 dereference a pointer like @t{0xcccccccc}, or some other reference to
 215 @t{0xcc}, there's a good chance you're trying to reuse a page that's
 216 already been freed.  Also, byte @t{0xcc} is the CPU opcode for
 217 ``invoke interrupt 3,'' so if you see an error like @code{Interrupt
 218 0x03 (#BP Breakpoint Exception)}, Pintos tried to execute code in a
 219 freed page.
 220
 221 Similarly, the block allocator in @file{threads/malloc.c} clears all
 222 the bytes in freed blocks to @t{0xcd}.  The two bytes @t{0xcdcd} are
 223 a CPU opcode for ``invoke interrupt @t{0xcd},'' so @code{Interrupt
 224 0xcd (unknown)} is a good sign that you tried to execute code in a
 225 block freed with @code{free()}.