1 /* PSPP - computes sample statistics.
2 Copyright (C) 2007 Free Software Foundation, Inc.
4 This program is free software; you can redistribute it and/or
5 modify it under the terms of the GNU General Public License as
6 published by the Free Software Foundation; either version 2 of the
7 License, or (at your option) any later version.
9 This program is distributed in the hope that it will be useful, but
10 WITHOUT ANY WARRANTY; without even the implied warranty of
11 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12 General Public License for more details.
14 You should have received a copy of the GNU General Public License
15 along with this program; if not, write to the Free Software
16 Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
19 /* Implementation-level model checker.
21 A model checker is a tool for software testing and
22 verification that works by exploring all the possible states
23 in a system and verifying their internal consistency. A
24 conventional model checker requires that the code in a system
25 be translated into a specification language. The model
26 checker then verifies the specification, rather than the code.
28 This is instead an implementation-level model checker, which
29 does not require a separate specification. Instead, the model
30 checker requires writing a second implementation of the system
31 being checked. The second implementation can usually be made
32 almost trivial in comparison to the one being checked, because
33 it's usually acceptable for its performance to be
34 comparatively poor, e.g. O(N^2) instead of O(lg N), and thus
35 to use much simpler algorithms.
37 For introduction to the implementation-level model checking
38 approach used here, please refer to the following papers:
40 Musuvathi, Park, Chou, Engler, Dill, "CMC: A Pragmatic
41 Approach to Model Checking Real Code", Proceedings of the
42 Fifth Symposium on Operating Systems Design and
43 Implementation (OSDI), Dec 2002.
44 http://sprout.stanford.edu/PAPERS/CMC-OSDI-2002/CMC-OSDI-2002.pdf
46 Yang, Twohey, Engler, Musuvathi, "Using Model Checking to
47 Find Serious File System Errors", Proceedings of the Sixth
48 Symposium on Operating System Design and Implementation
50 http://www.stanford.edu/~engler/osdi04-fisc.pdf
52 Yang, Twohey, Pfaff, Sar, Engler, "EXPLODE: A Lightweight,
53 General Approach to Finding Serious Errors in Storage
54 Systems", First Workshop on the Evaluation of Software
55 Defect Detection Tools (BUGS), June 2005.
56 http://benpfaff.org/papers/explode.pdf
58 Use of a model checker is appropriate when the system being
59 checked is difficult to test using handwritten tests. This
60 can be the case, for example, when the system has a
61 complicated internal state that is difficult to reason about
62 over a long series of operations.
64 The implementation model checker works by putting a set of one
65 of more initial states in a queue (and checking them for
66 consistency). Then the model checker removes a state from the
67 queue and applies all possible operations of interest to it
68 ("mutates" it), obtaining a set of zero or more child states
69 (and checking each of them for consistency). Each of these
70 states is itself added to the queue. The model checker
71 continues dequeuing states and mutating and checking them
72 until the queue is empty.
74 In pseudo-code, the process looks like this:
76 Q = { initial states }
79 for each operation applicable to S do:
84 In many cases this process will never terminate, because every
85 state has one or more child states. For some systems this is
86 unavoidable, but in others we can make the process finite by
87 pursuing a few stratagems:
89 1. Limit the maximum size of the system; for example, limit
90 the number of rows and columns in the implementation of a
91 table being checked. The client of the model checker is
92 responsible for implementing such limits.
94 2. Avoid checking a single state more than one time. This
95 model checker provides assistance for this function by
96 allowing the client to provide a hash of the system state.
97 States with identical hashes will only be checked once
100 When a system cannot be made finite, or when a finite system
101 is too large to check in a practical amount of time, the model
102 checker provides multiple ways to limit the checking run:
103 based on maximum depth, maximum unique states checked, maximum
104 errors found (by default, 1), or maximum time used for
107 The client of the model checker must provide three functions
108 via function pointers embedded into a "struct mc_class":
110 1. void init (struct mc *mc);
112 This function is called once at the beginning of a
113 checking run. It checks one or more initial states and
114 adds them to the model checker's queue. (If it does not
115 add any states to the queue, then there is nothing to
118 Here's an outline for writing the init function:
121 init_foo (struct mc *mc)
125 mc_name_operation (mc, "initial state");
126 foo = generate_initial_foo ();
127 if (!state_is_consistent (foo))
128 mc_error (mc, "inconsistent state");
129 mc_add_state (mc, foo);
132 2. void mutate (struct mc *mc, const void *data);
134 This function is called when a dequeued state is ready to
135 be mutated. For each operation that can be applied to
136 the client-specified DATA, it applies that operation to a
137 clone of the DATA, checks that the clone is consistent,
138 and adds the clone to the model checker's queue.
140 Here's an outline for writing the mutate function:
143 mutate_foo (struct mc *mc, void *state_)
145 struct foo *state = state_;
147 for (...each operation...)
148 if (mc_include_state (mc))
152 mc_name_operation (mc, "do operation %s", ...);
153 clone = clone_foo (state);
154 do_operation (clone);
155 if (!mc_discard_dup_state (mc, hash_foo (clone)))
157 if (!state_is_consistent (clone))
158 mc_error (mc, "inconsistent state");
159 mc_add_state (mc, clone);
166 Notes on the above outline:
168 - The call to mc_include_state allows currently
169 uninteresting operations to be skipped. It is not
172 - The call to mc_name_operation should give the current
173 operation a human-readable name. The name may
174 include printf-style format specifications.
176 When an error occurs, the model checker (by default)
177 replays the sequence of operations performed to reach
178 the error, printing the name of the operation at each
179 step, which is often sufficient information in itself
182 At higher levels of verbosity, the name is printed
185 - Operations should be performed on a copy of the data
186 provided. The data provided should not be destroyed
189 - The call to mc_discard_dup_state is needed to discard
190 (probably) duplicate states. It is otherwise
193 To reduce the probability of collisions, use a
194 high-quality hash function. MD4 is a reasonable
195 choice: it is fast but high-quality. In one test,
196 switching to MD4 from MD5 increased overall speed of
197 model checking by 8% and actually reduced (!) the
198 number of collisions.
200 The hash value needs to include enough of the state
201 to ensure that interesting states are not excluded,
202 but it need not include the entire state. For
203 example, in many cases, the structure of complex data
204 (metadata) is often much more important than the
205 contents (data), so it may be reasonable to hash only
208 mc_discard_dup_state may be called before or after
209 checking for consistency, but calling it first avoids
210 wasting time checking duplicate states for
211 consistency, which again can be a significant
214 - The mc_error function reports errors. It may be
215 called as many times as desired to report each kind
216 of inconsistency in a state.
218 - The mc_add_state function adds the new state to the
219 queue. It should be called regardless of whether an
220 error was reported, to indicate to the model checker
221 that state processing has finished.
223 - The mutation function should be deterministic, to
224 make it possible to reliably reproduce errors.
226 3. void destroy (struct mc *mc, void *data);
228 This function is called to discard the client-specified
229 DATA associated with a state.
231 Numerous options are available for configuring the model
232 checker. The most important of these are:
236 * Breadth-first search (the default): First try all the
237 operations with depth 1, then those with depth 2, then
238 those with depth 3, and so on.
240 This search algorithm finds the least number of
241 operations needed to trigger a given bug.
243 * Depth-first search: Searches downward in the tree of
244 states as fast as possible. Good for finding bugs that
245 require long sequences of operations to trigger.
247 * Random-first search: Searches through the tree of
248 states in random order.
250 * Explicit path: Applies an explicitly specified sequence
253 - Verbosity: By default, messages are printed only when an
254 error is encountered, but you can cause the checker to
255 print a message on every state transition. This is most
256 useful when the errors in your code cause segfaults or
257 some other kind of sudden termination.
259 - Treatment of errors: By default, when an error is
260 encountered, the model checker recursively invokes itself
261 with an increased verbosity level and configured to follow
262 only the error path. As long as the mutation function is
263 deterministic, this quickly and concisely replays the
264 error and describes the path followed to reach it in an
265 easily human-readable manner.
269 * Maximum depth: You can limit the depth of the operations
270 performed. Most often useful with depth-first search.
271 By default, depth is unlimited.
273 * Maximum queue length: You can limit the number of states
274 kept in the queue at any given time. The main reason to
275 do so is to limit memory consumption. The default
276 limit is 10,000 states. Several strategies are
277 available for choosing which state to drop when the
280 - Stop conditions: based on maximum unique states checked,
281 maximum errors found (by default, 1), or maximum time used
284 - Progress: by default, the checker prints a '.' on stderr
285 every .25 seconds, but you can substitute another progress
286 function or disable progress printing.
288 This model checker does not (yet) include two features
289 described in the papers cited above: utility scoring
290 heuristics to guide the search strategy and "choice points" to
291 explore alternative cases. The former feature is less
292 interesting for this model checker, because the data
293 structures we are thus far using it to model are much smaller
294 than those discussed in the paper. The latter feature we
295 should implement at some point. */
297 #ifndef LIBPSPP_MODEL_CHECKER_H
298 #define LIBPSPP_MODEL_CHECKER_H 1
303 #include <sys/time.h>
305 #include <libpspp/compiler.h>
307 /* An active model checking run. */
310 /* Provided by each client of the model checker. */
313 void (*init) (struct mc *);
314 void (*mutate) (struct mc *, const void *);
315 void (*destroy) (const struct mc *, void *);
318 /* Results of a model checking run. */
321 /* Configuration for running the model checker. */
324 /* Primary external interface to model checker. */
325 struct mc_results *mc_run (const struct mc_class *, struct mc_options *);
327 /* Functions for use from client-supplied "init" and "mutate"
329 bool mc_include_state (struct mc *);
330 bool mc_discard_dup_state (struct mc *, unsigned int hash);
331 void mc_name_operation (struct mc *, const char *, ...) PRINTF_FORMAT (2, 3);
332 void mc_vname_operation (struct mc *, const char *, va_list)
333 PRINTF_FORMAT (2, 0);
334 void mc_error (struct mc *, const char *, ...) PRINTF_FORMAT (2, 3);
335 void mc_add_state (struct mc *, void *data);
337 /* Functions for use from client-supplied "init", "mutate", and
338 "destroy" functions. */
339 const struct mc_options *mc_get_options (const struct mc *);
340 const struct mc_results *mc_get_results (const struct mc *);
341 void *mc_get_aux (const struct mc *);
343 /* A path of operations through a model to arrive at some
347 int *ops; /* Sequence of operations. */
348 size_t length; /* Number of operations. */
349 size_t capacity; /* Number of operations for which room is allocated. */
352 void mc_path_init (struct mc_path *);
353 void mc_path_copy (struct mc_path *, const struct mc_path *);
354 void mc_path_push (struct mc_path *, int new_state);
355 int mc_path_pop (struct mc_path *);
356 int mc_path_back (const struct mc_path *);
357 void mc_path_destroy (struct mc_path *);
359 int mc_path_get_operation (const struct mc_path *, size_t index);
360 size_t mc_path_get_length (const struct mc_path *);
363 void mc_path_to_string (const struct mc_path *, struct string *);
365 struct mc_options *mc_options_create (void);
366 struct mc_options *mc_options_clone (const struct mc_options *);
367 void mc_options_destroy (struct mc_options *);
369 /* Search strategy. */
372 MC_BROAD, /* Breadth-first search. */
373 MC_DEEP, /* Depth-first search. */
374 MC_RANDOM, /* Randomly ordered search. */
375 MC_PATH /* Follow one explicit path. */
378 enum mc_strategy mc_options_get_strategy (const struct mc_options *);
379 void mc_options_set_strategy (struct mc_options *, enum mc_strategy);
380 unsigned int mc_options_get_seed (const struct mc_options *);
381 void mc_options_set_seed (struct mc_options *, unsigned int seed);
382 int mc_options_get_max_depth (const struct mc_options *);
383 void mc_options_set_max_depth (struct mc_options *, int max_depth);
384 int mc_options_get_hash_bits (const struct mc_options *);
385 void mc_options_set_hash_bits (struct mc_options *, int hash_bits);
387 const struct mc_path *mc_options_get_follow_path (const struct mc_options *);
388 void mc_options_set_follow_path (struct mc_options *, const struct mc_path *);
390 /* Strategy for dropped states from the queue when it
392 enum mc_queue_limit_strategy
394 MC_DROP_NEWEST, /* Don't enqueue the new state at all. */
395 MC_DROP_OLDEST, /* Drop the oldest state in the queue. */
396 MC_DROP_RANDOM /* Drop a random state from the queue. */
399 int mc_options_get_queue_limit (const struct mc_options *);
400 void mc_options_set_queue_limit (struct mc_options *, int queue_limit);
401 enum mc_queue_limit_strategy mc_options_get_queue_limit_strategy (
402 const struct mc_options *);
403 void mc_options_set_queue_limit_strategy (struct mc_options *,
404 enum mc_queue_limit_strategy);
406 int mc_options_get_max_unique_states (const struct mc_options *);
407 void mc_options_set_max_unique_states (struct mc_options *,
408 int max_unique_states);
409 int mc_options_get_max_errors (const struct mc_options *);
410 void mc_options_set_max_errors (struct mc_options *, int max_errors);
411 double mc_options_get_time_limit (const struct mc_options *);
412 void mc_options_set_time_limit (struct mc_options *, double time_limit);
414 int mc_options_get_verbosity (const struct mc_options *);
415 void mc_options_set_verbosity (struct mc_options *, int verbosity);
416 int mc_options_get_failure_verbosity (const struct mc_options *);
417 void mc_options_set_failure_verbosity (struct mc_options *,
418 int failure_verbosity);
419 FILE *mc_options_get_output_file (const struct mc_options *);
420 void mc_options_set_output_file (struct mc_options *, FILE *);
422 typedef bool mc_progress_func (struct mc *);
423 int mc_options_get_progress_usec (const struct mc_options *);
424 void mc_options_set_progress_usec (struct mc_options *, int progress_usec);
425 mc_progress_func *mc_options_get_progress_func (const struct mc_options *);
426 void mc_options_set_progress_func (struct mc_options *, mc_progress_func *);
428 void *mc_options_get_aux (const struct mc_options *);
429 void mc_options_set_aux (struct mc_options *, void *aux);
431 /* Reason that a model checking run terminated. */
434 MC_CONTINUING, /* Run has not yet terminated. */
435 MC_SUCCESS, /* Queue emptied (ran out of states). */
436 MC_MAX_UNIQUE_STATES, /* Did requested number of unique states. */
437 MC_MAX_ERROR_COUNT, /* Too many errors. */
438 MC_END_OF_PATH, /* Processed requested path (MC_PATH only). */
439 MC_TIMEOUT, /* Timeout. */
440 MC_INTERRUPTED /* Received SIGINT (Ctrl+C). */
443 void mc_results_destroy (struct mc_results *);
445 enum mc_stop_reason mc_results_get_stop_reason (const struct mc_results *);
446 int mc_results_get_unique_state_count (const struct mc_results *);
447 int mc_results_get_error_count (const struct mc_results *);
449 int mc_results_get_max_depth_reached (const struct mc_results *);
450 double mc_results_get_mean_depth_reached (const struct mc_results *);
452 const struct mc_path *mc_results_get_error_path (const struct mc_results *);
454 int mc_results_get_duplicate_dropped_states (const struct mc_results *);
455 int mc_results_get_off_path_dropped_states (const struct mc_results *);
456 int mc_results_get_depth_dropped_states (const struct mc_results *);
457 int mc_results_get_queue_dropped_states (const struct mc_results *);
458 int mc_results_get_queued_unprocessed_states (const struct mc_results *);
459 int mc_results_get_max_queue_length (const struct mc_results *);
461 struct timeval mc_results_get_start (const struct mc_results *);
462 struct timeval mc_results_get_end (const struct mc_results *);
463 double mc_results_get_duration (const struct mc_results *);
465 #endif /* libpspp/model-checker.h */