1 /* PSPP - a program for statistical analysis.
2 Copyright (C) 2007, 2009, 2011 Free Software Foundation, Inc.
4 This program is free software: you can redistribute it and/or modify
5 it under the terms of the GNU General Public License as published by
6 the Free Software Foundation, either version 3 of the License, or
7 (at your option) any later version.
9 This program is distributed in the hope that it will be useful,
10 but WITHOUT ANY WARRANTY; without even the implied warranty of
11 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 GNU General Public License for more details.
14 You should have received a copy of the GNU General Public License
15 along with this program. If not, see <http://www.gnu.org/licenses/>. */
17 /* Implementation-level model checker.
19 A model checker is a tool for software testing and
20 verification that works by exploring all the possible states
21 in a system and verifying their internal consistency. A
22 conventional model checker requires that the code in a system
23 be translated into a specification language. The model
24 checker then verifies the specification, rather than the code.
26 This is instead an implementation-level model checker, which
27 does not require a separate specification. Instead, the model
28 checker requires writing a second implementation of the system
29 being checked. The second implementation can usually be made
30 almost trivial in comparison to the one being checked, because
31 it's usually acceptable for its performance to be
32 comparatively poor, e.g. O(N^2) instead of O(lg N), and thus
33 to use much simpler algorithms.
35 For introduction to the implementation-level model checking
36 approach used here, please refer to the following papers:
38 Musuvathi, Park, Chou, Engler, Dill, "CMC: A Pragmatic
39 Approach to Model Checking Real Code", Proceedings of the
40 Fifth Symposium on Operating Systems Design and
41 Implementation (OSDI), Dec 2002.
42 http://sprout.stanford.edu/PAPERS/CMC-OSDI-2002/CMC-OSDI-2002.pdf
44 Yang, Twohey, Engler, Musuvathi, "Using Model Checking to
45 Find Serious File System Errors", Proceedings of the Sixth
46 Symposium on Operating System Design and Implementation
48 http://www.stanford.edu/~engler/osdi04-fisc.pdf
50 Yang, Twohey, Pfaff, Sar, Engler, "EXPLODE: A Lightweight,
51 General Approach to Finding Serious Errors in Storage
52 Systems", First Workshop on the Evaluation of Software
53 Defect Detection Tools (BUGS), June 2005.
54 http://benpfaff.org/papers/explode.pdf
56 Use of a model checker is appropriate when the system being
57 checked is difficult to test using handwritten tests. This
58 can be the case, for example, when the system has a
59 complicated internal state that is difficult to reason about
60 over a long series of operations.
62 The implementation model checker works by putting a set of one
63 of more initial states in a queue (and checking them for
64 consistency). Then the model checker removes a state from the
65 queue and applies all possible operations of interest to it
66 ("mutates" it), obtaining a set of zero or more child states
67 (and checking each of them for consistency). Each of these
68 states is itself added to the queue. The model checker
69 continues dequeuing states and mutating and checking them
70 until the queue is empty.
72 In pseudo-code, the process looks like this:
74 Q = { initial states }
77 for each operation applicable to S do:
82 In many cases this process will never terminate, because every
83 state has one or more child states. For some systems this is
84 unavoidable, but in others we can make the process finite by
85 pursuing a few stratagems:
87 1. Limit the maximum size of the system; for example, limit
88 the number of rows and columns in the implementation of a
89 table being checked. The client of the model checker is
90 responsible for implementing such limits.
92 2. Avoid checking a single state more than one time. This
93 model checker provides assistance for this function by
94 allowing the client to provide a hash of the system state.
95 States with identical hashes will only be checked once
98 When a system cannot be made finite, or when a finite system
99 is too large to check in a practical amount of time, the model
100 checker provides multiple ways to limit the checking run:
101 based on maximum depth, maximum unique states checked, maximum
102 errors found (by default, 1), or maximum time used for
105 The client of the model checker must provide three functions
106 via function pointers embedded into a "struct mc_class":
108 1. void init (struct mc *mc);
110 This function is called once at the beginning of a
111 checking run. It checks one or more initial states and
112 adds them to the model checker's queue. (If it does not
113 add any states to the queue, then there is nothing to
116 Here's an outline for writing the init function:
119 init_foo (struct mc *mc)
123 mc_name_operation (mc, "initial state");
124 foo = generate_initial_foo ();
125 if (!state_is_consistent (foo))
126 mc_error (mc, "inconsistent state");
127 mc_add_state (mc, foo);
130 2. void mutate (struct mc *mc, const void *data);
132 This function is called when a dequeued state is ready to
133 be mutated. For each operation that can be applied to
134 the client-specified DATA, it applies that operation to a
135 clone of the DATA, checks that the clone is consistent,
136 and adds the clone to the model checker's queue.
138 Here's an outline for writing the mutate function:
141 mutate_foo (struct mc *mc, void *state_)
143 struct foo *state = state_;
145 for (...each operation...)
146 if (mc_include_state (mc))
150 mc_name_operation (mc, "do operation %s", ...);
151 clone = clone_foo (state);
152 do_operation (clone);
153 if (!state_is_consistent (clone))
154 mc_error (mc, "inconsistent state");
155 if (mc_discard_dup_state (mc, hash_foo (clone)))
158 mc_add_state (mc, clone);
162 Notes on the above outline:
164 - The call to mc_include_state allows currently
165 uninteresting operations to be skipped. It is not
168 - The call to mc_name_operation should give the current
169 operation a human-readable name. The name may
170 include printf-style format specifications.
172 When an error occurs, the model checker (by default)
173 replays the sequence of operations performed to reach
174 the error, printing the name of the operation at each
175 step, which is often sufficient information in itself
178 At higher levels of verbosity, the name is printed
181 - Operations should be performed on a copy of the data
182 provided. The data provided should not be destroyed
185 - The call to mc_discard_dup_state is needed to discard
186 (probably) duplicate states. It is otherwise
189 To reduce the probability of collisions, use a
190 high-quality hash function. MD4 is a reasonable
191 choice: it is fast but high-quality. In one test,
192 switching to MD4 from MD5 increased overall speed of
193 model checking by 8% and actually reduced (!) the
194 number of collisions.
196 The hash value needs to include enough of the state
197 to ensure that interesting states are not excluded,
198 but it need not include the entire state. For
199 example, in many cases, the structure of complex data
200 (metadata) is often much more important than the
201 contents (data), so it may be reasonable to hash only
204 mc_discard_dup_state may be called before or after
205 checking for consistency. Calling it after checking
206 may make checking a given number of unique states
207 take longer, but it also ensures that all paths to a
208 given state produce correct results.
210 - The mc_error function reports errors. It may be
211 called as many times as desired to report each kind
212 of inconsistency in a state.
214 - The mc_add_state function adds the new state to the
215 queue. It should be called regardless of whether an
216 error was reported, to indicate to the model checker
217 that state processing has finished.
219 - The mutation function should be deterministic, to
220 make it possible to reliably reproduce errors.
222 3. void destroy (struct mc *mc, void *data);
224 This function is called to discard the client-specified
225 DATA associated with a state.
227 Numerous options are available for configuring the model
228 checker. The most important of these are:
232 * Breadth-first search (the default): First try all the
233 operations with depth 1, then those with depth 2, then
234 those with depth 3, and so on.
236 This search algorithm finds the least number of
237 operations needed to trigger a given bug.
239 * Depth-first search: Searches downward in the tree of
240 states as fast as possible. Good for finding bugs that
241 require long sequences of operations to trigger.
243 * Random-first search: Searches through the tree of
244 states in random order.
246 * Explicit path: Applies an explicitly specified sequence
249 - Verbosity: By default, messages are printed only when an
250 error is encountered, but you can cause the checker to
251 print a message on every state transition. This is most
252 useful when the errors in your code cause segfaults or
253 some other kind of sudden termination.
255 - Treatment of errors: By default, when an error is
256 encountered, the model checker recursively invokes itself
257 with an increased verbosity level and configured to follow
258 only the error path. As long as the mutation function is
259 deterministic, this quickly and concisely replays the
260 error and describes the path followed to reach it in an
261 easily human-readable manner.
265 * Maximum depth: You can limit the depth of the operations
266 performed. Most often useful with depth-first search.
267 By default, depth is unlimited.
269 * Maximum queue length: You can limit the number of states
270 kept in the queue at any given time. The main reason to
271 do so is to limit memory consumption. The default
272 limit is 10,000 states. Several strategies are
273 available for choosing which state to drop when the
276 - Stop conditions: based on maximum unique states checked,
277 maximum errors found (by default, 1), or maximum time used
280 - Progress: by default, the checker prints a '.' on stderr
281 every .25 seconds, but you can substitute another progress
282 function or disable progress printing.
284 This model checker does not (yet) include two features
285 described in the papers cited above: utility scoring
286 heuristics to guide the search strategy and "choice points" to
287 explore alternative cases. The former feature is less
288 interesting for this model checker, because the data
289 structures we are thus far using it to model are much smaller
290 than those discussed in the paper. The latter feature we
291 should implement at some point. */
293 #ifndef LIBPSPP_MODEL_CHECKER_H
294 #define LIBPSPP_MODEL_CHECKER_H 1
299 #include <sys/time.h>
301 #include "libpspp/compiler.h"
303 /* An active model checking run. */
306 /* Provided by each client of the model checker. */
309 void (*init) (struct mc *);
310 void (*mutate) (struct mc *, const void *);
311 void (*destroy) (const struct mc *, void *);
314 /* Results of a model checking run. */
317 /* Configuration for running the model checker. */
320 /* Primary external interface to model checker. */
321 struct mc_results *mc_run (const struct mc_class *, struct mc_options *);
323 /* Functions for use from client-supplied "init" and "mutate"
325 bool mc_include_state (struct mc *);
326 bool mc_discard_dup_state (struct mc *, unsigned int hash);
327 void mc_name_operation (struct mc *, const char *, ...) PRINTF_FORMAT (2, 3);
328 void mc_vname_operation (struct mc *, const char *, va_list)
329 PRINTF_FORMAT (2, 0);
330 void mc_error (struct mc *, const char *, ...) PRINTF_FORMAT (2, 3);
331 void mc_add_state (struct mc *, void *data);
333 /* Functions for use from client-supplied "init", "mutate", and
334 "destroy" functions. */
335 const struct mc_options *mc_get_options (const struct mc *);
336 const struct mc_results *mc_get_results (const struct mc *);
337 void *mc_get_aux (const struct mc *);
339 /* A path of operations through a model to arrive at some
343 int *ops; /* Sequence of operations. */
344 size_t length; /* Number of operations. */
345 size_t capacity; /* Number of operations for which room is allocated. */
348 void mc_path_init (struct mc_path *);
349 void mc_path_copy (struct mc_path *, const struct mc_path *);
350 void mc_path_push (struct mc_path *, int new_state);
351 int mc_path_pop (struct mc_path *);
352 int mc_path_back (const struct mc_path *);
353 void mc_path_destroy (struct mc_path *);
355 int mc_path_get_operation (const struct mc_path *, size_t index);
356 size_t mc_path_get_length (const struct mc_path *);
359 void mc_path_to_string (const struct mc_path *, struct string *);
361 struct mc_options *mc_options_create (void);
362 struct mc_options *mc_options_clone (const struct mc_options *);
363 void mc_options_destroy (struct mc_options *);
365 /* Search strategy. */
368 MC_BROAD, /* Breadth-first search. */
369 MC_DEEP, /* Depth-first search. */
370 MC_RANDOM, /* Randomly ordered search. */
371 MC_PATH /* Follow one explicit path. */
374 enum mc_strategy mc_options_get_strategy (const struct mc_options *);
375 void mc_options_set_strategy (struct mc_options *, enum mc_strategy);
376 unsigned int mc_options_get_seed (const struct mc_options *);
377 void mc_options_set_seed (struct mc_options *, unsigned int seed);
378 int mc_options_get_max_depth (const struct mc_options *);
379 void mc_options_set_max_depth (struct mc_options *, int max_depth);
380 int mc_options_get_hash_bits (const struct mc_options *);
381 void mc_options_set_hash_bits (struct mc_options *, int hash_bits);
383 const struct mc_path *mc_options_get_follow_path (const struct mc_options *);
384 void mc_options_set_follow_path (struct mc_options *, const struct mc_path *);
386 /* Strategy for dropped states from the queue when it
388 enum mc_queue_limit_strategy
390 MC_DROP_NEWEST, /* Don't enqueue the new state at all. */
391 MC_DROP_OLDEST, /* Drop the oldest state in the queue. */
392 MC_DROP_RANDOM /* Drop a random state from the queue. */
395 int mc_options_get_queue_limit (const struct mc_options *);
396 void mc_options_set_queue_limit (struct mc_options *, int queue_limit);
397 enum mc_queue_limit_strategy mc_options_get_queue_limit_strategy (
398 const struct mc_options *);
399 void mc_options_set_queue_limit_strategy (struct mc_options *,
400 enum mc_queue_limit_strategy);
402 int mc_options_get_max_unique_states (const struct mc_options *);
403 void mc_options_set_max_unique_states (struct mc_options *,
404 int max_unique_states);
405 int mc_options_get_max_errors (const struct mc_options *);
406 void mc_options_set_max_errors (struct mc_options *, int max_errors);
407 double mc_options_get_time_limit (const struct mc_options *);
408 void mc_options_set_time_limit (struct mc_options *, double time_limit);
410 int mc_options_get_verbosity (const struct mc_options *);
411 void mc_options_set_verbosity (struct mc_options *, int verbosity);
412 int mc_options_get_failure_verbosity (const struct mc_options *);
413 void mc_options_set_failure_verbosity (struct mc_options *,
414 int failure_verbosity);
415 FILE *mc_options_get_output_file (const struct mc_options *);
416 void mc_options_set_output_file (struct mc_options *, FILE *);
418 typedef bool mc_progress_func (struct mc *);
419 mc_progress_func mc_progress_dots;
420 mc_progress_func mc_progress_fancy;
421 mc_progress_func mc_progress_verbose;
423 int mc_options_get_progress_usec (const struct mc_options *);
424 void mc_options_set_progress_usec (struct mc_options *, int progress_usec);
425 mc_progress_func *mc_options_get_progress_func (const struct mc_options *);
426 void mc_options_set_progress_func (struct mc_options *, mc_progress_func *);
428 void *mc_options_get_aux (const struct mc_options *);
429 void mc_options_set_aux (struct mc_options *, void *aux);
432 void mc_options_register_argv_parser (struct mc_options *,
433 struct argv_parser *);
434 void mc_options_usage (void);
436 /* Reason that a model checking run terminated. */
439 MC_CONTINUING, /* Run has not yet terminated. */
440 MC_SUCCESS, /* Queue emptied (ran out of states). */
441 MC_MAX_UNIQUE_STATES, /* Did requested number of unique states. */
442 MC_MAX_ERROR_COUNT, /* Too many errors. */
443 MC_END_OF_PATH, /* Processed requested path (MC_PATH only). */
444 MC_TIMEOUT, /* Timeout. */
445 MC_INTERRUPTED /* Received SIGINT (Ctrl+C). */
448 void mc_results_destroy (struct mc_results *);
450 enum mc_stop_reason mc_results_get_stop_reason (const struct mc_results *);
451 int mc_results_get_unique_state_count (const struct mc_results *);
452 int mc_results_get_error_count (const struct mc_results *);
454 int mc_results_get_max_depth_reached (const struct mc_results *);
455 double mc_results_get_mean_depth_reached (const struct mc_results *);
457 const struct mc_path *mc_results_get_error_path (const struct mc_results *);
459 int mc_results_get_duplicate_dropped_states (const struct mc_results *);
460 int mc_results_get_off_path_dropped_states (const struct mc_results *);
461 int mc_results_get_depth_dropped_states (const struct mc_results *);
462 int mc_results_get_queue_dropped_states (const struct mc_results *);
463 int mc_results_get_queued_unprocessed_states (const struct mc_results *);
464 int mc_results_get_max_queue_length (const struct mc_results *);
466 struct timeval mc_results_get_start (const struct mc_results *);
467 struct timeval mc_results_get_end (const struct mc_results *);
468 double mc_results_get_duration (const struct mc_results *);
470 void mc_results_print (const struct mc_results *, FILE *);
472 #endif /* libpspp/model-checker.h */