1 /* PSPP - a program for statistical analysis.
2 Copyright (C) 2007 Free Software Foundation, Inc.
4 This program is free software: you can redistribute it and/or modify
5 it under the terms of the GNU General Public License as published by
6 the Free Software Foundation, either version 3 of the License, or
7 (at your option) any later version.
9 This program is distributed in the hope that it will be useful,
10 but WITHOUT ANY WARRANTY; without even the implied warranty of
11 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 GNU General Public License for more details.
14 You should have received a copy of the GNU General Public License
15 along with this program. If not, see <http://www.gnu.org/licenses/>. */
17 /* Implementation-level model checker.
19 A model checker is a tool for software testing and
20 verification that works by exploring all the possible states
21 in a system and verifying their internal consistency. A
22 conventional model checker requires that the code in a system
23 be translated into a specification language. The model
24 checker then verifies the specification, rather than the code.
26 This is instead an implementation-level model checker, which
27 does not require a separate specification. Instead, the model
28 checker requires writing a second implementation of the system
29 being checked. The second implementation can usually be made
30 almost trivial in comparison to the one being checked, because
31 it's usually acceptable for its performance to be
32 comparatively poor, e.g. O(N^2) instead of O(lg N), and thus
33 to use much simpler algorithms.
35 For introduction to the implementation-level model checking
36 approach used here, please refer to the following papers:
38 Musuvathi, Park, Chou, Engler, Dill, "CMC: A Pragmatic
39 Approach to Model Checking Real Code", Proceedings of the
40 Fifth Symposium on Operating Systems Design and
41 Implementation (OSDI), Dec 2002.
42 http://sprout.stanford.edu/PAPERS/CMC-OSDI-2002/CMC-OSDI-2002.pdf
44 Yang, Twohey, Engler, Musuvathi, "Using Model Checking to
45 Find Serious File System Errors", Proceedings of the Sixth
46 Symposium on Operating System Design and Implementation
48 http://www.stanford.edu/~engler/osdi04-fisc.pdf
50 Yang, Twohey, Pfaff, Sar, Engler, "EXPLODE: A Lightweight,
51 General Approach to Finding Serious Errors in Storage
52 Systems", First Workshop on the Evaluation of Software
53 Defect Detection Tools (BUGS), June 2005.
54 http://benpfaff.org/papers/explode.pdf
56 Use of a model checker is appropriate when the system being
57 checked is difficult to test using handwritten tests. This
58 can be the case, for example, when the system has a
59 complicated internal state that is difficult to reason about
60 over a long series of operations.
62 The implementation model checker works by putting a set of one
63 of more initial states in a queue (and checking them for
64 consistency). Then the model checker removes a state from the
65 queue and applies all possible operations of interest to it
66 ("mutates" it), obtaining a set of zero or more child states
67 (and checking each of them for consistency). Each of these
68 states is itself added to the queue. The model checker
69 continues dequeuing states and mutating and checking them
70 until the queue is empty.
72 In pseudo-code, the process looks like this:
74 Q = { initial states }
77 for each operation applicable to S do:
82 In many cases this process will never terminate, because every
83 state has one or more child states. For some systems this is
84 unavoidable, but in others we can make the process finite by
85 pursuing a few stratagems:
87 1. Limit the maximum size of the system; for example, limit
88 the number of rows and columns in the implementation of a
89 table being checked. The client of the model checker is
90 responsible for implementing such limits.
92 2. Avoid checking a single state more than one time. This
93 model checker provides assistance for this function by
94 allowing the client to provide a hash of the system state.
95 States with identical hashes will only be checked once
98 When a system cannot be made finite, or when a finite system
99 is too large to check in a practical amount of time, the model
100 checker provides multiple ways to limit the checking run:
101 based on maximum depth, maximum unique states checked, maximum
102 errors found (by default, 1), or maximum time used for
105 The client of the model checker must provide three functions
106 via function pointers embedded into a "struct mc_class":
108 1. void init (struct mc *mc);
110 This function is called once at the beginning of a
111 checking run. It checks one or more initial states and
112 adds them to the model checker's queue. (If it does not
113 add any states to the queue, then there is nothing to
116 Here's an outline for writing the init function:
119 init_foo (struct mc *mc)
123 mc_name_operation (mc, "initial state");
124 foo = generate_initial_foo ();
125 if (!state_is_consistent (foo))
126 mc_error (mc, "inconsistent state");
127 mc_add_state (mc, foo);
130 2. void mutate (struct mc *mc, const void *data);
132 This function is called when a dequeued state is ready to
133 be mutated. For each operation that can be applied to
134 the client-specified DATA, it applies that operation to a
135 clone of the DATA, checks that the clone is consistent,
136 and adds the clone to the model checker's queue.
138 Here's an outline for writing the mutate function:
141 mutate_foo (struct mc *mc, void *state_)
143 struct foo *state = state_;
145 for (...each operation...)
146 if (mc_include_state (mc))
150 mc_name_operation (mc, "do operation %s", ...);
151 clone = clone_foo (state);
152 do_operation (clone);
153 if (!mc_discard_dup_state (mc, hash_foo (clone)))
155 if (!state_is_consistent (clone))
156 mc_error (mc, "inconsistent state");
157 mc_add_state (mc, clone);
164 Notes on the above outline:
166 - The call to mc_include_state allows currently
167 uninteresting operations to be skipped. It is not
170 - The call to mc_name_operation should give the current
171 operation a human-readable name. The name may
172 include printf-style format specifications.
174 When an error occurs, the model checker (by default)
175 replays the sequence of operations performed to reach
176 the error, printing the name of the operation at each
177 step, which is often sufficient information in itself
180 At higher levels of verbosity, the name is printed
183 - Operations should be performed on a copy of the data
184 provided. The data provided should not be destroyed
187 - The call to mc_discard_dup_state is needed to discard
188 (probably) duplicate states. It is otherwise
191 To reduce the probability of collisions, use a
192 high-quality hash function. MD4 is a reasonable
193 choice: it is fast but high-quality. In one test,
194 switching to MD4 from MD5 increased overall speed of
195 model checking by 8% and actually reduced (!) the
196 number of collisions.
198 The hash value needs to include enough of the state
199 to ensure that interesting states are not excluded,
200 but it need not include the entire state. For
201 example, in many cases, the structure of complex data
202 (metadata) is often much more important than the
203 contents (data), so it may be reasonable to hash only
206 mc_discard_dup_state may be called before or after
207 checking for consistency, but calling it first avoids
208 wasting time checking duplicate states for
209 consistency, which again can be a significant
212 - The mc_error function reports errors. It may be
213 called as many times as desired to report each kind
214 of inconsistency in a state.
216 - The mc_add_state function adds the new state to the
217 queue. It should be called regardless of whether an
218 error was reported, to indicate to the model checker
219 that state processing has finished.
221 - The mutation function should be deterministic, to
222 make it possible to reliably reproduce errors.
224 3. void destroy (struct mc *mc, void *data);
226 This function is called to discard the client-specified
227 DATA associated with a state.
229 Numerous options are available for configuring the model
230 checker. The most important of these are:
234 * Breadth-first search (the default): First try all the
235 operations with depth 1, then those with depth 2, then
236 those with depth 3, and so on.
238 This search algorithm finds the least number of
239 operations needed to trigger a given bug.
241 * Depth-first search: Searches downward in the tree of
242 states as fast as possible. Good for finding bugs that
243 require long sequences of operations to trigger.
245 * Random-first search: Searches through the tree of
246 states in random order.
248 * Explicit path: Applies an explicitly specified sequence
251 - Verbosity: By default, messages are printed only when an
252 error is encountered, but you can cause the checker to
253 print a message on every state transition. This is most
254 useful when the errors in your code cause segfaults or
255 some other kind of sudden termination.
257 - Treatment of errors: By default, when an error is
258 encountered, the model checker recursively invokes itself
259 with an increased verbosity level and configured to follow
260 only the error path. As long as the mutation function is
261 deterministic, this quickly and concisely replays the
262 error and describes the path followed to reach it in an
263 easily human-readable manner.
267 * Maximum depth: You can limit the depth of the operations
268 performed. Most often useful with depth-first search.
269 By default, depth is unlimited.
271 * Maximum queue length: You can limit the number of states
272 kept in the queue at any given time. The main reason to
273 do so is to limit memory consumption. The default
274 limit is 10,000 states. Several strategies are
275 available for choosing which state to drop when the
278 - Stop conditions: based on maximum unique states checked,
279 maximum errors found (by default, 1), or maximum time used
282 - Progress: by default, the checker prints a '.' on stderr
283 every .25 seconds, but you can substitute another progress
284 function or disable progress printing.
286 This model checker does not (yet) include two features
287 described in the papers cited above: utility scoring
288 heuristics to guide the search strategy and "choice points" to
289 explore alternative cases. The former feature is less
290 interesting for this model checker, because the data
291 structures we are thus far using it to model are much smaller
292 than those discussed in the paper. The latter feature we
293 should implement at some point. */
295 #ifndef LIBPSPP_MODEL_CHECKER_H
296 #define LIBPSPP_MODEL_CHECKER_H 1
301 #include <sys/time.h>
303 #include <libpspp/compiler.h>
305 /* An active model checking run. */
308 /* Provided by each client of the model checker. */
311 void (*init) (struct mc *);
312 void (*mutate) (struct mc *, const void *);
313 void (*destroy) (const struct mc *, void *);
316 /* Results of a model checking run. */
319 /* Configuration for running the model checker. */
322 /* Primary external interface to model checker. */
323 struct mc_results *mc_run (const struct mc_class *, struct mc_options *);
325 /* Functions for use from client-supplied "init" and "mutate"
327 bool mc_include_state (struct mc *);
328 bool mc_discard_dup_state (struct mc *, unsigned int hash);
329 void mc_name_operation (struct mc *, const char *, ...) PRINTF_FORMAT (2, 3);
330 void mc_vname_operation (struct mc *, const char *, va_list)
331 PRINTF_FORMAT (2, 0);
332 void mc_error (struct mc *, const char *, ...) PRINTF_FORMAT (2, 3);
333 void mc_add_state (struct mc *, void *data);
335 /* Functions for use from client-supplied "init", "mutate", and
336 "destroy" functions. */
337 const struct mc_options *mc_get_options (const struct mc *);
338 const struct mc_results *mc_get_results (const struct mc *);
339 void *mc_get_aux (const struct mc *);
341 /* A path of operations through a model to arrive at some
345 int *ops; /* Sequence of operations. */
346 size_t length; /* Number of operations. */
347 size_t capacity; /* Number of operations for which room is allocated. */
350 void mc_path_init (struct mc_path *);
351 void mc_path_copy (struct mc_path *, const struct mc_path *);
352 void mc_path_push (struct mc_path *, int new_state);
353 int mc_path_pop (struct mc_path *);
354 int mc_path_back (const struct mc_path *);
355 void mc_path_destroy (struct mc_path *);
357 int mc_path_get_operation (const struct mc_path *, size_t index);
358 size_t mc_path_get_length (const struct mc_path *);
361 void mc_path_to_string (const struct mc_path *, struct string *);
363 struct mc_options *mc_options_create (void);
364 struct mc_options *mc_options_clone (const struct mc_options *);
365 void mc_options_destroy (struct mc_options *);
367 /* Search strategy. */
370 MC_BROAD, /* Breadth-first search. */
371 MC_DEEP, /* Depth-first search. */
372 MC_RANDOM, /* Randomly ordered search. */
373 MC_PATH /* Follow one explicit path. */
376 enum mc_strategy mc_options_get_strategy (const struct mc_options *);
377 void mc_options_set_strategy (struct mc_options *, enum mc_strategy);
378 unsigned int mc_options_get_seed (const struct mc_options *);
379 void mc_options_set_seed (struct mc_options *, unsigned int seed);
380 int mc_options_get_max_depth (const struct mc_options *);
381 void mc_options_set_max_depth (struct mc_options *, int max_depth);
382 int mc_options_get_hash_bits (const struct mc_options *);
383 void mc_options_set_hash_bits (struct mc_options *, int hash_bits);
385 const struct mc_path *mc_options_get_follow_path (const struct mc_options *);
386 void mc_options_set_follow_path (struct mc_options *, const struct mc_path *);
388 /* Strategy for dropped states from the queue when it
390 enum mc_queue_limit_strategy
392 MC_DROP_NEWEST, /* Don't enqueue the new state at all. */
393 MC_DROP_OLDEST, /* Drop the oldest state in the queue. */
394 MC_DROP_RANDOM /* Drop a random state from the queue. */
397 int mc_options_get_queue_limit (const struct mc_options *);
398 void mc_options_set_queue_limit (struct mc_options *, int queue_limit);
399 enum mc_queue_limit_strategy mc_options_get_queue_limit_strategy (
400 const struct mc_options *);
401 void mc_options_set_queue_limit_strategy (struct mc_options *,
402 enum mc_queue_limit_strategy);
404 int mc_options_get_max_unique_states (const struct mc_options *);
405 void mc_options_set_max_unique_states (struct mc_options *,
406 int max_unique_states);
407 int mc_options_get_max_errors (const struct mc_options *);
408 void mc_options_set_max_errors (struct mc_options *, int max_errors);
409 double mc_options_get_time_limit (const struct mc_options *);
410 void mc_options_set_time_limit (struct mc_options *, double time_limit);
412 int mc_options_get_verbosity (const struct mc_options *);
413 void mc_options_set_verbosity (struct mc_options *, int verbosity);
414 int mc_options_get_failure_verbosity (const struct mc_options *);
415 void mc_options_set_failure_verbosity (struct mc_options *,
416 int failure_verbosity);
417 FILE *mc_options_get_output_file (const struct mc_options *);
418 void mc_options_set_output_file (struct mc_options *, FILE *);
420 typedef bool mc_progress_func (struct mc *);
421 int mc_options_get_progress_usec (const struct mc_options *);
422 void mc_options_set_progress_usec (struct mc_options *, int progress_usec);
423 mc_progress_func *mc_options_get_progress_func (const struct mc_options *);
424 void mc_options_set_progress_func (struct mc_options *, mc_progress_func *);
426 void *mc_options_get_aux (const struct mc_options *);
427 void mc_options_set_aux (struct mc_options *, void *aux);
429 /* Reason that a model checking run terminated. */
432 MC_CONTINUING, /* Run has not yet terminated. */
433 MC_SUCCESS, /* Queue emptied (ran out of states). */
434 MC_MAX_UNIQUE_STATES, /* Did requested number of unique states. */
435 MC_MAX_ERROR_COUNT, /* Too many errors. */
436 MC_END_OF_PATH, /* Processed requested path (MC_PATH only). */
437 MC_TIMEOUT, /* Timeout. */
438 MC_INTERRUPTED /* Received SIGINT (Ctrl+C). */
441 void mc_results_destroy (struct mc_results *);
443 enum mc_stop_reason mc_results_get_stop_reason (const struct mc_results *);
444 int mc_results_get_unique_state_count (const struct mc_results *);
445 int mc_results_get_error_count (const struct mc_results *);
447 int mc_results_get_max_depth_reached (const struct mc_results *);
448 double mc_results_get_mean_depth_reached (const struct mc_results *);
450 const struct mc_path *mc_results_get_error_path (const struct mc_results *);
452 int mc_results_get_duplicate_dropped_states (const struct mc_results *);
453 int mc_results_get_off_path_dropped_states (const struct mc_results *);
454 int mc_results_get_depth_dropped_states (const struct mc_results *);
455 int mc_results_get_queue_dropped_states (const struct mc_results *);
456 int mc_results_get_queued_unprocessed_states (const struct mc_results *);
457 int mc_results_get_max_queue_length (const struct mc_results *);
459 struct timeval mc_results_get_start (const struct mc_results *);
460 struct timeval mc_results_get_end (const struct mc_results *);
461 double mc_results_get_duration (const struct mc_results *);
463 #endif /* libpspp/model-checker.h */