Reactive Computing Systems
Introduction: Reactive computing systems are hardware/software ensembles that maintain an ongoing interaction with their environment, coordinating as well as facilitating these interactions. They are widely used in all walks of life—often in areas that directly affect life. Examples of reactive systems include device drivers that control the operation of computer peripherals such as disks and network cards, embedded control systems used in spacecraft or automobiles, cache coherency controllers that maintain memory consistency in multiprocessor machines, and even certain cardiac pacemakers that measure body functions (such as body electric fields and exercise/sleep patterns) to keep a defective heart beating properly. Model checking has already been employed in most of these areas, with its use imminent in critical areas such as medical systems. Clearly, knowing a little bit about the inner workings of a model checker can go a long way towards their proper usage.
Why model checking?
The design of most reactive systems is an involved as well as exacting task. Hundreds of engineers are involved in planning, analyzing, as well as building and testing the various hardware and software components that go into these systems. Despite all this exacting work, at least two vexing problems remain:
- Reactive systems often exhibit nasty bugs only when field-tested. Unfortunately, at such late stages of product development, identifying the root cause of bugs as well as finding solutions or workarounds takes valuable product engineering time. A manufacturer caught in this situation can very easily lose their competitive advantage, as these late life-cycle bug fixes can cost them dearly— especially if they miss critical market windows.
- The risk of undetected bugs in products is very high,2 in the form of law-suits and recalls. Since software testing methods are seldom exhaustive, product managers have a very difficult time deciding when to begin selling products.
Formal methods based on model checking are geared towards eliminating the above difficulties associated with late cycle debugging. While model checking is not a panacea, it has established a track record of finding many deep bugs missed by weeks or months of testing. Specifically,
- model checking is best used when a reactive protocol is in its early conceptual design stages. This is also the most cost-effective point at which to eliminate deep conceptual flaws.
- model checking can return answers — either successful verification outcomes or high level counterexamples —often in a matter of a few minutes to a few hours. In contrast, testing can wastefully explore vast expanses of the state-space over weeks or months of testing. Error location can also become nightmarishly hard during testing, as the state-space sizes are large, and because an astronomically large number of computation steps may be executed from when the actual erroneous steps were carried out until when the system crashes or other symptoms of “ill health” are manifested.
In reality, the success of model checking can be attributed to several facts. Many of these facts are just pure “common sense,” not exclusive to model checking in any way. Yet, it has been observed that model checking facilitates the use of such common sense! We now list the “virtues of model checking,” starting from the most pragmatic and going towards the more mathematical reasons.
Successive refinement: Human thought seldom advances3 without the benefit of symbolic thought or successive refinement. Once one erects symbols and defines “rules of the game,” one can begin playing. In the same sense, rather than remain frozen with indecision in the face of full design complexity, designers adopting model checking methods have at their disposal a vehicle for testing early prototypes and evaluating design alternatives. The models that a designer builds for reactive systems are largely finitestate machines. These finite-state machines can either be standard ones or embellished with extra information pertaining to communication or computation.
Exhaustively verify simplified models: Even though a manually created design model can be defective, creating one actually allows the designer to unload a piece of their mounting mental burden, and test the integrity of their thoughts through model checking experiments. The modern tendency in this area is to supplant manual model construction with abstraction refinement techniques that can help one gradually discover, through tool assistance, what a suitable abstract model is. This approach promises to considerably reduce the level of effort needed in creating abstract models.
State-space analysis tools are the ‘heart’ of any model checker. These tools help exhaustively analyze finite-state machine models through a combination of techniques that help reduce memory requirements as well as overall computation time. As Rusbhy points out, experience has shown that the exhaustive analysis of finite-state models of reactive systems can often lead to the discovery of unexpected interactions (“corner cases”) and bugs much more readily than testing methods can, for the same amount of resources (human and computer time) spent.