Model-Based Software Debugging

Model-based debugging is the application of model-based diagnosis techniques to software artifacts. This project focuses on automated modelling of and reasoning about imperative and object-oriented programs at the source code level to help software engineers isolate possible root causes of program failures more easily. The model-based approach does not rely on formal specifications; instead, it obtains the necessary information from test cases that are used to test the program under consideration. Therefore, this approach is particularly suited for users not trained in formal methods. Prototypes developed in the course of this project have demonstrated that the model-based approach is situated in the middle-ground between methods based on formal modelling and program verification, and light-weight debugging aids, such as program slicing. Notable improvements over other approaches have been achieved, in particular for programs where traditional debugging techniques do not perform well. Furthermore, the model-based framework poses minimal requirements on its underlying reasoning modules, thus facilitating the integration of different debugging techniques into a coherent framework that combines the strengths of individual approaches.

Contents


Introduction and Objectives

Faults in software systems not only are inconvenient for the user, but also result in a drastic increase of the total production costs. Developing tools that aid software engineers in automatically locating and possibly repairing such faults is not only a technically challenging task, but includes an important economic and social relevance. Since testing and debugging are among the most costly and time consuming tasks in the software development life cycle, a variety of intelligent debugging aids have been proposed within the last three decades. Model-based software debugging (MBSD) is a particular technique that exploits discrepancies between a program execution and the intended behaviour to isolate program fragments that could potentially explain an observed misbehaviour.

The main goal of the project is to provide a solid theoretical foundation for the localisation of bugs in computer programs and to develop a user-friendly and intuitive prototype debugger that supports its user in the localisation of source code faults in Java programs. Different to many other automated debugging techniques, we emphasise that the debugging framework should not require users to possess training in formal logic and formal specification and verification techniques.

The JADE (Java Debugging Experiments) project extends model-based diagnosis technology to the software domain, such that the debugger prototype automatically suggests possible bug locations to the programmer based on information about incorrect program outputs. The JADE project originally carried out at Vienna University of Technology, and later continued at the ACRC and the Technische Universitrat Graz, has been dealing with the application of model-based diagnosis techniques to the software domain. Based on encouraging experiences with a similar tool that was developed to debug commercial-grade circuit designs written in VHDL, the goal of the project is to develop and a model-based debugging tool for mainstream object-oriented languages.


Model-based Software Debugging

Model-based software debugging (MBSD) is an application of Model-based Diagnosis (MBD) techniques to debugging computer programs. Model-based diagnosis was initially focused on locating faults in physical systems, in particular faulty gates in electronic circuits. Originally, MBSD was applied primarily with the goal of identifying incorrect clauses in logic programs; the approach has since been extended to different programming languages, including VHDL and Java. Before describing the MBSD approach in detail, the underlying principles of MBD are summarised. The basic principle of MBD is to compare a model, a description of the correct behaviour of a system, to the observed behaviour of the system. Traditional MBD systems receive the description of the observed behaviour through direct measurements while the model is supplied by the system’s designer. The difference between the behaviour anticipated by the model and the actual observed behaviour is used to identify components that, when assumed to deviate from their normal behaviour, may explain the observed behaviour. The principles of model-based debugging are depicted in Figure 1 below. The key idea of adapting MBD for debugging is to exchange the roles of the model and the actual system: the model reflects the behaviour of the (incorrect) program, while the test cases specify the anticipated result. Differences between the values computed by the program and the specified results are used to compute model elements that, when assumed to behave differently, explain the observed misbehaviour.

Figure 1 Principles of Model-Based Debugging

The program’s instructions are partitioned into a set of model components which form the building blocks of explanations (the so-called “diagnoses”). Each component corresponds to a fragment of the program’s source text, such that explanations can be expressed in terms of the original program to indicate a potential bug to the programmer. Each component can operate in normal mode, where the component functions as specified in the program, or in one or more abnormal modes with different behaviour. Intuitively, each component mode corresponds to a modification of a part of the program. In contrast to mutation-based approaches, generic placeholders are used to represent modifications and no particular expression replacing the original program is assumed.

The model components, a formal model of the semantics of the programming language and a set of test cases are submitted to the conformance testing module to determine whether the program reflecting the fault assumptions is consistent with the test cases. A program is found consistent with a test case specification if the program possibly satisfies behaviour specified by the test case.

In case the altered program does not compute the anticipated result, the MBSD engine computes possible explanations in terms of mode assignments to components and invokes the conformance testing module to determine if an explanation is indeed valid. This process iterates until one (or all) possible explanations have been found.

A demonstration of the automated debugger using a model developed as part of the previous project at the Vienna University of Technology is available to illustrate the concept of model-based debugging. The prototype developed in this project has similar functionality, but currently does not provide a user-interface to visualise the results.


Modelling Programs for Debugging

To apply model-based debugging techniques to a program, a suitable representation of the program in terms of components must be derived. This transformation of source code into a formal model is fully automated and does not require user interaction. The focus of our research is to extend previous modelling techniques to eliminate shortcomings of the reasoning frameworks developed in the past. More specific, the following topics have been investigated:

  • A unified graph representation incorporating data and control flow has been developed to allow the debugging of programs with unstructured control flow and exceptions.
  • A combination of static and dynamic program analysis techniques is applied to derive program properties even in case not all values of a program state or are known. Also, this approach facilitates the debugging of programs using unbounded loop statements and recursive method calls, which were beyond the capabilities of previous approaches.
  • High-level design information, such as state charts, dependencies between variables, contracts, and constraints taken from UML class diagrams are utilized to detect different kinds of faults without requiring user interaction.
  • Automatic model abstraction and refinement provides means for accurately locate faults in method call hierarchies, loops, and recursive method invocations.

Models developed for VHDL and early models for Java were based on static analysis and represented the program as a fixed model representing the program’s structure. As test cases were not taken into account, it was necessary to represent all possible executions and fault assumptions. While this approach has been used successfully to debug VHDL programs, polymorphism, exception handling and side-effects often lead to large models and imprecise results for object-oriented programs.

This project investigates an approach where the model is constructed and refined dynamically, taking into account only executions that are feasible in an abstract domain given inputs, anticipated outputs and other constraints taken from a test case. The approach is based on Abstract Interpretation, generalised to (only partially determined) programs incorporating fault assumptions and test case information. Partially determined abstract states allow for considerable model reduction and also facilitate further iterative model refinement.

The benefit of dynamic model construction can lead to considerable improvements in both time and number of explanations compared to static models and models based solely on dependencies. It was also shown that techniques based on recent advances in SAT-based model checking techniques allow to improve results slightly at the expense of elevated computational complexity. While lightweight spectra-based approaches and cause-effect chains have been shown to isolate a number of faults precisely, in the majority of cases, results did not improve compared to top-down reading of a program’s source code. Model-based techniques provide a more consistent picture, where most faults are confined to a small fraction (typically 20%-30%) of the original program; however, additional information is necessary to pinpoint a fault precisely.


Current and Future Research

Since no single debugging approach can satisfactorily address all possible programs and faults, research in this area is necessarily open-ended. Among the on-going and future research topics we would like to address the following:

  • broadening fault modes to deal with a wider range of complex faults, such as assignments to incorrect variables, missing or swapped statements, structural faults in data structures,
  • providing simple user interaction for incremental specification of complex program behaviour, and
  • combining model-based and spectra-based approaches to improve precision as well as speed of our debugging prototype.