Automated
Software Testing in the Absence of Specifications
-- The Yangtse Project on Software Testing and Dynamic Behavior Inference
Previous
research has shown that specifications can play an important role in
automated software testing, either in guiding test-input generation or
checking the correctness of test executions. However, specifications
often do not exist in practice. The Yangtse project infers dynamic
behavior (often in the form of specifications) from program executions
and exploits the inferred behavior in automated software testing;
therefore, some benefits of specification-based testing can be achieved
without requiring developers to write specifications. The techniques
and tools developed in this project targets at a massive group of
developers because the inputs to the techniques or tools are simply
programs (or some other software development artifacts that commonly
exist in practice).
In particular, the project addresses
five important problems in testing practice: test generation, test
selection, test abstraction, test oracle augmentation, and regression
testing. The project addresses these problems by inferring various
types of dynamic behavior from program executions. Putting all these
pieces together, the project hopes to tackle these problems within a
feedback loop between test generation and dynamic behavior inference
(or specification inference). This project is conducted by the Automated Software
Engineering Research Group led by Tao
Xie at North Carolina State University, Raleigh, NC.
How
is our research work
related to software industry?
.
See below for Project Description, Sub-projects, and Related Publications.

Test Generation: Automated test generation tools should focus on generating useful tests. We have addressed this question by developing a novel definition of redundant tests based on method inputs [ASE 04]. In contrast to traditional approaches based on structural coverage, our new approach guarantees that removing a redundant test decreases neither a test suite's fault-detection capability nor our confidence in the code under test. Based on this definition, we have built a redundant-test detection tool that can detect a high percentage (about 90%) of redundant tests among tests generated by Parasoft Jtest 4.5, a commercial testing tool that is widely used in industry. As a direct result of wasting time generating redundant tests, Jtest requires long test-generation time. In response to their customers' needs, Parasoft initiated a collaboration with us to incorporate our new techniques into Jtest's later versions.
Borrowing ideas from model checking, we have built a test-generation tool that generates only non-redundant tests by systematically exploring the object-state space of the Java class under test. To address state explosion, we have developed another tool for generating tests using symbolic execution [TACAS 05]. We have developed novel techniques for comparing symbolic states and pruning the exploration of object states without compromising the exhaustiveness of the exploration. Experimental results showed that the tool generates tests that achieve higher structural coverage faster than the existing tools. We also empirically compared different automated generation techniques for object-oriented unit testing and showed that the techniques are complementary (i.e., detect different faults) [ASE 06]. We developed test generation tools for testing AspectJ programs [AOSD 06] and integration testing [AST 06].
Test Selection: In the absence of specifications,
developers cannot generally afford to inspect a large number of
generated tests for correctness. To address this problem, we have
developed test selection techniques for selecting a valuable subset of
generated tests for inspection. In particular, we have developed the
operational violation approach and implemented the approach by
integrating Daikon (a
dynamic invariant detector) and Parasoft
Jtest [ASEJ
06]. The approach uses Daikon to infer
specification-like properties from the existing tests and uses Jtest to
generate tests to violate these properties; the approach then selects
tests that violate these properties for inspection. Without requiring a
priori specifications, the approach exploits inferred properties to
gain many of the benefits of specification-based testing. Experimental
results showed that selected tests have a high probability of exposing
faults or failures in the code. In 2004, Agitar, Inc.
released Agitator,
a commercial testing tool that automatically generates initial tests,
infers specification-like properties, lets developers confirm these
properties, and generates more tests to violate these properties. The
success of Agitator confirms the utility of our operational violation
approach in industry. We have also developed a tool for automatically
selecting common and special tests from generated tests based on
statistical properties inferred from program executions [ISSRE
05].
We also empirically compared different
automated classification/selection techniques for
object-oriented
unit testing and showed that the techniques are complementary (i.e.,
detect
different faults) [ASE 06].
Test
Abstraction: To reduce the cost of inspecting
tests, we have developed techniques for abstracting and visualizing
dynamic Java program behavior for inspection. Because the concrete
object states and transitions among them are too complicated to be
useful for inspection, we have developed a tool for producing an
abstract state of an object based on the return values of a set of
observers (public methods with non-void returns) invoked on the
object [ICFEM 04].
The tool can also produce an abstract state of an object by projecting
the concrete object state on a member field of the class [SAVCBS 04]
or on branch coverage of methods invoked on the concrete state [RETR
05] . Given the abstract states and the transitions
among them, the tool extracts succinct and useful abstract object state
machines for inspection. The extracted views helped us to discover an
error in the widely used Java API documentation. Recently we extended
the tool to support testing multiple classes in integration testing [AST 06].
Regression
Testing: After a program is modified, traditional
regression testing compares the outputs of the same test on the new and
old program versions to assure that no regression faults are
introduced; however, behavioral differences between two program
versions are often not propagated to observable outputs. To address
this problem, we have characterized program behavior using value
spectra, which capture internal program states during a test execution,
and then compared the value spectra of the old and new versions to
detect internal behavioral differences [TSE
05]. Experimental results showed that our approach
can effectively expose behavioral differences between versions even
when their program outputs are the same. We developed an automatic
approach for augmenting an automatically
generated unit-test suite with regression oracle checking by adding
assertions to the automatically generated unit-test suite [ECOOP 06].
The
augmented test suite has an improved capability of guarding against
regression faults. We
also developed approaches for differential test generation:
generating test inputs to exhibit the behavioral differences of two
program versions [AST 07].
Dynamic Test/Program Behavior
Inference: Inferred dynamic program behavior can help test
generation and test selection for inspection. We have investigated the
applications of the following types of inferred program
behavior: equivalence among object states and method
executions, axiomatic specifications, algebraic specifications, and
protocol specifications. The inferred program behavior not only helps
testing tools, but also provide programmers with insights to test
executions. Then programmers inspect a summary of all test executions,
instead of a single test execution. This can further help
address the lack of test oracle problem.
Feedback Loop between Test Generation and Dynamic Behavior Inference: Specification-based test generation requires specifications a priori, but in practice, programmers do not write down specifications. Dynamic behavior inference relies on a good-quality test suite to infer satisfactory behavior that approximate the specifications. There is a circular dependence between specification-based test generation and dynamic behavior inference. The feedback loop [FATES 03] that we have developed integrates these two activities by producing better tests and better approximated specifications automatically and incrementally. In addition, the by-products of the feedback loop are a set of selected tests for inspection, indirectly tackling the lack of test oracle problem. The feedback loop allows programmers to enjoy the benefits of formal methods without the pain of writing specifications.
Related Publications: (Software Engineering Conferences) (Software Testing Researchers) Also see Tao Xie's publications.
| Research Foundations | Research Subareas |
|
|