Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
SOFTWARE QUALITY
ASSURANCE
AUTHOR
PROF. DR.-ING. PETER LIGGESMEYER
This work is protected by copyright. All rights thus conferred, particularly the rights to copy and
disseminate as well as the rights to translate and reprint the entire work or parts thereof, are re-
served. Beyond the permissions regulated by copyright, no part of this work may be reproduced
in any way (print, photocopy, microfilm, or any other process) nor processed, copied, or dissem-
inated using electronic systems without the written permission of the University of Kaiserslau-
tern.
Table of Contents
Glossary v
About the Author xi
References xiii
Objectives of the Textbook xvii
1 Introduction 1
1.1 Learning objectives of this chapter 1
1.2 Introduction 1
1.3 Motivation 2
1.4 Definition of terms 4
1.5 Software quality assurance 8
1.6 Classification of testing, analysis, and verification techniques 10
1.6.1 Dynamic testing 12
1.6.2 Static analysis 13
1.6.3 Formal techniques: symbolic testing and formal verification 14
1.7 Problems for Chapter 1 14
2 Function–Oriented Testing 17
2.1 Learning objectives of this chapter 17
2.2 Introduction 17
2.3 Characteristics and aims of function–oriented tests 17
2.4 Functional equivalence class construction 19
2.4.1 Characteristics and objectives of functional equivalence class
construction 19
2.4.2 Building equivalence classes 19
2.4.3 Evaluating functional equivalence class construction 24
2.5 State–based testing 25
2.5.1 Characteristics and aims of state–based testing 25
2.5.2 Description of state–based testing 25
2.5.3 Evaluation of state–based testing 27
2.6 Other function–oriented testing techniques 28
2.6.1 Transaction flow testing 28
2.6.2 Testing on the basis of decision tables or decision trees 29
2.7 Evaluation of function–oriented testing 30
2.8 Problems for Chapter 2 31
3 Control Flow Testing 33
3.1 Learning objectives of this chapter 33
3.2 Introduction 33
3.3 Characteristics and objectives of control–flow–oriented testing 34
3.4 Statement coverage test 35
3.4.1 Characteristics and objectives of the statement coverage test 35
3.4.2 Description of the statement coverage test 35
ii Table of Contents
Glossary
All c–uses criterion: This criterion requires the execution of at least one
definition–clear path with respect to x from ni to every element of dcu(x, ni) for
every node ni and every variable x that is an element of def(ni).
All c–uses/some p–uses criterion: This criterion is fulfilled if for every node ni
and every variable x that is an element of def(ni), a definition–clear path with
respect to x from ni to every element in dcu (x, ni) is tested, or if at least one
definition–clear path is tested with respect to x from ni to one element of dpu(x,
ni) if dcu (x, ni) is empty.
All–def criterion: This criterion requires a number of test paths so that for every
node ni and for every variable x that is an element of def(ni), there is at least one
definition–clear path with respect to x from ni to an element of dcu(x, ni) or dpu(x,
ni). The criterion requires that the test cases should be such that for every
definition (all–defs) of all the variables, at least one c–use or one p–use is tested.
All du–paths criterion: For this criterion to be fulfilled, the requirement of the
all–uses criterion has to be extended so that all the du–paths are tested with
respect to the definitions of all variables.
All p–uses criterion: This criterion is fulfilled if for every node ni and every
variable x that is an element of def(ni), a definition–clear path is contained in the
tested paths with respect to x from ni to all elements of dpu(x, ni). Every
combination of every variable definition with its p–use should be tested.
All p–uses/some c–uses criterion: This criterion requires a definition–clear path
from every definition to every predicate use, and if it is not available, demands
that a corresponding path to a computational use be tested.
All–uses criterion: This criterion subsumes the all p–uses/some c–uses criteria
and all c–uses/some p–uses criteria. It requires the testing of all definitions in
combination with all reachable p–uses and c–uses.
Author: The creator of the product to be inspected who is responsible for
correcting the errors found during inspection.
Availability: The measure of the ability of an entity to be functional at a specified
point in time. It is expressed through the probability that the entity performs the
required function at a defined point in time and under defined operating
conditions.
Backward slicing: A method for generating a slice containing instructions that
influence an observed variable.
Boundary interior testing: An approach to testing where it is assumed that a
complete set of tests must test alternative paths through the top level of a program,
alternative paths through loops, and alternative boundary tests of loops.
vi Glossary
Branch coverage test: A stricter testing technique than the statement coverage
test, whose objective is to execute all branches of the program to be tested, i.e.,
the run–through of all edges of the control flow graph.
Condition coverage test: A testing technique that handles the logical structure of
the conditions in the software.
Condition/decision coverage test: A testing technique that explicitly demands
branch coverage to be established in addition to simple condition coverage.
Control flow graph: A graph in which each node in the graph represents a basic
block, i.e. a straight–line piece of code without any jumps or jump targets; jump
targets start a block, jumps end a block, and directed edges are used to represent
jumps in the control flow.
Control flow testing techniques: Testing techniques are based on the control
structure or the control flow of the software to be tested.
Correctness: The degree of consistency between specification and program, or
the degree by which the program satisfies the customer’s expectations.
Data flow anomalies: Incorrect data flows through code in which the input data
is processed to determine interim results that are written into the memory to then
be reread and to finally be converted into output values.
Data flow attributed control flow graph: The control flow graph extended to
the data flow attributes (def, c–use, and p–use) is called data flow attributed
control flow graph.
Data flow tests: Test techniques in which test completeness is evaluated with
respect to coverage of data accesses.
dcu(x, ni): For a node ni and a variable x where x is an element of def(ni), it is the
set of all nodes nj that contain c–use(x) and for which a definition–clear path
exists with respect to x from node ni to nj.
Decision tables: Tables that represent the logic model of a problem to be solved
by software used to derive test cases.
Decision trees: The logic model of a problem to be solved by software represen-
ted in the form of a tree that is used to derive test cases.
def(ni): The set of the globally defined variables for every node ni of the control
flow graph.
dpu(x, ni): The set of all edges (nj, nk) for which x contains p–use(x) and for
which there exists a definition–clear path with respect to x from ni to (nj, nk).
du–path: A path p = (ni, …nj, nk) with a global definition of x in ni for which the
following holds true:
• p is definition–clear with respect to x, and ni contains a c–use of x and all
nodes ni, …nk are distinct or ni, …nj are distinct and ni = nk
Glossary vii
or
• p′ = (ni, … nj) is definition–clear with respect to x, and (ni, nk) contains a
p–use of x, and all nodes ni,… nj are distinct.
Dynamic testing: Testing techniques that can be used to generate test cases that
are representative, error–sensitive, have low redundancy, are economical, and can
be used to execute software, by providing it with concrete input values.
Equivalence class testing: A method of testing where the complexity of a
problem is reduced through continuous division in such a manner that the result
are equivalence classes from which elementary test cases can be constructed. All
the values of an equivalence class are processed similarly by the software.
Error: The wrong understanding of a specific programming language construct
which may cause a faulty command in the program and which, during the
program’s execution, may exhibit a deviation from the expected behavior or cause
a failure.
Failure: A malfunction that occurs dynamically during product usage.
Fault: A cause for a malfunction or failure that is statically contained in the
program code.
Formal inspection methods: Inspections that are carried out during exactly
defined phases with defined input and output criteria, a defined inspection rate,
and predefined goals to be achieved by different inspectors.
Forward slicing: A method for generating a slice that indicates which
instructions are influenced by a specific variable and how this influence takes
place.
Functional equivalence class testing: Technique that derives test cases from
equivalence classes of input and output values based on the specification of the
system being tested.
Functional testing: Testing techniques that test against specifications describing
the desired software functionality.
Informal review techniques without review meetings: Techniques that require
no review meetings and form the lower end of the scale in manual analysis
methods with respect to efficiency and effectiveness, but require very little effort.
Minimal multiple condition coverage test: A testing technique demanding that
along with the atomic partial decisions and overall decisions, all composite
decisions should be tested against the values true and false.
Moderator: A specialist with training as a moderator of inspections whose main
job is to create synergy within the inspection team.
Modified condition/decision coverage test: A testing technique that requires test
cases to demonstrate that every atomic partial decision can influence the logical
value of the overall decision, independent of the other partial decisions.
viii Glossary
Multiple condition coverage test: A testing technique that demands the test of all
logical combinations of the atomic partial decisions.
Path coverage test: A comprehensive control flow test technique that requires the
execution of all the different paths of a software module.
Principle of Integrated Quality Assurance: A method for quality assurance in
which, for a software development process structured into phases, constructive
methods are applied in each phase, and at the end of each phase, the quality of the
intermediate product is evaluated using analytical methods.
Quality characteristic: Properties of a functional unit, on the basis of which the
quality is described and evaluated without indicating values.
Quality measure: Measures that permit inference on specific values that help to
check whether specification–related quality characteristics have been met.
Quality requirement: A requirement or a set of requirements, which specifies
one or several quality charasteristics of a component or a system.
Quality target specification: A document in which the quality requirements are
defined.
Reader: A person who guides the inspection team through the meeting by reading
out the technical content, explaining it at the same time.
Recording clerk: A person who writes down and classifies all errors found
during inspection and also assists the moderator in preparing the inspection
reports.
Reliability: This is a collective term used to describe performance with respect to
availability, maintainability, and maintenance support, which can be defined as
the quality of an entity with respect to its ability to satisfy its reliability
requirements during or after pre–defined time spans under pre–defined application
conditions.
Robustness: The capability of an entity to function in a defined manner and
produce acceptable reactions under unusual operating conditions.
Safety: The state in which the risk of personal or material damage is within
tolerable limits or a measure for the ability of an item to not endanger people,
property, or the environment.
Simple condition coverage: A testing technique that requires the test of all
atomic component decisions against true and false.
Slicing: A method of testing that generates software slices, which are pieces of
software that can be examined for detecting errors.
Software Quality Assurance: Techniques for achieving a desired level of quality
for software applications.
State–based testing: Test techniques that use state charts to generate test cases.
Glossary ix
Statement coverage test: The simplest control flow test method, whose purpose
is a minimal one–time execution of all statements of the program, i.e., coverage of
all nodes of the control flow graph.
Static analysis: A testing technique where the software to be tested is not
executed and which can, in principle, be performed manually, but which is, in
practice, done using a tool, except in review techniques.
Static code analysis: An approach for testing software without executing it.
Structured testing: A testing approach in which all paths through a functional
module that require less than or equal to k (usually k >=2) iterations of loops are
tested at least once.
Structured walkthrough: This is an informal review technique with review
meetings where no formal procedure and no defined roles for team participants
exist. These reviews have lower efficiency and effectiveness than formal
inspections but the time consumed, the cost, and the use of resources are also less
than those for formal inspection methods.
Structure–oriented testing techniques: Dynamic testing techniques that assess
the completeness of testing using coverage of the source code.
Style analysis: A set of rules that programmers have to use when coding, also
called “programming conventions”.
Transaction flow testing: Test techniques that use a transaction flow represented
by flow diagrams, sequence diagrams, or message sequence charts to generate test
cases.
x Glossary
About the Author xi
Prof. Liggesmeyer is the author of numerous scientific articles and popular books,
particularly the standard reference work “Software Quality” (2002, 2nd edition
2009). He is one of the six authors of the National Roadmap Embedded Systems
(NRMES). He also participates regularly in many national and international
program committees. He has been co–editor of various professional journals,
including “Informatik–Spektrum”, “Informatik Forschung & Entwicklung” (both
Springer–Verlag), “information technology” (Oldenbourg–Verlag), and “Lecture
Notes in Informatics” (GI).
Prof. Liggesmeyer is a member of the scientific steering board of SafeTRANS,
supervisory board member of Schloss Dagstuhl, a member of the scientific
advisory board for the platform “Industry 4.0”, and advisor of the cluster project
“fast”. He is also a member of the executive board of the German Chapter of the
ACM as well as a member of the expert advisory board of the quality seal
“Software made in Germany” and a board member of the ZukunftsRegion
Westpfalz e.V. (Western Palatinate Regional Association). Prof. Liggesmeyer
provides consulting to several institutions on the state and federal level. He is a
member of the platform “Security, Protection and Trust for the Society and the
Economy” of the Federal Ministry of the Interior as well as a member of the
platform “Digitization in Education and Science” of the Federal Ministry of
Education and Research in the German National IT Summit. Furthermore, Prof.
Liggesmeyer is active on the advisory board of the “Alliance for Cyber Security”
of the Federal Office for Information Security. Since 2016, he has been a member
of the State Council for Digital Development and Culture of the State of
Rhineland–Palatinate. In 2015, he was also the chairman of the Commission of
Experts for the Bavarian Center for Digitization (ZD.B). From 2011 to 2014, he
was a member of the University Council at the Darmstadt University of Applied
Sciences by order of the Hessian Ministry for Science and Arts. Furthermore, he
was a liaison officer of the German National Academic Foundation (2009 –
2014).
For more than 25 years, Prof. Liggesmeyer has been in frequent demand as a
lecturer for industry seminars and has been advising many leading companies in
matters of technology issues. He leads several strategic projects at the University
of Kaiserslautern that are funded by the Federal Ministry of Education and
Research (BMBF) and by the EU. He is a member of the Center for Digital
Commercial Vehicle Technology (ZNT) and of the research initiative “amsys”.
His research interests are safety and reliability analysis techniques for cyber–
physical systems and comprehensive security and safety analysis processes for
smart ecosystems, primarily in the application fields of digital commercial vehicle
technology, Industry 4.0, and “Smart Rural Areas”. Prof. Liggesmeyer is co–
author of several patents.
References xiii
References
References for Chapter 1
DeMarco, T.: Structured Analysis and System Specification. Englewood Cliffs:
Prentice Hall, 1985
Booch, G., Rumbaugh, J., Jacobson, I.: The Unified Modeling Language User
Guide. Reading: Addison–Wesley, 1998
Informationstechnik; Bewertung von Software–Produkten; Qualitätsmerkmale
und Richtlinien für deren Anwendung. Berlin: Beuth Verlag, December 1991
IEEE Standard Glossary of Software Test Documentation. IEEE, New York 1983
Qualitätsmanagement – Begriffe. (ISO 8402: 1994); Dreisprachige Fassung EN
ISO 8402: 1995, Berlin: Beuth Verlag, August 1995
Birolini, A.: Zuverlässigkeit von Geräten und Systemen. Berlin, Heidelberg, New
York: Springer, 1997
Normen zu Qualitätsmanagement und zur Darlegung von Qualitätsmanagement-
systemen; Leitfaden zum Management von Zuverlässigkeitsprogrammen. Berlin:
Beuth Verlag, June 1994
Liggesmeyer, P., Rothfelder, M.: System Safety Improvement by Automated Soft-
ware Robustness Evaluation. Proceedings 15th International Conference on
Testing Computer Software, Washington, D.C., 71–77, June 1998
Leveson, N. G.: Safeware: System safety and computers. New York: Addison–
Wesley, 1995
Communications of the ACM, Vol. 33, No. 12, December 1990
Leveson, N. G., Turner, C. L.: An Investigation of the Therac–25 Accidents. IEEE
Computer, 26(7), 18–41, July 1993
Hohlfeld, B.: Zur Verifikation von modular zerlegten Programmen. Dissertation,
Fachbereich Informatik, Universität Kaiserslautern, 1988
Lions, J.L.: ARIANE 5 Flight 501 Failure, Report by the Inquiry Board. Paris,
July 1996
Musa, J. D., Iannino, A., Okumoto, K.: Software Reliability: Measurement,
Prediction, Application. New York: McGraw–Hill, 118, 1987
Jones, C.: Applied software measurement. New York: McGraw–Hill, 142–143,
1991
Boehm, B.: Software engineering economics. Englewood Cliffs: Prentice Hall,
534, 1981
xiv References
Tai, K. C.: Program Testing Complexity and Test Criteria. IEEE Transactions on
Software Engineering, Vol. SE–6(6), 531–538, November 1980
Howden, W. E.: A Survey of dynamic Analysis Methods. Tutorial: Software
Testing and Validation Techniques, New York: IEEE Computer Society Press,
209–231, 1981
Ebert, C.: Improving the Validation Process for a Better Field Quality in a Product
Line Architecture. Proceedings Informatik 2000, Berlin, Heidelberg, New York:
Springer, 372–388, 2000
Russel, G. W.: Experience with Inspection in Ultralarge–Scale Developments.
IEEE Software, 8(1), 25–31, January/February 1991
Schnurrer, K. E.: Programminspektionen: Erfahrungen und Probleme. Informatik–
Spektrum, Band 11, 312–322, 1988
Kosman, R. J., Restivo, T. J.: Incorporating the Inspection Process into a Software
Maintenance Organization. Proceedings Conference Software Maintenance,
October 1992
Yourdon, E., Structured Walkthroughs (4th ed.). Englewood Cliffs: Prentice Hall,
335–347, 1989
Objectives of the Textbook xvii
1 Introduction
1.2 Introduction
Every organization that develops software endeavors to deliver the best quality.
One can only attain a goal when it is precisely defined, which is not the case for
the term ‘the best quality’. There are many ways to define software quality. Many
attributes of software contribute towards its overall quality. Not all of these
attributes are equally important to the end user and to the manufacturer. Certain
attributes that are absolutely essential for a particular software product may be
totally irrelevant for another product. Certain attributes in the system have
negative interdependencies. The argument that one wants to implement the best
quality demonstrates that this fundamental fact is not understood. The aim is not
software development with the best, but with the right quality. This demands that
quality requirements be defined in a so–called quality target specification.
Subsequently, measures aimed at attaining the specified quality can be
determined. As a rule, a combination of techniques and organizational means is
2 Chapter 1 – Introduction
1.3 Motivation
With the growing penetration of computers in many application areas, assuring
reliability and correctness of software functionality has gained more importance.
Cost development exhibits a clear rise in software costs in comparison to
hardware costs, even though the life expectancy of software is significantly
longer. Cost–cutting in the area of software development would therefore be
especially economical. Analyzing the costs of software development depending
on the phase of the software development lifecycle, leads to the following
conclusion: By far the greatest proportion of the expenditure arises during the
maintenance phase, after introduction to the market. This is a result of inadequate
software quality caused by errors originating from the software development
phases that are first discovered during field use. If, in addition, a software product
is badly structured, inadequately documented, and difficult to understand, the
correction of errors becomes a time–consuming venture. Likewise, the growing
complexity of software products contributes towards this tendency. The
localization of errors and their elimination is difficult. In particular, in
inadequately structured software, error correction in one part of the system may
result in further errors due to interdependencies with another part of the system. If
the error is caused during the early phases of software development, for example
in the specification of requirements or during the conceptual phase, extensive
modifications often become necessary. This causes high maintenance costs; at the
same time, employees are tied up who are required elsewhere.
There are also methods for building software that is reliable, understandable, and
easy to modify. These methods are either constructive or analytical. In practice,
functional decomposition and object–oriented software development methods are
widely used. Examples are Structured Analysis (SA) /DeMarco 85/, Structured
Design (SD), and the Unified Modeling Language (UML) /Booch et al. 98/.
However, no constructive procedure can guarantee error–free products. For this
reason, the quality needs to be tested with analytical methods. If the software
Principle of development process is structured into phases, constructive methods can be
Integrated applied in each phase, and at the end of each phase, the quality of the intermediate
Quality Assurance
products can be evaluated with analytical methods (Fig. 1.1). This is the so–called
“Principle of Integrated Quality Assurance”. An intermediate product whose
quality is inadequate would not be forwarded to the next phase until a defined,
satisfactory quality level has been attained. This procedure facilitates early
detection of errors and enables their correction at a time when this is possible with
minimal expense. The aim is to detect as many errors as possible during the tests
Chapter 1 – Introduction 3
that immediately follow the development step where the error originated. The aim
is to minimize the number of errors persisting over several phases. The quality
assurance process starts with the first intermediate product in order to detect
quality deviations early and facilitate actions to eliminate them. Merely testing a
‘complete’ program is not enough. There is an interdependency between
constructive and analytical quality assurance. Ideally, each constructive step
should be followed by an analytical one.
The early definition of quality requirements is of crucial importance. This quality
target specification requires defining the desired target values of so–called quality
characteristics at an early stage of the development process. Software testing
occupies a central position in software development. Every programmer would
run newly created software and make a few inputs in order to examine the
generated outputs and compare them with the expected results. He would also
read the printout of the program code to spot bugs. Basically, the first procedure is
the application of an unsystematic dynamic testing procedure and the second
approach resembles manual execution of an unsystematic statistical analysis. On
the one hand, testing methods – in their simpler form – are very common. On the
other hand, the various techniques as well as their systematic and their productive
efficiency are widely unknown. The objective of testing must be the generation of
reproducible results applying a clearly defined procedure. Unfortunately, in
normal practice, this is often not done. In this course, techniques and methods for
analytical software quality assurance will be discussed, with a focus on testing
techniques. Alongside dynamic tests, automated and manual statistical analyses
will be introduced.
4 Chapter 1 – Introduction
Design
Design Errors - UML Design Errors
Coding
Programming Errors - C++
Quality:
The term ‘quality’ is defined in /DIN EN ISO 9000:2005/ as the degree to which a Early
set of inherent characteristics fulfills requirements. commitment to
quality is
Quality Requirement: important
The term ‘quality requirement’ is the need or expectation that is stated, generally
implied or obligatory. It specifies one or several quality charasteristics of a
component or a system.
Quality Characteristic:
The concrete specification and evaluation of quality is done by using so–called
quality characteristics. They represent properties of a functional unit on the basis
of which the quality is described and evaluated without indicating values. A
quality characteristic can be detailed over several steps by sub–characteristics.
There are various proposals for quality characteristics that are based on
publications by various authors. Examples of these characteristics are security,
reliability, availability, robustness, memory and runtime efficiency, flexibility,
portability, testability, and usability. While quality characteristics such as
flexibility, portability, and testability are important to the manufacturer, factors
such as security, reliability, availability, memory, and runtime efficiency as well
as usability are important to the end user. For this reason, they are relevant for the
manufacturer in the interest of customer satisfaction. The ISO/IEC standards 9126
/ISO IEC 9126 91/ and 25010 /ISO IEC 25010 11/ describe several quality Quality
characteristics. Normally, techniques used to assess the quality characteristic characteristics
can have
‘correctness’ are distinguished from those used to verify other quality reciprocal effects
characteristics such as safety or reliability. In some publications, one can find a
differentiation between functional and non–functional characteristics.
Interdependencies and correlations can exist between different quality
characteristics. Safety and availability are frequently antipodal optimization
targets. Safe systems are created in such a manner that component failures can be
identified, e.g. through redundancy, and the system resolves itself into a safe state.
The normal functionality of the system is not provided in this state, which reduces
its availability. Safe systems, e.g., dual channel or 2–by–2 systems, generally have
lower availability in comparison to a single–channel structure. Software can be
optimized either in terms of memory efficiency, i.e., for minimum memory
demand, or in terms of runtime efficiency, i.e., for minimum runtime demand.
Often, an improvement of one of the quality characteristics results in a
deterioration of the other. Therefore, it does not make sense to demand optimum
quality in every respect. This is principally impossible because of the
interdependencies among the various quality characteristics. The specification of
target values for individual quality characteristics needs to be performed at an
early stage of the development process. This is done by carrying out a so–called
quality target definition.
6 Chapter 1 – Introduction
Quality Measure:
The concrete commitment to specifications of a quality characteristic occurs
through quality measures. These are measures that permit inference on specific
values. For example, MTTF (Mean Time to Failure) can be used as a measure of
the quality characteristic reliability.
Failure:
IMPORTANT: A malfunction or failure occurs dynamically during product usage. During
The terms dynamic testing of software, one does not detect faults but a malfunction or
failures, faults,
and errors may be failure. These are a result of faults in the program. In hardware, failures may result
defined differently from wear and tear.
in some
publications Fault:
In software, a defect or fault constitutes a cause for malfunctions or failures:
Faults are statically contained in the program code. In hardware, too, design flaws
are possible (e.g., under–dimensioning of the cooling of a power semiconductor).
Error:
An error can be the result of a wrong understanding of a particular programming
language construct. Errors, faults, and failures are often related to each another.
Likewise, an error – the wrong understanding of a specific programming language
construct – may cause a faulty command in the program, which during its
execution then exhibits a different result than desired, namely a failure. The term
“error” is often used as a synonym for fault.
Correctness:
An IEEE Proposal /ANSI 729 83/ defines correctness as the degree of consistency
between specification and program, or as the degree by which the program
satisfies the customer’s expectations. Such definitions are geared towards
practice. They are not optimally suited for defining criteria for testing correctness
because their definition is already vague. As an example, the degree of customer
satisfaction depends to a large extent on the concrete user group surveyed. An
occasional user of a system will typically have requirements that are different
from those of a professional/frequent user. Among experts, the following
definition of the term correctness has been established:
Correctness is a • Correctness possesses no gradual character, i.e., software is either correct
binary property or not correct.
• Error–free software is correct.
• Software is correct if it is consistent with its specifications.
• In the absence of a specification, no verification of correctness is possible.
Since with this definition, a program is regarded as incorrect if there is only a
minimal deviation from the specification, one may assume that there is no correct
program above a certain level of complexity. On the other hand, it is normal in
Chapter 1 – Introduction 7
/Birolini 97/ defines safety as a measure for the ability of an item to endanger
neither persons, nor property, nor the environment. One can distinguish the safety
of a failure–free system (accident prevention) from the technical safety of a
failure–prone system. Besides, there is the quality characteristic ‘security’. It
includes the attribute of a system to prevent loss of information and unauthorized
data access.
Reliability:
DIN EN ISO 8402 /DIN EN ISO 8402 95/ defines reliability as a collective term Dependability ≠
for describing performance with respect to availability, maintainability, and Reliability
maintenance support. In DIN ISO 9000 Part 4 /DIN ISO 9000–4 94/, reliability is
defined as the quality of an entity with respect to its ability to satisfy its reliability
requirements during or after pre–defined time spans under pre–defined application
conditions. Reliability in this comprehensive sense is referred to as dependability.
/Birolini 97/ uses a restricted definition of reliability. Here, it is the measure of the
ability of an entity to remain functional, expressed through the probability that a
required function can be performed failure–free under defined operating
conditions for a specified period of time. This definition allows defining reliability
with the help of stochastic techniques. Thus, the expected value of the time span
until system failure – the so–called Mean Time to Failure (MTTF) – is specified
as the measure of reliability. In repairable systems, one uses the expected value
between two consecutive failures – the so–called Mean Time between Failures
(MTBF). Alternatively, the probability of survival R(t) can be used. This is the
probability that the system will still be running at time t after it was started at time
zero. In the following, reliability will be used in this narrower sense. Reliability
thus describes the probability of a failure–free functionality of an entity over a
specified time span and under specified operating conditions. It can be described
with stochastic methods. Reliability in general is not constant. In software, it
grows over the course of error corrections and increases in stability, and decreases
through new errors introduced when functionality is extended. For hardware
components, wear and tear require attention. The observed reliability deviates
further through the way a system is used. It is often possible to generate reliability
predictions using statistical procedures.
Availability:
Availability is a measure of the ability of an entity to be functional at a specified
point in time. It is expressed through the probability that the entity will perform
8 Chapter 1 – Introduction
the required function at a defined point in time and under defined operating
conditions.
Robustness:
Robustness is Robustness means that an entity is capable of functioning in a defined manner and
gradual producing acceptable reactions under unusual operating conditions /Liggesmeyer,
Rothfelder 98/. An entity that, per specification, is correct may therefore still have
low robustness. According to the above definition, a system is correct if it
correctly performs all the functions described in its specification. If there are
situations that are not included in the specification, the system may fail, for
example. Even then it is considered correct. Such cases are typically unusual
operating situations, e.g., faulty or contradictory input data. Robustness is actually
a property of the specification rather than one of the implementation. A robust
system is the result of the correct implementation of a specification that specifies
a defined response even in abnormal situations. Since one cannot include every
conceivable case while setting up the specification, robustness also has a gradual
character. It is important to note that increasing robustness requires additional
development effort, which arises from the creation of specifications, realization,
and testing. As a result of this, it makes sense to set the required degree of
robustness early in the development by creating appropriate specifications.
ignition, Ariane 5 attained a horizontal speed of more than 32768 internal units.
This value, which was converted into a signed integer variable in Ada, caused an
overflow that was not handled. There was hardware redundancy, but the second
channel ran the same software and thus experienced the same problem. As a
consequence of a complete breakdown of the onboard navigation, diagnosis data
was sent to the host computer, which interpreted it as flight data. Absurd control
commands were given. The rocket threatened to break apart and exploded. The
software had been reused from Ariane 4 and had worked there without any
problems.
Empirical data show that software errors are not exceptions. /Musa et al. 87, p. Empirical data on
118/ found that the average software at the beginning of module testing contains software errors
and their
approx. 19.7 errors per thousand lines of source code. At the beginning of system dependency on
testing, there exist about six errors per thousand lines of source code in the software measures
product. According to this research, released software contains about 1.5 errors are contained in
/Fenton, Ohlsson
per thousand lines of source code. At first, this appears to be a low value. It is 00/
clear, however, that today, large software products consist of a few million lines
of source code. Such a product can thus contain thousands of errors.
Figure 1.2:
Relative share of
development
expenditure
Jones /Jones 91, pp. 142–143/ published data that shows a relatively clear increase
in the expenditure for analytical quality assurance, depending on the total
development expenditure of a project (Fig. 1.2). According to these data, in very
small projects, software coding represents the largest share of the expenditure. In
the largest project presented by Jones /Jones 91/, analytical quality assurance
accounted for 37% of the total development expenditure, while coding only
required 12%.
A renowned research work done by Boehm /Boehm 81, p.534/ arrives at the result
that the ratio between maintenance and development expenditure of software
projects is between 2:1 and 1:1. Because maintenance basically aims at
10 Chapter 1 – Introduction
eliminating every error that was not detected by the time of the product’s release,
it makes sense that maintenance expenditure should be included in analytical
quality assurance. It is clear that in many cases, the largest share of development
expenses is spent on quality assurance.
Relative
count of 10% 40% 50%
emerging
errors
Relative 25%
count of 3% 5% 7% 50% 10%
detected
errors
Cost per
error 250 250 250 1000 3000 12500
Figure 1.3: correction
Error count and (€)
error correction Analysis Design Coding Module System Field
costs test test use
technique and a few of the control flow testing techniques. In the class of
diversified techniques, the practical relevance of regression tests is especially
high.
techniques are black–box techniques, even though not all black–box techniques
are functional. Functional testing and black–box testing are not synonyms. The
back–to–back test, for example, is a black–box technique that is not function–
oriented. By diversifying the testing techniques used, the correctness of the test
results can be evaluated by comparing the results from several versions of the
Function–oriented
software. test ≠ black–box
test
Test cases
Specification
nstart
n1
Interpretation
Completeness n2
n3
Image of the
specification n4
n5
n6
Figure 1.5:
Tester Evaluate nfinal Principle of
outputs structure–oriented
tests
true false
Correctness has a binary character.
2 Function–Oriented Testing
2.2 Introduction
In the following, dynamic tests will be described, which are used to assess the
completeness of a test by determining the degree to which the specification is
covered by the test cases. As specifications describe the desired software
functionality, testing techniques that test against specifications are called
function–oriented testing techniques. Function–oriented testing techniques are
absolutely essential. Standards always demand meticulous function–oriented test
planning. This is a direct requirement for the use of systematic function–oriented
testing techniques. Function–oriented tests take place in all test phases. In unit
testing, module specifications are considered to be the basis for the test. In
integration tests, interface specifications form the test reference. In system tests,
the test is carried out against the requirements definition. A test without systematic
function–oriented test planning is to be regarded as being inadequate. This chapter
introduces important function–oriented testing techniques that are used to develop
test cases from specifications.
Test cases
Specification
Completeness
Interpretation
Software
Image of the
specification
Figure 2.1:
Principle of Tester Evaluate
function–oriented outputs
tests
exists requiring an error handling routine. These are called invalid equivalence
classes. Equivalence classes for output conditions are formed likewise by means
of case differentiation based on the specification. Once the output equivalence
classes are available, the input values leading to outputs in the respective output
equivalence classes must be determined. The selection of concrete test data from
an equivalence class can be made according to different criteria. A frequently
applied approach is the equivalence class boundary test. This heuristic process is
known as boundary value analysis. It is based on the experience that errors occur
more frequently on the boundary of equivalence classes. Another possible
approach is the test of particular values, for example, the value 0, or a stochastic
procedure. The creation of valid and invalid equivalence classes can already be
found in /Myers 79/. If an input provides for a range of values, this range
constitutes a valid equivalence class that may have to be segmented or limited at
its upper and lower end by introducing invalid equivalence classes.
In detail, the following rules are used to form equivalence classes /Myers 79/:
• If an input condition specifies a value range, one valid and two invalid
equivalence classes are to be formed (see the above example).
Example 2.2: For example, between one and six owner(s) can be registered for one car.
Car, car owners
One valid equivalent class:
• 1–6 owner(s).
Two invalid equivalence classes:
• No owner.
• More than 6 owners.
If an input condition specifies a quantity of values that are to be processed
differently, a separate valid equivalence class has to be formed for each value.
For all other values except the valid ones, one invalid equivalence class is to be
formed.
Example 2.3: Keyboard instruments: piano, harpsichord, spinet, organ
Musical
instruments Four valid equivalence classes:
• Piano
• Harpsichord
• Spinet
• Organ
One invalid equivalence class:
• All others, e.g., violin
If elements of an equivalence class can be further classified into sub–categories
that need to be processed differently, then the equivalence class needs to be
partitioned accordingly. For example, the elements of the equivalence class piano
Chapter 2 – Function–Oriented Testing 21
• Test cases for invalid equivalence classes are constructed by selecting test
data from an invalid equivalence class. These are combined with values
that are selected from valid equivalence classes. Because an error handling
routine must exist for all invalid input values, the input of one erroneous
value per test case tests only the corresponding error handling routine. If
many erroneous data per test case are input, it cannot be determined which
erroneous data has invoked the error handling routine.
Example 2.6: An inventory control program in a shop contains an input possibility for the
Inventory control registration of deliveries. For example, when wooden boards are delivered, the
program
wood type is entered. The program recognizes the wood types oak, beech, and
pine. Furthermore, the length – with a value between 100 and 500 – is entered in
centimeters. An input signifies the number of delivered items, which must be
between 1 and 9999. In addition to this, each delivery maintains an order number.
Every order number for the delivery of wood begins with the letter W.
With the above description and the application of the given conditions, the
equivalence class is constructed as shown in Tab. 2.1.
2) Beech
3) Pine
Tab. 2.2 represents a complete set of test cases. The test data is chosen from an
equivalence class without the application of a concrete strategy. The resulting test
suite consists of nine test cases. Three test cases are required to cover all valid
equivalence classes. Six test cases are needed to cover the invalid classes. Each
invalid equivalence class is covered separately by a test case. The invalid values
of the parameters are marked gray in the table.
Chapter 2 – Function–Oriented Testing 23
Test cases with valid Test cases with invalid values Table 2.2:
values Test cases
according to
equivalence class
Test case 1 2 3 4 5 6 7 8 9
analysis
11)
Type of Oak Beech Pine Steel Beech Beech Beech Beech Beech
wood
Order W1 W1 W1 W1 W1 W1 W1 W1 J2
number
The test cases represented in Tab. 2.3 emerge from the joint application of
equivalence class analysis and boundary value analysis. The boundary values of
the equivalence classes are chosen as test data. The abbreviation L or U after the
description of the tested equivalence class denotes a test of the lower or upper
boundaries of the specified equivalence classes, respectively. The resulting test
suite consists of three test cases with valid boundary values and six test cases for
covering the invalid boundary values. Each boundary value of an invalid
equivalence class is covered separately by a test case. The invalid values of the
parameters are marked gray in the table.
24 Chapter 2 – Function–Oriented Testing
Table 2.3: Test cases with Test cases with invalid values
Test cases valid values
according to
equivalence class
Test case 1 2 3 4 5 6 7 8 9
analysis and
boundary value
analysis (additionally) 1) 2) 3) 4) 6) U 7) L 9) U 10) L 12)
tested
5) 5) U
equivalence
L
classes
8) 8) U
L
11)
Type of Oak Beech Pine Steel Beech Beech Beech Beech Beech
wood
Order W1 W1 W1 W1 W1 W1 W1 W1 J2
number
Timeout
occurred
Timeout/reset
dialed number Hang up
Hang up/reset
Digit 0, digit 1,
dialed number
…, digit 9/Add
digit to dialed Discon-
Dialing
number, validate nected
dialed number Unhook/reset
dialed number
Figure 2.2:
Hang up/reset State chart for the
dialed number telephone example
26 Chapter 2 – Function–Oriented Testing
State charts offer a direct basis for generating test cases. The simplest of all state–
based test completeness criteria calls for covering all states at least once with a
test case. According to Fig. 2.2, this strategy does not guarantee a run–through of
all transitions. It is therefore possible to visit, for instance, all states without
testing the transition from the state “Dialing” to the state “Disconnected”, which
occurs when the receiver is hung up. Therefore, testing all transitions is a more
thorough test criterion than only testing all states. Fig. 2.2 shows that a run–
through of all transitions guarantees that all states are visited at least once. The
test of all state transitions is possible with the following test cases (states and
events are depicted in cursive and non–cursive script, respectively).
The state chart in Fig. 2.2 represents the desired behavior. The test cases derived
from the state chart are exclusively normal cases. It does not contain test cases for
error situations. These cases are rather overlooked by testing on the basis of state
charts. A “Failure” state may be added to the original state chart in order to
describe error handling (Fig. 2.3).
Chapter 2 – Function–Oriented Testing 27
Timeout
occurred
Timeout/reset
dialed number Hang up
Hang up/reset
Digit 0, digit 1,
dialed number
…, digit 9/Add
digit to dialed Discon-
Dialing
number, validate nected
dialed number Unhook/reset
dialed number
Pre-existing objects
Actor
firstObject
:Class3 :Class2 Newly created
object
Operation1()
new Class3() newObject
:Class3
Message
Operation3()
Basically, one can consider each of the eight columns as test cases. As the number
of columns is exponentially dependent on the number of conditions,
simplifications are generally necessary. This is often possible. So, Tab. 2.5
represents the same rules as Tab. 2.4, but in a more condensed way. Payment by
bill is not possible only if a new private customer places an order in the amount of
over €1000. In all other cases, bill payment is possible. Thus, the eight cases in
Tab. 2.4 can be reduced to four. Fig. 2.5 shows the corresponding decision tree.
Customer =
new customer
N Y
Customer type =
Bill is private cutsomer
possible N
Y
Every path from the root to a leaf of the tree corresponds to a test case. Therefore,
there would be four test cases. The decision tree does not include the so–called
‘don’t cares’ in Tab. 2.5. Decision trees require a certain sequence of evaluation –
the root condition is evaluated first. This sequence of evaluation is not implied in
the decision table. In the decision tree, an IF–THEN–ELSEIF structure for the
evolution of the decision can be easily defined.
3.2 Introduction
In this chapter, dynamic testing techniques will be described that assess the
completeness of testing using coverage of the source code. Therefore, they are
referred to as structure–oriented testing techniques. The described testing
techniques are based on the control structure or the control flow of the software to
be tested. As a result, we speak of control flow testing techniques. This group of
testing techniques has great practical significance, particularly regarding its
application in module testing. The control flow testing technique group is well
supported by test tool providers. Beyond this, there are accepted minimum criteria
in the area of control flow tests that should be observed. One minimal, i.e.,
essential, accepted test procedure is the so–called branch coverage test. In
particularly critical application areas, relevant standards demand the use of additional
testing techniques. A standard for software applications in aviation, for example,
recommends using a so–called condition coverage test. Particular control flow
testing techniques have such an elementary character that any review that fails to
apply these techniques in module testing must be considered to be inadequate.
34 Chapter 3 – Control Flow Testing
Path
}
Figure 3.1:
nfinal }
Control flow graph
Chapter 3 – Control Flow Testing 35
Statement coverage demands the coverage of all nodes of the control flow graph Example 3.1:
represented in Fig. 3.1. The test cases require that the corresponding program Control flow graph
paths contain all the nodes of the control flow graph. In the control flow graph
shown in Fig. 3.1, this is possible, for example, with the following test case:
Call of CountChar with: TotalCount = 0;
Characters read: ‘A’, ‘1’;
Execution path: (nstart, n1, n2, n3, n4, n5, n2, nfinal).
This test path contains all nodes. It does not contain all edges of the control flow
graph. The edge (n3, n5) that corresponds to the optional else–case is not
contained. In the example, as the else–case of the if–statement is not used, no
statements to be tested by the statement coverage are assigned to it.
Like all coverage tests, the statement coverage test endeavors to fulfill the
essential criterion of executing potentially erroneous locations in the software.
The execution of all statements is a component of almost all important test
procedures, which also observe other aspects in addition. The statement coverage
test occupies a subordinate position as a stand–alone test procedure and is directly
supported only by a few tools. Statement coverage provides the possibility to
detect non–executable statements (i.e., so–called dead code). The statement
coverage measure can be used to quantify the achieved test coverage.
branches that are not executed. This is the case if no test data can be generated
that has not caused the execution of a branch yet until that time. In particular,
frequently executed software components can be detected and optimized as
desired.
The branch coverage test recommends the coverage of all branches of the control Example 3.3:
flow graph in Fig. 3.1. This is possible if the following test case is used: Test case
iterations should fulfill the branch coverage test. This is no acceptable attribute of
a criterion for testing loops. Simple branch coverage measurement proves to be
problematic. As all the branches are equally emphasized without considering the There is a non–
dependencies between them, there is no linear correlation between the achieved linear correlation
between coverage
coverage rate and the ratio between the number of required test cases and the total rate and test case
number of test cases, which is actually essential for 100% branch coverage (Fig. count
3.2). The estimated number of test cases needed to achieve 100% branch coverage
is much greater than the number of test cases that are normally executed for
testing the main functionality.
The executed test cases initially already cover a large number of branches. After a
few test runs, a proportionally high coverage rate is achieved. The remaining test
cases similarly cover these branches, but increase the coverage rate only through
the execution of branches not contained in the paths already tested. Therefore, at
the end of the test, a relatively large number of test cases is required to achieve a
slight increase in the coverage rate. Furthermore, the statement coverage of a
program by a test case set is usually higher than the branch coverage of the
program by the same test case set.
38 Chapter 3 – Control Flow Testing
100%
Coverage rate
observed increase in
the coverage rate with
the number of test
cases
into two classes and requires at least one test data to be selected from each class.
A test of the complex decision (((u == 0) || (x > 5)) && ((y < 6) || (z == 0)))
against both logical values cannot be viewed as sufficient because the decision
structure was not taken into account.
An exhaustive branch coverage test can, for example, be achieved with the
following test cases:
Test case 1: u = 1, x = 4, y = 5, z = 0
Test case 2: u = 0, x = 6, y = 5, z = 0
We assume that the composite decision is tested from left to right and the test
ends when the logical value of the overall decision is known. This can be
described as the non–exhaustive evaluation of decisions. Test case 1 results in the
following situation: The value 1 of the variable u yields the logical value false for
the first partial decision of the OR combination. Therefore, the second partial
decision of the OR combination sets the logical value of the OR combination. The
value 4 for the variable x inserted in the first partial decision (x > 5) also yields
the logical value false. The combination of the first two decisions has the logical
value false. Based on the result of the following AND–operator and independent
of the logical value of the other partial decisions at this time, it is already known
that the overall decision possesses the logical value false.
In many cases, the logical values of the remaining partial decisions towards the
right are not tested further. Independent of whether they are evaluated or not, the
40 Chapter 3 – Control Flow Testing
the values of the variables u, x, y, and z. Then the partial decisions A, B, C, and D
can be independently true and false. Tab. 3.1 represents the 16 logical value
combinations that are possible with exhaustive evaluation of the decisions. A
simple condition coverage test, for example, can be achieved with the two test
cases 6 and 11. The four component decisions A to D are tested against true and
false, respectively. The component decisions (A || B) and (C || D) and the decision
((A || B) && (C || D)) are true in both cases. A branch coverage test is not
achieved with this test case.
If test cases 1 and 16 corresponding to Tab. 3.1 had been selected, exhaustive
branch coverage could have been achieved. As shown by the example, there are
also test cases that fulfill simple condition coverage without guaranteeing branch
coverage. Simple condition coverage with exhaustive evaluation of the decisions
does not assure branch coverage.
With a non–exhaustive evaluation of the decisions, there are only 7 possible
combinations in place of the 16 logical values. These are stated in Tab. 3.2. For
example, cases 1 to 4 in Tab. 3.1 are mapped to case I in Tab. 3.2 because after
the evaluation of the component decisions, A and B are set to false, so that the
overall decision also possesses the value false. Therefore, it is not essential to test
the component decisions C and D in this situation. As shown in Tab. 3.2, test case
11, which results in branch coverage with exhaustive evaluation of the decisions
when combined with test case 6, also results in another logical value (VII
corresponding to Tab. 3.2). After the execution of the two test cases, the partial
decisions A and C are tested against true and false, respectively. The partial
decisions B and D are only tested against true. In contrast to the exhaustive
evaluation of decisions, the two test cases do not result in exhaustive simple
condition coverage. To achieve that, further test cases must be executed.
With non–exhaustive evaluation of decisions, simple condition coverage
subsumes the branch coverage test. Simple condition coverage has the advantage
that the number of essential test cases has a linear correlation with the number of
atomic component decisions. The disadvantage is that the branch coverage test
can be guaranteed only in the special case of non–exhaustive evaluation of
decisions.
42 Chapter 3 – Control Flow Testing
A B C D A || B C || D (A || B) && (C || D)
1 f f f f f f f
2 f f f t f t f
3 f f t f f t f
4 f f t t f t f
5 f t f f t f f
6 f t f t t t t
7 f t t f t t t
8 f t t t t t t
9 t f f f t f f
10 t f f t t t t
11 t f t f t t t
12 t f t t t t t
13 t t f f t f f
14 t t f t t t t
15 t t t f t t t
16 t t t t t t t
IV 7, 8 f t t – t t t
V 9, 13 t – f f t f f
VI 10, 14 t – f t t t t
2 f f f t f t f
3 f f t f f t f
4 f f t t f t f
5 f t f f t f f
6 f t f t t t t
7 f t t f t t t
8 f t t t t t t
9 t f f f t f f
10 t f f t t t t
11 t f t f t t t
12 t f t t t t t
13 t t f f t f f
14 t t f t t t t
15 t t t f t t t
16 t t t t t t t
44 Chapter 3 – Control Flow Testing
5 f t f f t f f
A
6 f t f t t t t
7 f t t f t t t
8 f t t t t t t
D 9 t f f f t f f
10 t f f t t t t
C
11 t f t f t t t
12 t f t t t t t
13 t t f f t f f
14 t t f t t t t
15 t t t f t t t
16 t t t t t t t
Number of
21 = 2 22 = 4 23 = 8 24 = 16 25 = 32 26 = 64 … 2n
test cases
4 f f t t f t f
5 f t f f t f f
6 f t f t t t t
7 f t t f t t t
8 f t t t t t t
9 t f f f t f f
10 t f f t t t t
11 t f t f t t t
12 t f t t t t t
13 t t f f t f f
14 t t f t t t t
15 t t t f t t t
16 t t t t t t t
48 Chapter 3 – Control Flow Testing
In a non–exhaustive evaluation of decisions, not all 16 test cases exist. Only those
existing in Tab. 3.8 can be generated. Fundamentally, test data can be generated
for all the 16 logical value combinations. But in the case of non–exhaustive
evaluation of decisions, there is no possibility to register other situations than the
seven given in Tab. 3.8 with a testing tool. Switching the compiler to exhaustive
evaluation should not be done because the test should show a minimum of
possible differences to the software to be released and delivered later. If a non–
exhaustive evaluation is planned, it should be the same during testing. Beyond
that, certain logical value combinations cannot be generated as a result of coupled
partial decisions. From the 32 (25) logical value combinations of the decision
((ch ==‘A’) || (ch ==‘E’) || (ch ==‘I’) || (ch ==‘O’) || (ch ==‘U’)) of the operation
CountChar, only six can be generated (Tab. 3.9). This does not indicate an error
in the programming but is one of the attributes of the variable used in the decision.
This hampers the definition of a strongly declared test measure. The objective of
test measures is to obtain a quantitative statement about the degree of the test.
Normally, this is the ratio of the number of tested objects (statements, branches,
and atomic partial decisions) to the number of specified objects to be tested.
Often, with multiple condition coverage, a part of the essential tests cannot be
executed; hence, a simple measurement of the described form is inhibited.
IV f t t – t t t
V t – f f t f f
VI t – f t t t t
VII t – t – t t t
Chapter 3 – Control Flow Testing 49
ch == ’E’ f t f f f f t … t
ch == ’I’ f f t f f f f … t
ch == ’O’ f f f t f f f … t
ch == ’U’ f f f f t f f … t
(ch==‘A’) || t t t t t f t … t
(ch==‘E’) ||
(ch==‘I’) ||
(ch ==‘O’) ||
(ch==‘U’)
32 Possibilities
an executable test criterion for loops with reasonable effort that comprises certain
rules. An exhaustive branch coverage test is necessary as a supplementary
condition, but we are on the lookout for a technique between the branch coverage
test and the path coverage test. According to fundamental definitions of the
procedures, there exist loop structures for which requirements are not fulfilled.
Heuristics for testing loops are found in /Beizer 90/. /Riedemann 97/ contains a
precise definition of structured path tests and boundary interior tests.
3.7.2.1 Description
Loops often, but The number of paths in a software module is often extremely high due to the
not always result presence of loops. This is not true for every type of loop. Loops with constant
in a high number
of paths iteration count do not present any problems in this regard.
If 32767 is the largest possible value of a variable of type int, the operation
CountChar consists of 232768 - 1 paths. That is about 1.41 * 109864 paths. If one
could test 1000 paths per second, the path coverage test would continue for
4.5 * 109853 years. As loops are the root cause of this problem, it is advisable to
define limitations for the testing of paths that contain loops. This is done by
combining paths outside a certain number of traversing loops into classes that are
regarded as sufficiently tested by selecting a test path from the class. Howden
states the following definition:
Definitions from /Howden 75/:
publications
In the boundary–interior approach to testing, it is assumed that a complete set of
tests must test alternative paths through the top level of a program, alternative
paths through loops, and alternative boundary tests of loops. A boundary test of a
loop is a test which causes the loop to be entered but not iterated.
An interior test causes a loop to be entered and then iterated at least once.
Experience indicates that both the boundary and interior conditions of the loop
should be tested.
The boundary–interior method separates paths into separate classes if they differ
other than in traversals of loops. If two paths P1 and P2 are the same except in
traversals of loops they are placed in separate classes if
1. one is a boundary and the other an interior test of a loop;
2. they enter or leave a loop along different loop entrance or loop exit
branches;
3. they are boundary tests of a loop and follow different paths through the
loop;
4. they are interior tests of a loop and follow different paths through the loop
on their first iteration of the loop.
Chapter 3 – Control Flow Testing 51
/Howden 78a/:
In the structured testing approach all paths through a functional module which
require less than or equal to k (usually k >=2) iterations of loops are tested at
least once. Variations in this rule are necessary in special cases where this would
leave parts of a module untested because of complicated loop indexing operations
and dependencies between loop bounds.
/Howden 78b/:
Each path through a functional module which executes loops less than k times is
tested at least once.
/Tai 80/:
Since the number of distinct paths is usually very large or infinite, a limited
number of paths are selected by restricting the number of iterations of each loop
in a path (called structured testing).
Basically, in the structured path test, only paths till the k–th execution of the loop The structured
body are differentiated. By k, we do not mean the number of loop iterations, but path test with
k = 2 is identical
the number of executions of the loop body. The definition of the technique in the to the boundary–
original literature is not completely clear on this point. The structured path test interior test
with k = 2 is denoted as boundary–interior coverage. Boundary–interior coverage
distinguishes between three cases – no loop executions, single–loop executions,
and two–loop minimum executions. As a result of the probable dependencies
between variables before, inside, and after the loop, this distinction is meaningful.
In data flow anomaly, the same three cases need to be observed for loops that
demand a dynamic test for the boundary–interior coverage. In the boundary–
interior test with respect to loops in the software module, three classes of paths are
originated. The first class contains the paths that do not execute loops. In the case
of non–rejection loops, this class is empty because the body of this loop would be
executed at least once. An example of how to deal with such loops is found in
/Riedemann 97/. The second class contains all paths that enter the loop, but do not
iterate loops. The third class consists of all paths that execute the loop at least
twice. According to /Howden 75/, the test cases for such paths that enter the loop
but do not iterate them, are called boundary test. The test cases that cause at least
one more execution of the loop body are called interior tests. The structured path
test with k = 2 is identical to the boundary–interior test.
52 Chapter 3 – Control Flow Testing
if ((a<0&&b>0)||(c==0&&d!=0))
...
a. State the logical values of the condition and those of its parts that are at least
necessary to achieve simple condition coverage with exhaustive evaluation
54 Chapter 3 – Control Flow Testing
4.2 Introduction
In this chapter, dynamic testing techniques will be described that evaluate the
completeness of a test based on the achieved coverage of the software source
code, just like the techniques presented in Chapter 3. In contrast to the control
flow testing techniques in Chapter 3, the techniques described here use the data
flow to evaluate the completeness of the test cases. The data flow test procedures
use the access to variables. Variables in a program can be accessed by “read
access“ or “write access“. So, the access to variables is used to test the software in
data flow tests. Every software module contains data that is processed and control
structures that control this processing. In contrast to control flow testing
techniques, data flow testing techniques shift data manipulation to the test focus.
The practical use of data flow test techniques is limited, due to a lack of
appropriate tools.
Just like all structural testing techniques, data flow testing techniques also define
no rules for test case generation. This degree of freedom can be used to generate
test cases with functional testing techniques, using the data flow test only for
measuring test completeness. Data flow testing is based on an extended variation
of the control flow graph. After a variable is declared in the memory, only the
following two things can occur with it until it is destroyed (for example at the end
of the program):
• It can have write access – i.e., its value can be changed if necessary.
• It can have read access – i.e., its value is not changed.
Attention: Write access is also called definition (in short: def). Read access is called
definition ≠ reference (in short: r) or use (in short: u). In programming, there are two possible
declaration
reasons for read access to variables. Either the read variable value acts as input
c–use data for a calculation, or it serves to determine the logical value in a decision. The
p–use first form of access is called computational use (in short: c–use). The other form
of read access is called predicate use (in short: p–use).
Every variable access must therefore belong to one of these three categories:
• Definition (def)
• Computational use (c–use)
• Predicate use (p–use).
Import nodes Definitions (defs) of variables and computational uses (c–uses) of variables are
Export nodes assigned to nodes in the control flow graph. The statement y: = f(x1,…,xn)
contains c–uses of the variables x1, …, xn, followed by the definition of variable y.
The statement if p(x1,…,xn)… contains predicate uses (p–uses) of the variables x1,
…,xn. As decisions determine the control flow of a program, the corresponding p–
uses are assigned to edges. If the last statement of the node is a decision in which
the variables x1 to xn are used, all the edges of the control flow graph emerging
from these nodes would be assigned p–uses of x1,…,xn. In certain testing
techniques, the assignment of p–uses to edges ensures that branch coverage is
achieved as the accepted minimum criterion. The control flow graph extended to
these attributes is called data flow attributed control flow graph. It can be
automatically generated from the source code with the help of a tool. The module
under test here generally contains a parameter interface from which the data in the
module can be input or entered into the environment. This data flow should be
described similarly through corresponding extensions of the control flow graph.
This introduces the purpose of the additional nodes nin and nout in the data flow
representation of the control flow graph. They describe the import or export of
information via interface parameters or global variables. The definitions of all
imported variables are assigned to the node nin. The c–uses of all exported
variables are assigned to the node nout. Fig. 4.1 shows the control flow graph of
the operation CountChar with data flow attributes.
Chapter 4 – Data Flow Testing 57
nstart
char Ch;
def (Ch) n1
cin >> Ch;
c-use (VowelCount) }
c-use (TotalCount) nout
} Figure 4.1:
Control flow graph
nfinal – data flow
representation
The control flow graph shown with data flow attributes of the operation MinMax Global vs. local
variable access
is represented in Fig. 4.2.
58 Chapter 4 – Data Flow Testing
nstart
Figure 4.2:
Control flow graph nfinal
for MinMax
is the set of all edges (nj, nk) for which x is an element of p–use (nj, nk) and for
which there exists a definition–clear path with respect to x from ni to (nj, nk).
Testing techniques can be defined on the basis of control flow graphs provided
with data flow attributes.
• The all–defs criterion requires a number of test paths so that for every All–defs criterion
node ni and for every variable x that is an element of def(ni), there is at
least one definition–clear path with respect to x from ni to an element of
dcu(x, ni) or dpu(x, ni). The criterion requires that the test cases should be
such that for every definition (all–defs) of all the variables, at least one c–
use or one p–use is tested. The sample program contains the definition of
the variables Min and Max in the nodes nin and n2, respectively. The test
path (nstart, nin, n1, n2, nout, nfinal) tests the program sufficiently in terms of
the all–defs criterion. The edge (n1, nout) is not executed. The all–defs
criterion subsumes neither branch coverage nor statement coverage.
• The all p–uses criterion is fulfilled if for every node ni and every variable x All p–uses
that is an element of def(ni), a definition–clear path is contained in the criterion
tested paths with respect to x from ni to all elements of dpu(x, ni). Every
combination of every variable definition with every p–use that is used by
this definition should be tested. In the sample program, the paths (nstart, nin,
n1, nout, nfinal) and (nstart, nin, n1, n2, nout, nfinal) should be exercised for the
fulfillment of this criterion in order to test the p–uses of the edges (n1, n2)
and (n1, nout). The all p–uses criterion subsumes branch coverage.
• The all c–uses criterion requires the execution of at least one definition– All c–uses
clear path with respect to x from ni to every element of dcu(x, ni) for every criterion
All p–uses/some • The all p–uses/some c–uses criterion is symmetrical to the all c–uses/some
c–uses criterion p–uses criterion. It requires a definition–clear path from every definition to
every predicate use, and, if it is not available, demands testing of a
corresponding path to a computational use. P–uses are intensively tested
by this procedure. The all p–uses/some c–uses criterion subsumes the all
p–uses criterion and the all–defs criterion and also the branch coverage.
The sample program is sufficiently tested through the execution of the
paths (nstart, nin, n1, n2, nout, nfinal) and (nstart, nin, n1, nout, nfinal).
All–uses criterion • The all–uses criterion subsumes the all p–uses/some c–uses criterion and
the all c–uses/some p–uses criterion. It requires testing of all definitions in
combination with all reachable p–uses and c–uses.
All du–path • If the requirement of the all–uses criterion is extended so that all the du–
criterion
paths should be tested with respect to the definitions of all variables, the
all du–paths criterion is obtained. A du–path is a path p = (ni, nj, ..., nk, nl)
with a global definition of x in ni for which the following holds true:
• p' = (nj, ..., nk) is def–clear with respect to x and p reaches a c–use in nl
AND the nodes ni, nj, ..., nk, nl are distinct (loop–free) or the nodes nj, ...,
nk are distinct and ni = nl (single loop traversal),
or
• p' = (nj, ..., nk) is def–clear with respect to x and p reaches a p–use in (nk,
nl) and the nodes ni, nj, ..., nk, nl are distinct (loop–free).
According to the specified definition, the limitation on the du–paths works
towards restricting the number of test paths, analogously to the boundary–
interior path test and the structured path test. It is to be noted that complete
test paths can contain many loop executions compared to the du–paths
/Bieman, Schultz 89/. The subsumption relations of the defs/uses criteria and
further testing techniques are represented in Fig. 4.3.
Chapter 4 – Data Flow Testing 61
Figure 4.3:
Subsumption
relation of data
flow tests
5.2 Introduction
In this chapter, you will be introduced to methods that only require little time and
expenditure because they can be extensively automated using tools. Different
objectives can also be pursued with tool–supported static code analysis. Apart
from checking programming conventions, images or tables can be generated
automatically. It is even possible to reliably detect certain errors through
automated analysis, or to support debugging after a faulty behavior occurs, i.e., in
the case of dynamic testing. One common characteristic of the methods
introduced here is that with the exception of so–called dynamic slicing, they do
not require any execution of the software to be tested.
a) Control structures:
• The following should be obeyed in “switch” instructions:
• Each case that contains instructions is closed with a break instruction.
• Each case not containing instructions is not closed with a break
instruction (reference to the next case).
• It is necessary to have one – possibly empty – default case.
• Return instructions should only occur once per function.
• The ?–operator must not be used.
• Loops with multiple exits should not be used as far as possible. If they do
occur, care should be taken that they are broken off at the end of the loop.
b) Operations:
• Operators consisting of a combination of operation and assignment must
be avoided. Instead of x + = y; , x = x + y; must be written.
• Assignments in expressions are not allowed.
• Multiple assignments are not permitted, for example,
x=y + (z = 2);
• Post–incrementing (x++) and pre–incrementing (++x) as well as post–
decrementing and pre–decrementing operations have to be avoided. The
control of for–loops is an exception.
• Attention should be paid to the prevalence of congruence between
operations and data types (for example, use of the operation mod only on
positive integer values).
c) Data types, variables, and indicators:
• Type conversions must be carried out explicitly.
• Variable names cannot start with the symbol underscore (‘_’).
• Variables always have to be initialized.
• Invalid pointers must be marked as nil (or null) pointers.
Assertions in C Apart from such limiting rules, expanding rules frequently also exist. An example
of these is the obligation to use input and output assertions. Assertions are
particularly suitable for checking whether input conditions have been fulfilled.
They are realized in C by the macro “void assert (int expression)“. If “expression”
has the value “false” during execution of the macro, then this violation of the
assertion is reported and the program execution is stopped. Checking for valid
parameter values, for example, is a part of this. On the output side, it is advisable
to use assertions for checking the results. Therefore, the formulation of input and
output assertions makes sense. Whether there is an input and an output assertion
for each function can be checked automatically.
Chapter 5 – Tool–Supported Static Code Analysis 67
5.5 Slicing
slicing corresponds to the analysis that is carried out in the manual search for the
cause of an observed faulty behavior. First, one checks whether the faulty
behavior was caused by the currently observed instruction. If this is not the case,
the instructions that provide data for the instruction or control their execution are
checked. This procedure can be continued until no predecessors exist anymore,
i.e., until input instructions are reached. The slice created in this way includes all
potential causes of the faulty value at the observed point. The following
information is available in compressed form through slicing:
• Instructions that can influence the value of an observed variable at a
specific point
• Participating variables and data flows
• Control influences
All instructions that are not a component of the slice do not have to be checked in Slicing reduces
the search for errors. The time consumed by searching for errors can therefore be expenditure in the
search for errors
reduced by automatic slicing. A distinction is made between static and dynamic
slicing /Korel, Laski 88/. Static slicing can be carried out as in static analysis. It is
not necessary to run the program or to make assumptions on the values of
variables. The limitation is critical if dependencies are only a result of the
68 Chapter 5 – Tool–Supported Static Code Analysis
concrete value of a date. Such situations occur in connection with the use of
pointers or arrays, for example. Dynamic slicing is of help here. We will focus on
static slicing.
Starting point: A suitable starting point for the representation of static slicing is the control flow
control flow graph, which can be created using static analysis and can be extended by data flow
graph with data
flow attributes attributes, and which is also used as a basis for data flow testing. The instructions
(nodes) of the control flow graphs are allocated to data flow attributes, which
describe the type of data access included in the instructions. A difference is made
between write access and read access. Write access is called definition (def, d).
Read access is known as reference. If a read access takes place in a decision, we
speak of a predicative reference (p–use, predicate use, p). A read access in a
calculation is described as a calculating reference (c–use, computational use, c).
The control flow graph in Fig. 5.1a refers to an example from /Korel 87/. Often,
the slice is also noted graphically. The instructions are indicated as nodes with the
same numbering as in the control flow graph. Interwoven nodes represent a data
dependency. Control dependencies are indicated by dashed edges. The following
definitions apply to both edge types:
Control edges of a • Control edges point from instructions containing a predicative reference
slice (p–use) to the directly controlled instructions, i.e., to those instructions that
are only executed when the predicate has a specific value. Control edges
are only drawn between the controlling instruction and the directly
controlled instructions. If a further control level is nested in a controlled
unit, then no control edges are drawn that cross over more than one level.
Data flow edges of • Data flow edges point from instructions in which a variable is defined to
a slice instructions in which this variable is referenced. The variable must not be
redefined between the definition and the reference. This is known as the
definition–clear path with respect to this variable.
The backward slice is determined by searching the control flow graph against the
direction of the edges, starting at the statement containing the reference to the
observed variable, for definitions of this observed variable. If computational
references exist for the definitions, the procedure is continued recursively, until no
additional nodes are found anymore. The dependencies between instructions
found in this way are data dependencies. If an observed node is found in a unit
whose execution is directly controlled by a decision, this is a control dependency.
Nodes with corresponding definitions, i.e., data flow dependencies, are searched
recursively for the predicative references of the variables that are a part of the
decision and that possibly possess further control dependencies.
Chapter 5 – Tool–Supported Static Code Analysis 69
Figure 5.1 a:
Control flow graph
with data flow
attributes
70 Chapter 5 – Tool–Supported Static Code Analysis
Figure 5.1 b:
Slice to max
Figure 5.1 c:
Slice to avr
Chapter 5 – Tool–Supported Static Code Analysis 71
The following data flow attributes can be allocated to the nodes of the control
flow graph with reference to the variable x:
Definition • x is defined: The variable x is assigned a value (for example, x = 5;).
abbreviation = d
Reference • x is referenced: The value of the variable x is read in a calculation or a
abbreviation = r decision, i.e., the value of x is not modified (for example, y = x + 1; or if
(x == 0)…).
Undefined; • x is undefined: The value of the variable x is destroyed (for example,
abbreviation = u destruction of local variables when ending a function). At the beginning of
a program, all variables are found in an undefined state.
• x is not used (empty): The instruction of the node does not influence the
variable x. x is not defined, referenced, or undefined.
Data flow attributes are often abbreviated as d (defined), r (referenced), u
(undefined), and e (empty). The empty action (e) can be neglected, since it does
not affect the data flow. If one looks at the execution of a program, the data flow
can be represented with respect to a specific variable by a sequence made up of
State automaton definitions, references, and undefinitions. The state automaton shown in Fig. 5.2
for data flow
describes the process of data flow anomaly analysis. Beginning with the start state
anomaly analysis
S, a variable is initially undefined. The variable must be defined before it can be
referenced. A correct data flow remains in the states “defined” and “referenced”
so that it can be ended with an undefinition in the state “undefined”. Each correct
data flow starts and ends in the state “undefined” and obeys certain rules, as
describd by the state automaton in Fig. 5.2. If the state “data flow anomaly” is
reached or the state “undefined” has not been reached at the end of a data flow
anomaly analysis, a data flow anomaly has been recognized. The state automaton
defines a so–called regular grammar. These grammars serve as a basis for the
lexical analysis performed in compilers. Therefore, data flow anomaly analysis
can be carried out by tools similar to compilers or may be integrated into
compilers.
Example 5.1: Let us look at the following two access sequences for a variable (u: undefined, d:
Two sequences definition, r: reference):
1. u r d r u
2. u d d r d u
Three types of data flow anomalies can be found: ur (reference of a variable with
an undefined value), dd (two sequential variable definitions), and du (variable
value definition is never referenced).
Sequence 1 starts with ur. The variable under consideration has a random value at
the time of the reference since it was not defined beforehand. There is a data flow
anomaly of the type ur; the reference of a variable with an undefined, random
value.
Chapter 5 – Tool–Supported Static Code Analysis 73
Sequence 2 includes two sequential variable definitions. The first definition does
not have any effect since the value is always overwritten by the second definition.
The data flow anomaly is of the type dd.
Sequence 2 ends with a definition followed by an undefinition. The value
assigned by the definition is not used since it is destroyed immediately afterwards.
This data flow anomaly is of the type du.
Figure 5.2:
State automaton for
data flow anomaly
analysis
The operation MinMax in the example contains data flow anomalies. Fig. 5.3 Import and
shows the control flow graph with the data flow attributes for MinMax. Since the export nodes
values of the variables Min and Max are imported into the operations through the
interface, the data flow form of the control flow graph must be used. Definitions Definite
of Min and Max must be assigned to the node nin. The export of Min and Max recognition of
data flow
when exiting the operation is represented by the assignment of the corresponding anomalies
variable references to the node nout. In the nodes nstart and nfinal all variable values
are set to undefined. The operation MinMax contains two paths. Table 5.1
represents the data flow attributes of the variables through the paths. The
sequences of the data flow attributes can now be examined for the two paths
regarding the data flow anomalies ur, dd, and du. Since this search for all
74 Chapter 5 – Tool–Supported Static Code Analysis
variables is carried out on all paths, anomalies are sure to be recognized. The
sequence udrddru for the variable Max contains two definitions directly following
one another – i.e., an anomaly of the form dd. The sequence urdu for “Temp”
starts with an ur–anomaly – the reference of an undefined variable – and ends
with a du–anomaly, i.e., a definition that becomes ineffective through subsequent
direct destruction of the value.
Figure 5.3:
Control flow graph
for the operation
MinMax
Table 5.1: Path nstart nin n1 n2 n3 n4 nout nfinal nstart nin n1 nout nfinal
Data flow table of Variable
the operation Min u d r r r r u u d r r u
MinMax Max u d r d d r u u d r r u
Temp u r d u u u
The operation MinMax gets two numbers through an interface that need to be
sorted according to size and returned. The local variable “Temp” is undefined at
first. If Min is larger than Max, the value of “Temp” is assigned to the variable
Max. However, the value of “Temp” is random since it is neither initialized nor
has it been transferred through the interface.
This part of the operation is therefore definitely faulty. There is an ur–anomaly.
The two repeated value assignments on the variable Max are also suspicious. The
Chapter 5 – Tool–Supported Static Code Analysis 75
first assignment is always overwritten by the second assignment so that the first
one does not make sense. This is a dd–anomaly. Since “Temp” is not output but is
destroyed at the end of the operation, the previous assignment of a value to this
variable is a du–anomaly.
Fig. 5.4 provides the control flow graph of the corrected version of the operation.
The data flows in Table 5.2 do not contain any anomalies.
Path nstart nin n1 n2 n3 n4 nout nfinal nstart nin n1 nout nfinal Table 5.2:
Variable Data flow table of
Min u d r r d r u u d r r u the corrected
Max u d r r d r u u d r r u operation MinMax
Temp u d r u u u
Figure 5.4:
Control flow graph
of the corrected
operation MinMax
Fig. 5.5 shows the control flow graph with data flow attributes of the operation
Sqrt. The analysis of the path that is run through for non–positive input values is
quite simple. The following data flows are obtained, which do not have any
anomalies:
• X: u d r u
• W: u u
• ReturnValue: u d r u
Chapter 5 – Tool–Supported Static Code Analysis 77
nstart
u (X), u (W), u (ReturnValue)
nin d (X)
nout
Figure 5.5:
Control flow graph
nfinal u (X), u (W), u (ReturnValue) for the operation
Sqrt
The iterative approximation process is carried out for positive inputs. The data
flow for the variable X starts with the sequence udrr up to the loop decision. If the
loop is not entered, the sequence u follows directly. If the loop is entered, the
sequence rr follows in between. This sequence is repeated with each further loop
execution. Therefore, the data flows on these paths can be indicated in closed
form. The data flow u d r r (r r)nu with n ≥0 applies to the variable X. The value n
indicates the number of loop executions. Closed expressions for the data flows are
also obtained for the variables W and ReturnValue.
• X: u d r r (r r)n u with n ≥0
• W: u r (r d r)n r u with n ≥0
• ReturnValue: udru
It can easily be seen that, if no data flow anomalies have occurred on the paths up
to the second loop execution, there will also be no more occurrences.
The operation Sqrt contains a data flow anomaly for the variable W. The data Data flow
anomalies are not
flow u r (r d r)n r u, n ≥ 0 starts with a “u r”–anomaly. The value of the variable W definitely found
has not yet been initialized when it is read for the first time. However, the by dynamic
operation works correctly for initial random values of W that happen to be testing
78 Chapter 5 – Tool–Supported Static Code Analysis
positive, so that a dynamic test does not recognize the error for sure. The negative
root is determined for negative initial values of W. If W is initially zero by
coincidence, the program “crashes”. Whereas this error can only be found in an
unreliable way using dynamic testing, it can definitely be recognized by a data
flow anomaly analysis. Fig. 5.6 shows the corrected version of the control flow
graph.
Data flow Frequently, a part of the theoretical paths that can be constructed cannot be
anomalies on executed since there are no corresponding input data or operational situations.
paths that cannot
be executed Data flow anomalies on this type of path that cannot be executed are not unusual,
although they can be avoided. They do not occur during program execution and
therefore cannot cause faulty behavior either. Since it is sometimes hard to
determine whether a certain path is executable, such anomalies must also be
removed if there are any doubts.
nstart
u (X), u (W), u (ReturnValue)
nin d (X)
n2 W = 1.0; d (W)
nout
Figure 5.6:
Control flow graph
for the corrected nfinal u (X), u (W), u (ReturnValue)
operation Sqrt
dynamic testing. But these errors are sure to be revealed. Due to the low time
consumption compared to dynamic tests and the direct localization of errors, data
flow anomaly analysis is an extremely interesting technique in practice. The
optimal tool support for this is a data flow anomaly analysis component integrated
into the compiler. Some compilers already support this.
function unifit(x)
ahat = min(x);
ahat = 0; % for debugging purposes only
bhat = max(x);
tmp = (bhat - ahat) ./ alpha.^(1./length(x));
aci = [bhat - tmp; ahat];
bci = [bhat; ahat + tmp];
disp(sprintf('A_Hat= %f B_Hat= %f \n CI-A (lb)=%f \n
CI-A (ub)=%f \n CI-B(lb)=%f \n CI-B
(ub)=%f',ahat,bhat,aci(1),aci(2),bci(1),bci(2)));
end
What anomalies can be detected using data flow anomaly analysis? How can the
problems be solved?
80 Chapter 5 – Tool–Supported Static Code Analysis
Chapter 6 – Software Inspections and Reviews 81
6.2 Introduction
Manual testing of documents and codes is a commonly used method in practice.
Numerous ways exist of performing such analyses, including inspections, reviews,
peer reviews, and structured walkthroughs. In this chapter, the three main forms
of manual analyses will be presented. Formal inspections are a particularly
effective means for finding errors. They will therefore be discussed in detail in the
following. Since formal inspections are usually quite time–consuming, however,
we cannot reject simpler review methods completely. Manual analyses are
particularly important in the early phases of software development, like the
checking of design documents. They can also be used as a supplement to dynamic
tests of the source code.
82 Chapter 6 – Software Inspections and Reviews
concerned with the preparation of the document under inspection. The inclusion
of several people in inspection and review teams has another positive aspect. The
expertise regarding the common product is spread out into the development team.
Moreover, people learn more details about the work methods from their
colleagues, if necessary. These can possibly be taken over into one’s own work.
Moreover, the responsibility for the quality of the product is borne by the entire
development team. If serious problems occur in a product after an inspection has
been carried out, the responsibility for this cannot be given to the author alone.
The inspection team as a whole bears the responsibility. The awareness for quality
no longer focuses on every tiny work product but rather on the whole product
produced by the development team.
Experience shows that authors make an effort to use comprehensible forms of
expression when writing documents since several people have to assess the
products. This improves readability. An easily understood and explained style is
preferable. Critical components are identified early on by the timely use of
inspection and review methods during development. This information is important
with respect to appropriate risk management.
In the past, there were discussions about whether reviews or dynamic testing was
the better means for detecting errors. Experience shows that these two different
solutions are not alternatives. Reviews or inspections are a supplement to tool–
supported tests. Neither tool–supported static analyses nor dynamic testing can be
replaced by review and inspection methods. In cases where tool–supported static Inspections and
analysis is possible, it can be carried out, as it takes less time and delivers higher reviews
supplement other
quality than a manual analysis. Tool–supported data flow anomaly analysis (see testing methods
the previous chapter) can be seen as an example. Manual tracing of data flow
anomalies during a code inspection is possible, but does not make sense, since
tool–supported data flow anomaly analysis works more reliably and creates less
expenditure. With dynamic testing, only errors that lead to faulty behavior are
found. Deviations from the standards – e.g., violation of programming
conventions – are not recognized if they do not lead directly to errors. Such errors
are detected in reviews. In the case of reviews, the focus is on “local” errors.
Faulty behavior that occurs due to the interaction of remote parts of the code is
hardly ever found by a source code review. Reviews of architectural models can
handle this issue. Additionally, dynamic testing is more reliable in revealing this
type of faulty behavior. Therefore, dynamic testing methods, tool–supported static
analyses, and review and inspection techniques complement one another.
Empirical examinations show that formal inspections, in particular, are a very
effective and efficient means for detecting errors.
84 Chapter 6 – Software Inspections and Reviews
overview session is unnecessary, and then it can and should be left out. This
diminishes the costs for the inspection and therefore increases its efficiency.
3. Preparation The inspectors must prepare themselves for the inspection meeting. For this
purpose, each inspector receives a complete set of the required documents. These
documents cannot be modified until the inspection has been carried out. This
prevents the discussion of an old state of the product in the inspection meeting.
Since the continuation of work on the product is stopped until the inspection
meeting, the inspection has to be carried out with high priority. The inspectors all
prepare on their own, individually, for the inspection meeting based on the
documents. They note down all the errors they find and the points that are not
clear. There are guidelines on the time to be spent on preparation, based on which
the preparation time is planned. If too little time is assigned for preparation, the
inspectors will have little knowledge during the inspection meeting. They would
be almost unprepared and only get to know the product during the actual
inspection meeting. Therefore, they would find relatively few mistakes. This
would reduce the efficiency of the inspection. If too much time is used for
preparation, efficiency is also reduced, since preparation time is added to the
inspection effort. Efficiency can be expressed as a quotient of the number of
errors found and the effort consumed for finding them. There has to be a medium,
optimal value for the preparation time. This value may be different in each
organization and for the various types of inspection. However, there are
guidelines in the literature that are given below. The main objective of preparing
for the inspection is to gain an understanding of the product. The objective is not,
first and foremost, the detection of errors. If errors are found, this is a desired side
effect. However, the crucial factor is that each inspector has a good understanding
of the function of the product to be inspected after preparation.
4. Inspection Conducting the inspection meeting is the central phase of a formal inspection. The
meeting participants in the inspection meeting are assigned the following roles:
• Moderator
• Author
• Reader
• Recording clerk
• Further inspectors
The moderator The moderator must be a recognized specialist with special training as a
should have moderator of inspections. The main literature on the subject requires the
completed special
training moderator to be technically competent. This would mean that the moderator of a
code inspection, for example, must understand the programming language used.
This is definitely desirable, but not absolutely necessary. What is more important
than specialist technical knowledge is the ability to moderate. The moderator
conducts the meeting and guarantees that the inspection is carried out in the
planned manner. He/she must specifically see to it that all inspectors work
Chapter 6 – Software Inspections and Reviews 87
inspection since the number of members of the inspection team affects the
inspection effort in a linear fashion. In order to be equally efficient, a team made
up of six people should therefore detect twice as many errors in the same time as a
team made up of three persons. This cannot be expected. The minimum number of
participants during inspections is three. If only three persons form an inspection
team, the moderator also takes on the role of the recording clerk. The remaining
two persons are the author and the reader.
IMPORTANT: It is extremely important that the results of inspections (as well as other review
Inspection results methods) are not used for assessing staff members. Carrying out an inspection is
should not be used
for assessing staff sometimes an unpleasant situation for the author of a product. A group of
members colleagues criticizes his work. However, experience from inspections teaches us
that the results of the inspection help to improve one’s own work and to reduce
problems later on. Nevertheless, a prerequisite for this is that inspections are
understood as a purely technical instrument. The personal value given to staff
must be fully detached from the inspection results. If the number of errors found
in the inspection is used as an indicator of the staff’s efficiency, inspections are
not used as an instrument for assuring technical quality, but as an assessment
procedure for each person’s own performance. This leads to the author attempting
to deny as many mistakes as possible, which reduces the efficiency of the
inspection and makes the method almost useless. A simple rule that helps to
prevent personal assessment of the author in the inspection is the requirement to
let only people with the same rank (peers) work together during inspections.
Checklists are The errors found during the inspection meeting must be classified to the extent
meaningful possible. It is advisable to have the recording clerk use a given classification
scheme for this purpose. It also makes sense to use checklists. On the one hand,
this ensures that no tests are left out. On the other hand, working through a
checklist helps to recognize whether the inspection is progressing continuously or
is going around in circles. This is an indicator for the moderator to defer the
current point in the checklist and to proceed with the next point.
A goal–oriented inspection performed according to this procedure is strenuous.
Synergetic work cannot be maintained over a long period of time without feeling
exhausted. Experience shows that the maximum duration for an inspection should
not exceed two to three hours. At the end of the inspection meeting, a decision is
made on whether the product is accepted, conditionally accepted, or whether a re–
inspection is necessary.
5. Subsequent Next, subsequent work on the inspection must be carried out. The author works on
work the errors noted in the inspection records. In the simplest case this means
correcting the errors. However, it is also possible that errors cannot be corrected
immediately. The error must then be introduced into change request procedures to
enable an appropriate decision to be made about its correction. An example of this
type of situation is the recognition of a design error during a code inspection. The
design documents are already under configuration control and cannot be modified.
The correction of errors must therefore take place through change management.
Chapter 6 – Software Inspections and Reviews 89
Moreover, the correction of errors can show that a presumably erroneous point is,
in fact, correct, contrary to the view expressed during the inspection. In such
cases, the author must put forward his point of view in the so–called follow–up.
After the author has worked on the list of errors, the product can be brought under
configuration control if it has been accepted in the inspection meeting. Usually,
products containing only a few small errors are accepted under the condition that
these errors are corrected by the author. Checking the correction of errors is not
necessary in this case. If the product was accepted conditionally during the
inspection meeting or if re–inspection is required, further steps are necessary.
These steps take place during the last inspection phase, the follow–up. If the 6. Follow–up
product was conditionally accepted during inspection, the moderator can check
the corrections together with the author. However, if the moderator is not a
technical specialist, He/she will not be a suitable partner for executing the test. In
this case, it makes more sense to appoint another person from the inspection team
to check the correction of errors. As a rule, the reader is selected due to the
required technical competence. It is recommended that in the case of a conditional
acceptance of the product, the correction of errors should be checked by the
author and the reader together.
If, during the inspection, the decision is made that a re–inspection is necessary, an
inspection meeting must take place again. The same inspectors take part in this
meeting. The meeting focuses on the errors found in the previous inspection
meeting. The record of errors from the inspection meeting is used and the
modifications of the product are checked. As a rule, re–inspections are critical
because they are not taken into consideration in the project plan and therefore
often lead to delays. Pending inspection reports are prepared in the follow–up.
These concern technical reports on detected errors, in particular, on expenditure,
and on the preparation and inspection time.
Numerous publications show that formal software inspections are both a very Inspections are
effective and a very efficient means of detecting errors. Relevant empirical data efficient and
effective but time–
can be found, for example, in /Thaler, Utesch 96/ and in /Ebert 00/. Further data is consuming
included in /Fagan 76/ and /Fagan 86/. The number of errors detected per
document size is called effectiveness. Efficiency is the quotient of the number of
errors detected and the effort required for finding them. Therefore, efficiency is a
yardstick for economic viability.
Thaler and Utesch compared the efficiency and effectiveness of software
inspections with conventional review methods and with dynamic tests. The results
are shown in Tab. 6.1. On the one hand, software inspections are superior to
conventional review methods (i.e., walkthroughs) with respect to efficiency and
effectiveness. On the other hand, Tab. 6.1 also shows the principal problem of
formal inspection methods: They are very time–consuming. Therefore, they can
only be used on relatively small parts of a product. In spite of their high efficiency
and good economic viability, formal inspections are usually too time–consuming
and expensive to apply to a complete project. Further statements on the efficiency
90 Chapter 6 – Software Inspections and Reviews
of inspection methods can be found in /Russel 91/, /Schnurrer 88/ and /Kosman,
Restivo 92/. Due to the relatively large amount of time consumed by formal
inspection methods, conventional review methods cannot be outright rejected.
The publications mentioned above do not compare the different types of errors
that can be revealed by specific quality assurance techniques.
Empirical data In /Fagan 86/, Fagan indicates a rate of 500 net source code lines per hour for the
overview session. Net source code lines can be understood as source code lines
without comments. Fagan indicates 125 net source code lines for the preparation
rate per hour. The inspection speed should be 90 net source code lines per hour.
The maximum inspection rate should not exceed 125 net source code lines per
hour. Fig. 6.1 shows empirical results from /Thaler, Utesch 96/. It can be clearly
recognized that the effectiveness of the inspection increases as the inspection rate
decreases. Still, one should not aim at an extremely low empirical data inspection
rate purely for economic reasons. Fig. 6.2 shows empirical data from /Ebert 00/.
As shown in the upper diagram, effectiveness also increases here as the inspection
rate decreases. The reverse value of efficiency shown in the lower diagram,
however, shows a minimum. To put it differently: Efficiency increases with a
decrease in the inspection rate until a maximum value is reached and then drops
again if the inspection rate decreases further. According to /Ebert 00/, the
optimum lies at about 90 instructions per staff member hour.
Table 6.1:
Efficiency and
effectiveness
comparison of
testing methods
from /Thaler,
Utesch 96 /
Figure 6.1:
Empirical data on
the inspection rate
from /Thaler,
Utesch 96/
Chapter 6 – Software Inspections and Reviews 91
Figure 6.2:
Empirical data on
the inspection rate
from /Ebert 00/
the same basic conditions apply for the execution of conventional reviews as in
the case of formal inspection methods. Particular care must be taken that teams in
informal reviews are rather small and team members lack the necessary
experience to efficiently conduct the reviews. It is especially important to create a
specialized atmosphere for conventional reviews as well. Conventional review
92 Chapter 6 – Software Inspections and Reviews
methods are described in detail in /Yourdon 89/. Empirical data on the efficiency
of conventional review methods are included in /Thaler, Utesch 96/.
Correctness
• Correctness has a binary character, i.e., an item is either correct or
incorrect.
• A fault–free realization is correct.
• An artifact is correct if it is consistent with its specification.
• If no specification exists for an artifact, correctness is not defined.
Robustness
• The property to deliver an acceptable behavior also in exceptional
situations (e.g., ability of a software to detect hardware failures).
• A correct system – as measured by the specification – can actually have
low robustness.
• Accordingly, robustness is rather a property of the specification than of the
implementation.
• A robust program is the result of the correct implementation of a good and
complete specification.
• Robustness has a gradual character.
Year of birth (YOB) 1900 <= YOB <= 1990 3) YOB < 1900 4)
YOB> 1990 5)
n in
n0 int a0;
c
n1 int a1;
d
n2
int j;
e
n4 while (i>=0)
g
n5 j=0;
h
n6 while (j<i )
i
n7 a0=get(j);
j
n8 a1=get(j+1);
s k
q r
n9 p if (a0>a1)
l
n 10 put(j,a1);
put
o m
n 11 put(j+1,a0);
n
n 12 j++;
n 13 i--;
nout
t
nfinal
Solutions to the Problems 97
b) The goal of statement coverage is to execute each statement at least once, i.e.,
to execute all nodes of the control flow graph. This can be done with:
abcdefghijklmnpqrst.
c) Branch coverage is a stricter testing technique than statement coverage.
Statement coverage is fully contained in branch coverage. Branch coverage
subsumes statement coverage. It aims at executing all branches of the program to
be tested. This requires the execution of all edges of the control flow graph:
abcdefghijklmnpijkopqrst
A B C D
1 T F T F F
a=-1 b=0 c=0 d=0
2 F T F T F
a=0 b=1 c=1 d=1
Non–exhaustive evaluation
A B C D
1 T F T F F
a=-1 b=0 c=0 d=0
2 F - F - F
a=0 b=? c=1 d=?
3 T T - - T
a=-1 b=1 c=? d=?
4 F - T T T
a=0 b=? c=0 d=1
98 Solutions to the Problems
b)
A B C D E F Total
1 T F T F F F F
a=-1 b=0 c=0 d=0
D B 2 T T T F T F T
a=-1 b=1 c=0 d=0
A 3 T F T T F T T
a=-1 b=0 c=0 d=1
4 F T T F F F F
5 T F F T F F F
a=-1 b=0 c=1 d=1
Solutions to the Problems 99
nin
n0 int a0;
c
n1 int a1;
d
n2 int j;
e
p-use(i) p-use(i)
n4 while (i>=0)
g
def(j) n5 j=0;
h p-use(i),
p-use(i), p-use(j)
n6
p-use(j) while (j<i)
i
c-use(j),
n7 a0=get(j);
def(a0)
j
c-use(j), n8 a1=get(j+1);
def(a1)
s k
q r
p-use(a0), n9 p if (a0>a1)
p-use(a1) p-use(a0),
l
p-use(a1)
c-use(j), n10 put(j,a1);
c-use(a1) o m
c-use(j),
n11 put(j+1,a0);
c-use(ao)
n
nout
t
nfinal
100 Solutions to the Problems
Index
defs/uses test ............................................... 57, 61
A
dpu(x, ni)................................................ vi, 58, 59
all c–uses criterion ........................................ v, 59 du–path .............................................................. vi
all c–uses/some p–uses criterion ................... v, 59 dynamic techniques .......................................... 10
all du–paths criterion .................................... v, 60 dynamic testing........................................... vii, 12
all p–uses criterion........................................ v, 59
all p-uses/some c-uses criterion ........................ 60 E
all p–uses/some c–uses criterion ......................... v equivalence class testing ................................... vii
all uses criterion............................................ v, 60 equivalence classes ............................... 19, 22, 24
all–def criterion .................................................. v error .......................................... vii, 4, 6, 9, 10, 12
author.................................................v, 86, 87, 88
availability .............................................. v, 4, 5, 7 F
failure ...................... vii, 4, 6, 8, 26, 35, 36, 38, 71
B
fault .......................................................... vii, 4, 6
backward slicing ........................................... v, 67 formal inspection ................. 81, 82, 84, 86, 89, 91
boundary interior coverage ............................... 49 formal inspection methods.............. vii, 84, 90, 91
boundary interior path test .......................... 50, 60 formal verification ............................................ 14
boundary interior testing..................................... v forward slicing ............................................ vii, 67
branch coverage test ........................ vi, 13, 36, 37 functional equivalence class construction ... 19, 24
functional equivalence class testing .................. vii
C
functional testing .................................. vii, 12, 13
condition coverage ........................................... 49 function–oriented test techniques ......... 17, 18, 28
condition coverage test ...............vi, 33, 39, 41, 52 function–oriented testing ...........17, 18, 28, 29, 30
condition/decision coverage test .................. vi, 43
control flow graph ... vi, 34, 35, 36, 37, 52, 56, 57, I
58, 68, 69, 72, 73, 74, 75, 76, 77, 78, 97 informal review techniques................... vii, 91, 92
control flow test ................................... viii, 33, 52 inspections ............... 82, 83, 84, 85, 88, 89, 91, 92
control flow–oriented testing ............................ 34 Integrated Quality Assurance ...................... viii, 2
control flow–oriented testing techniques .... vi, 11,
33, 34 L
correctness ............ vi, 2, 4, 5, 6, 12, 13, 17, 55, 64
loops ........................ 34, 37, 49, 50, 51, 52, 66, 76
D M
data flow anomalies .. vi, 14, 71, 73, 76, 77, 78, 83
minimal multiple condition coverage test .. vii, 39,
data flow anomaly analysis. 14, 71, 72, 73, 75, 78,
44
83
moderator............................ vii, 85, 86, 87, 88, 89
data flow attributed control flow graph ....... vi, 56
modified condition/decision coverage test. vii, 39,
data flow–oriented test .......................... vi, 55, 61
45
dcu(x, ni) ................................................ vi, 58, 59
multiple condition coverage test .......... viii, 39, 46
decision tables ............................................. vi, 29
decision trees ......................................... vi, 29, 30
def(ni) .................................................... vi, 58, 59
104 Index
P security ............................................................ 5, 7
simple condition coverage .. viii, 39, 40, 41, 42, 43
path coverage test ................................ viii, 50, 52
slicing .................................................. viii, 67, 71
Principle of Integrated Quality Assuranceviii, 2, 4
Software Quality Assurance ............................ viii
Q state–based test .................................... viii, 25, 27
state–based testing ............................... viii, 25, 27
quality ......................................................... 2, 4, 5
statement coverage test..................... ix, 13, 35, 36
quality characteristic ......................viii, 3, 5, 8, 82
static analysis .............................................. ix, 13
quality measure ....................................... viii, 4, 6
static code analysis .................................ix, 63, 64
quality requirement ..........................viii, 1, 3, 4, 5
static techniques ................................................10
quality target specification ...................... viii, 1, 3
structured path test .......................... 49, 50, 51, 60