Automatic Test Generation For Industrial Control Software
Automatic Test Generation For Industrial Control Software
No. 214
Mälardalen University Press Dissertations
No. 214
Eduard Enoiu
2016
Eduard Enoiu
2016
AUTOMATIC TEST
EduardGENERATION
Enoiu FOR
INDUSTRIAL CONTROL SOFTWARE
Akademisk avhandling
2016
Fakultetsopponent: Professor Mats Heimdahl, University of Minnesota
ISBN 978-91-7485-291-2
ISSN 1651-4238
To Raluca
I want to live my life taking the risk all the time that I don’t know enough
yet. That I haven’t read enough, that I can’t know enough, that I’m always
operating hungrily on the margins of a potentially great harvest of future
knowledge and wisdom. . . take the risk of thinking for yourself. Much
more happiness, truth, beauty, and wisdom will come to you that way.
Christopher Hitchens
Acknowledgments
Many people have helped in the writing of this thesis, but none more than my
supervisors, Professor Daniel Sundmark, Professor Paul Pettersson and Dr Adnan
Čaušević, who are wonderful, wise and inspiring. Huge thanks also to Ola Sellin and
the TCMS team in Bombardier Transportation.
I’d like to thank Associate Professor Cristina Seceleanu, Professor Elaine Weyuker,
Dr Aida Čaušević and Associate Professor Stefan Stancescu. They have been incred-
ibly supportive throughout my entire higher education. I’m very grateful for their
guidance and encouragement.
Big thanks to my family for their love and support. I want to say thank you to my
friends at Mälardalen University, who provided that little spark of inspiration.
Finally, I would like to thank the Swedish Governmental Agency of Innovation Systems
(Vinnova), the Knowledge Foundation (KKS), Mälardalen University and Bombardier
Transportation whose financial support via the Advanced Test Automation for
Complex and Highly-Configurable Software-intensive Systems (ATAC) project and
ITS-EASY industrial research school, has made this thesis possible.
v
List of Publications
Appended Studies1
This thesis is based on the following studies:
Study 1. Using Logic Coverage to Improve Testing Function Block Diagrams. Ed-
uard Enoiu, Daniel Sundmark, Paul Pettersson. Testing Software and Systems,
Proceedings of the 25th IFIP WG 6.1 International Conference ICTSS 2013, volume
8254, pages 1 - 16, Lecture Notes in Computer Science, 2013, Springer.
Study 2. Automated Test Generation using Model-Checking: An Industrial Eval-
uation. Eduard Enoiu, Adnan Čaušević, Elaine Weyuker, Tom Ostrand, Daniel
Sundmark and Paul Pettersson. International Journal on Software Tools for Tech-
nology Transfer (STTT), volume 18, issue 3, pages 335–353, 2016, Springer.
Study 5. Mutation-Based Test Generation for PLC Embedded Software using Model
Checking. Eduard Enoiu, Daniel Sundmark, Adnan Causevic, Robert Feldt and Paul
Pettersson. Testing Software and Systems, Proceedings of the 28th IFIP WG 6.1
International Conference ICTSS 2016, volume 9976, pages 155-171, Lecture Notes in
Computer Science, 2016, Springer.
Statement of Contribution
In all the included studies, the first author was the primary contributor to the
research, approach and tool development, study design, data collection, analysis and
reporting of the research work.
1
All published studies included in this thesis are reprinted with explicit permission from the
copyright holders (i.e. IEEE and Springer).
vii
Other Studies
Other relevant papers co-authored by Eduard Enoiu but not included in this thesis:
5. Model-based Test Suite Generation for Function Block Diagrams using the UP-
PAAL Model Checker. Eduard Enoiu, Daniel Sundmark, and Paul Pettersson. Sixth
International Conference on Software Testing, Verification and Validation Workshops
(ICSTW), pages 158 - 167, 2013, IEEE.
Acknowledgments v
I Thesis Summary 1
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Industrial Control Software . . . . . . . . . . . . . . . . . . . 4
2.2 Programmable Logic Controllers . . . . . . . . . . . . . . . . . 5
2.3 Test Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Research Objective . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 RG1: Techniques and Tool Implementation . . . . . . . . . . 11
3.4 RG2: Applicability . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 RG3: Cost and Fault Detection Evaluation . . . . . . . . . . . 14
4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1 Test Generation Technique and Tool Implementation . . . . . 16
4.2 Empirical Studies . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . 19
6 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Bibliography 21
II Studies 29
1 Using Logic Coverage to Improve Testing Function Block Diagrams 31
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 FBD Programs and Timer Components . . . . . . . . . . . . . 35
2.2 Networks of Timed Automata . . . . . . . . . . . . . . . . . . 35
2.3 Logic-based Coverage Criteria . . . . . . . . . . . . . . . . . . 36
3 Testing Methodology and Proposed Solutions . . . . . . . . . . . . . 36
4 Function Block Diagram Component Model . . . . . . . . . . . . . . 38
5 Transforming Function Block Diagrams into Timed Automata . . . . 39
ix
6 Test Case Generation using the UPPAAL
Model-Checker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7 Logic Coverage Criteria for Function Block Diagrams . . . . . . . . . 42
8 Example: Train Startup Mode . . . . . . . . . . . . . . . . . . . . . . 43
8.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.2 Logic Coverage and Timing Components . . . . . . . . . . . . 46
9 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Thesis Summary
1 Introduction
Software plays a vital role in our daily lives and can be found in a number of
domains, ranging from mobile applications to medical systems. The emergence and
wide spread usage of large complex software products has profoundly influenced
the traditional way of developing software. Nowadays, organizations need to deliver
reliable and high-quality software products while having to consider more stringent
time constraints. This problem is limiting the amount of development and quality
assurance that can be performed to deliver software. Software testing is an important
verification and validation activity used to reveal software faults and make sure that
the expected behavior matches the actual software execution [4]. In the academic
literature a test or a test case is usually defined as an observation of the software,
executed using a set of inputs. A set of test cases is called a test suite. Eldh
[20] categorized the goals used for creating a test suite into the following groups:
specification-based testing, negative testing, random testing, coverage-based testing,
syntax and/or semantic-based testing, search-based testing, usage-based testing,
model-based testing, and combination techniques. For a long time these software
testing techniques have been divided into different levels with respect to distinct
software development activities (i.e., unit, integration, system testing).
If software testing is severely constrained, this implies that less time is devoted to
assuring a proper level of software quality. As a solution to this challenge, automatic
test generation has been suggested to allow tests to be created with less effort and
at lower cost. In contrast to manual testing, test generation is automatic in the
sense that test creation satisfying a given test goal or given requirement is performed
automatically. However, over the past few decades, it has been a challenge for
both practitioners and researchers to develop strong and applicable test generation
techniques and tools that are relevant in practice. The work included in this thesis
is part of a larger research endeavor well captured by the following quotation:
reveal software faults and exercise different aspects of a program. However, tools for
automatic test generation are still few and far from being applicable in practice. As a
consequence, the evidence regarding the mainstream use of automatic test generation
is limited. This is especially problematic if we consider relying on automatic test
generation for thoroughly testing industrial safety-critical control software, such as is
found in trains, cars and airplanes. In these kind of applications, software failures can
lead to economical damage and, in some cases, loss of human lives. This motivated
us to investigate the use of automatic test generation and identify the empirical
evidence for, or against, the use of it in practice when developing industrial control
software.
In this thesis we study and develop automatic test generation techniques and tools
for a special class of software known as industrial control software. In particular, we
focus on IEC 61131-3, a popular programming language used in different control
systems.
2 Background
In this section, major aspects of industrial control software and automatic test
generation are discussed. The presented aspects are related to current research in
the field as well as the research done in this thesis.
XOR TON
IN1
IN2 SR
OUT1
80s
IN3
SEL TP
IN4
IN5 OUT2
IN6
IN7
GT AND
IN8
IN9 OUT3
OUT1
PLCs. The following sections will provide details on test generation as a general
concept used in this thesis.
Requirement-based Testing
Like other software engineering disciplines, many of today’s automatic test generation
techniques use requirement models to guide the search towards achieving a certain goal.
Many notations are used for such models, from formal - mathematical descriptions
[17] and semi-formal notations such as the Unified Modeling Language (UML) [51] to
natural language requirements. Formal requirement models with precise semantics
are suitable for automatic test generation [71]. Even so, recent results have showed
that natural language is still the dominant documentation format in control and
embedded software industry for requirement specification [67] even if engineers are
7
Software Test
Under Test Execution
Test
Requirement Test Result
Verdict
dissatisfied with using natural language for requirements specification. This thesis is
focusing on specifications expressed in a natural language, as this is still a realistic
scenario for testing industrial control software.
A requirement model is an abstraction of a desired software behavior and different
types of abstractions are often needed to construct it. Requirement-based testing
(also known as specification-based testing) is a technique where tests are derived from
a requirement model that specifies the expected behavior of the software. Different
test thoroughness measures based on requirement models have been proposed for
guiding the generation of suitable tests: conformance testing [47, 32, 75], coverage
criteria that are based on the specifications [52, 3, 78], domain/category partitioning
[5, 55], just to name a few.
Requirement-based testing requires the understanding of both the specified require-
ments and the program under test. The specification usually contains preconditions,
input values and expected output values [4] and a tester uses this information to
manually or automatically check the software conformance with the specification. A
test suite should contribute to the demonstration that the specified requirements
and/or coverage criteria have indeed been satisfied. Engineering of industrial control
software typically requires a certain degree of certification according to safety stan-
dards [12]. These standards pose specific concerns on testing (e.g., the demonstration
of requirement testing and/or some level of code coverage). In this thesis we seek
to investigate the implications of using requirement-based testing for IEC 61131-3
control software.
Code coverage criteria are used in software testing to assess the thoroughness of test
cases [4]. These criteria are normally used to determine the extent to which the
software structure has been exercised. In the context of traditional programming
languages (e.g., Java and C#), decision coverage is usually referred to as branch
coverage. A test suite satisfies branch coverage if running the test cases causes each
branch in the software to have the value true at least once and the value false at least
once. In this thesis, code coverage is used to determine if the test goal is satisfied.
Numerous techniques for automatic test generation based on code coverage criteria
[23, 11, 74, 82, 44] have been proposed in the last decade. An example of such
an approach is EvoSuite [23], a tool based on genetic algorithms, for automatic
test generation of Java programs. Another automatic test generation tool is KLEE
[11] which is based on dynamic symbolic execution and uses constraint solving
optimization as well as search heuristics to obtain high code coverage. In this thesis,
we investigate how an automatic test generation approach can be developed for IEC
61131-3 control software and how it can be adopted for testing industrial control
software. Moreover, we evaluate and compare such techniques with manual testing
performed by industrial engineers.
Mutation Testing
Recent work [28, 37] suggests that coverage criteria alone can be a poor indication
of fault detection. To tackle this issue, researchers have proposed approaches for
improving fault detection by using mutation analysis as a test goal. Mutation analysis
is the technique of automatically generating faulty implementations of a program
for the purpose of examining the fault detection ability of a test suite [16]. During
the process of generating mutants one should create syntactically and semantically
valid versions of the original program by introducing a single fault or multiple faults
(i.e., higher-order mutation [41]) into the program. A mutant is a new version of a
program created by making a small change to the original program. For example,
a mutant can be created by replacing a method with another, negating a variable,
or changing the value of a constant. The execution of a test case on the resulting
mutant may produce a different output than the original program, in which case
we say that the test case kills that mutant. The mutation score can be calculated
using either an output-only oracle (i.e., strong mutation [79]) or a state change oracle
(i.e., weak mutation [34]) against the set of mutants. When this technique is used to
generate test suites rather than evaluating existing ones, it is commonly referred to as
mutation testing or mutation-based test generation. Despite its effectiveness [43], no
attempt has been made to propose and evaluate mutation testing for PLC industrial
control software. This motivated us to develop an automatic test generation approach
based on mutation testing targeting this type of software.
9
3 Summary of Contributions
In this section, the motivation for the thesis is shown based on the challenges and gaps
in the scientific knowledge. In addition, the overall research objective is presented
and broken down into specific research goals that the thesis work aims to achieve.
3.1 Motivation
Numerous automatic test generation techniques [54] provide test suites with high
code coverage (e.g., branch coverage), high requirement coverage, or to satisfy other
related criteria (e.g., mutation testing). However, for industrial control software
contributions have been more sparse. The body of knowledge in automatic test
generation for IEC 61131-3 control programs is limited, in particular in regards
to tool support, empirical evidence for its applicability, usefulness, and evaluation
in industrial practice. The motivation for writing this thesis stems in part from a
fundamental issue raised by Heimdahl [30]:
From a software testing research point of view, this thesis is also motivated by the
need to provide evidence that automatic test generation can perform comparably
with manual testing performed by industrial engineers. Consequently, we identified
the general problem as:
In the next sections, we introduce the research objective of the thesis, present the
contributions, and show how the research goals are addressed by the contributions.
This research started with the problem of implementing, adopting and using automatic
test generation in industrial control software development, and ends with proposing a
solution for this problem while building empirical knowledge in the area of automatic
test generation.
Overall Objective
Study 1
Study 2
Study 3
Study 4
Study 5
Figure 3: Overview of the how the studies included in this thesis support the three
research goals.
As shown in Figure 3 the research was performed during five studies (i.e., Studies
1-5). Combined, these studies provided support to the overall objective of this thesis
and a contribution to the body of knowledge in automatic test generation for IEC
61131-3 control software. The studies included in this thesis used different research
methods (i.e., case studies and controlled experiments), data collection strategies
(unstructured interviews, document analysis and observation) and data analysis
statistics.
Practically, the research objective was broken down into three research goals:
RG 1. Develop automatic test generation techniques for testing industrial control
software written in the IEC 61131-3 programming language.
This goal refers to building an approach for automatic test generation that would
support IEC 61131-3 control software. This thesis is mainly concerned with test
generation techniques where the test goal is based on the implementation itself (i.e.,
code coverage criteria, mutation testing). In certain application domains (e.g., railway
industry) testing IEC 61131-3 software requires certification according to safety
standards [12]. These standards recommend the demonstration of some level of code
coverage on the developed software. In order to be able to automatically generate
tests, a translation of the IEC 61131-3 software to timed automata is proposed.
The resulting model can be automatically analyzed towards finding suitable tests
achieving some type of code coverage. To tackle the issue of generating tests achieving
better fault detection, an approach is proposed that targets the detection of injected
11
faults for IEC 61131-3 software using mutation testing. Support for this research
goal was acquired in studies 1, 2 and 5.
Applicability refers to its success in meeting efficiency (e.g., generation time) and
test goal requirements (e.g., achieving code coverage). This makes this goal key to
determine the value of its feasibility in an industrial context. The results contributing
to this goal were acquired primarily from studies 1, 2 and 5.
RG 3. Compare automatic test generation with manual testing in terms of cost and
fault detection.
This goal addresses how automatically created tests compare to manually written
ones in terms of effectiveness and cost. This goal also concerns the difference between
designing tests based on requirements and creating test suites solely satisfying code
coverage. Given that recent work [28] suggests that code coverage criteria alone can
be a poor indication of fault detection, with this goal we seek to investigate overall
implications of using manual testing, specification-based testing, code coverage-
adequate and mutation-based automatic test generation for IEC 61131-3 industrial
control software. The aim of this goal was to provide experimental evidence of test
effectiveness in the sense of bug-finding and cost in terms of testing time. Explicit
work to contribute to this goal was performed in studies 3, 4 and 5.
In the next section we describe the main contributions of this thesis. The contri-
bution is divided into three parts: techniques and tool implementation, applicability,
and evaluation of cost and fault detection.
IEC 1131-3
Control Program
translation
Test Goal
Timed Automata
Code
Coverage
Criteria
UPPAAL Mutation
Model Checker Testing
test generation
results indicate that model checking is suitable for handling code coverage criteria
for real-world IEC 61131-3 programs, but also reveal some potential limitations of
the tool when used for test generation such as the use of manual expected outputs.
The evaluation in study 2 and 5 showed that automatic test generation is efficient in
terms of time required to generate tests that satisfy code and mutation coverage and
that it scales well for real-world IEC 61131-3 programs.
A Controlled Experiment
Case Studies
An industrial case study (i.e., study 4) comparing manual test suites created by
industrial engineers with test suites created automatically was performed. This
case study provided the understanding of automatic test generation and manual
testing in its actual context and heavily used industrial resources. Unstructured
interviews with several industrial engineers were used to explore the actual usage of
manual testing. We analyzed test specification documents to give input to the actual
empirical work done in this case study, e.g., manual test cases were analyzed and
executed on a set of programs provided by Bombardier Transportation. In addition,
a planned observation was used to observe how manual testing was performed at the
company. This study provided an overall view of the opportunities and limitations
of using automatic test generation in industrial practice.
In addition, study 4 provided support for improvement in selecting test goals and
fault detection. Practically, study 4 is a case study in which the cost and effectiveness
between manually and automatically created test cases were compared. In particular,
we measured the cost and effectiveness in terms of fault detection of tests created
using a coverage-adequate automatic test generation tool and manually created tests
by industrial engineers from an existing train control system. Recently developed
real-world industrial programs written in the IEC 61131-3 FBD programming
language were used. The results show that automatically generated tests achieve
similar code coverage as manually created tests but in a fraction of the time (an
improvement of roughly 90% on average). We also found that the use of an automated
test generation tool did not show better fault detection in terms of mutation score
compared to manual testing. This underscores the need to further study how manual
testing is performed in industrial practice. These findings support the hypothesis
that manual testing performed by industrial engineers achieve high code coverage
and good fault detection in terms of mutation score. This study suggests some issues
that would need to be addressed in order to use automatic test generation tools to
aid in testing of safety-critical embedded software.
No attempt has been made to evaluate mutation testing for control software
written in the IEC 61131-3 programming language. This motivated us to evaluate
further mutation-based test generation targeting this type of software in study 5. For
realistic validation we collected industrial experimental evidence on how mutation
testing compares with manual testing as well as automatic decision-coverage adequate
test generation. In the evaluation, manually seeded faults were provided by four
industrial engineers. The results show that even if mutation-based test generation
achieves better fault detection than automatic decision coverage-based test generation,
these mutation-adequate test suites are not better at detecting faults than manual
test suites. However, the mutation-based test suites are significantly less costly
to create, in terms of testing time, than manually created test suites. The results
suggest that the fault detection scores could be improved by considering some new
and improved mutation operators for IEC 61131-3 programs as well as higher-order
mutations.
16 4. Related Work
4 Related Work
In this section, we identify key pieces of work that are related to the approach
presented in this thesis. This section begins with background information and then
discusses recent improvements to automatic test generation, like model checking,
and how the approach developed in this thesis benefited from these studies. We also
identify other techniques that aim at testing IEC 61131-3 in specific, and discuss
how they are related to the techniques developed in this thesis. For the work on
comparing manual testing with automatic test generation, we discuss how other
studies are related to the results of this thesis.
thesis we generate tests based on code coverage for offline execution. Recently, several
automatic test generation approaches [40, 80, 38, 69, 19] have been proposed for
IEC 61131-3 software. These techniques can typically produce tests for a given code
coverage criterion and have been shown to achieve high code coverage for various
IEC 61131-3 programs. Compared to the work in this thesis, these works are lacking
the tool support. In addition, CompleteTest augments previous work in this area
with mutation testing.
Recently, Wang et al. [77] compared automatically generated tests with manual
tests on several open-source projects. They found that automatically generated tests
are able to achieve higher code coverage but lower fault detection scores with manual
test suites being also better at discovering hard-to-cover code and hard-to-kill type
of faults. Another closely related study done by Kracht el al. [46] used EvoSuite
on a number of open-source Java projects and compared those tests with the ones
already manually created by developers. Automatically-generated tests achieved
similar code coverage and fault detection scores to manually created tests. Recently,
Shamshiri et al. [66] found that tests generated by EvoSuite achieved higher code
coverage than developer-written tests and detected 40% out of 357 real faults. The
results of this thesis indicate that, in IEC 61131-3 software development, automatic
test generation can achieve similar branch coverage to manual testing performed
by industrial engineers. However, these automatically generated tests do not show
better fault detection in terms of mutation score than manually created test suites.
The fault detection rate for automated implementation-based test generation and
manual testing was found, in some of the published studies [24, 25, 46, 77], to be
similar to the results of this thesis. Interestingly enough, our results indicate that
code coverage-adequate tests might even be slightly worse in terms of fault detection
compared to manual tests. However, a larger empirical study is needed to statistically
confirm this hypothesis.
Fraser et al. [24, 25] performed a controlled experiment and a follow-up replication
experiment on a total of 97 subjects. They found that automated test generation
performs well, achieving high code coverage but no measurable improvement over
manual testing in terms of the number of faults found by developers. Ramler et al.
[60] conducted a study and a follow-up replication [61], carried out with master’s
students and industrial professionals respectively, addressing the question of how
automatic test generation with Randoop compare to manual testing. In these specific
experiments, they found that the number of faults detected by Randoop was similar
to manual testing. However, the fault detection rates for automatic implementation-
based test generation and manual specification-based testing were found [60] to be
significantly different from the experiments included in this thesis. This could stem
from the fact that the subjects used Randoop rather than CompleteTest and that
in this thesis they were given more time to manually test their programs compared
to previous controlled experiments. By using a more restrictive testing duration, one
would expect human participants to show less comprehensive understanding of the
task at hand.
Most studies concerning automatic test generation for mutation testing and related
to the work included in this thesis have focused on how to generate tests as quickly as
possible, improve the mutation score and/or compare with code coverage-based auto-
matic test generation [58, 81, 22, 42]. For example, mutation-based test generation
[81] is able to outperform code coverage-directed test generation in terms of mutant
killing. Frankl et al [22] have shown that mutation testing is superior to several code
19
coverage criteria in terms of effectiveness. This fact is in line with the results of
study 5. Neither of these studies have looked at comparing mutation testing with
manual tests created by humans. The study of Fraser et al. [27] is the only one,
that we are aware of, comparing mutation-based test generation with manual testing
in terms of fault detection by using manually seeded faults. They report that the
generated tests based on mutation testing find significantly more seeded faults than
manually written tests. In comparison, study 5 shows that mutation-adequate test
suites are not better at detecting seeded faults than manual test suites. This partly
stems from the fact that some of these undetected faults are not reflected in the
mutation operator list used for generating mutation adequate test suites.
other operators. This new list of mutation operators needs to be further evaluated
in practice.
The results of this thesis support the claim that automatic test generation is
efficient but currently not quite as effective as manual testing. This needs to be further
studied; we need to consider the implications and the extent to which automatic
test generation can be used in the development of reliable safety-critical industrial
control software.
To evaluate the potential application of automatic test generation techniques,
several studies are presented where CompleteTest is applied to industrial control
software from Bombardier Transportation. Even if the results are mainly based on
data provided by one company and this can be seen as a weak point, we argue that
having access to real industrial data from the safety-critical domain is relevant to
the advancement of automatic test generation. More studies are needed to generalize
the results of this thesis to other systems and domains.
[1] R. Alur. “Timed Automata”. In: Computer Aided Verification. 1999, pp. 688–
688 (cit. on p. 11).
[2] R. Alur and D. Dill. “Automata for Modeling Real-time Systems”. In: Automata,
languages and programming. Springer, 1990, pp. 322–335 (cit. on p. 11).
[3] Paul Ammann and Paul Black. “A Specification-based Coverage Metric to
Evaluate Test Sets”. In: International Journal of Reliability, Quality and Safety
Engineering. Vol. 8. 04. World Scientific, 2001, pp. 275–299 (cit. on p. 7).
[4] Paul Ammann and Jeff Offutt. Introduction to Software Testing. Cambridge
University Press, 2008 (cit. on pp. 3, 7, 8).
[5] Paul Ammann and Jeff Offutt. “Using Formal Methods to Derive Test Frames
in Category-partition Testing”. In: Conference on Safety, Reliability, Fault
Tolerance, Concurrency and Real Time, Security. 1994, pp. 69–79 (cit. on p. 7).
[6] L. Baresi, M. Mauri, A. Monti, and M. Pezze. “PLCTools: Design, Formal Vali-
dation, and Code Generation for Programmable Controllers”. In: International
Conference on Systems, Man, and Cybernetics. Vol. 4. 2000, pp. 2437–2442
(cit. on p. 16).
[7] Paul Black. “Modeling and Marshaling: Making Tests from Model Checker
Counter-Examples”. In: Digital Avionics Systems Conference. Vol. 1. IEEE,
2000, 1B3–1 (cit. on p. 16).
[8] William Bolton. Programmable Logic Controllers. Newnes, 2015 (cit. on p. 4).
[9] Sébastien Bornot, Ralf Huuck, and Ben Lukoschus. “Sequential Function
Charts”. In: Discrete Event Systems: Analysis and Control. Vol. 569. Springer,
2012, p. 255 (cit. on p. 11).
[10] Stuart Boyer. SCADA: Supervisory Control and Data Acquisition. International
Society of Automation, 2009 (cit. on p. 4).
[11] Cristian Cadar, Daniel Dunbar, and Dawson R Engler. “KLEE: Unassisted and
Automatic Generation of High-Coverage Tests for Complex Systems Programs”.
In: Symposium on Operating Systems Design and Implementation. Vol. 8.
USENIX, 2008, pp. 209–224 (cit. on pp. 8, 19).
[12] CENELEC. “50128: Railway Application–Communications, Signaling and Pro-
cessing Systems–Software for Railway Control and Protection Systems”. In:
Standard Report. 2001 (cit. on pp. 7, 10).
21
22 Bibliography
[13] Ilinca Ciupa, Bertrand Meyer, Manuel Oriol, and Alexander Pretschner. “Find-
ing Faults: Manual Testing vs. Random+ Testing vs. User Reports”. In: In-
ternational Symposium on Software Reliability Engineering (ISSRE). 2008,
pp. 157–166 (cit. on p. 17).
[14] SJ Cunning and Jerzy W Rozenblit. “Automatic Test Case Generation from
Requirements Specifications for Real-Time Embedded Systems”. In: Interna-
tional Conference on Systems, Man, and Cybernetics. Vol. 5. 1999, pp. 784–789
(cit. on p. 17).
[15] Raffaello D’Andrea and Geir E Dullerud. “Distributed Control Design for
Spatially Interconnected Systems”. In: Transactions on Automatic Control.
Vol. 48. 9. IEEE, 2003, pp. 1478–1495 (cit. on p. 4).
[16] Richard A DeMillo, Richard J Lipton, and Frederick G Sayward. “Hints on Test
Data Selection: Help for the Practicing Programmer”. In: Computer. Vol. 11. 4.
IEEE, 1978, pp. 34–41 (cit. on pp. 8, 13).
[17] Jeremy Dick and Alain Faivre. “Automating the Generation and Sequencing of
Test Cases from Model-based Specifications”. In: International Symposium of
Formal Methods Europe. 1993, pp. 268–284 (cit. on p. 6).
[18] Henning Dierks. “PLC-Automata: A New Class of Implementable Real-Time
Automata”. In: Theoretical Computer Science. Vol. 253. 1. Elsevier, 2001,
pp. 61–93 (cit. on p. 11).
[19] Kivanc Doganay, Markus Bohlin, and Ola Sellin. “Search Based Testing of
Embedded Systems Implemented in IEC 61131-3: An Industrial Case Study”.
In: International Conference on Software Testing, Verification and Validation
Workshops. IEEE, 2013, pp. 425–432 (cit. on pp. 16, 17).
[20] Sigrid Eldh. “On Test Design”. In: PhD Thesis, Mälardalen University Press.
Mälardalen University Press, 2011 (cit. on p. 3).
[21] Phyllis G Frankl and Stewart N Weiss. “An Experimental Comparison of the
Effectiveness of Branch Testing and Data Flow Testing”. In: Transactions on
Software Engineering. Vol. 19. 8. IEEE, 1993, pp. 774–787 (cit. on p. 17).
[22] Phyllis G Frankl, Stewart N Weiss, and Cang Hu. “All-uses vs Mutation Testing:
an Experimental Comparison of Effectiveness”. In: Journal of Systems and
Software. Vol. 38. 3. Elsevier, 1997, pp. 235–253 (cit. on p. 18).
[23] Gordon Fraser and Andrea Arcuri. “Evosuite: Automatic Test Suite Generation
for Object-oriented Software”. In: Conference on Foundations of Software
Engineering. ACM, 2011, pp. 416–419 (cit. on pp. 8, 16, 19).
[24] Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg.
“Does Automated Unit Test Generation Really Help Software Testers? A
Controlled Empirical Study”. In: Transactions on Software Engineering and
Methodology. Vol. 24. 4. ACM, 2014, p. 23 (cit. on pp. 17, 18).
Bibliography 23
[25] Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg.
“Does Automated White-Box Test Generation Really Help Software Testers?”
In: International Symposium on Software Testing and Analysis. ACM, 2013,
pp. 291–301 (cit. on pp. 17, 18).
[26] Gordon Fraser, Franz Wotawa, and Paul E Ammann. “Testing with Model
Checkers: a Survey”. In: Journal on Software Testing, Verification and Relia-
bility. Vol. 19. 3. Wiley, 2009, pp. 215–261 (cit. on pp. 16, 17).
[27] Gordon Fraser and Andreas Zeller. “Mutation-driven generation of unit tests
and oracles”. In: vol. 38. 2. IEEE, 2012, pp. 278–292 (cit. on p. 19).
[28] Gregory Gay, Matt Staats, Michael Whalen, and Mats Heimdahl. “The Risks
of Coverage-Directed Test Case Generation”. In: Transactions on Software
Engineering. Vol. 41. 8. IEEE, 2015, pp. 803–819 (cit. on pp. 8, 11).
[29] David Harel. “Statecharts: A Visual Formalism for Complex Systems”. In:
Science of Computer Programming. Vol. 8. 3. Elsevier, 1987, pp. 231–274 (cit.
on p. 11).
[30] Mats Heimdahl. “Safety and Software Intensive Systems: Challenges Old and
New”. In: Future of Software Engineering. 2007, pp. 137–152 (cit. on p. 9).
[31] Mats Heimdahl, Devaraj George, and Robert Weber. “Specification Test Cov-
erage Adequacy Criteria= Specification Test Generation Inadequacy Criteria”.
In: International Symposium on High Assurance Systems Engineering. 2004,
pp. 178–186 (cit. on p. 17).
[32] Teruo Higashino, Akio Nakata, Kenichi Taniguchi, and Ana R Cavalli. “Gener-
ating Test Cases for a Timed I/O Automaton Model”. In: Testing of Commu-
nicating Systems. Springer, 1999, pp. 197–214 (cit. on p. 7).
[33] Hyoung Seok Hong, Insup Lee, Oleg Sokolsky, and Hasan Ural. “A Temporal
Logic-Based Theory of Test Coverage and Generation”. In: Tools and Algorithms
for the Construction and Analysis of Systems. Springer, 2002, pp. 327–341
(cit. on p. 16).
[34] William E Howden. “Weak mutation testing and completeness of test sets”.
In: Transactions on Software Engineering. 4. IEEE, 1982, pp. 371–379 (cit. on
p. 8).
[35] Monica Hutchins, Herb Foster, Tarak Goradia, and Thomas Ostrand. “Experi-
ments of the Effectiveness of Dataflow-and Controlflow-based Test Adequacy
Criteria”. In: International Conference on Software Engineering. 1994, pp. 191–
200 (cit. on p. 17).
[36] IEC. “International Standard on 61131-3 Programming Languages”. In: Pro-
grammable Controllers. IEC Library, 2014 (cit. on p. 5).
[37] Laura Inozemtseva and Reid Holmes. “Coverage is Not Strongly Correlated with
Test Suite Effectiveness”. In: International Conference on Software Engineering.
ACM, 2014, pp. 435–445 (cit. on pp. 8, 17).
24 Bibliography
[38] Marcin Jamro. “POU-Oriented Unit Testing of IEC 61131-3 Control Software”.
In: Transactions on Industrial Informatics, vol. 11. 5. IEEE, 2015 (cit. on
pp. 16, 17).
[39] E. Jee, S. Kim, S. Cha, and I. Lee. “Automated Test Coverage Measurement for
Reactor Protection System Software Implemented in Function Block Diagram”.
In: Journal on Computer Safety, Reliability, and Security. Springer, 2010,
pp. 223–236 (cit. on p. 16).
[40] Eunkyoung Jee, Donghwan Shin, Sungdeok Cha, Jang-Soo Lee, and Doo-Hwan
Bae. “Automated Test Case Generation for FBD Programs Implementing
Reactor Protection System Software”. In: Software Testing, Verification and
Reliability. Vol. 24. 8. Wiley, 2014 (cit. on pp. 16, 17).
[41] Yue Jia and Mark Harman. “Higher Order Mutation Testing”. In: Information
and Software Technology. Vol. 51. 10. Elsevier, 2009, pp. 1379–1393 (cit. on
p. 8).
[42] Bryan F Jones, David E Eyres, and H-H Sthamer. “A Strategy for using Genetic
Algorithms to Automate Branch and Fault-based Testing”. In: The Computer
Journal. Vol. 41. 2. Brazilian Computer Society, 1998, pp. 98–107 (cit. on
p. 18).
[43] René Just, Darioush Jalali, Laura Inozemtseva, Michael D Ernst, Reid Holmes,
and Gordon Fraser. “Are Mutants a Valid Substitute for Real Faults in Software
Testing?” In: International Symposium on Foundations of Software Engineering.
ACM, 2014, pp. 654–665 (cit. on p. 8).
[44] Yunho Kim, Youil Kim, Taeksu Kim, Gunwoo Lee, Yoonkyu Jang, and Moon-
zoo Kim. “Automated Unit Testing of Large Industrial Embedded Software
using Concolic Testing”. In: International Conference on Automated Software
Engineering. IEEE, 2013, pp. 519–528 (cit. on p. 8).
[45] Eric D Knapp and Joel Thomas Langill. Industrial Network Security: Securing
critical infrastructure networks for smart grid, SCADA, and other Industrial
Control Systems. Syngress, 2014 (cit. on p. 4).
[46] Jeshua S Kracht, Jacob Z Petrovic, and Kristen R Walcott-Justice. “Empirically
Evaluating the Quality of Automatically Generated and Manually Written
Test Suites”. In: International Conference on Quality Software. IEEE, 2014,
pp. 256–265 (cit. on pp. 17, 18).
[47] Moez Krichen and Stavros Tripakis. “Black-box Conformance Testing for Real-
time Systems”. In: International SPIN Workshop on Model Checking of Software.
2004, pp. 109–126 (cit. on p. 7).
[48] K.G. Larsen, P. Pettersson, and W. Yi. “UPPAAL in a Nutshell”. In: Interna-
tional Journal on Software Tools for Technology Transfer (STTT). Vol. 1. 1.
Springer, 1997, pp. 134–152 (cit. on p. 12).
[49] Tadao Murata. “Petri Nets: Properties, Analysis and Applications”. In: Pro-
ceedings of the IEEE. Vol. 77. 4. IEEE, 1989, pp. 541–580 (cit. on p. 11).
Bibliography 25
[50] Akbar Siami Namin and James H Andrews. “The Influence of Size and Coverage
on Test Suite Effectiveness”. In: International Symposium on Software Testing
and Analysis. 2009, pp. 57–68 (cit. on p. 17).
[51] Jeff Offutt and Aynur Abdurazik. “Generating Tests from UML Specifications”.
In: International Conference on the Unified Modeling Language. 1999, pp. 416–
429 (cit. on p. 6).
[52] Jeff Offutt, Yiwei Xiong, and Shaoying Liu. “Criteria for Generating Specification-
based Tests”. In: International Conference on Engineering of Complex Computer
Systems. 1999, pp. 119–129 (cit. on p. 7).
[53] M. Öhman, S. Johansson, and K.E. Årzén. “Implementation Aspects of the
PLC standard IEC 1131-3”. In: Journal on Control Engineering Practice. Vol. 6.
4. Elsevier, 1998, pp. 547–555 (cit. on p. 5).
[54] Alessandro Orso and Gregg Rothermel. “Software Testing: a Research Travel-
ogue (2000–2014)”. In: Proceedings of the International conference on Software
Engineering (ICSE), Future of Software Engineering (2014), pp. 117–132 (cit.
on pp. 3, 7, 9).
[55] Thomas J. Ostrand and Marc J. Balcer. “The Category-partition Method for
Specifying and Generating Functional Tests”. In: Communications of the ACM.
Vol. 31. 6. ACM, 1988, pp. 676–686 (cit. on p. 7).
[56] Tolga Ovatman, Atakan Aral, Davut Polat, and Ali Osman Ünver. “An Overview
of Model Checking Practices on Verification of PLC Software”. In: Software &
Systems Modeling. Springer, 2014, pp. 1–24 (cit. on p. 19).
[57] Carlos Pacheco, Shuvendu K Lahiri, Michael D Ernst, and Thomas Ball.
“Feedback-Directed Random Test Generation”. In: International Conference on
Software Engineering. IEEE, 2007, pp. 75–84 (cit. on p. 16).
[58] Mike Papadakis and Nicos Malevris. “Automatic Mutation Test Case Generation
via Dynamic Symbolic Execution”. In: International Symposium on Software
Reliability Engineering. 2010, pp. 121–130 (cit. on p. 18).
[59] Alexander Pretschner, Wolfgang Prenninger, Stefan Wagner, Christian Kühnel,
Martin Baumgartner, Bernd Sostawa, Rüdiger Zölch, and Thomas Stauner.
“One Evaluation of Model-based Testing and its Automation”. In: International
Conference on Software Engineering. 2005, pp. 392–401 (cit. on p. 17).
[60] Rudolf Ramler, Dietmar Winkler, and Martina Schmidt. “Random Test Case
Generation and Manual Unit Testing: Substitute or Complement in Retrofitting
Tests for Legacy Code?” In: Euromicro Conference on Software Engineering
and Advanced Application. 2012, pp. 286–293 (cit. on pp. 17, 18).
[61] Rudolf Ramler, Klaus Wolfmaier, and Theodorich Kopetzky. “A Replicated
Study on Random Test Case Generation and Manual Unit Testing: How
Many Bugs Do Professional Developers Find?” In: Computer Software and
Applications Conference. IEEE, 2013, pp. 484–491 (cit. on pp. 17, 18).
26 Bibliography
[75] Jan Tretmans. “Model Based Testing with Labelled Transition Systems”. In:
Formal Methods and Testing. Springer, 2008, pp. 1–38 (cit. on p. 7).
[76] Willem Visser, Corina S Pasareanu, and Sarfraz Khurshid. “Test Input Genera-
tion with Java PathFinder”. In: SIGSOFT Software Engineering Notes. Vol. 29.
4. ACM, 2004, pp. 97–107 (cit. on pp. 16, 19).
[77] Xiaoyin Wang, Lingming Zhang, and Philip Tanofsky. “Experience Report:
How is Dynamic Symbolic Execution Different from Manual Testing? A Study
on KLEE”. In: International Symposium on Software Testing and Analysis.
ACM, 2015, pp. 199–210 (cit. on pp. 17, 18).
[78] Michael Whalen, Ajitha Rajan, Mats Heimdahl, and Steven Miller. “Coverage
Metrics for Requirements-based Testing”. In: International Symposium on
Software Testing and Analysis. 2006, pp. 25–36 (cit. on p. 7).
[79] MR Woodward and K Halewood. “From Weak to Strong, Dead or Alive? An
Analysis of Some Mutation Testing Issues”. In: Workshop on Software Testing,
Verification, and Analysis. 1988, pp. 152–158 (cit. on p. 8).
[80] Yi-Chen Wu and Chin-Feng Fan. “Automatic Test Case Generation for Struc-
tural Testing of Function Block Diagrams”. In: Information and Software
Technology. Vol. 56. 10. Elsevier, 2014 (cit. on pp. 16, 17).
[81] Lingming Zhang, Tao Xie, Lu Zhang, Nikolai Tillmann, Jonathan De Halleux,
and Hong Mei. “Test Generation via Dynamic Symbolic Execution for Mutation
Testing”. In: International Conference on Software Maintenance (ICSM). 2010,
pp. 1–10 (cit. on p. 18).
[82] Sai Zhang, David Saff, Yingyi Bu, and Michael D Ernst. “Combined Static
and Dynamic Automated Test Generation”. In: International Symposium on
Software Testing and Analysis. ACM, 2011, pp. 353–363 (cit. on p. 8).