SV Book Part1
SV Book Part1
1
BRICS, Department of Computer Science, Aalborg University, 9220 Aalborg Ø, Den-
mark.
2
Department of Computer Science, School of Science and Engineering,
Reykjavı́k University, Iceland
Contents
Preface ix
1 Introduction 3
1.1 What are reactive systems? . . . . . . . . . . . . . . . . . . . . . 4
1.2 Process algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Behavioural equivalences 37
3.1 Criteria for a good behavioural equivalence . . . . . . . . . . . . 37
3.2 Trace equivalence: a first attempt . . . . . . . . . . . . . . . . . . 40
3.3 Strong bisimilarity . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Weak bisimilarity . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5 Game characterization of bisimilarity . . . . . . . . . . . . . . . 73
3.5.1 Weak bisimulation games . . . . . . . . . . . . . . . . . 79
3.6 Further results on equivalence checking . . . . . . . . . . . . . . 81
iii
iv CONTENTS
8 Introduction 177
Bibliography 293
Index 310
List of Figures
vii
viii LIST OF FIGURES
ix
Preface
This book is based on courses that have been held at Aalborg University and Reyk-
javı́k University over the last five-six years. The aim of those semester-long courses
was to introduce students at the early stage of their MSc degrees, or late in their
BSc degree studies, in computer science to the theory of concurrency, and to its
applications in the modelling and analysis of reactive systems. This is an area
of formal methods that is finding increasing application outside academic circles,
and allows the students to appreciate how techniques and software tools based on
sound theoretical principles are very useful in the design and analysis of non-trivial
reactive computing systems.
In order to carry this message across to the students in the most effective way,
the courses on which the material in this book is based presented
• some of the prime models used in the theory of concurrency (with special
emphasis on state-transition models of computation like labelled transition
systems and timed automata),
• languages for describing actual systems and their specifications (with focus
on classic algebraic process calculi like Milner’s Calculus of Communicating
Systems and logics like modal and temporal logics), and
• their embodiment in tools for the automatic verification of computing sys-
tems.
The use of the theory and the associated software tools in the modelling and anal-
ysis of computing systems is a very important component in our courses since it
gives the students hands-on experience in the application of what they have learned,
and reinforces their belief that the theory they are studying is indeed useful and
worth mastering. Once we have succeeded in awakening an interest in the theory
of concurrency and its applications amongst our students, it will be more likely
that at least some of them will decide to pursue a more in-depth study of the more
advanced, and mathematically sophisticated, aspects of our field—for instance,
during their MSc thesis work or at a doctoral level.
xi
xii PREFACE
It has been very satisfying for us to witness a change of attitudes in the students
taking our courses over the years. Indeed, we have gone from a state in which most
of the students saw very little point in taking the course on which this material is
based, to one in which the relevance of the material we cover is uncontroversial
to many of them! At the time when an early version of our course was elective at
Aalborg University, and taken only by a few mathematically inclined individuals,
one of our students remarked in his course evaluation form that ‘This course ought
to be mandatory for computer science students.’ Now the course is mandatory, it is
attended by all of the MSc students in computer science at Aalborg University, and
most of them happily play with the theory and tools we introduce in the course.
How did this change in attitude come about? And why do we believe that this
is an important change? In order to answer these questions, it might be best to
describe first the general area of computer science to which this textbook aims at
contributing.
The correctness problem and its importance Computer scientists build arti-
facts (implemented in hardware, software or, as is the case in the fast-growing area
of embedded and interactive systems, using a combination of both) that are sup-
posed to offer some well defined services to their users. Since these computing
systems are deployed in very large numbers, and often control crucial, if not safety
critical, industrial processes, it is vital that they correctly implement the specifica-
tion of their intended behaviour. The problem of ascertaining whether a computing
system does indeed offer the behaviour described by its specification is called the
correctness problem, and is one of the most fundamental problems in computer
science. The field of computer science that studies languages for the description of
(models of) computer systems and their specifications, and (possibly automated)
methods for establishing the correctness of systems with respect to their specifica-
tions is called algorithmic verification.
Despite their fundamental scientific and practical importance, however, twen-
tieth century computer and communication technology has not paid sufficient at-
tention to issues related to correctness and dependability of systems in its drive
toward faster and cheaper products. (See the editorial (Patterson, 2005) by David
Patterson, former president of the ACM, for forceful arguments to this effect.)
As a result, system crashes are commonplace, sometimes leading to very costly,
when not altogether spectacular, system failures like Intel’s Pentium-II bug in the
floating-point division unit (Pratt, 1995) and the crash of the Ariane-5 rocket due
to a conversion of a 64-bit real number to a 16-bit integer (Lions, 1996).
Classic engineering disciplines have a time-honoured and effective approach to
building artifacts that meet their intended specifications: before actually construct-
PREFACE xiii
ing the artifacts, engineers develop models of the design to be built and subject
them to a thorough analysis. Surprisingly, such an approach has only recently been
used extensively in the development of computing systems.
This textbook, and the courses we have given over the years based on the mate-
rial it presents, stem from our deep conviction that each well educated twenty-first
century computer scientist should be well versed in the technology of algorithmic,
model-based verification. Indeed, as recent advances in algorithmic verification
and applications of model checking (Clarke, Gruemberg and Peled, 1999) have
shown, the tools and ideas developed within these fields can be used to analyze
designs of considerable complexity that, until a few years ago, were thought to be
intractable using formal analysis and modelling tools. (Companies such as AT&T,
Cadence, Fujitsu, HP, IBM, Intel, Motorola, NEC, Siemens and Sun—to mention
but a few—are using these tools increasingly on their own designs to reduce time
to market and ensure product quality.)
We believe that the availability of automatic software tools for model-based
analysis of systems is one of the two main factors behind the increasing interest
amongst students and practitioners alike in model-based verification technology.
Another is the realization that even small reactive systems—for instance, relatively
short concurrent algorithms—exhibit very complex behaviours due to their inter-
active nature. Unlike in the setting of sequential software, it is therefore not hard
for the students to realize that systematic and formal analysis techniques are useful,
when not altogether necessary, to obtain some level of confidence in the correctness
of our designs. The tool support that is now available to explore the behaviour of
models of systems expressed as collections of interacting state machines of some
sort makes the theory presented in this textbook very appealing for many students
at several levels of their studies.
It is our firmly held belief that only by teaching the beautiful theory of con-
current systems, together with its applications and associated verification tools, to
our students, we shall be able to transfer the available technology to industry, and
improve the reliability of embedded software and other reactive systems. We hope
that this textbook will offer a small contribution to this pedagogical endeavour.
Why this book? This book is by no means the first one devoted to aspects of
the theory of reactive systems. Some of the books that have been published in
this area over the last twenty years or so are the references (Baeten and Weijland,
1990; Fokkink, 2000; Hennessy, 1988; Hoare, 1985; Magee and Kramer, 1999;
Milner, 1989; Roscoe, 1999; Schneider, 1999; Stirling, 2001) to mention but a
few. However, unlike all the aforementioned books but (Fokkink, 2000; Magee and
Kramer, 1999; Schneider, 1999), the present book was explicitly written to serve as
xiv PREFACE
a textbook, and offers a distinctive pedagogical approach to its subject matter that
derives from our extensive use of the material presented here in book form in the
classroom. In writing this textbook we have striven to transfer on paper the spirit
of the lectures on which this text is based. Our readers will find that the style in
which this book is written is often colloquial, and attempts to mimic the Socratic
dialogue with which we try to entice our student audience to take active part in the
lectures and associated exercise sessions. Explanations of the material presented
in this textbook are interspersed with questions to our readers and exercises that
invite the readers to check straight away whether they understand the material as
it is being presented. We believe that this makes this book suitable for self-study,
as well as for use as the main reference text in courses ranging from advanced
BSc courses to MSc courses in computer science and related subjects.
Of course, it is not up to us to say whether we have succeeded in conveying
the spirit of the lectures in the book you now hold in your hands, but we sincerely
hope that our readers will experience some of the excitement that we still have in
teaching our courses based on this material, and in seeing our students appreciate
it, and enjoy working with concurrency theory and the tools it offers to analyze
reactive systems.
For the instructor We have used much of the material presented in this textbook
in several one semester courses at Aalborg University and at Reykjavı́k University,
amongst others. These courses usually consist of about thirty hours of lectures and
a similar number of hours of exercise sessions, where the students solve exercises
and work on projects related to the material in the course. As we already stated
above, we strongly believe that these practical sessions play a very important role
in making the students appreciate the importance of the theory they are learning,
and understand it in depth. Examples of recent courses based on this book may be
found at the URL
http://www.cs.aau.dk/rsbook/.
There the instructor will find suggested schedules for his/her courses, exercises that
can be used to supplement those in the textbook, links to other useful teaching re-
sources available on the web, further suggestions for student projects and electronic
slides that can be used for the lectures. (As an example, we usually supplement
lectures covering the material in this textbook with a series of four-six 45 minute
lectures on Binary Decision Diagrams (Bryant, 1992) and their use in verification
based on Henrik Reif Andersen’s excellent lecture notes (Andersen, 1998) that are
freely available on the web and on Randel Bryant’s survey paper (Bryant, 1992).)
We strongly recommend that the teaching of the material covered in this book
be accompanied by the use of software tools for verification and validation. In our
PREFACE xv
http://www.cs.aau.dk/rsbook/.
In writing this book, we have tried to be at once pedagogical, careful and precise.
However, despite our efforts, we are sure that there is still room for improving
this text, and for correcting any mistake that may have escaped our attention. We
shall use the aforementioned web page to inform the reader about additions and
modifications to this book.
We welcome corrections (typographical or otherwise), comments and sugges-
tions from our readers. You can contact us by sending an email at the address
studies in Edinburgh, and would not have been possible without them. Even though
the other three authors were not students of Milner’s themselves, the strong intel-
lectual influence of his work and writings on their view of concurrency theory will
be evident to the readers of this book. Indeed, the ‘Edinburgh concurrency theory
school’ features prominently in the academic genealogy of each of the authors. For
example, Rocco De Nicola and Matthew Hennessy had a strong influence on the
view of concurrency theory and the work of Luca Aceto and/or Anna Ingolfsdottir,
and Jiri Srba enjoyed the liberal supervision of Mogens Nielsen.
The material upon which the courses we have held at Aalborg University and
elsewhere since the late 1980s were based has undergone gradual changes before
reaching the present form. Over the years, the part of the course devoted to Milner’s
Calculus of Communicating Systems and its underlying theory has decreased, and
so has the emphasis on some topics of mostly theoretical interest. At the same time,
the course material has grown to include models and specification languages for
real-time systems. The present material aims at offering a good balance between
classic and real-time systems, and between the theory and its applications.
Overall, as already stated above, the students’ appreciation of the theoretical
material covered here has been greatly increased by the availability of software
tools based on it. We thank all of the developers of the tools we use in our teaching;
their work has made our subject matter come alive for our students, and has been
instrumental in achieving whatever level of success we might have in our teaching
based on this textbook.
This book was partly written while Luca Aceto was on leave from Aalborg
University at Reykjavı́k University, Anna Ingolfsdottir was working at deCODE
Genetics, and Jiri Srba was visiting the University of Stuttgart sponsored by a
grant from the Alexander von Humboldt Foundation. They thank these institu-
tions for their hospitality and excellent working conditions. Luca Aceto and Anna
Ingolfsdottir were partly supported by the project ‘The Equational Logic of Paral-
lel Processes’ (nr. 060013021) of The Icelandic Research Fund. Jiřı́ Srba received
partial support from a grant of the Ministry of Education of the Czech Republic,
project No. 1M0545.
We thank Silvio Capobianco, Pierre-Louis Curien, Gudmundur Hreidarson,
Rocco De Nicola, Ralph Leibmann, MohammadReza Mousavi, Guy Vidal-Naquet
and the students of the Concurrency Course (Concurrence) (number 2–3) 2004–
2005, Master Parisien de Recherche en Informatique, for useful comments and
corrections on drafts of this text.
The authors used drafts of the book in courses taught in the spring of 2004,
2005 and 2006, and in the autumn 2006, at Aalborg University, Reykjavı́k Univer-
sity and the University of Iceland. The students who took those courses offered
valuable feedback on the text, and gave us detailed lists of errata. We thank Claus
PREFACE xvii
Brabrand for using a draft of the first part of this book in his course Semantics (Q1,
2005 and 2006) at Aarhus University. The suggestions from Claus and his stu-
dents helped us improve the text further. Moreover, Claus and Martin Mosegaard
designed and implemented an excellent simulator for Milner’s Calculus of Com-
municating Systems and the ‘bisimulation-game game’ that our students can use
to experiment with the behaviour of processes written in this language, and to play
the bisimulation game described in the textbook.
Last, but not least, we are thankful to David Tranah at Cambridge University
Press for his enthusiasm for our project, and to the three anonymous reviewers that
provided useful comments on a draft of this book.
Any remaining infelicity is solely our responsibility.
Luca Aceto and Anna Ingolfsdottir dedicate this book to their son Róbert, to
Anna’s sons Logi and Kári, and to Luca’s mother, Imelde Diomede Aceto. Kim
G. Larsen dedicates the book to his wife Merete and to his two daughters Mia and
Trine. Finally, Jiřı́ Srba dedicates the book to his parents Jaroslava and Jiřı́, and to
his wife Vanda.
Luca Aceto and Anna Ingolfsdottir, Reykjav´ık, Iceland
Kim G. Larsen and Jiř´ı Srba, Aalborg, Denmark
Part I
1
Chapter 1
Introduction
The aim of the first part of this book is to introduce three of the basic notions that
we shall use to describe, specify and analyze reactive systems, namely
• Milner’s Calculus of Communicating Systems (CCS) (Milner, 1989),
• Hennessy-Milner Logic (HML) (Hennessy and Milner, 1985) and its exten-
sion with recursive definitions of formulae (Larsen, 1990).
We shall present a general theory of reactive systems and its applications. In par-
ticular, we intend to show how
1. to describe actual systems using terms in our chosen models (that is, either
as terms in the process description language CCS or as labelled transition
systems),
3
4 CHAPTER 1. INTRODUCTION
After having worked through the material in this book, you will be able to
describe non-trivial reactive systems and their specifications using the aforemen-
tioned models, and verify the correctness of a model of a system with respect to
given specifications either manually or by using automatic verification tools like
the Edinburgh Concurrency Workbench (Cleaveland et al., 1993) and the model
checker for real-time systems U PPAAL (Behrmann et al., 2004).
Our, somewhat ambitious, aim is therefore to present a model of reactive sys-
tems that supports their design, specification and verification. Moreover, since
many real-life systems are hard to analyze manually, we should like to have com-
puter support for our verification tasks. This means that all the models and lan-
guages that we shall use in this book need to have a formal syntax and semantics.
(The syntax of a language consists of the rules governing the formation of state-
ments, whereas its semantics assigns meaning to each of the syntactically correct
statements in the language.) These requirements of formality are not only neces-
sary in order to be able to build computer tools for the analysis of systems’ descrip-
tions, but are also fundamental in agreeing upon what the terms in our models are
actually intended to describe in the first place. Moreover, as Donald Knuth once
wrote:
A person does not really understand something until after teaching
it to a computer, i.e. expressing it as an algorithm.. . . An attempt to
formalize things as algorithms leads to a much deeper understanding
than if we simply try to comprehend things in the traditional way.
The pay-off of using formal models with an explicit formal semantics to describe
our systems will therefore be the possibility of devising algorithms for the anima-
tion, simulation and verification of system models. These would be impossible to
obtain if our models were specified only in an informal notation.
Now that we know what to expect from this book, it is time to get to work.
We shall begin our journey through the beautiful land of Concurrency Theory by
introducing a prototype description language for reactive systems and its seman-
tics. However, before setting off on such an enterprise, we should describe in more
detail what we actually mean with the term ‘reactive system’.
S = z ← x; x ← y; y ← z
where the state s[x 7→ s(y), y 7→ s(x), z 7→ s(x)] is the one in which the value of
variable x is the value of y in state s and that of variables y and z is the value of x
in state s. The values of all of the other variables are those they had in state s. This
state transformation is a way of formally describing that the intended effect of S is
essentially to swap the values of the variables x and y.
On the other hand, the effect of the program
where we use skip to stand for a ‘no operation’, is described by the partial function
from states to states given by
that is the always undefined function. This captures the fact that the computation
of U never produces a result (final state) irrespective of the initial state.
In this view of computing systems, non-termination is a highly undesirable
phenomenon. An algorithm that fails to terminate on some inputs is not one the
users of a computing system would expect to have to use. A moment of reflection,
however, should make us realize that we already use many computing systems
whose behaviour cannot be readily described as a function from inputs to outputs—
not least because, at some level of abstraction, these systems are inherently meant
to be non-terminating. Examples of such computing systems are
• operating systems,
• communication protocols,
• (Process) Algebra,
• Logic.
prototype specification languages for reactive systems. They evolved from the in-
sights of many outstanding researchers over the last thirty years, and a brief history
of the evolution of the original ideas that led to their development may be found
in (Baeten, 2005). (For an accessible, but more advanced, discussion of the role that
algebra plays in process theory you may consult the survey paper (Luttik, 2006).)
A crucial initial observation that is at the heart of the notion of process algebra
is due to Milner, who noticed that concurrent processes have an algebraic struc-
ture. For example, once we have built two processes P and Q, we can form a
new process by combining P and Q sequentially or in parallel. The result of these
combinations will be a new process whose behaviour depends on that of P and Q
and on the operation that we have used to compose them. This is the first sense
in which these description languages are algebraic: they consist of a collection of
operations for building new process descriptions from existing ones.
Since these languages aim at specifying parallel processes that may interact
with one another, a key issue that needs to be addressed is how to describe commu-
nication/interaction between processes running at the same time. Communication
amounts to information exchange between a process that produces the informa-
tion (the sender), and a process that consumes it (the receiver). We often think of
this communication of information as taking place via some medium that connects
the sender and the receiver. If we are to develop a theory of communicating sys-
tems based on this view, it looks as if we have to decide upon the communication
medium used in inter-process communication. Several possible choices immedi-
ately come to mind. Processes may communicate via, e.g., (un)bounded buffers,
shared variables, some unspecified ether, or the tuple spaces used by Linda-like
languages (Gelernter, 1985). Which one do we choose? The answer is not at all
clear, and each specific choice may in fact reduce the applicability of our language
and the models that support it. A language that can properly describe processes that
communicate via, say, FIFO buffers may not readily allow us to specify situations
in which processes interact via shared variables, say.
The solution to this riddle is both conceptually simple and general. One of the
crucial original insights of figures like Hoare and Milner is that we need not distin-
guish between active components like senders and receivers, and passive ones like
the aforementioned kinds of communication media. All of these may be viewed as
processes—that is, as systems that exhibit behaviour. All of these processes can in-
teract via message-passing modelled as synchronized communication, which is the
only basic mode of interaction. This is the key idea underlying Hoare’s Communi-
cating Sequential Processes (CSP) (Hoare, 1978; Hoare, 1985), a highly influential
proposal for a programming language for parallel programs, and Milner’s Calcu-
lus of Communicating Systems (CCS) (Milner, 1989), the paradigmatic process
algebra.
Chapter 2
We shall now introduce the language CCS. We begin by informally presenting the
process constructions allowed in this language and their semantics in Section 2.1.
We then proceed to put our developments on a more formal footing in Section 2.2.
9
10 CHAPTER 2. THE LANGUAGE CCS
' $
coffee u CS u pub
&u %
coin
all is the process 0 (read ‘nil’). This is the most boring process imaginable, as it
performs no action whatsoever. The process 0 offers the prototypical example of a
deadlocked behaviour—one that cannot proceed any further in its computation.
The most basic process constructor in CCS is action prefixing. Two example
processes built using 0 and action prefixing are a match and a complex match,
described by the expressions
respectively. Intuitively, a match is a process that dies when stricken (i.e., that
becomes the process 0 after executing the action strike), and a complex match is
one that needs to be taken before it can behave like a match. More in general, the
formation rule for action prefixing says that:
If P is a process and a is a label, then a.P is a process.
The idea is that a label, like strike or pub, will denote an input or output action on
a communication port, and that the process a.P is one that begins by performing
action a and behaves like P thereafter.
We have already mentioned that processes can be given names, very much like
procedures can. This means that we can introduce names for (complex) processes,
and that we can use these names in defining other process descriptions. For in-
stance, we can give the name Match to the complex match thus:
def
Match = take.strike.0 .
Note that, since the process name Clock is a short-hand for the term on the right-
hand side of the above equation, we may repeatedly replace the name Clock with
its definition to obtain that
def
Clock = tick.Clock
= tick.tick.Clock
= tick.tick.tick.Clock
..
.
= tick.
| .{z . . .tick} .Clock ,
n-times
Choice The CCS constructs that we have presented so far would not allow us
to describe the behaviour of a vending machine that allows its paying customer
to choose between tea and coffee, say. In order to allow for the description of
processes whose behaviour may follow different patterns of interaction with their
environment, CCS offers the choice operator, which is written ‘+’. For example, a
vending machine offering either tea or coffee may be described thus:
def
CTM = coin.(coffee.CTM + tea.CTM) . (2.2)
The idea here is that, after having received a coin as input, the process CTM is will-
ing to deliver either coffee or tea, depending on its customer’s choice. In general,
the formation rule for choice states that:
The process P +Q is one that has the initial capabilities of both P and Q. However,
choosing to perform initially an action from P will pre-empt the further execution
of actions from Q, and vice versa.
12 CHAPTER 2. THE LANGUAGE CCS
Exercise 2.1 Give a CCS process that describes a clock that ticks at least once,
and that may stop ticking after each clock tick.
Exercise 2.2 Give a CCS process that describes a coffee machine that may behave
like that given by (2.1), but may also steal the money it receives and fail at any
time.
Using the operators introduced so far, give a CCS process that ‘describes T ’.
' $ ' $
u u u
coffee
CM coffee
CS pub
&u % &u %
coin coin
computer scientist and the coffee machine to communicate in the parallel compo-
sition CM | CS. However, we do not require that they must communicate with one
another. Both the computer scientist and the coffee machine could use their com-
plementary ports to communicate with other reactive systems in their environment.
For example, another computer scientist CS0 can use the coffee machine CM, and,
in so doing, make sure that he can produce publications to beef up his curriculum
vitae, and thus be a worthy competitor for CS in the next competition for a tenured
position. (See Figure 2.3.) Alternatively, the computer scientist may have access
to another coffee machine in her environment, as pictured in Figure 2.4.
In general, given two CCS expressions P and Q, the process P | Q describes a
system in which
Restriction and relabelling Since academics like the computer scientist often
live in a highly competitive ‘publish or perish’ environment, it may be fruitful
for her to make the coffee machine CM private to her, and therefore inaccessible
to her competitors. To make this possible, the language CCS offers an operation
called restriction, whose aim is to delimit the scope of channel names in much the
same way as variables have scope in block structured programming languages. For
instance, using the operations \coin and \coffee, we may hide the coin and coffee
ports from the environment of the processes CM and CS. Define the process SmUni
(for ‘Small University’) thus:
def
SmUni = (CM | CS) \ coin \ coffee . (2.4)
14 CHAPTER 2. THE LANGUAGE CCS
' $ ' $
u u u
coffee
CM coffee
CS pub
&u % &u %
coin coin
' $
u u
coffee CS’ pub
&u %
coin
' $ ' $
u u u
coffee
CM coffee
CS pub
&u % &u %
coin coin
' $
u
CM’ coffee
&u %
coin
As pictured in Figure 2.5, the restricted coin and coffee ports may now only be
used for communication between the computer scientist and the coffee machine,
and are not available for interaction with their environment. Their scope is re-
stricted to the process SmUni. The only port of SmUni that is visible to its envi-
ronment, e.g., to the competing computer scientist CS 0 , is the one via which the
computer scientist CS outputs her publications. In general, the formation rule for
restriction is as follows:
In P \ L, the scope of the port names in L is restricted to P —those port names can
only be used for communication within P .
Since a computer scientist cannot live on coffee alone, it is beneficial for her
to have access to other types of vending machines offering, say, chocolate, dried
figs and crisps. The behaviour of these machines may be easily specified by means
of minor variations on equation 2.1 on page 11. For instance, we may define the
16 CHAPTER 2. THE LANGUAGE CCS
' $ ' $
u u u
CM CS pub
&u % &u %
' $
&u %
coin
processes
def
CHM = coin.choc.CHM
def
DFM = coin.figs.DFM
def
CRM = coin.crisps.CRM .
Note, however, that all of these vending machines follow a common behavioural
pattern, and may be seen as specific instances of a generic vending machine that
receives a coin as input, dispenses an item and restarts, namely the process
def
VM = coin.item.VM .
where VM[choc/item] is a process that behaves like VM, but outputs chocolate
whenever VM dispenses the generic item. In general,
If P is a process and f is a function from labels to labels satisfying
certain requirements that will be made precise in Section 2.2, then
P [f ] is a process.
By introducing the relabelling operation, we have completed our informal tour
of the operations offered by the language CCS for the description of process be-
haviours. We hope that this informal introduction has given our readers a feeling
for the language, and that our readers will agree with us that CCS is indeed a
language based upon very few operations with an intuitively clear semantic inter-
pretation. In passing, we have also hinted at the fact that CCS processes may be
seen as defining automata which describe their behaviour—see Exercise 2.3. We
shall now expand a little on the connection between CCS expressions and the au-
tomata describing their behaviour. The presentation will again be informal, as we
plan to highlight the main ideas underlying this connection rather than to focus im-
mediately on the technicalities. The formal connection between CCS expressions
and labelled transition systems will be presented in Section 2.2 using the tools of
Structural Operational Semantics (Plotkin, 1981; Plotkin, 2004b).
def
CS = pub.CS1
def
CS1 = coin.CS2
def
CS2 = coffee.CS
instance, for the purpose of notational convenience in what follows, let us rede-
fine the process CS (originally defined in equation 2.3 on page 12) as in Table 2.1.
(This is the definition of the process CS that we shall use from now on, both when
discussing its behaviour in isolation and in the context of other processes—for in-
stance, as a component of the process SmUni.) Process CS can perform action pub
and evolve into a process whose behaviour is described by the CCS expression CS 1
in doing so. Process CS1 can then output a coin, thereby evolving into a process
whose behaviour is described by the CCS expression CS 2 . Finally, this process can
receive coffee as input, and behave like our good old CS all over again. Thus the
processes CS, CS1 and CS2 are the only possible states of the computation of pro-
cess CS. Note, furthermore, that there is really no conceptual difference between
processes and their states! By performing an action, a process evolves to another
process that describes what remains to be executed of the original one.
In CCS, processes change state by performing transitions, and these transitions
are labelled by the action that caused them. An example state transition is
pub
CS → CS1 ,
which says that CS can perform action pub, and become CS 1 in doing so. The op-
erational behaviour of our computer scientist CS is therefore completely described
by the following labelled transition system.
coffee
pub coin
CS / CS1 / CS2
In much the same way, we can make explicit the set of states of the coffee machine
described in equation 2.1 on page 11 by rewriting that equation thus:
def
CM = coin.CM1
def
CM1 = coffee.CM .
2.1. SOME CCS PROCESS CONSTRUCTIONS 19
Note that the computer scientist is willing to output a coin in state CS 1 , as wit-
nessed by the transition
coin
CS1 → CS2 ,
and the coffee machine is willing to accept that coin in its initial state, because of
the transition
coin
CM → CM1 .
Therefore, when put in parallel with one another, these two processes may commu-
nicate and change state simultaneously. The result of the communication should
be described as a state transition of the form
?
CM | CS1 → CM1 | CS2 .
In this way, the behaviour of the process SmUni defined by equation 2.4 on page 13
can be described by the following labelled transition system.
SmUni
pub
(CM | CS1 ) \ coin \ coffee
i
(CM1 | CS2 ) \ coin \ coffee pub
(CM | CS) \ coin \ coffee
20 CHAPTER 2. THE LANGUAGE CCS
Example 2.1 Let us start with a variation on the classic example of a tea/coffee
vending machine. The very simplified behaviour of the process which determines
the interaction of the machine with a customer can be described as follows. From
the initial state—say, p—representing the situation ‘waiting for a request’, two pos-
sible actions are enabled. Either the tea button or the coffee button can be pressed
2.2. CCS, FORMALLY 21
(the corresponding action ‘tea’ or ‘coffee’ is executed) and the internal state of the
machine changes accordingly to p1 or p2 . Formally, this can be described by the
transitions
tea coffee
p → p1 and p → p2 .
The target state p1 records that the customer has requested tea, whereas p 2 de-
scribes the situation in which coffee has been selected.
Now the customer is asked to insert the corresponding amount of money, let us
say one euro for a cup of tea and two euros for a cup of coffee. This is reflected
by corresponding changes in the control state of the vending machine. These state
changes can be modelled by the transitions
1C
= 2C
=
p1 → p3 and p2 → p3 ,
whose target state p3 records that the machine has received payment for the chosen
drink.
Finally, the drink is collected and the machine returns to its initial state p, ready
to accept the request of another customer. This corresponds to the transition
collect
p3 → p .
pO A
}}} AAA
} AA
tea }}} AAcoffee
} } AA
}} AA
} AA
~}
}
p1 A collect p2
AA }}
AA }}
AA }}
AA }
} =
1C
= AA }} 2C
AA }
~}}
p3
Sometimes, when referring only to the process p, we do not have to give names to
the other process states (in our example p 1 , p2 and p3 ) and it is sufficient to provide
the following labelled transition system for the process p.
22 CHAPTER 2. THE LANGUAGE CCS
pT
a b
z c
d
p1 o p2
•O B
||| BBB
| BB
tea ||| BBcoffee
|| BB
|| BB
| BB
~||
•B collect •
BB ||
BB |
BB ||
BB |||
1C
= BBB || 2C =
BB |||
~|
•
Remark 2.1 The definition of a labelled transition system permits situations like
that in Figure 2.6 (where p is the initial state). In that labelled transition system,
the state p2 , where the action c can be performed in a loop, is irrelevant for the
behaviour of the process p since, as you can easily check, p 2 can never be reached
from p. This motivates us to introduce the notion of reachable states. We say that
a state p0 in the transition system representing a process p is reachable from p iff
there exists an directed path from p to p 0 . The set of all such states is called the set
of reachable states. In our example this set contains exactly two states, namely p
and p1 .
Definition 2.1 [Labelled transition system] A labelled transition system (LTS) (at
α
times also called a transition graph) is a triple (Proc, Act, { →| α ∈ Act}), where:
• Proc is a set of states (or processes);
• Act is a set of actions (or labels);
α
• →⊆ Proc × Proc is a transition relation, for every α ∈ Act. As usual, we
α α
shall use the more suggestive notation s → s0 in lieu of (s, s0 ) ∈→, and
α α
write s 9 (read ‘s refuses a’) iff s → s0 for no state s0 .
2.2. CCS, FORMALLY 23
A labelled transition system is finite if its sets of states and actions are both finite.
For example, the LTS for the process SmUni defined by equation 2.4 on page 13
(see page 19) is formally specified thus:
Proc = {SmUni, (CM | CS1 ) \ coin \ coffee, (CM1 | CS2 ) \ coin \ coffee,
(CM | CS) \ coin \ coffee}
Act = {pub, τ }
pub
→ = { SmUni, (CM | CS1 ) \ coin \ coffee ,
(CM | CS) \ coin \ coffee, (CM | CS1 ) \ coin \ coffee } , and
τ
→ = { (CM | CS1 ) \ coin \ coffee, (CM1 | CS2 ) \ coin \ coffee ,
(CM1 | CS2 ) \ coin \ coffee, (CM | CS) \ coin \ coffee } .
As mentioned above, we shall often distinguish a so called start state (or initial
state ), which is one selected state in which the system initially starts. For exam-
ple, the start state for the process SmUni presented above is, not surprisingly, the
process SmUni itself.
α
Remark 2.2 Sometimes the transition relations → are presented as a ternary rela-
α
tion →⊆ Proc × Act × Proc and we write s → s0 whenever (s, α, s0 ) ∈→. This is
an alternative way to describe a labelled transition system and it defines the same
notion as Definition 2.1.
Notation 2.1 Let us now recall a few useful notations that will be used in connec-
tion with labelled transitions systems.
• We can extend the transition relation to the elements of Act ∗ (the set of all
finite strings over Act including the empty string ε). The definition is as
follows:
ε
– s → s for every s ∈ Proc, and
αw α w
– s → s0 iff there is a state t ∈ Proc such that s → t and t → s0 , for
∗
every s, s0 ∈ Proc, α ∈ Act and w ∈ Act .
In other words, if w = α1 α2 · · · αn for α1 , α2 . . . , αn ∈ Act then we write
w
s → s0 whenever there exist states s0 , s1 , . . . , sn−1 , sn ∈ Proc such that
α α α α αn−1 α
s = s0 →1 s1 →2 s2 →3 s3 →4 · · · → sn−1 →n sn = s0 .
ε
For the transition system in Figure 2.6 we have, for example, that p → p,
ab bab
p → p and p1 → p.
24 CHAPTER 2. THE LANGUAGE CCS
α
• We write s → s0 whenever there is an action α ∈ Act such that s → s0 .
For the transition system in Figure 2.6 we have, for instance, that p → p1 ,
p1 → p, p2 → p1 and p2 → p2 .
α
• We use the notation s → meaning that there is some s0 ∈ Proc such that
α
s → s0 .
a
For the transition system in Figure 2.6 we have, for instance, that p → and
b
p1 →.
w
• We write s →∗ s0 iff s → s0 for some w ∈ Act∗ . In other words, →∗ is the
reflexive and transitive closure of the relation →.
For the transition system in Figure 2.6 we have, for example, that p →∗ p,
p →∗ p1 , and p2 →∗ p.
a / s1
sO
a a
a
s3 o s2
α
• Define the labelled transition system as a triple (Proc, Act, { →| α ∈ Act}).
a
• What is the reflexive closure of the binary relation →? (A drawing is fine.)
a
• What is the symmetric closure of the binary relation →? (A drawing is fine.)
a
• What is the transitive closure of the binary relation →? (A drawing is fine.)
α
Definition 2.2 [Reachable states] Let T = (Proc, Act, {→| α ∈ Act}) be a la-
belled transition system, and let s ∈ Proc be its initial state. We say that s 0 ∈ Proc
is reachable in the transition system T iff s → ∗ s0 . The set of reachable states
contains all states reachable in T .
In the transition system from Figure 2.6, where p is the initial state, the set of
reachable states is equal to {p, p1 }.
2.2. CCS, FORMALLY 25
Exercise 2.5 What would the set of reachable states in the labelled transition sys-
tem in Figure 2.6 be if its start state were p2 ?
The step from a process denoted by a CCS expression to the LTS describing
its operational behaviour is taken using the framework of Structural Operational
Semantics (SOS) as pioneered by Plotkin in (Plotkin, 2004b). (The history of
the development of the ideas that led to SOS is recounted by Plotkin himself
in (Plotkin, 2004a).) The key idea underlying this approach is that the collection of
CCS process expressions will be the set of states of a (large) labelled transition sys-
tem, whose actions will be either input or output actions on communication ports
or τ , and whose transitions will be exactly those that can be proven to hold by
means of a collection of syntax-driven rules. These rules will capture the informal
semantics of the CCS operators presented above in a very simple and elegant way.
The operational semantics of a CCS expression is then obtained by selecting that
expression as the start state in the LTS for the whole language, and restricting our-
selves to the collection of CCS expressions that are reachable from it by following
transitions.
where
• K is a process name in K;
• α is an action in Act;
f (τ ) = τ and
f (ā) = f (a) for each label a ;
order: restriction and relabelling (tightest binding), action prefixing, parallel com-
position and summation. For example, the expression a.0 | b.P \ L + c.0 stands
for
((a.0) | (b.(P \ L))) + (c.0) .
Exercise 2.6 Which of the following expressions are syntactically correct CCS ex-
pressions? Why? Assume that A, B are process constants and a, b are channel
names.
• a.b.A + B
• a.B + [a/b]
• τ.τ.B + 0
• (a.b.A + a.0) | B
• (a.b.A + a.0).B
• (a.b.A + a.0) + B
• (0 | 0) + 0
Our readers can easily check that all of the processes presented in the previous
section are indeed CCS expressions. Another example of a CCS expression is
given by a counter, which is defined thus:
def
Counter0 = up.Counter 1 (2.5)
def
Countern = up.Counter n+1 + down.Counter n−1 (n > 0) . (2.6)
The behaviour of such a process is intuitively clear. For each non-negative integer
n, the process Counter n behaves like a counter whose value is n; the ‘up’ actions
increase the value of the counter by one, and the ‘down’ actions decrease it by one.
It would also be easy to construct the (infinite state) LTS for this process based on
its syntactic description, and on the intuitive understanding of process behaviour
28 CHAPTER 2. THE LANGUAGE CCS
we have so far developed. However, intuition alone can lead us to wrong con-
clusions, and most importantly cannot be fed to a computer! To capture formally
our understanding of the semantics of the language CCS, we therefore introduce
the collection of SOS rules in Table 2.2. These rules are used to generate an LTS
α
whose states are CCS expressions. In that LTS, a transition P → Q holds for CCS
expressions P, Q and action α if, and only if, it can be proven using the rules in
Table 2.2.
A rule like
α
α.P → P
is an axiom, as it has no premises—that is, it has no transition above the solid
line. This means that proving that a process of the form α.P affords the transition
α
α.P → P (the conclusion of the rule) can be done without establishing any further
α
sub-goal. Therefore each process of the form α.P affords the transition α.P → P .
As an example, we have that the following transition
pub
pub.CS1 → CS1 (2.7)
ACT α
α.P → P
α
Pj → Pj0
SUMj P α where j ∈ I
i∈I Pi → Pj0
α
P → P0
COM1 α
P | Q → P0 | Q
α
Q → Q0
COM2 α
P | Q → P | Q0
a ā
P → P0 Q → Q0
COM3 τ
P | Q → P 0 | Q0
α
P → P0
RES α where α, ᾱ 6∈ L
P \ L → P0 \ L
α
P → P0
REL
f (α)
P [f ] → P 0 [f ]
α
P → P0 def
CON α where K = P
K→ P0
This rule states that every transition of a term P determines a transition of the
expression P \ L, provided that neither the action producing the transition nor its
complement are in L. For example, as you can check, this side condition prevents
us from proving the existence of the transition
coffee
(coffee.CS) \ coffee → CS \ coffee .
Finally, note that, when considering the binary version of the summation operator,
the family of rules SUMj reduces to the following two rules.
α α
P1 → P10 P2 → P20
SUM1 α SUM2 α
P1 + P2 → P10 P1 + P2 → P20
To get a feeling for the power of recursive definitions of process behaviours, con-
sider the process C defined thus:
def
C = up.(C | down.0) . (2.8)
What are the transitions that this process affords? Using the rules for constants
and action prefixing, you should have little trouble in arguing that the only initial
transition for C is
up
C → C | down.0 . (2.9)
down
What next? Observing that down.0 → 0, using rule COM2 in Table 2.2 we can
infer that
down
C | down.0 → C | 0 .
Since it is reasonable to expect that the process C | 0 exhibits the same behaviour
as C—and we shall see later on that this does hold true—, the above transition
effectively brings our process back to its initial state, at least up to behavioural
equivalence. However, this is not all, because, as we have already proven (2.9),
using rule COM1 in Table 2.2 we have that the transition
up
C | down.0 → (C | down.0) | down.0
is also possible. You might find it instructive to continue building a little more of
the transition graph for process C. As you may begin to notice, the LTS giving
the operational semantics of the process expression C looks very similar to that for
Counter0 , as given in (2.5). Indeed, we shall prove later on that these two processes
exhibit the same behaviour in a very strong sense.
Exercise 2.7 Use the rules of the SOS semantics for CCS to derive the LTS for the
process SmUni defined by equation 2.4 on page 13. (Use the definition of CS in
Table 2.1.)
2.2. CCS, FORMALLY 31
def
Exercise 2.8 Assume that A = b.a.B. By using the SOS rules for CCS prove the
existence of the following transitions:
τ
• (A | b.0) \ {b} → (a.B | 0) \ {b},
b
• (A | b.a.B) + (b.A)[a/b] → (A | a.B), and
a
• (A | b.a.B) + (b.A)[a/b] → A[a/b].
Exercise 2.9 Draw (part of) the transition graph for the process name A whose
behaviour is given by the defining equation
def
A = (a.A) \ b .
The resulting transition graph should have infinitely many states. Can you think of
a CCS term that generates a finite labelled transition system that should intuitively
have the same behaviour as A?
Exercise 2.10 Draw (part of) the transition graph for the process name A whose
behaviour is given by the defining equation
def
A = (a0 .A)[f ]
where we assume that the set of channel names is {a 0 , a1 , a2 , . . .}, and f (ai ) =
ai+1 for each i.
The resulting transition graph should (again!) have infinitely many states. Can
you give an argument showing that there is no finite state labelled transition system
that could intuitively have the same behaviour as A?
Exercise 2.11
1. Draw the transition graph for the process name Mutex 1 whose behaviour is
given by the defining equation
def
Mutex1 = (User | Sem) \ {p, v}
def
User = p̄.enter.exit.v̄.User
def
Sem = p.v.Sem .
32 CHAPTER 2. THE LANGUAGE CCS
2. Draw the transition graph for the process name Mutex 2 whose behaviour is
given by the defining equation
def
Mutex2 = ((User | Sem) | User) \ {p, v} ,
3. Draw the transition graph for the process name FMutex whose behaviour is
given by the defining equation
def
FMutex = ((User | Sem) | FUser) \ {p, v} ,
where User and Sem are defined as before, and the behaviour of FUser is
given by the defining equation
def
FUser = p̄.enter.(exit.v̄.FUser + exit.v̄.0) .
Do you think that Mutex 2 and FMutex are offering the same behaviour? Can
you argue informally for your answer?
• If B is full, then it is only willing to output the successor of the value it stores,
and empties itself in doing so.
Note that the input prefix ‘in’ now carries a parameter that is a variable—in this
case x—whose scope is the process that is prefixed by the input action—in this
example, B(x). The intuitive idea is that process B is willing to accept a non-
negative integer n as input, bind the received value to x and thereafter behave like
B(n)—that is, like a full one-place buffer storing the value n. The behaviour of
the process B(n) is then described by the second equation above, where the scope
of the formal parameter x is the whole right-hand side of the equation. Note that
output prefixes, like ‘out(x+1)’ above, may carry expressions—the idea being that
the value being output is the one that results from the evaluation of the expression.
The general SOS rule for input prefixing now becomes
a(n) n≥0
a(x).P → P [n/x]
where we write P [n/x] for the expression that results by replacing each free oc-
currence of the variable x in P with n. The general SOS rule for output prefixing
is instead the one below.
In value passing CCS, as we have already seen in our definition of the one place
buffer B, process names may be parameterized by value variables. The general
form that these parameterized constants may take is A(x 1 , . . . , xn ), where A is a
process name, n ≥ 0 and x1 , . . . , xn are distinct value variables. The operational
semantics for these constants is given by the following rule.
α
P [v1 /x1 , . . . , vn /xn ] → P 0 def
α A(x1 , . . . , xn ) = P and each ei has value vi
A(e1 , . . . , en ) → P 0
34 CHAPTER 2. THE LANGUAGE CCS
To become familiar with these rules, you should apply them to the one-place buffer
B, and derive its possible transitions.
In what follows, we shall restrict ourselves to CCS expressions that have no
free occurrences of value variables—that is, to CCS expressions in which each
occurrence of a value variable, say y, is within the scope of an input prefix of the
form a(y) or of a parameterized constant A(x 1 , . . . , xn ) with y = xi for some
1 ≤ i ≤ n. For instance, the expression
a(x).b̄(y + 1).0
is disallowed because the single occurrence of the value variable y is bound neither
by an input prefixing nor by a parameterized constant.
Since processes in value passing CCS may manipulate data, it is natural to
add an if bexp then P else Q construct to the language, where bexp is a boolean
expression. Assume, by way of example, that we wish to define a one-place buffer
Pred that computes the predecessor function on the non-negative integers. This
may be defined thus:
def
Pred = in(x).Pred(x)
def
Pred(x) = if x = 0 then out(0).Pred else out(x − 1).Pred .
Use the Cell to define a two-place bag and a two-place FIFO queue. (Recall that
a bag, also known as multiset, is a set whose elements have multiplicity.) Give
specifications of the expected behaviour of these processes, and use the operational
rules given above to convince yourselves that your implementations are correct.
2.2. CCS, FORMALLY 35
P_Q = (P[p0 /p, e0 /e, o0 /o] | Q[p0 /push, e0 /empty, o0 /pop]) \ {p0 , o0 , e0 } .
Draw an initial fragment of the transition graph for this process. What behaviour
do you think B implements?
Exercise 2.14 (For the theoretically minded) Prove that the operational seman-
tics for value passing CCS we have given above is in complete agreement with the
semantics for this language via translation into the pure calculus given by Milner
in (Milner, 1989, Section 2.8).
Chapter 3
Behavioural equivalences
We have previously remarked that CCS, like all other process algebras, can be used
to describe both implementations of processes and specifications of their expected
behaviours. A language like CCS therefore supports the so-called single language
approach to process theory—that is, the approach in which a single language is
used to describe both actual processes and their specifications. An important in-
gredient of these languages is therefore a notion of behavioural equivalence or
behavioural approximation between processes. One process description, say SYS,
may describe an implementation, and another, say SPEC, may describe a specifica-
tion of the expected behaviour. To say that SYS and SPEC are equivalent is taken
to indicate that these two processes describe essentially the same behaviour, albeit
possibly at different levels of abstraction or refinement. To say that, in some formal
sense, SYS is an approximation of SPEC means roughly that every aspect of the
behaviour of this process is allowed by the specification SPEC, and thus that noth-
ing unexpected can happen in the behaviour of SYS. This approach to program
verification is also sometimes called implementation verification or equivalence
checking.
def
Spec = pub.Spec ,
37
38 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
and that the process C in equation 2.8 on page 30 behaves like a counter. Our order
of business now will be to introduce a suitable notion of behavioural equivalence
that will allow us to establish these expected equalities and many others.
Before doing so, it is however instructive to consider the criteria that we expect
a suitable notion of behavioural equivalence for processes to meet. First of all, we
have already used the term ‘equivalence’ several times, and since this is a mathe-
matical notion that some of you may not have met before, it is high time to define
it precisely.
Definition 3.1 Let X be a set. A binary relation over X is a subset of X × X, the
set of pairs of elements of X. If R is a binary relation over X, we often write x R y
instead of (x, y) ∈ R.
An equivalence relation over X is a binary relation R that satisfies the follow-
ing constraints:
• R is reflexive—that is, x R x for each x ∈ X;
P Q
C C
C(P ) C(Q)
refinement steps which are known to preserve some behavioural relation R. In this
approach, we might begin from our specification Spec and transform it into our
implementation Imp via a sequence of intermediate stages Spec i (0 ≤ i ≤ n) thus:
Since each of the steps above preserves the relation R, we would like to conclude
that Imp is a correct implementation of Spec with respect to R—that is, that
Spec R Imp
C[P ] R C[Q] .
Taking the point of view of standard automata theory, and abstracting from the no-
tion of ‘accept state’ that is missing altogether in our treatment, an automaton may
be completely identified by its set of traces, and thus two processes are equivalent
if, and only if, they afford the same traces.
This point of view is totally justified and natural if we view our LTSs as non-
deterministic devices that may generate or accept sequences of actions. However,
is it still a reasonable one if we view our automata as reactive machines that interact
with their environment?
To answer this questions, consider the coffee and tea machine CTM defined in
equation 2.2 on page 11, and compare it with the following one:
def
CTM0 = coin.coffee.CTM0 + coin.tea.CTM0 . (3.2)
You should be able to convince yourselves that CTM and CTM 0 afford the same
traces. (Do so!) However, if you were a user of the coffee and tea machine who
wants coffee and hates tea, which machine would you like to interact with? We
certainly would prefer to interact with CTM as that machine will give us coffee
after receiving a coin, whereas CTM0 may refuse to deliver coffee after having
accepted our coin!
This informal discussion may be directly formalized within CCS by assuming
that the behaviour of the coffee starved user is described by the process
def
CA = coin.coffee.CA .
and
(CA | CTM0 ) \ {coin, coffee, tea}
that we obtain by forcing interaction between the coffee addict CA and the two
vending machines. Using the SOS rules for CCS, you should convince yourselves
that the former term can only perform an infinite computation consisting of τ -
labelled transitions, whereas the second term can deadlock thus:
τ
(CA | CTM0 ) \ {coin, coffee, tea} → (coffee.CA | tea.CTM0 ) \ {coin, coffee, tea} .
Note that the target term of this transition captures precisely the deadlock situation
that we intuitively expected to have, namely that the user only wants coffee, but
the machine is only willing to deliver tea. So trace equivalent terms may exhibit
42 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
different deadlock behaviour when made to interact with other parallel processes—
a highly undesirable state of affairs.
In light of the above example, we are forced to reject the law
which is familiar from the standard theory of regular languages, for our desired
notion of behavioural equivalence. (Can you see why?) Therefore we need to
refine our notion of equivalence in order to differentiate processes that, like the two
vending machines above, exhibit different reactive behaviour while still having the
same traces.
1. Do the processes
and
(CA | CTM0 ) \ {coin, coffee, tea}
defined above have the same completed traces?
2. Is it true that if P and Q are two CCS processes affording the same com-
pleted traces and L is a set of labels, then P \ L and Q \ L also have the
same completed traces?
trace equivalent processes CTM and CTM0 exhibited different deadlock behaviour
when made to interact with a third parallel process, namely CA. In hindsight, this
is not overly surprising. In fact, when looking purely at the (completed) traces of a
process, we focus only on the sequences of actions that the process may perform,
but do not take into account the communication capabilities of the intermediate
states that the process traverses as it computes. As the above example shows,
the communication potential of the intermediate states does matter when we may
interact with the process at all times. In particular, there is a crucial difference in
the capabilities of the states reached by CTM and CTM 0 after these processes have
received a coin as input. Indeed, after accepting a coin the machine CTM always
enters a state in which it is willing to output both coffee and tea, depending on
what its user wants, whereas the machine CTM0 can only enter a state in which it
is willing to deliver either coffee or tea, but not both.
The lesson that we may learn from the above discussion is that a suitable notion
of behavioural relation between reactive systems should allow us to distinguish
processes that may have different deadlock potential when made to interact with
other processes. Such a notion of behavioural relation must take into account the
communication capabilities of the intermediate states that processes may reach as
they compute. One way to ensure that this holds is to require that in order for two
processes to be equivalent, not only should they afford the same traces, but, in some
formal sense, the states that they reach should still be equivalent. You can easily
convince yourselves that trace equivalence does not meet this latter requirement,
as the states that CTM and CTM0 may reach after receiving a coin as input are not
trace equivalent.
The classic notion of strong bisimulation equivalence, introduced by David
Park in (Park, 1981) and widely popularized by Robin Milner in (Milner, 1989),
formalizes the informal requirements introduced above in a very elegant way.
Definition 3.2 [Strong bisimulation] A binary relation R over the set of states of
an LTS is a bisimulation iff whenever s 1 R s2 and α is an action:
α α
- if s1 → s01 , then there is a transition s2 → s02 such that s01 R s02 ;
α α
- if s2 → s02 , then there is a transition s1 → s01 such that s01 R s02 .
Two states s and s0 are bisimilar, written s ∼ s0 , iff there is a bisimulation that
relates them. Henceforth the relation ∼ will be referred to as strong bisimulation
equivalence or strong bisimilarity.
Since the operational semantics of CCS is given in terms of an LTS whose states
are CCS process expressions, the above definition applies equally well to CCS
44 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
s t
888
88
88a
a a
88
88
b
8
s1 / s2 k t1 l
b b
3.3. STRONG BISIMILARITY 45
– transitions from s:
a a
∗ s → s1 can be matched by t → t1 and (s1 , t1 ) ∈ R,
a a
∗ s → s2 can be matched by t → t1 and (s2 , t1 ) ∈ R, and
∗ these are all the transitions from s;
– transitions from t:
a a
∗ t → t1 can be matched, e.g, by s → s2 and (s2 , t1 ) ∈ R (an-
a
other possibility would be to match it by s → s1 but finding one
matching transition is enough), and
∗ this is the only transition from t.
– transitions from s1 :
b b
∗ s1 → s2 can be matched by t1 → t1 and (s2 , t1 ) ∈ R, and
∗ this is the only transition from s1 ;
– transitions from t1 :
b b
∗ t1 → t1 can be matched by s1 → s2 and (s2 , t1 ) ∈ R, and
∗ this is the only transition from t1 .
– transitions from s2 :
b b
∗ s2 → s2 can be matched by t1 → t1 and (s2 , t1 ) ∈ R, and
∗ this is the only transition from s2 ;
– transitions from t1 :
b b
∗ t1 → t1 can be matched by s2 → s2 and (s2 , t1 ) ∈ R, and
∗ this is the only transition from t1 .
46 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
This completes the proof that R is a strong bisimulation and, since (s, t) ∈ R, we
get that s ∼ t.
In order to prove that, e.g., s1 ∼ s2 we can use the following relation
R = {(s1 , s2 ), (s2 , s2 )} .
Example 3.2 In this example we shall demonstrate that it is possible for the initial
state of a labelled transition system with infinitely many reachable states to be
strongly bisimilar to a state from which only finitely many states are reachable.
α
Consider the labelled transition system (Proc, Act, { →| α ∈ Act}) where
tm
a
R = {(si , t) | i ≥ 1}
is a strong bisimulation and it contains the pair (s 1 , t). The reader is invited to
verify this simple fact.
Consider now the two coffee and tea machines in our running example. We can
argue that CTM and CTM0 are not strongly bisimilar thus. Assume, towards a
contradiction, that CTM and CTM0 are strongly bisimilar. This means that there is
a strong bisimulation R such that
CTM R CTM0 .
Recall that
coin
CTM0 → tea.CTM0 .
3.3. STRONG BISIMILARITY 47
coin
CTM → P
coffee
coffee.CTM + tea.CTM → CTM ,
but tea.CTM0 cannot output coffee. Thus the first requirement in Definition 3.2
cannot be met. It follows that our assumption that the two machines were strongly
bisimilar leads to a contradiction. We may therefore conclude that, as claimed, the
processes CTM and CTM0 are not strongly bisimilar.
def
P = a.P1 + b.P2
def
P1 = c.P
def
P2 = c.P
and
def
Q = a.Q1 + b.Q2
def
Q1 = c.Q3
def
Q2 = c.Q3
def
Q3 = a.Q1 + b.Q2 .
We claim that P ∼ Q. To prove that this does hold, it suffices to argue that the
following relation is a strong bisimulation
| a a
@ s 8j 8
/ t3 / t4
Bt
88
88 a
a 88 a
88 a
8
a
s1 8 s2 a t1
88
88
a 8b8 a
88 a b
8
s3 s4 t2
Show that s ∼ t by finding a strong bisimulation R containing the pair (s, t).
Before looking at a few more examples, we now proceed to present some of the
general properties of strong bisimilarity. In particular, we shall see that ∼ is an
equivalence relation, and that it is preserved by all of the constructs in the CCS
language.
The following result states the most basic properties of strong bisimilarity, and
is our first theorem in this book.
3.3. STRONG BISIMILARITY 49
1. an equivalence relation,
1. In order to show that ∼ is an equivalence relation over the set of states Proc,
we need to argue that it is reflexive, symmetric and transitive. (See Defini-
tion 3.1.)
To prove that ∼ is reflexive, it suffices only to provide a bisimulation that
contains the pair (s, s), for each state s ∈ Proc. It is not hard to see that the
identity relation
I = {(s, s) | s ∈ Proc}
is such a relation.
We now show that ∼ is symmetric. Assume, to this end, that s 1 ∼ s2 for
some states s1 and s2 contained in Proc. We claim that s2 ∼ s1 also holds.
To prove this claim, recall that, since s 1 ∼ s2 , there is a bisimulation R that
contains the pair of states (s1 , s2 ). Consider now the relation
You should now be able to convince yourselves that the pair (s 2 , s1 ) is con-
tained in R−1 , and that this relation is indeed a bisimulation. Therefore
s2 ∼ s1 , as claimed.
We are therefore left to argue that ∼ is transitive. Assume, to this end, that
s1 ∼ s2 and s2 ∼ s3 for some states s1 , s2 and s3 contained in Proc. We
claim that s1 ∼ s3 also holds. To prove this, recall that, since s 1 ∼ s2 and
s2 ∼ s3 , there are two bisimulations R and R 0 that contain the pairs of states
(s1 , s2 ) and (s2 , s3 ), respectively. Consider now the relation
S = {(s01 , s03 ) | (s01 , s02 ) ∈ R and (s02 , s03 ) ∈ R0 , for some s02 } .
50 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
2. We aim at showing that ∼ is the largest strong bisimulation over the set of
states Proc. To this end, observe, first of all, that the definition of ∼ states
that [
∼ = {R | R is a bisimulation} .
This yields immediately that each bisimulation is included in ∼. We are
therefore left to show that the right-hand side of the above equation is itself
a bisimulation. This we now proceed to do.
Since we have already shown that ∼ is symmetric, it is sufficient to prove
that if
[ α
(s1 , s2 ) ∈ {R | R is a bisimulation} and s1 → s01 , (3.3)
α
then there is a state s02 such that s2 → s02 and
[
(s01 , s02 ) ∈ {R | R is a bisimulation} .
R = {(s1 , s2 )}∪ ∼ .
Exercise 3.6 Prove that the relations we have built in the proof of Theorem 3.1 are
indeed bisimulations.
Exercise 3.7 In the proof of Theorem 3.1(2), we argued that the union of all of the
bisimulation relations over an LTS is itself a bisimulation. Use the argument we
adopted in the proof of that statement to show that the union of an arbitrary family
of bisimulations is always a bisimulation.
Exercise 3.8 Is it true that any strong bisimulation must be reflexive, transitive
and symmetric? If yes then prove it, if not then give counter-examples—that is
52 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
• define an LTS and a binary relation over states that is not reflexive but is a
strong bisimulation;
• define an LTS and a binary relation over states that is not symmetric but is a
strong bisimulation; and
• define an LTS and a binary relation over states that is not transitive but is a
strong bisimulation.
Are the relations you have constructed the largest strong bisimulations over your
labelled transition systems?
Exercise 3.9 (Recommended) A binary relation R over the set of states of an LTS
is a string bisimulation iff whenever s 1 R s2 and σ is a sequence of actions in Act:
σ σ
- if s1 → s01 , then there is a transition s2 → s02 such that s01 R s02 ;
σ σ
- if s2 → s02 , then there is a transition s1 → s01 such that s01 R s02 .
Two states s and s0 are string bisimilar iff there is a string bisimulation that relates
them.
Prove that string bisimilarity and strong bisimilarity coincide. That is, show
that two states s and s0 are string bisimilar iff they are strongly bisimilar.
def
Exercise 3.10 Assume that the defining equation for the constant K is K = P .
Show that K ∼ P holds.
Exercise 3.11 Prove that two strongly bisimilar processes afford the same traces,
and thus that strong bisimulation equivalence satisfies the requirement for a be-
havioural equivalence we set out in equation (3.1). Hint: Use your solution to
Exercise 3.9 to show that, for each trace α1 · · · αk (k ≥ 0),
Is it true that strongly bisimilar processes have the same completed traces? (See
Exercise 3.2 for the definition of the notion of completed trace.)
Exercise 3.12 (Recommended) Show that the relations listed below are strong
bisimulations:
{(P | Q, Q | P ) | where P, Q are CCS processes}
{(P | 0, P ) | where P is a CCS process}
{((P | Q) | R, P | (Q | R)) | where P, Q, R are CCS processes} .
3.3. STRONG BISIMILARITY 53
P |Q ∼ Q|P , (3.4)
P | 0 ∼ P , and (3.5)
(P | Q) | R ∼ P | (Q | R) . (3.6)
(P | Q) \ a ∼ (P \ a) | (Q \ a) ?
Does the following equivalence hold for all CCS processes P and Q, and rela-
belling function f ?
(P | Q)[f ] ∼ (P [f ]) | (Q[f ]) .
If your answer to the above questions is positive, then construct appropriate bisim-
ulations. Otherwise, provide a counter-example to the claim.
P1 | P 2 | · · · | P k .
You should readily be able to convince yourselves that the pair of processes
(P | R, Q | R) is indeed contained in R, and thus that all we are left to do
to complete our argument is to show that R is a bisimulation. The proof
of this fact will, hopefully, also highlight that the above relation R was not
‘built out of thin air’, and will epitomize the creative process that underlies
the building of bisimulation relations.
First of all, observe that, by symmetry, to prove that R is a bisimulation, it is
α
sufficient to argue that if (P 0 | R0 , Q0 | R0 ) is contained in R and P 0 | R0 → S
α
for some action α and CCS process S, then Q 0 | R0 → T for some CCS
process T such that (S, T ) ∈ R. This we now proceed to do.
α
Assume that (P 0 | R0 , Q0 | R0 ) is contained in R and P 0 | R0 → S for some
action α and CCS process S. We now proceed with the proof by a case
α
analysis on the possible origins of the transition P 0 | R0 → S. Recall that
the transition we are considering must be provable using the SOS rules for
parallel composition given in Table 2.2 on page 29. Therefore there are three
α
possible forms that the transition P 0 | R0 → S may take, namely:
(P 00 | R0 , Q00 | R0 ) ∈ R .
(P 0 | R00 , Q0 | R00 ) ∈ R .
You should readily be able to convince yourselves that the pair of processes
(P \ L, Q \ L) is indeed contained in R. Moreover, following the lines of the
proof we have just gone through for parallel composition, it is an instructive
exercise to show that
You are strongly encouraged to fill in the missing details in the proof. 2
Exercise 3.14 Prove that ∼ is preserved by action prefixing, summation and rela-
belling.
Exercise 3.15 (For the theoretically minded) For each set of labels L and pro-
cess P , we may wish to build the process τ L (P ) that is obtained by turning into
a τ each action α performed by P with α ∈ L or ᾱ ∈ L. Operationally, the
behaviour of the construct τL ( ) can be described by the following two rules.
α
P → P0
τ if α ∈ L or ᾱ ∈ L
τL (P ) → τL (P 0 )
α
P → P0
α if α = τ or α, ᾱ 6∈ L
τL (P ) → τL (P 0 )
Prove that τL (P ) ∼ τL (Q), whenever P ∼ Q.
Consider the question of whether the operation τ L ( ) can be defined in CCS
modulo ∼—that is, can you find a CCS expression C L [ ] with a ‘hole’ (a place
holder when another process can be plugged) such that, for each process P ,
τL (P ) ∼ CL [P ] ?
We can now show that, in fact, C and Counter 0 are strongly bisimilar. To this end,
note that this follows if we can show that the relation R below
is a strong bisimulation. (Can you see why?) The following result states that this
does hold true.
α
2. Assume that Counter n → Q for some some action α and process Q. Then
Using CCS, we may specify the desired behaviour of a buffer with capacity one
thus:
def
B01 = in.B11
def
B11 = out.B01 .
The constant B01 stands for an empty buffer with capacity one—that is a buffer with
capacity one holding zero items—, and B 11 stands for a full buffer with capacity
one—that is a buffer with capacity one holding one item.
By analogy with the above definition, in general we may specify a buffer of
capacity n ≥ 1 as follows, where the superscript stands for the maximal capacity
of the buffer and the subscript for the number of elements the buffer currently
holds:
def
B0n = in.B1n
def
Bin n
= in.Bi+1 n
+ out.Bi−1 for 0 < i < n
def
Bnn n
= out.Bn−1 .
certainly met when n = 2 because, as you can readily check, the relation depicted
in Figure 3.2 is a bisimulation showing that
That this holds regardless of the size of the buffer to be implemented is the import
of the following result.
Intuitively, the above relation relates a buffer of capacity n holding i items with a
parallel composition of n buffers of capacity one, provided that exactly i of them
are full.
It is not hard to see that
• B0n , B01 | B01 | · · · | B01 ∈ R, and
• R is a strong bisimulation.
It follows that
B0n ∼ B01 | B01 | · · · | B01 ,
| {z }
n times
which was to be shown. We encourage you to fill in the details in this proof. 2
60 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
Exercise 3.17 (Simulation) Let us say that a binary relation R over the set of
states of an LTS is a simulation iff whenever s 1 R s2 and α is an action:
α α
- if s1 → s01 , then there is a transition s2 → s02 such that s01 R s02 .
We say that s0 simulates s, written s < 0
∼ s , iff there is a simulation R with s R s .
0
Two states s and s0 are simulation equivalent, written s ' s 0 , iff s < 0 0 <
∼ s and s ∼ s
both hold.
a.0 <
∼ a.a.0 and
a.b.0 + a.c.0 <
∼ a.(b.0 + c.0) .
Exercise 3.18 (Ready simulation) Let us say that a binary relation R over the set
of states of an LTS is a ready simulation iff whenever s 1 R s2 and α is an action:
α α
- if s1 → s01 , then there is a transition s2 → s02 such that s01 R s02 ; and
α α
- if s2 →, then s1 →.
We say that s0 ready simulates s, written s < 0
∼ RS s , iff there is a ready simulation R
with s R s . Two states s and s are ready simulation equivalent, written s ' RS s0 ,
0 0
a.0 <
∼ RS a.a.0 and
a.b.0 + a.c.0 <
∼ RS a.(b.0 + c.0) .
Is there a CCS process that can ready simulate any other CCS process?
Argue, first of all, that P and Q are not strongly bisimilar. Next show that:
(P | R) \ L and (Q | R) \ L
So P and Q have the same deadlock behaviour in all parallel contexts, even though
strong bisimilarity distinguishes them.
The lesson to be learned from these observations is that more generous notions
of behavioural equivalence than bisimilarity may be necessary to validate some
desirable equivalences.
P |Q ∼ Q|P ,
P | 0 ∼ P , and
(P | Q) | R ∼ P | (Q | R) .
Moreover, a wealth of other ‘structural equivalences’ like the ones above may be
proven to hold modulo strong bisimilarity. (See (Milner, 1989, Propositions 7–8).)
62 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
5 Start
pub
τ
(CMb | CS1 ) \ {coin, coffee}
pp NNN
ppp NNN
τpppp NNNτ
NNN
pp
ppppp NNN
NNN
x p
p &
Good Bad
where
def
Start ≡ (CMb | CS) \ {coin, coffee} CS = pub.CS1
def
Good ≡ (coffee.CMb | CS2 ) \ {coin, coffee} CS1 = coin.CS2
def
Bad ≡ (CMb | CS2 ) \ {coin, coffee} CS2 = coffee.CS .
behaviours of the system (CMb | CS) \ {coin, coffee}. In that table, for the sake of
notational convenience, we use Start as a short-hand for the CCS expression
The short-hands Bad and Good are also introduced in the picture using the ‘decla-
rations’
Good ≡ (coffee.CMb | CS2 ) \ {coin, coffee} and
Bad ≡ (CMb | CS2 ) \ {coin, coffee} .
Note that, there are two possible τ -transitions that stem from the process
erased from the behaviour of processes because, in light of their pre-emptive power
in the presence of nondeterministic choices, they may affect what we may observe.
Note that the pre-emptive power of internal transitions is unimportant in the
standard theory of automata as there we are only concerned with the possibility
of processing our input strings correctly. Indeed, as you may recall from your
courses in the theory of automata, the so-called ε-transitions do not increase the
expressive power of nondeterministic finite automata—see, for instance, the text-
book (Sipser, 2005, Chapter 1). In a reactive environment, on the other hand, this
power of internal transitions must be taken into account in a reasonable definition of
process behaviour because it may lead to undesirable consequences, e.g., the dead-
lock situation in the above example. We therefore expect that the behaviour of the
process SmUni is not equivalent to that of the process (CM b | CS) \ {coin, coffee}
since the latter may deadlock after outputting a publication, whereas the former
cannot.
In order to define a notion of bisimulation that allows us to abstract from inter-
nal transitions in process behaviours, and to differentiate the process SmUni from
(CMb | CS) \ {coin, coffee}, we begin by introducing a new notion of transition
relation between processes.
Definition 3.3 Let P and Q be CCS processes, or, more generally, states in an
α
LTS. For each action α, we shall write P ⇒ Q iff either
τ
• or α = τ and P (→)∗ Q,
τ τ
where we write (→)∗ for the reflexive and transitive closure of the relation →.
α
Thus P ⇒ Q holds if P can reach Q by performing an α-labelled transition, pos-
sibly preceded and followed by sequences of τ -labelled transitions. For example,
a a τ
a.τ.0 ⇒ 0 and a.τ.0 ⇒ τ.0 both hold, as well as a.τ.0 ⇒ a.τ.0. In fact, we have
τ
P ⇒ P for each process P .
In the LTS depicted in Table 3.1, apart from the obvious one step pub-labelled
transition, we have that
pub
Start ⇒ Good ,
pub
Start ⇒ Bad , and
pub
Start ⇒ Start .
3.4. WEAK BISIMILARITY 65
Our order of business will now be to use the new transition relations presented
above to define a notion of bisimulation that can be used to equate processes that
offer the same observable behaviour despite possibly having very different amounts
of internal computations. The idea underlying the definition of the new notion of
bisimulation is that a transition of a process can now be matched by a sequence of
transitions from the other that has the same ‘observational content’ and leads to a
state that is bisimilar to that reached by the first process.
Two states s and s0 are observationally equivalent (or weakly bisimilar), written
s ≈ s0 , iff there is a weak bisimulation that relates them. Henceforth the relation
≈ will be referred to as observational equivalence or weak bisimilarity.
• Let us examine all possible transitions from the components of the pair (s, t).
τ τ a a
If s → s1 then t ⇒ t and (s1 , t) ∈ R. If t → t1 then s ⇒ s2 and
(s2 , t1 ) ∈ R.
a a
• Let us examine all possible transitions from (s 1 , t). If s1 → s2 then t ⇒ t1
a a
and (s2 , t1 ) ∈ R. Similarly if t → t1 then s1 ⇒ s2 and again (s2 , t1 ) ∈ R.
• Consider now the pair (s2 , t1 ). Since neither s2 nor t1 can perform any
transition, it is safe to have this pair in R.
Hence we have shown that each pair from R satisfies the conditions given in Defi-
nition 3.4, which means that R is a weak bisimulation, as claimed.
66 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
We can readily argue that a.0 ≈ a.τ.0 by establishing a weak bisimulation that
relates these two processes. (Do so by renaming the states in the labelled transition
system and in the bisimulation above!) On the other hand, there is no weak bisim-
ulation that relates the process SmUni and the process Start in Table 3.1. In fact,
the process SmUni is observationally equivalent to the process
def
Spec = pub.Spec ,
Exercise 3.21 Prove that the behavioural equivalences claimed in Exercise 2.11
hold with respect to observational equivalence (weak bisimilarity).
The definition of weak bisimilarity is so natural, at least to our mind, that it is easy
to miss some of its crucial consequences. To highlight some of these, consider the
process
def
A? = a.0 + τ.B?
def
B? = b.0 + τ.A? .
Intuitively, this process describes a ‘polling loop’ that may be seen as an imple-
mentation of a process that is willing to receive on port a and port b, and then
terminate. Indeed, it is not hard to show that
A? ≈ B? ≈ a.0 + b.0 .
(Prove this!) This seems to be non-controversial until we note that A? and B? have
a livelock (that is, a possibility of divergence) due to the τ -loop
τ τ
A? → B? → A? ,
but a.0 + b.0 does not. The above equivalences capture one of the main features of
observational equivalence, namely the fact that it supports what is called ‘fair ab-
straction from divergence’. (See (Baeten, Bergstra and Klop, 1987), where Baeten,
Bergstra and Klop show that a proof rule embodying this idea, namely Koomen’s
fair abstraction rule, is valid with respect to observational equivalence.) This means
that observational equivalence assumes that if a process can escape from a loop
consisting of internal transitions, then it will eventually do so. This property of ob-
servational equivalence, that is by no means obvious from its definition, is crucial
in using it as a correctness criterion in the verification of communication protocols,
3.4. WEAK BISIMILARITY 67
def def
Send = acc.Sending Rec = trans.Del
def def
Sending = send.Wait Del = del.Ack
def def
Wait = ack.Send + error.Sending Ack = ack.Rec
def
Med = send.Med0
def
Med0 = τ.Err + trans.Med
def
Err = error.Med
where the communication media may lose messages, and messages may have to be
retransmitted some arbitrary number of times in order to ensure their delivery.
Note moreover that 0 is observationally equivalent to the process
def
Div = τ.Div .
This means that a process that can only diverge is observationally equivalent to a
deadlocked one. This may also seem odd at first sight. However, you will probably
agree that, assuming that we can only observe a process by communicating with it,
the systems 0 and Div are observationally equivalent since both refuse each attempt
at communicating with them. (They do so for different reasons, but these reasons
cannot be distinguished by an external observer.)
As an example of an application of observational equivalence to the verification
of a simple protocol, consider the process Protocol defined by
Proof: The proof follows the lines of that of Theorem 3.1, and is therefore omitted.
2
Exercise 3.23 Fill in the details of the proof of the above theorem.
Show that s ≈ t by finding a weak bisimulation containing the pair (s, t).
Exercise 3.26 Show that, for all P, Q, the following equivalences, which are usu-
ally referred to as Milner’s τ -laws, hold:
Exercise 3.28 We say that a CCS process is τ -free iff none of the states that it can
reach by performing sequences of transitions affords a τ -labelled transition. For
example, a.0 is τ -free, but a.(b.0 | b̄.0) is not.
Prove that no τ -free CCS process is observationally equivalent to a.0 + τ.0.
Exercise 3.29 Prove that, for each CCS process P , the process P \ (Act − {τ })
is observationally equivalent to 0. Does this remain true if we consider processes
modulo strong bisimilarity?
Two states s and s0 are weakly string bisimilar iff there is a weak string bisimula-
tion that relates them.
Prove that weak string bisimilarity and weak bisimilarity coincide. That is,
show that two states s and s0 are weakly string bisimilar iff they are weakly bisim-
ilar.
The notion of observational equivalence that we have just defined seems to meet
many of our desiderata. There is, however, one important property that observa-
tional equivalence does not enjoy. In fact, unlike strong bisimilarity, observational
equivalence is not a congruence. This means that, in general, we cannot substitute
observationally equivalent processes one for the other in a process context without
affecting the overall behaviour of the system.
To see this, observe that 0 is observationally equivalent to τ.0. However, it is
not hard to see that
a.0 + 0 6≈ a.0 + τ.0 .
τ
In fact, the transition a.0+τ.0 → 0 from the process a.0+τ.0 can only be matched
τ
by a.0 + 0 ⇒ a.0 + 0, and the processes 0 and a.0 + 0 are not observationally
equivalent. However, we still have that weak bisimilarity is a congruence with
respect to the remaining CCS operators.
Proof: The proof follows the lines of that of Theorem 3.2, and is left as an exercise
for the reader. 2
Exercise 3.32 Prove Theorem 3.4. In the proof of the second claim in the proposi-
tion, you may find the following fact useful:
a ā τ
if Q ⇒ Q0 and R → R0 , then Q|R ⇒ Q0 |R0 .
a
Show this fact by induction on the number of τ -steps in the transition Q ⇒ Q0 .
Exercise 3.33 Give syntactic restrictions on the syntax of CCS terms so that weak
bisimilarity becomes a congruence also with respect to the choice operator.
3.4. WEAK BISIMILARITY 71
1. Assume, to begin with, that there are only two philosophers and two forks.
Model the philosophers and the forks as CCS processes, assuming that the
philosophers and the forks are numbered from 1 to 2, and that the philoso-
phers pick the forks up in increasing order. (When he becomes hungry, the
second philosopher begins by picking up the second fork, and then picks up
the first.) Argue that the system has a deadlock by finding a state in the re-
sulting labelled transition system that is reachable from the start state, and
has no outgoing transitions.
We encourage you to find a possible deadlock in the system by yourselves,
and without using the Workbench.
2. Argue that a model of the system with five philosophers and five forks also
exhibits a deadlock.
3. Finally, assume that there are five philosophers and five forks, and that the
philosophers pick the forks up in increasing order, apart from the fifth, who
picks up the first fork before the fifth. Use the the Edinburgh Concurrency
72 CHAPTER 3. BEHAVIOURAL EQUIVALENCES
Here we are assuming that each philosopher performs action ‘think’ when he is
thinking, and that the funding agency is not interested in knowing which specific
philosopher is thinking!
Exercise 3.35 (For the theoretically minded) A binary relation R over the set of
states of an LTS is a branching bisimulation (Glabbeek and Weijland, 1996) iff it is
symmetric, and whenever s R t and α is an action (including τ ):
α
if s → s0 , then
- either α = τ and s0 R t
- or there is a k ≥ 0 and a sequence of transitions
τ τ α
t = t 0 → t1 → t2 · · · t k → t0
Two states s and t are branching bisimulation equivalent (or branching bisimi-
lar) iff there is a branching bisimulation that relates them. The largest branching
bisimulation is called branching bisimilarity.
1. Show that branching bisimilarity is contained in weak bisimilarity.
2. Can you find two processes that are weakly bisimilar, but not branching
bisimilar?
3. Which of the τ -laws from Exercise 3.26 holds with respect to branching
bisimilarity?
Exercise 3.36 Define the binary relation ≈ c over the set of states of an LTS as
follows:
s1 ≈c s2 iff for each action α (including τ ):
3.5. GAME CHARACTERIZATION OF BISIMILARITY 73
α τ α
- if s1 → s01 , then there is a sequence of transitions s 2 ⇒ s002 →
τ
s000 0 0
2 ⇒ s2 such that s1 ≈ s2 ;
0
α τ α
- if s2 → s02 , then there is a sequence of transitions s 1 ⇒ s001 →
τ
s000 0 0
1 ⇒ s1 such that s1 ≈ s2 .
0
α
Definition 3.5 [Strong bisimulation game] Let (Proc, Act, {→| α ∈ Act}) be a
labelled transition system. A strong bisimulation game starting from the pair of
states (s1 , t1 ) ∈ Proc × Proc is a two-player game of an ‘attacker’ and a ‘de-
fender’.
The game is played in rounds, and configurations of the game are pairs of states
from Proc × Proc. In every round exactly one configuration is called current ;
initially the configuration (s1 , t1 ) is the current one.
In each round the players change the current configuration (s, t) according to
the following rules.
1. The attacker chooses either the left- or the right-hand side of the current
configuration (s, t) and an action α from Act.
α
• If the attacker chose left then he has to perform a transition s → s0 for
some state s0 ∈ Proc.
α
• If the attacker chose right then he has to perform a transition t → t0 for
some state t0 ∈ Proc.
2. In this step the defender must provide an answer to the attack made in the
previous step.
• If the attacker chose left then the defender plays on the right-hand side,
α
and has to respond by making a transitions t → t0 for some t0 ∈ Proc.
• If the attacker chose right then the defender plays on the left-hand side
α
and has to respond by making a transitions s → s0 for some s0 ∈ Proc.
3. The configuration (s0 , t0 ) becomes the current configuration and the game
continues for another round according to the rules described above.
A finite play is lost by the player who is stuck and cannot make a move from
the current configuration (s, t) according to the rules of the game. Note that the
attacker loses a finite play only if both s 9 and t 9, i.e., there is no transition
from both the left- and the right-hand side of the configuration. The defender loses
a finite play if he has (on his side of the configuration) no available transition under
the action selected by the attacker.
It can also be the case that none of the players is stuck in any configuration
and the play is infinite. In this situation the defender is the winner of the play.
Intuitively, this is a natural choice of outcome because if the play is infinite then the
attacker has been unable to find a ‘difference’ in the behaviour of the two systems—
which will turn out to be bisimilar.
A given play is always winning either for the attacker or the defender and it
cannot be winning for both at the same time.
The following proposition relates strong bisimilarity with the corresponding
game characterization (see, e.g., (Stirling, 1995; Thomas, 1993)).
Proposition 3.3 States s1 and t1 of a labelled transition system are strongly bisim-
ilar if and only if the defender has a universal winning strategy in the strong bisim-
ulation game starting from the configuration (s 1 , t1 ). The states s1 and t1 are not
strongly bisimilar if and only if the attacker has a universal winning strategy.
By universal winning strategy we mean that the player can always win the game,
regardless of how the other player is selecting his moves. In case the opponent has
more than one choice for how to continue from the current configuration, all these
possibilities have to be considered.
The notion of a universal winning strategy is best explained by means of an
example.
Example 3.5 Let us recall the transition system from Example 3.1.
s 8 t
888
88
a 88a a
88
88
b
s1 / s2 k t1 l
b b
We will show that the defender has a universal winning strategy from the configu-
ration (s, t) and hence, in light of Proposition 3.3, that s ∼ t. In order to do so, we
have to consider all possible attacker’s moves from this configuration and define
76 CHAPTER 3.5. GAME CHARACTERIZATION OF BISIMILARITY
defender’s response to each of them. The attacker can make three different moves
from (s, t).
a
1. Attacker selects right-hand side, action a and makes the move t → t1 ,
a
2. Attacker selects left-hand side, action a and makes the move s → s2 .
a
3. Attacker selects left-hand side, action a and makes the move s → s1 .
a
• Defender’s answer to attack 1. is by playing s → s2 .
(Even though there are more possibilities it is sufficient to provide only one
suitable answer.)
The current configuration becomes (s 2 , t1 ).
a
• Defender’s answer to attack 2. is by playing t → t1 .
The current configuration becomes again (s 2 , t1 ).
a
• Defender’s answer to attack 3. is by playing t → t1 .
The current configuration becomes (s 1 , t1 ).
Now it remains to show that the defender has a universal winning strategy from the
configurations (s2 , t1 ) and (s1 , t1 ).
From (s2 , t1 ) it is easy to see that any continuation of the game will always
go through the same current configuration (s 2 , t1 ) and hence the game will be
necessarily infinite. According to the definition of a winning play, the defender is
the winner in this case.
b b
From (s1 , t1 ) the attacker has two possible moves. Either s 1 → s2 or t1 → t1 .
b
In the former case the defender answers by t 1 → t1 , and in the latter case by
b
s1 → s2 . The next configuration is in both cases (s 2 , t1 ), and we already know
that the defender has a winning strategy from this configuration.
Hence we have shown that the defender has a universal winning strategy from
the configuration (s, t) and, according to Proposition 3.3, this means that s ∼ t.
Example 3.6 Let us consider the following transition system (we provide only its
graphical representation).
CHAPTER 3.5. GAME CHARACTERIZATION OF BISIMILARITY 77
s t@
~~~ @@@
a ~~ @@a
a ~~ @@
~ @@
~~
~
s1 @ t1 t2
~~ @@
b ~~~ @@ c
~ @@ b c
~~ @@
~~
~ @
s2 s3 t3 t4
We will show that s 6∼ t by describing a universal winning strategy for the attacker
in the bisimulation game starting from (s, t). We will in fact show two different
strategies (but of course finding one is sufficient for proving non-bisimilarity).
• In the first strategy, the attacker selects the left-hand side, action a and the
a a a
transition s → s1 . Defender can answer by t → t1 or t → t2 . This means
that we will have to consider two different configurations in the next round,
namely (s1 , t1 ) and (s1 , t2 ). From (s1 , t1 ) the attacker wins by playing the
c
transition s1 → s3 on the left-hand side, and the defender cannot answer as
there is no c-transition from t1 . From (s1 , t2 ) the attacker wins by playing
b
s1 → s2 and the defender has again no answer from t 2 . As we analyzed all
different possibilities for the defender and in every one the attacker wins, we
have found a universal winning strategy for the attacker. Hence s and t are
not bisimilar.
• Now we provide another strategy, which is easier to describe and involves
switching of sides. Starting from (s, t) the attacker plays on the right-hand
a
side according to the transition t → t1 and the defender can only answer by
a
s → s1 on the left-hand side (no more configurations need to be examined as
this is the only possibility for the defender). The current configuration hence
c
becomes (s1 , t1 ). In the next round the attacker plays s 1 → s3 and wins the
c
game as t1 9.
Example 3.7 Let us consider a slightly more complex transition system.
s9 t9
999 99
99a
a 99a
99 99
9 b 9 b
s1 s2 q t1 9q
99
b 99b
99
9
s3 t2
78 CHAPTER 3.5. GAME CHARACTERIZATION OF BISIMILARITY
We will define attacker’s universal winning strategy from (s, t) and hence show
that s 6∼ t.
a
In the first round the attacker plays on the left-hand side the move s → s1
a
and the defender can only answer by t → t1 . The current configuration becomes
(s1 , t1 ). In the second round the attacker plays on the right-hand side according
b b
to the transition t1 → t1 and the defender can only answer by s 1 → s3 . The
current configuration becomes (s3 , t1 ). Now the attacker wins by playing again
b b
the transition t1 → t1 (or t1 → t2 ) and the defender loses because s3 9.
s\ t\ B u \99 v \8
99 88
99 a 88
a a 99 a
88a
a 88
99 88
9 b /
s1 a a u1 u2 v1 v2
8 t1
b a b
B g B
b b b
b
b b
s2 Y t2 u3 v3 Y
b b
Exercise 3.38 (For the theoretically minded) Prove Proposition 3.3 on page 75.
Hint: Argue that, using the universal winning strategy for the defender, you can
find a strong bisimulation, and conversely that, given a strong bisimulation, you
can define a universal winning strategy for the defender.
Exercise 3.39 (For the theoretically minded) Recall from Exercise 3.17 that a
binary relation R over the set of states of an LTS is a simulation iff whenever
s1 R s2 and a is an action then
a a
• if s1 → s01 , then there is a transition s2 → s02 such that s01 R s02 .
A binary relation R over the set of states of an LTS is a 2-nested simulation iff R
is a simulation and moreover R−1 ⊆ R.
CHAPTER 3.5. GAME CHARACTERIZATION OF BISIMILARITY 79
Exercise 3.40 (For the theoretically minded) Can you change the rules of the
strong bisimulation game in such a way that it characterizes the ready simulation
preorder introduced in Exercise 3.18?
The definitions of a play and winning strategy are exactly as before and we have a
similar proposition as for the strong bisimulation game.
Proposition 3.4 Two states s1 and t1 of a labelled transition system are weakly
bisimilar if and only if the defender has a universal winning strategy in the weak
bisimulation game starting from the configuration (s 1 , t1 ). The states s1 and t1 are
not weakly bisimilar if and only if the attacker has a universal winning strategy.
80 CHAPTER 3.5. GAME CHARACTERIZATION OF BISIMILARITY
We remind the reader of the fact that, in the weak bisimulation game from the
current configuration (s, t), if the attacker chooses a move under the silent action
τ
τ (let us say s → s0 ) then the defender can (as one possibility) simply answer by
τ
doing ‘nothing’, i.e., by idling in the state t (as we always have t ⇒ t). In that
case, the current configuration becomes (s 0 , t).
Again, the notions of play and universal winning strategy in the weak bisimu-
lation game are best explained by means of an example.
Example 3.8 Consider the following transition system.
s t
a τ
s1 D t1
DD zz DD DD a
DDb azz DD
a DD zz DD
DD z D!
! }z
z τ
s2 s3 t2 o t3
a b
t4 t5
We will show that s 6≈ t by defining a universal winning strategy for the attacker
in the weak bisimulation game from (s, t).
In the first round, the attacker selects the left-hand side and action a, and plays
a a
the move s → s1 . The defender has three possible moves to answer: (i) t ⇒ t2 via
a a
t1 , (ii) t ⇒ t2 via t1 and t3 , and (iii) t ⇒ t3 via t1 . In case (i) and (ii) the current
configuration becomes (s1 , t2 ) and in case (iii) it becomes (s1 , t3 ).
b
From the configuration (s1 , t2 ) the attacker wins by playing s1 → s3 , and the
b
defender loses because t2 ;.
From the configuration (s1 , t3 ) the attacker plays the τ -move from the right-
τ τ
hand side: t3 → t2 . Defender’s only answer from s1 is s1 ⇒ s1 because no τ
actions are enabled from s1 . The current configuration becomes (s 1 , t2 ) and, as
argued above, the attacker has a winning strategy from this pair.
This concludes the proof and shows that s 6≈ t because we found a universal
winning strategy for the attacker.
a
Exercise 3.41 In the weak bisimulation game the attacker is allowed to use →
a
moves for the attacks, and the defender can use ⇒ moves in response. Argue that
if we modify the rules of the game so that the attacker can also use moves of the
a
form ⇒ then this does not provide any additional power for the attacker. Conclude
that both versions of the game provide the same answer about bisimilarity/non-
bisimilarity of two processes.
3.6. FURTHER RESULTS ON EQUIVALENCE CHECKING 81
The aim of this chapter is to collect under one roof all the mathematical notions
from the theory of partially ordered sets and lattices that is needed to introduce
Tarski’s classic fixed point theorem. You might think that this detour into some ex-
otic looking mathematics is unwarranted in this textbook. However, we shall then
put these possible doubts of yours to rest by using this fixed point theorem to give
an alternative definition of strong bisimulation equivalence. This reformulation of
the notion of strong bisimulation equivalence is not just mathematically pleasing,
but it also yields an algorithm for computing the largest strong bisimulation over fi-
nite labelled transition systems—i.e., labelled transition systems with only finitely
many states, actions and transitions. This is an illustrative example of how appar-
ently very abstract mathematical notions turn out to have algorithmic content and,
possibly unexpected, applications in Computer Science. As you will see in what
follows, we shall also put Tarski’s fixed point theorem to good use in Chapter 6,
where the theory developed in this chapter will allow us to understand the meaning
of recursively defined properties of reactive systems.
Definition 4.1 [Partially ordered sets] A partially ordered set (often abbreviated to
85
86 CHAPTER 4. THEORY OF FIXED POINTS
poset) is a pair (D, v), where D is a set, and v is a binary relation over D (i.e., a
subset of D × D) such that:
We moreover say that (D, v) is a totally ordered set if, for all d, e ∈ D, either
d v e or e v d holds.
• (N, ≤), where N denotes the set of natural numbers, and ≤ stands for the
standard ordering over N.
• (R, ≤), where R denotes the set of real numbers, and ≤ stands for the stan-
dard ordering over R.
• (A∗ , ≤), where A∗ is the set of strings over alphabet A, and ≤ denotes the
prefix ordering between strings, i.e., for all s, t ∈ A ∗ , s ≤ t iff there exists
w ∈ A∗ such that sw = t. (Check that this is indeed a poset!)
• Let (A, ≤) be a finite totally ordered set. Then (A ∗ , ≺), the set of strings
in A∗ ordered lexicographically, is a poset. Recall that, for all s, t ∈ A ∗ ,
the relation s ≺ t holds with respect to the lexicographic order if one of the
following conditions apply:
• Let (D, v) be a poset and S be a set. Then the collection of functions from
S to D is a poset when equipped with the ordering relation defined thus:
We encourage you to think of other examples of posets you are familiar with.
Exercise 4.1 Convince yourselves that the structures mentioned in the above ex-
ample are indeed posets. Which of the above posets is a totally ordered set?
4.1. POSETS AND COMPLETE LATTICES 87
As witnessed by the list of structures in Example 4.1 and by the many other exam-
ples that you have met in your discrete mathematics courses, posets are abundant
in mathematics. Another example of a poset that will play an important role in
the developments to follow is the structure (2 S , ⊆), where S is a set, 2S stands for
the set of all subsets of S, and ⊆ denotes set inclusion. For instance, the structure
(2Proc , ⊆) is a poset for each set of states Proc in a labelled transition system.
Exercise 4.2 Is the poset (2S , ⊆) totally ordered?
Definition 4.2 [Least upper bounds and greatest lower bounds] Let (D, v) be a
poset, and take X ⊆ D.
• We say that d ∈ D is an upper bound for X iff x vF d for all x ∈ X. We say
that d is the least upper bound (lub) of X, notation X, iff
In the poset (N, ≤), all finite subsets of N have least upper bounds. Indeed, the
least upper bound of such a set is its largest element. On the other hand, no infinite
subset of N has an upper bound. All subsets of N have a least element, which is
their greatest lower bound. S T
In (2S , ⊆), every subset X of 2S has a lub and a glb given by X and X,
respectively. For example, consider the poset (2 N , ⊆), consisting of the family of
subsets of the set of natural numbers N ordered S by inclusion. Take X to be the
collection of finite sets of even numbers. Then X is the set of even numbers and
T
X is the empty set. (Can you see why?)
Exercise 4.4
S T
1. Prove that the lub and the glb of a subset X of 2 S are indeed X and X,
respectively.
88 CHAPTER 4. THEORY OF FIXED POINTS
2. Give examples of subsets of {a, b} ∗ that have upper bounds in the poset
({a, b}∗ , ≤), where ≤ is the prefix ordering over strings defined in the third
bullet of Example 4.1. Find examples of subsets of {a, b}∗ that do not have
upper bounds in that poset.
As you have seen already, a poset like (2 S , ⊆) has the pleasing property that each
of its subsets has both a least upper bound and a greatest lower bound. Posets with
this property will play a crucial role in what follows, and we now introduce them
formally.
F
Definition
d 4.3 [Complete lattices] A poset (D, v) is a complete lattice iff X
and X exist for every subset X of D.
d
Note that a complete lattice (D, v) F has a least element ⊥ = D, often called
bottom, and a top element > = D. For example, the bottom element of the
poset (2S , ⊆) is the empty set, and the top element is S. (Why?) By Exercise 4.3,
the least and top elements of a complete lattice are unique.
F d
Exercise 4.5 Let (D, v) be a complete lattice. What are ∅ and ∅? Hint: Each
element of D is both a lower bound and an upper bound for ∅. Why?
Example 4.2
• The poset (N, ≤) is not a complete lattice because, as remarked previously,
it does not have least upper bounds for its infinite subsets.
• The poset (N ∪ {∞}, v), obtained by adding a largest element ∞ to (N, ≤),
is instead a complete lattice. This complete lattice can be pictured as follows:
∞
..
.
↑
2
↑
1
↑
0
where ≤ is the reflexive and transitive closure of the ↑ relation.
• (2S , ⊆) is a complete lattice.
Of course, you should convince yourselves of these claims!
4.2. TARSKI’S FIXED POINT THEOREM 89
>
C [777
77
77
77
77
7
0 [7 C 1
77
77
77
77
77
7
⊥
90 CHAPTER 4. THEORY OF FIXED POINTS
The identify function over {⊥, 0, 1, >} is monotonic, but the function mapping ⊥
to 0 and acting like the identity function on all of the other elements is not. (Why?)
Note that both of the posets mentioned above are in fact complete lattices.
Intuitively, if we view the partial order relation in a poset (D, v) as an ‘infor-
mation order’—that is, if we view d v d 0 as meaning that ‘d0 has at least as much
information as d’—, then monotonic functions have the property that providing
more information in the input will offer at least as much information as we had
before in the output. (Our somewhat imprecise, but hopefully suggestive, slogan
during lectures on this topic is that a monotonic function is one with the property
that ‘the more you get in, the more you get out!’)
The following important theorem is due to Tarski (Tarski, 1955), and was also
independently proven for the special case of lattices of sets by Knaster (Knaster,
1928).
Theorem 4.1 [Tarski’s fixed point theorem] Let (D, v) be a complete lattice, and
let f : D → D be monotonic. Then f has a largest fixed point z max and a least
fixed point zmin given by
G
zmax = {x ∈ D | x v f (x)} and
l
zmin = {x ∈ D | f (x) v x} .
Proof: First we shall prove that zmax is the largest fixed point of f . This involves
proving the following two statements:
In what follows we prove each of these statements separately. In the rest of the
proof we let
A = {x ∈ D | x v f (x)} .
First of all, we shall show that (4.1) holds. By definition, we have that
G
zmax = A .
4.2. TARSKI’S FIXED POINT THEOREM 91
x v f (x) v f (zmax ) .
Thus f (zmax ) is an upper bound for the set A. By definition, z max is the
least upper bound of A. Thus zmax v f (zmax ), and we have shown (4.1).
To prove that (4.2) holds, note that, from (4.1) and the monotonicity of f , we
have that f (zmax ) v f (f (zmax )). This implies that f (zmax ) ∈ A. Therefore
f (zmax ) v zmax , as zmax is an upper bound for A.
From (4.1) and (4.2), we have that zmax v f (zmax ) v zmax . By antisym-
metry, it follows that zmax = f (zmax ), i.e., zmax is a fixed point of f .
2. We now show that zmax is the largest fixed point of f . Let d be any fixed
F we have that d v f (d). This implies that
point of f . Then, in particular,
d ∈ A and therefore that d v A = zmax .
We have thus shown that zmax is the largest fixed point of f .
To show that zmin is the least fixed point of f , we proceed in a similar fashion by
proving the following two statements:
1. zmin is a fixed point of f , i.e., zmin = f (zmin ), and
Claim (4.3) can be shown following the proof for (4.1), and claim (4.4) can be
shown following the proof for (4.2). The details are left as an exercise for the
reader. Having shown that zmin is a fixed point of f , it is a simple matter to prove
that it is indeed the least fixed point of f . (Do this as an exercise). 2
Consider, for example, a complete lattice of the form (2 S , ⊆), where S is a set,
and a monotonic function f : S → S. If we instantiate the statement of the above
theorem to this setting, the largest and least fixed points for f can be characterized
thus:
[
zmax = {X ⊆ S | X ⊆ f (X)} and
\
zmin = {X ⊆ S | f (X) ⊆ X} .
92 CHAPTER 4. THEORY OF FIXED POINTS
This follows because X ∪ {1, 2} ⊆ X means that X already contains 1 and 2, and
the smallest set with this property is {1, 2}.
The following important theorem gives a characterization of the largest and
least fixed points for monotonic functions over finite complete lattices. We shall
see in due course how this result gives an algorithm for computing the fixed points
which will find application in equivalence checking and in the developments in
Chapter 6.
f 0 (d) = d and
f n+1 (d) = f (f n (d)) .
Theorem 4.2 Let (D, v) be a finite complete lattice and let f : D → D be mono-
tonic. Then the least fixed point for f is obtained as
zmin = f m (⊥) ,
for some natural number m. Furthermore the largest fixed point for f is obtained
as
zmax = f M (>) ,
Proof: We only prove the first statement as the proof for the second one is similar.
As f is monotonic we have the following non-decreasing sequence
Exercise 4.8 (For the theoretically minded) Fill in the details in the proof of the
above theorem.
f (X) = X ∪ {0} .
This function is monotonic, and 2{0,1} is a complete lattice, when ordered using set
inclusion, with the empty set as least element and {0, 1} as largest element. The
above theorem gives an algorithm for computing the least and largest fixed point
of f . To compute the least fixed point, we begin by applying f to the empty set.
The result is {0}. Since, we have added 0 to the input of f , we have not found our
least fixed point yet. Therefore we proceed by applying f to {0}. We have that
It follows that, not surprisingly, {0} is the least fixed point of the function f .
To compute the largest fixed point of f , we begin by applying f to the top
element in our lattice, namely the set {0, 1}. Observe that
Use Theorem 4.2 to compute the least and largest fixed point of g.
94 CHAPTER 4. THEORY OF FIXED POINTS
Exercise 4.10 (For the theoretically minded) This exercise is for those amongst
you that enjoy the mathematics of partially ordered sets. It has no direct bearing
on the theory of reactive systems covered in the rest of the textbook.
d0 v d 1 v d 2 v · · ·
F
in (D, v) has a least upper bound (written i≥0 di ). A function f : D → D
is continuous (see, for instance, (Nielson and Nielson, 1992, Page 103)) if
G G
f( di ) = f (di ) ,
i≥0 i≥0
Hint: Consider the lattice pictured above, but turned upside down.
We invite those amongst you who would like to learn more about the mathematics
of partially ordered sets and lattices to consult the book (Davey and Priestley, 2002)
and the collection of notes (Harju, 2006).
96 CHAPTER 4. THEORY OF FIXED POINTS
In what follows we shall describe the relation ∼ as a fixed point to a suitable mono-
tonic function. First we note that (2 (Proc×Proc) , ⊆) (i.e., the set of Sbinary T
rela-
tions over Proc ordered by set inclusion) is a complete lattice with and as
least upper bound and greatest lower bound. (Why? In fact, you should be able
to realize readily that we have seen this kind of complete lattice in our previous
developments!)
Consider now a binary relation R over Proc—that is, an element of the set
2 (Proc ×Proc) . We define the set F(R) as follows:
In other words, F(R) contains all the pairs of processes from which, in one round
of the bisimulation game, the defender can make sure that the players reach a cur-
rent pair of processes that is already contained in R.
4.3. BISIMULATION AS A FIXED POINT 97
Take a minute to look at the above equality, and compare it with the characteri-
zation of the largest fixed point of a monotonic function given by Tarski’s fixed
point theorem (Theorem 4.1). That theorem tells us that the largest fixed point of
a monotonic function f is the least upper bound of the set of elements x such that
x v f (x)—these are called the post-fixed points of the function. In our specific
S
setting, the least upper bound of a subset of 2 (Proc×Proc) is given by , and the
post-fixed points of F are precisely the binary relations R over Proc such that
R ⊆ F(R). This means that the definition of ∼ matches the one for the largest
fixed point for F perfectly!
We note that if R, S ∈ 2(Proc×Proc) and R ⊆ S then F(R) ⊆ F(S)—that
is, the function F is monotonic over (2 (Proc×Proc) , ⊆). (Check this!) Therefore,
as all the conditions for Tarski’s theorem are satisfied, we can conclude that ∼ is
indeed the largest fixed point of F. In particular, by Theorem 4.2, if Proc is finite
then ∼ is equal to F M (Proc × Proc) for some integer M ≥ 0. Note how this
gives us an algorithm to calculate ∼ for a given finite labelled transition system.
To compute ∼, simply evaluate the non-increasing sequence
until the sequence stabilizes. (Recall, that F 0 (Proc × Proc) is just the
top element in the complete lattice, namely Proc × Proc.)
Example 4.4 Consider the labelled transition system described by the following
defining equations in CCS:
Q1 = b.Q2 + a.Q3
Q2 = c.Q4
Q3 = c.Q4
Q4 = b.Q2 + a.Q3 + a.Q1 .
Proc = {Qi | 1 ≤ i ≤ 4} .
I = {(Qi , Qi ) | 1 ≤ i ≤ 4} .
98 CHAPTER 4. THEORY OF FIXED POINTS
Therefore, the only distinct processes that are related by the largest strong bisim-
ulation over this labelled transition system are Q 2 and Q3 , and indeed Q2 ∼ Q3 .
Exercise 4.11 Using the iterative algorithm described above, compute the largest
strong bisimulation over the labelled transition system described by the following
defining equations in CCS:
P1 = a.P2
P2 = a.P1
P3 = a.P2 + a.P4
P4 = a.P3 + a.P5
P5 = 0 .
You may find it useful to draw the labelled transition system associated with the
above CCS definition first.
Exercise 4.12 Use the iterative algorithm described above to compute the largest
bisimulation over the labelled transition system in Example 3.7.
Exercise 4.13 What is the worst case complexity of the algorithm outlined above
when run on a labelled transition system consisting of n states and m transitions?
Express your answer using O-notation, and compare it with the complexity of the
algorithm due to Page and Tarjan mentioned in Section 3.6.
α
Exercise 4.14 Let (Proc, Act, { → | α ∈ Act}) be a labelled transition system.
For each i ≥ 0, define the relation ∼i over Proc as follows:
• s1 ∼0 s2 holds always;
3. ∼i = F i (Proc × Proc).
Exercise 4.15
Hennessy-Milner logic
No doubt, you will be able to come up with many others examples of similar prop-
erties of the computer scientist that we may wish to verify.
101
102 CHAPTER 5. HENNESSY-MILNER LOGIC
All of the aforementioned properties, and many others, seem best checked by
exploring the state space of the process under consideration, rather than by trans-
forming them into equivalence checking questions. However, before even thinking
of checking whether these properties hold of a process, either manually or automat-
ically, we need to have a language for expressing them. This language must have
a formal syntax and semantics, so that it can be understood by a computer, and al-
gorithms to check whether a process affords a property may be devised. Moreover,
the use of a language with a well defined and intuitively understandable seman-
tics will also allow us to overcome the imprecision that often accompanies natural
language descriptions. For instance, what do we really mean when we say that
our computer scientist is willing to drink both coffee and tea now?
Do we mean that, in its current state, the computer scientist can perform both a
coffee-labelled transition and a tea-labelled one? Or do we mean that these tran-
sitions should be possible one after the other? And, may these transitions be pre-
ceded and/or followed by sequences of internal steps? Whether our computer sci-
entist affords the specified property clearly depends on the answer to the questions
above, and the use of a language with a formal semantics will help us understand
precisely what is meant. Moreover, giving a formal syntax to our specification
language will tell us what properties we can hope to express using it.
The approach to specification and verification of reactive systems that we shall
begin exploring in this section is often referred to as ‘model checking’. In this
approach we usually use different languages for describing actual systems and their
specifications. For instance, we may use CCS expressions or the LTSs that they
denote to describe actual systems, and some kind of logic to describe specifications.
In this section, we shall present a property language that has been introduced in
process theory by Hennessy and Milner in (Hennessy and Milner, 1985). This logic
is often referred to as Hennessy-Milner logic (or HML for short), and, as we shall
see in due course, has a very pleasing connection with the notion of bisimilarity.
Definition 5.1 The set M of Hennessy-Milner formulae over a set of actions Act
is given by the following abstract syntax:
where a ∈ Act, and we use tt and ff to denote ‘true’ and ‘false’, respectively. If
A = {a1 , . . . , an } ⊆ Act (n ≥ 0), we use the abbreviation hAiF for the formula
ha1 iF ∨ . . . ∨ han iF and [A]F for the formula [a1 ]F ∧ . . . ∧ [an ]F . (If A = ∅,
then hAiF = ff and [A]F = tt.)
103
We are interested in using the above logic to describe properties of CCS processes,
or, more generally, of states in an LTS over the set of actions Act. The meaning of
a formula in the language M is given by characterizing the collection of processes
that satisfy it. Intuitively, this can be described as follows.
• All processes satisfy tt.
• No process satisfies ff .
• A process satisfies F ∧ G (respectively, F ∨ G) iff it satisfies both F and G
(respectively, either F or G).
• A process satisfies haiF for some a ∈ Act iff it affords an a-labelled transi-
tion leading to a state satisfying F .
• A process satisfies [a]F for some a ∈ Act iff all of its a-labelled transitions
lead to a state satisfying F .
So, intuitively, a formula of the form haiF states that it is possible to perform
action a and thereby satisfy property F . Whereas a formula of the form [a]F states
that no matter how a process performs action a, the state it reaches in doing so will
necessarily satisfy the property F .
Logics that involve the use of expressions like possibly and necessarily are
usually called modal logics , and, in some form or another, have been studied by
philosophers throughout history, notably by Aristotle and philosophers in the mid-
dle ages. So Hennessy-Milner logic is a modal logic—in fact, a so-called multi-
modal logic, since the logic involves modal operators that are parameterized by
actions. The semantics of formulae is defined with respect to a given labelled tran-
sition system
a
(Proc, Act, {→| a ∈ Act}) .
We shall use [[F ]] to denote the set of processes in Proc that satisfy F . This we
now proceed to define formally.
where we use the set operators h·a·i, [·a·] : 2 Proc → 2Proc defined by
a
h·a·iS = {p ∈ Proc | p → p0 and p0 ∈ S, for some p0 } and
a 0 0 0
[·a·]S = {p ∈ Proc | p → p implies p ∈ S, for each p } .
104 CHAPTER 5. HENNESSY-MILNER LOGIC
Example 5.1 In order to understand the definition of the set operators h·a·i, [·a·]
introduced above, it is instructive to look at an example. Consider the following
labelled transition system.
s t
888
88
a 88a
a
88
88
b
8
s1 / s2 k t1 l
b b
Then
h·a·i{s1 , t1 } = {s, t} .
This means that h·a·i{s1 , t1 } is the collection of states from which it is possible to
perform an a-labelled transition ending up in either s 1 or t1 . On the other hand,
[·a·]{s1 , t1 } = {s1 , s2 , t, t1 } .
The idea here is that [·a·]{s1 , t1 } consists of the set of all processes that become
either s1 or t1 no matter how they perform an a-labelled transition. Clearly, s does
a
not have this property because it can perform the transition s → s2 , whereas t does
because its only a-labelled transition ends up in t 1 . But why are s1 , s2 and t1 in
[·a·]{s1 , t1 }? To see this, look at the formal definition of the set
a
[·a·]{s1 , t1 } = {p ∈ Proc | p → p0 implies p0 ∈ {s1 , t1 }, for each p0 } .
Since s1 , s2 and t1 do not afford a-labelled transitions, it is vacuously true that all
of their a-labelled transitions end up in either s 1 or t1 ! This is the reason why those
states are in the set [·a·]{s1 , t1 }.
We shall come back to this important point repeatedly in what follows.
Exercise 5.1 Consider the labelled transition system in the example above. What
are h·b·i{s1 , t1 } and [·b·]{s1 , t1 }?
Let us now re-examine the properties of our computer scientist that we mentioned
earlier, and let us see whether we can express them using HML. First of all, note
that, for the time being, we have defined the semantics of formulae in M in terms
105
a
of the one step transitions →. This means, in particular, that we are not considering
τ actions as unobservable. So, if we say that ‘a process P can do action a now’,
a
then we really mean that the process can perform a transition of the form P → Q
for some Q.
How can we express, for instance, that our computer scientist is willing to drink
coffee now? Well, one way to say so using our logic is to say that the computer
scientist has the possibility of doing a coffee-labelled transition. This suggests that
we use a formula of the form hcoffeeiF for some formula F that should be satisfied
by the state reached by the computer scientist after having drunk her coffee. What
should this F be? Since we are not requiring anything of the subsequent behaviour
of the computer scientist, it makes sense to set F = tt. So, it looks as if we can
express our natural language requirement in terms of the formula hcoffeeitt. In fact,
since our property language has a formal semantics, we can actually prove that our
proposed formula is satisfied exactly by all the processes that have an outgoing
coffee-labelled transition. This can be done as follows:
[[hcoffeeitt]] = h·coffee·i[[tt]]
= h·coffee·iProc
coffee
= {P | P → P 0 for some P 0 ∈ Proc}
coffee
= {P | P → } .
[[[tea]ff ]] = [·tea·][[ff ]]
= [·tea·]∅
tea
= {P | P → P 0 implies P 0 ∈ ∅, for each P 0 }
tea
= {P | P 9} .
The last equality above follows from the fact that, for each process P ,
tea tea
P 9 iff (P → P 0 implies P 0 ∈ ∅, for each P 0 ) .
106 CHAPTER 5. HENNESSY-MILNER LOGIC
tea
To see that this holds, observe first of all that if P → Q for some Q, then it is not
tea
true that P 0 ∈ ∅ for all P 0 such that P → P 0 . In fact, Q is a counter-example
to the latter statement. So the implication from right to left is true. To establish
tea
the implication from left to right, assume that P 9. Then it is vacuously true that
tea
P 0 ∈ ∅ for all P 0 such that P → P 0 —indeed, since there is no such P 0 , there is no
counter-example to that statement!
To sum up, we can express that a process cannot perform action a ∈ Act with
the formula [a]ff .
Suppose now that we want to say that the computer scientist must have a biscuit
after drinking coffee. This means that it is possible for the computer scientist to
have a biscuit in all the states that she can reach by drinking coffee. This can be
expressed by means of the formula
[coffee]hbiscuititt .
1. Use the semantics of the logic to check that the above formula expresses the
desired property of the computer scientist.
@ s 8j 8
88
88 a
a 88
88 a
8
a
s1 8 s2
8 8
88
a 8b8 a
88
8
s3 s4
107
2. Compute the following sets using the denotational semantics for Hennessy-
Milner logic.
• [[[a][b]ff ]] = ?
• [[hai haitt ∧ hbitt ]] = ?
• [[[a][a][b]ff ]] = ?
• [[[a] haitt ∨ hbitt ]] = ?
[tick](htickitt ∧ [tock]ff ) .
108 CHAPTER 5. HENNESSY-MILNER LOGIC
Show also that, for each n ≥ 0, the process Clock satisfies the formula
hticki · · · hticki tt .
| {z }
n-times
• P |= ff , for no P ,
• P |= F ∧ G iff P |= F and P |= G,
• P |= F ∨ G iff P |= F or P |= G,
a
• P |= haiF iff P → P 0 for some P 0 such that P 0 |= F , and
a
• P |= [a]F iff P 0 |= F , for each P 0 such that P → P 0 .
Exercise 5.6 Show that the above definition of the satisfaction relation is equiva-
lent to that given in Definition 5.2. Hint: Use induction on the structure of formu-
lae.
Exercise 5.7 Find one labelled transition system with initial state s that satisfies
all of the following properties:
• hai(hbihcitt ∧ hcitt),
• [a]hbi([c]ff ∧ haitt).
109
Note that logical negation is not one of the constructs in the abstract syntax for
M. However, the language M is closed under negation, in the sense that, for each
formula F ∈ M, there is a formula F c ∈ M that is equivalent to the negation of
F . This formula F c is defined inductively on the structure of F as follows:
1. ttc = ff 4. (F ∨ G)c = F c ∧ Gc
2. ff c = tt 5. (haiF )c = [a]F c
3. (F ∧ G)c = F c ∨ Gc 6. ([a]F )c = haiF c .
Exercise 5.8
As a consequence of Proposition 5.1, we have that, for each process P and formula
F , exactly one of P |= F and P |= F c holds. In fact, each process is either
contained in [[F ]] or in [[F c ]].
In Exercise 5.5 you were asked to come up with formulae that distinguished
processes that we know are not strongly bisimilar. As a further example, consider
the processes
def
A = a.A + a.0 and
def
B = a.a.B + a.0 .
These two processes are not strongly bisimilar. In fact, A affords the transition
a
A→A .
110 CHAPTER 5. HENNESSY-MILNER LOGIC
or
a
B → a.B .
However, neither 0 nor a.B is strongly bisimilar to A, because this process can
perform an a-labelled transition and become 0 in doing so. On the other hand,
a
a.B → B
is the only transition that is possible from a.B, and B is not strongly bisimilar to 0.
Based on this analysis, it seems that a property distinguishing the processes A
and B is haihai[a]ff —that is, the process can perform a sequence of two a-labelled
transitions, and in so doing reach a state from which no a-labelled transition is
possible. In fact, you should be able to establish that A satisfies this property, but
B does not. (Do so!)
Again, faced with two non-bisimilar processes, we have been able to find a for-
mula in the logic M that distinguishes them, in the sense that one process satisfies
it, but the other does not. Is this true in general? And what can we say about two
processes that satisfy precisely the same formulae in M? Are they guaranteed to
be strongly bisimilar?
We shall now present a seminal theorem, due to Hennessy and Milner, that
answers both of these questions in one fell swoop by establishing a satisfying,
and very fruitful, connection between the apparently unrelated notions of strong
bisimilarity and the logic M. The theorem applies to a class of processes that we
now proceed to define.
Definition 5.3 [Image finite process] A process P is image finite iff the collection
a
{P 0 | P → P 0 } is finite for each action a.
An LTS is image finite if so is each of its states.
def
Arep = a.0 | Arep
is not image finite. In fact, you should be able to prove by induction on n that, for
each n ≥ 1,
a
Arep → a.0 | · · · | a.0 |0 | Arep .
| {z }
n times
111
Theorem 5.1 [Hennessy and Milner (Hennessy and Milner, 1985)] Let
a
(Proc, Act, {→| a ∈ Act})
be an image finite LTS. Assume that P, Q are states in Proc. Then P ∼ Q iff P
and Q satisfy exactly the same formulae in Hennessy-Milner logic.
• Assume that P and Q satisfy the same formulae in M. We shall prove that
P and Q are strongly bisimilar. To this end, note that it is sufficient to show
that the relation
is a strong bisimulation.
a
Assume that R R S and R → R0 . We shall now argue that there is a process
a
S 0 such that S → S 0 and R0 R S 0 . Since R is symmetric, this suffices to
establish that R is a strong bisimulation.
112 CHAPTER 5. HENNESSY-MILNER LOGIC
a
Assume, towards a contradiction, that there is no S 0 such that S → S 0 and
S 0 satisfies the same properties as R 0 . Since S is image finite, the set of
processes S can reach by performing an a-labelled transition is finite, say
{S1 , . . . , Sn } with n ≥ 0. By our assumption, none of the processes in the
above set satisfies the same formulae as R 0 . So, for each i ∈ {1, . . . , n},
there is a formula Fi such that
R0 |= Fi and Si 6|= Fi .
(Why? Could it not be that R 0 6|= Fi and Si |= Fi , for some i ∈ {1, . . . , n}?)
We are now in a position to construct a formula that is satisfied by R, but not
by S—contradicting our assumption that R and S satisfy the same formulae.
In fact, the formula
hai(F1 ∧ F2 ∧ · · · ∧ Fn )
Exercise 5.9 (Mandatory) Fill in the details that we have omitted in the above
proof. What is the formula that we have constructed to distinguish R and S in the
proof of the implication from right to left if n = 0?
Remark 5.1 In fact, the implication from left to right in the above theorem holds
for arbitrary processes, not just image finite ones.
The above theorem has many applications in the theory of processes, and in ver-
ification technology. For example, a consequence of its statement is that if two
image finite processes are not strongly bisimilar, then there is a formula in M that
tells us one reason why they are not. Moreover, as the proof of the above theorem
suggests, we can always construct this distinguishing formula.
Note, moreover, that the above characterization theorem for strong bisimilarity
is very general. For instance, in light of your answer to Exercise 3.30, it also applies
to observational equivalence, provided that we interpret HML over the labelled
transition system whose set of actions consists of all of the observable actions and
of the label τ , and whose transitions are precisely the ‘weak transitions’ whose
labels are either observable actions or τ .
s\ t\ v \8
88
88
a a
88a
a 88
88
b / w
s1 a v1 vB 2
b 8 t1 a
b
b b
b
b
s2 Y t2 v3
• s and v, and
• t and v.
Verify your claims in the Edinburgh Concurrency Workbench (use the strongeq
and checkprop commands) and check whether you found the shortest distin-
guishing formula (use the dfstrong command).
Exercise 5.11 For each of the following CCS expressions decide whether they are
strongly bisimilar and, if they are not, find a distinguishing formula in Hennessy-
Milner logic:
• b.a.0 + b.0 and b.(a.0 + b.0),
Exercise 5.13 (For the theoretically minded) Consider the process A ω given by:
def
Aω = a.Aω .
Show that the processes A<ω and Aω + A<ω , where A<ω was defined in equation
(5.1) on page 111,
Conclude that Theorem 5.1 does not hold for processes that are not image finite.
Hint: To prove that the two processes satisfy the same formulae in M, use struc-
tural induction on formulae. You will find it useful to first establish the following
statement:
The modal depth of a formula is the maximum nesting of the modal operators in it.
Chapter 6
An HML formula can only describe a finite part of the overall behaviour of a pro-
cess. In fact, as each modal operator allows us to explore the effect of taking one
step in the behaviour of a process, using a single HML formula we can only de-
scribe properties of a fixed finite fragment of the computations of a process. As
those of you who solved Exercise 5.13 already discovered, how much of the be-
haviour of a process we can explore using a single formula is entirely determined
by its so-called modal depth—i.e., by the maximum nesting of modal operators in
it. For example, the formula ([a]haiff ) ∨ hbitt has modal depth 2, and checking
whether a process satisfies it or not involves only an analysis of its sequences of
transitions whose length is at most 2. (We will return to this issue in Section 6.6,
where a formal definition of the modal depth of a formula will be given.)
However, we often wish to describe properties that describe states of affairs
that may or must occur in arbitrarily long computations of a process. If we want
to express properties like, for example, that a process is always able to perform
a given action, we have to extend the logic. As the following example indicates,
one way of doing so is to allow for infinite conjunctions and disjunctions in our
property language.
Example 6.1 Consider the processes p and q in Figure 6.1. It is not hard to come
up with an HML formula that p satisfies and q does not. In fact, after performing
an a-action, p will always be able to perform another one, whereas q may fail to do
so. This can be captured formally in HML as follows:
p |= [a]haitt but
q 6|= [a]haitt .
115
116 CHAPTER 6. HML WITH RECURSION
u -u
r
#
q
#u
p
a
"!
- "!
-
a a
Since a difference in the behaviour of the two processes can already be found by
examining their behaviour after two transitions, a formula that distinguishes them
is ‘small’.
Assume, however, that we modify the labelled transition system for q by adding
a sequence of transitions to r thus:
a a a a
r = r0 → r1 → r2 → r3 · · · rn−1 → rn (n ≥ 0) .
where [a]n+1 stands for a sequence of modal operators [a] of length n+1. However,
no formula in HML would work for all values of n. (Prove this claim!) This is
unsatisfactory as there appears to be a general reason why the behaviours of p and
q are different. Indeed, the process p in Figure 6.1 can always (i.e., at any point
in each of its computations) perform an a-action—that is, haitt is always true. Let
us call this invariance property Inv (haitt). We could describe it in an extension of
HML as an infinite conjunction thus:
^ i
Inv (haitt) = haitt ∧ [a]haitt ∧ [a][a]haitt ∧ · · · = [a] haitt .
i≥0
expressed by the conjunct [a] i haitt because a is the only action in our
example labelled transition system).
On the other hand, the process q has the option of terminating at any time by per-
forming the a-labelled transition leading to process r, or equivalently it is possible
from q to satisfy [a]ff . Let us call this property Pos([a]ff ). We can express it in
an extension of HML as the following infinite disjunction:
_
Pos([a]ff ) = [a]ff ∨ hai[a]ff ∨ haihai[a]ff ∨ · · · = haii [a]ff ,
i≥0
where haii stands for a sequence of modal operators hai of length i. This formula
can be read as follows:
where we write F ≡ G if and only if the formulae F and G are satisfied by exactly
the same processes—i.e., if [[F ]] = [[G]]. The above recursive equation captures the
intuition that a process that can invariantly perform an a-labelled transition—that
is, one that can perform an a-labelled transition in all of its reachable states—can
certainly perform one now, and, moreover, each state that it reaches via one such
transition can invariantly perform an a-labelled transition. This looks deceptively
easy and natural. However, the mere fact of writing down an equation like (6.1)
does not mean that this equation makes sense! Indeed, equations may be seen as
118 CHAPTER 6. HML WITH RECURSION
implicitly defining the set of their solutions, and we are all familiar with equations
that have no solutions at all. For instance, the equation
x=x+1 (6.2)
has no solution over the set of natural numbers, and there is no X ⊆ N such that
X =N\X . (6.3)
X = {2} ∪ X , (6.4)
namely all of the sets of natural numbers that contain the number 2. There are
also equations that have a finite number of solutions, but not a unique one. As an
example, consider the equation
X = {10} ∪ {n − 1 | n ∈ X, n 6= 0} . (6.5)
The only finite set that is the solution for this equation is the set {0, 1, . . . , 10}, and
the only infinite solution is N itself.
1. Why doesn’t Tarski’s fixed point theorem apply to yield a solution to the first
two of these equations?
∞+d=d+∞=∞ .
Does equation (6.2) have a solution in the resulting structure? How many
solutions does that equation have?
3. Use Tarski’s fixed point theorem to find the largest and least solutions of
(6.5).
Precise answers to these questions will be given in the remainder of this chapter.
However, to motivate our subsequent technical developments, it is appropriate here
to discuss briefly the first two questions above.
Recall that the meaning of a formula (with respect to a labelled transition sys-
tem) is the set of processes that satisfy it. Therefore, it is natural to expect that a
set S of processes that satisfy the formula described by equation (6.1) should be
such that:
S = h·a·iProc ∩ [·a·]S .
It is clear that S = ∅ is a solution to the equation (as no process can satisfy both
haitt and [a]ff ). However, the process p on Figure 6.1 can perform an a-transition
invariantly and p 6∈ ∅, so this cannot be the solution we are looking for. Actually
it turns out that it is the largest solution we need here, namely S = {p}. The set
S = ∅ is the least solution.
In other cases it is the least solution we are interested in. For instance, we can
express Pos([a]ff ) by the following equation:
Y ≡ [a]ff ∨ haiY .
Here the largest solution is Y = {p, q, r} but, as the process p on Figure 6.1 cannot
terminate at all, this is clearly not the solution we are interested in. The least
solution of the above equation over the labelled transition system on Figure 6.1 is
Y = {q, r} and is exactly the set of processes in that labelled transition system that
intuitively satisfy Pos([a]ff ).
When we write down a recursively defined property, we can indicate whether
we desire the least or the largest solution by adding this information to the equality
sign. For Inv (haitt) we want the largest solution, and in this case we write
max
X = haitt ∧ [a]X .
min
Y = F ∨ hActiY .
Intuitively, we use largest solutions for those properties that hold of a process un-
less it has a finite computation that disproves the property. For instance, process
q does not have property Inv (haitt) because it can reach a state in which no a-
labelled transition is possible. Conversely, we use least solutions for those proper-
ties that hold of a process if it has a finite computation sequence which ‘witnesses’
the property. For instance, a process has property Pos(haitt) if it has a computation
leading to a state that can perform an a-labelled transition. This computation is a
witness for the fact that the process can perform an a-labelled transition at some
point in its behaviour.
We shall appeal to the intuition given above in the following section, where we
present examples of recursively defined properties.
Exercise 6.3 Give a formula, built using HML and the temporal operators Pos
and/or Inv , that expresses a property satisfied by exactly one of the processes in
Exercise 5.13.
max
X = F ∧ ([Act]ff ∨ hActiX) .
6.1. EXAMPLES OF RECURSIVE PROPERTIES 121
It turns out to be the largest solution that is of interest here as we will argue for
formally later.
Intuitively, the recursively defined formula above states that a process p has a
complete transition sequence all of whose states satisfy the formula F if, and only
if,
• either p has no outgoing transition (in which case p will satisfy the formula
[Act]ff ) or p has a transition leading to a state that has a complete transition
sequence all of whose states satisfy the formula F .
min
Y = F ∨ (hActitt ∧ [Act]Y ) .
In this case we are interested in the least solution because Even(F ) should only be
satisfied by those processes that are guaranteed to reach a state satisfying F in all
of their computation paths.
Note that the definitions of Safe(F ) and Even(F ), respectively Inv(F ) and
Pos(F ), are mutually dual, i.e., they can be obtained from one another by replacing
min max
∨ by ∧, [Act] by hActi and = by = . One can show that ¬Inv (F ) ≡ Pos(¬F )
and ¬Safe(F ) ≡ Even(¬F ), where we write ¬ for logical negation.
It is also possible to express that F should be satisfied in each transition se-
quence until G becomes true. There are two well known variants of this construc-
tion:
• F U s G, the so-called strong until, that says that sooner or later p reaches a
state where G is true and in all the states it traverses before this happens F
must hold;
• F U w G, the so-called weak until, that says that F must hold in all states p
traverses until it gets into a state where G holds (but maybe this will never
happen!).
min
F Us G = G ∨ (F ∧ hActitt ∧ [Act](F U s G)) , and
max
F Uw G = G ∨ (F ∧ [Act](F U w G)) .
It should be clear that, as the names indicate, strong until is a stronger condition
than weak until. We can use the ‘until’ operators to express Even(F ) and Inv(F ).
In fact, Even(G) ≡ tt U s G and Inv (F ) ≡ F U w ff .
Properties like ‘some time in the future’ and ‘until’ are examples of what we
call temporal properties. Tempora is Latin—it is plural for tempus, which means
‘time’—, and a logic that expresses properties that depend on time is called tempo-
ral logic. The study of temporal logics is very old and can be traced back to Aristo-
tle. Within the last 30 years, researchers in computer science have started showing
interest in temporal logic as, within this framework, it is possible to express prop-
erties of the behaviour of programs that change over time (Clarke, Emerson and
A.P. Sistla, 1986; Manna and Pnueli, 1992; Pnueli, 1977).
The modal µ-calculus (Kozen, 1983) is a generalization of Hennessy-Milner
logic with recursion that allows for largest and least fixed point definitions to be
mixed freely. It has been shown that the modal µ-calculus is expressive enough
to describe any of the standard operators that occur in the framework of temporal
logic. In this sense by extending Hennessy-Milner logic with recursion we obtain
a temporal logic.
From the examples in this section we can see that least fixed points are used to
express that something will happen sooner or later, whereas the largest fixed points
are used to express invariance of some state of affairs during computations, or that
something does not happen as a system evolves.
p1 ?o
?? a
??
??b
a
??
??
??
p2 o a
p3
Example 6.2 Consider the formula F = haiX and let Proc be the set of states
in the transition graph in Figure 6.2. If X is satisfied by p1 , then haiX will be
satisfied by p3 , i.e., we expect that
OX (S) = S
Ott (S) = Proc
Off (S) = ∅
OF1 ∧F2 (S) = OF1 (S) ∩ OF2 (S)
OF1 ∨F2 (S) = OF1 (S) ∪ OF2 (S)
OhaiF (S) = h·a·iOF (S)
O[a]F (S) = [·a·]OF (S) .
A few words of explanation for the above definition are in order here. Intuitively,
the first equality in Definition 6.1 expresses the trivial observation that if we assume
that S is the set of states that satisfy X, then the set of states satisfying X is S!
124 CHAPTER 6. HML WITH RECURSION
The second equation states the, equally obvious, fact that every state satisfies tt
irrespective of the set of states that are assumed to satisfy X. The last equation
instead says that to calculate the set of states satisfying the formula [a]F under the
assumption that the states in S satisfy X, it is sufficient to
1. compute the set of states satisfying the formula F under the assumption that
the states in S satisfy X, and then
2. find the collection of states that end up in that set no matter how they perform
an a-labelled transition.
Exercise 6.4 Given the transition graph from Example 6.2, use the above defini-
tion to calculate O[b]ff ∧[a]X ({p2 }).
One can show that for every formula F , the function O F is monotonic (see Def-
inition 4.4) over the complete lattice (2Proc , ⊆). In other words, for all subsets
S1 , S2 of Proc, if S1 ⊆ S2 then OF (S1 ) ⊆ OF (S2 ).
Exercise 6.5 Show that OF is monotonic for all F . Consider what will happen if
we introduce negation into our logic. Hint: Use structural induction on F .
As mentioned before, the idea underlying the definition of the function O F is that
if [[X]] ⊆ Proc gives the set of processes that satisfy X, then O F ([[X]]) will be the
set of processes that satisfy F . What is this set [[X]] then? Syntactically we shall
assume that [[X]] is implicitly given by a recursive equation for X of the form
min max
X = FX or X = FX .
As shown in the previous section, such an equation can be interpreted as the set
equation
[[X]] = OFX ([[X]]) . (6.6)
As OFX is a monotonic function over a complete lattice we know that (6.6) has
solutions, i.e., that OFX has fixed points. In particular Tarski’s fixed point theorem
(see Theorem 4.1) gives us that there is a unique largest fixed point, denoted by
FIX OFX , and also a unique least one, denoted by fix O FX , given respectively by
[
FIX OFX = {S ⊆ Proc | S ⊆ OFX (S)} and
\
fix OFX = {S ⊆ Proc | OFX (S) ⊆ S} .
A set S with the property that S ⊆ OFX (S) is called a post-fixed point for OFX .
Correspondingly S is a pre-fixed point for OFX if OFX (S) ⊆ S.
6.2. SYNTAX AND SEMANTICS OF HML WITH RECURSION 125
When Proc is finite we have the following characterization of the largest and least
fixed points.
Theorem 6.1 If Proc is finite then FIX O FX = (OFX )M (Proc) for some M and
fix OFX = (OFX )m (∅) for some m.
Proof: This follows directly from the fixed point theorem for finite complete lat-
tices. See Theorem 4.2 for the details. 2
The above theorem gives us an algorithm for computing the least and largest set of
processes solving an equation of the form (6.6). Consider, by way of example, the
formula
max
X = FX ,
where FX = hbitt ∧ [b]X. The set of processes in the labelled transition system
s t
888
88
a 88a
a
88
88
b
8
s1 / s2 k t1 l
b b
This solution is nothing but the largest fixed point of the set function defined by
the right-hand side of the above equation—that is, the function mapping each set
of states S to the set
Since we are looking for the largest fixed point of this function, we begin the iter-
ative algorithm by taking S = {s, s1 , s2 , t, t1 }, the set of all states in our labelled
126 CHAPTER 6. HML WITH RECURSION
transition system. We therefore have that our first approximation to the largest
fixed point is the set
Note that our candidate solution to the equation has shrunk in size, since an appli-
cation of OFX to the set of all processes has removed the states s and t from our
candidate solution. Intuitively, this is because, by applying O FX to the set of all
states, we have found a reason why s and t do not afford the property specified by
max
X = hbitt ∧ [b]X ,
namely that s and t do not have a b-labelled outgoing transition, and therefore that
neither of them is in the set h·b·i{s, s 1 , s2 , t, t1 }.
Following our iterative algorithm for the computation of the largest fixed point,
we now apply the function OFX to the new candidate largest solution, namely
{s1 , s2 , t1 }. We now have that
(You should convince yourselves that the above calculations are correct!) We have
now found that {s1 , s2 , t1 } is a fixed point of the function OFX . By Theorem 6.1,
this is the largest fixed point and therefore states s 1 , s2 and t1 are the only states in
our labelled transition system that satisfy the property
max
X = hbitt ∧ [b]X .
This is in complete agreement with our intuition because those are the only states
that can perform a b-action in all states that they can reach by performing sequences
of b-labelled transitions.
Use Theorem 6.1 to compute the set of processes in the labelled transition system
above that satisfy this property.
6.3. LARGEST FIXED POINTS AND INVARIANT PROPERTIES 127
We have previously given an informal argument for why invariant properties are
obtained as largest fixed points. In what follows we will formalize this argument,
and prove its correctness.
As we saw in the previous section, the property Inv (F ) is obtained as the
largest fixed point to the recursive equation
X = F ∧ [Act]X .
We will now show that Inv (F ) defined in this way indeed expresses that F holds
at all states in all transitions sequences.
For this purpose we let I : 2Proc −→ 2Proc be the corresponding semantic
function, i.e.,
I(S) = [[F ]] ∩ [·Act·]S .
By Tarski´s fixed point theorem this equation has exactly one largest solution given
by [
FIX I = {S | S ⊆ I(S)} .
To show that FIX I indeed characterizes precisely the set of processes for which
all states in all computations satisfy the property F , we need a direct (and obviously
correct) formulation of this set. This is given by the set Inv defined as follows:
σ
Inv = {p | p → p0 implies p0 ∈ [[F ]], for each σ ∈ Act∗ and p0 ∈ Proc} .
The correctness of Inv(F ) with respect to this description can now be formulated
as follows.
a
Theorem 6.2 For every labelled transition system (Proc, Act, { → | a ∈ Act}), it
holds that Inv = FIX I.
Proof: We show the statement by proving each of the inclusions Inv ⊆ FIX I
and FIX I ⊆ Inv separately.
128 CHAPTER 6. HML WITH RECURSION
Inv ⊆ FIX I: To prove this inclusion it is sufficient to show that Inv ⊆ I(Inv )
(Why?). To this end, let p ∈ Inv . Then, for all σ ∈ Act ∗ and p0 ∈ Proc,
σ
p → p0 implies that p0 ∈ [[F ]] . (6.7)
We must establish that p ∈ I(Inv ), or equivalently that p ∈ [[F ]] and that
p ∈ [·Act·]Inv . We obtain the first one of these two statements by taking
ε
σ = ε in (6.7) because p → p always holds.
To prove that p ∈ [·Act·]Inv , we have to show that, for each process p 0 and
action a,
a
p → p0 implies p0 ∈ Inv .
This is equivalent to proving that, for each sequence of actions σ 0 and process
p00 ,
a σ0
p → p0 and p0 → p00 imply p00 ∈ [[F ]].
However, this follows immediately by letting σ = aσ 0 in (6.7).
FIX I ⊆ Inv: First we note that, since FIX I is a fixed point of I, it holds that
FIX I = [[F ]] ∩ [·Act·]FIX I . (6.8)
σ
To prove that FIX I ⊆ Inv , assume that p ∈ FIX I and that p → p0 . We
shall show that p0 ∈ [[F ]] by induction on |σ|, the length of σ.
Base case σ = ε: Then p = p0 and therefore, by (6.8) and our assumption
that p ∈ FIX I, it holds that p0 ∈ [[F ]], which was to be shown.
a σ0
Inductive step σ = aσ 0 : Then p → p00 → p0 for some p00 . By (6.8) and our
assumption that p ∈ FIX I, it follows that p 00 ∈ FIX I. As |σ 0 | < |σ|
and p00 ∈ FIX I, by the induction hypothesis we may conclude that
p0 ∈ [[F ]], which was to be shown.
This completes the proof of the second inclusion.
The proof of the theorem is now complete. 2
where a ∈ Act and there is exactly one defining equation for the variable X, which
is of the form
min
X = FX
or
max
X = FX ,
where FX is a formula of the logic which may contain occurrences of the variable
X.
a
Let (Proc, Act, {→| a ∈ Act}) be a labelled transition system and F a formula
of Hennessy-Milner logic with one (recursively defined) variable X. Let s ∈ Proc.
We shall describe a game between an ‘attacker’ and a ‘defender’ which has the
following goal:
The configurations of the game are pairs of the form (s, F ) where s ∈ Proc and F
is a formula of Hennessy-Milner logic with one variable X. For every configura-
tion we define the following successor configurations according to the structure of
the formula F (here s is ranging over Proc):
• (s, F1 ∧F2 ) and (s, F1 ∨F2 ) both have two successor configurations, namely
(s, F1 ) and (s, F2 ),
• (s, haiF ) and (s, [a]F ) both have the successor configurations (s 0 , F ) for
a
every s0 such that s → s0 , and
• (s, X) has only one successor configuration (s, F X ), where X is defined via
max min
the equation X = FX or X = FX .
Note that the successor configuration of (s, X) is always uniquely determined and
we will denote this move by (s, X) → (s, F X ). (It is suggestive to think of these
moves that unwind fixed points as moves made by a referee for the game.) Simi-
A
larly successor configurations selected by the attacker will be denoted by → moves
D
and those chosen by the defender by → moves.
We also notice that every play either
• terminates in (s, tt) or (s, ff ), or
• it can be the case that the attacker (or the defender) gets stuck in the current
a
configuration (s, [a]F ) (or (s, haiF )) whenever s 9, or
• The attacker is a winner in every infinite play provided that X is defined via
min
X = FX ; the defender is a winner in every infinite play provided that X is
max
defined via X = FX .
Remark 6.1 The intuition for the least and largest fixed point is as follows. If X is
defined as a least fixed point then the defender has to prove in finitely many rounds
that the property is satisfied. If a play of the game is infinite, then the defender
has failed to do so, and the attacker wins. If instead X is defined as a largest fixed
point, then it is the attacker who has to disprove in finitely many rounds that the
formula is satisfied. If a play of the game is infinite, then the attacker has failed to
do so, and the defender wins.
a
Theorem 6.3 [Game characterization] Let (Proc, Act, { →| a ∈ Act}) be a la-
belled transition system and F a formula of Hennessy-Milner logic with one (re-
cursively defined) variable X. Let s ∈ Proc. Then the following statements hold.
• State s satisfies F if and only if the defender has a universal winning strategy
starting from (s, F ).
• State s does not satisfy F if and only if the attacker has a universal winning
strategy starting from (s, F ).
6.4. GAME CHARACTERIZATION FOR HML WITH RECURSION 131
The proof of this result is beyond the scope of this introductory textbook. We refer
the reader to (Stirling, 2001) for a proof of the above result and more information
on model checking games.
b
& b a w
se s1 / s2 / s3 a
Example 6.3 We start with an example which is not using any recursively defined
variable. We shall demonstrate that s |= [b](hbi[b]ff ∧ hbi[a]ff ) by defining a
A
universal winning strategy for the defender. As remarked before, we will use →
D
to denote that the successor configuration was selected by the attacker and → to
denote that it was selected by the defender. The game starts from
Because [b] is the topmost operation, the attacker selects the successor configura-
tion and he has only one possibility, namely
A
(s, [b](hbi[b]ff ∧ hbi[a]ff )) → (s1 , hbi[b]ff ∧ hbi[a]ff ) .
or
A
(s1 , hbi[b]ff ∧ hbi[a]ff ) → (s1 , hbi[a]ff ) .
We have to show that the defender wins from any of these two configurations (we
have to find a universal winning strategy).
• From (s1 , hbi[b]ff ) it is the defender who makes the next move; let him so
D b
play (s1 , hbi[b]ff ) → (s2 , [b]ff ). Now the attacker should continue but s 2 9
so he is stuck and the defender wins this play.
• From (s1 , hbi[a]ff ) it is also the defender who makes the next move; let him
D a
play (s1 , hbi[a]ff ) → (s, [a]ff ). Now the attacker should continue but s 9
so he is stuck again and the defender wins this play.
132 CHAPTER 6. HML WITH RECURSION
or
A
(s2 , haitt ∧ [a]X) → (s2 , [a]X) .
6.4. GAME CHARACTERIZATION FOR HML WITH RECURSION 133
It is easy to see that the defender wins from the configuration (s 2 , haitt) by the
D
move (s2 , haitt) → (s3 , tt), so we shall investigate only the continuation of the
A
game from (s2 , [a]X). The attacker has only the move (s 2 , [a]X) → (s3 , X).
After expanding the variable X the game continues from (s 3 , haitt ∧ [a]X). Again
the attacker can play either
A
(s3 , haitt ∧ [a]X) → (s3 , haitt)
or
A
(s3 , haitt ∧ [a]X) → (s3 , [a]X) .
In the first case the attacker loses as before. In the second case, the only contin-
A
uation of the game is (s3 , [a]X) → (s3 , X). However, we have already seen this
configuration earlier in the game. To sum up, either the attacker loses in finitely
many steps or the game can be infinite. As we consider the largest fixed point, in
both cases the defender is the winner of the game.
min
Example 6.7 Let X = haitt ∨ ([b]X ∧ hbitt). This property informally says that
along each b labelled sequence there is eventually a state where the action a is
enabled. We shall argue that s1 6|= X by finding a winning strategy for the attacker
starting from (s1 , X). The first move of the game is
or
D
(s1 , haitt ∨ ([b]X ∧ hbitt)) → (s1 , [b]X ∧ hbitt) .
In the first case the defender loses as he is supposed to pick an a-successor of the
a
state s1 but s1 9. In the second case the attacker proceeds as follows.
A A
(s1 , [b]X ∧ hbitt) → (s1 , [b]X) → (s, X) .
A A
The attacker continues by (s, [b]X ∧hbitt) → (s, [b]X) → (s1 , X) and the situation
(s1 , X) has already been seen before. This means that the game is infinite (unless
the defender loses in finitely many rounds) and hence the attacker is the winner of
the game (since we are considering a least fixed point).
s 8 t
888
88
a 88a a
88
88
b
s1 / s2 k t1 l
b a
a
t2
Use the game characterization for HML with recursion to show that
1. s1 satisfies the formula
max
X = hbitt ∧ [b]X ;
where the formula Forever(b) is one that expresses that b-transitions can be ex-
ecuted forever. The formula Forever(b) is, however, itself specified recursively
thus:
max
Forever(b) = hbiForever(b) .
Our informally specified requirement is therefore formally expressed by means of
two recursive equations.
In general, a mutually recursive equational system has the form
X 1 = F X1
..
.
X n = F Xn ,
X = [a]Y
Y = haiX .
X =F ,
136 CHAPTER 6. HML WITH RECURSION
OXi (S1 , . . . , Sn ) = Si (1 ≤ i ≤ n) .
The function [[D]] turns out to be monotonic over the complete lattice (D, v), and
we can obtain both the largest and least fixed point for the equational system in the
same way as for the case of one variable.
Consider, for example, the mutually recursive formulae described by the sys-
tem of equations below:
max
X = haiY ∧ [a]Y ∧ [b]ff
max
Y = hbiX ∧ [b]X ∧ [a]ff .
We wish to find out the set of states in the following labelled transition that satisfies
the formula X.
a
& a a w
se s1 o s2 / s3 b
To this end, we can again apply the iterative algorithm for computing the largest
fixed point of the function determined by the above system of equations. Note that,
as formally explained before, such a function maps a pair of sets of states (S 1 , S2 )
to the pair of sets of states
There
• h·a·iS2 ∩ [·a·]S2 ∩ {s, s2 } is the set of states that satisfy the right-hand side
of the defining equation for X under these assumptions, and
138 CHAPTER 6. HML WITH RECURSION
• h·b·iS1 ∩ [·b·]S1 ∩ {s1 , s3 } is the set of states that satisfy the right-hand side
of the defining equation for Y under these assumptions.
To compute the largest solution to the system of equations above, we use the it-
erative algorithm provided by Theorem 6.1 starting from the top element in our
complete lattice, namely the pair
({s, s1 , s2 , s3 }, {s, s1 , s2 , s3 }) .
This corresponds to assuming that all states satisfy both X and Y . To obtain the
next approximation to the largest solution to our system of equations, we compute
the pair (6.10) taking S1 = S2 = {s, s1 , s2 , s3 }. The result is the pair
({s, s2 }, {s1 , s3 }) .
Note that we have shrunk both components in our original estimate to the largest
solution. This means that we have not yet found the largest solution we are looking
for. We therefore compute the pair (6.10) again taking the above pair as our new
input (S1 , S2 ). You should convince yourselves that the result of this computation
is the pair
({s, s2 }, {s1 }) .
Note that the first component in the pair has not changed since our previous ap-
proximation, but that s3 has been removed from the second component. This is
because at this point we have discovered, for instance, that s 3 does not afford a
b-labelled transition ending up in either s or s 2 .
Since we have not yet found a fixed point, we compute the pair (6.10) again,
taking ({s, s2 }, {s1 }) as our new input (S1 , S2 ). The result of this computation is
the pair
({s}, {s1 }) .
Intuitively, at this iteration we have discovered a reason why s 2 does not afford
property X—namely that s2 has an a-labelled transition leading to state s 3 , which,
as we saw before, does not have property Y .
If we now compute the pair (6.10) again, taking ({s}, {s1 }) as our new input
(S1 , S2 ), we obtain ({s}, {s1 }). We have therefore found the largest solution to our
system of equations. It follows that process s satisfies X and process s 1 satisfies
Y.
Exercise 6.8
F d
1. Show that ((2Proc )n , v), with v, and defined as described in the text
above, is a complete lattice.
6.6. CHARACTERISTIC PROPERTIES 139
X = [a]Y
Y = haiX
A0 = a.A1 + a.a.0
A1 = a.A2 + a.0
A2 = a.A1 .
X = [a]Y
Y = haiX
Exercise 6.10 Note that in the above rephrasing of the characterization theorem
for HML, we only require that each formula satisfied by p is also satisfied by q, but
not that the converse also holds. Show, however, that if q satisfies all the formulae
in HML satisfied by p, then p and q satisfy the same formulae in HML.
Example 6.8 Assume that Act = {a} and that the process p is given by the equa-
tion
def
X = a.X .
We will show that p cannot be characterized up to bisimulation equivalence by a
single recursion free formula. To see this we assume that such a formula exists and
show that this leads to a contradiction. Towards a contradiction, we assume that
for some Fp ∈ M,
[[Fp ]] = [p]∼ . (6.11)
In particular we have that
We will obtain contradiction by proving that (6.12) cannot hold for any formula
Fp . Before we prove our statement we have to introduce some notation.
6.6. CHARACTERISTIC PROPERTIES 141
a / pn−1 a / pn−2 · · · p2 a a
pn / p1 / p0
pm
a
Recall that, by the modal depth of a formula F , notation md(F ), we mean the
maximum number of nested occurrences of the modal operators in F . Formally
this is defined by the following recursive definition:
1. md(tt) = md(ff ) = 0,
1. p0 = 0,
2. pi+1 = a.pi .
(The processes p and pi , for i ≥ 1, are depicted in Figure 6.3.) Observe that each
process pi can perform a sequence of i a-labelled transitions in a row and terminate
in doing so. Moreover, this is the only behaviour that p i affords.
Now we can prove the following:
Exercise 6.12 (Recommended) Before reading on, you might want to try and de-
fine a characteristic formula for some processes for which HML suffices. If you
fancy this challenge, we encourage you to read Example 6.9 to follow for inspira-
tion.
Assume that a is the only action. For each i ≥ 0, construct an HML formula
that is a characteristic formula for process p i in Figure 6.3. Hint: First give a
characteristic formula for p0 . Next show how to construct a characteristic formula
for pi+1 from that for pi .
Example 6.8 shows us that in order to obtain a characteristic formula even for
finite labelled transition systems we need to make use of the recursive extension of
Hennessy-Milner logic.
The construction of the characteristic formula involves two steps. First of all,
we need to construct an equational system that describes the formula; next we
should decide whether to adopt the least or the largest solution to this system. We
start our search for the characteristic formula by giving the equational system, and
choose the suitable interpretation for the fixed points afterwards.
We start by assuming that we have a finite transition system
a
({p1 , . . . , pn }, Act, {→| a ∈ Act})
and a set of variables X = {Xp1 , . . . , Xpn , . . .} that contains (at least) as many
variables as there are states in the transition system. Intuitively X p is the syntactic
symbol for the characteristic formula for p and its meaning will be given in terms
of an equational system.
A characteristic formula for a process has to describe both which actions the
process can perform, which actions it cannot perform and what happens to it after
it has performed each action. The following example illustrates these issues.
Example 6.9 If a coffee machine is given by Figure 6.4, we can construct a char-
acteristic formula for it as follows.
Let gkm be the initial state of the coffee machine. Then we see that gkm can
perform an m-action and that this is the only action it can perform in this state. The
picture also shows us that, by performing the m action, gkm will necessarily end
up in state q. This can be expressed as follows:
- gkm
k m t
?
q
If we let Xgkm and Xq denote the characteristic formula for q and gkm respec-
tively, Xgkm can be expressed as
where hmiXq expresses property 1 above, [m]Xq expresses property 2, and the
last conjunct [{t, k}]ff expresses property 3. To obtain the characteristic formula
for gkm we have to define a recursive formula for X q following the same strategy.
We observe that q can perform two actions, namely t and k, and in both cases it
becomes gkm. Xq can therefore be expressed as
In the recursive formula above, the first conjunct htiX gkm states that a process that
is bisimilar to q should be able to perform a t-labelled transition and thereby end up
in a state that is bisimilar to gkm—that is, that satisfies the characteristic property
Xgkm for state gkm. The interpretation of the second conjunct is similar. The
third conjunct instead states that all of the outgoing transitions from a state that is
bisimilar to q that are labelled with t or k will end up in a state that is bisimilar to
gkm. Finally, the fourth and last conjunct says that a process that is bisimilar to q
cannot perform action m.
Now we can generalize the strategy employed in the above example as follows. Let
a
Der(a, p) = {p0 | p → p0 }
a
Furthermore, if p → p0 then p0 ∈ Der(a, p). Therefore p has the property
_
[a] X p0 ,
a
p0 .p → p0
for each action a. The above property states that, by performing action a, process p
(and any other process that is bisimilar to it) must become a process satisfying the
a
characteristic property of a state in Der(a, p). (Note that if p 9, then Der(a, p)
is empty. In that case, since an empty disjunction is just the formula ff , the above
formula becomes simply [a]ff —which is what we would expect.)
Since action a is arbitrary, we have that
^ _
p |= [a] X p0 .
a a
p0 .p → p0
The solution can either be the least or the largest one (or, in fact, any other fixed
point for what we know at this stage).
The following example shows that the least solution to (6.14) in general does
not yield the characteristic property for a process.
Example 6.10 Let p be the process given in Figure 6.5. In this case, assuming for
the sake of simplicity that a is the only action, the equational system obtained by
using (6.14) will have the form
Xp = haiXp ∧ [a]Xp .
Since h·a·i∅ = ∅, you should be able to convince yourselves that [[X p ]] = ∅ is
the least solution to this equation. This corresponds to taking X p = ff as the
characteristic formula for p. However, p does not have the property ff , which
therefore cannot be the characteristic property for p.
6.6. CHARACTERISTIC PROPERTIES 145
a
?
p
In what follows we will show that the largest solution to (6.14) yields the char-
acteristic property for all p ∈ Proc. (Those amongst you who read Section 4.3
will notice that this is in line with our characterization of bisimulation equivalence
as the largest fixed point of a suitable monotonic function.) This is the content of
the following theorem, whose proof you can skip unless you are interested in the
mathematical developments.
a
Theorem 6.4 Let (Proc, Act, {→| a ∈ Act}) be a finite transition system and, for
each p ∈ Proc, let Xp be defined by
max
^ ^ _
Xp = haiXp0 ∧ [a] X p0 . (6.15)
a
a,p0 .p → p0 a a
p0 .p → p0
Then Xp is the characteristic property for p—that is, q |= X p iff p ∼ q, for each
q ∈ Proc.
The assumption that Proc and Act be finite ensures that there is only a finite num-
ber of variables involved in the definition of the characteristic formula and that we
only obtain a formula with finite conjunctions and disjunctions on the right-hand
side of each equation.
In the proof of the theorem we will let D K be the declaration defined by
^ ^ _
DK (Xp ) = haiXp0 ∧ [a] X p0 .
a
a,p.p → p0 a a
p0 .p → p0
From our previous discussion, we have that X p is the characteristic property for p
if and only if for the largest solution [[X p ]], where p ∈ Proc, we have that [[Xp ]] =
[p]∼ . In what follows, we write q|=max Xp if q belongs to [[Xp ]] in the largest
solution for DK .
In order to prove Theorem 6.4, we shall establish the following two statements
separately, for each process q ∈ Proc:
146 CHAPTER 6. HML WITH RECURSION
2. if p ∼ q, then q|=max Xp .
As the first step in the proof of Theorem 6.4, we prove the following lemma to the
effect that the former statement holds.
Lemma 6.1 Let Xp be defined as in (6.15). Then, for each q ∈ Proc, we have that
q |=max Xp ⇒ p ∼ q .
b b
a) (p, q) ∈ R and p → p1 ⇒ ∃ q1 . q → q1 and (p1 , q1 ) ∈ R.
b b
b) (p, q) ∈ R and q → q1 ⇒ ∃ p1 . p → p1 and (p1 , q1 ) ∈ R.
b
q |=max Xp and p → p1 .
b
As p → p1 , we obtain, in particular, that q |= max hbiXp1 , which means that,
for some q1 ∈ Proc,
b
q → q1 and q1 |=max Xp1 .
b
q → q1 and (p1 , q1 ) ∈ R ,
b
b) Assume that (p, q) ∈ R and q → q1 . This means that
b
q |=max Xp and q → q1 .
b
As we know that q → q1 , we obtain that
_
q1 |=max X p0 .
b
p0 .p → p0
b
Therefore there must exist a p1 such that q1 |=max Xp1 and p → p1 .
We have therefore proven that
b
∃p1 . p → p1 and (p1 , q1 ) ∈ R ,
q |=max Xp implies p ∼ q .
The following lemma completes the proof of our main theorem of this section. In
the statement of this result, and in its proof, we assume for notational convenience
that Proc = {p1 , . . . , pn }.
Lemma 6.2 ([p1 ]∼ , . . . , [pn ]∼ ) v [[DK ]]([p1 ]∼ , . . . , [pn ]∼ ), where DK is the dec-
laration defined on page 145.
148 CHAPTER 6. HML WITH RECURSION
(Can you see why?) The proof can be divided into two parts, namely:
T
a) q ∈ h·a·i[p0 ]∼ and
a
T .p → p S
a,p 0 0
b) q ∈ [·a·] [p0 ]∼ .
a p0 .p → p0
a
q ∈ h·a·i[p0 ]∼ .
which is equivalent to
\ [
q∈ [·a·] [p0 ]∼ .
a a
p0 .p → p0
6.7. MIXING LARGEST AND LEAST FIXED POINTS 149
Theorem 6.4 can now be expressed as the following lemma, whose proof completes
the argument for that result.
Exercise 6.13 What are the characteristic formulae for the processes p and q in
Figure 6.1?
Exercise 6.14 Define characteristic formulae for the simulation and ready simu-
lation preorders as defined in Definitions 3.17 and 3.18, respectively.
We already saw on page 120 how to describe a property of the form ‘it is possible
for the system to reach a state satisfying F ’ using the template formula P os(F ),
namely
min
P os(F ) = F ∨ hActiP os(F ) .
Therefore, all that we need to do to specify the above property using HML with
recursion is to ‘plug in’ a specification of the property ‘the state has a livelock’ in
lieu of F . How can we describe a property of the form ‘the state has a livelock’
using HML with recursion? A livelock is an infinite sequence of internal steps of
150 CHAPTER 6. HML WITH RECURSION
LivelockNow = hτ iLivelockNow .
As usual, we are faced with a choice in selecting a suitable solution for the above
equation. Since we are specifying a state of affairs that should hold forever, in this
case we should select the largest solution to the equation above. It follows that our
HML specification of the property ‘the state has a livelock’ is
max
LivelockNow = hτ iLivelockNow .
Exercise 6.15 What would be the least solution of the above equation?
a /p τ /q τ
s X
/r
Use the iterative algorithm for computing the set of states in that labelled transition
system that satisfies the formula LivelockNow defined above.
Exercise 6.17 This exercise is for those amongst you who feel they need more
practice in computing fixed points using the iterative algorithm.
Consider the labelled transition system below.
τ
% a τ w
sd s1 o s2 / s3 τ
Use the iterative algorithm for computing the set of states in that labelled transition
system that satisfies the formula LivelockNow defined above.
6.7. MIXING LARGEST AND LEAST FIXED POINTS 151
This looks natural and innocuous. However, first appearances can be deceiving!
Indeed, the equational systems we have considered so far have only allowed us to
express formulae purely in terms of largest or least solutions to systems of recur-
sion equations. (See Section 6.5.) For instance, in defining the characteristic for-
mulae for bisimulation equivalence, we only used systems of equations in which
the largest solution was sought for all of the equations in the system.
Our next question is whether we can extend our framework in such a way that
it can treat systems of equations with mixed solutions like the one describing the
formula P os(LivelockNow) above. How can we, for instance, compute the set of
processes in the labelled transition system
a /p τ /q τ
s X
/r
that satisfy the formula P os(LivelockNow)? In this case, the answer is not overly
difficult. In fact, you might have already noted that we can compute the set of
processes satisfying the formula P os(LivelockNow) once we have in our hands
the collection of processes satisfying the formula LivelockNow. As you saw in
Exercise 6.16, the only state in the above labelled transition system satisfying the
formula LivelockNow is p. Therefore, we may obtain the collection of states satis-
fying the formula P os(LivelockNow) as the least solution of the set equation
where S ranges over subsets of {s, p, q, r}. We can calculate the least solution of
this equation using the iterative methods we introduced in Section 6.2.
Since we are looking for the least solution of the above equation, we begin by
obtaining our first approximation S (1) to the solution by computing the value of the
expression on the right-hand side of the equation when S = ∅, which is the least
element in the complete lattice consisting of the subsets of {s, p, q, r} ordered by
inclusion. We have that
Intuitively, we have so far discovered the (obvious!) fact that p has a possibility of
reaching a state where a livelock may arise because p has a livelock now.
Our second approximation S (2) is obtained by computing the set obtained by
evaluating the expression on the right-hand side of equation (6.16) when S =
S (1) = {p}. The result is
S (2) = {p} ∪ h·Act·i{p} = {s, p} .
Intuitively, we have now discovered the new fact that s has a possibility of reaching
a state where a livelock may arise because s has a transition leading to p, which,
as we found out in the previous approximation, has itself a possibility of reaching
a livelock.
You should now be able to convince yourselves that the set {s, p} is indeed a
fixed point of equation (6.16)—that is, that
{s, p} = {p} ∪ h·Act·i{s, p} .
It follows that {s, p} is the least solution of equation (6.16), and that the states s
and p are the only ones in our example labelled transition system that satisfy the
formula P os(LivelockNow). This makes perfect sense intuitively because s and
p are the only states in that labelled transition system that afford a sequence of
transitions leading to a state from which an infinite computation consisting of τ -
labelled transitions is possible. (In case of p, this sequence is empty since p can
embark in a τ -loop immediately.)
Note that we could find the set of states satisfying P os(LivelockNow) by first
computing [[LivelockNow]], and then using this set to compute
[[P os(LivelockNow)]] ,
because the specification of the formula LivelockNow was independent of that
P os(LivelockNow). In general, we can apply this strategy when the collection
of equations can be partitioned into a sequence of ‘blocks’ such that
• the equations in the same block are all either largest fixed point equations or
least fixed equations, and
• equations in each block only use variables defined in that block or in preced-
ing ones.
The following definition formalizes this class of systems of equations.
Definition 6.2 A n-nested mutually recursive equational system E is an n-tuple
h (D1 , X1 , m1 ), (D2 , X2 , m2 ), . . . , (Dn , Xn , mn ) i,
where the Xi s are pairwise disjoint, finite sets of variables, and, for each 1 ≤ i ≤ n,
6.7. MIXING LARGEST AND LEAST FIXED POINTS 153
does not meet these requirements because the variables X and Y are both defined
in mutually recursive fashion, and their definitions refer to different types of fixed
points. If we allow fixed points to be mixed completely freely we obtain the modal
µ-calculus (Kozen, 1983), which was mentioned in Section 6.1. In this book we
shall however not allow a full freedom in mixing fixed points in declarations but re-
strict ourselves to systems of equations satisfying the constraints in Definition 6.2.
Note that employing the approach described above using our running example in
this section, such systems of equations have a unique solution, obtained by solv-
ing the first block and then proceeding with the others using the solutions already
obtained for the preceding blocks.
Finally if F is a Hennessy-Milner formula defined over a set of variables Y =
{Y1 , . . . , Yk } that are declared by an n-nested mutually recursive equational system
E, then [[F ]] is well-defined and can be expressed by
where [[Y1 ]], . . . , [[Yk ]] are the sets of states satisfying the recursively defined for-
mulae associated with the variables Y 1 , . . . , Yk .
Exercise 6.18 Consider the labelled transition system in Exercise 6.17. Use equa-
tion (6.17) to compute the set of states satisfying the formula
F = hActiP os(LivelockNow) .
154 CHAPTER 6. HML WITH RECURSION
Express this property using HML with recursion. Next, construct a rooted labelled
transition system that satisfies the property and one that does not. Check your
constructions by computing the set of states in the labelled transition systems you
have built that satisfy the formula.
• the modal µ-calculus (Kozen, 1983), i.e., Hennessy-Milner logic with arbi-
trarily many nested and recursively defined variables.
These logics are typical representatives of the so called branching-time logics. The
view of time taken by these logics is that each moment in time may branch into
several distinct possible futures. Therefore, the structures used for interpreting
branching-time logics can be viewed as computation trees. This means that in
order to check for the validity of a formula, one has to consider a whole tree of
states reachable from the root. Another typical and well-known branching-time
logic is the computation tree logic or CTL (Clarke and Emerson, 1981), which
uses (nested) until as the only temporal operator, the next-time modality X and
existential/universal path quantifiers.
Another collection of temporal logics is that of the so called linear-time logics.
The view of time taken by these logics is that each moment in time has a unique
successor. Suitable models for formulae in such logics are therefore computation
sequences. Here the validity of a formula is determined for a particular (fixed) trace
of the system and possible branching is not taken into account. A process satisfies
a linear-time formula if all of its computation sequences satisfy it. Linear tempo-
ral logic or LTL (Pnueli, 1977) is probably the most studied logic of this type, in
6.8. FURTHER RESULTS ON MODEL CHECKING 155
over sequential infinite-state systems (Caucal, 1996; Muller and Schupp, 1985).
The EXPTIME-hardness of model checking µ-calculus formulae over pushdown
automata is valid even in the case that the size of the formula is assumed to be
constant (fixed). On the other hand, for fixed formulae and processes in the BPA
class (pushdown automata with a single control state), the problem is decidable in
polynomial time (Burkart and Steffen, 1997; Walukiewicz, 2001). Model checking
of HML is PSPACE-complete for BPA, but, for a fixed formula, this problem is
again in P (Mayr, 1998).
The situation is, however, not that promising once we move from sequential
infinite-state systems to parallel infinite-state systems. Both for the class of Petri
nets (PN) and for its communication-free fragment BPP (CCS with parallel com-
position, recursion and action prefixing only) essentially all branching-time logics
with at least one recursively defined variable are undecidable. More precisely, the
EG logic which can express the property whether there exists a computation during
which some HML formula is invariantly true is undecidable for BPP (Esparza and
Kiehn, 1995) (and hence also for PN). The EF logic, which can essentially express
reachability properties, is decidable for BPP (Esparza, 1997) but undecidable for
PN (Esparza, 1994). On the other hand, the linear time logic LTL (with a certain
restriction) is decidable for Petri nets and BPP (Esparza, 1994). This is an example
when LTL turns out to be more tractable than branching-time logics. A thorough
discussion of the relative merits of linear- and branching-time logics from a com-
plexity theoretic perspective may be found in, e.g., the paper (Vardi, 2001).
For further references and more detailed overviews we refer the reader, for
example, to the references (Burkart et al., 2001; Burkart and Esparza, 1997).
Chapter 7
In the previous chapters of this book, we have illustrated the use of the ingredients
in our methodology for the description and analysis of reactive systems by means
of simple, but hopefully illustrative, examples. As we have mentioned repeatedly,
the difficulty in understanding and reasoning reliably about even the simplest re-
active systems has long been recognized. Apart from the intrinsic scientific and
intellectual interest of a theory of reactive computation, this realization has served
as a powerful motivation for the development of the theory we have presented so
far, and of its associated verification techniques.
In order to offer you further evidence for the usefulness of the theory you have
learned so far in the modelling and analysis of reactive systems, we shall now use
it to model and analyze some well known mutual exclusion algorithms. These al-
gorithms are amongst the most classic ones in the theory of concurrent algorithms,
and have been investigated by many authors using a variety of techniques—see, for
instance, the classic papers (Dijkstra, 1965; Knuth, 1966; Lamport, 1986). Here,
they will give us the opportunity to introduce some modelling and verification tech-
niques that have proven their worth in the analysis of many different kinds of reac-
tive systems.
In order to illustrate concretely the steps that have to be taken in modelling
and verification problems, we shall consider a very elegant solution to the mutual
exclusion problem proposed by Peterson and discussed in (Peterson and Silber-
schatz, 1985).
In Peterson’s algorithm for mutual exclusion, there are two processes P 1 and
P2 , two boolean variables b1 and b2 and an integer variable k that may take the
values 1 and 2. The boolean variables b 1 and b2 have initial value false, whereas the
157
158 CHAPTER 7. MODELLING MUTUAL EXCLUSION ALGORITHMS
initial value of the variable k can be arbitrary. In order to ensure mutual exclusion,
each process Pi (i ∈ {1, 2}) executes the following algorithm, where we use j to
denote the index of the other process.
while true do
begin
‘noncritical section’;
bi := true;
k := j;
while (bj and k = j) do skip;
‘critical section’;
bi := false;
end
As many concurrent algorithms in the literature, Peterson’s mutual exclusion algo-
rithm is presented in pseudocode. Therefore one of our tasks, when modelling the
above algorithm, is to translate the pseudocode description of the behaviour of the
processes P1 and P2 into the model of labelled transition systems or into Milner’s
CCS. Moreover, the algorithm uses variables that are manipulated by the processes
P1 and P2 . Variables are not part of CCS because, as discussed in Section 1.2,
process calculi like CCS are based on the message passing paradigm, and not on
shared variables. However, this is not a major problem. In fact, following the
message passing paradigm, we can view variables as processes that are willing to
communicate with other computing agents in their environment that need to read
and/or write them.
By way of example, let us consider how to represent the boolean variable b 1 as
a process. This variable will be encoded as a process with two states, namely B 1t
an B1f . The former state will describe the ‘behaviour’ of the variable b 1 holding
the value true, and the latter the ‘behaviour’ of the variable b 1 holding the value
false. No matter what its value is, the variable b 1 can be read (yielding information
on its value to the reading process) or written (possibly changing the value held
by the variable). We need to describe these possibilities in CCS. To this end, we
shall assume that processes read and write variables by communicating with them
using suitable communication ports. For instance, a process wishing to read the
value true from variable b1 will try to synchronize with the process representing
that variable on a specific communication channel, say b1rt—the acronym means
‘read the value true from b1 ’. Similarly, a process wishing to write the value false
into variable b1 will try to synchronize with the process representing that variable
on the communication channel b1wf—‘write false into b 1 ’.
Using these ideas, the behaviour of the process describing the variable b 1 can
be represented by the following CCS expressions:
CHAPTER 7. MODELLING MUTUAL EXCLUSION ALGORITHMS 159
def
B1f = b1rf.B1f + b1wf.B1f + b1wt.B1t
def
B1t = b1rt.B1t + b1wf.B1f + b1wt.B1t .
Intuitively, when in state B1t , the above process is willing to tell its environment
that its value is true, and to receive writing requests from other processes. The
communication of the value of the variable to its environment does not change the
state of the variable, whereas a writing request from a process in the environment
may do so.
The behaviour of the process describing the variable b 2 can be represented in
similar fashion thus:
def
B2f = b2rf.B2f + b2wf.B2f + b2wt.B2t
def
B2t = b2rt.B2t + b2wf.B2f + b2wt.B2t .
Again, the process representing the variable k has two states, denoted by the con-
stants K1 and K2 above, because the variable k can only take the two values 1 and
2.
Exercise 7.1 You should now be in a position to generalize the above examples.
Assume that we have a variable v taking values over a data domain D. Can you
represent this variable using a CCS process?
we shall assume, for the sake of simplicity, that processes cannot fail or termi-
nate within the critical section. Under these assumptions, the initial behaviour of
process P1 can be described by the following CCS expression:
def
P1 = b1wt.kw2.P11 .
The above expression says that process P 1 begins by writing true in variable b 1 and
2 in variable k. Having done so, it will enter a new state that will be represented
by the constant P11 . This new constant will intuitively describe the behaviour of
process P1 while it is executing the following line of pseudocode:
• move to a new state, say P12 , otherwise. In state P12 , we expect that process
P1 will enter and then exit the critical section.
The first thing to note here is that we need to make a decision as to the precise
semantics of the informal pseudocode expression
bj and k = j.
def
P11 = b2rf.P12 + b2rt.(kr2.P11 + kr1.P12 ) .
To complete the description of the behaviour of the process P 1 we are left to present
the defining equation for the constant P 12 , describing the access to, and exit from,
the critical section, and the setting of the variable b 1 to false:
def
P12 = enter1.exit1.b1wf.P1 .
In the above CCS expression, we have labelled the enter and exit actions in a way
that makes it clear that it is process P 1 that is entering and exiting the critical
section.
The CCS process describing the behaviour of process P 2 in Peterson’s algo-
rithm is entirely symmetric to the one we have just provided, and is defined thus:
def
P2 = b2wt.kw1.P21
def
P21 = b1rf.P22 + b1rt.(kr1.P21 + kr2.P22 )
def
P22 = enter2.exit2.b2wf.P2 .
The CCS process term representing the whole of Peterson’s algorithm consists
of the parallel composition of the terms describing the two processes running the
algorithm, and of those describing the variables. Since we are only interested in the
behaviour of the algorithm pertaining to the access to, and exit from, their critical
sections, we shall restrict all of the communication channels that are used to read
from, and write to, the variables. We shall use L to stand for that set of channel
names. Assuming that the initial value of the variable k is 1, our CCS description
of Peterson’s algorithm is therefore given by the term
def
Peterson = (P1 | P2 | B1f | B2f | K1 ) \ L .
Exercise 7.3 (Mandatory!) Give a CCS process that describes the behaviour of
Hyman’s ‘mutual exclusion’ algorithm. Hyman’s algorithm was proposed in the
reference (Hyman, 1966). It uses the same variables as Peterson’s.
In Hyman’s algorithm, each process P i (i ∈ {1, 2}) executes the algorithm in
Figure 7.1, where as above we use j to denote the index of the other process.
Now that we have a formal description of Peterson’s algorithm, we can set our-
selves the goal to analyze its behaviour—manually or with the assistance of a soft-
ware tool that can handle specifications of reactive systems given in the language
CCS. In order to do so, however, we first need to specify precisely what it means for
an algorithm to ‘ensure mutual exclusion’. In our formalization, it seems natural
to identify ‘ensuring mutual exclusion’ with the following requirement:
At no point in the execution of the algorithm will both processes P 1
and P2 be in their critical sections at the same time.
162 CHAPTER 7. MODELLING MUTUAL EXCLUSION ALGORITHMS
while true do
begin
‘noncritical section’;
bi := true;
while k 6= j do begin
while bj do skip;
k:=i
end;
‘critical section’;
bi := false;
end
How can we formalize this requirement? There are at least two options for do-
ing so, depending on whether we wish to use HML with recursion or CCS pro-
cesses/labelled transition systems as our specification formalism. In order to gain
experience in the use of both approaches to specification and verification, in what
follows we shall present specifications for mutual exclusion using HML with re-
cursion and CCS.
So all that we are left to do in order to formalize our requirement for mutual exclu-
sion is to give a formula F in HML describing the requirement that:
7.1. SPECIFYING MUTUAL EXCLUSION IN HML 163
In light of our CCS formalization of the processes P 1 and P2 , we know that process
Pi (i ∈ {1, 2}) is in its critical section precisely when it can perform action exit i .
So our formula F can be taken to be
def
F = [exit1 ]ff ∨ [exit2 ]ff .
The formula Inv(F ) now states that it is invariantly the case that either P 1 is not
in the critical section or that P2 is not in the critical section, which is an equivalent
formulation of our correctness criterion.
Throughout this chapter, we are interpreting the modalities in HML over the
transition system whose states are CCS processes, and whose transitions are weak
α
transitions of the form ⇒ for any action α including τ . So a formula like [exit 1 ]ff
exit
is satisfied by all processes that do not afford an ⇒1 -labelled transition—that is,
by those processes that cannot perform action exit 1 no matter how many internal
steps they do before.
Would such a formula be a good specification for our correctness criterion? What
if we took G to be the formula
S 7→ [[F ]] ∩ [·Act·]S ,
or by iteratively computing the largest fixed point of the above mapping. The good
news, however, is that we do not need to do so! One of the benefits of having
formal specifications of systems and of their correctness criteria is that, at least in
164 CHAPTER 7. MODELLING MUTUAL EXCLUSION ALGORITHMS
principle, they can be used as inputs for algorithms and tools that do the analysis
for us.
One such verification tool for reactive systems that is often used for educa-
tional purposes is the so-called Edinburgh Concurrency Workbench (henceforth
abbreviated to CWB) that is freely available at
http://homepages.inf.ed.ac.uk/perdita/cwb/.
The CWB accepts inputs specified in CCS and HML with recursive definitions,
and implements, amongst others, algorithms that check whether a CCS process
satisfies a formula in HML with recursion or not. One of its commands (namely,
checkprop) allows us to check, at the press of a button, that Peterson does indeed
satisfy property Inv (F ) above, and therefore that it preserves mutual exclusion, as
its proposer intended.
Exercise 7.5 Use the CWB to check whether Peterson satisfies the two candidate
formulae Inv (G) in Exercise 7.4.
Exercise 7.6 (Mandatory) Use the CWB to check whether the CCS process for
Hyman’s algorithm that you gave in your answer to Exercise 7.3 satisfies the for-
mula Inv(F ) specifying mutual exclusion.
process descriptions. Such a notion of equivalence can be used as our yardstick for
correctness.
Unfortunately, there is no single notion of behavioural equivalence that fits
all purposes. We have already met notions of equivalence like trace equivalence
(Section 3.2), strong bisimilarity (Section 3.3) and weak bisimilarity (Section 3.4).
Moreover, this is just the tip of the iceberg of ‘reasonable’ notions of equiva-
lence or approximation between reactive systems. (The interested, and very keen,
reader may wish to consult van Glabbeek’s encyclopaedic studies (Glabbeek, 1990;
Glabbeek, 1993; Glabbeek, 2001) for an in-depth investigation of the notions of be-
havioural equivalence that have been proposed in the literature on concurrency the-
ory.) So, when using implementation verification to establish the correctness of an
implementation, such as our description of Peterson’s mutual exclusion algorithm,
we need to
1. express our specification of the desired behaviour of the implementation us-
ing our model for reactive systems—in our setting as a CCS term—, and
2. choose a suitable notion of behavioural equivalence to be used in checking
that the model of the implementation is correct with respect to the chosen
specification.
As you can see, in both of these steps we need to make creative choices—putting
to rest the usual perception that verifying the correctness of computing systems is
a purely mechanical endeavour.
So let us try and verify the correctness of Peterson’s algorithm for mutual ex-
clusion using implementation verification. According to the above checklist, the
first thing we need to do is to express the desired behaviour of a mutual exclusion
algorithm using a CCS process term.
Intuitively, we expect that a mutual exclusion algorithm like Peterson’s initially
allows both processes P1 and P2 to enter their critical sections. However, once one
of the two processes, say P1 , has entered its critical section, the other can only enter
after P1 has exited its critical section. A suitable specification of the behaviour of
a mutual exclusion algorithm seems therefore to be given by the CCS term
def
MutexSpec = enter1 .exit1 .MutexSpec + enter 2 .exit2 .MutexSpec . (7.1)
Assuming that this is our specification of the expected behaviour of a mutual ex-
clusion algorithm, our next task is to prove that the process Peterson is equivalent
to, or a suitable approximation of, MutexSpec. What notion of equivalence or
approximation should we use for this purpose?
You should be able to convince yourselves readily that trace equivalence or
strong bisimilarity, as presented in Sections 3.2 and 3.3, will not do. (Why?) One
166 CHAPTER 7. MODELLING MUTUAL EXCLUSION ALGORITHMS
and that the state that is the target of that transition affords an enter 1 -labelled tran-
sition, but cannot perform any weak enter 2 -labelled transition. On the other hand,
the only state that process MutexSpec can reach by performing internal transitions
is itself, and in that state both enter transitions are always enabled. It follows that
Peterson and MutexSpec are not observationally equivalent.
Exercise 7.7 What sequence of τ -transitions will bring process Peterson into state
(P12 | P21 | B1t | B2t | K1 ) \ L? (You will need five τ -steps.)
Argue that, as we claimed above, that state affords an enter 1 -labelled transi-
tion, but cannot perform a weak enter 2 -labelled transition.
This sounds like very bad news indeed. Observational equivalence allows us to
abstract away from some of the internal steps in the evolution of process Peterson,
but obviously not enough in this specific setting. We seem to need a more abstract
notion of equivalence to establish the, seemingly obvious, correctness of Peterson’s
algorithm with respect to our specification.
Observe that if we could show that the ‘observable content’ of each sequence
of actions performed by process Peterson is a trace of process MutexSpec, then we
could certainly conclude that Peterson does ensure mutual exclusion. In fact, this
would mean that at no point in its behaviour process Peterson can perform two exit
actions in a row—possibly with some internal steps in between them. But what
do we mean precisely by the ‘observable content’ of a sequence of actions? The
following definition formalizes this notion in a very natural way.
Definition 7.1 [Weak traces and weak trace equivalence] A weak trace of a process
P is a sequence a1 · · · ak (k ≥ 1) of observable actions such that there exists a
sequence of transitions
a a k a
P = P0 ⇒1 P1 ⇒2 · · · ⇒ Pk ,
Note that the collection of weak traces coincides with that of traces for processes
that, like MutexSpec, do not afford internal transitions. (Why?)
We claim that the processes Peterson and MutexSpec are weak trace equivalent,
and therefore that Peterson does meet our specification of mutual exclusion modulo
weak trace equivalence. This can be checked automatically using the command
mayeq provided by the CWB. (Do so!) This equivalence tells us that not only each
weak trace of process Peterson is allowed by the specification MutexSpec, but also
that process Peterson can exhibit as a weak trace each of the traces permitted by
the specification.
If we are just satisfied with checking the pure safety condition that no trace
of process Peterson violates the mutual exclusion property, then it suffices only to
show that Peterson is a weak trace approximation of MutexSpec. A useful proof
technique that can be used to establish this result is given by the notion of weak
simulation. (Compare with the notion of simulation defined in Exercise 3.17.)
Definition 7.2 [Weak simulation] Let us say that a binary relation R over the set
of states of an LTS is a weak simulation iff whenever s 1 R s2 and α is an action
(including τ ):
α α
- if s1 → s01 , then there is a transition s2 ⇒ s02 such that s01 R s02 .
Proposition 7.1 For all states s, s0 , s00 in a labelled transition system, the following
statements hold.
2. If s0 weakly simulates s, and s00 weakly simulates s0 , then s00 weakly simu-
lates s.
In light of the above proposition, to show that Peterson is a weak trace approxima-
tion of MutexSpec, it suffices only to build a weak simulation that relates Peterson
with MutexSpec. The existence of such a weak simulation can be checked using
the command pre offered by the CWB. (Do so!)
Exercise 7.10 Assume that the CCS process Q weakly simulates P . Show that
Q + R weakly simulates P and P + R, for each CCS process R.
Exercise 7.11
1. Show that the processes α.P +α.Q and α.(P +Q) are weak trace equivalent
for each action α, and terms P, Q.
where we have assumed that our monitor process outputs on channel name bad,
when it discovers that two enter actions have occurred without an intervening exit.
In order to check whether process Peterson ensures mutual exclusion, it is now
sufficient to let it interact with MutexTest, and ask whether the resulting system
can initially perform the action bad. Indeed, we have the following result.
Proposition 7.2 Let P be a CCS process whose only visible actions are contained
bad
in the set L0 = {enter1 , enter2 , exit1 , exit2 }. Then (P | MutexTest)\L0 ⇒ iff either
σ enter enter σ enter enter
P ⇒ P 0 ⇒ 1 P 00 ⇒ 2 P 000 or P ⇒ P 0 ⇒ 2 P 00 ⇒ 1 P 000 , for some P 0 , P 00 , P 000
and some sequence of actions σ in the regular language (enter 1 exit1 +enter2 exit2 )∗ .
Proof: For the ‘if implication’, assume, without loss of generality, that
σ enter enter
P ⇒ P 0 ⇒ 1 P 00 ⇒ 2 P 000 ,
for some P 0 , P 00 , P 000 and sequence of actions σ ∈ (enter 1 exit1 + enter2 exit2 )∗ . We
bad
shall argue that (P | MutexTest) \ L0 ⇒. To see this, note that, using induction on
the length of the sequence σ, it is not hard to prove that
τ
(P | MutexTest) \ L0 ⇒ (P 0 | MutexTest) \ L0 .
enter enter
Since P 0 ⇒ 1 P 00 ⇒ 2 P 000 , we have that
τ τ bad
(P 0 | MutexTest) \ L0 ⇒ (P 00 | MutexTest1 ) \ L0 ⇒ (P 000 | bad.0) \ L0 → .
bad
(P | MutexTest) \ L0 ⇒ ,
bad
Conversely, assume that (P | MutexTest) \ L 0 ⇒. Since bad.0 is the only state
of process MutexTest that can perform a bad-action, this means that, for some P 000 ,
τ bad
(P | MutexTest) \ L0 ⇒ (P 000 | bad.0) \ L0 → .
Because of the way MutexTest is constructed, this must be because, for some P 0
enter enter enter enter
and P 00 such that either P 0 ⇒ 1 P 00 ⇒ 2 P 000 or P 0 ⇒ 2 P 00 ⇒ 1 P 000 ,
τ
(P | MutexTest) \ L0 ⇒ (P 0 | MutexTest) \ L0
τ
⇒ (P 00 | MutexTest i ) \ L0 (i ∈ {1, 2})
τ 000 0
⇒ (P | bad.0) \ L .
τ
Using induction on the number of →-steps in the weak transition
τ
(P | MutexTest) \ L0 ⇒ (P 0 | MutexTest) \ L0 ,
σ
you can now argue that P ⇒ P 0 , for some sequence of actions σ in the regular
language (enter 1 exit1 + enter2 exit2 )∗ . This completes the proof. 2
Exercise 7.12 Fill in the details in the above proof.
Definition 7.3 [Tests] A test is a finite, rooted LTS T over the set of actions Act ∪
{bad}, where bad is a distinguished channel name not occurring in Act. We use
root(T ) to denote the start state of the LTS T .
As above, the idea is that a test acts as a monitor that ‘observes’ the behaviour of
a process and reports any occurrence of an undesirable situation by performing a
bad-labelled transition.
In the remainder of this section, tests will often be concisely described using
the regular fragment of Milner’s CCS—that is the fragment of CCS given by the
following grammar :
T ::= 0 | α.T | T + T | X ,
where α can be any action in Act as well as the distinguished action bad, and X
is a constant drawn from a given, finite set of process names. The right-hand side
of the defining equations for a constant can only be a term generated by the above
grammar. For example, the process MutexTest we specified above is a regular CCS
process, but the term
def
X = a.(b.0 | X)
is not.
We now proceed to describe formally how tests can be used to check whether
a process satisfies a formula expressed in HML with recursion.
• For every state s of an LTS, we say that s passes the test T iff
bad
(s | root(T )) \ L ; .
(Recall that L stands for the collection of observable actions in CCS except
for the action bad.) Otherwise we say that s fails the test T .
• We say that the test T tests for the formula F (and that F is testable) iff for
every LTS T and every state s of T ,
172 CHAPTER 7. MODELLING MUTUAL EXCLUSION ALGORITHMS
Example 7.1 The formula [a]ff is satisfied by those processes that do not afford
a
an ⇒-transition. We therefore expect that a suitable test for such a property is
bad a
T ≡ ā.bad.0. Indeed, the reader will easily realize that (s | T ) \ L ; iff s ;, for
every state s. The formula [a]ff is thus testable, in the sense of Definition 7.4.
The formula defined by the recursion equation
max
F = [a]ff ∧ [b]F
a
is satisfied by those states which cannot perform any ⇒-transition, no matter how
b
they engage in a sequence of ⇒-transitions. (Why?) A suitable test for such a
property is
def
X = ā.bad.0 + b̄.X ,
and the recursively defined formula F is thus testable.
s
Compute the set of states in this labelled transition system that satisfy the property
max
F = [a]ff ∧ [b]F .
Which of the states in that labelled transition system passes the test
def
X = ā.bad.0 + b̄.X ?
Exercise 7.14 Prove the claims that we have made in the above example.
In Example 7.1, we have met two examples of testable formulae. But, can each
formula in HML with recursion be tested in the sense introduced above? The
following instructive result shows that even some very simple HML properties are
not testable in the sense of Definition 7.4.
2. Let a and b be two distinct actions in L. Then the formula [a]ff ∨ [b]ff is not
testable.
• P ROOF OF (1). Assume, towards a contradiction, that a test T tests for the
formula haitt. Since T tests for haitt and 0 6|= haitt, we have that
bad
(0 | root(T )) \ L ⇒ .
a
Consider now the term P = a.0 + τ.0. As P → 0, the process P satisfies
the formula haitt. However, P fails the test T because
τ bad
(P | root(T )) \ L → (0 | root(T )) \ L ⇒ .
• P ROOF OF (2). Assume, towards a contradiction, that a test T tests for the
formula [a]ff ∨ [b]ff , with a 6= b. Since the state a.0 + b.0 does not satisfy
the formula [a]ff ∨ [b]ff , it follows that
bad
((a.0 + b.0) | root(T )) \ L ⇒ . (7.2)
We now proceed to show that this implies that either the state a.0 fails the
test T or b.0 does. This we do by examining the possible forms transition
(7.2) may take.
bad bad
– C ASE : ((a.0 + b.0) | root(T )) \ L ⇒ because root(T ) ⇒. In this
case, every state of an LTS fails the test T , and we are done.
τ bad
– C ASE : ((a.0 + b.0) | root(T )) \ L ⇒ (0 | t) \ L →, because
ā
root(T ) ⇒ t for some state t of T . In this case, we may infer that
τ bad
(a.0 | root(T )) \ L ⇒ (0 | t) \ L →
Hence, as previously claimed, either a.0 fails the test T or b.0 does. Since
both a.0 and b.0 satisfy the formula [a]ff ∨ [b]ff , this contradicts our as-
sumption that T tests for it.
The collection of formulae in safety HML is the set of formulae in HML with
recursion that do not contain occurrences of ∨, hαi and variables defined using
least fixed point recursion equations. For instance, the formula haiX is not a legal
formula in safety HML if X is defined thus:
min
X = hbitt ∨ haiX .
Exercise 7.15 (Strongly recommended) Can you build a test (denoted by a pro-
cess in the regular fragment of CCS) for each formula in safety HML without
recursion? Hint: Use induction on the structure of formulae.
It turns out that, with the addition of recursive formulae defined using largest fixed
points, the collection of testable formulae in HML with recursion is precisely the
one you built tests for in the previous exercise! This is the import of the following
result from (Aceto and Ingolfsdottir, 1999).
Thus we can construct tests for safety properties expressible in HML with recur-
sion. We refer the interested readers to (Aceto and Ingolfsdottir, 1999) for more
details, further developments and references to the literature.