furia2014_loop invariant
furia2014_loop invariant
Software verification has emerged as a key concern for ensuring the continued progress of information
technology. Full verification generally requires, as a crucial step, equipping each loop with a “loop invariant.”
Beyond their role in verification, loop invariants help program understanding by providing fundamental
insights into the nature of algorithms. In practice, finding sound and useful invariants remains a challenge.
Fortunately, many invariants seem intuitively to exhibit a common flavor. Understanding these fundamental
invariant patterns could therefore provide help for understanding and verifying a large variety of programs.
We performed a systematic identification, validation, and classification of loop invariants over a range
of fundamental algorithms from diverse areas of computer science. This article analyzes the patterns, as
uncovered in this study, governing how invariants are derived from postconditions; it proposes a taxonomy
of invariants according to these patterns; and it presents its application to the algorithms reviewed. The
discussion also shows the need for high-level specifications based on “domain theory.” It describes how the
invariants and the corresponding algorithms have been mechanically verified using an automated program
prover; the proof source files are available. The contributions also include suggestions for invariant inference
and for model-based specification.
Categories and Subject Descriptors: D.2.4 [Software/Program Verification]: Correctness Proofs; F.3.1
[Specifying and Verifying and Reasoning about Programs]: Invariants
General Terms: Algorithms, Verification
34
Additional Key Words and Phrases: Loop invariants, deductive verification, preconditions and postconditions,
formal verification
ACM Reference Format:
Carlo A. Furia, Bertrand Meyer, and Sergey Velder. 2014. Loop invariants: Analysis, classification, and
examples. ACM Comput. Surv. 46, 3, Article 34 (January 2014), 51 pages.
DOI: http://dx.doi.org/10.1145/2506375
The ETH part of this work was partially supported by the Swiss National Science Foundation under projects
ASII (# 200021-134976) and LSAT (# 200020-134974), and by the Hasler Foundation under project Mancom.
The NRU ITMO part of this work was performed in the ITMO Software Engineering Laboratory with
financial support from the mail.ru group.
Authors’ addresses: C. Furia and B. Meyer, Software Engineering, Meyer, ETH Zentrum, Clausiusstrasse
59, 8092 Zurich, Switzerland and S. Velder, Software Engineering Laboratory, ITMO National Research
University, Kronversky Prospekt 49, Saint Petersburg 197101, Russia.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for
components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this
work in other works requires prior specific permission and/or a fee. Permissions may be requested from
Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)
869-0481, or [email protected].
c 2014 ACM 0360-0300/2014/01-ART34 $15.00
DOI: http://dx.doi.org/10.1145/2506375
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:2 C. A. Furia et al.
1 With the exception of those in Sections 4.5 and 4.7, whose presentation is at a higher level of abstraction,
so that a complete formalization would have required complex axiomatization of geometric and numerical
properties beyond the focus of this article.
2 In the repository, the branch inv_survey contains only the algorithms described in the article; see
http://goo.gl/DsdrV for instruction on how to access it.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:3
(the variant clause helps establish termination as discussed later). Init and Body are
each a compound (a list of instructions to be executed in sequence); either or both can be
empty, although Body normally will not. Exit and Inv (the inductive invariant) are both
Boolean expressions, that is to say, predicates on the program state. The semantics of
the loop is:
(1) Execute Init.
(2) Then, if Exit has value True, do nothing; if it has value False, execute Body, and
repeat step 2.
Another way of stating this informal specification is that the execution of the loop body
consists of the execution of Init followed by zero or more executions of Body, stopping
as soon as Exit becomes True.
There are many variations of the loop construct in imperative programming lan-
guages: “while” forms, which use a continuation condition rather than the inverse exit
condition; “do-until” forms, which always execute the loop body at least once, testing
for the condition at the end rather than on entry; and “for” or “do” forms (“across”
in Eiffel), which iterate over an integer interval or a data structure. They can all be
derived in a straightforward way from the previously mentioned basic form, on which
we will rely throughout this article.
The invariant Inv plays no direct role in the informal semantics but serves to reason
about the loop and its correctness. Inv is a correct invariant for the loop if it satisfies
the following conditions:
(1) Every execution of Init, started in the state preceding the loop execution, will yield
a state in which Inv holds.
(2) Every execution of Body, started in any state in which Inv holds and Exit does not
hold, will yield a state in which Inv holds again.
If these properties hold, then any terminating execution of the loop will yield a state
in which both Inv and Exit hold. This result is a consequence of the loop semantics,
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:4 C. A. Furia et al.
which defines the loop execution as the execution of Init followed by zero or more
executions of Body, each performed in a state where Exit does not hold. If Init ensures
satisfaction of the invariant, and any one execution of Body preserves it (it is enough
to obtain this property for executions started in a state not satisfying Exit), then Init
followed by any number of executions of Body will.
Formally, the following classic inference rule [Hoare 1972; Meyer 1997] uses the
invariant to express the correctness requirement on any loop:
{P} Init {Inv}, {Inv ∧ ¬Exit} Body {Inv}
.
{P} from Init until Exit loop Body end {Inv ∧ Exit}
This is a partial correctness rule, useful only for loops that terminate. Proofs of termi-
nation are in general handled separately through the introduction of a loop variant: a
value from a well-founded set, usually taken to be the set of natural numbers, which de-
creases upon each iteration (again, it is enough to show that it does so for initial states
not satisfying Exit). Since in a well-founded set all decreasing sequences are finite,
the existence of a variant expression implies termination. The rest of this discussion
concentrates on the invariants; it only considers terminating algorithms, of course, and
includes the corresponding variant clauses, but does not explain why the corresponding
expressions are indeed loop variants (nonnegative and decreasing). Invariants, how-
ever, also feature in termination proofs, where they ensure that the variant ranges over
a well-founded set (or, equivalently, the values it takes are bounded from below).
If a loop is equipped with an invariant, proving its partial correctness means estab-
lishing the two hypotheses in the aforementioned rules:
—{P} Init {Inv}, stating that the initialization ensures the invariant, is called the
initiation property.
—{Inv ∧ ¬Exit} Body {Inv}, stating that the body preserves the invariant, is called the
consecution (or inductiveness) property.
1.2. A Constructive View
We may look at the notion of loop invariant from the constructive perspective of a
programmer directing his or her program to reach a state satisfying a certain desired
property, the postcondition. In this view, program construction is a form of problem solv-
ing, and the various control structures are problem-solving techniques [Dijkstra 1976;
Meyer 1980; Gries 1981; Morgan 1994]; a loop solves a problem through successive
approximation.
The idea of this solution, illustrated in Figure 1, is the following:
—Generalize the postcondition (the characterization of possible solutions) into a
broader condition: the invariant.
—As a result, the postcondition can be defined as the combination (“and” in logic,
intersection in the figure) of the invariant and another condition: the exit condition.
—Find a way to reach the invariant from the previous state of the computation: the
initialization.
—Find a way, given a state that satisfies the invariant, to get to another state, still
satisfying the invariant but closer, in some appropriate sense, to the exit condition:
the body.
For the solution to reach its goal after a finite number of steps, we need a notion of
discrete “distance” to the exit condition. This is the loop variant.
The importance of this presentation of the loop process is that it highlights the nature
of the invariant: it is a generalized form of the desired postcondition, which in a special
case (represented by the exit condition) will give us that postcondition. This view of the
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:5
invariant, as a particular way of generalizing the desired goal of the loop computation,
explains why the loop invariant is such an important property of loops; one can argue
that understanding a loop means understanding its invariant (in spite of the obvious
observation that many programmers write loops without ever formally learning the
notion of invariant, although we may claim that if they understand what they are
doing they are relying on some intuitive understanding of the invariant anyway, like
Molière’s Mr. Jourdain speaking in prose without knowing it).
The key results of this article can be described as generalization strategies to obtain
invariants from postconditions.
gcd(x, x) = x. (2)
The second conjunct, a generalization of the postcondition, will serve as the invariant;
the first conjunct will serve as the exit condition. To obtain the loop body, we take
advantage of another mathematical property: for every x > y,
gcd(x, y) = gcd(x − y, y), (3)
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:6 C. A. Furia et al.
yielding the well-known algorithm in Figure 2. (As with any assertion, writing clauses
successively in the invariant is equivalent to a logical conjunction.) This form of Euclid’s
algorithm uses subtraction; another form, given in Section 4.2.2, uses integer division.
We may use this example to illustrate some of the orthogonal categories in the
classification developed in the rest of this article:
—The last clause of the invariant is an essential invariant, representing a weakening of
the postcondition. The first two clauses are a bounding invariant, indicating that the
state remains within certain general boundaries and ensuring that the “essential”
part is defined.
—The essential invariant is a conservation invariant, indicating that a certain quantity
remains equal to its original value.
—The strategy that leads to this conservation invariant is uncoupling, which replaces
a property of one variable (Result), used in the postcondition, by a property of two
variables (Result and x), used in the invariant.
The proof of correctness follows directly from the mathematical property stated:
(2) establishes initiation, and (3) establishes consecution.
Section 4.2.2 shows how the same technique is applicable backward to guess likely
loop invariants given an algorithm annotated with pre- and postcondition: mutating
the latter yields a suitable loop invariant.
1.4. Other Kinds of Invariants
Loop invariants are the focus of this article, but before we return to them it is useful
to list some other kinds of invariants encountered in software. (Yet other invariants,
which lie even further beyond the scope of this discussion, play fundamental roles in
fields such as physics; consider, for example, the invariance of the speed of light under
a Lorentz transformation and of time under a Galilean transformation.)
In object-oriented programming, a class invariant (also directly supported by the
Eiffel notation [Eiffel 2006]) expresses a property of a class that:
—Every instance of the class possesses immediately after creation, and
—Every exported feature (operation) of the class preserves,
with the consequence that whenever such an object is accessible to the rest of the
software it satisfies the invariant, since the life of an object consists of creation followed
by any number of “qualified” calls x . f to exported features f by clients of the class.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:7
The two properties listed are strikingly similar to initiation and consecution for loop
invariants, and the connection appears clearly if we model the life of an object as a loop:
1 from
2 create x.make - - Written in some languages as x := new C()
3 invariant
4 CI - - The class invariant
5 until
6 “x is no longer needed”
7 loop
8 x.some feature of the class
9 end
Also useful are Lamport-style invariants [Lamport 1977] used to reason about con-
current programs, which obviate the need for the “ghost variables” of the Owicki-Gries
method [Owicki and Gries 1976]). Like other invariants, a Lamport invariant is a pred-
icate on the program state; the difference is that the definition of the states involves not
only the values of the program’s variables but also the current point of the execution of
the program (“Program Counter” or PC) and, in the case of a concurrent program, the
collection of the PCs of all its concurrent processes. An example of application is the
answer to the following problem posed by Lamport [2009].
Consider N processes numbered from 0 through N − 1 in which each process i executes
i0 : x[i] := 1
i1 : y[i] := x[(i − 1) mod N]
i2 :
and stops where each x[i] initially equals 0. (The reads and writes of each x[i] are assumed to be atomic.)
[. . .] The algorithm [. . .] maintains an inductive invariant. Do you know what that invariant is?
If we associate a proposition @(m, i) for m = 1, 2, 3 that holds precisely when the execu-
tion of process i reaches location im, an invariant for the algorithm can be expressed as:
⎛ ⎞
@(0, (i − 1) mod N) ∧ y[i] = 0
⎜ ∨ ⎟
⎜ ⎟
@(2, i) =⇒ ⎜ @(1, (i − 1) mod N) ∧ y[i] = 1 ⎟ .
⎝ ∨ ⎠
@(2, (i − 1) mod N) ∧ y[i] = 1
Yet another kind of invariant occurs in the study of dynamical systems, where an
invariant is a region I ⊆ S of the state space S such that any trajectory starting in I or
entering it stays in I indefinitely in the future:
∀x ∈ I, ∀t ∈ T : (t, x) ∈ I,
where T is the time domain and : T × S → S is the evolution function. The connection
between dynamical system invariants and loop invariants is clear in the constructive
view (Section 1.2) and can be formally derived by modeling programs as dynamical
systems or using some other operational formalism [Furia et al. 2012]. The differential
invariants introduced in the study of hybrid systems [Platzer 2010] are also variations
of the invariants defined by dynamical systems.
2. EXPRESSING INVARIANTS: DOMAIN THEORY
To discuss and compare invariants, we need to settle on the expressiveness of the
underlying invariant language: what do we accept as a loop invariant?
The question involves general assertions, not just invariants; more generally, we
must make sure that any convention for invariants is compatible with the general
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:8 C. A. Furia et al.
The example of Euclid’s algorithm given earlier, simple as it is, was already an
example of the domain-theory-based approach because of its use of a function gcd in
the invariant clause
gcd(Result, x) = gcd(a, b) (4)
corresponding to a weakening of the routine postcondition
3 No relation with the study of partially ordered sets, also called domain theory [Abramsky and Jung 1994].
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:9
Expressing the invariant in the same style requires several more lines since the defi-
nition of the greatest common divisor must be expanded for both sides of (4).
Even for such a simple example, the limitations of the atomic-assertion style are
clear: because it requires going back to basic logical constructs every time, it does not
scale.
Another example where we can contrast the two styles is any program that computes
the maximum of an array. In the atomic-assertion style, the postcondition will be
written
∀k ∈ Z : a.lower ≤ k ≤ a.upper implies a[k] ≤ Result (Every element between bounds has a
value smaller than Result)
∃k ∈ Z : a.lower ≤ k ≤ a.upper ∧ a[k] = Result (Some element between bounds has the
value Result).
This property is the definition of the maximum and hence needs to be written some-
where. If we define a function “max” to capture this definition, the specification becomes
simply
Result = max(a).
The difference between the two styles becomes critical when we come to the invari-
ant of programs computing an array’s maximum. Two different algorithms appear in
Section 4.1. The first (Section 4.1.1) is the most straightforward; it moves an index i
from a.lower + 1 to a.upper, updating Result if the current value is higher than the
current result (initialized to the first element a [a.lower]). With a domain theory on
arrays, the function max will be available as well as a notion of slice, where the slice
a [i..j] for integers i and j is the array consisting of elements of a in the range [i, j].
Then the invariant is simply
Result = max(a [a.lower..i]),
which is ensured by initialization and, on exit when i = a.upper, yields the postcondition
Result = max(a) (based on the domain theory property that a [a.lower..a.upper] = a).
The atomic-assertion invariant would be a variation of the expanded postcondition:
∀k ∈ Z : a.lower ≤ k ≤ i implies a[k] ≤ Result
∃k ∈ Z : a.lower ≤ k ≤ i ∧ a[k] = Result.
Consider now a different algorithm for the same problem (Section 4.1.2), which works
by exploring the array from both ends, moving the left cursor i up if the element at
i is less than the element at j and otherwise moving the right cursor j down. The
atomic-assertion invariant can be written with an additional level of quantification:
∀k ∈ Z : a.lower ≤ k ≤ a.upper implies a[k] ≤ m
∃m : . (5)
∃k ∈ Z : i ≤k≤ j and a[k] = m
Alternatively, we can avoid quantifier alternation using the characterization based on
the complement property that the maximal element is not outside the slice a[i..j]:
∀k ∈ Z : a.lower ≤ k < i ∨ j < k ≤ a.upper =⇒ a[k] ≤ a[i] ∨ a[k] ≤ a[ j]. (6)
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:10 C. A. Furia et al.
3. CLASSIFYING INVARIANTS
Loop invariants and their constituent clauses can be classified along two dimensions:
—By their role with respect to the postcondition (Section 3.1), leading us to distinguish
between “essential” and “bounding” invariant properties.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:11
—By the transformation technique that yields the invariant from the postcondition
(Section 3.2). Here we have techniques such as uncoupling and constant relaxation.
3.1. Classification by Role
In the typical loop strategy described in Section 1.2, it is essential that successive iter-
ations of the loop body remain in the convergence regions where the generalized form
of the postcondition is defined. The corresponding conditions make up the bounding
invariant; the clause describing the generalized postcondition is the essential invari-
ant. The bounding invariant for the greatest common divisor algorithm consists of the
clauses
Result > 0
x > 0.
The essential clause is
gcd(Result, x) = gcd(a, b),
yielding the postcondition when Result = x.
For the one-way maximum program, the bounding invariant is
a.lower ≤ i ≤ a.upper
and the essential invariant is
Result = max (a[a.lower..i]),
yielding the postcondition when i = a.upper. Note that the essential invariant would
not be defined without the bounding invariant, since the slice a[1..i] would be undefined
(if i > a.upper) or would be empty and have no maximum (if i < a.lower).
For the two-way maximum program, the bounding invariant is
a.lower ≤ i ≤ j ≤ a.upper
and the essential invariant is
max(a) = max(a[i.. j]),
yielding the postcondition when i = j. Again, the essential invariant would not be
always defined without the bounding invariant.
The separation between bounding and essential invariants is often straightforward
as in these examples. In case of doubt, the following observation will help distinguish.
The functions involved in the invariant (and often, those of the postcondition) are often
partial; for example:
—gcd(u, v) is only defined if u and v are both nonzero (and, since we consider natural
integers only in the example, positive).
—For an array a and an integer i, a[i] is only defined if i ∈ [a.lower..a.upper], and the
slice a[i..j] is nonempty only if [i..j] ⊆ [a.lower..a.upper].
—max(a) is only defined if the array a is not empty.
Since the essential clauses, obtained by postcondition generalization, use gcd (Result,
x) and (in the array algorithms) array elements and maxima, the invariants must
include the bounding clauses as well to ensure that these essential clauses are
meaningful. A similar pattern applies to most of the invariants studied later.
3.2. Classification by Generalization Technique
The essential invariant is a mutation (often, a weakening) of the loop’s postcondition.
The following mutation techniques are particularly common:
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:12 C. A. Furia et al.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:13
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:14 C. A. Furia et al.
4.1.2. Maximum: Two-Variable Loop. The one-way maximum algorithm results from ar-
bitrarily choosing to apply constant relaxation to either a.lower or (as in the previous
version) a.upper. Guided by a symmetry concern, we may choose double constant relax-
ation, yielding another maximum algorithm max_two_way, which traverses the array
from both ends. If i and j are the two relaxing variables, the loop body either increases
i or decreases j. When i = j, the loop has processed all of a, and hence i and j indicate
the maximum element.
The specification (precondition and postcondition) is the same as for the previous
algorithm. Figure 4 shows an implementation.
It is again trivial to prove initiation. Consecution relies on the following two domain
theory properties:
j > i ∧ a [i] ≥ a [ j] =⇒ max(a [i.. j]) = max(a [i.. j − 1]) (8)
i < j ∧ a [ j] ≥ a [i] =⇒ max(a [i.. j]) = max(a [i + 1.. j]). (9)
4.1.3. Search in an Unsorted Array. The following routine has_sequential returns the
position of an occurrence of an element key in an array a or, if key does not appear, a
special value. The algorithm applies to any sequential structure but is shown here for
arrays. For simplicity, we assume that the lower bound a.lower of the array is 1, so that
we can choose 0 as the special value. Obviously this assumption is easy to remove for
generality: just replace 0, as a possible value for Result, by a.lower −1.
The specification may use the domain theory notation elements (a) to express the set
of elements of an array a. A simple form of the postcondition is
Result = 0 ⇐⇒ key ∈ elements(a), (10)
which just records whether the key has been found. We will instead use a form that
also records where the element appears if present:
Result = 0 =⇒ key = a[Result] (11)
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:15
4 Notethat in this example it is OK for the array to be empty, so there is no precondition on a.upper, although
general properties of arrays imply that a.upper ≥ 0; the value 0 corresponds to an empty array.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:16 C. A. Furia et al.
4.1.4. Binary Search. Binary search works on sorted arrays by iteratively halving a
segment of the array where the searched element may occur. The search terminates
either when the element is found or when the segment becomes empty, implying that
the element appears nowhere in the array.
As already remarked by Knuth many years ago [Knuth 2011, Vol. 3, Sec. 6.2.1]:
Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly
tricky, and many programmers have done it wrong the first few times they tried.
Reasoning carefully on the specification (at the domain theory level) and the resulting
invariant helps avoid mistakes.
For the present discussion, it is interesting that the postcondition is the same as for
sequential search (Section 4.1.3), so that we can see where the generalization strategy
differs, taking advantage of the extra property that the array is sorted.
The algorithm and implementation now have the precondition
sorted (a),
where the domain theory predicate sorted (a), defined as
∀ j ∈ [a.lower..a.upper − 1] : a [ j] ≤ a [ j + 1],
expresses that an array is sorted upward. The domain theorem on which binary search
rests is that, for any value mid in [i..j] (where i and j are valid indexes of the array),
and any value key of type T (the type of the array elements):
key ≤ a[mid] ∧ key ∈ elements(a[i..mid])
key ∈ elements(a[i.. j]) ⇐⇒ ∨ . (13)
key > a[mid] ∧ key ∈ elements(a[mid + 1.. j])
This property leads to the key insight behind binary search, whose invariant follows
from the postcondition by variable introduction, mid serving as that variable.
Formula (13) is not symmetric with respect to i and j; a symmetric version is possible,
using in the second disjunct, “≥” rather than “>” and mid rather than mid + 1. The
form given in (13) has the advantage of using two mutually exclusive conditions in the
comparison of key to a [mid]. As a consequence, we can limit ourselves to a value mid
chosen in [i..j − 1] (rather than [i..j]) since the first disjunct does not involve j and the
second disjunct cannot hold for mid = j (the slice a [mid + 1..j] being then empty). All
these observations and choices have direct consequences on the program text but are
better handled at the specification (theory) level.
We will start for simplicity with the version (10) of the postcondition that only records
presence or absence, repeated here for ease of reference:
Result = 0 ⇐⇒ key ∈ elements(a). (14)
Duplicating the right-hand side of (14), writing a in slice form a[1..a.upper], and ap-
plying constant relaxation twice, to the lower bound 1 and the upper bound a.upper,
yields the essential invariant:
key ∈ elements(a[i.. j]) ⇐⇒ key ∈ elements(a) (15)
with the bounding invariant
1 ≤ i ≤ mid + 1 ∧ 1 ≤ mid ≤ j ≤ a.upper ∧ i ≤ j,
which combines the assumptions on mid necessary to apply (13)—also assumed in
(15)—and the additional knowledge that 1 ≤ i and j ≤ a.upper.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:17
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:18 C. A. Furia et al.
4.2.1. Integer Division. The algorithm for integer division by successive differences com-
putes the integer quotient q and the remainder r of two integers m and n. The postcon-
dition reads
0≤r<m
n = m · q + r.
The loop invariant consists of a bounding clause and an essential clause. The latter
is simply an element of the postcondition:
n = m · q + r.
The bounding clause weakens the other postcondition clause by keeping only its first
part:
0 ≤ r,
so that the dropped condition r < m becomes the exit condition. As a consequence,
r ≥ m holds in the loop body, and the assignment r:= r − m maintains the invariant
property 0 ≤ r. It is straightforward to prove the implementation in Figure 7 correct
with respect to this specification.
4.2.2. Greatest Common Divisor (with Division). Euclid’s algorithm for the greatest com-
mon divisor offers another example where clearly separating between the underlying
mathematical theory and the implementation yields a concise and convincing correct-
ness argument. Sections 1.3 and 2 previewed this example by using the form that
repeatedly subtracts one of the values from the other; here we will use the version that
uses division.
The greatest common divisor gcd(a, b) is the greatest integer that divides both a and
b, defined by the following axioms, where a and b are nonnegative integers such that
at least one of them is positive (“\\” denotes integer remainder):
a\\ gcd(a, b) = 0
b\\ gcd(a, b) = 0
∀d ∈ N : (a\\d = 0) ∧ (b\\d = 0) =⇒ d ≤ gcd(a, b).
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:19
5 Thevariant is simply y, which is guaranteed to decrease at every iteration and can be bounded from below
by the property 0 ≤ x\\y < y.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:20 C. A. Furia et al.
but a more efficient algorithm squares m for all 1s values in the binary representation
of n. In practice, there is no need to compute this binary representation.
Given the postcondition
Result = mn,
we first rewrite it into the obviously equivalent form Result · 11 = mn. Then, the
invariant is obtained by double constant relaxation: the essential property
Result · x y = mn
is easy to obtain initially (by setting Result, x, and y to 1, m, and n), yields the post-
condition when y = 0, and can be maintained while progressing toward this situation
thanks to the domain theory properties
x 2z = (x 2 )2z/2 (16)
x z = x · x z−1 . (17)
Using only (17) would lead to the inefficient (n − 1)-multiplication algorithm, but we
may use (16) for even values of y = 2z. This leads to the algorithm in Figure 9.
Proving initiation is trivial. Consecution is a direct application of the (16) and (17)
properties.
4.2.4. Long Integer Addition. The algorithm for long integer addition computes the sum
of two integers a and b given in any base as arrays of positional digits starting from
the least significant position. For example, the array sequence 3, 2, 0, 1 represents
the number 138 in base 5 as 3 · 50 + 2 · 51 + 0 · 52 + 1 · 53 = 138. For simplicity of
representation, in this algorithm we use arrays indexed by 0, so that we can readily
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:21
The postcondition of the long integer addition algorithm has two clauses. One speci-
fies that the pairwise sum of elements in a and b encodes the same number as Result:
n−1 n
(a[k] + b[k]) · basek = Result[k] · basek. (18)
k=0 k=0
Result may have one more digit than a or b, hence the different bound in the two sums,
where n denotes a’s and b’s length (normally written a.count and b.count). The second
postcondition clause is the consistency constraint that Result is indeed a representa-
tion in base base:
has base(Result, base), (19)
where the predicate has_base is defined by a quantification over the array’s length:
has base(v, b) ⇐⇒ ∀k ∈ N : 0 ≤ k < v.count =⇒ 0 ≤ v[k] < b.
Both postcondition clauses appear mutated in the loop invariant. First, we rewrite
Result in slice form Result [0..n] in (18) and (19). The first essential invariant clause
follows by applying constant relaxation to (19), with the variable expression i − 1
replacing constant n:
has base(Result[0..i − 1], base).
The decrement is required because the loop updates i at the end of each iteration; it is
a form of aging (see Section 3.2).
To get the other part of the essential invariant, we first highlight the last term in
the summation on the right-hand side of (18):
n−1 n−1
(a[k] + b[k]) · basek = Result[n] · basen + Result[k] · basek.
k=0 k=0
We then introduce variables i and carry, replacing constants n and Result[n]. Variable
i is the loop counter, also mentioned in the other invariant clause; carry, as the name
indicates, stores the remainder of each pairwise addition, which will be carried over to
the next digit.
The domain property that the integer division by b of the sum of two b-base digits
v1 , v2 is less than b (all variables are integer):
b > 0 ∧ v1 , v2 ∈ [0..b − 1] =⇒ (v1 + v2 )//b ∈ [0..b − 1]
suggests the bounding invariant clause 0 ≤ carry < base. Figure 10 shows the re-
sulting implementation, where the most significant digit is set after the loop before
terminating.
Initiation is trivial under the convention that an empty sum evaluates to zero. Con-
secution easily follows from the domain theoretic properties of the operations in the
loop body, and in particular from how the carry and the current digit d are set in each
iteration.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:22 C. A. Furia et al.
4.3. Sorting
A number of important algorithms sort an array based on pairwise comparisons and
swaps of elements. The following domain theory notations will be useful for arrays a
and b:
—perm (a,b) expresses that the arrays are permutations of each other (their elements
are the same, each occurring the same number of times as in the other array).
—sorted (a) expresses that the array elements appear in increasing order: ∀i ∈
[a.lower..a.upper − 1] : a[i] ≤ a[i + 1].
The sorting algorithms considered sort an array in place, with the specification:
sort(a : ARRAY [T ])
require
a.lower = 1
a.count = n ≥ 1
ensure
perm (a, old a)
sorted (a)
The type T indicates a generic type that is totally ordered and provides the comparison
operators <, ≤, ≥, and >. The precondition that the array be indexed from 1 and
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:23
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:24 C. A. Furia et al.
element no greater than pivot; this is not enforced by the loop, whose invariant leaves
the value of a [low] unconstrained. In particular, in the special case of all elements
being no less than pivot, low and Result are set to zero after the loop.
In the correctness proof, it is useful to discuss the cases a [low] < pivot and a [low] ≥
pivot separately when proving consecution. In the former case, we combine a [1..low −
1] ≤ pivot and a [low] < pivot to establish the backward substitution a [1..low] ≤ pivot.
In the latter case, we combine low = high, a [high + 1..n] ≥ pivot, and a [low] ≥ pivot
to establish the backward substitution a [low..n] ≥ pivot. The other details of the proof
are straightforward.
4.3.2. Selection Sort. Selection sort is a straightforward sorting algorithm based on a
simple idea: to sort an array, find the smallest element, put it in the first position, and
repeat recursively from the second position on. Pre- and postcondition are the usual
ones for sorting (see Section 4.3) and hence require no further comment.
The first postcondition clause perm (a, old a) is also an essential loop invariant:
perm (a, old a). (20)
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:25
If we introduce a variable i to iterate over the array, another essential invariant clause
is derived by writing a in slice form a[1..n] and then by relaxing n into i:
sorted (a [1..i]) (21)
with the bounding clause
1 ≤ i ≤ n, (22)
which ensures that the sorted slice a[1..i] is always nonempty. The final component of
the invariant is also an essential weakening of the postcondition but is less straight-
forward to derive by syntactic mutation. If we split a [1..n] into the concatenation
a [1..i − 1] ◦ a [i..n], we notice that sorted (a [1..i − 1] ◦ a [i..n]) implies
∀k ∈ [i..n] : a [1..i − 1] ≤ a[k] (23)
as a special case. Formula (23) guarantees that the slice a[i..n], which has not been
sorted yet, contains elements that are no smaller than any of those in the sorted slice
a[1..i − 1].
The loop invariants (20)–(22) apply—possibly with minimal changes due to inessen-
tial details in the implementation—for any sorting algorithm that sorts an array se-
quentially, working its way from lower to upper indices. To implement the behavior
specific to selection sort, we introduce an inner loop that finds the minimum element in
the slice a [i..n], which is not yet sorted. To this end, it uses variables j and m: j scans
the slice sequentially starting from position i + 1; m points to the minimum element
found so far. Correspondingly, the inner loop’s postcondition is a[m] ≤ a[i..n], which
induces the essential invariant clause
a [m] ≤ a[i.. j − 1] (24)
specific to the inner loop, by constant relaxation and aging. The outer loop’s invariant
(23) clearly also applies to the inner loop—which does not change i or n—where it
implies that the element in position m is an upper bound on all elements already
sorted:
a [1.. i − 1] ≤ a[m]. (25)
Also specific to the inner loop are more complex bounding invariants relating the values
of i, j, and m to the array bounds:
1 ≤ i < j ≤ n+ 1
i ≤ m < j.
The implementation in Figure 12 follows these invariants. The outer loop’s only task is
then to swap the “minimum” element pointed to by m with the lowest available position
pointed to by i.
The most interesting aspect of the correctness proof is proving consecution of the
outer loop’s invariant clause (21), and in particular that a[i] ≤ a[i + 1]. To this end,
notice that (24) guarantees that a [m] is the minimum of all elements in positions from
i to n, and that (25) is an upper bound on the other elements in positions from 1 to
i − 1. In particular, a[m] ≤ a[i + 1] and a[i − 1] ≤ a[m] hold before the swap on line
30. After the swap, a[i] equals the previous value of a[m]; thus a[i − 1] ≤ a[i] ≤ a[i +
1] holds as required. A similar reasoning proves the inductiveness of the main loop’s
other invariant clause (23).
4.3.3. Insertion Sort. Insertion sort is another suboptimal sorting algorithm that is,
however, simple to present and implement and reasonably efficient on arrays of small
size. As the name suggests, insertion sort hinges on the idea of rearranging elements
in an array by inserting them in their correct positions with respect to the sorting
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:26 C. A. Furia et al.
order; insertion is done by shifting the elements to make room for insertion. Pre- and
postcondition are the usual ones for sorting (see Section 4.3 and the comments in the
previous subsections).
The main loop’s essential invariant is as in selection sort (Section 4.3.2) and other
similar algorithms, as it merely expresses the property that the sorting has progressed
up to position i and has not changed the array content:
sorted (a[1..i]) (26)
perm (a, old a). (27)
This essential invariant goes together with the bounding clause 1 ≤ i ≤ n.
The main loop includes an inner loop, whose invariant captures the specific strategy
of insertion sort. The outer loop’s invariant (27) must be weakened, because the inner
loop overwrites a [i] while progressively shifting to the right elements in the slice
a [1..j]. If a local variable v stores the value of a [i] before entering the inner loop, we
can weaken (27) as:
perm (a [1.. j] ◦ v ◦ a[ j + 2..n], old a), (28)
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:27
where “◦” is the concatenation operator; that is, a’s element at position j + 1 is the
current candidate for inserting v—the value temporarily removed. After the inner loop
terminates, the outer loop will put v back into the array at position j + 1 (line 28 in
Figure 13), thus restoring the stronger invariant (27) (and establishing inductiveness
for it).
The clause (26), crucial for the correctness argument, is also weakened in the inner
loop. First, we “age” i by replacing it with i − 1; this corresponds to the fact that the
outer loop increments i at the beginning, and will then re-establish (26) only at the end
of each iteration. Therefore, the inner loop can only assume the weaker invariant:
sorted (a [1..i − 1]) (29)
that is not invalidated by shifting (which only temporarily duplicates elements). Shift-
ing has, however, another effect: since the slice a[j + 1..i] contains elements shifted up
from the sorted portion, the slice a[j + 1..i] is itself sorted, thus the essential invariant:
sorted (a [ j + 1..i]). (30)
We can derive the pair of invariants (29)–(30) from the inner loop’s postcondition (26):
write a [1..i] as a [1..i − 1] ◦ a[i..i]; weaken the formula sorted (a [1..i − 1] ◦ a[i..i]) into
the conjunction of sorted ( a [1..i − 1]) and sorted (a[i..i]); and replace one occurrence of
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:28 C. A. Furia et al.
constant i in the second conjunct by a fresh variable j and age to derive sorted (a [j +
1..i]).
Finally, there is another essential invariant, specific to the inner loop. Since the
loop’s goal is to find a position, pointed to by j + 1, before i where v can be inserted, its
postcondition is:
v ≤ a [ j + 1..i], (31)
which is also a suitable loop invariant, combined with a bounding clause that constrains
j and i:
0 ≤ j < i ≤ n. (32)
Overall, clauses (28)–(32) are the inner loop invariant, and Figure 13 shows the match-
ing implementation.
As usual for this kind of algorithm, the crux of the correctness argument is proving
that the outer loop’s essential invariant is inductive, based on the inner loop’s. The
formal proof uses the following informal argument. Formulas (29) and (31) imply that
inserting v at j + 1 does not break the sortedness of the slice a [1..j + 1]. Furthermore,
(30) guarantees that the elements in the “upper” slice a [j + 1..i] are also sorted with
a [j] ≤ a[j + 1] ≤ a[j + 2]. (The detailed argument would discuss the cases j = 0,
0 < j < i − 1, and j = i − 1.) In all, the whole slice a [1..i] is sorted, as required by (26).
4.3.4. Bubble Sort (Basic). As a sorting method, bubble sort is known not for its perfor-
mance but for its simplicity [Knuth 2011, Vol. 3, Sec. 5.2.2]. It relies on the notion of
inversion: a pair of elements that are not ordered, that is, such that the first is greater
than the second. The straightforward observation that an array is sorted if and only
if it has no inversions suggests to sort an array by iteratively removing all inversions.
Let us present invariants that match such a high-level strategy, deriving them from
the postcondition (which is the same as the other sorting algorithms of this section).
The postcondition perm (a, old a) that a’s elements be not changed is also an invariant
of the two nested loops used in bubble sort. The other postcondition sorted (a) is instead
weakened, but in a way different than in other sorting algorithms seen before. We
introduce a Boolean flag swapped, which records if there is some inversion that has
been removed by swapping a pair of elements. When swapped is false after a complete
scan of the array a, no inversions have been found, and hence a is sorted. Therefore,
we use ¬ swapped as exit condition of the main loop, and the weakened postcondition
¬ swapped =⇒ sorted (a) (33)
as its essential loop invariant.
The inner loop performs a scan of the input array that compares all pairs of adjacent
elements and swaps them when they are inverted. Since the scan proceeds linearly
from the first element to the last one, we get an essential invariant for the inner loop
by replacing n by i in (33) written in slice form:
¬ swapped =⇒ sorted (a[1..i]). (34)
The usual bounding invariant 1 ≤ i ≤ n and the outer loop’s invariant clause perm (a,
old a) complete the inner loop invariant.
The implementation is now straightforward to write as in Figure 14. The inner loop,
in particular, sets swapped to True whenever it finds some inversion while scanning.
This signals that more scans are needed before the array is certainly sorted.
Verifying the correctness of the annotated program in Figure 14 is easy, because the
essential loop invariants (33) and (34) are trivially true in all iterations where swapped
is set to True. On the other hand, this style of specification makes the termination
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:29
argument more involved: the outer loop’s variant (line 28 in Figure 14) must explicitly
refer to the number of inversions left in a, which are decreased by complete executions
of the inner loop.
4.3.5. Bubble Sort (Improved). The inner loop in the basic version of bubble sort—
presented in Section 4.3.4—always performs a complete scan of the n-element array
a. This is often redundant, because swapping adjacent inverted elements guarantees
that the largest misplaced element is sorted after each iteration. Namely, the largest
element reaches the rightmost position after the first iteration, the second-largest one
reaches the penultimate position after the second iteration, and so on. This section
describes an implementation of bubble sort that takes advantage of this observation to
improve the running time.
The improved version still uses two nested loops. The outer loop’s essential invariant
has two clauses:
sorted (a [i..n]) (35)
is a weakening of the postcondition that encodes the knowledge that the “upper” part
of array a is sorted, and
i < n =⇒ a[1..i] ≤ a[i + 1] (36)
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:30 C. A. Furia et al.
specifies that the elements in the unsorted slice a[1..i] are no larger than the first
“sorted” element a[i + 1]. The expression a[1..i] ≤ a[i + 1] is a mutation (constant
relaxation and aging) of a[1..n] ≤ a[n], which is, in turn, a domain property following
from the postcondition. Variable i is now used in the outer loop to mark the portion still
to be sorted; correspondingly, (36) is well defined only when i < n, and the bounding
invariant clause 1 ≤ i ≤ n is also part of the outer loop’s specification.
Continuing with the same logic, the inner loop’s postcondition:
a[1..i] ≤ a[i] (37)
states that the largest element in the slice a [1..i] has been moved to the highest
position. Constant relaxation, replacing i (not changed by the inner loop) with a fresh
variable j, yields a new essential component of the inner loop’s invariant:
a[1.. j] ≤ a[ j]. (38)
The outer loop’s invariant and the bounding clause 1 ≤ j ≤ i complete the specification
of the inner loop. Figure 15 displays the corresponding implementation.
The correctness proof follows standard strategies. In particular, the inner loop’s post-
condition (37), that is, the inner loop’s invariant when j = i—implies a[i − 1] ≤ a[i]
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:31
as a special case. This fact combines with the other clause (36) to establish the induc-
tiveness of the main loop’s essential clause:
sorted (a [i..n]).
Finally, proving termination is trivial for this program because each loop has an asso-
ciated iteration variable that is unconditionally incremented or decremented.
4.3.6. Comb Sort. In an attempt to improve performance in critical cases, comb sort
generalizes bubble sort based on the observation that small elements initially stored
in the right-most portion of an array require a large number of iterations to be sorted.
This happens because bubble sort swaps adjacent elements; hence, it takes n scans of
an array of size n just to bring the smallest element from the right-most nth position to
the first one, where it belongs. Comb sort adds the flexibility of swapping nonadjacent
elements, thus allowing for a faster movement of small elements from right to left. A
sequence of nonadjacent equally spaced elements also conveys the image of a comb’s
teeth, hence the name “comb sort.”
Let us make this intuition rigorous and generalize the loop invariants, and the
implementation, of the basic bubble sort algorithm described in Section 4.3.4. Comb
sort is also based on swapping elements, therefore the—now well-known—invariant
perm (a, old a) also applies to its two nested loops. To adapt the other loop invariant
(33), we need a generalization of the predicate sorted that fits the behavior of comb
sort. Predicate gap_sorted (a, d), defined as:
gap sorted (a, d) ⇐⇒ ∀k ∈ [a.lower..a.upper − d] : a[k] ≤ a[k + d]
holds for arrays a such that the subsequence of d-spaced elements is sorted. Notice
that, for d = 1, gap_sorted reduces to sorted:
gap sorted (a, 1) ⇐⇒ sorted (a).
This fact will be used to prove the postcondition from the loop invariant upon
termination.
With this new piece of domain theory, we can easily generalize the essential and
bounding invariants of Figure 14 to comb sort. The outer loop considers decreasing
gaps; if variable gap stores the current value, the bounding invariant
1 ≤ gap ≤ n
defines its variability range. Precisely, the main loop starts with with gap = n and
terminates with gap = 1, satisfying the essential invariant:
¬swapped =⇒ gap sorted (a, gap). (39)
The correctness of comb sort does not depend on how gap is decreased, as long as it
eventually reaches 1; if gap is initialized to 1, comb sort behaves exactly as bubble sort.
In practice, it is customary to divide gap by some chosen parameter c at every iteration
of the main loop.
Let us now consider the inner loop, which compares and swaps the subsequence of
d-spaced elements. The bubble sort invariant (34) generalizes to:
¬swapped =⇒ gap sorted (a [1..i − 1 + gap], gap) (40)
and its matching bounding invariant is:
1 ≤ i < i + gap ≤ n + 1
so that when i = n + 1 + gap the inner loop terminates and (40) is equivalent to (39).
This invariant follows from constant relaxation and aging; the substituted expression
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:32 C. A. Furia et al.
i − 1 + gap is more involved, to accommodate how i is used and updated in the inner
loop, but is otherwise semantically straightforward.
The complete implementation is shown in Figure 16. The correctness argument is
exactly as for bubble sort in Section 4.3.4 but exploits the properties of the generalized
predicate gap_sorted instead of the simpler sorted.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:33
4.4.1. Unbounded Knapsack Problem with Integer Weights. We have an unlimited collection
of items of n different types. An item of type k, for k = 1, . . . , n, has weight w[k] and
value v[k]. The unbounded knapsack problem asks what is the maximum overall value
that one can carry in a knapsack whose weight limit is a given weight. The attribute
“unbounded” refers to the fact that we can pick as many objects of any type as we
want: the only limit is given by the input value of weight, and by the constraint that
we cannot store fractions of an item—either we pick it or we don’t.
Any vector s of n nonnegative integers defines a selection of items whose overall
weight is given by the scalar product:
s·w = s[k]w[k]
1≤k≤n
and whose overall value is similarly given by the scalar product s·v. Using this notation,
we introduce the domain theoretical function max_knapsack, which defines the solution
of the knapsack problem given a weight limit b and items of n types with weight and
value given by the vectors w and v:
∃s ∈ Nn : s · w ≤ b ∧ s · v = κ
max knapsack (b, v, w, n) = κ ⇐⇒ ∧ ,
∀t ∈ Nn : t · w ≤ b =⇒ t · v ≤ κ
that is, the largest value achievable with the given limit. Whenever weights w, values v,
and number n of item types are clear from the context, we will abbreviate max_knapsack
(b, v, w, n) by just K(b).
The unbounded knapsack problem is NP-complete [Garey and Johnson 1979; Kellerer
et al. 2004]. It is, however, weakly NP-complete [Papadimitriou 1993], and in fact it
has a nice solution with pseudo-polynomial complexity based on a recurrence relation,
which suggests a straightforward dynamic programming algorithm. The recurrence
relation defines the value of K(b) based on the values K(b ) for b < b.
The base case is for b = 0. If we assume, without loss of generality, that no item has
null weight, it is clear that we cannot store anything in the knapsack without adding
some weight, and hence the maximum value attainable with a weight limit zero is also
zero: K(0) = 0. Let now b be a generic weight limit greater than zero. To determine the
value of K(b), we make a series of attempts as follows. First, we select some item of
type k, such that w[k] ≤ b. Then, we recursively consider the best selection of items for
a weight limit of b − w[k], and we set up a new selection by adding one item of type k
to it. The new configuration has weight no greater than b − w[k] + w[k] = b and value
v[k] + K(b − w[k]),
which is, by inductive hypothesis, the largest achievable by adding an element of type
k. Correspondingly, the recurrence relation defines K(b) as the maximum among all
values achievable by adding an object of some type:
0 b=0
K(b) = (41)
max{v[k] + K(b − w[k]) k ∈ [1..n] and 0 ≤ w[k] ≤ b} b > 0.
The dynamic programming solution to the knapsack problem presented in this sec-
tion computes the recursive definition (41) for increasing values of b. It inputs arrays v
and w (storing the values and weights of all elements), the number n of element types,
and the weight-bound weight. The precondition requires that weight be nonnegative,
that all element weights w be positive, and that the arrays v and w be indexed from 1
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:34 C. A. Furia et al.
to n:
weight ≥ 0
w>0
v.lower = w.lower = 1
v.upper = w.upper = n.
The last two clauses are merely for notational convenience and could be dropped. The
postcondition states that the routine returns the value K(weight) or, with more precise
notation:
Result = max knapsack (weight, v, w, n).
The main loop starts with b = 0 and continues with unit increments until b = weight;
each iteration stores the value of K(b) in the local array m, so that m [weight] will contain
the final result upon termination. Correspondingly, the main loop’s essential invariant
follows by constant relaxation:
—Variable m [weight] replaces constant Result, thus connecting m to the returned
result.
—The range of variables [1..b] replaces constant weight, thus expressing the loop’s
incremental progress.
The loop invariant is thus:
∀y ∈ [0..b] : m[y] = max knapsack (y, v, w, n), (42)
which goes together with the bounding clause
0 ≤ b ≤ weight
that qualifies b’s variability domain. With a slight abuse of notation, we concisely write
(42), and similar expressions, as:
m[0..b] = max knapsack ([0..b], v, w, n). (43)
The inner loop computes the maximum of definition (41) iteratively, for all element
types j, where 1 ≤ j ≤ n. To derive its essential invariant, we first consider its postcon-
dition (similarly as the analysis of selection sort in Section 4.3.2). Since the inner loop
terminates at the end of the outer loop’s body, the inner’s postcondition is the outer’s
invariant (43). Let us rewrite it by highlighting the value m[b] computed in the latest
iteration:
m[0..b − 1] = max knapsack ([0..b − 1], v, w, n) (44)
m[b] = best value (b, v, w, n, n). (45)
Function best_value is part of the domain theory for knapsack, and it expresses the
“best” value that can be achieved given a weight limit of b, j ≤ n element types, and
assuming that the values K(b ) for lower-weight limits b < b are known:
best value (b, v, w, j, n) = max{v[k] + K(b − w[k]) k ∈ [1.. j] and 0 ≤ w[k] ≤ b}.
If we substitute variable j for constant n in (45), expressing the fact that the inner loop
tries one element type at a time, we get the inner loop essential invariant:
m[0..b − 1] = max knapsack ([0..b − 1], v, w, n)
m[b] = best value (b, v, w, j, m).
The obvious bounding invariants 0 ≤ b ≤ weight and 0 ≤ j ≤ n complete the inner loop’s
specification. Figure 17 shows the corresponding implementation.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:35
The correctness proof reverses the construction we highlighted following the usual
patterns seen in this section. In particular, notice that:
—When j = n the inner loop terminates, thus establishing (44) and (45).
—(44) and (45) imply (43) because the recursive definition (41) for some b only depends
on the previous values for b < b and (44) guarantees that m stores those values.
4.4.2. Levenshtein Distance. The Levenshtein distance of two sequences s and t is the
minimum number of elementary edit operations (deletion, addition, or substitution of
one element in either sequence) necessary to turn s into t. The distance has a natural
recursive definition:
⎧
⎪
⎪ 0 m= n= 0
⎪
⎪m m > 0, n = 0
⎪
⎪
⎪
⎨n n > 0, m = 0
distance(s, t) = distance(s[1..m − 1], t[1..n − 1]) m > 0, n > 0, s[m] = t[n]
⎪
⎪ −
⎪
⎪ distance(s[1..m 1], t),
⎪
⎪ 1 + min distance(s, t[1..n − 1]), m > 0, n > 0, s[m] = t[n],
⎪
⎩
distance(s[1..m − 1], t[1..n − 1])
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:36 C. A. Furia et al.
where m and n, respectively, denote s’s and t’s length (written s.count and t.count when
s and t are arrays). The first three cases of the definition are trivial and correspond
to when s, t, or both are empty: the only way to get a nonempty string from an empty
one is by adding all the former’s elements. If s and t are both nonempty and their last
elements coincide, then the same number of operations as for the shorter sequences
s[1..m − 1] and t[1..n − 1] (which omit the last elements) are needed. Finally, if s’s and
t’s last elements differ, there are three options: (1) delete s[m] and then edit s[1..m− 1]
into t; (2) edit s into t[1..n− 1] and then add t[n] to the result; (3) substitute t[n] for s[m]
and then edit the rest s[1..m − 1] into t[1..n − 1]. Whichever of the options (1), (2), and
(3) leads to the minimal number of edit operations is the Levenshtein distance.
It is natural to use a dynamic programming algorithm to compute the Levenshtein
distance according to its recursive definition. The overall specification, implementa-
tion, and corresponding proofs are along the same lines as the knapsack problem of
Section 4.4.1; therefore, we briefly present only the most important details. The post-
condition is simply
Result = distance (s, t).
The implementation incrementally builds a bidimensional matrix d of distances
such that the element d[i, j] stores the Levenshtein distance of the sequences s[1..i]
and t[1.. j]. Correspondingly, there are two nested loops: the outer loop iterates over
rows of d, and the inner loop iterates over each column of d. Their essential invariants
express, through quantification, the partial progress achieved after each iteration:
∀h ∈ [0..i − 1], ∀k ∈ [0..n] : d[h, k] = distance(s[1..h], t[1..k])
∀h ∈ [0..i − 1], ∀k ∈ [0.. j − 1] : d[h, k] = distance(s[1..h], t[1..k]).
The standard bounding invariants on the loop variables i and j complete the
specification.
Figure 18 shows the implementation, which uses the compact across notation for
loops, similar to “for” loops in other languages. The syntax
across [a..b] invariant I as k loop B end
is simply a shorthand for:
from k := a invariant I until k = b + 1 loop B; k := k + 1 end
For brevity, Figure 18 omits the obvious loop invariants of the initialization loops at
lines 11 and 12.
4.5. Computational Geometry: Rotating Calipers
The diameter of a polygon is its maximum width, that is, the maximum distance
between any pair of its points. For a convex polygon, it is clear that a pair of vertices
determine the diameter (such as vertices p3 and p7 in Figure 19). Shamos showed [1978]
that it is not necessary to check all O(n2 ) pairs of vertices: his algorithm, described
later, runs in time O(n). The correctness of the algorithm rests on the notions of lines
of support and antipodal points. A line of support is analogous to a tangent: a line of
support of a convex polygon p is a line that intersects p such that the interior of p lies
entirely to one side of the line. An antipodal pair is then any pair of p’s vertices that
lie on two parallel lines of support. It is a geometric property that an antipodal pair
determines the diameter of any polygon p, and that a convex polygon with n vertices
has O(n) antipodal pairs. Figure 19(a), for example, shows two parallel lines of support
that identify the antipodal pair ( p1 , p5 ).
Shamos’s algorithm efficiently enumerates all antipodal pairs by rotating two lines
of support while maintaining them parallel. After a complete rotation around the
polygon, they have touched all antipodal pairs, and hence the algorithm can terminate.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:37
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:38 C. A. Furia et al.
Observing that two parallel support lines resemble the two jaws of a caliper, Toussaint
[1983] suggested the name “rotating calipers” to describe Shamos’s technique.
A presentation of the rotating calipers algorithm and a proof of its correctness with
the same level of detail as the algorithms in the previous sections would require the
development of a complex domain theory for geometric entities and of the corresponding
implementation primitives. Such a level of detail is beyond the scope of this article;
instead, we outline the essential traits of the specification and give a description of the
algorithm in pseudo-code in Figure 20.6
6 The algorithm in Figure 20 is slightly simplified, as it does not deal explicitly with the special case where a
line initially coincides with an edge: then, the minimum angle is zero, and hence the next vertex not on the
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:39
The algorithm inputs a list p of at least three points such that it represents a
convex polygon (precondition on line 3) and returns the value of the polygon’s diameter
(postcondition on line 41).
It starts by adjusting the two parallel support lines on the two vertices with the
maximum difference in y coordinate, such as p1 and p5 in Figure 19(a). Then, it enters
a loop that computes the angle between each line and the next vertex on the polygon
and rotates both jaws by the minimum of such angles. At each iteration, it compares
the distance between the new antipodal pair and the currently stored maximum (in
Result) and updates the latter if necessary. In the example of Figure 19(a), jaw_a
determines the smallest angle, and hence both jaws are rotated by angle_a in Figure
19(b). Such an informal description suggests the obvious bounding invariants that
the two jaws are maintained parallel (line 18), Result varies between some initial
value and the final value diameter (p) (line 19), and the total rotation of the calipers
is between zero and 180 + 180 (line 20). The essential invariant is, however, harder
to state formally, because it involves a subset of the antipodal pairs reached by the
calipers. A semiformal presentation is:
p1 , p2 ∈ p ∧
Result = max | p1 , p2 | reached ( p1 , total rotation) ∧
reached ( p2 , total rotation)
whose intended meaning is that Result stores the maximum distance between all
points p1 , p2 among p’s vertices that can be reached with a rotation of up to to-
tal_rotation degrees from the initial calipers’ horizontal positions.
line should be considered. This problem can be avoided by adjusting the initialization to avoid that a line
coincides with an edge.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:40 C. A. Furia et al.
where denotes the empty sequence and “◦” is the concatenation operator. With this
notation, reverse’s postcondition is:
list = rev(old list)
with the matching property that list is still acyclic.
The domain theory for lists makes it possible to derive the loop invariant with
the usual techniques. Let us introduce a local variable reversed, which will store the
iteratively constructed reversed list. More precisely, every iteration of the loop removes
one element from the beginning of list and moves it to the beginning of reversed, until
all elements have been moved. When the loop terminates:
—list points to an empty list;
—reversed points to the reversal of old list.
Therefore, the routine concludes by assigning reversed to overwrite list. Backward
substitution yields the loop’s postcondition from the routine’s:
reversed = rev(old list). (47)
Using (46) for empty lists, we can equivalently write (47) as:
rev(list) ◦ reversed = rev(old list), (48)
which is the essential loop invariant, whereas list = Void is the exit condition. The
other component of the loop invariant is the constraint that list and reversed be acyclic,
also by mutation of the postcondition.
Figure 22 shows the standard implementation. Figure 21 pictures instead a graphical
representation of reverse’s behavior: Figure 21(a) shows the state of the lists in the
middle of the computation, and Figure 21(b) shows how the state changes after one
iteration, in a way that the invariant is preserved.
The correctness proof relies on some straightforward properties of the rev and ◦
functions. Initiation follows from the property that s ◦ = s. Consecution relies on the
definition (46) for n > 0, so that:
rev (list.next) ◦ list.first = rev (list).
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:41
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:42 C. A. Furia et al.
We can obtain the essential invariant by weakening both conjuncts in (50) based on
two immediate properties of trees. First, Result ∈ t implies Result = Void, because
no valid node is Void. Second, a node’s value belongs to the set of values of the subtree
rooted at the node:
n.value ∈ t[n].values
for n ∈ t. Thus, the following formula is a weakening of (50):
key ∈ t.values =⇒ Result = Void ∧ key ∈ t[Result].values, (52)
which works as an essential loop invariant for binary-search-tree search.
Search works by moving Result to the left or right subtree—according to (49)—until
a value key is found or the subtree to be explored is empty. This corresponds to the
disjunctive exit condition Result = Void ∨ Result.value = key and to the bounding
invariant Result = Void =⇒ Result ∈ t: we are within the tree until we hit Void.
Figure 23 shows the corresponding implementation.
Initiation follows from the precondition and from the identity t[t.root] = t. Consecu-
tion relies on the following domain property, which in turn follows from (49):
n ∈ t ∧ v ∈ t[n].values ∧ v < n.value =⇒ n.left = Void ∧ v ∈ t[n.left].values
n ∈ t ∧ v ∈ t[n].values ∧ v > n.value =⇒ n.right = Void ∧ v ∈ t[n.right].values.
The ordering property (49) entails that the leftmost node in a (nonempty) tree t—that
is, the first node without left child reached by always going left from the root—stores
the minimum of all node values. This property, expressible using the domain theory
function leftmost as:
min(t.values) = leftmost(t).value, (53)
leads to an algorithm to determine the node with minimum value in a binary search
tree, whose postcondition is thus:
Result = leftmost(t) (54)
Result.value = min(t.values). (55)
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:43
The algorithm only has to establish (54), which then implies (55) combined with the
property (53). In fact, the algorithm is oblivious of (49) and operates solely based on
structural properties of binary trees; (55) follows as an afterthought.
Duplicating the right-hand side of (54), writing t in the equivalent form t[t.root], and
applying constant relaxation to t.root yields the essential invariant
leftmost (t [Result]) = leftmost (t) (56)
with the bounding invariant Result ∈ t that we remain inside the tree t. These invari-
ants capture the procedure informally highlighted earlier: walk down the left children
until you reach the leftmost node. The corresponding implementation is in Figure 24.
Initiation follows by trivial identities. Consecution relies on a structural property of
the leftmost node in any binary tree:
n ∈ t ∧ n.left = Void =⇒ n.left ∈ t ∧ leftmost(t[n]) = leftmost(t[n.left]).
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:44 C. A. Furia et al.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:45
The inner loop computes the sum in (58) for j ranging over the set reaching [i] of nodes
that directly reach i. The invariants of the across loops express the progress in the
computation of (58); we do not write them down explicitly as they are not particularly
insightful from the perspective of connecting postconditions and loop invariants.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:46 C. A. Furia et al.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:47
Among the numerous attempts to improve the effectiveness and flexibility of dynamic
inference, Gupta and Heidepriem [2003] suggest improving the quality of inferred
contracts by using different test suites (based on code coverage and invariant coverage)
and by retaining only the contracts that are inferred with both techniques. Fraser and
Zeller [2011] simplify and improve test cases based on mining recurring usage patterns
in code bases; the simplified tests are easier to understand and focus on common
usage. Other approaches to improve the quality of inferred contracts combine static
and dynamic techniques [Csallner et al. 2008; Tillmann et al. 2006; Wei et al. 2011b].
To date, dynamic invariant inference has been mostly used to infer pre- and postcon-
ditions or intermediate assertions, whereas it has been only rarely applied [Nguyen
et al. 2012] to loop invariant inference. This is probably because dynamic techniques
require a sufficiently varied collection of test cases, and this is more difficult to achieve
for loops—especially if they manipulate complex data structures.
which compares adjacent elements in positions i and i + 1. Since bubble sort rear-
ranges elements in an array also by swapping adjacent elements, proving it in Boogie
was straightforward, since the prover could figure out how to apply definition (59)
to discharge the various verification conditions—which also feature comparisons of
adjacent elements. Selection sort and insertion sort required substantially more guid-
ance in the form of additional domain theory properties and detailing of intermediate
verification steps, since their logic does not operate directly on adjacent elements,
and hence their verification conditions are syntactically dissimilar to (59). Changing
the definition of sorted into something that relates nonadjacent elements—such as
∀i, j : a.lower ≤ i ≤ j ≤ a.upper =⇒ a[i] ≤ a[ j]—is not sufficient to bridge the semantic
gap between sortedness and other predicates: the logic of selection sort and insertion
sort remains more complex to reason about than that of bubble sort. On the contrary,
comb sort was as easy as bubble sort, because it relies on a generalization of sorted
that directly matches its logic (see Section 4.3.6).
A similar experience occurred for the two dynamic programming algorithms of
Section 4.4. While the implementation of Levenshtein distance closely mirrors the
recursive definition of distance (Section 4.4.2) in computing the minimum, the relation
between specification and implementation is less straightforward for the knapsack
problem (Section 4.4.1), which correspondingly required a more complicated axiomati-
zation for the proof in Boogie.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:48 C. A. Furia et al.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:49
ACKNOWLEDGMENTS
This work is clearly based on the insights of Robert Floyd, C.A.R. Hoare, and Edsger Dijkstra in introducing
and developing the notion of loop invariant. The mechanized proofs were made possible by Rustan Leino’s
Boogie tool, a significant advance in turning axiomatic semantics into a practically usable technique. We
are grateful to our former and present ETH colleagues Bernd Schoeller, Nadia Polikarpova, and Julian
Tschannen for pioneering the use of Boogie in our environment.
REFERENCES
Samson Abramsky and Achim Jung. 1994. Domain theory. In Handbook of Logic in Computer Science III, S.
Abramsky, D. M. Gabbay, and T. S. E. Maibaum (Eds.). Oxford University Press.
Dirk Beyer, Thomas A. Henzinger, Rupak Majumdar, and Andrey Rybalchenko. 2007. Invariant synthesis for
combined theories. In Proceedings of the 8th International Conference on Verification, Model Checking,
and Abstract Interpretation (VMCAI’07), Byron Cook and Andreas Podelski (Eds.), Lecture Notes in
Computer Science, vol. 4349. Springer, 378–394.
Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jérôme Feret, Laurent Mauborgne, Antoine Miné, David
Monniaux, and Xavier Rival. 2003. A static analyzer for large safety-critical software. In Proceedings of
the 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’03).
ACM, 196–207.
Joshua Bloch. 2006. Extra, Extra – Read all about it: Nearly all binary searches and mergesorts are broken.
http://googleresearch.blogspot.ch/2006/06/extra-extra-read-all-about-it -nearly.html. (2006).
Marius Bozga, Peter Habermehl, Radu Iosif, Filip Konečný, and Tomáš Vojnar. 2009. Automatic verification
of integer array programs. In Proceedings of the 21st International Conference on Computer Aided
Verification (CAV’09), Ahmed Bouajjani and Oded Maler (Eds.), Lecture Notes in Computer Science,
vol. 5643. Springer, 157–172.
Aaron R. Bradley and Zohar Manna. 2007. The Calculus of Computation. Springer.
Aaron R. Bradley, Zohar Manna, and Henny B. Sipma. 2006. What’s decidable about arrays? In Proceed-
ings of the 7th International Conference on Verification, Model Checking, and Abstract Interpretation
(VMCAI’06), E. Allen Emerson and Kedar S. Namjoshi (Eds.), Lecture Notes in Computer Science,
vol. 3855. Springer, 427–442.
Bruno Buchberger. 2006. Mathematical theory exploration. In Proceedings of the 3rd International Con-
ference on Automated Reasoning (IJCAR’66). Lecture Notes in Computer Science, vol. 4130. Springer,
1–2.
Bor-Yuh Evan Chang and K. Rustan M. Leino. 2005. Abstract Interpretation with alien expressions and
heap structures. In Proceedings of the 6th International Conference on Verification, Model Checking, and
Abstract Interpretation (VMCAI’05), Radhia Cousot (Ed.), Lecture Notes in Computer Science Series,
vol. 3385. Springer, 147–163.
Edmund M. Clarke, Orna Grumberg, and Doron A. Peled. 1999. Model Checking. MIT Press.
Michael Colón, Sriram Sankaranarayanan, and Henny Sipma. 2003. Linear invariant generation using
non-linear constraint solving. In Proceedings of the 15th International Conference on Computer Aided
Verification(CAV’03), Jr. Warren A. Hunt and Fabio Somenzi (Eds.), Lecture Notes in Computer Science,
vol. 2725. Springer, 420–432.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algo-
rithms (3rd ed.). MIT Press.
Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis
of programs by construction or approximation of fixpoints. In Proceedings of the 4th Annual ACM
Symposium on Principles of Programming Languages (POPL’77). 238–252.
Patrick Cousot and Nicolas Halbwachs. 1978. Automatic discovery of linear restraints among variables of a
program. In Proceedings of the 5th Annual ACM Symposium on Principles of Programming Languages
(POPL’78). 84–96.
Christoph Csallner, Nikolai Tillman, and Yannis Smaragdakis. 2008. DySy: Dynamic symbolic execution
for invariant inference. In Proceedings of the 30th International Conference on Software Engineering
(ICSE’08), Wilhelm Schäfer, Matthew B. Dwyer, and Volker Gruhn (Eds.). ACM, 281–290.
Guido de Caso, Diego Garbervetsky, and Daniel Gorı́n. 2009. Reducing the number of annotations in a
verification-oriented imperative language. In Proceedings of Automatic Program Verification.
Edsger W. Dijkstra. 1976. A Discipline of Programming. Prentice Hall.
Eiffel. 2006. Standard ECMA-367. Eiffel: Analysis, design and programming language. (2006).
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
34:50 C. A. Furia et al.
Michael D. Ernst, Jake Cockrell, William G. Griswold, and David Notkin. 2001. Dynamically discovering
likely program invariants to support program evolution. IEEE Transactions of Software Engineering 27,
2 (2001), 99–123.
Robert W. Floyd. 1967. Assigning meanings to programs. In Mathematical Aspects of Computer Science J. T.
Schwartz (Ed.), Proceedings of Symposia in Applied Mathematics Series, vol. 19. American Mathematical
Society, 19–32.
Gordon Fraser and Andreas Zeller. 2011. Exploiting common object usage in test case generation. In Proceed-
ings of the IEEE 4th International Conference on Software Testing, Verification and Validation (ICST’11).
IEEE Computer Society, 80–89.
Carlo A. Furia, Dino Mandrioli, Angelo Morzenti, and Matteo Rossi. 2012. Modeling Time in Computing.
Monographs in Theoretical Computer Science. An CATCS Series. Springer.
Carlo A. Furia and Bertrand Meyer. 2010. Inferring loop invariants using postconditions. In Fields of Logic
and Computation: Essays Dedicated to Yuri Gurevich on the Occasion of His 70th Birthday, Andreas
Blass, Nachum Dershowitz, and Wolfgang Reisig (Eds.), Lecture Notes in Computer Science Series, vol.
6300. Springer, 277–300.
Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of
NP-Completeness. Freeman.
Carlo Ghezzi, Andrea Mocci, and Mattia Monga. 2009. Synthesizing intensional behavior models by graph
transformation. In Proceedings of the 31st International Conference on Software Engineering (ICSE’09).
IEEE, 430–440.
David Gries. 1981. The Science of Programming. Springer-Verlag.
Neelam Gupta and Zachary V. Heidepriem. 2003. A new structural coverage criterion for dynamic detection
of program invariants. In 18th IEEE International Conference on Automated Software Engineering
(ASE’03). IEEE Computer Society, 49–59.
John Hatcliff, Gary T. Leavens, K. Rustan M. Leino, Peter Müller, and Matthew J. Parkinson. 2012. Behav-
ioral interface specification languages. ACM Comput. Surv. 44, 3 (2012), 16.
Thomas A. Henzinger, Thibaud Hottelier, Laura Kovács, and Andrei Voronkov. 2010. Invariant and type in-
ference for matrices. In Proceedings of the 11th International Conference on Verification, Model Checking,
and Abstract Interpretation (VMCAI’10). Lecture Notes in Computer Science. Springer.
C. A. R. Hoare. 1969. An axiomatic basis for computer programming. Commun. ACM 12, 10 (1969), 576–580.
C. A. R. Hoare. 1972. Proof of correctness of data representations. Acta Inf. 1 (1972), 271–281.
C. A. R. Hoare. 2003. The verifying compiler: A grand challenge for computing research. J. ACM 50, 1 (2003),
63–69.
Mikoláš Janota. 2007. Assertion-based loop invariant generation. In Proceedings of the 1st International
Workshop on Invariant Generation (WING’07).
Ranjit Jhala and Rupak Majumdar. 2009. Software model checking. Comput. Surveys 41, 4 (2009).
Michael Karr. 1976. Affine relationships among variables of a program. Acta Informatica 6 (1976), 133–151.
Hans Kellerer, Ulrich Pferschy, and David Pisinger. 2004. Knapsack problems. Springer.
Donald E. Knuth. 2011. The Art of Computer Programming (1st ed., Volumes 1–4A). Addison-Wesley. 1968–
1973.
Laura Kovács and Andrei Voronkov. 2009. Finding loop invariants for programs over arrays using a theorem
prover. In Proceedings of the 12th International Conference on Fundamental Approaches to Software
Engineering (FASE’09), Marsha Chechik and Martin Wirsing (Eds.), Lecture Notes in Computer Science
Series, vol. 5503. Springer, 470–485.
Daniel Kroening and Ofer Strichman. 2008. Decision Procedures: An Algorithmic Point of View. Monographs
in Theoretical Computer Sciences. An EATCS Series. Springer.
Shuvendu K. Lahiri, Shaz Qadeer, Juan P. Galeotti, Jan W. Voung, and Thomas Wies. 2009. Intra-module
Inference. In Proceedings of the 21st International Conference on Computer Aided Verification (CAV’09),
Ahmed Bouajjani and Oded Maler (Eds.), Lecture Notes in Computer Science Series, vol. 5643. Springer,
493–508.
Leslie Lamport. 1977. Proving the correctness of multiprocess programs. IEEE Transactions on Software
Engineering 3, 2 (1977), 125–143.
Leslie Lamport. 2009. Teaching concurrency. SIGACT News 40, 1 (2009), 58–62.
K. Rustan M. Leino. 2008. This is Boogie 2. (June 2008). (Manuscript KRML 178) http://research.microsoft.
com/en-us/projects/boogie/.
Francesco Logozzo. 2004. Automatic inference of class invariants. In Proceedings of the 5th International
Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI’04), Bernhard Steffen
and Giorgio Levi (Eds.), Lecture Notes in Computer Science Series, vol. 2937. Springer, 211–222.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Loop Invariants: Analysis, Classification, and Examples 34:51
Kurt Mehlhorn and Peter Sanders. 2008. Algorithms and Data Structures: The Basic Toolbox. Springer.
Bertrand Meyer. 1980. A basis for the constructive approach to programming. In Proceedings of IFIP Congress
1980, Simon H. Lavington (Ed.). 293–298.
Bertrand Meyer. 1997. Object-oriented software construction (2nd ed.). Prentice Hall.
Carroll Morgan. 1994. Programming from Specifications (2nd ed.). Prentice Hall.
ThanhVu Nguyen, Deepak Kapur, Westley Weimer, and Stephanie Forrest. 2012. Using dynamic analysis
to discover polynomial and array invariants. In Proceedings of the 34th International Conference on
Software Engineering (ICSE’12). IEEE, 683–693.
Susan Owicki and David Gries. 1976. An axiomatic proof technique for parallel programs. Acta Informatica
6 (1976), 319–340.
Christos H. Papadimitriou. 1993. Computational Complexity. Addison-Wesley.
Corina S. Păsăreanu and Willem Visser. 2004. Verification of Java programs using symbolic execution
and invariant generation. In Proceedings of the 11th International SPIN Workshop on Model Checking
Software. Lecture Notes in Computer Science Series, vol. 2989. Springer, 164–181.
Jeff H. Perkings and Michael D. Ernst. 2004. Efficient incremental algorithms for dynamic detection of
likely invariants. In Proceedings of the 12th ACM SIGSOFT International Symposium on Foundations
of Software Engineering (SIGSOFT’04/FSE-12), Richard N. Taylor and Matthew B. Dwyer (Eds.). ACM,
23–32.
André Platzer. 2010. Differential-algebraic dynamic logic for differential-algebraic programs. J. Log. Comput.
20, 1 (2010), 309–352.
Nadia Polikarpova, Ilinca Ciupa, and Bertrand Meyer. 2009. A comparative study of programmer-written
and automatically inferred contracts. In Proceedings of the ACM/SIGSOFT International Symposium
on Software Testing and Analysis (ISSTA’09). 93–104.
Alan Robinson and Andrei Voronkov (Eds.). 2001. Handbook of Automated Reasoning. Elsevier.
Enric Rodrı́guez-Carbonell and Deepak Kapur. 2007. Generating all polynomial invariants in simple loops.
Journal of Symbolic Computation 42, 4 (2007), 443–476.
Sriram Sankaranarayanan, Henny Sipma, and Zohar Manna. 2004. Non-linear loop invariant generation
using Gröbner bases. In Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages (POPL’04), Neil D. Jones and Xavier Leroy (Eds.). ACM, 318–329.
Michael I. Shamos. 1978. Computational geometry. Ph.D.thesis Yale University. http://goo.gl/XiXNl.
J. Michael Spivey. 1992. Z Notation – a reference manual (2nd ed.). Prentice Hall International Series in
Computer Science. Prentice Hall.
Nikolai Tillmann, Feng Chen, and Wolfram Schulte. 2006. Discovering likely method specifications. In
Proceedings of the 8th International Conference on Formal Engineering Methods (ICFEM’06). Lecture
Notes in Computer Science Series, vol. 4260. 717–736.
Godfried T. Toussaint. 1983. Solving geometric problems with the rotating calipers. In Proceedings of IEEE
MELECON.
Yi Wei, Carlo A. Furia, Nikolay Kazmin, and Bertrand Meyer. 2011a. Inferring Better Contracts. In Proceed-
ings of the 33rd International Conference on Software Engineering (ICSE’11), Richard N. Taylor, Harald
Gall, and Nenad Medvidović (Eds.). ACM, 191–200.
Yi Wei, Hannes Roth, Carlo A. Furia, Yu Pei, Alexander Horton, Michael Steindorfer, Martin Nordio, and
Bertrand Meyer. 2011b. Stateful testing: Finding more errors in code and contracts. In Proceedings
of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE’11), Perry
Alexander, Corina Pasareanu, and John Hosking (Eds.). ACM, 440–443.
ACM Computing Surveys, Vol. 46, No. 3, Article 34, Publication date: January 2014.
Copyright of ACM Computing Surveys is the property of Association for Computing
Machinery and its content may not be copied or emailed to multiple sites or posted to a
listserv without the copyright holder's express written permission. However, users may print,
download, or email articles for individual use.