0% found this document useful (0 votes)
190 views312 pages

Franz Baader, Tobias Nipkow - Term Rewriting and All That-Cambridge University Press (1998)

The document is an introduction to the field of term rewriting, presenting a comprehensive textbook that covers essential topics such as abstract reduction systems, termination, confluence, and algorithms in functional programming. It serves both as a reference for researchers and a textbook for students, featuring numerous examples and exercises. The authors, Franz Baader and Tobias Nipkow, are established professors in computer science with significant contributions to the field.

Uploaded by

leconspratiques
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
190 views312 pages

Franz Baader, Tobias Nipkow - Term Rewriting and All That-Cambridge University Press (1998)

The document is an introduction to the field of term rewriting, presenting a comprehensive textbook that covers essential topics such as abstract reduction systems, termination, confluence, and algorithms in functional programming. It serves both as a reference for researchers and a textbook for students, featuring numerous examples and exercises. The authors, Franz Baader and Tobias Nipkow, are established professors in computer science with significant contributions to the field.

Uploaded by

leconspratiques
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 312
i) ‘Term Rewriting and All That This is the first English language textbook offering a unified and self-contained introduction to the field of term rewriting. It covers all the basic material (abstract reduction systems, termination, confluence, completion, and combination problems), but also some important and closely connected subjects: universal algebra, unification theory, and Grébner bases. The main algorithms are presented both informally and as programs in the functional language Standard ML (an appendix contains a quick and easy introduction to ML). Certain crucial algorithms like unification and congruence closure are covered in more depth and efficient Pascal programs are developed. The book contains many examples and over 170 exercises. This text is also an ideal reference book for professional researchers: results that have been spread over many conference and journal articles are collected together in a unified notation, detailed proofs of almost all theorems are provided, and each chapter closes with a guide to the literature. Franz Baader has been professor of computer science at the Technical University of Aachen since 1993. His current research interests include knowledge representation (in particular, description logics, nonmonotonic logics, and modal logics) and automated deduction. In these areas he has published more than 50 articles in major journals and conferences. ‘Tobias Nipkow took up a professorship in the Computer Science Department of the Technical University in Munich in 1992. His research interests include term rewriting, theorem proving, and formal program development. In these areas he has published almost 50 articles in major journals and conferences. Term Rewriting and All That Franz Baader and Tobias Nipkow CAMBRIDGE UNIVERSITY PRESS PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE The Pitt Building, Trumpington Street, Cambridge CB2 IRP, United Kingdom CAMBRIDGE UNIVERSITY PRESS The Edinburgh Building, Cambridge CB2 2RU, UK http://www.cup.camac.uk 40 West 20th Street, New York, NY 10011-4211, USA http://www.cup.org 10 Stamford Road, Oakleigh, Melbourne 3166, Australia © Cambridge University Press 1998 This book is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 1998 Printed in the United Kingdom at the University Press, Cambridge ‘Typeset by the author A catalogue record for this book is available from the British Library Library of Congress Cataloguing in Publication data Baader, Franz. Term rewriting and all that / Franz Baader and Tobias Nipkow. Pp. cm. Includes bibliographical references and index. ISBN 0 521 45520 0 (he : alk. paper) 1. Rewriting systems (Computer science). I, Nipkow, Tobias, 1958- 1 Title, (QA267.B314 1998 005.13’1-de21 97-28286 CIP ISBN 0 521 45520 0 hardback Contents Preface 1 Motivating Examples 2 Abstract Reduction Systems 2.1 Equivalence and reduction 2.2 Well-founded induction 2.3 Proving termination 24 Lexicographic orders 2.5 Multiset orders 2.6 Orders in ML 2.7 Proving confluence 2.8 Bibliographic notes 3 Universal Algebra 3.1 Terms, substitutions, and identities 3.2 Algebras, homomorphisms, and congruences 3.3 Free algebras 3.4 Term algebras 3.5 Equational classes 4 Equational Problems 41 Deciding ~g 42 Term rewriting systems 4.3 Congruence closure 4.4 Congruence closure on graphs 4.5 Syntactic unification 4.6 Unification by transformation 4.7 Unification and term rewriting in ML 48 Unification of term graphs 4.9 Bibliographic notes vi 5.1 5.2 5.4 5.5 6.1 6.2 6.3 6.4 6.5 71 7.2 73 7A 75 8.1 8.2 8.3 84 8.5 9.1 9.2 9.3 9.5 10 10.1 10.2 10.3 10.4 10.5 Contents ‘Termination The decision problem Reduction orders The interpretation method Simplification orders Bibliographic notes Confluence The decision problem Critical pairs Orthogonality Beyond orthogonality Bibliographic notes Completion The basic completion procedure An improved completion procedure Proof orders Huet’s completion procedure Huet’s completion procedure in ML Bibliographic notes Grébner Bases and Buchberger’s Algorithm The ideal membership problem Polynomial reduction Grébner bases Buchberger’s algorithm Bibliographic notes Combination Problems Basic notions Termination Confluence Combining word problems Bibliographic notes Equational Unification Basic definitions and results Commutative functions Associative and commutative functions Boolean rings Bibliographic notes 93 93 101 104 lll 131 134 134 135 145 151 157 158 160 164 172 178 182 187, 187 189 193 196 198 200 200 202 207 211 222 223 224 230 236 250 262 Contents 11 Extensions 11.1 Rewriting modulo equational theories 11.2 Ordered rewriting 11.3 Conditional identities and conditional rewriting 11.4 Higher-order rewrite systems 11.5 Reduction strategies 11.6 Narrowing Appendit 1 Ordered Sets Appendix 2. A Bluffer’s Guide to ML Bibliography Index vii 265 265 267 269 270 271 273 276 278 284 297 Preface ‘Term rewriting is a branch of theoretical computer science which combines elements of logic, universal algebra, automated theorem proving and func- tional programming. Its foundation is equational logic. What distinguishes term rewriting from equational logic is that equations are used as directed replacement rules, i.e. the left-hand side can be replaced by the right-hand side, but not vice versa. This constitutes a Turing-complete computational model which is very close to functional programming. It has applications in algebra (e.g. Boolean algebra, group theory and ring theory), recursion theory (what is and is not computable with certain sets of rewrite rules), software engineering (reasoning about equationally defined data types such as numbers, lists, sets etc.), and programming languages (especially func- tional and logic programming). In general, term rewriting applies in any context where efficient methods for reasoning with equations are required. To date, most of the term rewriting literature has been published in spe- cialist conference proceedings (especially Rewriting Techniques and Appli- cations and Automated Deduction in Springer's LNCS series) and journals (eg. Journal of Symbolic Computation and Journal of Automated Reaso- ning). In addition, several overview articles provide introductions into the field, and references to the relevant literature (141, 74, 204]. This is the first English book devoted to the theory and applications of term rewriting. It is ambitious in that it tries to serve two masters: © The researcher, who needs a unified theory that covers, in detail and in a single volume, material that has previously only been collected in overview articles, and whose technical details are spread over the literature. © The teacher or student, who needs a readable textbook in an area where there is hardly any literature for the non-specialist. Our choice of material is fairly canonical: abstract reduction systems and x Preface universal algebra (the foundation), word problems (the motivation), unifica- tion (a central algorithm), termination, confluence and completion (the sine qua non of term rewriting). The inclusion of combination problems is also uncontroversial, except maybe for the rather technical topic of combining word problems. Two further topics show our own preferences and are not strictly core material: equational unification is included because of its sig- nificance for rewriting based theorem provers, Grébner bases because they form an essential link between term rewriting and computer algebra. Prerequisites are minimal: readers who have taken introductory cour ses such as discrete mathematics, (linear) algebra, or theoretical computer science are well equipped for this book. The basic notions of ordered sets are summarized in an appendix. How to teach this book The diagram below shows the dependencies between the different sections of the book. 2 t 3.23.5 + 3.1 43-44 + 41-4.2,4546 — 48 t 5 10 — 111 4 64+ 61-63 + 12116 4 7A 9 £o™ 7278 8 ‘An introductory undergraduate course should cover the trunk of the above tree. To give the students a more algorithmic understanding of completion, it is helpful also to introduce Huet’s completion procedure (7.4) without for- mally justifying its correctness. The course should conclude with 11.2-11.6. A more advanced introduction at graduate level would also include 4.3-4.4, 4.8, 6.4, 7.2-7.4, 9.1-9.3, and (initial segments of) 10. For a mathemati- Preface xi cally oriented audience, 3.2-3.5 is mandatory and 8 contains an excellent application of rewriting methods in mathematics. Chapter 2 on abstract reduction systems is the foundation that term rewriting rests on, Nevertheless we recommend not to teach this chapter en bloc but to interleave it with the rest of the book. Only Section 2.1 needs to be covered right at the start. The dependency of the remaining sections is as follows: 2.225 —+ 5 — 2.7 — 6. This groups together the abstract and concrete treatments of termination (2.2-2.5 and 5) and confluence (2.7 and 6). Chapter 5 on termination has a special status in the dependency diagram. It is not the case that all of Chapter 5 is a prerequisite for the remainder of the book. In fact, almost the opposite is the case: one could read most of the remainder quite happily, except that one would not be able to follow particular termination arguments. However, due to the overall importance of termination, we recommend that students should be exposed at least to 5.1-5.3 and possibly one of the simplification orders in 5.4. The general theory of simplification orders should be reserved for a graduate-level course. A final word of warning. A book also aimed at researchers is written with a higher level of formality than a pure textbook. In places, the formal rigour of the book needs to be adjusted to the requirements of the classroom. The réle of ML Most of the theory in this book is constructive. Either we explicitly deal with particular algorithms, e.g. unification, or the proof of some theorem is essentially an algorithm, e.g. a decision procedure. We find that many computer science students take more easily to logical formalisms once they understand how to represent formulae as data structures and how to trans- form them. Therefore we have tried to accompany every major algorithm in this book by an implementation. As an implementation language we have chosen ML: functional languages are closest to our algorithms and ML is one of its best-known representatives. For those readers not familiar with ML, a concise summary of the core of the language is provided as an appendix. It should be emphasized that our ML programs are strictly added value: they reside in separate sections and are not required for an understanding of the main text (although we believe that their study enhances this under- standing). We should also point out that the programs are intentionally unoptimized. xii Preface They are written for clarity rather than efficiency. Nevertheless they cope well with small to medium sized examples. Their simplicity makes them an ideal vehicle for further developments, and we encourage our readers to experiment with them. They are available on the internet at http: //www4.informatik.tu-muenchen.de/“nipkow/ Acknowledgments David Basin, Eric Domenjoud, Harald Ganzinger, Bernhard Gramlich, Hen- rik Linnestad, Aart Middeldorp, Monica Nesi, Vincent van Oostrom, Larry Paulson, Manfred Schmidt-Schau8, Klaus Schulz, Wayne Snyder, Cesare Tinelli, and Markus Wenzel read individual chapters and commented exten- sively on them. In particular Aart Middeldorp’s amazing scrutiny uncovered a number of embarrassing mistakes. Michael Hanus, Maribel Fernéndez and Femke van Raamsdonk provided additional comments. Can A. Albayrak and Volker Braun produced first versions of some of the figures. The DFG funded a sabbatical of the second author at Cambridge Uni- versity Computer Laboratory where Larry Paulson greatly contributed to a very productive four months. Alison Woollatt of CUP provided essential TeXpertise. David Tranah, our very patient editor, suggested the title. We wish to thank them all. 1 Motivating Examples Equational reasoning is concerned with a rather restricted class of first-order languages: the only predicate symbol is equality. It is, however, at the heart of many problems in mathematics and computer science, which explains why developing specialized methods and tools for this type of reasoning is very popular and important. For example, in mathematics one often defines classes of algebras (such as groups, rings, etc.) by giving defining identities (which state associativity of the group operation, etc.). In this context, it is important to know which other identities can be derived from the defining ones. In algebraic specification, new operations are defined from given ones by stating characteristic identities that must hold for the defined operations. As a special case we have functional programs where functions are defined by recursion equations. For example, assume that we want to define addition of natural numbers using the constant 0 and the successor function s. This can be done with the identities} z+0 © 2, etsy) & s(a+y). By applying these identities, we can calculate the sum of 1 (encoded as s(0)) and 2 (encoded as s(s(0))): 8(0) + s(s(0)) © s(s(0) + s(0)) © s(s(s(0)) + 0) © s(s(s(0))). In this calculation, we have interpreted the identities as rewrite rules that tell us how a subterm of a given term can be replaced by another term. This brings us to one of the key notions of this book, namely term rewrit- ing systems, What do we mean by terms? They are built from variables, + Throughout this book, we use ~ for identities to make a clear distinction between the object level sign for identity and our use of = for equality on the meta-level. 1 2 1 Motivating Examples constant symbols, and function symbols. In the above example, + is a binary function symbol, s is a unary function symbol, 0 is a constant sym- bol, and «,y are variables. Examples of terms over these symbols are 0, 2, s(s(0)), « + (0), s(s(s(0)) +0). In our example calculation, we have used the identities only from left to right, but in general, identities can be applied in both directions. In the following, we give two examples that illustrate some of the key issues arising in connection with identities and rewrite systems, and which will be treated in detail in this book. In the first example, the rewrite rules are intended to be used only in one direction (which is expressed by writing — instead of ~). This is an instance of rewriting as a computation mechanism. In the second, we consider the identities defining groups, which are intended to be used in both directions. This is an instance of rewriting as a deduction mechanism. Symbolic Differentiation We consider symbolic differentiation of arithmetic expressions that are built with the operations +, *, the indeterminates X,Y, and the numbers 0,1. For example, ((X+X)*Y)-+1 is an admissible expression. These expressions can be viewed as terms that are built from the constant symbols 0, 1, X, and Y, and the binary function symbols + and . For the partial derivative with respect to X, we introduce the additional (unary) function symbol Dx. The following rules are (some of the) well-known rules for computing the derivative: (Rl) Dx(X) > 1, (R2) Dx(Y) — 0, (R3)— Dx(ut+v) + Dx(u)+ Dx(v), (R4) Dx(u*v) — (u* Dx(v)) + (Dx(u) *v). In terms like Dx (u+v), the symbols u and v are variables, with the intended meaning that they can be replaced by arbitrary expressions. Thus, rule (R3) can be applied to terms having the same pattern as the left-hand side, i.e. a Dx followed by a +-expression. Starting with the term Dx (XX), the rules (R1)-(R4) lead to the possible reductions depicted in Fig. 1.1. We can use this example to illustrate two of the most important properties of term rewriting systems: + These variables should not be confused with the indeterminates X,Y of the arithmetic expres- sions, which are constant symbols. Motivating Examples 3 Dx(X *X) [ns (X * Dx(X)) + (Dx(X) * X) \ a (X #1) + (Dx(X)*X) (X * Dx(X)) + (1*X) \n A (X #1) + (1X) Fig. 1.1. Symbolic differentiation of the expression Dx(X * X). ‘Termination: Is it always the case that after finitely many rule applications we reach an expression to which no more rules apply? Such an expression is then called a normal form. For the rules (R1)-(R4) this is the case. It is, however, not com- pletely trivial to show this because rule (R4) leads to a considerable increase in the size of the expression. ‘An example of a non-terminating rule is utvrutu, which expresses commutativity of addition. The sequence (X * 1) + (1*X) — (LX) + (X *1) > (X *1)+(1*X) >... is an example for an infinite chain of applications of this rule. Of course, non- termination need not always be caused by a single rule; it could also result from the interaction of several rules. Confluence: If there are different ways of applying rules to a given term ¢, leading to different derived terms t, and ta, can ty and t2 be joined, i.e. can we always find a common term s that can be reached both from ty and from t2 by rule application? In Fig. 1.1 this is the case, and more generally, one can prove (but How?) that (R1)-(R4) are confluent. This shows that the symbolic differentiation of a given expression always leads to the same deri- 4 1 Motivating Examples vative (i.e. the term to which no more rules apply), independent of the strategy for applying rules. If we add the simplification rule (R5) ut+0>u ‘0 (R1)-(R4), we lose the confluence property (see Fig. 1.2). x (X + 0) fe \* Dx(X) Dx(X) + Dx(0) | RL | Ri 1 1+ Dx(0) Fig. 1.2. Dx(X) and Dx(X) + Dx(0) cannot be joined. In our example, non-confluence of (R1)-(R5) can be overcome by adding the rule Dx (0) + 0. More generally, one can ask whether this is always possible, i.e. can we always make a non-confluent system confluent by adding implied rules (completion of term rewriting systems). Because of their special form, the rules (R1)-(R4) constitute a functional program (on the left-hand side, the defined function Dx occurs only at the very outside). Termination of the rules means that Dx is a total function. Confluence of the rules means that the result of a computation is indepen- dent of the evaluation strategy. Confluence of (R1)-(R4) is not a lucky coincidence. We will prove that all term rewriting systems that constitute functional programs are confluent. Group Theory Let be a binary function symbol, i be a unary function symbol, ¢ be a constant symbol, and z,y,z be variable symbols. The class of all groups is defined by the identities (G1) (oy)oz ® ao(yoz), (G2) cor & g, ae (G3) a(x) o e, Motivating Examples 5 ie. a set G equipped with a binary operation o, a unary operation i, and containing an element ¢ is a group iff the operations satisfy the identities (G1)-(G3). Identity (G3) states only that for every group element g, the clement i(g) is a left-inverse of g with respect to the left-unit e. The identities (G1)-(G3) can be used to show that this left-inverse is also a right-inverse. In fact, using these identities, the term e can be transformed into the term xoi(z): e Zi(voi(a))o( (x 0 i(a)) (xo fa oi(x))) Bile oi(a)) 0 (wo ((i(z) 02) o(a))) RZ i(woi(a)) 0 (x ot (x) 0z)) oi(x)) & i(w oi(x)) o (wo i(w)) 02) oi(x)) 8 i(x 0 i(z)) 0 ((x0i(x)) 0 (wo i(z))) RG (x 0i(x)) 0 (xo i(x))) 0 (x oa(x)) Zeo(eoila)) Rxoi(z). a0 i(x)) This example illustrates that it is nontrivial to find such derivations, ie. to solve the so-called word problem for sets of identities: given a set of identities E and two terms s and ¢, is it possible to transform the term s into the term f, using the identities in E as rewrite rules that can be applied in both directions? One possible way of approaching this problem is to consider the identities as uni-directional rewrite rules: (RG1) (woy)oz + zo(yoz), (RG2) cor > a, (RG3) i(c)or > e. The basic idea is that the identities are only applied in the direction that “simplifies” a given term. One is now looking for normal forms, i.e. terms to which no more rules apply. In order to decide whether the terms s and ¢ are equivalent (i.e. can be transformed into each other by applying identities in both directions), we use the uni-directional rewrite rules to reduce s to a normal form @ and ¢ to a normal form #. Then we check whether $ and # are syntactically equal. There are, however, two problems that must be overcome before this method for deciding the word problem can be applied: e Equivalent terms can have distinct normal forms. In our example, both coi(c) and e are normal forms with respect to (RG1)-(RG3), and we have shown that they are equivalent. However, the above method for deciding 6 1 Motivating Examples the word problem would fail because it would find that the normal forms of x0 i(z) and e are distinct. © Normal forms need not exist: the process of reducing a term may lead to an infinite chain of rule applications. ‘We will see that termination and confluence are the important properties that ensure existence and uniqueness of normal forms. If a given set of identities leads to a non-confluent rewrite system, we do not have to give up. ‘We can again apply the idea of completion to extend the rewrite system to a confluent one. In the case of groups, a confluent and terminating extension of (RG1)-(RG3) exists (see Exercise 7.12 on page 184). 2 Abstract Reduction Systems This chapter is concerned with the abstract treatment of reduction, where reduction is synonymous with the traversal of some directed graph, the stepwise execution of some computation, the gradual transformation of some object (e.g. a term), or any similar step by step activity. Mathematically this means we are simply talking about binary relations. An abstract reduction system is a pair (A,—>), where the reduction — is a binary relation on the set A, i.e. + C Ax A. Instead of (a,b) € — we write a — b. The term “reduction” has been chosen because in many applications some- thing decreases with each reduction step, but cannot decrease forever. Yet this need not be the case, as witnessed by the reduction 0 > 1+ 2 --- Unless noted otherwise, all our discussions take place in the context of some arbitrary but fixed abstract reduction system (A,—). 2.1 Equivalence and reduction ‘We can view reduction in two ways: the first is as a directed computation, which, starting from some point ag, tries to reach a normal form by following the reduction ay > a; + +++. This corresponds to the idea of program evaluation. Or we may consider + merely as a description of “+, where a+b means that there is a path between a and b where the arrows can be traversed in both directions, for example, as in a9 ~ a1 — a2 — a3. This corresponds to the idea of identities which can be used in both directions. ‘The key question here is to decide if two elements a and b are equivalent, ic. ifa 4 b holds. Settling this question by an undirected search along both — and + is bound to be expensive. Wouldn’t it be nice if we could decide equivalence by reducing both a and b to their normal forms and testing if the normal forms are identical? As explained in the first chapter, this idea is only going to work if reduction terminates and normal forms are unique. 7 8 2 Abstract Reduction Systems Formally, we talk about termination and confluence of reduction, and the study of these two notions is one of the central themes of this book. 2.1.1 Basic definitions In the sequel, we define a great many symbols, not all of which will be put to immediate use. Therefore you may treat these definitions as a table of relevant notions which can be consulted when necessary. Given two relations R C A x B and S C B x C, their composition is defined by RoS:= {(x,z)€ Ax C | 3ye B. (a,y) € RA (y,z) € S} Definition 2.1.1 We are particularly interested in composing a reduction with itself and define the following notions: * := {(e,2)|c€ A} identity uy ton (i+ 1)-fold composition, i > 0 4 Usso > transitive closure + 4u4 reflexive transitive closure S = aus reflexive closure 3 {(y,2)|—y} inverse - a inverse a Ue symmetric closure a (4)t transitive symmetric closure 4 (4)* reflexive transitive symmetric closure Some remarks are in order: 1. Notations like “+ and < only work for arrow-like symbols. In the case of arbitrary relations RC A x A we write R*, R-! ete. 2. Some of the constructions can also be expressed nicely in terms of paths: ay if there is a path of length n from z to y, «sy if there is some finite path from z to y, xy if there is some finite nonempty path from = to y. 3. The word closure has a precise meaning: the P closure of R is the least set with property P which contains R. For example, ++, the reflexive transitive closure of —, is the least reflexive and transitive relation which contains —+. Note that for arbitrary P and R, the P closure of R need not exist, but in the above cases they always do because reflexivity, transitivity and symmetry are closed under arbitrary intersections. In 2.1 Equivalence and reduction 9 such cases the P closure of R can be defined directly as the intersection of all sets with property P which contain R. 4, It is easy to show that ¢ is the least equivalence relation containing —. Let us add some terminology to this notation: 1. « is reducible iff there is a y such that x — y. 2. «is in normal form (irreducible) iff it is not reducible. 3. y isa normal form of z iff « * y and y is in normal form. If « has a uniquely determined normal form, the latter is denoted by «|. y is a direct successor of x iff x — y. 5. y is a successor of x iff x 5 y a and y are joinable iff there is a z such that «+ z + y, in which case we write x | y. Example 2.1.2 1. Let A:=N-—{0,1} and (a) m is in normal form iff m is prime. (b) pis @ normal form of m iff p is a prime factor of m. (c) m | niff m and n are not relatively prime. (d) 4 =— because > and “divides” are already transitive. (:) S=4xA . Let A := {a,b}* (the set of words over the alphabet {a,b}) and > := {(ubav, uabv) | u,v € A}. Then (a) w is in normal form iff w is sorted, i.e. of the form a*b*. (b) Every w has a unique normal form w|, the result of sorting w. (c) wi | we iff wy + we iff w; and we contain the same number of as and bs. = = {(m,n) | m > n and n divides m}. Then y Finally we come to some of the central notions of this book. Definition 2.1.3 A reduction — is called Church-Rosserj iff ry > cly (see Fig. 2.1). confluent iff y tart y > yr lye (see Fig. 2.1). terminating _ iff there is no infinite descending chain a) a1 > ++ normalizing _ iff every element has a normal form. convergent, iff it is both confluent and terminating. Both reductions in Example 2.1.2 terminate, but only the second one is Church-Rosser and confluent. + Alonzo Church and J. Barkley Rosser proved that the A-calculus has this property [61]. 10 2 Abstract Reduction Systems Fig. 2.1. Church-Rosser property, confluence and semi-confluence. Remarks: 1. The diagrams in Fig. 2.1 have a precise meaning and are used throughout the book in this manner: solid arrows represent universal and dashed arrows existential quantification; the whole diagram is an implication of the form Vz. P(Z) => 39. Q(&,y). For example, the confluence diagram becomes Vz, ¥1, 92 y1 & © > y > Az. yi zt yo. 2. Because x | y implies « ¢ y, the Church-Rosser property can also be phrased as an equivalence: « & y x | y. 3. Any terminating relation is normalizing, but the converse is not true, as the example in Fig 2.2 shows. Fig. 2.2. Confluent, normalizing and acyclic but not terminating. Thus we have come back to our initial motivation: the Church-Rosser property is exactly what we were looking for, namely the ability to test, equivalence by the search for a common successor. We will now see how it relates to termination and confluence. 2.1.2 Basic results It turns out that the Church-Rosser property and confluence coincide. The fact that any Church-Rosser relation is confluent is almost immediate, and the reverse implication has a beautiful diagrammatic proof which is shown in Fig. 2.3. It is based on the observation that any equivalence z > y can be 2.1 Equivalence and reduction u Fig. 2.3. Confluence implies the Church-Rosser property. written as a series of peaks as in the top of the diagram. Now you can use confluence to complete the diagram from the top to the bottom. The formal proof below yields some additional information by involving an intermediate property: Definition 2.1.4 A relation — is semi-confluent (Fig. 2.1) iff not>m > nly Although semi-confluence looks weaker than confluence, it turns out to be equivalent: Theorem 2.1.5 The following conditions are equivalent: 1. — has the Church-Rosser property. 2. — is confluent. 3. — is semi-confluent. Proof We show that the implications 1 = 2 = 3 = 1 hold. (1 = 2) If + has the Church-Rosser property and y; © x 7+ yp then y: ty and hence, by the Church-Rosser property, yi | yo, ie. > is confluent. (2+ 3) Obviously any confluent relation is semi-confluent. (3 > 1) If + is semi-confluent and x + y then we show x | y, ie. the Church-Rosser property, by induction on the length of the chain « y. If a =y, this is trivial. If ¢ 4 y © y/ we know « | y by induction hypothesis, ie. 2 z+ y for some suitable z. We show z | y' by case distinction: Jy follows directly from x | y. y= y': semi-confluence implies z | y/ and hence « | y/. The reasoning is displayed graphically in Fig. 2.4. a 12 2 Abstract Reduction Systems Case y + y! Case y + y! Fig. 2.4. Semi-confluence implies the Church-Rosser property. This theorem has some easy consequences: Corollary 2.1.6 If + is confluent and x @ y then 1. ay if y is in normal form, and 2. 2=y if both x and y are in normal form. Now we know that for confluent relations, two elements are equivalent iff they are joinable. Of course the test for joinability can be difficult (and even undecidable) if the relation does not terminate: given two elements which are not joinable, when should you stop the search for a common successor in case of an infinite reduction starting from one of the two elements, as in the following example? o 7a > @ >, b > hh > bk > It turns out that normalization suffices for determining joinability. To see this, let us explore the relationship between termination, normalization, confluence and the uniqueness of normal forms. Fact 2.1.7 If + is confluent, every element has at most one normal form. Since every element has at least one normal form if — is normalizing, it follows that for confluent and normalizing relations every element x has exactly one normal form which we write x|: Lemma 2.1.8 If + is normalizing and confluent, every element has a unique normal form. Having established under what conditions the notation «| is well-defined, we immediately obtain our main theorem: Theorem 2.1.9 If — is confluent and normalizing then x > y = xl =yl. 2.2 Well-founded induction 13 Proof The <-direction is trivial. Conversely, if x y then x| ¢ y| and hence «| = y| by Corollary 2.1.6. a ‘Thus we have finally arrived at a very goal-directed equivalence test: simply check if the normal forms of both elements are identical. Provided normal forms are computable and identity is decidable, equivalence also becomes decidable. Many authors prefer to work with termination instead of normalization and state Theorem 2.1.9 with “convergent” instead of “confluent and norma- lizing”. Although normalization suffices for finding normal forms, it means that breadth-first rather than depth-first search may be required, for ex- ample in Fig. 2.2. For this reason we will also concentrate on termination rather than normalization in the sequel. Exercises 2.1 Which closure operations commute? Find a proof or counterexample: (a) Is the reflexive closure of the transitive closure the same as the transitive closure of the reflexive closure, i.e. are (+)= and ()* the same and do they coincide with “+? (b) What about the transitive and the symmetric closure? Do (<) and (4) U (4)-? coincide? + 2.2 Show that — is confluent and normalizing iff every element has a unique normal form. 2.3 Find areduction — on N such that — is decidable but it is undecidable if some n is in normal form. 2.2 Well-founded induction This section introduces the important proof principle of well-founded in- duction and shows that it is enjoyed by all terminating relations. As a motivation, recall the principle of induction for natural numbers: a pro- perty P(n) holds for all natural numbers n. if we can show that P(n) holds under the (induction) hypothesis that P(m) holds for all m m, > --- of natural numbers. The principle of well-founded in- duction is a generalization of induction from (N,>) to any terminating reduction system (A,—). Formally, it is expressed by the following infe- 4 2 Abstract Reduction Systems rence rule: Yee A (WEA cy > P(y)) > P(e) (WET Va € A. P(x) ) where P is some property of elements of A. The horizontal line is simply another symbol for implication. In words: to prove P(r) for all x, it suffices to prove P(x) under the assumption that P(y) holds for all successors y of x. It may come as a bit of a surprise to see an induction schema without explicit base case. The solution to this puzzle is that the premise of WFI subsumes the base case. If — is terminating, the “base case” of the induction consists of showing that P(x) holds for all elements without successor, i.e. all normal forms. Hence the assumption (Vy € A.a Sy => P(y)) is trivially true and the premise of WFI degenerates to P(z), just as expected. WF is not correct for arbitrary —., but for terminating ones it is: Theorem 2.2.1 If — terminates then WFI holds. Proof by contraposition. Assume that WFI does not hold for —, i.e. there is some P such that the premise of WFI holds but the conclusion does not, i.e. sP(ao) for some ao € A. But then the premise of WFI implies that there must exist some a; such that ap % a1 and =P(a1). By the same argument, there must exist a a cued that a; az and +P(az). Hence there is an infinite chain ap + a, % a2 4 ---, i.e. + does not terminate. a As a first application of WFI, we can prove the converse of this theorem: Theorem 2.2.2 If WFI holds, then — terminates. Proof by WFI where P(x) := “there is no infinite chain starting from a”. The induction step is simple: if there is no infinite chain starting from any successor of «, then there is no infinite chain starting from « either. Hence the premise of WFI holds and we can conclude that P() holds for all , i. — terminates. a A few words on terminology. Terminating relations are usually called well- founded in the mathematical literature. Hence the term “well-founded induction”. In the computer science literature the terms Noetherian} and Noetherian induction are sometimes used instead. Strictly speaking, a reduction system (A,—+) is well-founded if every nonempty B C A has a minimal clement, ic. some b € B such that b > ¥! for no b € B. With + In honour of the mathematician Emmy Noether. 2.2 Well-founded induction 15 the help of the Axiom of Choice it can be shown that well-foundedness and termination are equivalent. We will now use well-founded induction to study some further properties of reductions which are related to termination. Definition 2.2.3 A relation — is called finitely branching. if each element has only finitely many direct successors, globally finite if cach clement has only finitely many successors, acyclic if there is no element a such that a * a. Note that — is globally finite iff * is finitely branching. Lemma 2.2.4 A finitely branching relation is globally finite if it is termi- nating. Proof Let — be finitely branching and terminating. We use well-founded induction to prove that for every element the set of all its successors is finite. Since this is true for all its direct successors (by induction hypothesis), of which there are only finitely many, it is also true for the element itself. O It is not true that a finitely branching relation is terminating if it is glo- bally finite. The reason is cycles. However, we have the following weaker implication: Lemma 2.2.5 Any acyclic relation is terminating if it is globally finite. The combination of the last two lemmas says that a finitely branching and acyclic relation is globally finite iff it is terminating. The special case of an acyclic relation induced by a tree is known as Kénig’s Lemma: A finitely branching tree is infinite iff it contains an infinite path. Exercises 2.4 Show that + is terminating iff — is. 2.5 Show that + is a strict partial order iff — is acyclic. 2.6 A relation — is called bounded iff for each element the length of all paths starting from it is bounded: Wz. Jn. dy. 2 3 y. (a) Is every terminating relation bounded? (b) Show that a finitely branching relation terminates iff it is boun- ded. 2.7 Prove Lemma 2.2.5. 16 2 Abstract Reduction Systems 2.3 Proving termination ‘The importance of termination hardly needs emphasizing: it is essential not just for programmers but also for theoreticians, as the previous sections, in particular the connection with well-founded induction, have shown. We will now examine a number of constructions for proving termination, a hard (because undecidable) task, as computer scientists well know. These con- structions are on the level of relations and are applicable to termination proofs of programs as well as to purely mathematical questions, for example from the realm of group theory. In connection with termination, it frequently pays to work with transitive relations or even partial orders. One reason is that there is a vast body of mathematical literature on partial orders. Another is that some of our constructions (e.g. the multiset order) are simpler for partial orders than for arbitrary relations. Fortunately, the transition to partial orders is without loss of generality: + terminates iff + does, in which case + is a strict order (Exercises 2.4 and 2.5). The most basic method for proving termination of some (A,—+) is to embed it into another abstract reduction system (B, >) which is known to terminate. This requires a monotone mapping y : A > B, where monotone means that c — 2’ implies p(x) > ¢y(2’). Now — terminates because an infinite chain x9 + 21 — ++ would induce an infinite chain g(x) > (a1) > +++. The mapping ¢ is often called a measure function and the whole construction is known as the inverse image construction (because + Cy (>) = {(z,2’) | y(x) > y(z')}). Note that if y is the identity, this yields that any subset of a terminating relation is terminating. Example 2.3.1 The most popular choice for termination proofs is an em- bedding into (N, >), which is known to terminate. For strings, i.e. A:= X* for some set X, there are two natural choices: 1. Length. ¢ is defined by y(w) := |w|. This proves termination of all length-decreasing reductions like uabby —1 uaav, where u,v € A are arbitrary and a,b € X are fixed. . Letters. For each a € X define yq(w) := “the number of occurrences of ain w”. This can cope with reductions like wav 2 vbu where u,v € A are arbitrary and a,b € X, a b, are fixed. » How about —1U—+2? We claim it also terminates, in which case Lemma 2.3.3 below tells us that there exists a measure function into N. Can you find one? Many program termination proofs follow the same schema by showing that 2.8 Proving termination 17 every computation step (e.g. loop iteration or recursive call) decreases the value of some expression y(Z) in terms of the program variables Z. Example 2.3.2 Assume all variables in the following program range over natural numbers: while ub > Ib+1 do begin r := (ub+lb) div 2; if ® then ub := r else lb := r end Termination is independent of the test (provided ® terminates and has no side effect) and can be proved with the measure function y(ub, 1b) := ub— Ib which decreases with every loop iteration. The popularity of measure functions into N is in part explained by the following completeness result: Lemma 2.3.3 A finitely branching reduction terminates iff there is a mo- notone embedding into (N,>). Proof The <=-direction follows from the soundness of the measure function approach. For the other direction, let + be a terminating and finitely branching reduction. Define ¢(zx) as the number of successors of a which, by Lemma 2.2.4, must be finite. Since — is terminating and hence acyclic, a — a’ implies that z' has strictly fewer successors than z. Alternatively, (x) can be defined as the length of the longest reduction starting from 2. Since — terminates, Exercise 2.6 implies that (a) is well-defined. o ‘The restriction to finitely branching relations is necessary, as the following example shows. Example 2.3.4 Let A := N x N and let — be defined by the two rules (i+1, 9) > (ih) and (i,j +1) — (i, 4) for all i, j,k > 0. This reduction is not finitely branching because the value of k in the first rule is not constrained by the left-hand side, Termination of + can be shown by a simple lexicographic construction (see Section 2.4). Yet there is no monotone function y from (N x N, +) into (N,>). For if there were such a function y, observe that monotonicity implies k := (1,1) > y(0,k) > y(0,k — 1) > ++ > (0,0). This is a contradiction because there are only k natural numbers below k and yet the chain y(0,k) > --- > ¢(0,0) has length k + 1. Even in the context of finitely branching reductions, an embedding into N can be tricky to find. 18 2 Abstract Reduction Systems Example 2.3.5 Let A = NxN and define the reduction by (i,j +1) — (i, j) and (i + 1,j) — (i,i). This reduction terminates at (0,0) for every start point. It is also finitely branching. Hence there is a measure function into N. In this particular case y(i, j) = i?+j does the job, but it takes a moment to find this function and prove that it is monotone. We will now discuss how to get around the above problems with measure functions into N by building complex orders from simpler ones using fixed constructions which preserve termination. Exercises 2.8 Find a measure function into N which proves termination of — in Example 2.1.2, part 2. 2.9 Find a measure function into N which proves termination of > U2 in Example 2.3.1. 2.4 Lexicographic orders Given two strict orders (A, >,4) and (B,>g), the lexicographic product >axp on A x B is defined by (@,y) >axa (o'y') 1 (@ >aa’)V(e=2' Ay >By)- If A and B are obvious from the context we write > instead of >axp- Sometimes we also write >A Xtex >B- The following property is routine to prove: Lemma 2.4.1 The lericographic product of two strict orders is again a strict order. More interestingly we have Theorem 2.4.2 The lexicographic product of two terminating relations is again terminating. Proof by contradiction. Assume there is an infinitely descending chain (a0, bo) > (a1,b1) > +++. This implies ap >4 a1 >4 ---. Since > terminates, this chain cannot contain an infinite number of strict steps a; >4 ai41- Hence there is a k such that a; = a;4; for all i > k. But this implies b; > bi41 for all i > k, which contradicts the termination of >. o This theorem proves termination of + on N x N in Examples 2.3.4 and 2.3.5: (i,j) — (W,7’) is defined such that (i, 7) is lexicographically greater 2.4 Levicographic orders 19 than (i’,j’), ie. + is a subset of the terminating relation >yxn. It also proves termination of —+1 U—+2 in Example 2.3.1: 1 decreases the length whereas — leaves the length invariant but decreases the number of as. Lexicographic products are essential in building up more complex orders from simpler ones. By iteration, we can form lexicographic products over any number of strict orders (Aj, >i), i= 1,...,n: >1..n, Where n > 1, is the lexicographic product of >; and >2,.n. Unwinding the recursion and writing > instead of >1,.n we get (1,666) fn) > (Yiye0 yn) 1% Fh Sn. (Vi < KE = Yi) Nk > Yee (2-1) Ifall (Aj, >i) are the same we write >f., for the n-fold lexicographic product. ‘The above results for the binary lexicographic product carry over to n-fold products: > is again a strict order and it terminates if all the >; terminate. The proofs are by induction on n. Instead of tuples of fixed length, we can also consider strings of arbitrary but finite length: given a strict order (A, >), the lexicographic order >j,, on A* is defined as UD fee v6 (ful > fol) V (lul = lo] Aw Heh 0) where |w| is the length of w and >!!! is the order on Al! defined in (2.1) above. More concisely, we can define >#,, as the lexicographic product of >ten and User >feg Where U >ten v 24> [ul > |v|- Since A’ and Ad are disjoint if i 4 j, the second component of this product is a union of orders over disjoint sets. Since such unions (this is easy to see) and lexicographic products (as shown above) preserve strict orders and termination, we have Lemma 2.4.3 If > is a strict order, so is >},,. If > terminates, so does Dice Despite its name, >j., is not the order used in dictionaries. The latter does not terminate: a >dict @@ >dict AAA >dict ***- Yet another interesting variation on lexicographic orders compares strings from left to right as follows: w1 >rez wo if we is a proper prefix of w; or if ‘w] = uav, w2 = ubw and a > b, where > is the underlying strict order. For example, if a > b, then aaaa >zex aaa >Ler abba. Unfortunately, >rex need not terminate either, even if > does (exercise!). Nevertheless, >ze can be a useful component in more complicated orders. Lemma 2.4.4 If > is a strict order, 80 is >zes- ‘The proof, a simple case analysis, is left as an exercise. 20 2 Abstract Reduction Systems A final word of warning about our definition of the lexicographic pro- duct. Although we assume the component relations to be strict orders, the definition works just as well for arbitrary relations. In fact, Theorem 2.4.2 depends on termination only. Nevertheless, the lexicographic product of two arbitrary relations may not be what you expect. For example >y Xver >N relates all (i, 7) and (i,k), simply because i >y i. Hence you should not use Xjez directly with reflexive relations. Given two partial orders >4 and >z, their lexicographic product should be defined as the reflexive closure of >A Xlez >p. (Remember that the strict part of a partial order > is written >.) Of course this can be written more succinctly, if slightly ambiguously, as >Axp. Alternatively, we can define the lexicographic product directly for partial orders: (2,y) 2axp (ay) + (@>az')V(e=a' Ay>sy/). It is easy to show that these two definitions of > 4x. are equivalent and that >axp is a partial order if >4 and >p are partial orders (exercise!) Exercises 2.10 Prove Theorem 2.4.2 by well-founded induction. 2.11 Show that the following process always terminates. There is a box full of black and white balls. Each step consists of removing an arbitrary ball from the box. If it happens to be a black ball, one also adds an arbitrary (but finite) number of white balls to the box. 2.12 Show that v; >j,, v2 implies uvjw >}, uv2w. 2.13 Show that >, is linear if both >4 and > are. 2.14 Show that >}, is linear if > is. 2.15 Why do the following two programs terminate, provided all variables range over positive natural numbers? while m # n do ifm > nthen m := m—nelsen :=n-m while m # n do ifm > nthen m := m-n else begin h := m; m := 7; n := h end What if the variables range over positive rational numbers? 2.5 Multiset orders 21 2.16 Show that the evaluation of the following recursively defined function, also known as Ackermann’s function, terminates for all m,n € N: ack(0,n) = n+1, ack(m+1,0) = ack(m,1), ack(m+1,n+1) = ack(m,ack(m+1,n)). I 2.17 Does termination of > imply termination of >pec? 2.18 Prove Lemma 2.4.4. 2.19 Show that >er is linear if > is. 2.20 Formalize the order used in dictionaries. 2.21 The lexicographic product of two quasi-orders 24 and 2p is defined as follows: (2,9) 2 (ay) 4 >aa'V(eraa' Ay2py). (a) Show that > is a quasi-order if both 24 and 2p are. (b) Show that >, the strict part of >, terminates if >4 and >, do. 2.5 Multiset orders Consider the following reduction on N*: u(i+1)v — uiiv for all u,v € N* and i€N. It turns out that — terminates, and because it is finitely branching, there should also exist a measure function into N. If you want to spare yourself the torture of finding that function, you should read on. One of the most powerful ways of building terminating orders is multisets. They are usually defined as “sets with repeated elements”, which the purist will find a contradiction in terms, but which conveys their nature quite well. Examples are {a,a,5} and {a,b,a}, which are identical, and {a,b 5}, which is distinct from them. Of course, we can also be more formal: Definition 2.5.1 A multiset M over a set A is a function M: A > N. Intuitively, M(c) is the number of copies of « € A in M. A multiset M is finite if there are only finitely many x such that M(x) > 0. Let M(A) denote the set of all finite multisets over A. Although multisets can be infinite, and much of the theory works for infinite multisets, the bit that is crucial for us fails: termination. Therefore all our multisets are assumed to be finite unless stated otherwise. We use standard set notation like {a,a,b} as an abbreviation of the func- tion {a++ 2,b++ 1,c++ 0} over the base set A = {a,b,c}. It will be obvious from the context if we refer to a set or a multiset. 22 2 Abstract Reduction Systems Most set operations are easily generalized to multisets by replacing the underlying Boolean operations by similar ones on N. Definition 2.5.2 Some basic operations and relations on M(A) are: Element : c¢M : M(z)>0. Inclusion : MCN : Vr € A. M(a) < N(z). Union : (MUN)(2) = M(z) + N(2). Difference : (M—N)(c) = M(x) + N(e) where m =n is m—n if m >n and is 0 otherwise. Some typical examples: 0 C {a,a} C {a,a,a}, {a,b} U {b,a} = {a,a,,B} and {a,b, b,b} — {a, a,b,c} = {b,D}. Now we come to the central concept of this section, an order on multisets: the smaller multiset is obtained from the larger one by removing a nonempty subset X and adding only elements which are smaller than some element in x. Definition 2.5.3 Given a strict order > on a set A, we define the corres- ponding multiset order >muz on M(A) as follows: M >mu N iff there exist X,Y € M(A) such that O#AX CM and N =(M-—X)UY and Wy EY. 3ceX.c>y. For example, {5,3, 1,1} >mu {4,3,3, 1} is verified by replacing X = {5,1} by Y = {4,3}. Note that X and Y are not uniquely determined: X = {5,3,1,1} and Y = {4,3, 3,1} work just as well. Sometimes it can be useful to realize that M >mui N holds iff you can get from M to N by carrying out the following procedure one or more times: remove an element x and add a finite number of elements, all of which are smaller than x (see Exercise 2.22). On finite multisets, the multiset order is again a strict order: Lemma 2.5.4 If > is a strict order, so is >mul- Proof Irreflexivity: if M >my M, there are X and Y such that X C M, M =(M-X)UY, ie. X =Y, and Wy € Y.3e € X. x > y, which implies Vy € X.dr € X. x > y. Since > is a strict order this implies that X is infinite, a contradiction. ‘Transitivity is more involved. If Mi >mut Mz >mu M3 then Mp = (My — X1) UY; and Mg = (Mz — X2) UY2, for multisets X; and Y; satisfying the appropriate conditions in the definition of >my:. We now claim that 2.5 Multiset orders 23 X := X,U(X2—¥j) and ¥ = (Yj — X2) UY) prove My >mur Mg. Let us look at the required conditions in turn. « X #0 is implied by X1 40. @ XC Mp = (My —X1)UYj implies X»—¥ C Mi —X; and hence, because 1 CM, X= XU (%2-N) CM. © We need to show that Ms = (Mj — X) UY =: M$, which follows if we can show Ma(a) = Mj(a) for an arbitrary a € A. We have Mj(a) = (Mi(a) = (X1(a)+(X2(a) ~ ¥i(a))))+((4a(@) + X2(a))+¥2(a)). Because X C Mi, the first “+” in this expression can be replaced by an ordinary minus “—”, which (after some arithmetic rearrangement) yields Mj(a) = (Mi (a)—X1(a))+((¥a(a) + ¥2(a))—(Xa(a) + Yi(a)))+¥o(a). Obviously, (m = n)—(n + m) = m—n, and thus we obtain M3(a) = (((Mi(a) - Xi(a)) + ¥i(a)) — X2(a)) + Yaa) = (Ma(a) — Xo(a)) + Ya(a) = Ms(a). © To prove Vy €Y. dr € X.a>yletyeY. fy %, Mi >mut Mo implies a > y for some x € X; C X. If y € Y2, M2 >mut M3 implies x > y for some « € X92. If « € X2—Yi C X, we are done. Otherwise « € Yi, in which case Mi >mut Mz implies x > x for some x € X1 C X and hence a > y by transitivity of > on A. o ‘The really important nontrivial property of >mut is Theorem 2.5.5 The multiset order >mu is terminating iff > is. Proof If > does not terminate, there is an infinite chain ap > a, > --+ which induces an infinite chain {a9} >mui {a1} >mui ‘++ of multisets. Hence >mut does not terminate either. If > terminates, we show by contradiction that >mu terminates. Assume there is an infinite chain Mo >mut Mi >mui «++. We can then build a finitely branching but infinite tree where the nodes are labelled with elements of A such that along each path the labels decrease w.r.t. >. Using Kénig’s Lemma, it follows that this tree must have an infinite branch, which yields an infinitely descending sequence in A, the desired contradiction. It remains to be seen how to construct this tree. Let 1 be an arbitrary element not in A, let Ay := AU {1}, and extend > by defining a > 1 for all a € A. Obviously (A, >) is still terminating. Now we grow the following tree whose nodes are labelled with elements of A. At stage i of the construction the non- leaf nodes form the multiset M;. The initial tree has a root with an arbitrary label and a successor node for each element of Mo, e.g. Mo = {5,3, 1,1}: 24 2 Abstract Reduction Systems Q ®®OO® Since Mo >mut Mj, there are finite multisets X and Y with the properties stated in the definition of >mu. For every y € Y add a new node labelled y and make it the child of some leaf node labelled « where x > y. By definition of >mut such an x must exist in X C Mp and hence « is among the current leaf nodes. In addition we add a son labelled | to each 2 € X. This ensures that even if Y is empty, the tree has grown. Example: M, = {4,3,3,1}, X = {5,1} and Y = {4,3}: Q ®@®OO®O © OOO © ‘This process can be continued for Mz, Mg, .... Thus we are constructing a finitely branching (the M; are finite) but infinite (for each M; at least one node is added) tree. Ignoring the root node, the labels on each path are strictly decreasing by construction. a Note that the proof does not require > to be a strict order but works for any relation. It is now easy to see that the reduction u(i+ 1)v — uéiv considered at the beginning of this section terminates: the mapping y : N* + M(N) defined by (it -.-in) = {i1,-.-; in} is obviously monotone (io(u(i + 1)v) = 9(u)U{i+T}UG(v) >mut P(u) Ui, }UY(v) = e(uiiv)) and >mur on M(N) terminates because > on N does. The above definition of >mx is quite intuitive but also a little cumbersome because of its many quantifiers and conditions. Therefore the following alternative characterization is useful: Lemma 2.5.6 If > is a strict order and M,N € M(A), then M>muN # MANAYNEN-M.3AmeM—-N.m>n. Proof For the =>-direction, assume M >mu N, in which case there are X and Y as in the definition of >mu. M # N follows from irreflexivity of >mul- For the second conjunct, let y, € N— M = ((M-X)UY)-M= ((MUY) — X)—_M = ((MUY)-M)-X =¥Y—X. Hence there is a ye € X such that yp > y1. Either yo € XY =(M-—(M-—X))-Y= 2.5 Multiset orders 25 = ((M — X) UY) = M—N, in which case we are done, or yp € XNY (where (X NY)(zz) = min(X(z),¥(z))), in which case there is a ys € X such that ys > yo. Because our multisets are finite and > is a strict order, there is no infinite ascending chain y) < y2 < yg < ++: in XNY, ie. this process must always terminate with some yn € X-Y = M—N. Transitivity yields yn > y1- The <-direction is left as an exercise. o It is worth noting that if > is linear, then M >, N can be computed quite efficiently: sort M and N into descending order (w. r.t. >) and compare the resulting lists lexicographically w.r.t. >zee- Let M be the sorted version of M. It is easy to see that M > tex N implies M >myi N: either N ii is a proper prefix of M, in which case M > N and hence M >my N; or M = umv, N = unw such that m > n, in which case m is larger than all elements in w, which again implies M >mu N. Conversely, if M rex N then either M =N or N >rez M (Exercise 2.19) and thus N >muy M; since >mul is strict, this implies M %mu N in both cases. Thus we conclude that M>muN # M >t N. (2.2) Let us briefly look at the multiset extension of partial orders. As in the lexicographic case, we have to be a bit careful. If we simply replace > by > we end up with {1} >mu! {1,1}, which is not desirable. Instead, >mut, the multiset extension of a partial order >, is defined as follows: M2>muN : M >mu NVM=N. Given a quasi-order (A, 2), we define its multiset extension via the induced partial order > on A/~: M2N : M/.>mu N/~ where {a1,...,ax}/~ := {[ar]ny---, [ax]e} Exercises 2.22 Given a strict order (A,>), define the following single-step relation on M(A): M > Nv Axe M,Y € M(A). N=(M-{2})UY A WeY.r>y. Show that >mui is the same as the transitive closure of >},,,. (Hint: show that each relation is contained in the other using appropriate inductions.) Conclude that >mui is transitive. 26 2 Abstract Reduction Systems 2.23 Show that X and Y in the definition of >; can always be chosen such that they are disjoint. 2.24 Give a counterexample to Lemma 2.5.4 for infinite multisets. Show that Lemma 2.5.4 also holds for infinite multisets provided there is no infinitely ascending chain x9 < 1 <- 2.25 Prove the <-direction of Lemma 2.5.6. 2.26 Show that if > is a partial order, so is >mu, and that 2mut is a quasi-order if > is one. 2.6 Orders in ML How should we implement strict/partial orders in general? The obvious implementation as a function ord: 7 * -> bool has its problems: « If ord(x,y) implements x > y, we cannot recover x > y by writing ord(x,y) orelse « = y because in general we cannot assume that the mathematical equality = on the base set A coincides with the program- ming language equality = on the type 7 used to implement A. For example, if sets are implemented by lists, we do not have [1,2] = [2,1] although they are equal as sets. If ord(x,y) implements « > y, we can compute x > y as ord(a,y) andalso not(ord(y,)). This is mathematically correct but inefficient because of the two calls to ord. The performance penalty is exponential in the depth of the nesting of orders. Implementing both > and > is likely to duplicate much of the code. To overcome these problems we introduce datatype onder = GR | EQ | NGE: which represents the three outcomes >, = and #. We say that a function ord computes a strict/partial order >/> if GR ife>y, ord(z,y)=4 EQ ifz=y, NGE ifczy. Note that by « = y we mean equality on the abstract, not the implementa- tion level. The latter is x = y, which is too weak, as the set/list example demonstrates: on the implementation level, a partial order may turn into a quasi-order. The purpose of EQ instead of = is to hide that fact. On the other hand, we may even start with a quasi-order 2, in which case GR, EQ and NGE represent >, ~ and Z. 2.6 Orders in ML 27 2.6.1 Lexicographic orders Unsurprisingly, x is implemented by * and * by list. The corresponding constructions >4xg and >j,, are equally straightforward. Note that A” should be implemented not as an n-fold product but as a list, in which case >, has the following simple recursive implementation: (+ lex: (a * B -> order) -> a list * 8 list -> order ») fun let ord (0,0) = EQ | lex ord (a: i2s,y::ys) = case ond(t.y) of GR => GR 1 EQ => ler ord (2s,ys) | NGE => NGE; If ord implements > then lec ord implements >{., for any n. Note that lex ord is undefined if the two argument lists have different lengths. The type of ler is slightly more general than one might have expected because ord could potentially compare elements of two different types. This kind of unexpected generalization is a frequent ML phenomenon which we will not comment on in the future. 2.6.2 Multiset orders ‘We represent finite multisets over a type T by 7 list, which leads to very simple algorithms. For example, U becomes @. Multiset difference, however, needs to be parameterized by the order on 7 because we need to compare elements for EQuality on the abstract level: (# rem1: (a * 8 -> bool -> a list -> 8 -> @ list *) fun remi ord ((] a | remt ord (2::28, y) = if ond(z,y) = EQ then zs else 2: (remi ord (zs, y))5 (* mdiff: (a + B -> bool) -> a list -> 6 list -> @ list +) fun mdiff ord (as, 0) = as | maiff ord (as, y:tys) = mdiff ord (remt ord (xs,y), us)+ The starting point for an implementation of >m, is not its actual defini- tion, which is marred by existential quantifiers, but Lemma 2.5.6 which can be expressed in ML almost verbatim: (+ mul: (a * @ -> order) -> a list * a list -> order *) fun mul ord (ms,ns) = Let val nms =’ mdiff ord (ns,ms) val mns = mdiff ord (ms,ns) im if null(nms) andalso null(mns) then EQ else if forall (fn n => exists (fn m => ord(m,n)=GR) mns) nms then GR else NGE end; 28 2 Abstract Reduction Systems The “almost” is a consequence of the fact that we cannot use = to com- pare ms and ns. The test null(nms) andalso null(mns) is justified by the equivalence M=N « (M-—N)=0=(N—M) on the multiset level. Assuming that the running time of ord is constant, mdiff ord (ms,ns) has time complexity O(mn), where m and n are the lengths of ms and ns. This is inherited by mul ord (ms,ns) because O(mn + nm + |m—nl|n— m|) = O(mn). If ord is a linear order, condition (2.2) above allows an implementation of mul which runs in time O(m+n), provided multisets are represented by sorted lists. Exercises 2.27 Implement >zes- 2.28 Implement multisets as association lists which pair every element with the number of times it occurs in the multiset. Update the code for mdiff and mul accordingly. 2.7 Proving confluence Proving confluence can be hard work because one has to consider forks un © & + yp of arbitrary length. We will now look at ways of localizing the confluence test to single-step forks y, — x > yo. Definition 2.7.1 A relation — is locally confluent (Fig. 2.5) iff uctoy > nly 5. Local confluence, strong confluence, and the diamond property. Local confluence is strictly weaker than confluence. A simple example is shown in Fig. 2.6 on the left: both local forks a — 0 > 1 and 0 — 1— bean be closed, yet the reduction is not confluent. One might suspect that the cycle between 0 and 1 is responsible, but the second example in Fig. 2.6 (only an initial segment of the infinite graph generated by 2n — a, 2n +1 b and n + n-+ 1, is shown) proves that this is not the case: even for acyclic

You might also like