ML for the Working Programmer
2nd edition
Lawrence C. Paulson
University of Cambridge
= CAMBRIDGE
¥ UNIVERSITY PRESSPublished by the Pi Syndicate of the University of Cambridge
The Pitt Building, Trumpington Street, Cambridge CB2 1RP
40 West 20th Street, New York. NY LOOII-4211, USA
10 Stamford Road, Oakleigh. Melbourne 3166. Australia
© Cambridge Un.
ity Press 1991. L996
First edition published 1991
- First published in paperback (with corrections) 1992
Reprinted 1993, 1995.
Second edition published 1996
Printed in Great Britain at the University Press. Cambridge
British Library cataloguing in publication data available
Library of Congress cataloguing in publication data available
ISBN 0521 57050 6 (hardback)
ISBN 0 521 56543 X (paperback)
‘The cover illustration is taken from Work by Ford Madox Brown
© Manchester City Art Galleries, and is reproduced with their permission
Miranda. is a trademark of Research Software Limited. Sun and
SuperSPARC are trademarks of Sun Microsystems. Unix is a trademark of AT&T
Bell Laboratories. Poplog is a trademark of the University of Sussex. MLWorks is
a trademark of Harlequin Limited. DEC and PDP are trademarks of Digital
Equipment Corporation.
Trademarks.
Dedication. For Sue. Nathan and Sarah,
DISCLAIMER OF WARRANTY
The programs listed in this book are provided ‘as is’ without warranty of any kind.
We make no warranties, express or implied, that the programs are free of error, or
are consistent with any particular standard of merchantability, or that they will meet
your requiments for any particular application. They should not be relied upon for
solving a problem whose incorrect solution could result in injury to a person or loss
of property. If you do use the programs or procedures in such a manner, it is at your
own risk. The author and publisher disclaim all liability for dirct, incidental or
consequential damages resulting from your use of the programs, modules or functions
in this book,CONTENTS
Preface to the Second Edition xiii
Preface xv
1 Standard ML 1
Functional Programming 2
1d Expressions versus commands 2
1.2 Expressions in procedural programming languages 3
1.3 Storage management 5
14 Elements of a functional language 5
1.5 The efficiency of functional programming 9
Standard ML ll
1.6 The evolution of Standard ML 11
17 The ML tradition of theorem proving 12
18 The new standard library 13
19 ML and the working programmer 1S
2 Names, Functions and Types 17
Chapter outline 18
Value declarations 18
2.1 Naming constants 18
2.2 Declaring functions 19
2.3 Identifiers in Standard ML 21
Numbers, character strings and truth values 22
24 Arithmetic 22
2.5 Strings and characters 24
2.6 Truth values and conditional expressions 26
Pairs, tuples and records 27
27 Vectors: an example of pairing 28
2.8 Functions with multiple arguments and results 29
29 Records 32Contents
2.10 — Infix operators
The evaluation of expressions
2.11 | Evaluation in ML: call-by-value
2.12 Recursive functions under call-by-value
2.13 Call-by-need, or lazy evaluation
Writing recursive functions
2.14 Raising to an integer power
2.15 Fibonacci numbers
2.16 Integer square roots
Local declarations
2.17 Example: real square roots
2.18 Hiding declarations using local
2.19 Simultaneous declarations
Introduction to modules
2.20 The complex numbers
2.21 Structures
2.22 Signatures
Polymorphic type checking
2.23 Type inference
2.24 Polymorphic function declarations
Summary of main points
3 Lists
Chapter outline
Introduction to lists
3.1 Building a list
3.2 Operating on a list
Some fundamental list functions
3.3 Testing lists and taking them apart
3.4 List processing by numbers
3.5 Append and reverse
3.6 Lists of lists, lists of pairs
Applications of lists
3.7 Making change
3.8 Binary arithmetic
3.9 Matrix transpose
3.10 Matrix multiplication
3.11 Gaussian elimination
3.12 Writing a number as the sum of two squares
36
38
39
40
48
48
49
52
53
54
55
56
59
59
62
63
64
65
67
69
69
70
70
72
74
74
16
78
81
82
83
85
87
89
90
93Contents vil
3.13 The problem of the next permutation 95
The equality test in polymorphic functions 96
3.14 Equality types 97
3.15 Polymorphic set operations 98
3.16 Association lists 101
3.17 Graph algorithms 102
Sorting: A case study 108
3.18 Random numbers 108
3.19 Insertion sort 109
3.20 Quick sort 110
3.21 | Merge sort lll
Polynomial arithmetic 114
3.22 Representing abstract data 115
3.23 Representing polynomials 116
3.24 Polynomial addition and multiplication 117
3.25 The greatest common divisor 119
Summary of main points 121
4 Trees and Concrete Data . 123
Chapter outline 123
The datatype declaration 124
41 The King and his subjects 124
4.2 Enumeration types 127
43 Polymorphic datatypes 128
44 Pattern-matching with val, as, case 130
Exceptions 134
45 Introduction to exceptions 134
4.6 Declaring exceptions 135
47 Raising exceptions 136
4.8 Handling exceptions 138
49 Objections to exceptions 140
Trees 141
4.10 A type for binary trees 142
4.11 Enumerating the contents of a tree 145
412 Building a tree from a list 146
4.13 Astructure for binary trees 148
Tree-based data structures 148
4.14 Dictionaries 149
4.15 Functional and flexible arrays 154vill Contents
4.16 Priority queues 159
A tautology checker 164
4.17 Propositional Logic 164
4.18 Negation normal form 166
4.19 Conjunctive normal form 167
Summary of main points 170
~
5 Functions and Infinite Data 171
Chapter outline 171
Functions as values 172
5.1 Anonymous functions with £n notation 172
5.2 Curried functions 173
5.3 Functions in data structures 176
5.4 Functions as arguments and results 177
General-purpose functionals 179
5.5 Sections 179
5.6 Combinators 180
5.7 The list functionals map and filter 182
58 The list functionals takewhile and dropwhile 184
5.9 The list functionals exists and all 184
5.10 The list functionals fold! and foldr 185
5.11 | More examples of recursive functionals 188
Sequences, or infinite lists 191
5.12 A type of sequences 192
5.13. Elementary sequence processing 194
5.14 Elementary applications of sequences 197
5.15 Numerical computing 199
5.16 Interleaving and sequences of sequences 201
Search strategies and infinite lists 204
5.17 Search strategies in ML 204
5.18 Generating palindromes 207
5.19 The Eight Queens problem 208
5.20 Iterative deepening 210
Summary of main points 211
6 Reasoning About Functional Programs 213
Chapter outline 213
Some principles of mathematical proof 214
6.1 ML programs and mathematics 214Contents ix
62 Mathematical induction and complete induction 216
6.3 Simple examples of program verification 220
Structural induction 224
6.4 Structural induction on lists 225
6.5 Structural induction on trees 229
6.6 Function values and functionals 233
A general induction principle 237
6.7 Computing normal forms 238
68 Well-founded induction and recursion 242
69 Recursive program schemes 246
Specification and verification 248
6.10 An ordering predicate 249
6.11 Expressing rearrangement through multisets 251
6.12 The significance of verification 254
Summary of main points 256
7 Abstract Types and Functors 257
Chapter outline . 258
Three representations of queues 258
TA Representing queues as lists 259
72 Representing queues as a new datatype 260
73 Representing queues as pairs of lists 261
Signatures and abstraction 263
14 The intended signature for queues 263
75 Signature constraints 264
76 The abstype declaration 266
V7 Inferred signatures for structures 269
Functors 271
18 Testing the queue structures 272
19 Generic matrix arithmetic 275
7.10 Generic dictionaries and priority queues 280
Building large systems using modules 285
7.11 Funetors with multiple arguments 285
7.12 Sharing constraints 290
7.13 Fully-functorial programming 294
7.14 The open declaration 299
7.15 Signatures and substructures 305
Reference guide to modules 308
7.16 The syntax of signatures and structures 309x Contents
FAT
The syntax of module declarations
Summary of main points
8 Imperative Programming in ML
Chapter outline
Reference types
8.1 References and their operations
8.2 Control structures
8.3. Polymorphic references
References in data structures
8.4 Sequences, or lazy lists
8.5 Ring buffers
8.6 Mutable and functional arrays
Input and output
8.7 String processing
8.8 Text input/output
8.9 Text processing examples
8.10 A pretty printer
Summary of main points
9 Writing Interpreters for the A-Calculus
Chapter outline
A functional parser
9.1
9.2
93
94
Scanning, or lexical analysis
A toolkit for top-down parsing
The ML code of the parser
Example: parsing and displaying types
Introducing the A-calculus
9.5
9.6
A-terms and 4-reductions
Preventing variable capture in substitution
Representing A-terms in ML
9.7
9.8
9.9
The fundamental operations
Parsing A-terms
Displaying 4-terms
The i-calculus as a programming language
9.10
9.11
9.12
9.13
Data structures in the A-calculus
Recursive definitions in the A-calculus
The evaluation of A-terms
Demonstrating the evaluators
311
312
313
313
314
314
317
321
326
327
331
335
340
340
344
346
351
356
357
357
357
358
360
363
367
372
372
375
378
378
381
382
384
385
388
389
393Contents
Summary of main points
10 A Tactical Theorem Prover
Chapter outline
A sequent calculus for first-order logic
10.1 The sequent calculus for propositional logic
10.2 Proving theorems in the sequent calculus
10.3. Sequent rules for the quantifiers
10.4 Theorem proving with quantifiers
Processing terms and formula in ML
10.5 Representing terms and formula
10.6 Parsing and displaying formule
10.7 Unification
Tactics and the proof state
10.8 The proof state
10.9 The ML signature
10.10 Tactics for basic sequents
10.11 The propositional tactics
10,12 The quantifier tactics
Searching for proofs
10.13 Commands for transforming proof states
10.14 Two sample proofs using tactics
10.15 Tacticals
10.16 Automatic tactics for first-order logic
Summary of main points
Project Suggestions
Bibliography
Syntax Charts
Index
xi
396
397
397
398
399
400
403
404
407
407
411
416
420
420
421
424
426
428
430
430
433
436
445
449
457
469PREFACE TO THE SECOND EDITION
With each reprinting of this book, a dozen minor errors have silently disappeared.
Buta reprinting is no occasion for making improvements, however valuable, that
would affect the page numbering: we should then have several slightly different,
incompatible editions. An accumulation of major changes (and the Editor’s urg-
ings) have prompted this second edition.
As luck would have it, changes to ML have come about at the same time. ML
has a new standard library and the language itself has been revised. It is worth
stressing that the changes do not compromise ML’s essential stability. Some ob-
scure technical points have been simplified. Anomalies in the original definition
have been corrected. Existing programs will run with few or no changes. The
most visible changes are the new character type and a new set of top level library
functions.
The new edition brings the book up to date and greatly improves the presenta-
tion. Modules are now introduced early — in Chapter 2 instead of Chapter 7
— and used throughout. This effects a change of emphasis, from data struc-
tures (say, binary search trees) to abstract types (say, dictionaries). A typical sec-
tion introduces an abstract type and presents its ML signature. Then it explains
the ideas underlying the implementation, and finally presents the code as an ML
structure. Though reviewers have been kind to the first edition, many readers
have requested such a restructuring.
The programs have not just been moved about, but rewritten. They now reflect
modern thoughts on how to use modules. The open declaration, which obscures
a program’s modular structure, seldom appears. Functors are only used where
necessary. Programs are now indented with greater care. This, together with the
other changes, should make them much more readable than hitherto. They are
also better: there is a faster merge sort and simpler, faster priority queues.
The new standard library would in any case have necessitated an early men-
tion of modules. Although it entails changes to existing code, the new library
brings ML firmly into the fold of realistic languages. The library has been de-
signed, through a long process of consultation, to provide comprehensive sup-
port without needless complication. Its organization demonstrates the benefits
xiiiXIV Preface to the Second Edition
of ML modules. The string processing, input/output and system interface mod-
ules provide real gains in power. .
The library forced much rewriting. Readers would hardly like to read about
the function foldleft when the library includes a similar function called foldl. But
these functions are not identical; the rewriting involved more than a change of
name. Many sections that previously described useful functions now survey cor-
responding library structures.
The updated bibliography shows functional programming and ML used in a
wide variety of applications. ML meets the requirements for building reliable
systems. Software engineers expect a language to provide type safety, modular-
ity, compile-time consistency checking and fault tolerance (exceptions). Thanks
in part to the library, ML programs are portable. Commercially supported com-
pilers offer increasing quality and efficiency. ML can now run as fast as C, es-
pecially in applications requiring complicated storage management. The title of
this book, which has attracted some jibes, may well prove to be prophetic.
My greatest surprise was to see the first edition in the hands of beginning pro-
grammers, when the first page told them to look elsewhere. To help beginners
have added a few especially simple examples, and removed most references
from the main text. The rewritten first chapter attempts to introducing basic pro-
gramming concepts in a manner suitable both to beginners and to experienced C
programmers. That is easier than it sounds: C does not attempt to give program-
mers a problem-solving environment, merely to dress up the underlying hard-
ware. The first chapter still presupposes some basic knowledge of computers.
Instructors may still wish to start with Chapter 2, with its simple on-line sessions.
At the end of the book is a list of suggested projects. They are intentionally
vague; the first step in a major project is to analyse the requirements precisely. I
hope to see ML increasingly adopted for project work. The choice of ML, espe-
cially over insecure languages like C, may eventually be recognized as a mark
of professionalism.
I should like to thank everyone whose comments, advice or code made an im-
pact on this edition, They include Matthew Arcus, Jon Fairbairn, Andy Gordon,
Carl Gunter, Michael Hansen, Andrew Kennedy, David MacQueen, Brian Mon-
ahan, Arthur Norman, Chris Okasaki, John Reppy, Hans Rischel, Peter Sestoft,
Mark Staples and Mads Tofte. Sestoft also gave me a pre-release of Moscow ML,
incorporating library updates. Alison Woollatt of CUP coded the TsTgX class file.PREFACE
This book originated in lectures on Standard ML and functional programming. It
can still be regarded as a text on functional programming — one with a pragmatic
orientation, in contrast to the rather idealistic books that are the norm — but it is
primarily a guide to the effective use of ML. It even discusses ML’s imperative
features.
Some of the material requires an understanding of discrete mathematics: ele-
mentary logic and set theory. Readers will find it easier if they already have some
programming experience, but this is not essential.
The book is a programming manual, not a reference manual; it covers the ma-
joraspects of ML without getting bogged down with every detail. It devotes some
time to theoretical principles, but is mainly concerned with efficient algorithms
and practical programming.
The organization reflects my experience with teaching. Higher-order func-
tions appear late, in Chapter 5. They are usually introduced at the very beginning
with some contrived example that only confuses students. Higher-order fune-
tions are conceptually difficult and require thorough preparation. This book be-
gins with basic types, lists and trees. When higher-order functions are reached,
a host of motivating examples is at hand.
The exercises vary greatly in difficulty. They are not intended for assessing
Students, but for providing practice, broadening the material and provoking dis-
cussion.
Overview of the book. Most chapters are devoted to aspects of ML. Chapter | in-
troduces the ideas behind functional programming and surveys the history of ML.
Chapters 2-5 cover the functional part of ML, including an introduction to mod-
ules. Basic types, lists, trees and higher-order functions are presented. Broader
Principles of functional programming are discussed.
Chapter 6 presents formal methods for reasoning about functional programs.
If this seems to be a distraction from the main business of programming, consider
that a program is worth little unless it is correct. Ease of formal reasoning is a
Major argument in favour of functional programming.Xvi Preface
Chapter 7 covers modules in detail, including functors (modules with param-
eters). Chapter 8 covers ML’s imperative features: references, arrays and in--
puVoutput. The remainder of the book consists of extended examples. Chapter 9
presents a functional parser and a A-calculus interpreter. Chapter 10 presents a
theorem prover, a traditional ML application.
The book is full of examples. Some of these serve only to demonstrate some
aspect of ML, but most are intended to be useful in themselves — sorting, func-
tional arrays, priority queues, search algorithms, pretty printing. Please note: al-
though I have tested these programs, they undoubtedly contain some errors.
Information and warning boxes. Technical asides, descriptions oF library func-
tions, and notes for further study appear from place to place. They are high-
lighted for the benefit of readers who wish to skip over them:
0 King Henry's claim. There is no bar to make against your highness’ claim to
‘ France but this, which they produce from Pharamond, /n terram Salicam muli-
eres ne succedant, ‘No woman shall succeed in Salique land’: which Salique land the
French unjustly gloze to be the realm of France, and Pharamond the founder of this law
and female bar. But their own authors faithfully affirm that the land Salique is in Ger-
many ...!
ML is not perfect. Certain pitfalls can allow a simple coding error to waste
hours of a programmer’s time. The new standard library introduces incompat-
ibilities between old and new compilers. Warnings of possible hazards appear
throughout the book. They look like this:
Beware the Duke of Gloucester. Q Buckingham! take heed of yonder dog.
Look, when he fawns, he bites; and when he bites, his venom tooth will ran-
Ke to the death. Have not to do with him, beware of him; Sin, Death, and Hel] have set
their marks on him, and all their ministers attend on him.
[ hasten to add that nothing in ML can have consequences quite this dire. No
fault in a program can corrupt the ML system itself. On the other hand, pro-
grammers must remember that even correct programs can do harm in the outside
world.
How to get a Standard ML compiler. Because Standard ML is fairly new on the
scene, many institutions will not have a compiler. The following is a partial list
of existing Standard ML compilers, with contact addresses. The examples in this
' No technical aside in this book is as long as the Archbishop’s speech, which
extends to 62 lines.Preface XVIL
book were developed under Moscow ML, Poly/ML and Standard ML of New
Jersey. | have not tried the other compilers.
To obtain. MLWorks, contact Harlequin Limited, Barrington Hall, Barrington,
Cambridge, CB2 5RG, England. Their email address is web@harlequin. com,
To obtain Moscow ML, contact Peter Sestoft, Mathematical Section, Royal
Veterinary and Agricultural University, Thorvaldsensvej 40, DK-1871 Frede-
riksberg C, Denmark. Or get the system from the World Wide Web:
http: /www.dina.kvl.dk/~sestoft/mosml .htm]
To obtain Poly/ML, contact Abstract Hardware Ltd, 1 Brunel Science Park,
Kingston Lane, Uxbridge, Middlesex, UB8 3PQ, England. Their email address
is
[email protected].
To obtain Poplog Standard ML, contact Integral Solutions Ltd, Berk House,
Basing View, Basingstoke, Hampshire, RG21 4RG, England. Their email ad-
dress is
[email protected].
To obtain Standard ML of New Jersey, contact Andrew Appel, Computer Sci-
ence Department, Princeton University, Princeton NJ 08544-2087, USA. Better
still, fetch the files from the World Wide Web:
http: //www.cs.princeton.edu/“appel /smlnj/
The programs in this book and answers to some exercises are available by
email; my address is
[email protected]. If possible, please use the World
Wide Web; my home page is at
http: //www.cl.cam.ac.uk/users/lep/
Acknowledgements. The editor, David Tranah, assisted with all stages of the
writing and suggested the title. Graham Birtwistle, Glenn Bruns and David
Wolfram read the text carefully. Dave Berry, Simon Finn, Mike Fourman, Kent
Karlsson, Robin Milner, Richard O'Keefe, Keith van Rijsbergen, Nick Roth-
well, Mads Tofte, David N. Turner and the staff of Harlequin also commented
on the text. Andrew Appel, Gavin Bierman, Phil Brabbin, Richard Brooksby,
Guy Cousineau, Lal George, Mike Gordon, Martin Hansen, Darrell Kindred, Sil-
vio Meira, Andrew Morris, Khalid Mughal, Tobias Nipkow, Kurt Olender, Allen
Stoughton, Reuben Thomas, Ray Toal and Helen Wilson found errors in previ-
ous printings. Piete Brooks, John Carroll and Graham Titmus helped with the
computers. I wish to thank Dave Matthews for developing Poly/ML, which was
for many years the only efficient implementation of Standard ML.
Of the many works in the bibliography, Abelson and Sussman (1985), Birdxviii = Preface
and Wadler (1988) and Burge (1975) have been especially helpful. Reade (1989)
contains useful ideas for implementing lazy lists in ML.
The Science and Engineering Research Council has supported LCF and ML in
numerous research grants over the past 20 years.
I wrote most of this book while on leave from the University of Cambridge.
Tam grateful to the Computer Laboratory and Clare College for granting leave,
and to the University of Edinburgh for accommodating me for six months.
Finally, I should like to thank Sue for all she did to help, and for tolerating my
daily accounts of the progress of every chapter.1
Standard ML
The first ML compiler was built in 1974. As the user community grew, various
dialects began to appear. The ML community then got together to develop and
promote a common language, Standard ML — sometimes called SML, or just ML.
Good Standard ML compilers are available.
Standard ML has become remarkably popular in a short time. Universities
around the world have adopted it as the first programming language to teach to
students. Developers of substantial applications have chosen it as their imple-
mentation language. One could explain this popularity by saying that ML makes
it easy to write clear, reliable programs. For a more satisfying explanation, let us
examine how we look at computer systems.
Computers are enormously complex. The hardware and software found in a
typical workstation are more than one mind can fully comprehend. Different
people understand the workstation on different levels. To the user, the work-
station is a word processor or spreadsheet. To the repair crew, it is a box con-
taining a power supply, circuit boards, etc. To the machine language program-
mer, the workstation provides a large store of bytes, connected to a processor that
can perform arithmetic and logical operations. The applications programmer un-
derstands the workstation through the medium of the chosen programming lan-
guage,
Here we take ‘spreadsheet’, ‘power supply’ and ‘processor’ as ideal, abstract
concepts. We think of them in terms of their functions and limitations, but not in
terms of how they are built. Good abstractions let us use computers effectively,
without being overwhelmed by their complexity.
Conventional ‘high level’ programming languages do not provide a level of
abstraction significantly above machine language. They provide convenient no-
tations, but only those that map straightforwardly to machine code. A minor er-
Tor in the program can make it destroy other data or even itself. The resulting
behaviour can be explained only at the level of machine language — if at all!
ML is well above the machine language level. It supports functional program-
ming, where programs consist of functions operating on simple data structures.
Functional programming is ideal for many aspects of problem solving, as argued2 f Standard ML
briefly below and demonstrated throughout the book, Programming tasks can
be approached mathematically, without preoccupation with the computer’s inter-
nal workings. ML also provides mutable variables and arrays. Mutable objects
can be updated using an assignment command; using them, any piece of conven-
tional code can be expressed easily. For structuring large systems, ML provides
modules: parts of the program can be specified and coded separately.
Most importantly of all, ML protects programmers from their own errors. Be-
fore a program may run, the compiler checks that all module interfaces agree
and that data are used consistently. For example, an integer may not be used as
a store address. (It is a myth that real programs must rely on such tricks.) As the
program executes, further checking ensures safety: even a faulty ML program
continues to behave as an ML program. It might nin forever and it might return
to the user with an error message. But it cannot crash.
ML supports a level of abstraction that is oriented to the requirements of the
programmer, not those of the hardware. The ML system can preserve this abstrac-
tion, even if the program is faulty. Few other languages offer such assurances. _
Functional Programming
Programming languages come in several varieties. Languages like For-
tran, Pascal and C are called procedural: their main programming unit is the pro-
cedure, A popular refinement of this approach centres on objects that carry their
own operations about with them. Such object-oriented languages include C++
and Modula-3. Both approaches rely on commands that act upon the machine
state; they are both imperative approaches.
Just as procedural languages are oriented around commands, functional lan-
guages are oriented around expressions. Programming without commands may
seem alien to some readers, so let us see what lies behind this idea. We begin
with a critique of imperative programming.
11 Expressions versus commands
Fortran, the first high-level programming language, gave programmers
the arithmetic expression. No longer did they have to code sequences of addi-
tions, loads and stores on registers: the FORmula TRANslator did this for them.
Why are expressions so important? Not because they are familiar: the Fortran
syntax for1,2 Expressions in procedural programming languages 3
has but a passing resemblance to that formula. Let us consider the advantages
of expressions in detail. Expressions in Fortran can have side effeets: they can
change the state. We shall focus on purc expressions, which merely compute a
value.
Expressions have a recursive structure. A typical expression like
FEI + E2) — (Es)
is built out of other expressions Ej, £2 and £3, and may itself form part of a larger
expression.
The value of an expression is given recursively in terms of the values of its
subexpressions. The subexpressions can be evaluated in any order, or even in
parallel.
Expressions can be transformed using mathematical laws. For instance, re-
placing E; + £2 by £2 + E\ does not affect the value of the expression above,
thanks to the commutative law of addition. This ability to substitute equals for
equals is called referential transparency, In particular, an expression may safely
be replaced by its value. .
Commands share most of these advantages. In modern Janguages, commands
are built out of other commands. The meaning of a command like
while B, do (if B,; then C; else C,)
can be given in terms of the meanings of its parts. Commands even enjoy refer-
ential transparency: laws like
(if B then C, else C);C=if B then (Cj;C) else (Cr: 0
can be proved and applied as substitutions.
However, the meaning of an expression is simply the result of evaluating it,
which is why subexpressions can be evaluated independently of each other. The
meaning of an expression can be extremely simple, like the number 3. The mean-
ing of a command is a state transformation or something equally complicated. To
understand a command, you have to understand its full effect on the machine’s
State,
1.2 Expressions in procedural programming languages
How far have programming languages advanced since Fortran? Con-
sider Euctid’s Algorithm, which is defined by recursion, for computing the Great-4 1 Standard ML
est Common Divisor (GCD) of two natural numbers:
gcdO,ny=n
gcd(m, n) = gcd(n mod m, m) form >0
In Pascal, a procedural language, most people would code the GCD as an imper-
ative program:
function ged(m,n: integer): integer;
var prevm: integer;
begin
while m<>0 do
begin prevm := m; m := mod m; n := prevm end;
ged i= n
end;
Here it is in Standard ML as a functional program:
fun ged(m,n) =
if m=0 then n
else gedin mod m, m);
The imperative program, though coded in a ‘high-level’ language, is hardly
clearer or shorter than a machine language program. It repeatedly updates three
quantities, one of which is just a temporary storage cell. Proving that it correctly
implements Euclid’s algorithm requires Floyd-Hoare proof rules — a tedious en-
terprise. In contrast, the functional version obviously implements Euclid’s Algo-
rithm,
A recursive program in Pascal would be only a slight improvement. Recur-
sive procedure calls are seldom implemented efficiently. Thirty years after its
introduction to programming languages, recursion is still regarded as something
to eliminate from programs. Correctness proofs for recursive procedures have a
sad history of complexity and errors.
Pascal expressions do not satisfy the usual mathematical laws. An optimizing
compiler might transform f(z) + u/2 into u/2 + f(z). However, these expres-
sions may not compute the same value if the ‘function’ f changes the value of u.
The meaning of an expression in Pascal involves states as well as values. For all
practical purposes, referential transparency has been lost.
In a purely functional language there is no state. Expressions satisfy the usual
mathematical laws, up to the limitations of the machine (for example, real arith-
metic is approximate). Purely functional programs can also be written in Stan-
dard ML. However, ML is not pure because of its assignments and input/output
commands. The ML programmer whose style is ‘almost’ functional had better
not be lulled into a false sense of referential transparency.1.3 Storage management 5
1.3 Storage management
Expressions in procedural languages have progressed little beyond For-
tran; they have not kept up with developments in data structures. Suppose we
have employee records consisting of name, address, and other details. We can-
not write record-valued expressions, or return an employee record from a func-
tion; even if the language permits this, copying such large records is prohibitively
slow.
To avoid copying large objects, we can refer to them indirectly. Our record-
valued function could allocate storage space for the employee record, and return
its address. Instead of copying the record from one place to another, we copy its
address instead. When we are finished with the record, we deallocate (release)
its storage. (Presumably the employee got sacked.) Addresses used in this way
are called references or pointers.
Deallocation is the bugbear of this approach. The program might release the
storage prematurely, when the record is still in use. Once that storage is reallo-
cated, it will be used for different purposes at the same time. Anything could
happen, leading (perhaps much Jater) to a mysterious crash. This is one of the
most treacherous programming errors.
If we never deallocate storage, we might run out of it. Should we then avoid
using references? But many basic data structures, such as the linked list, require
references.
Functional languages, and some others, manage storage automatically. The
programmer does not decide when to deallocate a record’s storage. At intervals,
the run-time system scans the store systematically, marking everything that is
accessible and reclaiming everything that is not. This operation is called garbage”
collection, although it is more like recycling. Garbage collection can be slow and
may require additional space, but it pays dividends.
Languages with garbage collection typically use references heavily in their in-
ternal representation of data. A function that ‘returns’ an employee record actu-
ally returns only a reference to it, but the programmer does not know or care.
The language gains in expressive power. The programmer, freed from the chore
of storage management, can work more productively.
14 Elements of a functional language
Functional programs work with values, not states. Their tools are ex-
pressions, not commands. How can assignments, arrays and loops be dispensed
with? Does not the outside world have state? These questions pose real chal-
lenges. The functional programmer can exploit a wide range of techniques to
solve problems.6 1 Standard ML
Lists and trees. Collections of data can processed as lists of the form
fa, b,c, d.e,...].
Lists support sequential access: scanning from left to right. This suffices for most
purposes, even sorting and matrix operations. A more flexible way of organizing
data is as a tree:
Balanced trees permit random access: any part can be reached quickly. In theory,
trees offer the same efficiency as arrays; in practice, arrays are often faster. Trees
play key réles in symbolic computation, representing logical terms and formule
in theorem provers. Lists and trees are represented using references, so the run-
time system must include a garbage collector.
Functions. Expressions consist mainly of function applications. To increase the
power of expressions, functions must be freed from arbitrary restrictions. Func-
tions may take any type of arguments and return any type of result. As we shall
see, ‘any type’ includes functions themselves, which can be treated like other
data; making this work also requires a garbage collector.
Recursion. Variables in a functional program obtain their values from outside
(when a function is called) or by declaration. They cannot be updated, but re-
cursive calls can produce a changing series of argument values. Recursion is
easier to understand than iteration —— if you do not believe this, recall our two
GCD programs. Recursion eliminates the baroque looping constructs of proce-
dural languages.!
Pattern-matching. Most functional languages allow a function to analyse its ar-
gument using pattern-matching. A function to count the elements of a list looks
like this in ML:
' Recursion does have its critics. Backus (1978) recommends providing iteration
primitives to replace most uses of recursion in function definitions. However,
his style of functional programming has not caught on.1.4 Elements of a functional language 7
fun length [1 =0
| length (x::xs) = 1 + length xs;
We instantly sce that the length of the empty list ([]) is zero, and that the length
of a list consisting of the element x prefixed to the list xs is the length of xs plus
one. Here is the equivalent definition in Lisp, which lacks pattern-matching:
(define (length x)
(if (null? x)
0
(+ 1 (length (cdr x)})))
ML function declarations often consider half a dozen cases, with patterns much
more complicated than x: : xs. Expressing such functions without using patterns
is terribly cumbersome. The ML compiler does this internally, and can do a better
job than the programmer could.
Polymorphic type checking. Programmers, being human, often err. Using a non-
existent part of a data structure, supplying a function with too few arguments, or
confusing a reference to an object with the object itself are serious errors: they
could make the program crash. Fortunately, the compiler can detect them before
the program runs, provided the language enforces a type discipline. Types clas-
sify data as integers, reals, lists, etc., and let us ensure that they are used sensibly.
Some programmers resist type checking because it can be too restrictive. In
Pascal, a function to compute the length of a list must specify the — completely
irrelevant! —- type of the list’s elements. Our ML length function works for all
lists because ML’s type system is polymorphic: it ignores the types of irrele-
vant components. Our Lisp version also works for all lists, because Lisp has no
compile-time type checking. Lisp is more flexible than ML; a single list can mix
elements of different types. The price of this freedom is hours spent hunting er-
rors that might have been caught automatically.
Higher-order functions. Functions themselves are computational values. Even
Fortran lets a function be passed as an argument to another function, but few pro-
cedural languages let function values play a full réle as data structures.
A higher-order function (or functional) is a function that operates on other
functions, The functional map, when applied to a function f, returns another
function; that function takes
Xn] to [f04),.-. SOI
{x,-8 J Standard ML
Another higher-order function, when applied to a function f and value e, returns
Si fQa.... fine...)
Ite = Oandf = + (yes, the addition operator is a function) then we get the sum
Of X1,... .%,, computed by
Xp G2 +++ + Oy + 0)-+ +).
Ife = 1 andf = x then we get their product, computed by
XX (¥2 Xe KO xX TY- +).
Other computations are expressed by suitable choices for f and e.
Infinite data structures. Infinite lists like [1, 2, 3, ...] can be given a computa-
tional meaning, They can be of great help when tackling sophisticated problems,
Infinite lists are processed using lazy evaluation, which ensures that no value —
or part of a value —-- is computed until it is actually needed to obtain the final
result. An infinite list never exists in full; it is rather a process for computing
successive elements upon demand.
The search space in a theorem prover may form an infinite tree, whose success
nodes form an infinite list, Different search strategies produce different lists of
success nodes. The list can be given to another part of the program, which need
not know how it was produced.
Infinite lists can also represent sequences of inputs and outputs. Many of us
have encountered this concept in the pipes of the Unix operating system. A chain
of processes linked by pipes forms a single process. Each process consumes its
input when available and passes its output along a pipe to the next process, The
outputs of intermediate processes are never stored in full. This saves storage, but
more importantly it gives us a clear notation for combining processes. Mathe-
matically, every process is a function from inputs to outputs, and the chain of
processes is their composition.
Input and output. Communication with the outside world, which has state, is
hard to reconcile with functional programming. Infinite lists can handle sequen-
tial input and output (as mentioned above), but interactive programming and pro-
cess communication are thorny issues. Many functional approaches have been
investigated; monads are one of the most promising (Peyton Jones and Wadler,
1993). ML simply provides commands to perform input and output; thus, ML
abandons functional programming here.1.5 The efficiency of functional programming 9
on Functional languages: a survey. The mainstream functional languages adopt
0 lazy evaluation, pattern-matching and ML-style polymorphic types. Miranda
is an clegant language by David A. Turner (1990a). Lazy ML is a dialect of ML with
lazy evaluation; its compiler generates efficient code (Augustsson and Johnsson, 1989).
Haskell was designed by a committee of researchers as a common language (Hudak
et al., 1992); it has been widely adopted.
John Backus (1978) introduced the language FP in a widely publicized lecture. FP
provides many higher-order functions (called ‘combining forms’), but the programmer
may not define new ones. Backus criticized the close coupling between programming
languages and the underlying hardware, coining the phrase von Neumann bottleneck for
the connection between the processor and the store. Many have argued that functional
languages are ideal for parallel hardware. Sisal has been designed for parallel numerical
computations; Cann (1992) claims that Sisal sometimes outperforms Fortran.
Many implementation techniques for functional programming, such as garbage col-
lection, originated with Lisp (McCarthy et al., 1962). The language includes low-level
features that can be misused to disastrous effect. Later dialects, including Scheme (Abel-
son and Sussman, 1985) and Common Lisp, provide higher-order functions, Although
much Lisp code is imperative, the first functional programs were written in Lisp. Most
ML dialects include imperative features, but ML is more disciplined than Lisp. It has
compile-time type checking, and allows updates only to mutable objects.
15 The efficiency of functional programming
A functional program typically carries a large run-time system with a
resident compiler. The garbage collector may require a uniform representation of
data, making it occupy additional storage. The functional programmer is some-
times deprived of the most efficient data structures, such as arrays, strings and
bit vectors. A functional program may therefore be less efficient than the corre-
sponding C program, particularly jn its storage demands.
ML is best suited for large, complex applications. Type checking, automatic
storage allocation and other advantages of functional programming can make the
difference between a program that works and one that doesn’t. Efficiency be-
comes a secondary issue; besides, with a demanding application, the difference
will be less pronounced. Most functional programs ought to run nearly as fast as
their procedural counterparts -—~ perhaps five times slower in the worst case.
Efficiency is regarded with suspicion by many researchers, doubtless because
many programs have been ruined in its pursuit. Functional programmers have
sometimes chosen inefficient algorithms for the sake of clarity, or have sought
to enrich their languages rather than implement them better. This attitude, more
than technical reasons, has given functional programming a reputation for inef-
ficiency.
We must now redress the balance. Functional programs must be efficient, or10 7? Standard ML
nobody will use them. Algorithms, after all, are designed to be efficient. The
Greatest Common Divisor of two numbers can be found by searching through
all possible candidates. This exhaustive search algorithm is clear, but useless.
Euclid’s Algorithm is fast and simple, having sacrificed clarity.
The exhaustive search algorithm for the GCD is an example of an executa-
ble specification. One approach to program design might start with this and ap-
ply transformations to make it more efficient, while preserving its correctness.
Eventually it might arrive at Euclid’s Algorithm. Program transformations can
indeed improve efficiency, but we should regard executable specifications with
caution. The Greatest Common Divisor of two integers is, by definition, the larg-
est integer that exactly divides both; the specification does not mention search at
all. The exhaustive search algorithm is too complicated to be a good specifica-
tion.
Functional programming and logic programming are instances of declarative
Programming. The ideal of declarative programming is to free us from writ-
ing programs — just state the requirements and the computer will do the rest.
Hoare (1989c) has explored this ideal in the case of the Greatest Common Di-
visor, demonstrating that it is sti] a dream. A more realistic aim for declara-
tive programming is to make programs easier to understand. Their correctness
can be justified by simple mathematical reasoning, without thinking about bytes.
Declarative programming is still programming; we still have to code efficiently.
This book gives concrete advice about performance and tries to help you de-
cide where efficiency matters. Most natural functional definitions are also rea-
sonably efficient. Some ML compilers offer execution profiling, which measures
the time spent by each function. The function that spends the most time (never
the one you would expect) becomes a prime candidate for improvement. Such
bottom-up optimization can produce dramatic results, although it may not reveal
global causes of waste. These considerations hold for programming generally —
be it functional, procedural, object-oriented or whatever.
Correctness must come first. Clarity must usually come second, and efficiency
third. Any sacrifice of clarity makes the program harder to maintain, and must
be justified by a significant efficiency gain. A judicious mixture of realism and
principle, with plenty of patience, makes for efficient programs.
0) Applications of functional programming. Functional programming techniques
are used in artificial intelligence, formal methods, computer aided design, and
other tasks involving symbolic computation. Substantial compilers have been written
in (and for) Standard ML (Appel, 1992) and Haskell (Peyton Jones, 1992). Networking
software has been written in ML (Biagioni et al., 1994), in a project to demonstrate ML’s
utility for systems programming. A major natural language processing system, called1.6 The evolution of Standard ML 11
LOLITA, has been written in Haskell (Smith et al., 1994); the authors adopted functional
programming in order to manage the complexity of their system. Hartel and Plasmei-
jet (1996) describe six major projects, involving diverse applications. Wadler and Gill
(1995) have compiled a list of real world applications; these cover many domains and
involve all the main functional languages.
Standard ML
Every successful language was designed for some specific purpose: Lisp
for artificial intelligence, Fortran for numerical computation, Prolog for natural
language processing. Conversely, languages designed to be general purpose —~
such as the ‘algorithmic languages’ Algol 60 and Algol 68 — have succeeded
more as sources of ideas than as practical tools.
ML was designed for theorem proving. This is not a broad field, and ML was
intended for the programming of one particular theorem prover — a specific pur-
pose indeed! This theorem prover, called Edinburgh LCF (Logic for Computable
Functions) spawned a host of successors, all of which were coded in ML. And
just as Lisp, Fortran and Prolog have applications far removed from their origins,
ML is being used in diverse problem areas.
1.6 The evolution of Standard ML
As ML was the Meta Language for the programming of proof strategies,
its designers incorporated the necessary features for this application:
e The inference rules and proof methods were to be represented as func-
tions, so ML was given the full power of higher-order functional pro-
gramming.
» The inference rules were to define an abstract type: the type of theorems.
Strong type checking (as in Pascal) would have been too restrictive, so
ML was given polymorphic type checking.
© Proof methods could be combined in complex ways. Failure at any point
had to be detected so that another method could be tried. So ML was
allowed to raise and trap exceptions.
© Since a theorem prover would be useless if there were loopholes, ML was
designed to be secure, with no way of corrupting the environment.
The ML system of Edinburgh LCF was slow: programs were translated into Lisp
and then interpreted. Luca Cardelli wrote an efficient compiler for his version of
ML, which included a rich set of declaration and type structures. At Cambridge
University and INRIA, the ML system of LCF was extended and its performance12 1 Standard ML
improved. ML also influenced HOPE; this purely functional language adopted
polymorphism and added recursive type definitions and pattern-matching.
Robin Milner led a standardization effort to consolidate the dialects into Stan-
dard ML. Many people contributed. The module language — the language’s
most complex and innovative feature — was designed by David MacQueen and
refined by Milner and Mads Tofte. In 1987, Milner won the British Computer
Society Award for Technical Excellence for his work on Standard ML. The first
compilers were developed at the Universities of Cambridge and Edinburgh; the
excellent Standard ML of New Jersey appeared shortly thereafter.
Several universities teach Standard ML as the students’ first programming lan-
guage. ML provides a level base for all students, whether they arrive knowing
C, Basic, machine language or no language at all. Using ML, students can learn
how to analyse problems mathematically, breaking the bad habits learned from
low-level languages. Significant computations can be expressed in a few lines.
Beginners especially appreciate that the type checker detects common errors, and
that nothing can crash the system!
Section 1.5 has mentioned applications of Standard ML to networking, com-
piler construction, etc. Theorem proving remains ML’s most important applica-
tion area, as we shall see below.
oO Further reading, Gordon et al. (1979) describe LCF. Landin (1966) discusses
) the language ISWIM, upon which ML was originally based. The formal defini-
tion of Standard ML has been published as a book (Milner er al., 1990), with a separate
volume of commentary (Milner and Tofte, 1990).
Standard ML has not displaced all other dialects. The French, typically, have gone
their own way. Their language CAML provides broadly similar features with the tradi-
tional ISWIM syntax (Cousineau and Huet, 1990). It has proved useful for experiments in
language design; extensions over Standard ML include lazy data structures and dynamic
types. CAML Light is a simple byte-code interpreter that is ideal for small computers.
Lazy dialects of ML also exist, as mentioned previously, HOPE continues to be used and
taught (Bailey, 1990).
1.7 The ML tradition of theorem proving
Theorem proving and functional programming go hand in hand. One of
the first functional programs ever written is a simple theorem prover (McCarthy
et al., 1962). Back in the 1970s, when some researchers were wondering what
functional programming was good for, Edinburgh LCF was putting it to work.
Fully automatic theorem proving is usually impossible: for most logics, no
automatic methods are known. The obvious alternative to automatic theorem
proving, proof checking, soon becomes intolerable. Most proofs involve long,
repetitive combinations of rules.1.8 The new standard library 13
Edinburgh LCF represented a new kind of theorem prover, where the level of
automation was entirely up to the user. It was basically a programmable proof
checker. Users could write proof procedures in ML — the Meta Language —
rather than typing repetitive commands. ML programs could operate on expres-
sions of the Object Language, namely Scott’s Logic of Computable Functions.
Edinburgh LCF introduced the idea of representing a logic as an abstract type
of theorems. Each axiom was a primitive theorem while each inference rule was
a function from theorems to theorems. Type checking ensured that theorems
could be made only by axioms and rules. Applying inference rules to already
known theorems constructed proofs, rule by rule, in the forward direction.
Tactics permitted a more natural style, backward proof. A tactic was a func-
tion from goals to subgoals, justified by the existence of an inference rule going
the other way. The tactic actually returned this inference rule (as a function) in
its result: tactics were higher-order functions.
Tacticals provided control structures for combining simple tactics into com-
plex ones. The resulting tactics could be combined to form still more complex
tactics, which in a single step could perform hundreds of primitive inferences.
Tacticals were even more ‘higher-order’ than tactics. New uses for higher-order
functions turned up in rewriting and elsewhere.
0 Further reading. Automated theorem proving originated as a task for artificial
' intelligence. Later research applied it to reasoning tasks such as planning (Rich
and Knight, 1991). Program verification aims to prove software correct. Hardware veri-
fication, although a newer field, has been more successful; Graham (1992) describes the
verification of a substantial VLSI chip and surveys other work.
Offshoots of Edinburgh LCF include HOL88, which uses higher-order logic (Gordon
and Melham, 1993) and Nuprl, which supports constructive reasoning (Constable et al.,
1986).
Other recent systems adopt Standard ML. LAMBDA is a hardware synthesis tool, for
designing circuits and simultaneously proving their correctness using higher-order logic.
ALF is a proof editor for constructive type theory (Magnusson and Nordstrém, 1994).
18 The new standard library
The ML definition specifies a small library of standard declarations, in-
cluding operations on numbers, strings and lists. Many people have found this
library inadequate. For example, it has nothing to convert a character string such
as "3.14" into a real number. The library’s shortcomings have become more
apparent as people have used ML for systems programming and other unforeseen
areas. A committee, comprising several compiler writing teams, has drafted a
new ML standard library (Gansner and Reppy, 1996). As of this writing itis still
under development, but its basic outlines are known.14 1 Standard ML
The library requires some minor changes to ML itself. It introduces a type of
characters, distinct from character strings of length one. It allows the coexistence
of numeric types that differ in their internal representations, and therefore in their
precisions; this changes the treatment of some numerical functions.
The library is organized using ML modules. The numerous functions are com-
ponents of ML structures, whose contents is specified using ML signatures. The
functions are invoked not by their name alone, but via the name of their struc-
ture; for example, the sign function for real numbers is Real . sign, not just sign.
Many function names occur in more than one structure; the library also provides
int. sign. When we later discuss modules, the library will help motivate the key.
concepts. Here is a summary of the library’s main components, with the relevant
structures:
@ Operations on lists and lists of pairs belong to the structures List and
ListPair. Some of these will be described in later chapters,
* Integer operations belong to the structure Int. Integers may be avail-
able in various precisions. These may include the usual hardware inte-
gers (structure FixedInt), which are efficient but have limited size. They
could include unlimited precision integers (structure /nt/nf), which are
essential for some tasks.
e Real number operations belong to the structure Real, while functions
such as sqrt, sin and cos belong to Math. The reals may also be available
in various precisions. Structures have names such as Real32 or Real64,
which specify the number of bits used.
« Unsigned integer arithmetic is available. This includes bit-level opera-
tions such as logical ‘and’, which are normally found only in low-level
languages. The ML version is safe, as it does not allow the bits to be
converted to arbitrary types. Structures have names such as Word8.
© Arrays of many forms are provided. They include mutable arrays like
those of imperative languages (structure Array), and immutable arrays
(structure Vector). The latter are suitable for functional programming,
since they cannot be updated. Their initial value is given by some cal-
culation — one presumably too expensive to perform repeatedly.
* Operations on characters and character strings belong to structures Char
and String among others. The conversion between a type and its textual
representation is defined in the type’s structure, such as fut.
« Input/output is available in several forms. The main ones are text I/O,
which transfers lines of text, and binary I/O, which transfers arbitrary
streams of bytes. The structures are Text/O and BinIO.1.9 ML and the working programmer 15
« Operating system primitives reside in structure OS. They are concerned
with files, directories and processes. Numerous other operating system
and input/output services may be provided.
« Calendar and time operations, including processor time measurements,
are provided in structures Date, Time and Timer.
» Declarations needed by disparate parts of the library are collected into
structure General.
Many other packages and tools, though not part of the library, are widely avail-
able. The resulting environment provides ample support for the most demanding
projects.
19 ML and the working programmer
Software is notoriously unreliable. Wiener (1993) describes countless
cases where software failures have resulted in loss of life, business crises and
other calamities. Software products come not with a warranty, but with a war-
ranty disclaimer. Could we prevent these failures by coding in ML instead of C?
Of course not — but it would be a step in the right direction.
Part of the problem is the prevailing disdain for safety. Checks on the correct
use of arrays and references are costly, but they can detect errors before they do
serious harm. C. A. R. Hoare has said,
... itis absurd to make elaborate security checks on debugging runs, when no
trust is put in the results, and then remove them in production runs, when an
erroneous result could be expensive or disastrous, What would we think of a
sailing enthusiast who wears his life-jacket when training on dry land but
takes it off as soon as he goes to sea? (Hoare, 1989b, page 198)
This quote, from a lecture first given in 1973, has seldom been heeded. Typical
compilers omit checks unless specifically commanded to include them. The C
language is particularly unsafe: as its arrays are mere storage addresses, check-
ing their correct usage is impractical. The standard C library includes many pro-
cedures that risk corrupting the store; they are given a storage area but not told
its size!_In consequence, the Unix operating system has many security loop-
holes. The Internet Worm exploited these, causing widespread network disrup-
tion (Spafford, 1989).
ML supports the development of reliable software in many ways. Compilers
do not allow checks to be omitted. Appel (1993) cites its safety, automatic stor-
age allocation, and compile-time type checking; these eliminate some major er-16 1 Standard ML
rors altogether, and ensure the early detection of others. Appel shares the view
that functional programming is valuable, even in major projects.
Moreover, ML is defined formally. Milner er al. (1990) is not the first formal
definition of a programming language, but it is the first one that compiler writers
can understand.” Because the usual ambiguities are absent, compilers agree to a
remarkable extent. The new standard library will strengthen this agreement. A
program ought to behave identically regardless of which compiler runs it; ML is
close to this ideal.
A key advantage of ML is its module system. System components, however
large, can be specified and coded independently. Each component can supply its
specified services, protected from external tampering. One component can take
other components as parameters, and be compiled separately from them. Such
components can be combined in many ways, configuring different systems.
Viewed from a software engineering perspective, ML is an excellent language
for large systems. Its modules allow programmers to work in teams, and to reuse
components. [ts types and overall safety contribute to reliability. Its exceptions
allow programs to respond to failures. Comparing ML with C, Appel admits that
ML programs need a great deal of space, but run acceptably fast. Software de-
velopers have a choice of commercially supported compilers.
We cannot soon expect to have ML programs running in our digital watches.
With major applications, however, reliability and programmer productivity are
basic requirements. Is the age of C drawing to a close?
2 This is possible thanks to recent progress in the theory of programming lan-
guages. The ML definition is an example of a structural operational semantics
(Hennessy, 1990).2
Names, Functions and Types
Most functional languages are interactive. If you enter an expression, the com-
puter immediately evaluates it and displays the result. Interaction is fun; it gives
immediate feedback; it lets you develop programs in easily managed pieces.
We can enter an expression followed by a semicolon .. .
242;
. and ML responds
> 4: int
Here we see some conventions that will be followed throughout the book. Most
ML systems print a prompt character when waiting for input; here, the input is
shown in typewriter characters. The response is shown, in slanted char-
acters,
> ona line like this.
At its simplest, ML is just a calculator. It has integers, as shown above, and real
numbers. ML can do simple arithmetic ...
2.3;
3.2 -
> 0.9 : real
- and square roots:
Math, sqrt 2.0;
> 1.414213562 : real
Again, anything typed to ME must end with a semicolon (;). ML has printed the
value and type. Note that real is the type of real numbers, while int is the type
of integers. |
Interactive program development is more difficult with procedural languages
because they are too verbose. A self-contained program is too long to type as a
Single input.18 2 Names, Functions and Types
Chapter outline
This chapter introduces Standard ML and functional programming. The
basic concepts include declarations, simple data types, record types, recursive
functions and polymorphism. Although this material is presented using Standard
ML, it illustrates general principles.
The chapter contains the following sections:
Value declarations. Value and function declarations are presented using ele-
mentary examples. :
Numbers, character strings and truth values, The built-in types int, real, char,
string and bool support arithmetic, textual and logical operations. ‘
Pairs, tuples and records. Ordered pairs and tuples allow functions to have
multiple arguments and results.
The evaluation of expressions. The difference between strict evaluation and!
lazy evaluation is not just a matter of efficiency, but concerns the very meaning
of expressions.
Writing recursive functions. Several worked examples illustrate the use of Te-!
cursion.
Local declarations. Using let or local, names can be declared with a re-.
stricted scope.
Introduction to modules. Signatures and structures are introduced by develop-'
ing a generic treatment of arithmetic operations.
Polymorphic type checking. The principles of polymorphism are introduced,
including type inference and polymorphic functions.
Value declarations
A declaration gives something a name. ML. has many kinds of things that
can be named: values, types, signatures, structures and functors. Most names in
a program stand for values, like numbers, strings — and functions. Although
functions are values in ML, they have a special declaration syntax.
2.1 Naming constants =.
Any value of importance can be named, whether its importance is uni-
versal (like the constant 2r) or transient (the result of a previous computation).
AS a trivial example, suppose we want to compute the number of seconds in an
hour. We begin by letting the name seconds stand for 60.
val seconds = 60;2.2 Declaring functions
The value declaration begins with ML keyword vai and ends with a semicolon.
Names in this book usually appear in italics. ML repeats the name, with its value
and type:
> val seconds = 60 : int
Let us declare constants for minutes per hour and hours per day:
val minutes = 60;
> val minutes = 60 : int
val hours = 24;
> val hours = 24 : int
These names are now valid in expressions:
seconds * minutes * hours;
> 86400 ; int
If you enter an expression at top level like this, ML stores the value under the
name it. By referring to it you can use the value in a further calculation:
it div 24;
> 3600
int .
The name it always has the value of the last expression typed at top level. Any
previous value of it is lost. To save the value of it, declare a permanent name:
val secsinhour = it;
> val secsinhour = 3600 ; int
Incidentally, names may contain underscores to make them easier to read:
val secs_in-hour = seconds* minutes;
> val secs_in_hour = 3600; int
To demonstrate reat numbers, we compute the area of a circle of radius r by the
formula area = 7°:
val pi = 3.14159;
> val pi = 3.14159 ; real
val r = 2.0;
> val x = 2.0: real
val area = pi * 7 * 4;
> val area = 12.56636 : real
22 Declaring functions
The formula for the area of a circle can be made into an ML function like
this:
fun area (r) = pi*r*r;20 2 Names, Functions and Types
The keyword fun starts the function declaration, while area is the function
name, r is the formal parameter, and pi*r*r is the body. The body refers to r
and to the constant pi declared above.
Because functions are values in ML, a function declaration is a form of value
declaration, and so ML prints the value and type:
> val area = fn : real -> real
The type, which in standard mathematical notation is real — real, says that area
takes a real number as argument and returns another real number. The value of a
function is printed as fn. In ML, as in most functional languages, functions are
abstract values: their internal structure is hidden.
Let us call the function, repeating the area calculation performed above:
area(2.0)7
> 12.56636 : real
Let us try it with a different argument. Observe that the parentheses around the
argument are optional:
area 1.0;
> 3.14159 : real
The parentheses are also optional in function declarations, This definition of area
is equivalent to the former one.
fun area r = pitr*r;
The evaluation of function applications is discussed in more detail below.
Comments. Programmers often imagine that their creations are too transparent
to require further description. This logical clarity will not be evident to others
unless the program is properly commented. A comment can describe the pur-
pose of a declaration, give a literature reference, or explain an obscure matter.
Needless to say, comments must be correct and up-to-date.
Acomment in Standard ML begins with (* and ends with *) , and may extend
over several lines. Comments can even be nested. They can be inserted almost
anywhere:
fun area = — (*area of circle with radius r*)
pitr*r;
Functional programmers should not feel absolved from writing comments. Peo-
ple once claimed that Pascal was self-documenting.2.3 Identifiers in Standard ML 21
Redeclaring a name. Value names are called variables. Unlike variables in im-
perative languages, they cannot be updated. Buta name can be reused for another
purpose. Ifa name is declared again then the new meaning is adopted afterwards,
but does not affect existing uses of the name. Let us redeclare the constant pi:
val pi = 0.0
= 0.0: real
> val pi
We can see that area still takes the original value of pi:
area{1.O);
> 3.14159 : real
At this point in the session, several variables have values. These include seconds,
minutes, area and pi, as well as the built-in operations provided by the library.
The set of bindings visible at any point is called the environment. The function
area refers to an earlier environment in which pi denotes 3.14159. Thanks to
the permanence of names (called static binding), redeclaring a function cannot
damage the system, the library or your program.
Correcting your program. Because of static binding, redeclaring a function
AX called by your program may have no visible effect. When modifying a program,
be sure to recompile the entire file. Large programs should be divided into modules:
Chapter 7 will explain this in detail. After the modified module has been recompiled,
the program merely has to be relinked.
2.3 Identifiers in Standard ML
An alphabetic name must begin with a letter, which may be followed
by any number of letters, digits, underscores (_), or primes ("), usually called
single quotes. For instance:
x UB40 Hamlet_Prince_of_Denmark n’'3_H
The case of letters matters, so q differs from Q, Prime characters are allowed
because ML was designed by mathematicians, who like variables called x, x’, x’.
When choosing names, be certain to avoid ML’s keywords:
abstype and andalso as case datatype do
else end eqtype exception fn fun functor
handle if in include infix infixr let local
nonfix of op open orelse raise rec
sharing sig signature struct structure
then type val where while with withtype
Watch especially for the short ones: as, fn, if, in, of, op.
ML also permits symbolic names. These consist of the characters22 2 Names, Functions and Types
LReSHE- tS pe a>7?eayrs*|
Names made up of these characters can be as long as you like:
~a3-> $°$°s"s L12@exoots
Certain strings of special characters are reserved for ML’s syntax and should not
be used as symbolic names:
fos => <-> Hor
A symbolic name is allowed wherever an alphabetic name is:
val +-+-+ = 1415;
> val +-+-+ = 1415 ; int
Names are more formally known as identifiers. An identifier can simultaneously
denote a value, a type, a structure, a signature, a functor and a record field.
Exercise 2.1 On your computer, learn how to start an ML session and how to
terminate it. Then learn how to make the ML compiler read declarations from a
file — a typical command is use "myfile”.
Numbers, character strings and truth values
The simplest ML values are integer and real numbers, strings and char-
acters, and the booleans or truth values. This section introduces these types with
their constants and principal operations,
24 Arithmetic
ML distinguishes between integers {type inf) and real numbers (type
real), Integer arithmetic is exact (with unlimited precision in some ML systems)
while real arithmetic is only as accurate as the computer’s floating-point hard-
ware.
Integers. An integer constant is a sequence of digits, possibly beginning with a
minus sign (~). For instance:
9 “23 01234 ~B5601435654678
Integer operations include addition (+), subtraction (~), multiplication (*), di-
vision (div) and remainder (mod). These are infix operators with conventional
precedences: thus in
C( (mtn) ky — (im div jy) + j2.4 Arithmetic 23
all the parentheses can be omitted without harm.
Real numbers. A real constant contains a decimal point or E notation, or both.
For instance:
O.01 2.718281828 “1. 2E12 JEWS
The ending En means ‘times the nth power of 10.’ A negative exponent begins
with the unary minus sign (~). Thus 123 . 4E~2 denotes 1.234.
Negative real numbers begin with unary minus (~). Infix operators for reals in-
clude addition (+), subtraction (-), multiplication (*) and division (/). Function
application binds more tightly than infix operators. For instance, area a + bis
equivalent to (area a) + b,notarea (a + b).
Unary plus and minus, The unary minus sign is a tilde (~). Do not confuse it
with the subtraction sign (-)! ML has no unary plus sign. Neither + nor - may
appear in the exponent of a real number.
Type constraints. ML can deduce the types in most expressions from the types of
the functions and constants in it. But certain built-in functions are overloaded,
having more than one meaning. For example, + and * are defined for both inte-
gers and reals, The type of an overloaded function must be determined from the
context; occasionally types must be stated explicitly.
For instance, ML cannot tell whether this squaring function is intended for in-
tegers or reals, and therefore rejects it.
fun square x = x*x;
> Error- Unable to resolve overloading for *
Suppose the function is intended for real numbers. We can insert the type real in
a number of places.
We can specify the type of the argument:
fun square(x + real) = x*x;
> val square = fn : real -> real
We can specify the type of the result:
fun square x : real = x*x;
> val square = fn : real -> real
Equivalently, we can specify the type of the body:
fun square x = x*x : real;
> val square = fm ; real -> real24 2 Names, Functions and Types
Type constraints can also appear within the body, indeed almost anywhere.
AN Default overloading. The standard library introduces the notion of a default
overloading; the compiler may resolve the ambiguity in square by choosing
type int. Using a type constraint in such cases is still advisable, for clarity. The motiva-
tion for default overloadings is to allow different precisions of numbers to coexist. For
example, unless the precision of 1 . 23 is determined by its context, it will be assumed
to have the default precision for real numbers. As of this writing there is no experience
of using different precisions, but care is plainly necessary.
Q| Arithmetic and the standard library. The standard library includes numerous
functions for integers and reals, of various precisions. Structure Jaf contains
such functions as abs (absolute value), min, max and sign. Here are some examples:
Int.abs “4;
> 4: int
Int .min(7, Int. sign 12);
> 1: int
Structure Real contains analogous functions such as abs and sign, as well as functions
to convert between integers and reals. Calling real (i) converts i to the equivalent real
number. Calling round (r) converts r to the nearest integer. Other real-to-integer con-
versions include floor, ceil and trunc. Conversion functions are necessary whenever in-
tegers and reals appear in the same expression.
Structure Math contains higher mathematical functions on real numbers, such as sqrt,
sin, cos, atan (inverse tangent), exp and /n (natural logarithm). Each takes one real ar-
gument and returns a real result.
Exercise 2.2 A Lisp hacker says: ‘Since the integers are a subset of the real
numbers, the distinction between them is wholly artificial — foisted on us by
hardware designers. ML should simply provide numbers, as Lisp does, and au-
tomaticatly use integers or reals as appropriate.’ Do you agree? What consider-
ations are there?
Exercise 2.3 Which of these function definitions require type constraints?
fun double(n) = 2*n;
fun f u = Math.sin(w) /u;
fun gk =7 k * k;
25 Strings and characters
Messages and other text are strings of characters. They have type string.
String constants are written in double quotes:
"How now! a rat? Dead, for a ducat, dead";
> "How now! a rat? Dead, for a ducat, dead!" : string2.5 Strings and characters 25
The concatenation operator (*) joins two strings end-to-end:
“pair " * "Ophelia";
> "Fair Ophelia” : string
The built-in function size returns the number of characters in a string. Here it
refersto "Fair Ophelia":
size (it);
> 12: int
The space character counts, of course. The empty string contains no characters;
size("") is 0,
Here is a function that makes noble titles:
fun fitle(name) = "The Duke of " * name;
> val title = fn ; string -> string
title "York";
> “The Duke of York" : string
Special characters. Escape sequences, which begin with a backslash (\), insert
certain special characters into a string. Here are some of them:
e \n inserts a newline character (line break).
» \t inserts a tabulation character.
* \" inserts a double quote.
« \\ inserts a backslash.
© \ followed by a newline and other white-space characters, followed by
another \ inserts nothing, but continues a string across the line break.
Here is a string containing newline characters:
“This above ail:\nto thine own self be true\n";
The type char. Just as the number 3 differs from the set {3}, a character differs
from a one-character string. Characters have type char. The constants have the
form #5, where s is a string constant consisting of a single character. Here is a
letter, a space and a special character:
Hea" eon #°\n"
The functions ord and chr convert between characters and character codes. Most
implementations use the ASCII character set; if k is in the range 0 < & < 255 then
chr(k) returns the character with code k. Conversely, ord(c) is the integer code
of the character c, We can use these to convert a number between 0 and 9 toa
character between #"0" and #"9*:26 2 Names, Functions and Types
fun digit i = chr(i + ord #"0");
> val digit = fm : int -> char
The functions str and String . sub convert between characters and strings. If ¢
is a character then str(c) is the corresponding string. Conversely, if s is a string
then String . sub(s, ) returns the nth character in s, counting from zero. Let us
try these, first expressing the function digit differently:
fun digit i = String .sub("0123456789", i);
> val digit = fn ; int -> char
sir (digit 5);
> "5" ; string
The second definition of digit is preferable to the first, as it does not rely on char-
acter codes,
0 Strings, characters and the standard library, Structure String contains numer-
ous operations on strings. Structure Char provides functions such as isDigit,
isAlpha, etc., to recognize certain classes of character. A substring is a contiguous sub-
sequence of characters from a string; structure Substring provides operations for extract-
ing and manipulating them,
The Mi. Definition only has type string (Milner et al., 1990). The standard library
introduces the type char. It also modifies the types of built-in functions such as ord and
chr, which previously operated on single-character strings.
Exercise 2.4 For each version of digit, what do you expect in response to the
calls digit ~1 and.digit 10? Try to predict the response before experimenting
on the computer.
2.6 Truth values and conditional expressions
To define a function by cases — where the result depends on the outcome
of a test — we employ a conditional expression.' The test is an expression E of
type bool, whose values are true and false. The outcome of the test chooses one
of two expressions E, or E». The value of the conditional expression
if E then £, else Fy
is that of £; if E equals true, and that of E> if E equals false. The else part is
mandatory.
The simplest tests are the relations:
e less than (<)
‘ Because a Standard ML expression can update the state, conditional expressions
can also act like the if commands of procedural languages.2.6 Truth values and conditional expressions 27
e greater than (>)
e less than or equals (<=)
¢ greater than or equals (>=)
These are defined on integers and reals; they also test alphabetical ordering on
strings and characters. Thus the relations are overloaded and may require type
constraints. Equality (=) and its negation (<>) can be tested for most types.
For example, the function sign computes the sign (1, 0, or —1) of an integer.
It has two conditional expressions and a comment.
fun signin) =
if m>0 then 1
else if n=0 then 0
else (*n<0*) “l;
> val sign = fn ; int ->int
Tests are combined by ML’s boolean operations:
logical or (called orelse)
e logical and (called andalso)
logical negation (the function not)
Functions that return a boolean value are known as predicates. Here is a predi-
cate to test whether its argument, a character, is a lower-case letter:
fun isLower c = #"a" <= ¢ andalso ¢ <= #"z";
> val isLower = fm : char -> bool
When a conditional expression is evaluated, either the then or the el se expres-
sion is evaluated, never both. The boolean operators andal so and orelse be-
have differently from ordinary functions: the second operand is evaluated only
if necessary. Their names reflect this sequential behaviour.
Exercise 2.5 Let d be an integer and m a string. Write an ML boolean expres-
sion that is true just when d and m form a valid date: say 25 and "October".
Assume it is not a leap year.
Pairs, tuples and records
In mathematics, a collection of values is often viewed as a single value.
A vector in two dimensions is an ordered pair of real numbers. A statement about
{wo vectors ¥) and ¥2 can be taken as a statement about four real numbers, and
those real numbers can themselves be broken down into smaller pieces, but think-23 2 Names, Functions and Types
ing at a high level is easier. Writing ¥; + ¥ for their vector sum saves us from
writing (x) + 2x2, ¥1 +¥2).
Dates are a more commonplace example. A date like 25 October 1415 con-
sists of three values. Taken as a unit, it is a triple of the form (day, month, year).
This elementary concept has taken remarkably long to appear in programming
languages, and only a few handle it properly.
Standard ML provides ordered pairs, triples, quadruples and so forth. For a >
2, the ordered collection of n values is called an n-tuple, or just a tuple. The tuple
whose components are x1, x2, ... ,X, is written (x), x2,...,%). Such a value is
created by an expression of the form (E), £2,...,£,). With functions, tuples
give the effect of multiple arguments and results.
The components of an ML tuple may themselves be tuples or any other value.
For example, a period of time can be represented by a pair of dates, regardless of
how dates are represented. It also follows that nested pairs can represent n-tuples.
(In Classic ML, the original dialect, (x1, ...,%»-1,%n) Was merely an abbrevia-
tion for (41,6... nt.) ..))
An ML record has components identified by name, not by position. A record
with 20 components occupies a lot of space on the printed page, but is easier to
manage than a 20-tuple.
2.7 Vectors: an example of pairing
Let us develop the example of vectors. To try the syntax for pairs, enter
the vector (2.5, -1.2):
(2.5, ~1.2);
> (2.5, “1.2) : real * real
The vector’s type, which in mathematical notation is real x real, is the type of a
pair of real numbers. Vectors are ML values and can be given names. We declare
the zero vector and two others, called a and b.
val zerovee = (0.0, 0.0);
> val zerovec = (0.0, 0.0) : real * real
vala (1.5, 6.8);
> val = (1.5, 6.8) : real * real
a
val b = (3.6, 0.9}3
> val b= (3.6, 0.9) ; real * real
Many functions on vectors operate on the components. The length of (x, y) is
vx? +)°, while the negation of (x, y) is (—x, —y). To code these functions in
ML, simply write the argument as a pattern:2.8 Functions with multiple arguments and results 29
fun lengthvec (x,y) = Math.sqrt(x*x + y*y);
> val lengthvec = fn : real * real -> real
The function /engthvec takes the pair of values of x and y. It has type real x
real — real: its argument is a pair of real numbers and its result is another real
number.” Here, a is a pair of real numbers.
lengthvec a;
> 6.963476143 : real
lengthvec (1.0, 1.0);
> 1.414213562 : real
Function negvec negates a vector with respect to the point (0, 0).
fun negvec (x,y) + real*real = ("x, “y)i
> val negvec = fn : real * real -> real * real
This function has type real x real -> real x real: given a pair of real numbers it
returns another pair. The type constraint real x real is necessary because minus
(~) is overloaded.
We negate some vectors, giving a name to the negation of b:
negvec (1.0, 1.0);
> (71.0, ~1.0) : real * real
val bn = negvec(b) ;
> val bn = (73.6, ~0.9) : real * real
Vectors can be arguments and results of functions and can be given names. In
short, they have all the rights of ML’s built-in values, like the integers. We can
even declare a type of vectors:
type vec = real*real;
> type vec
Now vec abbreviates real x real. It is only an abbreviation though: every pair
of real numbers has type vec, regardless of whether it is intended to represent a
vector. We shall employ vec in type constraints.
2.8 Functions with multiple arguments and results
Here is a function that computes the average of a pair of real numbers.
fun average(x,y) = (x+y) /2.0;
> val average = fn : (real * real) -> real
2 function Math. sqrt, which is defined only for real numbers, constrains the
overloaded operators to type real.30 2 Names, Functions and Types
This would be an odd thing to do to a vector, but average works for any two
numbers:
average (3.1,3.3);
> 3.2: real
A function on pairs is, in effect, a function of two arguments: lengthvee(x, y) and
average(x, y) operate on the real numbers x and y. Whether we view (x, y) asa
vector is up to us. Similarly negvec takes a pair of arguments — and retums a
pair of results.
Strictly speaking, every ML function has one argument and one result. With
tuples, functions can effectively have any number of arguments and results. Cur-
rying, discussed in Chapter 5, also gives the effect of multiple arguments.
Since the components of a tuple can themselves be tuples, two vectors can be
paired:
((2.0, 3.5), zerovec);
> ((2.0, 3.5), (0.0, 0.0)) : (real*real) * (real*real)
The sum of vectors (x1, y1) and (x2, y2) is (4) +22, y1 +2). In ML, this function
takes a pair of vectors. Its argument pattern is a pair of pairs:
fun addvec ({x1,yl}, (x2,92)) : vee = ¢xbex2, yloy2);
> val addvec = fn ; (real*reail} * (real*real) -> vec
Type vec appears for the first time, constraining addition to operate on real num-
bers. ML gives addvec the type
((real x real) x (real x real)) > vee
which is equivalent to the more concise (vec x vec) > vec. The ML system may
not abbreviate every real x real as vec.
Look again at the argument pattern of addvec. We may equivalently view this
function as taking
@ one arguinent: a pair of pairs of real numbers
« two arguments: each a pair of real numbers
¢ four arguments: all real numbers, oddly grouped
Here we add the vectors (8.9,4.4) and b, then add the result to another vector.
Note that vec is the result type of the function.
addvec( (8.9, 4.4), b);
> (12.5, 5.3) : veo
addvec(it, (0.1, 0.2));
> (12.6, 5.5) : vee2.8 Functions with multiple arguments and results 31
Vector subtraction involves subtraction of the components, but can be expressed
by vector operations:
fun subvec (v1.12) = addvee(vl, negvee v2);
> val subvec = fn : (real*realj * (real *real) -> vec
The variables v1 and v2 range over pairs of reals.
subvec{a,b) ;
> (72.1, 5.9) : vee
The distance between two vectors is the length of the difference:
fun distance (vl, v2) = lengthvec (subvec (v1, v2) ) ;
> val distance = fn : (real*real} * (real*real) -> real
Since distance never refers separately to v1 or v2, it can be simplified:
fun distance pairv = lengthvec (subvec pairv) ;
The variable pairv ranges over pairs of vectors. This version may look odd, but
is equivalent to its predecessor. How far is it from a to b?
distance (a,b) ;
> 6.262587325 : real
A final example will show that the components of a pair can have different types:
here, a real number and a vector. Scaling a vector means multiplying both com-
ponents by a constant.
fun sealevee (r, (x,y)) + wee = (r8x, rty);
> val sealevec = fn : real * (real*real) -> vec
The type constraint vec ensures that the multiplications apply to reals. The func-
tion sealevec takes a real number and a vector, and returns a vector,
sealevec(2.0, a);
> (3.0, 13.6) : vec
Sealevec (2.9, it);
> (6.0, 27.2) : vee
Selecting the components of a tuple. A function defined on apattem, say (x,¥),
refers to the components of its argument through the pattern variables x and y. A
val declaration may also match a value against a pattern: each variable in the
pattern refers to the corresponding component.
Here we treat scalevec as a function returning two results, which we name xc
and ye.32 2 Names, Functions and Types
sealevec(4,0, a);
0: real
val (xe,ye) =
6.
27.2 3: real
)
> val xe =
> val yo =
The pattern in a val declaration can be as complicated as the argument pattern
of a function definition. In this contrived example, a pair of pairs is split into four
parts, which are all given names,
val ((xl,yl), (22,y2)) = (addvec(a,b), subvec(a,b));
> val xl = 5.1; real
> val yl = 7.7: real
> val x2 = “2.1: real
> val y2 = 5.9: real
The 0-tuple and the type unit. Previously we have considered n-tuples for n > 2.
There is also a 0-tuple, written () and pronounced ‘unity,’ which has no compo-
nents. It serves as a placeholder in situations where no data needs to be conveyed.
The 0-tuple is the sole value of type unit.
Type unit is often used with procedural programming in ML. A procedure is
typically a ‘function’ whose result type is unit. The procedure is called for its
effect — not for its value, which is always (). For instance, some ML systems
provide a function use of type string — unit. Calling use "myfile "has the
effect of reading the definitions on the file "myfile" into ML.
A function whose argument type is unit passes no information to its body when
called. Calling the function simply causes its body to be evaluated. In Chapter 5,
such functions are used to delay evaluation for programming with infinite lists.
Exercise 2.6 Write a function to determine whether one time of day, in the form
(hours, minutes, AM or PM), comes before another. As an example, (11, 59,
“AM") comes before (1, 15, "PM").
Exercise 2.7 Old English money had 12 pence ina shilling and 20 shillings in
a pound. Write functions to add and subtract two amounts, working with triples
(pounds, shillings, pence).
2.9 Records
A record is a tuple whose components — called fields — have labels.
While each component of an n-tuple is identified by its position from 1 to n, the
fields of a record may appear in any order. Transposing the components of a tu-
ple is a common error. If employees are taken as triples (name, age, salary) then2.9 Records 33
there is a big difference between ("Jones", 25, 15300) and ("Jones",
15300, 25). But the records
{name="Jones", age=25, salary=15300}
and
{name="Jones", salary=15300, age=25}
are equal. A record is enclosed in braces { ... }; each field has the form label =
expression.
Records are appropriate when there are many components. Let us record five
fundamental facts about some Kings of England, and note ML’s response:
val henryV =
{name = "Henry V",
born 1387,
crowned 1413,
died 1422,
quote = “Bid them achieve me and then sell my bones"};
> val henryV =
{born = 1387,
died = 1422,
name = “Henry V",
quote = "Bid them achieve me and then sell my bones",
crowned = 1413}
: (born: int,
died; int,
name; string,
quote: string,
crowned: int}
VV¥VYVYV YY Vv
ML has rearranged the fields into a standard order, ignoring the order in which
they were given. The record type lists each field as label : type, within braces.
Here are two more Kings:
val henryVI =
(name = “Henry VI",
born 1421,
crowned 1422,
died = 1471,
quote = “Weep, wretched man, \
\ I'll aid thee tear for tear"};
val richardill =
{name = "Richard III",
bom = 1452,
crowned = 1483,34 2 Names, Functions and Types
died = 1485,
quote = "Plots have I laid..."};
The guote of henryVI extends across two lines, using the backslash, newline,
backslash escape sequence.
Record patterns. A record pattern with fields fabel = variable gives each vari-
able the value of the corresponding label. If we do not need all the fields, we
can write three dots (. . .) in place of the others. Here we get two fields from
Henry V’s famous record, calling them nameV and bornV:
val (name=nameV, born=bornV, ...3 = henryV;
> val nameV = "Henry V" : string
~» val born¥Y = 1387 : int
Often we want to open up a record, making its fields directly visible. We can -
specify each field in the pattern as /abel = label, making the variable and the
label identical. Such a specification can be shortened to simply label. We open
up Richard TL:
val {name , born, died , quote ,crowned} = richardIlT;
> val mame = "Richard III" : string
> val born = 1452 : int
> val died = 1485 :; int
> val quote = "Plots have T laid..." : string
> val crowned = 1483 : int
To omit some fields, write (. . .) as before. Now quote stands for the quote of
Richard III. Obviously this makes sense for only one King at a time.
Record field selections. The selection #label gets the value of the given label
from a record.
quote richardllT;
> "Plots nave I laid..." ; string
tdied henryV - #bhom henryV;
> 35: int
Different record types can have labels in common. Both employees and Kings
have a name, whether "Jones" or "Henry V". The three Kings given above
have the same record type because they have the same number of fields with the
same labels and types.
Here is another example of different record types with some labels in common:
the n-tuple (x1,2,... ,%,) is just an abbreviation for a record with numbered2.9 Records 35
fields:
{l= xy,2 =2x2....,8=xp)}
Yes, a label can be a positive integer! This obscure fact about Standard ML is
worth knowing for one reason: the selector #4 gets the value of component & of
an n-tuple. So #1 selects the first component and #2 selects the second. If there
is a third component then #3 selects it, and so forth:
#2 ("a","b",3, false) ;
> "Bb" ; string
Partial record specifications. A field selection that omits some of the fields
does not completely specify the record type; a function may only be defined
over a complete record type. For instance, a function cannot be defined for all records
that have fields born and died, without specifying the full set of field names (typically
using a type constraint). This restriction makes ML records efficient but inflexible. It
applies equally to record patterns and field selections of the form #/abel. Ohori (1995)
has defined and implemented flexible records for a variant of ML.
Declaring a record type. Let us declare the record type of Kings. This abbrevi-
ation will be useful for type constraints in functions.
type king = {name = string,
born : int,
crowned : int,
died 3 int,
quote + string);
> type king
We now can declare a function on type king to return the King’s lifetime:
fun lifetime(k: king) = died k - #hom k;
> val lifetime > fn : king -> int
Using a pattern, lifetime can be declared like this:
fun lifetime ({born, died, ...}: king) = died ~ born;
Either way the type constraint is mandatory. Otherwise ML will print a message
like ‘A fixed record type is needed here.”
lifetime henryV;
> 35 : int
lifetime richardlll;
> 330: int36 2 Names, Functions and Types
Exercise 2.8 Does the following function definition require a type constraint?
What is its type?
fun lifetime ( {name , born, crowned , died, quote}) = died ~ born;
Exercise 2.9 Discuss the differences, if any, between the selector #born and
the function
fun bornat({born}) = born;
2.10 — Infix operators
An infix operator is a function that is written between its two arguments,
We take infix notation for granted in mathematics. Imagine doing without it. In-
stead of 2+2=4 we should have to write =(+(2,2),4). Most functional languages
let programmers declare their own infix operators.
Let us declare an infix operator xor for ‘exclusive or.’ First we issue an ML
infix directive:
infix xor;
We now must write p xor g rather than xor (p,q):
fun (p xor q) = (p orelse qg) andalso not (p andalso 4);
> val xor = fn : (bool * bool} -> bool
The function xor takes a pair of booleans and returns a boolean.
true xor false xor true;
> false : bool
The infix status of a name concerns only its syntax, not its value, if any. Usually
a name is made infix before it has any value at all.
Precedence of infixes. Most people take m x n + i/j to mean (m x n) + (i/j),
giving x and / higher precedence than +. Similarly i — j — k means (i — j) ~
k, since the operator — associates to the left. An ML infix directive may state a
precedence from 0 to 9. The default precedence is 0, which is the lowest. The
directive infix causes association to the left, while inf ixr causes association
to the right.
To demonstrate infixes, the following functions construct strings enclosed in
parentheses. Operator plus has precedence 6 (the precedence of + in ML) and
constructs a string containing a + sign.
infix 6 plus;
fun (a plus b) = "(" 7 a@ > "HE" 7 BT MY;
> val plus = fn : string * string -> string2.10 Infix operators 37
Observe that plus associates to the left:
1" plus “2" plus °3";
> "((1+2)+3)" 2: string
Similarly, times has precedence 7 (like * in ML) and constructs a string contain-
ing a * sign.
infix 7 times;
fun (a times b) = "(" 7 a Hee 7 bt yee
> val times = fn : string * string -> string
*m* times "n" times *3" plus *i" plus "j" times "k";
> "CC C(m*n) #3} +i) +(F*k))" : string
The operator pow has higher precedence than times and associates to the right,
which is traditional for raising to a power. It produces a # sign. (ML has no op-
erator for powers.)
infixr 8 pow;
fun (a pow b) = "(* ~ a7 "HY “bt ye;
> val pow = fn ; string * string -> string
"m" times *i" pow "4" pow °2% times "1";
> "((m* (i#(J#2)))*n)" + string -
Many infix operators have symbolic names. Let ++ be the operator for vector
addition:
infix ++;
fun ((xt, yl) ++ (x2,y2)) vee = (xdee2, yley2);
> val ++ = fn : (real*real) * (real*real) -> vec
It works exactly like addvec, but with infix notation:
b ++ (0,1,0.2) +4 (20.0, 30.0);
> (23.7, 31.1) : vee
Keep symbolic names separate. Symbolic names can cause confusion if you
run them together. Below, ML reads the characters +~ as one symbolic name,
then complains that this name has no value:
14+°3;
> Unknown name +~
Symbolic names must be separated by spaces or other characters:
+
> "2: int
Taking infixes as functions. Occasionally an infix has to be treated like an or-
dinary function. In ML the keyword op overrides infix status: if @ is an infix38 2 Names, Functions and Types
operator then op@ is the corresponding function, which can be applied to a pair
in the usual way.
> opt+ ((2,5,0-9), (0.1,2.5));
(2.6, 2.5) : real * real
op” ("Mont", "joy");
> "Montjoy" : string
Infix status can be revoked. If @ is an infix operator then the directive nonfix®
makes it revert to ordinary function notation. A subsequent infix directive can
make © an infix operator again.
Here we deprive ML’s multiplication operator of its infix status. The attempt
to use it produces an error message, since we may not apply 3 as a function. But
* can be applied as a function:
nonfix *;
3*2;
> Error: Type conflict...
*(3,2)5
> 6: int
The nonf ix directive is intended for interactive development of syntax, for try-
ing different precedences and association. Changing the infix status.of estab-
lished operators leads to madness.
The evaluation of expressions
An imperative program specifies commands to update the machine state.
During execution, the state changes millions of times per second. Its structure
changes too: local variables are created and destroyed. Even if the program has
a mathematical meaning independent of hardware details, that meaning is be-
yond the comprehension of the programmer. Axiomatic and denotational seman-
tic definitions make sense only to a handful of experts. Programmers trying to
correct their programs rely on debugging tools and intuition,
Functional programming aims to give each program a straightforward math-
ematical meaning. It simplifies our mental image of execution, for there are no
state changes. Execution is the reduction of an expression to its value, replacing
equals by equals. Most function definitions can be understood within elementary
mathematics.
When a function is applied, as in f(£), the argument £ must be supplied to the
body of f. If the expression contains several function calls, one must be chosen
according to some evaluation rule. The evaluation rule in ML is call-by-vatue (or2.11 Evaluation in ML: call-by-value 39
strict evaluation), while most purely functional languages adopt call-by-need (or
lazy evaluation).
Each evaluation rule has its partisans. To compare the rules we shall consider
two trivial functions. The squaring function sqr uses its argument twice:
fun sgr(x) : int = x*x;
> val sqr = fm; int ~> int
The constant function zero ignores its argument and retums 0:
fun zero(x ; int) = 0;
> val zero = fn: int -> int
When a function is called, the argument is substituted for the function’s formal
parameter in the body. The evaluation rules differ over when, and how many
times, the argument is evaluated. The formal parameter indicates where in the
body to substitute the argument. The name of the formal parameter has no other
significance, and no significance outside of the function definition.
2.11 Evaluation in ML; call-by-value
Let us assume that expressions consist of constants, variables, function
calls and conditional expressions (if-then-else). Constants have explicit val-
ues; variables have bindings in the environment. So evaluation has only to deal
with function calls and conditionals. M1’s evaluation rule is based on an obvious
idea.
‘To compute the value of f(£), first compute the value of the expression E.
This value is substituted into the body of f, which then can be evaluated. Pattern-
matching is a minor complication. If f is declared by, say
fun f (x,y,z) = body
then substitute the corresponding parts of E's value for the pattern variables x, y
and z. (A practical implementation performs no substitutions, but instead binds
the formal parameters in the local environment.)
Consider how ML evaluates sgr(sqr(sgr(2))). Of the three function calls, only
the innermost call has a value for the argument. So sgr(sqr(sqr(2))) reduces to
sqr(sqr(2 x 2)). The multiplication must now be evaluated, yielding sgr(sqr(4)).
Evaluating the inner call yields sqr(4 x 4), and so forth. Reductions are written
sgr(sqr(4)) => sqr(4 x 4). The full evaluation looks like this:
sqr(sqr(sqr(2))) => sgr(sqr(2 x 2))
=> sgr(sqr(4))40 4 Names, Functions and Types
=> sqr4 x 4)
=> sqr(16)
=> 16x 16
=> 256
Now consider cero(sqr(sqr(sqr(2)))). The argument of zero is the expression
evaluated above. It is evaluated but the value is ignored:
zero(sqr(sqr(sqr(2)))) = zero(sqr(sqr(2 x 2)))
=> zero(256)
=>0
Such waste! Functions like zero are uncommon, but frequently a function’s result
does not depend on all of its arguments.
ML’s evaluation rule is known as call-by-value because a function is always
given its argument’s value. It is not hard to see that call-by-value corresponds
to the usual way we should perform a calculation on paper. Almost all program-
ming languages adopt it. But perhaps we should look for an evaluation rule that
reduces zero(sqr(sqr(sqr(2)))) to 0 in one step. Before such issues can be exam-
ined, we must have a look at recursion.
2.12 Recursive functions under call-by-value
The factorial function is a standard example of recursion. It includes a
base case, n = 0, where evaluation stops.
fun fact n =
if a=0 then 1 else n * fact(n-1);
> val fact = fn : int -> int
fact 7;
> 5040 ; int
fact 35;
> 10333147966386144929666651337523200000000 : int
ML evaluates fact(4) as follows. The argument, 4, is substituted for in the body,
yielding
if 4=0 then 1 else 4x fact(4—1)
Since 4 = 0 is false, the conditional reduces to 4 x fact(4 — 1). Then 4 — 1 is
selected, and the entire expression reduces to 4 x fact(3). Figure 2.1 summarizes2.12 Recursive functions under call-by-value Al
Figure 2.1 Evaluation of fact(4)
fact(4) = 4 x fact(4 — 1)
=> 4 x fact(3)
=> 4x (3 x fact(3 — 1))
=> 4x 3 x fact(2))
=> 4x Bx (2 x faci(2 — D))
= 4x Bx (2 x fact(1)))
=> 4x (3 x (2 x (E x fact(1 — 1))))
=> 4x 3x (2 x (i x fact(0))))
=> 4x (3 x (2x (1 x 1)
34x (3x (2x1)
= 4x (3 x 2)
=>4x6
=> 24 .
Figure 2.2 Evaluation of facti(4, 1)
facti(4, 1) => facti(4 — 1,4 x 1)
=> facti(3, 4)
=> facti(3 — 1,3 x 4)
=> facti(2, 12)
=> facti(2 — 1,2 x 12)
=> facti(1, 24)
=> facti(1 — 1,1 x 24)
=> facti(0, 24)
3.442 2 Names, Functions and Types
the evaluation. The conditionals are not shown: they behave similarly apart from
n = (), when the conditional returns 1.
The evaluation of fact(4) exactly follows the mathematical definition of fac-
torial: O! = 1, andn! = nx (n—1)! ifn > 0. Could the execution of a recursive
procedure be shown as succinctly?
iterative functions. Something is odd about the computation of fact(4). As the
recursion progresses, more and more numbers are waiting to be multiplied. The
multiplications cannot take place until the recursion terminates with fact(0). At
that point 4 x (3 x (2 x (1 x 1))) must be evaluated. This paper calculation shows
that fact is wasting space.
A more efficient version can be found by thinking about how we should com-
pute factorials. By the associative law, each multiplication can be done at once:
4 x (3 x fact(2)) = (4 x 3) x fact(2) = 12 x fact(2)
The computer will not apply such laws unless we force it to. The function facti
keeps a running product in p, which initially should be I:
fun facti (n,p) =
if n=0 then p else fucti{n-1, n*p);
> val facti = fn: int * int -> int
Compare the evaluation for facti(4, 1), shown in Figure 2.2, with that of fact(4).
The intermediate expressions stay small; each multiplication can be done at once;
storage requirements remain constant. The evaluation is iterative — also termed
tail recursive. In Section 6.3 we shall prove that facti gives correct results by
establishing the law facti(n, p) = n! x p.
Good compilers detect iterative forms of recursion and execute them effi-
ciently. The result of the recursive call facti(n — 1, n x p) undergoes no further
computation, but is immediately returned as the value of facti(n, p). Such a tail
call can be executed by assigning the arguments n and p their new values and
then jumping back into the function, avoiding the cost of a proper function in-
vocation. The recursive call in fact is not a tail call because its value undergoes
further computation, namely multiplication by n.
Many functions can be made iterative by adding an argument, like p in facti.
Sometimes the iterative function runs much faster. Sometimes, making a func-
tion iterative is the only way to avoid running out of store. However, adding an
extra argument to every recursive function is a bad habit. It leads to ugly, con-
voluted code that might run slower than it should.2.12 Recursive functions under call-by-value 43
The special role of conditional expressions. The conditional expression permits
definition by cases. Recall how the factorial function is defined:
Ol=1
ni=nx(n—1)! forn > 0
These equations determine n! for all integers n > 0. Omitting the condition 2 >
0 from the second equation would lead to absurdity:
1=0!=0x(-)!=0
Similarly, in the conditional expression
if E then £, else &,
ML evaluates £, only if E = true, and evaluates E> only if E = false.
Due to call-by-value, there is no ML function cond such that cond(E, E), E2)
is evaluated like a conditional expression. Let us try to declare one and use it to
code the factorial function:
fun cond(p,x,y) : im = if p then x else y;
> val cond = fn : bool * int * int -> ine
fun badf n = cond(n=0, 1, n*badf(n-1));
> val badf = fn : int -> int
This may look plausible, but every call to badf runs forever. Observe the evalu-
ation of badf (0):
badf (0) => cond(true, 1,0 x badf(—1))
=> cond(true, |, 0 x cond(false, 1, —1 x badf(—2)))
Although cond never requires the values of all three of its arguments, the call-
by-value rule evaluates them all. The recursion cannot terminate.
Conditional and/or. ML's boolean infix operators andalso and orelse are
not functions, but stand for conditional expressions.
The expression H} andalso E abbreviates
if E; then £ else false.
The expression E, orelse E> abbreviates
if £; then true else Fo.44 2 Names, Functions and Types
These operators compute the boolean and/or, but evaluate E only if necessary.
If they were functions, the call-by-value rule would evaluate both arguments. All
other ML infixes are really functions.
The sequential evaluation of andalso and orelse makes them ideal for
expressing recursive predicates (boolean-valued functions). The function pow-
oftwo tests whether a number is a power of two:
fun even n = (n mod 2 = 0);
> val even = fn: int -> bool
fun powoftwo n = (n=1) orelse
(even(n) andalso powoftwo(n div 2));
> val powoftwo = fn : int -> bool
You might expect powoftwo to be defined by conditional expressions, and so it
is, through orelse and andalso. Evaluation terminates once the outcome is
decided:
Powoftwo(6) = (6= 1) orelse (even(6) andalso ---)
=> even(6) andalso powoftwo(6div2)
=> powoftwo(3)
=> (3=1) orelse (even(3) andalso --:)
=> even(3) andalso powoftwo(3 div2)
=> false
Exercise 2.10 Write the reduction steps for powoftwo(8).
Exercise 2.11 _ Is powoftwo an iterative function?
2.13 Call-by-need, or lazy evaluation
The call-by-value rule has accumulated a catalogue of complaints. It
evaluates E superfiuously in zero(E). And it evaluates £1 or E2 superfluously in
cond(E, E1, E2). Conditional expressions and similar operations cannot be func-
tions. ML provides andalso and orelse, but we have no means of defining
similar things.
Shall we give functions their arguments as expressions, not as values? The
general idea is this:
To compute the value of f(E), substitute E immediately into the body of f.
Then compute the value of the resulting expression.2.13 Call-by-need, or lazy evaluation 45
This is the call-by-name rule. It reduces zero(sqr(sqgr(sqr(2)))) at once to 0. But
it does badly by sgr(sqr(sqr(2))). It duplicates the argument, sqr(sqr(2)). The
result of this ‘reduction’ is
sqr(sqr(2)) x sqr(sgr(2)).
This happens because sqr(x) = x x x.
Multiplication, like other arithmetic operations, needs special treatment. It
must be applied to values, not expressions: it is an example of a sérict function.
To evaluate E, x Ez, the expressions E, and Ez must be evaluated first.
Let us carry on with the evaluation. As the outermost function is x, which is
strict, the rule selects the leftmost call to sqr. Its argument is also‘duplicated:
(sqr(2) x sqr(2)) x sqr(sqr(2))
A full evaluation goes something like this.
sqr(sqr(sqr(2))) => sqr(sqr(2)) x sqr(sqr(2))
=> (sqr(2) x sqr(2)) x sqr(sqr(2))
=> ((2 x 2) x sqr(2)) x sqr(sgr(2))
=> (4 x sqr(2)) x sqr(sqr(2))
=> (4 x (2 x 2)) x sqr(sqr(2))
Does it ever reach the answer? Eventually. But call-by-name cannot be the eval-
uation rule we want.
The eall-by-need rule (lazy evaluation) is like call-by-name, but ensures that
each argument is evaluated at most once. Rather than substituting an expression
into the function’s body, the occurrences of the argument are linked by pointers.
If the argument is ever evaluated, the value will be shared with its other occur-
tences, The pointer structure forms a directed graph of functions and arguments.
AS a part of the graph is evaluated, it is updated by the resulting value. This is
called graph reduction.
Figure 2.3 presents a graph reduction. Every step replaces an occurrence of
Sqr(E) by E x E, where the two Es are shared. There is no wasteful duplica-
tion: only three multiplications are performed. We seem to have the best of both
worlds, for zero(E) reduces immediately to 0, But the graph manipulations are
expensive,
Lazy evaluation of cond(E, E,, E2) behaves like a conditional expression pro-
Vided that its argument, the tuple (E, F;, £2), is itself evaluated lazily. The de-46 4 Names, Functions and Types
Figure 2.3 Graph reduction of sqr(sqr(sqr(2)))
’ > ' >
i io ;
' '
' > ' >
AN AN
poe
C)2.43 Call-by-need, or lazy evaluation 47
Figure 2.4 A space leak with lazy evaluation
facti(4, 1) = facti(4 -— 1,4 x 1)
=> facti(3 — 1,3 x (4x 1))
=> facti(2 -1,2 x Gx 4x 1)))
=> facti(] —1,1 x (2x 3 x (4 1))))
= 1x (2x (3x (4x 1)))
=> 24
tails of this are quite subtle: tuple formation must be viewed as a function. The
idea that a data structure like (Z, £,, Ez) can be partially evaluated — either E,
or Ey but not both — leads to infinite lists.
A comparison of strict and lazy evaluation. Call-by-need does the least possible
evaluation. It may seem like the route to efficiency. But it requires much book-
keeping. Realistic implementations became possible only after David Turner
(1979) applied graph reduction to combinators. He exploited obscure facts about
the A-calculus to develop new compilation techniques, which researchers con-
tinue to improve. Every new technology has its evangelists: some people are
claiming that lazy evaluation is the way, the truth and the light. Why does Stan-
dard ML not adopt it?
Lazy evaluation says that zero(E) = 0 even if E fails to terminate. This flies
in the face of mathematical tradition: an expression is meaningful only if all its
Parts are. Alonzo Church, the inventor of the 1-calculus, preferred a variant (the
Ad-calculus) banning constant functions like zero.
Infinite data structures complicate mathematical reasoning. To fully under-
stand lazy evaluation, it is necessary to know some domain theory, as well as the
theory of the A-calculus. The output of a program is not simply a value, but a
Partially evaluated expression. These concepts are not easy to learn, and many
of them are mechanistic. If we can only think in terms of the evaluation mecha-
nism, we are no better off than the procedural programmers.
Efficiency is problematical too. Sometimes lazy evaluation saves enormous
amounts of space; sometimes it wastes space. Recall that facti is more efficient
than fact under strict evaluation, performing each multiplication at once. Lazy48 2 Names, Functions and Types
evaluation of facti(n, p) evaluates m immediately (for the test n = 0), but not p.
The multiplications accumulate; we have a space leak (Figure 2.4).
Most lazy programming languages are purely functional. Can lazy evaluation
be combined with commands, such as are used in ML to perform input/output? -
Subexpressions would be evaluated at unpredictable times; it would be impos-
sible to write reliable programs. Much research has been directed at combining -
functional and imperative programming (Peyton Jones and Wadler, 1993),
Writing recursive functions
Since recursion is so fundamental to functional programming, let us take.
the time to examine a few recursive functions. There is no magic formula for
program design, but perhaps it is possible to learn by example. One recursive
function we have already seen implements Euclid’s Algorithm:
fun ged(m,n) =
if m=0 then n
else gedin mod m, m);
> val ged = fn: int * int -> int
The Greatest Common Divisor of two integers is by definition the greatest integer ,
that divides both. Euclid’s Algorithm is correct because the divisors of m and n
are the same as those of m and n — m, and, by repeated subtraction, the same as
the divisors of m and n mod m. Regarding its efficiency, consider :
gcd(5499,6812) > gcd(1313, 5499) => ged(247, 1313)
=> gcd(78, 247) => gcd(13, 78) => ged(0, 13) = 13.
Euclid’s Algorithm dates from antiquity. We seldom can draw on 2000 years of »
expertise, but we should aim for equally elegant and efficient solutions.
Recursion involves reducing a problem to smaller subproblems. The key to
efficiency is to select the right subproblems. There must not be too many of them,
and the rest of the computation should be reasonably simpie.
2.14 — Raising to an integer power
The obvious way to compute x* is to multiply repeatedly by x. Using
recursion, the problem x“ is reduced to the subproblem x*—!. But x!° need not
involve 10 multiplications. We can compute 2° and then square it. Since x5 =
x x x*, we can compute x4 by squaring also:
= 0°)? = (x x x4)? = (& x (°)*)?215 Fibonacci numbers 49
By exploiting the law x?" = (x)? we have improved vastly over repeated mul-
tiplication. But the computation is still messy: using instead x" = (x*)" elimi-
nates the nested squaring:
202 4 = 4 x 16> = 4 x 256! = 1024
By this approach, power computes x* for real x and integer k > 0:
fun power(x,k) : real =
if k=1 then x
else if k mod 2 = 0 then power (x*x, k div 2)
else x * power(x*x, k div 2);
> val power = fn : real * int -> real
Note how mod tests whether the exponent is even. Integer division (div) trun-
cates its result to an integer if k is odd. The function power embodies the equa-
tions (for x > 0)
xl=x
a ey"
xPntl = y x (x7)", :
We can test power using the built-in exponentiation function Math . pow:
power(2.0,10);
> 1024.0 : real
power(1.01, 925);
> 9937,353723 : real
Math .pow(1.01, 925.0);
> 9937.353723 : real
Reducing x?" to (x?)" instead of (x)? makes power iterative in its first recursive
call. The second call (for odd exponents) can be made iterative only by introduc-
ing an argument to hold the result, which is a needless complication.
Exercise 2.12 Write the computation steps for power(2.0, 29).
Exercise 2.13 How many multiplications does power(x, 4) need in the worst
case?
Exercise 2.14 Why not take k = 0 for the base case instead of k = 1?
2.15 Fibonacci numbers
The Fibonacci sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ... , is popular
with mathematical hobbyists because it enjoys many fascinating properties. The50 2 Names, Functions and Types
sequence (F,,) is defined by
Fo =0
Fy=1
Fy = Fya t+ Fae. forn >
The corresponding recursive function is a standard benchmark for measuring th
efficiency of compiled code! It is far too slow for any other use because it co
putes subproblems repeatedly. For example, since
Fy = Fo + Fy = Fo t+ (F5 + Fe),
it computes F twice.
Each Fibonacci number is the sum of the previous two:
OFl=1 L+1=2 142=3 243=5 345=
So we should compute with pairs of numbers. Function nexfib takes (Fn—1, Fa
and returns the next pair (Fy, Fn41).
fun nextfib(prev, curr :int) = (curr, prev+curr);
> val nextfib = fn: int * int ~> int * int
The special name it, by referring to the previous pair, helps us demonstrate th
function:
nextfib (0,1);
> ¢1, 1) : int * int
nexifib it;
> (1, 2) : int * int
nextfid it;
> (2, 3) : int * int
nextfib it;
> (3, 5): int * int
Recursion applies nextfib the requisite number of times:
fun fibpair (n) =
ifm=1 then (0,1) else nextfib (fibpairin-1)) ;
> val fibpair = fm ; int -> int * int
Tt quickly computes (F29, F39), which previously would have required nearly
three million function calls:
fibpair 30;
> (514229, 832040) : int * int2.45 Fibonacci numbers 51
Let us consider in detail why fibpair is correct. Clearly fibpair(1) = (Fo, Fi).
And if, forn > 1, we have
fibpair(n) = (Fri, Fn),
then
Sibpair(n + 1) = (Fay Fat + Fa) = (Fas Fog).
We have just seen a proof of the formula fibpair(n) = (F,1, Fn) by mathemat-
ical induction. We shall see many more examples of such proofs in Chapter 6.
Proving properties of functional programs is often straightforward; this is one of
the main advantages of functional languages.
The function fibpair uses a correct and fairly efficient algorithm for comput-
ing Fibonacci numbers, and it illustrates computing with pairs. But its pattern of
recursion wastes space: fibpair builds the nest of calls
nextfib(nextfib(. --nextfib(O, 1)---)).
To make the algorithm iterative, let us turn the computation inside out:
tun iffib (n, prev, curr) ; int =
if n=l then curr (*does not work for n=0+)
else iffib (n-1, curr, prevecurr) ;
> val itfib = fn : int * int * int -> int
The function fib calls iffib with correct initial arguments:
fun fib (n) = itfib(n,0,1);
> val fib = fn; int -> int
fib 30;
> 832040 : int
fib 100;
> 354224848179261915075 : int
For Fibonacci numbers, iteration is clearer than recursion:
itftb(7, 0, 1) => itfib(6, 1, 1) => --- itftb(1, 8, 13) > 13
In Section 6.3 we shall show that iffib is correct by proving the rather unusual
law irfib(n, Fy, Fiat) = Fan:
Exercise 2.15 How is the repeated computation in the recursive definition of
F, related to the cal |-by-name rule? Could lazy evaluation execute this definition
efficiently?52 2 Names, Functions and Types
Exercise 2.16 Show that the number of steps needed to compute F;, by its re
cursive definition is exponential inn. How many steps does fib perform? Assu
that call-by-value is used. i
Exercise 2.17 What is the value of isfib(n, Fy—1, Fx)?
2.16 Integer square roots
The integer square root of n is the integer k such that
P
0. Since n may not be exactly divisible by 4, write n = 4m +
where r = 0, 1, 2, or 3. Since m < n we can recursively find the integer squ
root of m:
2
Psma< (+1.
Since m and i ate integers, m+1 < (i+1)?. Multiplication by 4 implies 4/2 < 4—
and 4(m + 1) < 4+ 1). Therefore .
(2i)? <4m val increase
Lf (k+1)*(k+1) > m then k else k+1;
fn; int * int -> int
The recursion terminates when n = 0. Repeated integer division will reduce an
number to 0 eventually:
fun introot n =
if n=0 then 0 else increase(2 * introot(n div 4), n);
> val introot = fn : int -> int
There are faster methods of computing square roots, but ours is respectably fast
and is a simple demonstration of recursion.2.16 Integer square roots 53
introot 123456789;
> 11111 : int
irit;
> 123454321 ; int
intraot 2000000000000000000000000000000;
> 1414213562373095 : int
itm it;
> 1999999999999999861967979879025 :; int
Exercise 2.18 Code this integer square root algorithm using iteration in a pro-
cedural programming language.
Exercise 2.19 Declare an ML function for computing the Greatest Common Di-
visor, based on these equations (m and n range over positive integers):
GCD(2m, 2n) = 2 x GCD(n, n)
GCD(2m, 2n + 1) = GCD(m, 2n + 1)
GCD(2Qm + 1, 2n + 1) = GCD{n — m, 2m 4+ 1) m val fraction = fn : int * int -> int * int
low
We have used a let expression, which has the general form
let D in E end54 2 Names, Functions and Types
During evaluation, the declaration D is evaluated first: expressions within the
declaration are evaluated, and their results given names. The environment thus
created is visible only inside the Let expression. Then the expression E is eval-
uated, and its value returned. :
Typically D is a compound declaration, which consists of a list of declarations:
Dy; D2; ...3Dn
The effect of each declaration is visible in subsequent ones. The semicolons are -
optional and many programmers omit them.
2.17 Example: real square roots :
The Newton-Raphson method finds roots of a function: in other words,
it solves equations of the form f(x) = 0. Given a good initial approximation, it
converges rapidly. It is highly effective for computing square roots, solving the
equation a — x? = 0. To compute /a, choose any positive xp, say 1, as the first :
approximation, If x is the current approximation then the next approximation is
(a/x + x)/2. Stop as soon as the difference becomes small enough.
The function findroot performs this computation, where x approximates the
square root of a and acc is the desired accuracy (relative to x). Since the next
approximation is used several times, it is given the name nextx using Let.
fun findroot (a, x, ace) =
let val nextr = (a/x + x) / 2.0
in Lf abs (x-nextx) < ace*x
then nextx else findroot (a, nextx, acc)
end;
> val findroot = fn : (real * real * real} -> real
The function sgroot calls findroot with suitable starting values.
fun sqroot a = findroot (a, 1.0, 1.0E710);
> val sqroot = fn; real -> real
sqroot 2.0;
> 1,414213562 : real
itt it;
> 2.0: real
Nested function declarations. Our square root function is still not ideal. The ar-
guments @ and ace are passed unchanged in every recursive call of findroot, They
can be made global to findroot for efficiency and clarity.
A further let declaration nests findroot within sqroot. The accuracy acc is
declared first, to be visible in findroot; the argument a is also visible.2.18 Hiding declarations using local 55
fun sqroot a =
let val ace = 1.0E710
fun findroot x =
let val nextx = (afx + x) / 2.0
in if abs (x-nextx) < ace*x
then nextr else findroot nextx
end
in findroot 1.0 end;
> val sgroot = fn : real -> real
As we see from ML’s response, findroot is not visible outside sgrvot.
Most kinds of declaration are permitted within Let. Values, functions, types
and exceptions may be declared.
When not to use Let. Consider taking the minimum of f(x) and g(x). You could
name these quantities using Let:
let vala=fx
val b= gx
in
if acb then a else b
end
Better, declare a function for the minimum of two real numbers:
fun min(a,b) : real = if ach then a else b;
Now min (f x, g x) is clear because min computes something familiar. Take
every opportunity to declare meaningful functions, even if they are only needed
once,
2.18 Hiding declarations using Local
A local declaration resembles a Let expression:
local D; in D, end
This declaration behaves like the list of declarations Dy; D2 except that Dy is vis-
ible only within D2, not outside. Since a list of declarations is regarded as one
declaration, both PD, and D2 can declare any number of names.
While Let is frequently used, Local is not. Its sole purpose is to hide a dec-
laration. Recall iffib and fib, which compute Fibonacci numbers. The function
itfib should be called only from fib:56 2 Names, Functions and Types
local
fun iffh (n, prev, curr) =: int =
if n=1 then curr
else ifih (n-1, curr, prevscurr)
fun fib (n} = itfib(n,0,1) :
end;
> val fib = fm ; int -> int i
Here the loca] declaration makes itfib private to fib.
Exercise 2.20 Above we have used local to hide the function itfib.
simply nest the declaration of itfib within fib? Compare with the treat
findroot and sqroot. .
Exercise 2.21 Using Let, we can eliminate the expensive squaring ope _
in our integer square root function. Code a variant of introot that maps n°
integer square root k, paired with the difference n ~ k2. Only simple m
cations and divisions are needed; an optimizing compiler could replace th |
bit operations. «
2.19 Simultaneous declarations
A simultaneous declaration defines several names at once. Normall '
declarations are independent, But fun declarations allow recursion, so a si
taneous declaration can introduce mutually recursive functions. |
A val declaration of the form ;
val Id) = E; and --- and Id, = Ey 4,
evaluates the expressions E),..., E, and then declares the identifiers fd), ... :
to have the corresponding values. Since the declarations do not take effect ~
all the expressions are evaluated, their order is immaterial, ‘
Here we declare names for 7, e and the logarithm of 2.
‘
val pi = 4.0 * Math.atan 1.0 $
and e Math .exp 1.0
and log2 = Math.in 2.0;
> pi 3,141592654 : real 4
>e 2.718281828 : real .
> log2 = 0.693147806 : real
A single input declares three names, The simultaneous declaration emphasi
that they are independent.
Now let us declare the chimes of Big Ben:2.19 Simultaneous declarations 57
val one "BONG ";
> val one = "BONG" : string
val three = one” one” one;
> val three = “BONG BONG BONG * : string
val five = three*one“one;
> val five = "BONG BONG BONG BONG BONG " : string
There must be three separate declarations, and in this order.
A simultaneous declaration can also swap the values of names:
val one = three and three = one;
> val one = "BONG BONG BONG " : string
> val three = "BONG " : string
This is, of course, a silly thing to do! But it illustrates that the declarations occur
at the same time. Consecutive declarations would give one and three identical
bindings.
Mutually recursive functions. Several functions are mutually recursive if they
are declared recursively in terms of each other. A recursive descent parser is a
typical case. This sort of parser has one function for each element of the gram-
mar, and most grammars are mutually recursive: an M1. declaration can contain
expressions, while an expression can contain declarations. Functions to traverse
the resulting parse tree wil! also be mutually recursive.
Parsing and trees are discussed later in this book. For a simpler example, con-
sider summing the series
mij titi ,} 1
4° °°3°5 7 4k+1 0 4k 43
By mutual recursion, the final term of the summation can be either Positive or
negative:
fun pos d = neg(d-2.0) + 1.0/d
and neg d = if d>0.0 then pos(d-2.0) - 1.0/d
else 0.0;
> val pos = fn : real -> real
> val neg = fn ; real -> real
Two functions are declared. The series converges leisurely:
4.0 * pos(201.0);
> 3.151493401
4.0 * neg(8003.0);
> 3.14134277935 2 Names, Functions and Types
Mutually recursive functions can often be combined into one function
wi
help of an additional argument:
fun sum(d,one) =
if d>0.0 then sum(d-2.0, ~one) + one/d else 0.0;
Now sum(d,1.0) returns the same value as pos(d), and sum(d,~1.9 °
turns the same value as
neg (d).
Emulating goto statements, Functional programming and procedural pro
ming are more alike than you may imagine. Any combination of goto an
signment statements — the worst of procedural code —- can be translated’ *
set of mutually recursive functions. Here is a simple case: 5
var x i= 0; yrs 0; z:8 0; :
F: x := x41; goto G
G: iff yez then goto F else (y := x+y;
H: if s0 then (z := zx; goto F) else
goto H) .
stop
For each of the labels, F, G and H, declare mutually recursive functions, :
argument of each function is a tuple holding all of the variables, :
fun F(x,y,z) = G(rel,y,z)
and G(x,y,z) = if yez then F(t,y,2) else H(x,x+y,2) \
else (x,y,z);
int * int -> int * int * int
int * int -> int * int * int :
int * int -> int * int * int
and H{x,y,z) = if z-0 then F(x,y,2-x)
> val F = fn: int *
> val G fn: int *
> val H= fn: int * x
Calling f(0, 0, 0) gives x, y and z their initial values for execution, and re
the result of the procedural code.
£{0,0,0);
> (1, 1, 0) : int * int * dnt y
Functional programs are referentially transparent, yet can be totally opaque. :
your code starts to look like this, beware!
Exercise 2.22 What is the effect of this declaration? 4
val (pi,log2) = (log2, pi);
Exercise 2.23 Consider the sequence (P,) defined for > 1 by
asl
Py = 14+ >> Py.2,20 The complex numbers YD
icular, P; = 1.) Express this computation as an ML function. How effi-
‘cent js it? Is there a faster way of computing P,,?
5
Introduction to modules
An engineer understands a device in terms of its component parts, and
those, similarly, in terms of their subcomponents. A bicycle has wheels; a wheel
has a hub; a hub has bearings, and so forth. Tt takes several stages before we
reach the level of individual pieces of metal and plastic. In this way one can un-
derstand the entire bike at an abstract level, or parts of it in detail. The engineer
can improve the design by modifying one part, often without thinking about the
other parts.
Programs (which are more complicated than bicycles!) should also be seen as
consisting of components. Traditionally, a subprogram is a procedure or func-
tion, but these are too small — it is like regarding the bicycle as composed of
thousands of metal shapes. Many recent languages regard programs as consist-
ing of modules, each of which defines its own data structures and associated op-
erations. The interface to each module is specified separately from the module it-
self. Different modules can therefore be coded by different members of a project
team; the compiler can check that each module meets its interface specification.
Consider our vector example. The function addvec is useless in isolation; it
must be used together with other vector operations, all sharing the same repre-
sentation of vectors. We can guess that the other operations are related because
their names all end with vec, but nothing enforces this naming convention. They
should be combined together to form a program module.
An ML structure combines related types, values and other structures, with a
uniform naming discipline. An ML signature specifies a class of structures by
listing the name and type (or other attributes) of each component.
Standard ML’s signatures and structures have analogues in other languages,
such as Modula-2’s definition and implementation modules (Wirth, 1985). ML
also provides functors — structures taking other structures as parameters — but
we shall defer these until Chapter 7.
2.20 The complex numbers
Many types of mathematical object can be added, subtracted, multiplied
and divided. Besides the familiar integer and real numbers, there are the ratio-
nal numbers, matrices, polynomials, etc. Our example below will be the com-
Plex numbers, which are important in scientific mathematics. We shall gather
Up their arithmetic operations using a structure Complex, then declare a signa-60 2 Names, Functions and Types
ture for Complex that also matches any structure that defines the same ari
operations. This will provide a basis for generic arithmetic. i
We start with a quick introduction to the complex numbers. A complex
ber has the form x + iy, where x and y are real numbers and jis a constant
tulated to satisfy i? = —1, Thus, x and y determine the complex number,
The complex number zero is 0 + 10. The sum of two complex numbers *
sists of the sums of the x and y parts; the difference is similar. The definitio
product and reciprocal look complicated, but are easy to justify using alg ©
laws and the axiom i? = —1:
@t+iy+@ +) = 4x) +inty)
+i) -O' +i) =@-x) +i-y) 4
(+ ty) x (x + iy’) = Gar! — yy’) + iy’ +2’y)
Vat =@-iy/@+y’) ;
In the reciprocal above, the y component is —y/(x? + y’). We can now define)
complex quotient 2/z’ as z x (1/z’).
By analogy with our vector example, we could implement the complex n,
bers by definitions such as Hl
type complex = real*real;
val complexzero = (0.0, 0.0); t
but it is better to use a structure.
6 Further reading. Penrose (1989) explains the complex number system in °
detail, with plenty of motivation and examples. He discusses the connec
between the complex numbers and fractals, including a definition of the Mandelbrot
Later in the book, the complex numbers play a central r6le in his discussion of q__
tum mechanics. Penrose gives the complex numbers a metaphysical significance;
might be taken with a pinch of salt! Feynman ef al. (1963) give a more technical
marvellously enjoyable description of the complex numbers in Chapter 22.
2.21 Structures
Declarations can be grouped to form a structure by enclosing them in
keywords struct and end. The result can be bound to an ML identifier usi
astructure declaration:2.21 Structures 6l
structure Complex =
struct
type 1 = real*real;
val zero = (0.0, 0.0);
fun sun ((%y), Oy )) = (ew, sey) or
fun diff (Quy), O4y)) = Gex, yy) 2 ty
fun prod (0. ¥), (OY )) > (rtd = yty, tye May) ro
fun recip (x,y) =
let val r= x*x + yty
in (x/t, “y/t) end
fun quo (2.2) = prod(z, recip 2);
end;
Where structure Complex is visible, its components are known by compound
names such as Complex . zero and Complex . sum. Inside the structure body, the
components are known by their ordinary identifiers, such as zero and sum; note
the use of recip in the declaration of quo. The type of complex numbers is called
Complex .t. When the purpose of a structure is to define a type, that type is com-
monly called 1.
We may safely use short names. They cannot clash with names occurring in
other structures. The standard library exploits this heavily, for example to dis-
tinguish the absolute value functions Int .abs and Real . abs.
Let us experiment with our new structure. We declare two ML identifiers, i
and a; a mathematician would normally write their values as i and 0.3, respec-
tively.
val i= (0.0, 1.0);
> val i = (0.0, 1.0) : real * real
val a= (0.3, 0.0);
> val a = (0.3, 0.0) : real * real
In two steps we form the sum a +i +0.7, which equals 1 + i. Finally we square
that number to obtain 2i:
val b = Complex. sun(a, i);
> val b = (0.3, 1.0) : Complex.t
Complex.sum(b, (0.7, 0.0));
> (1.0, 1.0) : Complex. t
Complex . prod (it , it) ;
> (0.0, 2.0) : Complex.t
Observe that Complex . t is the same type as real x real, what is more confusing,
itis the same type as vec above. Chapter 7 describes how to declare an abstract
Spe, whose internal representation is hidden.
Structures look a bit like records, but there are major differences. A record’s
Components can only be values (including, perhaps, other records). A structure’s62 2 Names, Functions and Types
components may include types and exceptions (as well as other structures). But
you cannot compute with structures: they can only be created when the program ‘
modules are being linked together. Structures should be seen as encapsulated
environments. ‘
2.22 Signatures
A signature is a description of each component of a structure. ML re-
sponds to our declaration of the structure Complex by printing its view of the-
corresponding signature:
structure Complex = ..
> structure Complex :
> sig
> typet 5
> val diff (real * real) * (real * real) -> t i
> val prod (real * real) * (real * real) ~> t
> val quo : (real * real) * (real * real) -> t
> val recip real * real -> real * real :
> val sum (real * real) * (real * real) -> t
> val zero: real * real
> end
The keywords sig and end enclose the signature body. It shows the types of all:
the components that are values, and mentions the type t. (Some compilers display:
eqtype t insteadof type t, informing us that is a so-called equality type.)
The signature inferred by the ML compiler is frequently not the best one for our’
purposes. The structure may contain definitions that ought to be kept private. By.
declaring our own signature and omitting the private names, we can hide them’
from users of the structure. We might, for instance, hide the name recip.
The signature printed above expresses the type of complex numbers some-
times as f and sometimes as real x real. If we use t everywhere, then we obtain a
general signature that specifies a type t equipped with operators sum, prod, etc.:
signature ARITH :
sig
type t
val zero: f
val sum: re t->t
val diff :t* t->t
val prod: t* t->¢t
val quo :t*t->¢
end;2.22 Signatures 63
The declaration gives the name AR/TH to the signature enclosed within the brack-
ets sig and end. We can declare other structures and make them conform to
signature ARITH. Here is the skeleton of a structure for the rational numbers:
structure Rational : ARITH =
struct
type ¢ = inf*int;
val ze = (0, 1);
end;
A signature specifies the information that ML needs to integrate program units
safely. It cannot specify what the components actually do. A well-documented
signature includes comments describing the purpose of each component. Com-
ments describing a component’s implementation belong in the structure, not in
the signature. Signatures can be combined in various ways to form new signa-
tures; structures can similarly be combined.
ML functors can express generic modules: for example, ones that take any
structure conforming to signature ARITH. The standard library offers extensive
possibilities for this. An ML system may provide floating point numbers in vari-
ous precisions, as structures matching signature FLOAT. A numerical algorithm
can be coded as a functor. Applying the functor to a floating point structure spe-
cializes the algorithm to the desired precision. ML thus has some of the power
of object-oriented languages such as C++ — though in a more restrictive form,
since structures are not computable values.
Exercise 2.24 Declare a structure Real, matching signature AR/TH, such that
Real .t is the type real and the components zero, sum, prod, etc., denote the cor-
tesponding operations on type real.
Exercise 2.25 Complete the declaration of structure Rational above, basing
your definitions on the laws n/d + .n'/d’ = (nd’ + n'd)/dd’, (n/d) x (n'/d') =
nn'/dd’, and 1/(n/d) = d/n. Use the function ged to maintain the fractions in
lowest terms, and ensure that the denominator is always positive.
Polymorphic type checking
Until recently, the debate on type checking has been deadlocked, with
two rigid positions:
« Weakly typed languages like Lisp and Prolog give programmers the free-
dom they need when writing large programs.64 2 Names, Functions and Types
Strongly typed languages like Pascal give programmers security by re-
stricting their freedom to make mistakes.
Polymorphic type checking offers a new position: the security of strong type
checking, as well as great flexibility. Programs are not cluttered with type spec-
ifications since most type information is deduced automatically.
A type denotes a collection of values, A function's argument type specifies
which values are acceptable as arguments. The result type specifies which values
could be returned as results. Thus, div demands a pair of integers as argument,
its result can only be an integer. If the divisor is zero then there will be no result
at all: an error will be signalled instead. Even in this exceptional situation, the
function div is faithful to its type.
ML can also assign a type to the identity function, which returns its argument
unchanged. Because the identity function can be applied to an argument of any
type, itis polymorphic. Generally speaking, an object is polymorphic if it can be
regarded as having multiple types. ML polymorphism is based on type schemes,
which are like patterns or templates for types. For instance, the identity function
has the type scheme @ > a.
2.23 Type inference
Given little or no explicit type information, ML can infer all the types
involved with a function declaration. Type inference follows a natural but rigor-
ous procedure. ML notes the types of any constants, and applies type checking
rules for each form of expression. Each variable must have the same type every-
where in the declaration. The type of each overloaded operator (like +) must be
determined from the context.
Here is the type checking rule for the conditional expression. If E has type
bool and E, and E have the same type, say T, then
if & then F; else Ey
also has type r. Otherwise, the expression is ill-typed.
Let us examine, step by step, the type checking of facti:
fun facti (n,p) =
if n=0 then p else facti(n-1, n*p);
The constants 0 and 1 have type int. Therefore n=0 and n-1 involve integers, so
n has type int. Now n*p must be integer multiplication, so p has type int. Since
p is returned as the result of facti, its result type is int and its argument type is2.24 Polymorphic function declarations 65
int x int. This fits with the recursive call. Having made all these checks, ML can
respond
> val facti = fm : int * int ~> int
If the types are not consistent, the compiler rejects the declaration.
Exercise 2.26 Describe the steps in the type checking of itfib.
Exercise 2.27 Type check the following function declaration:
fun f (k,m) = if k=0 then 1 else f(k-1);
2.24 Polymorphic function declarations
If type inference leaves some types completely unconstrained then the
declaration is polymorphic —- literally, ‘having many forms.’ Most polymorphic
functions involve pairs, lists and other data structures. They usually do some-
thing simple, like pairing a value with itself:
fun pairself x = (x2); .
> val pairself = fn: ‘a -> ‘a * ‘a
This type is polymorphic because it contains a type variable, namely ‘a. In ML,
type variables begin with a prime (single quote) character.
'b 'c ‘we_band_of_brothers 3
Let us write a, B, y for the ML type variables ' a, 'b, ‘ c, because type variables
are traditionally Greek letters. Write x : 7 to mean ‘x has type t,’ for instance
pairself : a —» (a xa). Incidentally, x has higher precedence than ->; the type
ot pairself can be written a > a x a.
A polymorphic type is a type scheme. Substituting types for type variables
forms an instance of the scheme. A value whose type is polymorphic has in-
finitely many types. When pairself is applied to a real number, it effectively has
type real > real x real.
pairself 4.0;
> (4.0, 4.0) : real * real
Applied to an integer, pairself effectively has type int > int x int.
pairself 7;
> (7, 7) : int * int
Here Pairself is applied to a pair; the result is called pp.66 2 Names, Functions and Types
val pp = pairself ("Help!",999);
> val pp = (("Help!", 999), ("Help!", 999)}
>: (string * int) * (string * int)
Projection functions return a component of a pair. The function fst returns
first component; snd returns the second:
fun fst (x,y) = x;
> val fst = fn
fun snd (x,¥}
> val snd = fn: '@ * 'b -> ‘b
ta * "bh -> ‘a
Before considering their polymorphic types, we apply them to pp:
fst pp: ‘
> ("Help!", 999) : string * int |
snd (fst pp):
> 999 ; int
The type of fst is@ x f —> a, with two type variables. The argument pair m
involve any two types 7 and 72 (not necessarily different); the result has type t
Polymorphic functions can express other functions. The function that tak’
((x, y), w) to x could be coded directly, but two applications of fst also work:
fun fstfst 2 = fst (fst 2);
> val fstfst = fn: (’a * ‘b) * to -> ¢a
fotfst pp: ‘
> "Help!" ; string -
The type (w x f) x y — a is what we should expect for fsifst. Note that a poly
morphic function can have different types within the same expression. The inne
fst has type (@ x B) x y > @ x B; the outer fst has type a x B > a. .
Now for something obscure: what does this function do?
fun silly x = fotfst (pairself (pairself x)) ;
> val silly = fn: ‘a -> ’a
Not very much:
silly "Hold off your hands.";
> "Hold off your hands.” : string
Its type, w —> @, suggests that silly is the identity function. This function can be
expressed rather more directly:
fun 1x = x;
> val I= fn: 'a -> ’a
i) Further issues. Milner (1978) gives an algorithm for polymorphic type check-
' ing and proves that a type-correct Program cannot suffer a run-time type er-2.24 Polymorphic function declarations 67
ror, Damas and Milner (1982) prove that the types inferred by this algorithm are prin-
cipal: they are as polymorphic as possible, Cardelli and Wegner (1985) survey several
approaches to polymorphism. For Standard ML, things are quite complicated.
Equality testing is polymorphic in a limited sense: it is defined for most, not all, types.
Standard ML provides a class of equality type variables to range over this restricted col-
lection of types. See Section 3.14.
Recall that certain built-in functions are overloaded: addition (+) is defined for in-
tegers and reals, for instance. Overloading sits uneasily with polymorphism. It com-
plicates the type checking algorithm and frequently forces programmers to write type
constraints. Fortunately there are only a few overloaded functions. Programmers can-
not introduce further overloading.
Summary of main points
e A variable stands for a value; it can be redeclared but not updated.
e Basic values have type int, real, char, string or bool.
e Values of any types can be combined to form tuples and records.
e Numerical operations can be expressed as recursive functions.
e An iterative function employs recursion in a limited fashion, where re-
cursive calls are essentially jumps. .
¢ Structures and signatures serve to organize large programs.
e A polymorphic type is a scheme containing type variables.Lists
Ina public lecture, C. A. R. Hoare (1989a) described his algorithm for finding the
ith smallest integer in a collection. This algorithm is subtle, but Hoare described
it with admirable clarity as a game of solitaire. Each playing card carried an in-
teger. Moving cards from pile to pile by simple rules, the required integer could
quickly be found.
Then Hoare changed the rules of the game. Each card occupied a fixed posi-
tion, and could only be moved if exchanged with another card. This described the
algorithm in terms of arrays. Arrays have great efficiency, but they also have a
cost. They probably defeated much of the audience, as they defeat experienced
programmers. Mills and Linger (1986) claim that programmers become more
productive when arrays are restricted to stacks, queues, etc., without subscript-
ing.
Functional programmers often process collections of items using lists. Like
Hoare’s stacks of cards, lists allow items to be dealt with one at a time, with great
clarity. Lists are easy to understand mathematically, and turn out to be more ef-
ficient than commonly thought.
Chapter outline
This chapter describes how to program with lists in Standard ML. It pre-
sents several examples that would normally involve arrays, such as matrix oper-
ations and sorting.
The chapter contains the following sections:
Introduction to lists. The notion of list is introduced. Standard ML operates
on lists using pattern-matching.
Some fundamental list functions. A family of functions is presented. These are
instructive examples of list programming, and are indispensable when tackling
harder problems.
Applications of lists. Some increasingly complicated examples illustrate the
Variety of problems that can be solved using lists.
The equality test in polymorphic functions. Equality polymorphism is intro-70 3 Lists
duced and demonstrated with many examples. These include a useful collection
of functions on finite sets.
Sorting: A case study. Procedural programming and functional programming
are compared in efficiency. In one experiment, a procedural program runs only
slightly faster than a much clearer functional program.
Polynomial arithmetic. Computers can solve algebraic problems. Lists are
used to add, multiply and divide polynomials in symbolic form.
Introduction to lists
A list is a finite sequence of elements. Typical lists are [3,5,9] and
["fair", "Ophelia"]. The empty list, [], has no elements. The order of
elements is significant, and elements may appear more than once. For instance,
the following lists are all different:
(3,4] (4,3) (3,4,3] (3,3,4]
The elements of a list may have any type, including tuples and even other lists.
Every element of a list must have the same type. Suppose this type is t; the type
of the list is then t dist. Thus
[(1,"One"), (2,"fwo"), (3,"Three"}J] : (int*string) list
{ [3.1], (], (5.7, 7"G.6] } : (veal list) list
The empty list, [], has the polymorphic type & list. It can be regarded as having
any type of elements.
Observe that the type operator /ist has a postfix syntax. It binds more tightly
than x and ->. So int x string list is the same type as int x (string list), not
(int x string)list. Also, int list list is the same type as (int list)list.
3.1 Building a list f
Every list is constructed by just two primitives: the constant mil and the
infix operator ::, pronounced ‘cons’ for ‘construct.’
e nilis a synonym for the empty list, [].
® The operator :: makes a list by putting an element in front of an existing
list.
Every list is either nil, if empty, or has the form x :: J where x is its head and |
its tail, The tail is itself a list. The list operations are not symmetric: the first
element of a list is much more easily reached than the last.
If / is the list [x,,... , x] and x is a value of correct type then x :: J is the
list [x, x1, -.. , 4]. Making the new list does not affect the value of /. The list3.1 Building a list 7
[3, 5, 9] is constructed as follows:
nil = []
92 T= 19]
52 [9] = [5,9]
3 :: [5,9] = [3, 5,9]
Observe that the elements are taken in reverse order. The list [3, 5, 9] can be
written in many ways, such as 3 :: (5 :: (9 :: nil), or 3 +: (5 2: [9]), or 3 :: [5, 9].
To save writing parentheses, the infix operator ‘cons’ groups to the right. The
notation [xj], x2, ... ,X,] stands for xy 2: x2 2 +++ 2 x, 1: nil. The elements may
be given by expressions; a list of the values of various real expressions is
[ Math.sin 0.5, Math.cos 0.5, Math.exp 0.5 1;
> [0,479425539, 0.877582562, 1.64872127} : real list
List notation makes a list with a fixed number of elements. Consider how to build
the list of integers from m to n:
[m,m+1,...,a)]
First compare m and n. If #1 > 1» then there are no numbers between m and n; the
list is empty. Otherwise the head of the list is sm and the tail is [m+ 1,....n].
Constructing the tai] recursively, the result is obtained by
mil[m+t,...,n).
This process corresponds to a simple ML function:
fun upio (m,n) =
if mon then {] else m :: upto(m+1,a};
> val upto = fn : int * int -> int list
upto (2,5);
> (2, 3, 4, 5} : int list
0 Lists in other languages. Weakly typed languages like Lisp and Prolog repre-
sent lists by pairing, asin (3, (5, (9, “nil’))). Here "nil" is some end
marker and the list is [3, 5, 9]. This representation of lists docs not work in ML because
the type of a ‘list’ would depend on the number of elements. What type could upto have?
ML’s syntax for lists differs subtly from Prolog’s. In Prolog. [5| [6] ] is the same
listas [5,6]. InML, [5::[6]] is the same list as [[5,6]].72 3 Lists
3.2 Operating on a list
Lists, like tuples, are structured values. In ML, a function on tuples can
be written with a pattern for its argument, showing its structure and naming the
components. Functions over lists can be written similarly. For example,
fun prodof3 [i,j,k] + int = injrk;
declares a function to take the product of a list of numbers — but only if there!
are exactly three of them! :
List operations are usually defined by recursion, treating several cases. What.
is the product of a list of integers?
If the list is empty, the product is 1 (by convention).
e if the list is non-empty, the product is the head times the product of the
tail. .
It can be expressed in ML like this:
fun prod [] =i
oan
| prod (n
> val prod
* (prod ns);
: int list -> int
The function consists of two clauses separated by a vertical bar (|). Each clause:
treats one argument pattern. There may be several clauses and complex patterns, .
provided the types agree. Since the patterns involve lists, and the result can be’
the integer 1, ML infers that prod maps a list of integers to an integer. ‘
prod(2,3,5) 7
> 30: int
Empty versus non-empty is the commonest sort of case analysis for lists. Finding
the maximum of a list of integers requires something different, for the empty list
has no maximum. The two cases are
e The maximum of the one-element list [nt] is 7.
e To find the maximum of a list with two or more elements [77,”,...], '
remove the smaller of m or n and find the maximum of the remaining :
numbers.
This gives the ML function
fun max [m]
| mad (mm:
m
if m>n then maxi (m
else maxl(n
> ***Warning: Patterns not exhaustive
> val maxl = fn : int list -> int3.2 Operating on a list 73
Note the warning message: ML detects that max/ is undefined for the empty list.
Also, observe how the pattern m :: 2 :: ns describes a list of the form [m, n, ...].
The smaller element is dropped in the recursive call.
The function works — except for the empty list.
maxl [ ~4, 0, ~12];
> 0: int
maxt [];
> Exception: Match
An exception, for the time being, can be regarded as a run-time error, The func-
tion max! has been applied to an argument for which it is undefined. Normally
exceptions abort execution. They can be trapped, as we shall see in the next chap-
ter.
Intermediate lists. Lists are sometimes generated and consumed within a com-
putation. For instance, the factorial function has a clever definition using prod
and upto:
fun factl (n) = prod (upto (1,n));
> val factl = fm : int -> int
factl 7;
> 5040 ; int
This declaration is concise and clear, avoiding explicit recursion, The cost of
building the list {1, 2,... , 7] may not matter. However, functional programming
should facilitate reasoning about programs. This does not happen here. The triv-
ial law
factl(m + P= (mt 1) x factl(m)
has no obvious proof. Opening up its definition, we get
factl(m + 1) = prod(upto(1, m+ 1)) =?
The next step is unclear because the recursion in upto follows its first argument,
not the second. The honest recursive definition of factorial seems better.
Strings and lists. Lists are important in string processing. Most functional lan-
guages provide a type of single characters, regarding strings as lists of characters.
With the new standard library, ML has acquired a character type — but it does not
regard strings as lists. The built-in function explode converts a string to a list of
characters. The function implode performs the inverse operation, joining a list
of characters to form a string.