Functional and Concurrent Programming: Core Concepts and Features
Functional and Concurrent Programming: Core Concepts and Features
Programming
This page intentionally left blank
Functional and Concurrent
Programming
Michel Charpentier
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and the publisher was aware of a trademark
claim, the designations have been printed with initial capital letters or in all capitals.
The author and publisher have taken care in the preparation of this book, but make no expressed
or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is
assumed for incidental or consequential damages in connection with or arising out of the use of the
information or programs contained herein.
For information about buying this title in bulk quantities, or for special sales opportunities (which
may include electronic versions; custom cover designs; and content particular to your business, training
goals, marketing focus, or branding interests), please contact our corporate sales department at corpsales
@pearsoned.com or (800) 382-3419.
For questions about sales outside the United States, please contact [email protected].
All rights reserved. This publication is protected by copyright, and permission must be obtained from
the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any
form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information re-
garding permissions, request forms and the appropriate contacts within the Pearson Education Global
Rights & Permissions Department, please visit www.pearson.com.
ISBN-13: 978-0-13-746654-2
ISBN-10: 0-13-746654-4
ScoutAutomatedPrintCode
To Karen and Andre
This page intentionally left blank
Contents
Preface xxv
Acknowledgments xxxv
Chapter 3 Immutability 21
3.1 Pure and Impure Functions 21
3.2 Actions 23
3.3 Expressions Versus Statements 25
3.4 Functional Variables 26
vii
viii Contents
Glossary 463
Index 469
This page intentionally left blank
List of Listings
2.1 Example of a method defined with a symbolic name. 13
2.2 Example of a method defined for infix invocation. 13
2.3 Example of a method added to a type by extension. 14
2.4 Example of a local function defined within another function. 14
2.5 Example of a function with variable-length arguments. 15
2.6 Example of a function with default values for some of its arguments. 16
2.7 Example of a function parameterized by a type; contrast with Lis. 2.8. 18
2.8 Example of parameterization by multiple types; contrast with Lis. 2.7. 19
xv
xvi List of Listings
7.1 Simple list lookup, using tail recursion; see also Lis. 7.2. 80
7.2 Simple list lookup, using tail recursion and pattern matching. 81
7.3 List length, not tail recursive; contrast with Lis. 7.4. 81
7.4 List length, using tail recursion; contrast with Lis. 7.3. 82
7.5 Recursive implementation of drop on lists; see also Lis. 7.6. 82
7.6 Recursive implementation of drop on lists; see also Lis. 7.5. 83
7.7 Implementation of getAt on lists, using drop. 84
7.8 Recursive implementation of take; contrast with Listings 7.21 to 7.23. 84
7.9 Recursive implementation of list concatenation. 86
7.10 Appending at the end of a list using concatenation. 86
7.11 Recursive implementation of list flattening. 87
7.12 Recursive implementation of list zipping. 88
7.13 Recursive implementation of list splitting. 89
7.14 Recursive implementation of list grouping. 89
7.15 Linear, tail recursive implementation of list reversal. 90
7.16 List building by prepending and reversing instead of appending. 91
7.17 Insertion sort. 92
7.18 Merge-sort; fixed from Lis. 6.5; see also Lis. 7.19. 93
7.19 Alternative implementation of merge-sort; see also Lis. 7.18. 93
7.20 Quick-sort with user-defined splitting; see also Lis. 10.1. 94
7.21 Tail recursive implementation of take; contrast with Lis. 7.8. 94
7.22 Buffer-based implementation of take; contrast with Lis. 7.21. 95
7.23 Loop-based implementation of take; contrast with Lis. 7.22. 95
8.6 Set conversion to balanced binary search trees, improved in Lis. 8.12. 107
8.7 Left and right rotations on binary search trees. 109
8.8 Rebalancing of binary search trees by rotations. 110
8.9 Key insertion in self-balancing binary search trees. 111
8.10 Key deletion in self-balancing binary search trees; see also Lis. 8.11. 111
8.11 Key deletion in self-balancing binary search trees; see also Lis. 8.10. 112
8.12 Conversion from set to binary search trees, improved from Lis. 8.6. 112
12.11 All the solutions to subset-sum; contrast with Lis. 12.10 and 12.12. 192
12.12 The solutions to subset-sum derived lazily; contrast with Lis. 12.11. 192
13.1 Pipeline example with no handling of failures; contrast with Lis. 13.2. 201
13.2 Pipeline example with failure handling; contrast with Lis. 13.1. 202
20.1 Thread-safe queue (single public lock); see also Lis. 20.5 and 20.7. 298
20.2 Batch insertions by client-side locking on a concurrent queue. 299
20.3 Batch extraction by client-side locking on a concurrent queue. 300
20.4 Batch extraction without synchronization; contrast with Lis. 20.3. 300
20.5 Thread-safe queue (single private lock); see also Lis. 20.1 and 20.7. 301
20.6 Batch processing from queue with a private lock. 302
20.7 Thread-safe queue (two private locks); see also Lis. 20.1 and 20.5. 303
20.8 Batch processing from a queue with a split lock. 304
21.1 Example concurrency from a thread pool; contrast with Lis. 17.1. 307
21.2 Using a thread pool to create threads and wait for their termination. 309
21.3 A server that processes requests sequentially; contrast with Lis. 21.4. 310
21.4 A server that processes requests concurrently; contrast with Lis. 21.3. 310
21.5 Example of scheduled execution using a timer pool. 313
21.6 Example of parallel evaluation of higher-order function map. 316
22.1 Possible deadlock of the incorrect box implementation in Lis. 22.3. 327
22.2 Single-threaded executions are sequentially consistent. 330
22.3 Multithreaded executions are not sequentially consistent. 331
xxi
xxii List of Figures
27.1 Possible loss of value from concurrent calls to push on a stack. 403
27.2 Compare-and-set to push/pop a lock-free stack. 404
27.3 Message flow of actors in Lis. 27.7. 409
Foreword by Cay Horstmann
In my book Scala for the Impatient, I provide a rapid-fire introduction into the many
features of the Scala language and API. If you need to know how a particular feature
works, you will find a concise explanation and a minimal code example (with real code,
not fruits or animals). I assume that the reader is familiar with Java or a similar object-
oriented programming language and organize the material to maximize the experience
and intuition of such readers. In fact, I wrote the book because I was put off by the
learning materials at the time, which were disdainful of object-oriented programming
and biased toward functional programming as the superior paradigm.
That was more than a decade ago. Nowadays, functional techniques have become
much more mainstream, and it is widely recognized that the object-oriented and func-
tional paradigms complement each other. In this book, Michel Charpentier provides
an accessible introduction to functional and concurrent programming. Unlike my Scala
book, the material here is organized around concepts and techniques rather than lan-
guage features. Those concepts are developed quite a bit more deeply than they would
be in a book that is focused on a programming language. You will learn about nontrivial
and elegant techniques such as zippers and trampolines.
This book uses Scala 3 for most of its examples, which is a great choice. The concise
and elegant Scala syntax makes the concepts stand out without being obscured by a
thicket of notation. You will particularly notice that when the same concept is expressed
in Scala and in Java. You don’t need to know any Scala to get started, and only a modest
part of Scala is used in the code examples. Again, the focus of the book is concepts,
not programming language minutiae. Mastering these concepts will make you a better
programmer in any language, even if you never end up using Scala in your career.
I encourage you to actively work with the sample programs. Execute them, observe
their behavior, and experiment by making changes. I suggest that you use a program-
ming environment that supports Scala worksheets, such as Visual Studio Code, IntelliJ,
or the online Scastie service. With a worksheet, turnaround is quick and exploratory
programming is enjoyable.
Seven out of the 28 chapters are complete case studies that illustrate the material
that preceded them. They are chosen to be interesting without being overwhelming. I
am sure you will profit from working through them in detail.
The book is divided into two parts. The first part covers functional programming
with immutable data, algebraic data types, recursion, higher-order functions, and lazy
evaluation. Even if you are at first unexcited about reimplementing lists and trees, give
it a chance. Observe the contrast with traditional mutable data structures, and you
will find the journey rewarding. The book is blessedly free of complex category theory
that in my opinion—evidently shared by the author—requires a large amount of jargon
before yielding paltry gains.
xxiii
xxiv Foreword by Cay Horstmann
The focus of the second part is concurrent programming. Here too the organization
along concepts rather than language and API features is refreshing. Concurrent pro-
gramming is a complex subject with many distinct use cases and no obvious way of
teaching it well. Michel has broken down the material into an interesting and thought-
provoking sequence of chapters that is quite different from what you may have seen
before. As with the first part, the ultimate aim is not to teach you a specific set of skills
and techniques, but to make you think at a higher level about program design.
I enjoyed reading and working through this unique book and very much hope that
you will too.
Cay Horstmann
Berlin, 2022
Preface
Before you start reading this book, it is important to think about the distinction between
programming languages and programming language features. I believe that developers
benefit from being able to rely on an extensive set of programming language features,
and that a solid understanding of these features—in any language—will help them be
productive in a variety of programming languages, present or future.
The world of programming languages is varied and continues to evolve all the time.
As a developer, you are expected to adapt and to repeatedly transfer your programming
skills from one language to another. Learning new programming languages is made easier
by mastering a set of core features that today’s languages often share, and that many
of tomorrow’s languages are likely to use as well.
Programming language features are illustrated in this book with numerous code
examples, primarily in Scala (for reasons that are detailed later). The concepts, however,
are relevant—with various degrees—to other popular languages like Java, C++, Kotlin,
Python, C#, Swift, Rust, Go, JavaScript, and whatever languages might pop up in the
future to support strong typing as well as functional and/or concurrent programming.
As an illustration of the distinction between languages and features, consider the
following programming task:
Shift every number from a given list by a random amount between -10
and 10. Return a list of shifted numbers, omitting all values that are not
positive.
A Java programmer might implement the desired function as follows:
Java
List<Integer> randShift(List<Integer> nums, Random rand) {
var shiftedNums = new java.util.ArrayList<Integer>(nums.size());
for (int num : nums) {
int shifted = num + rand.nextInt(-10, 11);
if (shifted > 0) shiftedNums.add(shifted);
}
return shiftedNums;
}
xxv
xxvi Preface
Although they are written in two different languages, both functions follow a similar
strategy: Create a new empty list to hold the shifted numbers, shift each original number
by a random amount, and add the new values to the result list only when they are
positive. For all intents and purposes, the two programs are the same.
Other programmers might choose to approach the problem differently. Here is one
possible Java variant:
Java
List<Integer> randShift(List<Integer> nums, Random rand) {
return nums.stream()
.map(num -> num + rand.nextInt(-10, 11))
.filter(shifted -> shifted > 0)
.toList();
}
The details of this implementation are not important for now—it relies on functional
programming concepts that will be discussed in Part I. What matters is that the code
is noticeably different from the previous Java implementation.
You can write a similar functional variant in Python:
Python
def rand_shift(nums, rand):
return list(filter(lambda shifted: shifted > 0,
map(lambda num: num + rand.randrange(-10, 11), nums)))
This implementation is arguably closer to the second Java variant than it is to the first
Python program.
These four programs demonstrate two different ways to solve the original problem.
They contrast an imperative implementation—in Java or in Python—with a functional
implementation—again, in Java or in Python. What fundamentally distinguishes the
Preface xxvii
programs is not the languages—Java versus Python—but the features being used—
imperative versus functional. The programming language features used in the impera-
tive variant (assignment statements, loops) and in the functional variant (higher-order
functions, lambda expressions) exist independently from Java and Python; indeed, they
are available in many programming languages.
I am not saying that programming languages don’t matter. We all know that, for
a given task, some languages are a better fit than others. But I want to emphasize
core features and concepts that extend across languages, even when they appear under
a different syntax. For instance, an experienced Python programmer is more likely to
write the example functional program in this way:
Python
def rand_shift(nums, rand):
return [shifted for shifted in (num + rand.randrange(-10, 11) for num in nums)
if shifted > 0]
This code looks different from the earlier Python code—and the details are again unim-
portant. Notice that functions map and filter are nowhere to be seen. Conceptually,
though, this is the same program but written using a specific Python syntax known as
list comprehension, instead of map and filter.
The important concept to understand here is the use of map and filter (and more
generally higher-order functions, of which they are an example), not list comprehension.
You benefit from this understanding in two ways. First, more languages support higher-
order functions than have a comprehension syntax. If you are programming in Java, for
instance, you will have to write map and filter explicitly (at least for now). Second,
if you ever face a language that uses a somewhat unusual syntax, as Python does with
list comprehension, it will be easier to recognize what is going on once you realize that
it is just a variation of a concept you already understand.
The preceding code examples illustrate a contrast between a program written in plain
imperative style and one that leverages the functional programming features available
in many languages. I can make a similar argument with concurrent programming. Lan-
guages (and libraries) have evolved, and there is no reason to write today’s concurrent
programs the way we did 20 years ago. As a somewhat extreme example, travel back
not quite 20 years to 2004, the days of Java 1.4, and consider the following problem:
Given two tasks that each produce a string, invoke both tasks in parallel and
return the first string that is produced.
Assume a type StringComputation with a string-producing method compute. In
Java 1.4, the problem can be solved as follows (do not try to understand the code; it is
rather long, and the details are unimportant):
xxviii Preface
Java
String firstOf(final StringComputation comp1, final StringComputation comp2)
throws InterruptedException {
class Result {
private String value = null;
This implementation uses features with which you may not be familiar (but which are
covered in Part II of the book).1 Here are the important points to notice:
• The code is about 30 lines long.
• It relies on synchronized methods, a form of locking available in the Java Virtual
Machine (JVM).
1 One reason such old-fashioned features are still covered in this book is that I believe they help us
understand the richer and fancier constructs that we should be using in practice. The other reason is
that the concurrent programming landscape is still evolving and recent developments, such as virtual
threads in the Java Virtual Machine, have the potential to make these older concepts relevant again.
Preface xxix
and efficient programs with Java 19 than it was with Java 1. Feature-rich programming
languages can be harder to learn, but they are also more powerful once mastered.
Of course, what you find hard or easy depends a lot on your programming back-
ground, and it is important not to confuse simplicity with familiarity. The functional
variants of the Java and Python programs presented earlier are not more complicated
than the imperative variants, but for some programmers, they can certainly be less
familiar. Indeed, it is more difficult for a programmer to shift from an imperative to a
functional variant (or vice versa) within Java or Python than it is to shift from Java
to Python (or vice versa) within the same imperative or functional style. The latter
transition is mostly a matter of syntax, while the first requires a paradigm shift.
Most of the advantages of current, feature-rich, programming languages revolve
around functional programming, concurrency, and types—hence the three themes of
this book. A common trend is to provide developers with abstractions that allow them
to dispense with writing nonessential implementation details, and code that is not writ-
ten is bug-free code.
Jumps and gotos, for instance, were long ago discarded in high-level programming
languages in favor of structured loops. But many loops can themselves be replaced
with functional alternatives that instead use a standard set of higher-order functions.
Similarly, writing concurrent programs directly in terms of threads and locks can be very
challenging. Relying on thread pools, futures, and other mechanisms instead can result
in simpler patterns. In many scenarios, you have no more reason to use loops and locks
than you have to write your own hash map or sorting method: It’s unnecessary work, it’s
error-prone, and it’s unlikely to achieve the performance of existing implementations. As
for types, the age-old dichotomy between safety—being able to catch errors thanks to
types—and flexibility—not being overly constrained in design choices because of types—
is often being resolved in favor of safe and flexible type systems, albeit complicated ones.
This book is not a comprehensive guide to everything you need to know about func-
tional and concurrent programming, or about types. But to leverage modern language
constructs in your everyday programming, you need to become familiar with the abstract
concepts that underlie these features. There is more to applying functional patterns than
being aware of the syntax for lambda expressions, for instance. This book introduces
only enough concepts as are needed to use language features effectively. There is a lot
more to functional and concurrent programming and to types than what the book cov-
ers. (There is also a lot more to Scala.) Advanced topics are left for you to explore
through other resources.
Why Scala?
As mentioned earlier, most of the code illustrations in this book are written in Scala.
This may not be the language you are most familiar with, or the language in which you
plan to develop your next application. It is a fair question to wonder why I chose it
instead of a more mainstream language.
Preface xxxi
The first function is imperative, based on an iteration and a mutable list. The next
variant is functional and uses map and filter explicitly. The last variant relies on
Scala’s for-comprehension, a mechanism similar to (but more powerful than) Python’s
list comprehension.
You can also use Scala to write a concise solution to the concurrency problem. It
uses futures and thread pools, like the earlier Java program:
Scala
def firstOf(comp1: StringComputation, comp2: StringComputation)
(using ExecutionContext): String = {
val future1 = Future(comp1.compute())
val future2 = Future(comp2.compute())
Await.result(Future.firstCompletedOf(Set(future1, future2)), timeout)
}
2 Different incarnations of Scala exist. This book uses the most common flavor of Scala, namely, the
one that runs on the JVM and leverages the JVM’s support for concurrency.
xxxii Preface
Given the book’s objectives, there are several benefits to using Scala for code illustra-
tions. First, this language is feature-rich, making it possible to illustrate many concepts
without switching languages. Many of the standard features of functional and concur-
rent programming exist in Scala, which also has a powerful type system. Second, Scala
was introduced fairly recently and was carefully (and often beautifully) designed. Com-
pared to some older languages, there is less historical baggage in Scala that can get in
the way when discussing underlying concepts. Finally, Scala syntax is quite conventional
and easy to follow for most programmers without prior exposure to the language.
Nevertheless, it is important to keep in mind that programming language features,
rather than Scala per se, are the focus of this book. Although I personally like it as a
teaching language, I am not selling Scala, and this is not a Scala book. It just happens
that I need a programming language that is clean and simple in all areas of interest,
and I believe Scala meets these requirements.
Target Audience
The target audience is programmers with enough experience to not be distracted by
simple matters of syntax. I assume prior Java experience, or enough overall program-
ming experience to read and understand simple Java code. Concepts such as classes,
methods, objects, types, variables, loops, and conditionals are assumed to be familiar. A
rudimentary understanding of program execution—execution stack, garbage collection,
exceptions—is also assumed, as well as basic exposure to data structures and algorithms.
For other key terms covered in depth in the book, the glossary provides a basic definition
and indicates the appropriate chapter or chapters where the concept is presented.
No prior knowledge of functional or concurrent programming is assumed. No prior
knowledge of Scala is assumed. Presumably, many readers will have some understanding
of functional or concurrent concepts, such as recursion or locks, but no such knowledge
is required. For instance, I do not expect you to necessarily understand the functional
Python and Java programs discussed earlier, or the two Java concurrent programs, or
the last two Scala functions. Indeed, I would argue that if these programs feel strange
and mysterious, this book is for you! By comparison, the imperative variant of the
number-shifting program should be easy to follow, and I expect you to understand
the corresponding code, whether it is written in Java, Python, or Scala. You are expected
to understand simple Scala syntax when it is similar to that of other languages and to
pick up new elements as they are introduced.
The syntax of Scala was inspired by Java’s syntax—and that of Java by C’s syntax—
which should make the transition fairly straightforward for most programmers. Scala
departs from Java in ways that will be explained as code examples are introduced. For
now, I’ll highlight just three differences:
• Semicolon inference. In Scala, terminating semicolons are inferred by the compiler
and rarely used explicitly. They may still appear occasionally—for instance, as a
way to place two statements on the same line.
Preface xxxiii
In this book, code illustrations rely on indentation instead of curly braces when
possible and omit most end markers for the sake of compactness. I expect readers
to be able to read imperative Scala code in this form, like the preceding function.
you might find to be the most applicable is found in the later chapters in each part of
the book. I have found this progression to be most conducive to a solid understanding
of features, which can then be translated into languages other than Scala. If you feel
that the early topics are well known and the pace too slow, please be patient.
This book is designed to be read in order, from beginning to end. Most chapters—and
their code illustrations—depend on ideas and programs presented in earlier chapters.
For instance, several solutions to the same problem are often presented in separate
chapters as a way to illustrate different sets of programming language features. It is
also the case that Part II on concurrent programming uses concepts from Part I on
functional programming.
While this makes it near impossible to proceed through the contents in a different
order, you are free to speed through sections that cover features with which you are
already familiar. Material from this book has been to used to teach undergraduate and
graduate students who are told that, as long as the code makes sense, they are ready
to move on to the next part. It is when code starts to look puzzling that it is time to
slow down and pay closer attention to the explanations in the text.
There are several ways you can safely skip certain parts of the contents:
• Chapter 15 on types can be skipped entirely. Elsewhere in the book, several code
examples make simplifying assumptions to avoid intricate concepts such as type
bounds and type variance. A basic understanding of Java types, including generics
(but not necessarily with wildcards) and polymorphism, is sufficient.
• Any “aside” can be safely ignored. These are designed as complementary discus-
sions that you may expect to find, given the book’s topics (and I would not want
to disappoint you!), and they can sometimes be lengthy. They are rarely referred
to in the main text, and any of these references can be ignored.
• Any “case study” chapter can be skipped. I would not necessarily recommend
that you do so, however, because the case study code is where features are put
together in the most interesting ways. However, no concept or syntax needed in a
later part of the book is ever introduced in a case study. The main text does not
refer to code from the case studies, with one minor exception: Section 10.8 refers
to a binary search tree implementation developed in Chapter 8.
Additional Resources
The book’s companion website is hosted at https://fcpbook.org. It contains additional
resources, a list of errata, and access to the code illustrations, which are available from
GitHub. The code examples were compiled and tested using Scala 3.2. The author
welcomes comments and discussions, and can be reached at [email protected].
Acknowledgments
I want to thank past and present colleagues for the encouragements that got me started
and for their feedback on the early stages of this project. Some were confident I had
something to say (and could say it) before I realized it myself.
A special thanks to my students, who went through various iterations of the material
that ended up in this book. They were my guinea pigs. More times than I care to admit,
I subjected them to frantic improvisation because a feature suddenly needed for that
day’s lecture was only going to be introduced as part of the following week’s discussion.
(To arrange hundreds of code examples in a consistent order is harder than it looks.)
With last minute changes before every class, students got used to, in their own words,
“handouts and slides on which the ink is not quite yet dry.”
Some of the ideas for code illustrations in this book were gathered over a period of
thirty years, during which time I refined them by writing and rewriting many imple-
mentations in a variety of programming languages. As much as I’d like to specifically
thank the original authors of these examples, I can’t remember all the sources I’ve used,
I don’t know which of them were original, and I feel it wouldn’t be fair to mention some
names but not the others. Nevertheless, I don’t claim to have invented all the examples
used in this book. The code is mine (including bugs), but credit for program ideas that
originated elsewhere should go to their creators, whoever they are.
It is truly scary to think what this book would have been if I had been left on my
own. Whatever its current flaws, it was made astronomically better with the help of
my editor, Gregory Doench, and his production team. They were very patient with a
first-time author who clearly didn’t always know what he was doing.
My feelings toward anonymous and non-anonymous reviewers is mixed. Without a
doubt, they helped improve the book but at the cost of extending my prison sentence
every time I was hoping to get paroled. I am thankful for their help—feedback from Cay
Horstmann, Jeff Langr, and Philippe Quéinnec, and long email discussions with Brian
Goetz, in particular, come to mind—but I cannot say I always welcomed their input as
unmitigated good news. I had been warned that writing a book like this was a major
undertaking but not that I would have to write it four times.
Which brings me to my deepest gratitude. It goes to my family, who showed angelic
patience as I kept promising to be done “by next month” for more than a year. They
must have grown tired of “the book” repeatedly getting in the way of our family life.
Indeed, I’m amazed that my wife didn’t pick up the phone one day, call my editor, and
notify him: “That’s it. The book is finished. Done. Today. Now.”
xxxv
This page intentionally left blank
About the Author
Michel Charpentier is an associate professor with the Computer Science department
at the University of New Hampshire (UNH). His interests over the years have ranged
from distributed systems to formal verification and mobile sensor networks. He has been
with UNH since 1999 and currently teaches courses in programming languages, concur-
rency, formal verification, and model-checking.
Register your copy of Functional and Concurrent Programming: Core Concepts and
Features on the InformIT site for convenient access to updates and/or corrections as
they become available. To start the registration process, go to informit.com/register
and log in or create an account. Enter the product ISBN (9780137466542) and click
Submit. Look on the Registered Products tab for an Access Bonus Content link next
to this product, and follow that link to access any available bonus materials. If you
would like to be notified of exclusive offers on new editions and updates, please check
the box to receive email from us.
xxxvii
This page intentionally left blank
Part I
Functional Programming
This page intentionally left blank
Chapter 1
Concepts of
Functional Programming
There is no universally accepted definition of functional programming, but all can agree
that it involves programming with functions. This chapter is an overview of some of the
characteristics of a functional programming style. It argues that these characteristics all
are, with varying degrees, a consequence of choosing to program in terms of functions.
3
4 Chapter 1 Concepts of Functional Programming
are not expected to understand all these concepts from this chapter. Rather, the idea is
to point out that they are all somewhat related. The details come later.
1.2 Functions
Functions are a well-established mathematical concept: A function associates every value
from a set with a unique value from another set (both sets may or may not be the same).
The first set is often referred to as the domain of the function and the second as its
range or codomain. Functions are said to map values from the domain to values from
the codomain, and functions are sometimes called mappings. Functional programming
is rooted in a yearning to write programs in terms of constructs that approach mathe-
matical functions.
This mathematical notion of function does not necessarily coincide with what a
function keyword might do in a programming language, or what you think of when
you decide to implement a “function” in your favorite language. Instead, you need to
think back
√ to the functions you saw in your high-school algebra class. In that sense,
f (x) = x and g(x) = x2 − 1 are functions (on real numbers), while Java’s sorting
method Arrays.sort is not.
Why? The Java method sorts an array by rearranging its contents. By doing so, it
modifies the array; it does not map it to another value. In mathematics, functions √ do
not modify anything. What is there to modify, anyway? The number 4 is a value; 4
is another value, also a number;√and so is 42 − 1. Functions map values to values, but
values are not modified. Using 4 or 42 − 1 in a larger expression does not “modify”
the number 4, whatever that would mean. Similarly, a sorting function—as opposed
to Java’s Arrays.sort—would map an array value to another, sorted array value, but
would not modify its input array in any way.
A core principle of functional programming is to organize code in terms of functions
that do not modify anything. Consider, for instance, these two Java “functions”:
Java
String firstString1(List<String> strings) {
return strings.get(0);
}
Both return the first string of a list of strings (assuming the list is not empty). Seen
as functions, they are equivalent: For any non-empty list of strings x, the strings
firstString1(x) and firstString2(x) are the same.1 However, the firstString2
function also removes the first string from the list—thus modifying the list—while
firstString1 does not.
1 This assumes that list x implements the remove method, which technically is optional in Java.
1.2 Functions 5
As this Java example illustrates, functions that modify some object and func-
tions that do not are often indistinguishable from their signatures: firstString1 and
firstString2 both take a List<String> argument and return a value of type String.
Instead, programmers are expected to rely on good naming and documentation—includ-
ing possibly annotations—to help emphasize that a function-like construct also modifies
the state of a system. For instance, the two functions in the preceding example could
be named getFirstString and getAndRemoveFirstString, respectively.
The starting point of functional programming is to rely, as fundamental organiza-
tional blocks of code, on functions that produce new values but do not modify existing
data in any way. State modifications, when necessary, are performed elsewhere in ways
that are unambiguous—that is, not through constructs that look like functions.
Aside on λ-Calculus
When discussing functions, notations from a typical algebra class are somewhat
ambiguous. One might write f (x) = x2 − 1 as a way to define function f . Then,
f (2) is used to represent the value obtained by applying function f to the num-
ber 2. In the same way, f (y) is f applied to variable y. However, by itself,
f (x) = (x − 1)(x + 1) is far from clear. Is this a (re)definition of f ? Or is it a
theorem about an existing function f , one stating that f (x) is always equal to
(x − 1)(x + 1)?
λ-Calculus (where λ is the Greek letter lambda) is a theory of functions
developed in the 1930s and serves as one of the mathematical foundations of
functional programming. In λ-calculus, the definition of f and the corresponding
theorem would be stated unambiguously. The definition of f could be written
as f = λ x. (x2 − 1); the application of f to number 2 would be f 2; and the
general theorem would be f x = (x − 1)(x + 1) (assuming a standard notation
for arithmetic operations).
This book focuses on practical programming, so it does not discuss λ-calculus
in any depth. It should be said, though, that many core ideas of functional pro-
gramming find their roots there. For instance, in λ-calculus, functions and data
are terms of the same algebra. This naturally leads to the notion of functions
as values, which is explored in Chapter 9. Currying (Section 9.2) is tied to the
fact that all functions in λ-calculus are single-argument functions. Addition, for
instance, would be defined as λ x. λ y. (x + y), which is a function of a single
argument (x) that returns another function λ y. (x + y) of a single argument (y).
Other aspects of functional programming are often discussed in terms of λ-
calculus. Hybrid programming languages, for example, sometimes discuss the
relationship between methods and functions (Section 9.6) in terms of the η-
conversion, which states that λ x. f x and f are equivalent. (In programming
language terms, you can think of it as the equivalence between the function that
applies method f and method f itself.)
The debt of functional programming to λ-calculus is nowhere more evident
than in the terminology used for function literals (Section 9.3): They are often
expressed as “lambda expressions,” and lambda is actually a keyword used to
define such functions in languages like Ruby and Python.
6 Chapter 1 Concepts of Functional Programming
It does not matter how the function is implemented: You can know for sure that it
does not modify the given list of strings, because the list simply cannot be modified.
(Immutability is discussed in Chapters 3 and 4.)
What about performance? Won’t immutable lists force the costly creation of new lists
as the state of a system evolves? Not necessarily. New immutable lists can often share
data with existing lists and be created with minimal copying and memory allocation.
Other data structures also have this property. They are typically defined in terms of
algebraic data types, and functional programming languages often implement a notion
of pattern matching to better support programming with such types. (Algebraic data
types and pattern matching are the topic of Chapter 5.)
Another important observation about immutability is that code that produces an
immutable value from another immutable value gains nothing from being repeated mul-
tiple times. This is critical, because it means that, in a purely functional approach to
programming, loops are useless. Indeed, if the body of a loop does not change the state
of a system in any way, executing it ten times instead of one (or even zero times) makes
no difference. If you are used to imperative programming, you may find the idea of pro-
gramming without loops quite mystifying. It is actually a frequent mistake, when first
learning functional programming, to try to use a functional computation as the body
of a loop. Instead, functional programming makes heavy use of recursion, not only as a
replacement for loops, but also as a natural way to process algebraic data types that
are recursive in nature, like trees. (Recursion is discussed at length in Chapters 6 to 8,
and used throughout the book.)
From the centrality of functions in functional programming emerges another idea,
that of functions as values. In functional programming languages, functions are regular
values, which can be stored in collections, or used as arguments and return values of
other functions. This gives rise to the concepts of higher-order functions and of func-
tion literals, often expressed in terms of lambda expressions. (Higher-order functions,
function literals, and lambda expressions are the topics of Chapters 9 and 10.)
Since, in functional programming, functions can be used as values, you can some-
times replace an explicit argument with a function that can compute this argument, thus
1.4 Summary 7
delaying the evaluation of the argument until it is needed. The argument is then said to
be lazily evaluated. By the same token, functions can also be stored inside data struc-
tures to implement lazily evaluated types, such as streams and views. (Lazy evaluation
is covered in Chapters 12 and 14).
Finally, handling failures by throwing and catching exceptions becomes inadequate
in programs that embed their control flow in higher-order functions. Instead, faults and
errors are better treated in functional programming as regular values—of well-chosen
types, defined for this purpose. (Chapter 13 discusses functional error handling.)
1.4 Summary
It is not uncommon for tutorials and overviews to answer the question What is functional
programming? by emphasizing one or more of the characteristics highlighted in this
chapter. You might hear that functional programming is all about recursion, or all about
immutability, or all about higher-order functions and lazy evaluation. In truth, all these
ideas are important, and all follow from the central principle of using functions as
the primary notion of computation. The concepts briefly mentioned here are thoroughly
explored in the first part of the book (Chapters 2 to 14) through small code illustrations
and longer case studies.
This page intentionally left blank
Chapter 2
Functions in
Programming Languages
The simplicity of mathematical functions is often enriched in programming languages
with features that facilitate their use as programming abstractions. Some of the most
common features are discussed in this chapter. Hybrid languages, which combine
functional and object-oriented programming, typically distinguish between operators,
methods, and functions and define mechanisms to bridge all three. Additionally, type
parameterization, optional arguments, and variable-length arguments are commonly
used to define templates that represent families of functions.
JavaScript
function abs(x) {
if (x > 0) return x; else return -x;
}
Python
def abs(x):
return x if x > 0 else -x
Kotlin
fun abs(x: Int): Int = if (x > 0) x else -x
9
10 Chapter 2 Functions in Programming Languages
Scala
def abs(x: Int): Int = if x > 0 then x else -x
These five definitions show many similarities. For instance, the body of the function
is exactly the same in Java and in JavaScript, and also in Kotlin and in Scala. However,
you might also notice several differences:
• A function is introduced in Java without a keyword. By contrast, JavaScript uses
function, Kotlin uses fun, and Python and Scala use def.
• Java, JavaScript, and Python all use the keyword return to return a value. Kotlin
and Scala do not (at least in this example).
• The body of the function is delimited by curly braces in Java and JavaScript.
Python uses indentation. For a function body as simple as the absolute value,
the Kotlin and Scala variants use nothing. (When defining more complex func-
tions, Kotlin uses braces and Scala uses braces or indentation.)
• Types are handled differently: Java uses a “type variable” syntax, while Kotlin
and Scala use “variable: type.” More noticeably, JavaScript and Python do not
mention types at all.
• The languages use a different syntax to test whether input x is positive. Some rely
on parentheses; some don’t. Some include a then keyword; others don’t. More
importantly, the Python, Kotlin, and Scala variants use “if” as an expression,
with a value. The Java and JavaScript variants do not.
Some of these differences are trivial matters of syntax. Developers are expected to
seamlessly navigate such minor variations when switching languages. You may write def
a few times as you start programming in Kotlin, but it should not take long to adjust
to writing fun instead. Other dissimilarities run deeper and will be revisited (typing
strategies are discussed in Chapter 15, and the use of conditionals as expressions is
discussed in Chapter 3). This book uses mostly Scala—with some Java—in its code
illustrations. You may already be familiar with one or the other language—more often
Java than Scala, I suspect—but should quickly get used to both as you read the code
examples.
The two functions doOneThing and doAnotherThing are executed in sequence, one then
the other. When functions are used as functions—mechanisms to produce values from
2.2 Composing Functions 11
values—as opposed to acting in some way on the state of an application, they need to
be composed differently. Let’s supplement the absolute value function with a second
function, dots, for the purpose of an illustration:
Scala
def dots(length: Int): String = "." * length
This function creates a string of dots of a specified length. It can be combined with the
absolute value function to produce the string "..." from the number -3:
Scala
dots(abs(-3)) // the string "..."
Functions are composed by using the output of a function as the input to another
function. Note that I did not use sequential composition and did not write:
Scala
abs(-3);
dots(-3);
which would not have the desired effect. For sequential composition to work, function
abs would need to change a number (into its absolute value) before dots uses it to build
a string. This would work:
Scala
num = -3;
num = abs(num);
dots(num); // the string "..."
However, this pattern cannot be expressed without introducing a variable num. You need
to store the effects of the first part of the sequence somewhere and tell the second part
where to find them.
Functions can be composed more easily because their effects are local. Ideally, a func-
tion only needs to know its input and does nothing more than produce an output (see
the discussion of pure functions in Chapter 3). This makes it possible to compose func-
tions into larger functions, for which functional programming languages define specific
operators:
Scala
(dots compose abs)(-3) // the string "..."
(abs andThen dots)(-3) // the string "..."
Functions abs and dots are composed, and the composed function is then applied to the
argument -3. By contrast, you cannot compose two sequential code fragments into one
without tying them together by mentioning explicitly what is being transformed by the
12 Chapter 2 Functions in Programming Languages
first part so the second part can use it, which is the role of variable num in the preceding
example.
If you are new to functional programming, you may find expressions like dots
compose abs or abs andThen dots somewhat mystifying. Instead of being applied to
arguments, functions abs and dots look like they are themselves arguments to the oper-
ators compose and andThen. Indeed, this is exactly what is happening, and Chapters 9
and 10 are dedicated to this very important feature of functional programming.
NOTE
Function dots breaks down if length is negative. To keep the code excerpts short and focused on
the concepts being illustrated, I deliberately omit all argument validation throughout the book. As
a reader, you are asked to assume that arguments have reasonable values and have been validated
elsewhere.
Scala uses methods with symbolic names to implement almost all its operators. For
instance, both “>” and “-” are implemented as methods. The expression x > 0 invokes
a method “>” on an object x with argument 0. It could be written as x.>(0), but there
really is no reason to do so. Instead, some languages let you use an infix notation when
invoking methods, which makes perfect sense for methods defined with symbolic names.
This includes your own user-defined methods:
Scala
class Node:
def --> (that: Node): Edge = Edge(this, that)
...
Given this definition, the expression a --> b can be used to create an edge between
two nodes a and b. In effect, you have defined an operator “-->”. In addition, you can
sometimes specify that a method with a regular name can be used as an operator:
Scala
class Node:
infix def to(that: Node): Edge = Edge(this, that)
...
Given this definition, you can build an edge between a and b with the expression a to b.
If desired, you can make this function look like a method of the String type by defining
an extension:
Scala
extension (str: String)
def short(maxLen: Int): String = shorten(str, maxLen)
Hybrid languages like Scala or Kotlin define other bridges between functions and
methods. In Chapter 9, we will see a conversion that operates in the opposite direction:
It lets you use a method where a function is expected.
delimited by curly braces, but braces can often be omitted if you use proper indentation,
as in Listing 2.4.
An important property of local functions is that they can access the arguments and
local variables of their enclosing function. As an even quirkier definition of absolute
value, you could write:
Scala
def abs(x: Int): Int =
def maxX(a: Int): Int = if a > x then a else x
maxX(-x)
Note how local function maxX uses variable x, which is not one of its arguments.
You can think of this feature as a mechanism used to define a family of functions: a
one-argument average function, a two-argument average function, a three-argument
average function, etc. This is also true of optional arguments and type parameters,
which are described next.
16 Chapter 2 Functions in Programming Languages
Listing 2.6: Example of a function with default values for some of its arguments.
In this function, the user string defaults to the empty string, and the withNewline flag
defaults to true. All of the following are valid calls of this function:
Scala
formatMessage("hello") // "hello\n"
formatMessage("hello", "Joe") // "Joe: hello\n"
formatMessage("hello", "Joe", false) // "Joe: hello"
This does not work because the function expects a string as the second argument, not
the Boolean false, which should be the third argument. To get around this difficulty,
you need to specify the name of the argument explicitly:
Scala
formatMessage("hello", withNewline = false) // "hello"
2.10 Type Parameters 17
With explicit names, arguments can be reordered arbitrarily. All of the following calls
are valid:
Scala
formatMessage(msg = "hello", user = "Joe")
formatMessage(user = "Joe", msg = "hello")
formatMessage(user = "Joe", withNewline = false, msg = "hello")
Even when they are not strictly necessary, you can sometimes rely on argument
names to improve code readability:
Scala
formatMessage("Tweedledee", "Tweedledum") // which is user and which is message?
formatMessage(msg = "Tweedledee", user = "Tweedledum") // clearer
Argument pair has type (Any, Any), which represents pairs of any types—Any being
the type of everything in Scala. The function can be applied to a pair of integers, as in
first((1, 2)), or a pair of strings, as in first(("A", "B")). The problem with this
definition of first is that its return type is Any. When you apply first to a pair of
integers, the function returns a value of type Any. This value happens to be an integer,
but the type information is lost:
Scala
first((1, 2)) // has type Any
first((1, 2)) + 10 // rejected by the compiler
first(("egg", "chicken")) // has type Any
first(("egg", "chicken")).toUpperCase // rejected by the compiler
18 Chapter 2 Functions in Programming Languages
This is the wrong solution. A better approach is to use a type parameter in the definition
of function first:
Scala
def first[A](pair: (A, A)): A = pair(0)
Listing 2.7: Example of a function parameterized by a type; contrast with Lis. 2.8.
The function now has a type parameter A and produces a value of type A from a pair of
type (A, A). Adding [A] to the definition of function first makes A a (type) variable
of the function. The name is irrelevant. You could define first as
Scala
def first[Type](pair: (Type, Type)): Type = pair(0)
integer 1 and the string "chicken". The compiler sets A to be Any,2 the only type that
contains both values. To better handle heterogeneous pairs, you can modify function
first to use two type parameters:
Scala
def first[A, B](pair: (A, B)): A = pair(0)
Listing 2.8: Example of parameterization by multiple types; contrast with Lis. 2.7.
The expression first((1, "chicken")) + 10 can now be compiled, after the com-
piler infers types A=Int and B=String. Similarly, first(("egg", 2)).toUpperCase is
a valid expression, with types A=String and B=Int.
Type parameterization is a powerful feature that you should use as often as possible.
If you find yourself frequently testing and casting types at runtime, it could be a sign
of inadequate leverage of type parameterization. Step back, and see if adding type
parameters to functions, methods, or classes could help you avoid the typecasts.
2.11 Summary
• All mainstream programming languages implement at least one code-structuring
abstraction that can be used to represent a function. In languages that favor
object-oriented programming, many functions are implemented as methods.
• The role of a function is to produce output values from input values. Accordingly,
functional code is typically structured in terms of function composition, in which
the output of a function is used as the input of the next function.
• In addition to functions and methods, programming languages may rely on opera-
tors, often applied in prefix or infix notation. Some languages use symbolic names
to implement some or all of their operators as functions or methods.
• Functions often rely on other, intermediate functions in their definition. In some
languages, it is possible to define local functions within functions. A local function
can access the arguments and local variables of its enclosing function directly,
which simplifies its signature and improves code legibility. (This is similar to the
use of local classes in object-oriented languages.)
• For increased flexibility, functions may use named arguments, repeated arguments
(also called variable-length arguments or varargs), or default values. Some lan-
guages, such as Scala, Kotlin, and Python, implement all three mechanisms, while
others, such as Java, support repeated arguments but not named arguments or
default values.
2 Technically, the type inferred by the Scala compiler is Matchable, a strict subtype of Any, but you
their domain is not the infinite set of integers with the usual addition and subtraction operations,
but rather a different set of 32-bit values with 2’s complement arithmetic. Still, they implement a
mathematical function, just not the standard absolute value.
2 Input here is to be understood as an input value. When invoked multiple times on the same
mutable object, a pure function could still produce different outputs because the object—and thus the
input—has changed.
21
22 Chapter 3 Immutability
As an example of code that fails to satisfy the first property of pure functions,
consider function format:
Scala
var prompt = "> "
This function’s behavior depends on the value of the external variable prompt. As a
result, it might produce different outputs when you call it multiple times on the same
input:
Scala
format("command") // "> command"
prompt = "% " // change the prompt
format("command") // "% command"
Function format produces two different strings from the same input, making it impure.
As an illustration of the second characteristic of pure functions—the absence of side
effects—recall function firstString2 from Chapter 1, rewritten here in Scala (but still
using Java lists):
Scala
import java.util
This function returns the first string from its input list but also has the side effect that
the value is removed from the list, making the function impure.
A function can even be impure by breaking both purity constraints:
Scala
var lastID = 0
uniqueName("user-") // "user-1"
uniqueName("user-") // "user-2"
This function has the side effect of modifying variable lastID, on which its own com-
putation depends, so it fails to meet both of the requirements needed to be pure. By
contrast to these examples, function firstString1, which returns a string without re-
moving it from the list, is pure (assuming that method get on a Java list is pure, which
it is for standard list implementations).
3.2 Actions 23
Note that the two reasons that a function might be impure—side effects or depen-
dency on external mutable state—are closely related. Both stem from the notion of
mutable state: A function can be impure because of its own mutations, or because it is
impacted by the mutations of other functions. In particular, in a program that consists
entirely of functions free of side effects, all the functions are pure.3 Mutation is the
source of impurity.
3.2 Actions
Both functions firstString1 and firstString2 return the first string of a list, but
firstString2 also removes the string from the list as a side effect. A third variant of
the function is possible, one that removes the first string, but does not return it:
Scala
def removeFirstString(strings: util.List[String]): Unit = strings.remove(0)
This function uses the same implementation as firstString2, but its return type is dif-
ferent: Unit instead of String. For a programmer new to functional programming, Unit
is an unusual type. There is only one value of type Unit, typically referred to as unit, and
represented in Scala by the token “()”.
As a function, removeFirstString is not very useful: From its return type, you
already know that the function will return unit before you even call it. So why call it
as all? For its side effect of removing the first string from the list.
Type Unit is used when a function has nothing useful to return and is defined
solely for the purpose of exploiting its side effects. Such functions are sometimes called
procedures. In this book, I use the simpler and shorter term action instead of procedure,
but this is not standard terminology. By choosing Unit as the return type, you can make
it clear to the user that a function has side effects. Indeed, a function with return type
Unit and no side effects would truly be useless.
Several hybrid languages rely on the Unit type to ensure that all methods are func-
tions: Every method returns a value, even if this value is sometimes useless. As an older
language, Java uses the notion of void methods, which do not return anything, and
thus are not functions, not even peculiar ones. In Java, removeFirstString would be a
void method, making it clear that it is invoked only for the purpose of exerting a side
effect.
You can make your code less surprising by relying as much as possible on actions and
pure functions. Take, for example, class java.util.Arrays. All its methods are either
actions like sort or fill or setAll, which modify their input but return no value, or
pure functions, like binarySearch or hashCode or equals, which return a new value
without side effects.
Functions that include side effects but are not actions, like firstString2, tend to be
more confusing. In particular, impurity is not obvious from the function’s signature, since
3 For this reason, “pure” is sometimes used loosely to mean “free of side effects.”
24 Chapter 3 Immutability
a meaningful pure function could be defined with the same signature. This could lead
you to call a function without being aware of its side effects, with adverse consequences.
In other words, it is an easy mistake to use firstString2 with the implementation of
firstString1 in mind.
Still, impure functions have their uses. When something has clear side effects—there
is no way it could be a pure function—an action can safely be replaced by a function
that returns something useful. For instance, the Java method add, which clearly adds a
value to a set, also returns a Boolean:
Java
Set<String> set = ...;
if (set.add("X")) {
// "X" was added and was not already in the set
} else {
// "X" was already in the set, which was left unchanged
}
You can rely on the value returned by add to know if an element was indeed inserted
into the set, or if the element was already in the set and did not need to be added.
A Boolean value is not the only choice that makes sense for an adding function.
Scala’s set method “+=”, for instance, returns the set itself, making it possible to chain
operations:
Scala
val set: Set[String] = ...
set += "X" -= "Y"
String "X" is first added to the set. Method “+=” returns the set itself, on which method
“-=” is then applied to remove string "Y".
Note that method “-=” also returns the set, but in this example, the value being
returned is ignored. In other words, “-=” is used here as if it were an action. When
the return value of a function with side effects is ignored, I will sometimes refer to the
function simply as an action instead of the more cumbersome “impure function used as
an action.”
In summary, there are three possible variants for a “first string” function:
Scala
// pure function, cannot be used as an action
def getFirstString(strings: util.List[String]): String = strings.get(0)
The pure function has no side effects, while the action does not return any meaningful
value. The impure function is a combination of both. You can argue that it is the most
powerful and versatile combination, but it is also the most susceptible to bugs if used
erroneously—that is, as if it were a pure function.
4 In
C or Java, the statement prompt = "% " is also an expression, with value "% ". Like impure
functions, statements that are also expressions can be a source of confusion. In more recent languages,
such as Scala or Kotlin, prompt = "% " is used only as a statement. As an expression, it has type Unit
and is useless.
26 Chapter 3 Immutability
In the last line of this example, the if-then-else expression is used as the input to
action println. This has the same effect as the Java program in Listing 3.2, but proceeds
differently: Instead of selecting between two print statements, if-then-else is used to
select between two strings, and the selected string is printed. Of course, nothing prevents
you from using if-then-else to combine expressions with side effects. The behavior of
Listing 3.2 can also be achieved in Scala as follows:
Scala
if num > 100 then println("large") else println("small")
This is still an expression, but its value—unit—is useless. The expression is evaluated
only for its side effect.
As an expression, if-then-else makes no sense without the else part, which is typ-
ically required in a functional programming language. However, if-then (with no else)
does make sense when using actions. As a hybrid language, Scala implicitly substitutes
else () for a missing else clause when if-then-else is used imperatively.
Another fundamental construct used in imperative programming to combine state-
ments is the while-loop (and its for-loop and do-loop derivatives). Although they are
useful with statements, loops make little sense with expressions: The repeated evaluation
of a pure function is pointless. When shifting from imperative to functional program-
ming, such ineffective loops can be replaced with either recursion (Chapters 6 and 7) or
higher-order functions (Chapters 9 and 10).
example, to modify variables prompt and lastID) or indirectly (for example, inside the
implementation of list method remove). A functional programming style aims at elimi-
nating side effects so as to write programs in terms of pure functions. Not surprisingly,
this means targeting the assignment statement for elimination (or at least much reduced
usage).
If you look closely at the Scala examples so far, you’ll notice that I have introduced
some local variables with the keyword val, and others with the keyword var. You may
also notice that the var variables were used as the targets of assignment statements,
but the val variables were not. Indeed, the distinctive difference between the two is
that val is used to proscribe assignments:
Scala
val two = 2
val three = two + 1
two = two + 1 // rejected by the compiler
What val really does is give a name to an expression. Despite similarities in the syntax,
you are better off thinking of val two = 2 as giving the name two to the value 2,
rather than as assigning 2 to variable two. Because of its val definition, two is not an
ordinary variable in the C or Java sense. In particular, it cannot be reassigned with a
different value. If you attempt to write this kind of assignment, such as two = two + 1
or two = 3, the compiler will reject it as a “reassignment to val” error.
As a consequence of this restriction, two and 2 are now equivalent expressions;
wherever the name two appears (within the scope of the val declaration), it means 2.
This property is known as referential transparency: The expressions two and 2 are
interchangeable, and one can always be substituted for the other. You do not need
to carefully parse code to see if two is being modified somewhere, because it cannot
be changed. This would not be true if two was introduced as a reassignable variable,
using var. Then it could be initialized with 2 at the beginning of the program but have
another value by the time it is used inside a later expression.
Names introduced using val are often said to be functional variables. You might
also see them being referred to as immutable, non-reassignable, or assign-once variables.
Functional programming languages rely heavily on this notion of variables. Hybrid lan-
guages might complement functional variables with old-fashioned, reassignable variables.
In Scala, the declaration var two = 2 assigns the value 2 to the name two, but the vari-
able can later be updated via assignment statements like two = 3 or two += 1, and the
referential transparency property is lost.
If you are new to functional programming, you may find the idea of programming
without assignments quite puzzling. Still, if you want to practice with a functional
programming style in a language that supports both types of variables, you should strive
to maximize your use of functional variables. Purists might even consider any use of an
assignment statement to be a “code smell,” but pragmatic users of hybrid languages
will know when reassignable variables are warranted, and when they are better avoided.
As an illustration, consider the problem of parsing a command-line option to set a
verbosity level: 0 is the default level, -v sets it to 1, and -vv sets it to 2. An imperative
programmer could implement this as follows:
28 Chapter 3 Immutability
Scala
var verbosity = 0
if arg == "-v" then verbosity = 1
else if arg == "-vv" then verbosity = 2
This sets verbosity to the desired value, but the variable is reassignable, and there is no
referential transparency. Could verbosity be declared as a val instead? A programmer
used to a more functional style would implement the calculation in this way:
Scala
val verbosity = if arg == "-v" then 1 else if arg == "-vv" then 2 else 0
In Java, final is used to prevent variable reassignment. If you use the earlier imperative
approach to define verbosity, you won’t be able to mark it as final.
imperative programming mindset. You will need to get used to different interfaces and
programming patterns.
Consider sets as an example. The Scala standard library implements both mutable
and immutable sets. If you come from a Java or Python background, mutable sets feel
completely natural:
Scala
val set = Set("A", "B") // a mutable set
set += "C"
// set is now {A,B,C}
The action set += "C" modifies the set by adding "C" to it. Alternatively, you can
rewrite the example using an immutable set, stored in a mutable variable:
Scala
var set = Set("A", "B") // an immutable set
set = set + "C"
// set is now {A,B,C}
Instead of a “+=” method that modifies the set, immutable sets define a “+” method
that creates a new set from an existing set. The set on which method “+” is applied is
not modified—it is immutable.
Avoid the common mistake of using persistent structures with a mind frame that
expects mutability:
Scala
// DON'T DO THIS!
var set = Set("A", "B") // an immutable set
set + "C"
// set is still {A,B}
Here, method “+” is used as if its effect was to modify a set by adding an element. But
that is not what the method does. Instead, a new set is created—and not used—and
variable set is unchanged, so it is still equal to the set {A, B}.
The similarity makes it easier to switch from one approach to the other. However, it
can also be a source of confusion:
Scala
var set = Set("A", "B") // a set of unknown type
set += "C" // is this a call to += or a reassignment of a var?
import scala.collection.mutable
val set2 = mutable.Set(1, 2, 3)
In this code, set1 is immutable, while set2 and set3 are mutable. Importing scala
.collection.mutable.Set without renaming—as I did earlier in the chapter—is con-
sidered bad practice.
NOTE
For convenience and readability, I often use, throughout the book, formulations like “an element is
added to/removed from the set” or “the list is reversed” or “the kth value is replaced,” even when
referring to persistent structures. These statements are intended to mean “a new set is created with
an element added/removed,” “a new list is created containing all values in reverse order,” etc.
3.7 Functional Lists 31
The nested expression starts with the empty list nil, uses cons to build the list
(cons 3 nil), which is the list [3], then uses cons again to add 2, then 1, in front of
the list. The head of the final list is 1; its tail is the list (cons 2 (cons 3 nil)).
Languages of the ML family popularized the use of the infix operator “::” to rep-
resent cons. It is also being used in more recent languages like F# and Scala. The Lisp
code just given can be written in Scala as follows:
Scala
1 :: 2 :: 3 :: Nil
The memory allocation of these three lists looks something like Figure 3.1. Although the
three lists together contain nine values, there are only four cells allocated in memory,
32 Chapter 3 Immutability
thanks to data sharing between a, b, and c. Because lists are immutable, this sharing
is harmless. The same is true, to some degree, of other persistent data structures. The
method “+” used earlier on an immutable set does not make a full copy of the set. If
largeSet contains 1 million elements, largeSet + x is a new set that contains 1 million
and one elements, but the two sets share the vast majority of the memory allocated to
implement them.
a 1 2 3
b 0
Another consequence of this data sharing is that, on functional lists, both the head
and the tail methods are efficient, constant-time operations—typically, a single pointer
dereference. As long as you can express an algorithm in terms of head and tail, lists
make a reasonable implementation choice. Be aware, however, that other list operations
may not be as fast. Methods such as last and length, for instance, require a computing
time that is proportional to the length of the list, and methods such as appended or
concat need to copy the entire list.
The second reason the head/tail structure of functional lists is important is that the
tail of a non-empty list is itself a list. In consequence, many computations on lists can
be implemented efficiently using recursion. Chapter 7 is dedicated to recursive program-
ming with lists.
In a hybrid application that combines pure functions, impure functions, and actions,
you need to worry about possible side effects when calling functions that are not un-
der your control. The beauty of immutable values is that they can be shared safely,
knowing that no unexpected side effect can modify them.
As an illustration, consider an application that keeps track of registered users in a
mutable set:
Scala
type Directory = mutable.Set[String]
The problem here is that function newRegistrations is not necessarily pure. It could
be implemented as follows:
Scala
def newRegistrations(newDir: Directory, oldDir: Directory): Directory =
newDir.subtractAll(oldDir)
This implementation removes from the set newDir every user already in the set oldDir,
and returns newDir. The return value is correct, but the set newDir has been modified.
Because you worry that newRegistrations could be implemented like this, you need
to make sure that the directories you pass as arguments will not be modified. So, you
end up calling newRegistrations on copies of your directories:
Scala
val todayReg = newRegistrations(toDayDir.clone(), yesterdayDir.clone())
34 Chapter 3 Immutability
Unless you know for sure that newRegistrations does not modify its arguments,6 you
cannot dispense with the calls to clone, sometimes known as defensive copies. The
sad part of this story is that it also possible—likely, even—that newRegistrations is
implemented as a pure function. In that case, the defensive copies are entirely wasted.
Wasted defensive copies are actually quite common7 in large applications that depend
on a multitudes of libraries. You can avoid them by using immutable directories instead:
Scala
type Directory = Set[String]
trol, checking the source code is not enough. You need some guarantees, typically in the documentation,
that the function not only is but will remain pure.
7 It can also be the case that you are calling a method that stores its arguments and makes its
own defensive copies because it worries that the arguments might be modified after the call. In the
worst case, you may end up cloning objects both on the calling side and inside the code that is called.
Paradoxically, sharing immutable objects often results is less memory allocation, not more.
3.9 Updating Collections of Mutable/Immutable Objects 35
Even though the list is immutable, its contents are being modified. However, the same
list, containing the same objects, is returned.
Alternatively, you could implement loads as an immutable type:
Scala
trait Load:
def weight: Int // current weight
def reduced: Load // a reduced load, with a weight reduced by one
stored into a new list. You can replace the for-do syntax with a for-yield syntax that
does just that:
Scala
def reduceAll(loads: List[Load]): List[Load] = for load <- loads yield load.reduced
Method reduced is applied to all the loads in the list but, since it has no side effects, the
loads do not change. Instead, all the new load objects that are created by reduced are
assembled into a new list, which is returned. In general, for-yield applies a function to
a collection of values and returns a new collection with the outputs.9 The function being
applied is typically pure.
3.10 Summary
• The responsibility of a function is to calculate an output value from one or more
input values. While doing so, functions should refrain from depending on an exter-
nal mutable state and from modifying anything in a system (or its environment)
through side effects. This includes modifying the function’s arguments, if they are
mutable.
• Functions that adhere to this contract are said to be pure. A functional program-
ming style emphasizes the use of pure functions as often as possible.
• Programming with pure functions is programming with values: Output values pro-
duced by functions are used as input arguments to other functions. This results in
a programming style centered on expressions instead of statements. Programming
languages that support a functional style often reflect this choice in their control
structures. In Scala, for instance, if-then-else is an expression, and so are code
blocks.
• In functional programming, the main use of variables is to name expressions. √
The resulting names are variables in the mathematical sense, as in “let x be 2
in . . . .” They are not variables in the C/Java sense of memory locations that can
be reassigned. Hybrid languages make use of both kinds of variables: functional
(non-reassignable) variables and traditional, imperative (reassignable) variables.
In Scala, these variables are introduced by the keywords val and var, respectively.
• Functional programming and object-oriented programming can be combined into a
functional–object-oriented programming style that centers on immutable objects,
also called functional objects. Methods on a functional object are not used to
modify the object, but rather to produce a new object.
9 Thefor-yield syntax is somewhat particular to Scala, although other languages define similar
constructs. The more universal way to implement this transformation is to use a higher-order function
called map. See Section 10.9 for a discussion of map and its relationship with for-yield.
3.10 Summary 37
39
40 Chapter 4 Case Study: Active–Passive Sets
Method allActive is simpler than before: You can just return a reference on field
activeSet instead of having to copy the set. This works because activeSet is an immu-
table set, and returning a reference to it is harmless. The code for methods activate
and deactivate looks unchanged but is actually compiled differently. Instead of call-
ing methods += and -= on a mutable set, you are now creating a new immutable
set and reassigning field activeSet with it. (The similarity in this syntax was dis-
cussed in Section 3.6.) Methods activateAll and deactivateAll are implemented
without any set operation. They simply (and efficiently) reset field activeSet with
the value elements, or with an empty set. The other methods (isActive, isPassive,
allPassive, isAllActive, and isAllPassive) are unchanged.
In summary, there are two ways you can define activeSet to implement a mutable
active–passive set:
Scala
private val activeSet = mutable.Set.empty[A] // in Listing 4.1
private var activeSet = Set.empty[A] // in Listing 4.2
In one case, you activate/deactivate elements by modifying a mutable set. In the other
case, you do it by building a new immutable set and reassigning a mutable field.
The first set in a pair represents the entire collection; the second set is the subset of
active elements. Both sets are immutable. All the remaining code is written as functions:
Scala
private def elements[A](ap: ActivePassive[A]): Set[A] = ap(0)
private def activeSet[A](ap: ActivePassive[A]): Set[A] = ap(1)
3 Because of the keyword opaque, this definition makes the types ActivePassive[A] and (Set[A],
Set[A]) distinct. This prevents you from using a (Set[A], Set[A]) value by mistake as an argument
to a function that expects ActivePassive[A], and vice versa. In particular, you avoid the risk of calling
an active–passive function with a pair that does not represent an active–passive set—that is, one in
which the second set is not a subset of the first set.
42 Chapter 4 Case Study: Active–Passive Sets
// in Listing 4.3
def isAllActive[A](ap: ActivePassive[A]): Boolean =
activeSet(ap).size == elements(ap).size
Because active–passive sets are now immutable, the operations that activate or deac-
tivate elements change more substantially. They now need to create a new value: a new
active–passive pair that differs from the previous one by having some elements activated
or deactivated. Notice that the signatures are different from before: Instead of Unit, they
now have ActivePassive[A] as their return type. For instance, to activate a single ele-
ment elem in function activate, you create a new active set, activeSet(ap) + elem,
and use it to make a new active–passive pair. Two private functions are added to access
pair elements so as to avoid writing all the code in terms of ap(0) and ap(1), which
would be error-prone. Finally, the class constructor from the object-oriented variants is
replaced with a createActivePassive function.
4.3 Functional Objects 43
All the functions used in the implementation are pure, and active–passive values are
used by function composition. Instead of
Scala
ap.activateAll()
ap.deactivate(A)
ap.deactivate(B)
The ActivePassive class is back, but defined differently. Both fields, elements
and activeSet, are now immutable sets and are never reassigned.4 As a consequence,
active–passive objects are never modified. Instead, activation and deactivation methods
return a new active–passive set, as in the functional variant. To activate an element,
for instance, you create a new activeSet, as in Listing 4.3, but instead of using it to
build a new pair, you wrap the set into a new ActivePassive instance. Internally, each
new active–passive set is built using a private, two-set constructor. Users, however, can
only use a single-set public constructor (def this in the source code). This makes it
impossible for them to create nonsensical objects in which activeSet is not a subset of
elements.
As with the functional variant, these active–passive sets rely on function composition,
but you write the code with a method-call syntax. Instead of
Scala
deactivate(deactivate(activateAll(ap), A), B)
As a final note, be aware that mutable and immutable variants typically use dif-
ferent function and method names. To better highlight differences and similarities, I
used the same names in all the variants. In practice, the functional variants should use
names consistent with the fact that values are immutable. We saw, for instance, that
mutable sets in Scala use a method +=, while immutable sets use +. Similarly, arrays
have a method update, while lists have a method updated. Here, the immutable vari-
ant of ActivePassive in Listing 4.4 would have been better written using activated,
deactivated, allActivated, and allDeactivated instead of activate, deactivate,
activateAll, and deactivateAll.
4.4 Summary
The point of this case study is to illustrate the differences between mutable and immu-
table data. There are really two variants of the active–passive sets here: Listings 4.1
and 4.2, on the one hand, and Listings 4.3 and 4.4, on the other hand. The first two
implement a mutable variant, the last two an immutable variant.
The fact that Listing 4.1 relies on mutable sets while Listing 4.2 uses immutable sets
is an implementation choice. From a user standpoint, the two ActivePassive classes
are equivalent. In the same way, Listing 4.3 is expressed in terms of pairs and func-
tions, while Listing 4.4 uses a class and methods instead. It slightly changes how you
write code—for example, activate(ap, A) versus ap.activate(A)—but underneath,
4 In Scala, fields introduced in the constructor are implicitly val, unless you use var explicitly.
4.4 Summary 45
both implementations are the same. They both represent active–passive sets as pairs of
immutable sets, either using a built-in pair type, or our own, two-field ActivePassive
class as a wrapper. Both implementations share the same fundamental property that
active–passive sets are immutable, and activation/deactivation always creates a new set.
When using a true functional programming language, you are likely to write this
immutable variant directly in terms of functions. However, with a hybrid object-oriented/
functional language, using functional objects is an attractive option that arguably re-
sults in cleaner code than a purely functional implementation.
You may wonder about the relative performance of each implementation—the ques-
tion of performance always comes up when discussing immutability. Immutable active–
passive sets incur two performance costs:
• Activation and deactivation of elements require the allocation of a new wrapping
container for the two sets—either a pair or an instance of class ActivePassive.
Discarded pairs, sets, and wrappers also create additional work for the garbage
collector.
• Methods += and -= on mutable sets are likely to be slightly more efficient than +
and - on immutable sets, though not necessarily by much.
Overall, these disadvantages are likely to be minor compared to the benefit of simpler
and safer data sharing and should not stop you from going the immutable route if that’s
what makes sense for your application. As always, there is no need to worry about
minor differences in performance until profiling has established that they contribute to
a bottleneck in your application.
This page intentionally left blank
Chapter 5
Pattern Matching and
Algebraic Data Types
Most functional programming languages—as well as many of the hybrid languages that
aspire to support functional programming—define a form of pattern matching. Pattern
matching has many uses, from a simple switch-like construct to runtime type checking
and casting, but it is most effective when applied to algebraic data types, which combine
alternatives and aggregation.
47
48 Chapter 5 Pattern Matching and Algebraic Data Types
Scala
arg match
case "--" => ... // end of options
case longOpt if longOpt.startsWith("--") => ... // long option
case shortOpt if shortOpt.startsWith("-") => ... // short option
case plain => ... // plain argument
Note how ordering again is important: The pattern case _: Exception would catch
I/O and illegal state exceptions if it appeared at the top.
A functional switch with guards can be very useful, but to fully leverage the power of
pattern matching, you need to apply it to composite types and types with alternatives.
These types are sometimes referred to as algebraic data types (see the aside on sum and
product types at the end of Section 5.5). The next few sections explore several common
types of that nature.
5.2 Tuples
The simplest way to compose two types is to aggregate them into a pair. For instance,
(String, Int) is Scala’s type for pairs in which the first value is a string and the second
value is an integer, like ("foo", 42) or ("bar", 0). This generalizes to N -tuples.
You can use pattern matching not only to “switch” between tuples—for instance, to
select pairs in which the second value is zero—but also to extract values from the tuple:
5.2 Tuples 49
Scala
val pair: (String, Int) = ("foo", 42)
pair match
case (str, 0) => ... // no match because the number in pair is not zero
case (str, n) => ... // str is the string "foo", n is the integer 42
This is equivalent to
Scala
val str = pair(0)
val n = pair(1)
rec.city // "Phoenix"
rec.temperature // 122
By using a case class instead of a tuple, you can sometimes improve code legibility—
rec.city instead of rec(0)—while keeping the convenience of pattern matching. In
particular, you can define separate, distinct types that all contain the same type com-
ponents, like a city and a temperature, or a label and a count, or a unit of time and a
duration, instead of the more generic (String, Int). Case classes are used for conve-
nience in several of this book’s case studies and illustrations.
1 Case classes also differ from regular classes in a few other ways (they redefine equality and string
representation, and their constructor arguments are implicitly public val variables), but they are mostly
used in this book for the purpose of pattern matching.
50 Chapter 5 Pattern Matching and Algebraic Data Types
5.3 Options
Tuples aggregate values but offer no choice: A pair is always made of two values. Other
types are defined to take one of multiple forms as alternatives. Options are a simple
and widely used type with two alternatives: An option can contain a value or it can be
empty. Consider the following example:
Scala
val someNum: Option[Int] = Some(42)
val noNum: Option[Int] = None
A value of type Option[Int]2 represents either a single integer or nothing. You can use
it as the return type of a function that is not guaranteed to produce a result, such as
a search that may or may not find what it is looking for. Returning options is much
preferable to using null for this purpose, a topic we will explore in Chapter 13.
You can use pattern matching on options:
Scala
optNum match
case Some(x) => if x > 0 then x else 0
case None => 0
The underscore pattern handles any value that did not match the first pattern, in this
case an option with a non-positive number or an empty option. Note that you can flip
the patterns in the first variant, but not the second:
2 Scala’s Option type is called Optional in Java, and Maybe is other programming languages.
5.4 Revisiting Functional Lists 51
Scala
optNum match
case None => 0
case Some(x) => if x > 0 then x else 0
This code still works, but this next variant does not:
Scala
// DON'T DO THIS!
optNum match
case _ => 0
case Some(x) if x > 0 => x
The underscore pattern matches anything, and the second pattern is unreachable.
Patterns can be nested arbitrarily. In particular, you can use “@” to capture a com-
ponent of a composite type while breaking it into its own components at the same
time. For instance, if an expression is a list of options of pairs, pattern matching can be
applied to the list, and to an option inside the list, and to a pair inside an option, all
within the same pattern:
Scala
List(Some(("foo", 42)), None) match
case (head @ Some(str, n)) :: tail => <expr>
Inside expression <expr>, variables head, str, n, and tail denote the following values:
5.5 Trees 53
Finally, note that Scala defines additional list patterns for convenience. Instead of
“x :: Nil”, which matches a list with a single element, you can use the more readable
pattern List(x). In a same way, a list of exactly three elements can be matched us-
ing List(x, y, z) instead of the awkward x :: y :: z :: Nil. Some of these more
readable patterns are used in later code examples.
5.5 Trees
In this chapter, we have already applied pattern matching to several algebraic data
types: tuples, options, and lists. Tuples are used to aggregate multiple types, options
are an alternative between a type or nothing, and lists are defined both in terms of an
alternative between empty and non-empty lists and as the aggregation of a head and a
tail.
In addition to involving both alternatives and aggregation, the list type has the
remarkable property of being defined inductively: Either a list is empty, or it consists of
the aggregation of a value (the head) and another list (the tail). A data type like this is
said to be recursive: Inside a (non-empty) list, there is a list. Trees are another classic
recursive data type, frequently used in programming.
true ∨
false ¬
true false
Scala
enum BoolExpr:
case T
case F
case Not(e: BoolExpr)
case And(e1: BoolExpr, e2: BoolExpr)
case Or(e1: BoolExpr, e2: BoolExpr)
As with lists, the tree type involves alternatives (five possibilities), aggregation (left and
right operands of And and or), and recursion (Not, And, and Or are trees that contain
subtrees). Using this type, the tree in Figure 5.1 becomes the following expression:
Trees can be processed very naturally using pattern matching to switch between
alternatives and to extract subtrees. For instance, you can write a function that evaluates
a tree to its truth value:
Scala
def eval(expr: BoolExpr): Boolean =
import BoolExpr.*
expr match
case T => true
case F => false
case Not(e) => !eval(e)
case And(e1, e2) => eval(e1) && eval(e2)
case Or(e1, e2) => eval(e1) || eval(e2)
Given the recursive nature of trees, it is not surprising that function eval is itself
recursive. Recursion is the topic of Chapter 6.
Explorations of types like the options, pairs, lists, and trees used in this chapter
are often framed in terms of sum and product types. You do not need to know
the names to effectively use these types, but a brief discussion of this terminology
can help you better understand the algebraic nature of such types.
Consider two Scala types defined as follows:
Scala
type Stooge = "Larry" | "Curly" | "Moe"
type Digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
5.5 Trees 55
For the purpose of our discussion, types can be thought of as sets of values:
The type Boolean contains two values, true and false; Int has 232 possible val-
ues; and String has an infinite number of values. Here, we rely on two small
types: Stooge, with three values, and Digit, with ten values.
Product types are used for aggregation. For instance, the pair type
(Stooge, Digit) contains values that consist of one stooge and one digit:
Scala
type StoogeAndDigit = (Stooge, Digit)
Pos(3) and Neg(3) are different values, and the Number type contains 21
values. The name Zero is no more relevant than the names Pos and Neg, and
the type can be denoted as Digit + Digit + 1, where “1” represents the single-
valued type that contains Zero.
56 Chapter 5 Pattern Matching and Algebraic Data Types
operation: It takes time proportional to the value of the index. Furthermore, to change
a value at index n, you would have to discard n elements from the list, and allocate a new
list of length n. For instance, if the index points to the first “a” in list [P,l, a ,t,o],
and you want to replace it with “u” to produce [P,l, u ,t,o], you will need to create
a new list [P,l,u]:
Plato P l a t o
Pluto P l u
The zipper is a clever data structure that avoids this drawback. It is implemented as
a pair of lists (left, right): The second list contains the elements right of the cursor; the
first list contains the elements left of the cursor, but in reverse order. All the necessary
operations—moving the cursor left or right, and querying and updating the element
under the cursor—can then be implemented in constant time.
To help visualize zippers, consider again the list [P,l, a ,t,o] with current ele-
ment “a”. It is represented as this zipper:
Scala
(List('l', 'P'), List('a', 't', 'o'))
Scala
(List('a', 'l', 'P'), List('t', 'o'))
To implement zippers, first define a Zipper type as a pair of lists. A zipper value is
constructed from a non-empty list of elements:
Scala
type Zipper[A] = (List[A], List[A])
When you construct a zipper from a list, the cursor is in the leftmost position (no ele-
ments in the left list). A zipper always contains at least one value (the current element),
and the right list in a pair is never empty (a proper implementation of fromList would
reject an empty list argument with an exception).
The head of the right list is the element under the cursor. It can be queried and
updated efficiently:
58 Chapter 5 Pattern Matching and Algebraic Data Types
Scala
def get[A](zipper: Zipper[A]): A = zipper match
case (_, x :: _) => x
Scala
def moveRight[A](zipper: Zipper[A]): Zipper[A] = zipper match
case (left, x :: (right @ _ :: _)) => (x :: left, right)
case _ => zipper
This also works but, unless you are programming in a language without pattern condi-
tions (like SML), the variant with a condition is more readable.
Finally, a zipper can be turned back into a list if needed:
Scala
def toList[A](zipper: Zipper[A]): List[A] = zipper match
case (left, right) => left.reverse ::: right
5.7 Extractors
We have seen how pattern matching can be used to switch between alternatives and
to extract components of a composite type. As a generalization, pattern matching can
sometimes be used to decompose into parts a type that is not naturally composite.
As an example, consider the problem of extracting the alpha, red, green, and blue
components from a color represented as a 32-bit ARGB integer. It can be implemented
as follows:
Scala
object ARGB:
def unapply(argb: Int): (Int, Int, Int, Int) =
var bits = argb
val b = bits & 0xFF
bits >>>= 8
val g = bits & 0xFF
bits >>>= 8
val r = bits & 0xFF
60 Chapter 5 Pattern Matching and Algebraic Data Types
In Scala, the name unapply of the function plays a special role. With object ARGB in
scope, you can now use pattern matching to split a number into its color elements:
Scala
val ARGB(a, r, g, b) = 0xABCDEF12
5.8 Summary
• Pattern matching is a common feature of functional programming languages. In its
simplest form, you can think of it as a powerful form of a switch expression. It can
be used to test Boolean conditions, compare values to constants, or check runtime
types.
• The real power of pattern matching, however, comes from being used to process
algebraic data types. What characterizes these types is that they are defined in
terms of alternatives (also called sums) and aggregations (also called products) of
other types. Pattern matching can be used to switch between alternatives, and to
extract components in aggregated types.
3 In
Scala, the r method creates a regular expression out of a string. Also, the s prefix enables
interpolation in string literals: s"$x" is a string that contains the value of x.
5.8 Summary 61
What would the processing of a collection look like in the functional world? First, you
would replace the mutable collection with an immutable one. Then, since you cannot
63
64 Chapter 6 Recursive Programming
actually remove anything from an immutable collection, you would need to replace
the action processOne with a function1 that processes an element and returns a new
collection with that element removed:
Scala
// processes one element, and returns a collection with this element removed
def processOne[A](collection: ImmutableCollection[A]): ImmutableCollection[A] = ...
So far, so good. Now, you need a way to use this function to process an entire
collection, one element at a time. The same approach as before will not do: If
collection.nonEmpty is true, processOne(collection) will not change that—the col-
lection is immutable. Therefore, collection.nonEmpty will remain true, and the loop
will never terminate. Can a loop be used at all? Yes, and this program would work:
Scala
def processCollection[A](collection: ImmutableCollection[A]): Unit =
var remaining = collection
while remaining.nonEmpty do
remaining = processOne(remaining)
This gets the job done, but at the cost of reintroducing mutation through an assign-
ment statement: Something in the body of the loop must have some side effect, and
since collection cannot change, a mutable variable remaining is introduced. (The
same approach was used in Section 3.6 to implement a mutable state in terms of an im-
mutable data structure.) The loop-based solution works only because there is mutation.
To achieve a fully immutable style—no var—you need to take a different route.
If a function f is free of side effects, the repetition f(x); f(x); f(x) is pointless.
Think of code that executes Math.sqrt(4.0); Math.sqrt(4.0); Math.sqrt(4.0).
What is the point? Instead, the output of f must be used, typically as the input to
further computing. Instead of loops, what you need is a mechanism that can arbitrarily
nest function calls. This mechanism is recursion.
Let’s go back to the problem of processing an entire immutable collection. A func-
tion needs to process one element from the collection and then process, in the same
fashion, a new collection with that one element removed. In other words, you need
to invoke processOne on the value produced by a previous call to processOne. Since
processCollection is designed to invoke processOne, all you need to do is to
call processCollection again on the output of processOne:
Scala
def processCollection[A](collection: ImmutableCollection[A]): Unit =
if collection.nonEmpty then processCollection(processOne(collection))
has no side effect on the collection itself, which is all that matters for the purpose of this illustration.
6.2 Recursive Algorithms 65
through another call to function processCollection, and so on. The computation that
processes the entire list is processOne(processOne(processOne(...))), which is of
the form f(f(f(x))), not f(x); f(x); f(x). No mutable variable is involved.
You can always replace code that uses loops and mutable variables with code that
uses recursion and immutable variables instead. Function composition combined with
recursion has the same expressive power as sequential composition combined with loops.
Anything that can be computed using one of these approaches can be computed using
the other.
0.
1.
2.
3.
4.
5.
6.
7.
using R; then move one disc from L to R; then move n − 1 discs from M to R using L.
If n = 0, there is nothing to do. This strategy can be translated into a recursive function
that prints all the steps needed to move a given number of discs:
Scala
def hanoi[A](n: Int, from: A, middle: A, to: A): Unit =
if n > 0 then
hanoi(n - 1, from, to, middle)
println(s"$from -> $to")
hanoi(n - 1, middle, from, to)
This function does nothing if n = 0. Otherwise, a first recursive call is used to move
n−1 discs from the left peg to the middle peg, using the right peg as temporary storage.
The effect of this call is to leave the nth disc free (position 3 in Figure 6.1). The print
statement then moves this disc from left to right (position 4 in Figure 6.1). A second
recursive call moves n−1 disc from their storage peg (middle) to the right peg, using the
left peg as storage. When called as hanoi(3, 'L', 'M', 'R') this function produces
the same sequence of steps as Figure 6.1:
L -> R
L -> M
R -> M
L -> R
M -> L
M -> R
L -> R
What makes the Tower of Hanoi a compelling example is that there is no straight-
forward loop-based equivalent that implements the same algorithm.3 One reason for
this is that function hanoi makes two recursive calls to itself, compared to the single
call needed in the factorial or greatest common divisor examples.
loops, but their correctness is harder to justify, and even the simplest loop-based program is still more
complex than the recursive function presented here.
68 Chapter 6 Recursive Programming
Keep this principle in mind to help catch mistakes quickly. For instance, consider
this incorrect attempt to define a function to extract the last element of a list:
Scala
// DON'T DO THIS!
def last[A](list: List[A]): A = list match
case Nil => throw NoSuchElementException("last(empty)")
case _ :: tail => last(tail)
the function is applied recursively to a sorting problem that is not smaller. This
mergeSort function is incorrect: It does not terminate when called on a non-empty
list.
When a problem has multiple parameters, it is sometimes enough that one of them,
or a combination of them, becomes smaller. For example, a pair that consists of a
positive number and a list can become smaller by the number getting smaller, the
list getting shorter, or the sum of the number plus the length of the list getting
smaller, among other ways.
Some cases are more complicated and necessitate more refined notions of “smaller.”
For instance, in this classic example (sometimes known as the Ackermann func-
tion),4 the computation is guaranteed to terminate because the pair (x,y) de-
creases in lexicographic ordering: Either x decreases, or x stays unchanged and y
decreases:
Scala
def a(x: BigInt, y: BigInt): BigInt = (x, y) match
case (0, _) => y + 1
case (_, 0) => a(x - 1, 1)
case _ => a(x - 1, a(x, y - 1))
evaluation will run out of stack space on most inputs. The Ackermann function was defined for the
theoretical study of recursive functions and has no practical purpose.
70 Chapter 6 Recursive Programming
value and two children, typically named left and right, which are themselves binary
trees:5
Scala
enum BinTree[+A]:
case Empty
case Node(value: A, left: BinTree[A], right: BinTree[A])
You can then calculate the size of a tree by relying on the size of its forest, and the size
of a forest by relying on the sizes of its trees:
Scala
def treeSize[A](tree: Tree[A]): Int = tree match
case Empty => 0
case Node(_, trees) => 1 + forestSize(trees)
5 Ignore the “+” sign in front of type parameter A. Although necessary here, it is not relevant to our
discussion. If curious, you can read about it in Section 15.8 on type variance.
6.5 Tail Recursion 71
In this pattern, when control returns from the call to g, it will then immediately return
from the call to f—function f has nothing to do after the call to g. In particular, by
the time function f initiates its call to function g, its x, y, and z arguments are no
longer needed (and neither are its local variables). They can be popped from the stack
before branching into g, thus limiting stack growth—pop a frame, then push a frame.
Furthermore, if the call is a recursive call (f and g are the same function), the whole
stack frame can be reused by reassigning variables x, y, and z to have the values t, u,
and v, and then branching back to the beginning of f. In that case, you can execute the
computation as a loop, without using any stack at all.
Some functional languages may guarantee that some or all tail calls are optimized,
while others might focus solely on tail recursive calls, and still others might leave the
decision to specific compiler implementations. Hybrid languages, in which functions
are implemented in terms of methods, tend to face a more challenging situation, since
method invocation typically involves its own unique characteristics (see the discussion
of dynamic dispatching in Section 15.7).
72 Chapter 6 Recursive Programming
As an experiment, consider the small class shown here, with a single recursive
method:
Scala
class TailRecursionTest:
def zero(x: Int): 0 = if x == 0 then 0 else zero(x - 1)
The useless method always returns zero (hence its name and return type).
The bytecode generated for it by the Scala compiler looks something like this:
bytecode
public int zero(int);
Code:
0: iload 1
1: iconst_0
2: if_icmpne 9
5: iconst_0
6: goto 16
9: aload 0
10: iload 1
11: iconst_1
12: isub
13: invokevirtual #16 // Method zero:(I)I
16: ireturn
disabled, by making zero a final method, for instance, the bytecode generated
by the compiler changes:
bytecode
public final int zero(int);
Code:
0: aload 0
1: astore 2
2: iload 1
3: istore 3
4: iload 3
5: iconst_0
6: if_icmpne 13
9: iconst_0
10: goto 30
13: aload 2
14: astore 4
16: iload 3
17: iconst_1
18: isub
19: istore 5
21: aload 4
23: astore 2
24: iload 5
26: istore 3
27: goto 31
30: ireturn
31: goto 4
34: athrow
35: athrow
Again, the details are unimportant, but the essential thing to notice is that
the invokevirtual instruction is gone. Instead, we see goto 4 on line 31: The
recursive function is now implemented as a loop.
Scala
def search(sortedSeq: IndexedSeq[String], target: String): Option[Int] =
var from = 0
var to = sortedSeq.length - 1
while from <= to do
val middle = (from + to) / 2
sortedSeq(middle) match
case midVal if target > midVal => from = middle + 1
case midVal if target < midVal => to = middle - 1
case _ /* found at middle */ => return Some(middle)
end while
None
This function works by searching a slice of the array between from and to, initialized to
the entire array. Each loop iteration starts by looking at the value in the middle of the
range, midVal. If the target is larger than the midpoint value, the function keeps looking
in the upper part of the range, between middle + 1 and to, using the fact that the
sequence is sorted. If the target is less than the midpoint value, the function continues
instead with the lower part of the range, between from and middle - 1. Otherwise,
the value must be equal to the target, so the function returns the index at which it was
found. Each new iteration of the loop performs a search in a smaller range. The search
ends when the target value is found or the range becomes empty (from > to).
Instead of using a loop, you can implement these successive searches with recursive
calls instead:
Scala
def search(sortedSeq: IndexedSeq[String], target: String): Option[Int] =
@tailrec
def doSearch(from: Int, to: Int): Option[Int] =
if from > to then None
else
val middle = (from + to) / 2
sortedSeq(middle) match
case midVal if target > midVal => doSearch(middle + 1, to)
case midVal if target < midVal => doSearch(from, middle - 1)
case _ /* found at middle */ => Some(middle)
doSearch(0, sortedSeq.length - 1)
Even though one uses a loop and the other uses recursion, these two implementa-
tions are fairly similar. Recursive calls to doSearch bring the computation back to the
calculation of a new midpoint, just like the loop does. Indeed, function doSearch is tail
recursive and is compiled in Scala as a loop. Thus, instead of invoking doSearch again,
the recursive calls are implemented by updating a local variable—either from or to—
and by jumping back to the beginning of the function, just like in the loop variant.
After compilation, both implementations use bytecode that, for all practical purposes,
is equivalent. Which one you decide to write is a matter of taste.
Sometimes, tail recursion happens quite naturally, without having to think too much
about it, as in the binary search example. Function last from Listing 6.4, for instance,
is tail recursive too. Of course, it is also incorrect, but that can be fixed without losing
tail recursion:
Scala
@tailrec
def last[A](list: List[A]): A = list match
case Nil => throw NoSuchElementException("last(empty)")
case head :: tail => if tail.isEmpty then head else last(tail)
Listing 6.10: Correct tail recursive implementation of last; fixed from Lis. 6.4.
Sometimes, though, a function is not naturally tail recursive. For instance, the
factorial function from Listing 6.2 is not tail recursive. The recursive call is not
in a tail position: After control returns from it, the number still needs to be multiplied
by n before a value is returned. If you need an optimized implementation that does
not grow the execution stack, you can switch back to using a while-loop, if the pro-
gramming language supports it. Otherwise, you need to rewrite the recursive function
slightly differently. The standard trick consists of adding a second argument that serves
as an accumulator for the value being calculated:
Scala
def factorial(n: Int): Int =
@tailrec
def loop(m: Int, f: Int): Int = if m == 0 then f else loop(m - 1, m * f)
loop(n, 1)
Listing 6.11: Tail recursive implementation of factorial; contrast with Lis. 6.2.
The second
Qn argument of function loop contains the part of the factorial already calcu-
lated, i=m+1 i. The multiplication m * f takes place before the recursive call, which is
now in tail position. Function loop is tail recursive, and is compiled as a loop. However,
the elegance of the simple recursive implementation from Listing 6.2 is somewhat lost.
Functions that involve multiple recursive calls are never tail recursive—at most one
recursive call can be in tail position. Can they be made tail recursive? Yes, but at
the cost of introducing additional data structures. Consider, for instance, the case of
function size on binary trees in Listing 6.7. This function makes two recursive calls,
76 Chapter 6 Recursive Programming
one on the left tree and one on the right tree. To trigger the tail recursion optimization,
you might be tempted to employ the same trick used for function factorial:
Scala
def size[A](tree: BinTree[A]): Int =
def loop(tr: BinTree[A], sz: Int): Int = tr match
case Empty => sz
case Node(_, left, right) => loop(right, loop(left, sz + 1))
loop(tree, 0)
One recursive call to loop is now in tail position and could potentially be optimized,
but the nested call loop(left, ...) still needs to use a stack frame: After controls
returns from it, the loop(right, ...) call still needs to take place and still uses the
current frame (to access local variable right).
A true tail recursive variant needs to make a single recursive call only. You can
achieve this by introducing your own stack, as a list of trees:
Scala
def size[A](tree: BinTree[A]): Int =
@tailrec
def sizeSum(list: List[BinTree[A]], sum: Int): Int = list match
case Nil => sum
case Empty :: trees => sizeSum(trees, sum)
case Node(_, left, right) :: trees => sizeSum(left :: right :: trees, sum + 1)
sizeSum(List(tree), 0)
Function sizeSum computes the sum of the sizes of a list of trees. It is tail recursive.
When an empty tree is found in the list, it contributes nothing to the sum. When a
node is taken out of the list, the sum accumulator is incremented by one, and the node’s
children are added to the list of trees to be processed. The size of a tree is obtained by
applying function sizeSum to a list that contains only this tree. In effect, the execution
stack has been replaced with a regular list. As with factorial, tail recursion is achieved
at the cost of a loss of elegance.
Seeking to replace general recursion with tail recursion involves trade-offs. While
pushing and popping the execution stack is likely to be faster than list operations, stack
space tends to be limited. By contrast, the added list in function sizeSum is allocated
in heap memory alongside other objects. Heap memory is typically orders of magnitude
larger than stack space.
Other approaches exist to optimize recursive calls and trade heap space for stack
space. One notable technique, known as a trampoline, has been used to translate lan-
guages with tail-call optimization into languages without it. Trampolines are explored
as a case study in Chapter 14.
6.7 Summary 77
6.7 Summary
• Loops are a programming language mechanism that implements repetition. A loop
is only useful if its body implements some form of state change, either by mutating
objects or by reassigning variables.
NOTE
The Scala collection library implements many list operations as methods of its List type. In this
chapter, several are reimplemented as functions for the purpose of our discussion of recursion and
recursive algorithms. The code examples deliberately rely on previously defined functions instead of
the corresponding standard methods—for instance, writing head(list) where list.head could be
used—so the code is closer to what it could be in another programming language with functional
lists. This approach is specific to this chapter. Later parts of the book use the corresponding List
methods instead for convenience and readability.
This function is based on the idea that, in general, the last element of a list is the last
element of its tail. In other words, it relies on the following equality:
last(list) = last(tail(list))
Because empty lists have no last element, this equality holds only when neither list nor
tail(list) is empty. These two special cases are handled non-recursively in function
79
80 Chapter 7 Recursion on Lists
last—case Nil and if isEmpty(tail)—and the recursive call then simply follows
the equality.
Every recursive algorithm is founded on such an equality. For instance, the size of
a tree equals the sum of the sizes of the children, plus one; and the good old factorial
implementation discussed earlier is based on the equality n! = n × (n − 1)!. To become
fluent in recursive programming, you need to learn to identify these equalities.
A frequent list pattern involves calculating a function f on a list by using a single
recursive call on the tail of the list:
f(list) = g(f(tail(list)))
Your goal is to calculate f(list), and you have the value f(tail(list)) to work with.
You get this value for free by the “magic” of recursion. Your job is to formulate and
implement function g.
In the case of function last, g is the identity function, and f(list) is simply
f(tail(list)). Many of the examples that follow rely on a function g, which may or
may not be written explicitly. For instance, function contains in Listing 7.1, which
searches for a target in a list, uses g(x) = (head(list)==target) || x. Function
length in Listing 7.3, which calculates the length of a list, uses g(x) = x + 1.
Once you figure out the general equality, the recursive function usually follows
straightforwardly. All you need is to take care of the special cases where the equal-
ity does not hold. They become the non-recursive cases of the computation, like short
lists in function last, or zero in factorial. The equality itself defines the recursion.
The next several sections illustrate this principle by looking at typical list calculations
and by deriving for each a recursive function from an equality and one or more special
cases.
The equality is ill defined for the empty list, which has neither head nor tail, and which
needs to be treated as an easy special case (the empty list contains nothing):
Scala
@tailrec
def contains[A](list: List[A], target: A): Boolean =
!isEmpty(list) && (head(list) == target || contains(tail(list), target))
Listing 7.1: Simple list lookup, using tail recursion; see also Lis. 7.2.
7.2 Traversing Lists 81
Note how the code practically reads itself: “A list contains a target if it is non-empty
and either the first element of the list is the target, or the target is somewhere else in
the list.”1 The function is tail recursive because the logical operator “||” uses shortcut
evaluation in Scala (as in C or Java) and thus is equivalent to the following expression:
Scala
(if head(list) == target then true else contains(tail(list), target))
With “||” expanded as if-then-else, the tail recursion appears more clearly.
Instead of relying on head and tail functions, you can use pattern matching:
Scala
@tailrec
def contains[A](list: List[A], target: A): Boolean = list match
case Nil => false
case head :: tail => head == target || contains(tail, target)
Listing 7.2: Simple list lookup, using tail recursion and pattern matching.
In the remainder of this chapter—and throughout the book—code illustrations mostly
follow this pattern-matching style, which I often find easier to read.
The next example is a recursive function to calculate the length of a list. A non-
empty list contains as many elements as its tail, plus one for the head of the list:
length(list) = 1 + length(tail(list))
After the empty list, of length zero, is handled as a special case, the implementation
has the following form:
Scala
def length[A](list: List[A]): Int = list match
case Nil => 0
case _ :: tail => 1 + length(tail)
Listing 7.3: List length, not tail recursive; contrast with Lis. 7.4.
A drawback of this implementation is that the function is not tail recursive. Its evalua-
tion requires as many stack frames as there are values in the list and is likely to fail on
large lists by running out of space.
You can derive a tail recursive variant by applying the same transformation used
to write a tail recursive factorial function: Add a second argument for the length,
and update it before the recursive call. Helper function addLength adds the length of a
list to a given accumulator. The length function is then implemented by using addLength
to add the length of a list to zero:
1 For this reason, functional programming is often said to be more declarative than imperative
programming.
82 Chapter 7 Recursion on Lists
Scala
def length[A](list: List[A]): Int =
@tailrec
def addLength(theList: List[A], len: Int): Int = theList match
case Nil => len
case _ :: tail => addLength(tail, len + 1)
addLength(list, 0)
Listing 7.4: List length, using tail recursion; contrast with Lis. 7.3.
Function addLength is tail recursive. The whole length function can now be compiled
as a loop, which will avoid stack overflow issues.
head
n drop(list, n)
As a first example, consider the standard function drop, which is used to remove
the first n elements from a list: drop(List(A,B,C,D), 2) is List(C,D). Its recursive
equality follows from the fact that, to remove n elements from a list, you need to first
remove the head of the list, and then remove n − 1 elements from the tail (Figure 7.1):
drop(list, n) = drop(tail(list), n - 1)
For the equality to be valid, the list needs to have a tail, and n − 1 cannot be a negative
number, since negative numbers make no sense in this context. From this observation,
two special cases follow, one for the empty list and one for n = 0:
Scala
@tailrec
def drop[A](list: List[A], n: Int): List[A] =
if n == 0 then list
else list match
7.3 Returning Lists 83
Listing 7.5: Recursive implementation of drop on lists; see also Lis. 7.6.
Function drop takes two arguments: a list and a number n of elements to drop. The
special case n = 0 (no elements removed) is handled by returning the list unchanged.
You can handle the other special case (empty list) in two different ways, depending on
the desired semantics of removing elements from an empty list:
• There are no elements to remove in an empty list: Throw an exception.
• Whatever is removed, an empty list remains empty: Return an empty list.
Some programming languages, like ML, follow the first interpretation. The Scala stan-
dard library uses the second interpretation, which I also chose here. Function drop is
tail recursive and does not incur stack space usage at runtime.
As an alternative to if-then-else followed by pattern matching, you can use pat-
tern matching directly on both the number and the list by joining them as a pair:
Scala
@tailrec
def drop[A](list: List[A], n: Int): List[A] = (list, n) match
case (_, 0) | (Nil, _) => list
case (_ :: tail, _) => drop(tail, n - 1)
Listing 7.6: Recursive implementation of drop on lists; see also Lis. 7.5.
Pattern (_, 0) is the case n = 0, and pattern (Nil, _) is the empty list case. In both
cases, the function returns the list unchanged.
n
drop(list, n)
Scala
def getAt[A](list: List[A], i: Int): A = drop(list, i) match
case Nil => throw NoSuchElementException("getAt(empty)")
case value :: _ => value
Listing 7.8: Recursive implementation of take; contrast with Listings 7.21 to 7.23.
A key difference between functions take and drop is that in the case of drop, the
input and output lists can share data in memory, but function take needs to allocate a
new list to store the elements being extracted (Figure 7.3). Note also that, unlike drop,
2 This is especially true in Scala, which uses the tempting syntax list(i) and array(i) to access
copy
take(list, n) ...
function take is not tail recursive. It works by pushing the first n elements of the input
list onto the execution stack, then building a list from them as the function calls return.
NOTE
Section 7.9 briefly discusses building lists with limited use of the execution stack, and possible scal-
able implementations of function take are discussed. However, many code examples throughout
this book continue to build lists using the execution stack, as in Listing 7.8. My motivation for
using this approach is to avoid unnecessary distractions and to keep code illustrations focused on
the concepts at hand. In this section, which focuses on the equalities at the core of recursive algo-
rithms, I am willing to sacrifice library-level performance and robustness for the benefit of code that
draws more clearly from the corresponding equality. In the implementation of take, for instance,
the code head::take(tail, n - 1) and the equality head(list)::take(tail(list), n - 1)
are almost identical. Within the context of this book, it is not worth losing this clarity to improve
the code’s performance.
concat(list1, list2)
concat(tail(list1), list2)
As for the underlying equality, the concatenation of list1 and list2 produces, in
general, a list that starts with the head of list1 and continues with the concatenation
of the tail of list1 with list2 (Figure 7.4):
concat(list1, list2) = head(list1) :: concat(tail(list1), list2)
For this equality to make sense, the first list needs to have a head and a tail (the second
list can be empty). You handle the special case where the first list is empty by observing
that the concatenation of an empty list and another list is the other list itself:
Scala
def concat[A](list1: List[A], list2: List[A]): List[A] = list1 match
case Nil => list2
case head1 :: tail1 => head1 :: concat(tail1, list2)
Lists can contain elements that are themselves lists. A list of lists can be flattened
by repeated concatenation:
Scala
flatten(List(List(1, 2, 3), Nil, List(4, 5), List(6))) // List(1, 2, 3, 4, 5, 6)
List(1, 2, 3) ::: Nil ::: List(4, 5) ::: List(6) // List(1, 2, 3, 4, 5, 6)
tail
head
head flatten(tail)
concat(head, flatten(tail))
The list returned by function flatten begins with the elements of the first inner list
and continues by concatenation with the result of flattening the tail (Figure 7.5):
flatten(list) = concat(head(list), flatten(tail(list))
The actual implementation mimics the equality, adding only a special consideration for
the empty list:
Scala
def flatten[A](list: List[List[A]]): List[A] = list match
case Nil => Nil
case head :: tail => concat(head, flatten(tail))
All the inner lists have length k, except possibly the last list, which may contain fewer
than k elements. The corresponding recursive equality is based on the following idea.
First, use function take to take the first k elements of the main list—they form the first
inner list. Then use function drop to remove these first k elements from the main list
and process the remaining elements recursively:
group(list, k) = take(list, k) :: group(drop(list, k), k)
Functions group, take, and drop are all well defined on the empty list, leaving no special
cases to consider:
7.6 Recursion on Sublists Other Than the Tail 89
Scala
// DON'T DO THIS!
def group[A](list: List[A], k: Int): List[List[A]] =
take(list, k) :: group(drop(list, k), k)
While this equality is indeed valid on a non-empty list, using it results in an unacceptable
reverse function:
Scala
// DON'T DO THIS!
def reverse[A](list: List[A]): List[A] = list match
case Nil => list
case head :: tail => append(reverse(tail), head)
Although this function correctly produces a list in reverse order, it suffers from two
major flaws. First, it is not tail recursive and will run out of stack space when applied
to large lists. Second, and most importantly, its performance is unsatisfactory, even for
those lists that fit in the execution stack.
If the list being reversed contains n elements, the length of reverse(tail) is
n − 1, and the evaluation of append necessitates n − 1 operations. But inside the
reverse(tail) computation, there is another call to append on a list with n − 2 ele-
ments; and inside the next nested computation is another call to append on a list of
length n − 3; and so forth. Overall, (n − 1) + (n − 2) + · · · + 1 = (n × (n − 1))/2 operations
are needed. As a result, the computing time of this reverse function is proportional to
the square of the length of the list being reversed.
To avoid this, introduce a second list argument:
Scala
def reverse[A](list: List[A]): List[A] =
@tailrec
def addToStack(rem: List[A], rev: List[A]): List[A] = rem match
case Nil => rev
case top :: bottom => addToStack(bottom, top :: rev)
addToStack(list, Nil)
This function proceeds by repeatedly adding the head of the list being reversed to the
front of an accumulator list rev. You can think of it as peeling cards from a deck and
putting them onto the table: In the end, the resulting deck of cards is the reverse of the
original deck.
Contrary to the previous attempt, this implementation only uses “::” directly to
build the resulting list, and is linear in complexity. As a simple experiment on the
desktop computer used to typeset this book, it took approximately 75 milliseconds
to reverse a list of 5000 numbers using the first implementation. This time decreased to
0.027 millisecond when the improved variant was used, a more than 1/2000 reduction.
Incidentally, this implementation has also become tail recursive, and can handle lists of
arbitrary length.
There may be situations where you are tempted to append to a list because prepend-
ing would not produce a list in the right order. Don’t do it. Instead, build the list in the
wrong order, efficiently, and then reverse it. For instance, to extract space-separated to-
kens from a stream of characters, it would feel natural to use append to add a character
to the current token, then append again to add a token to a list of tokens. Instead, you
can use a combination of prepending and reversing:
Scala
def tokenize(stream: List[Char]): List[String] =
def addToken(token: List[Char], tokens: List[String]): List[String] =
if isEmpty(token) then tokens else reverse(token).mkString :: tokens
@tailrec
def loop(stream: List[Char], token: List[Char], tokens: List[String]): List[String]=
stream match
case w :: chars if w.isWhitespace => loop(chars, Nil, addToken(token, tokens))
case c :: chars => loop(chars, c :: token, tokens)
case Nil => addToken(token, tokens)
NOTE
Efficient sorting is typically not performed directly on lists. Instead, list values can be stored in a
temporary array, the array sorted, and the list reconstructed. Furthermore, practical sorting functions
should be parameterized by the type of the elements being sorted and the criterion used to order
them. This section focuses on direct sorting of lists of integers because it offers a good illustration
of recursive patterns. The resulting functions suffer from limitations and inefficiencies that would
likely be unacceptable in production code.
Consider the problem of sorting a list of integers in increasing order. You can decompose
a non-empty list into a head and a tail and apply recursion to sort the tail. You are then
left with a single value (the head) and a sorted list (the sorted tail). All that is needed
to complete the sorting function is to insert this value at the right place into the sorted
list. This strategy results in a form of insert-sort:
Scala
def insertInSorted(x: Int, sorted: List[Int]): List[Int] = sorted match
case Nil => List(x)
case min :: others =>
if x < min then x :: sorted else min :: insertInSorted(x, others)
• Quick-sort: Split a list into low and high values, sort both parts recursively, then
concatenate the two sorted lists into one.
Merge-sort uses a simple strategy to split a list in two, but has more work to do to
merge the lists after they have been sorted. Quick-sort relies on a more complicated
splitting strategy, but once sorted, lists only need to be concatenated.
First, merge-sort:
Scala
def merge(sortedA: List[Int], sortedB: List[Int]): List[Int] = (sortedA, sortedB) match
case (Nil, _) => sortedB
case (_, Nil) => sortedA
case (hA :: tA, hB :: tB) =>
if hA <= hB then hA :: merge(tA, sortedB) else hB :: merge(sortedA, tB)
Listing 7.18: Merge-sort; fixed from Lis. 6.5; see also Lis. 7.19.
The merging of sorted lists starts with two patterns to deal with empty lists—merging a
list with an empty list results in the list itself. The last pattern then compares the heads
of two non-empty lists. Whichever is smaller needs to come first in the merged list. Once
this value is selected, the remainder of the two lists are merged recursively. With the
merging function thus written, you implement the merge-sort by using functions length
and splitAt to split a list in the middle.
An easy mistake would be to forget the case List(_) in the first pattern in mergeSort.
As mentioned earlier, in the discussion of Listing 6.5, this would result in a non-
terminating function. As a variant, you can apply pattern matching to the length of
the list, which is needed for the split, and write the merge-sort function as follows:
Scala
def mergeSort(list: List[Int]): List[Int] = length(list) match
case 0 | 1 => list
case len =>
val (left, right) = splitAt(list, len / 2)
merge(mergeSort(left), mergeSort(right))
smaller than the pivot and values larger than the pivot. Function splitPivot, shown
next, works by splitting the tail of a list into low and high values according to the pivot,
and then adding the head of the list to the low or high part. With the splitting function
implemented, a quick-sort function only has to split, sort recursively, and concatenate:
Scala
def splitPivot(pivot: Int, list: List[Int]): (List[Int], List[Int]) = list match
case Nil => (Nil, Nil)
case h :: t =>
val (low, high) = splitPivot(pivot, t)
if h < pivot then (h :: low, high) else (low, h :: high)
Listing 7.20: Quick-sort with user-defined splitting; see also Lis. 10.1.
Merge-sort and quick-sort tend to outperform insert-sort: If the two lists to be sorted
recursively have about the same size, sorting time is reduced from n2 to n log2 (n),
where n is the length of the list to be sorted. This equal size property of the split
is guaranteed in merge-sort but not in quick-sort. Although quick-sort works well on
average, it suffers from poor performance in extreme cases. For instance, if a list is
already sorted, and you choose its head as the pivot, as in our example, quick-sort takes
time proportional to the square of the length of the list.
The implementation of splitPivot shown here is not very good—it uses as much
stack as there are values in the list. We could rewrite it as a tail recursive function,
using accumulator variables for the low and high lists. However, there is an even better
way to achieve the same functionality efficiently by using a higher-order function called
partition. Higher-order functions are discussed in Chapters 9 and 10, and quick-sort
is reimplemented in Listing 10.1.
reverse(takeAndAdd(list, n, Nil))
Listing 7.21: Tail recursive implementation of take; contrast with Lis. 7.8.
Note how values are added to the accumulator using “::”, not append, for performance
reasons. As a result, function takeAndAdd accumulates the elements in reverse order,
and the accumulated list is reversed before it is returned.
Another approach, which is frequently used in libraries, is to implement pure func-
tions by using non-functional elements internally. You can implement take with an
intermediate ListBuffer, a mutable accumulator designed to build immutable lists:
Scala
def take[A](list: List[A], n: Int): List[A] =
@tailrec
def takeAndAdd(list: List[A], n: Int, added: mutable.ListBuffer[A]): List[A] =
(list, n) match
case (_, 0) | (Nil, _) => added.result()
case (head :: tail, _) => takeAndAdd(tail, n - 1, added += head)
takeAndAdd(list, n, mutable.ListBuffer.empty[A])
To add elements to the accumulator, a list buffer is mutated by its method “+=”. Once
building is complete, method result is used to produce an immutable list from the
mutable buffer, thus bringing us back into the realm of functional programming. Because
the buffer supports a constant-time append operation, no reversal is needed at the end.
Finally, if your language supports it, you can always use loops instead of recursion
to update the mutable buffer:
Scala
def take[A](list: List[A], n: Int): List[A] =
val added = mutable.ListBuffer.empty[A]
var elems = list
var rem = n
while rem > 0 && elems.nonEmpty do
added += head(elems)
elems = tail(elems)
rem -= 1
added.result()
After compilation, the last two variants of function take should be more or less equiv-
alent. While Listings 7.22 and 7.23 are the “right” way to implement take, we will
continue to rely mostly on “::” and immutable accumulators (or the execution stack)
to build lists in Part I of this book. This is done for clarity. Using mutable builders, even
hidden ones, would unnecessarily muddle the presentation of functional programming
concepts.
7.10 Summary
• Behind every recursive function lies a recursive equality: The solution to a given
problem equals a combination of solutions to smaller problems, solved recursively.
Having these equalities in mind can help you design recursive algorithms correctly.
This approach emphasizes the declarative side of functional programming by fo-
cusing attention on what the desired value is, instead of what the code must do
to compute it.
• A recursive equality is often well defined only on a subset of input data. Values
for which the equality does not hold need to be treated separately. They usually
result in one or more non-recursive cases in a recursive function.
• Functional lists form a recursive data structure: A non-empty list consists of a head
value, followed by another list (the tail). As such, lists are naturally amenable to
recursive programming. Many operations can be implemented on lists as functions
that perform a recursive call on the tail of the list. Recursion on one or more sub-
lists other than the tail is possible as well.
• Some list functions end up being naturally tail recursive. Others may need to be
rewritten from their natural form to achieve tail recursion. This rewrite usually
involves including an accumulator as an additional argument to the function.
There is often a trade-off between the simplicity of a recursive function in its
natural form and the robustness of a tail recursive variant.
• Functions that need to build lists can sometimes rely on the execution stack
to store list elements until the list is built. If, instead, tail recursion is needed,
an accumulator should always build a list from the front, using a constant-time
prepend operation. Building lists from the other end typically results in code that
is quadratic in its runtime.
• If, as a consequence of building from the front, a list is constructed in reverse order,
it can be reversed back after construction. List reversal takes time proportional
to the length of the list. An additional reversal at the end is usually a better
option than constructing a list in the right order if the cost of this construction is
quadratic.
7.10 Summary 97
99
100 Chapter 8 Case Study: Binary Search Trees
57
43 71
20 51 60 83
6 32 52 78
compare it to 57, 71, 83, and 78 before failing. The number of comparisons equals the
length of the branch being followed.
Empty is the empty tree; non-empty trees consist of a Node that includes a key and
two children. Missing subtrees are represented as empty trees. Given this type, you
would then implement tree functions by using pattern matching. For instance, a function
isEmpty can be written as follows:
8.2 Sets of Integers as Binary Search Trees 101
Scala
def isEmpty(tree: BinTree): Boolean = tree match
case Empty => true
case Node(_, _, _) => false
case class Node(key: Int, left: BinTree, right: BinTree) extends BinTree:
def isEmpty = false
... // concrete methods for a non-empty tree
NOTE
The BinTree class should be sealed to prevent subtypes other than Node and Empty from being
added. Also, class Node and object Empty should be private to avoid the creation of nonsensical
trees like Node(1, Node(2, Empty, Empty), Empty) (key 2 should be on the right, not the left,
of root 1). On such ill-formed trees, most code behaves incorrectly (the search algorithm discussed
earlier would fail to find key 2 in this tree). As important as they are when developing an actual
library, these considerations are not relevant to the discussion of recursion that is the focus of this
case study. For clarity, all classes and methods are left public throughout the chapter.
102 Chapter 8 Case Study: Binary Search Trees
def size = 0
def height = 0
with val size = .... The memory footprint of nodes would increase, but size would become a
constant-time operation. The same memory/speed trade-off is possible for height, min, or max. The
self-balancing trees implemented in Section 8.4 rely heavily on height calculations and would benefit
from making height a val instead of a def.
104 Chapter 8 Case Study: Binary Search Trees
Indeed, with one tree being shared at each level of the recursion, the trees before and
after insertion share all their nodes except for the one branch that is being traversed.
This is a fundamental property of persistent data structures, which was discussed earlier
in the context of functional lists in Section 3.7 (see, in particular, Figure 3.1).
If you are used to mutable types, beware of a common beginner mistake:
Scala
// DON'T DO THIS!
def + (k: Int): Node =
if k < key then
left + k
Node(key, left, right)
else if k > key then
right + k
Node(key, left, right)
else this
Method “-” starts like method “+”, by removing the target key k from either the left
or the right child. However, the case k == key, which is trivial in the insertion method,
is more challenging in the removal method. After removing the root of the tree, you are
left with two separate child trees. If either child is empty, you can simply return the
other tree, which contains all the keys in the set.
The complicated case is the final else: You are removing the root of a tree, and both
the left and right children are non-empty. In that case, you need to extract a key from
a child, and use it as the new root key. To maintain the ordering property of binary
search trees, a common strategy is to focus on the smallest key of the right child (or,
alternatively, the largest key of the left child). The smallest key of the right child is
greater than all the keys in the left child (all the right child keys are). It is also smaller
than all the other keys in the right child. Therefore, you can use this key as a new root,
and use the remaining keys of the right child to create a new right child. This way, the
new root is larger than all the keys in the left child and smaller than all the keys in
the new right child.
Extracting this key is the task of method minRemoved, which returns a pair: the
smallest key of a tree and the tree without this key. To remove key 57—the root—from
the example tree in Figure 8.1, you apply minRemoved to the right child. This produces
a pair (minRight, othersRight), where minRight is 60 and othersRight is the tree:
71
83
78
You then use minRight as the root of a new node whose right child is othersRight.
The left child is unchanged. Figure 8.2 shows the resulting tree.
60
43 71
20 51 83
6 32 52 78
Figure 8.2 The binary search tree from Figure 8.1 after key 57 is removed.
smallest key in the tree, and its right child is what is left of the tree once this key is
removed. Otherwise, the smallest key in the tree is the smallest key in the left child, just
as before. This left child, with the smallest key removed, becomes the new left child of
the node. The key and right child of the node are unchanged.
If needed, you can build a list of all the keys in a tree. Because of the key ordering
property of binary search trees, an in-order traversal—left child, then root, then right
child—produces an ordered list of keys. A naive implementation is straightforward:
Scala
// DON'T DO THIS!
// in object Empty:
def toList = List.empty
// in class Node:
def toList = left.toList ::: key :: right.toList
Although this method does produce the right list, its performance suffers from the use
of the list concatenation operator “:::”, which takes time proportional to the length of
the first operand (see the discussion of function concat from Listing 7.9). Keep in mind
that, due to the recursive nature of method toList, this is not a one-time cost: The
concatenation operator is used again within the calls left.toList and right.toList,
and again in the recursive calls within these calls, and so on. To avoid this inefficiency,
you can build the list in an accumulator argument instead:
Scala
// in class BinTree:
def toList: List[Int] = makeList(List.empty)
def makeList(list: List[Int]): List[Int]
// in object Empty:
def makeList(list: List[Int]) = list
// in class Node:
def makeList(list: List[Int]) = left.makeList(key :: right.makeList(list))
Since keys are always added to the front of the list, you end up with a list in increasing
order, as desired.
Finally, you can provide users with a tree-building function. (Recall that Node and
Empty would not be public in an actual implementation.) The companion object defines
methods apply and fromSet to build a tree directly from a set of keys:
Scala
object BinTree:
def empty: BinTree = Empty
makeTree(keys.toList.sorted)
end fromSet
Listing 8.6: Set conversion to balanced binary search trees, improved in Lis. 8.12.
Function fromSet works by first sorting the keys into a list. The middle value of
the list is used as the root key of the tree. The two children are built recursively. The
values left of the middle key make up the left child, and the values right of the middle
key make up the right child. Because the list is sorted, the resulting tree satisfies the
key ordering property of binary search trees. Furthermore, since the lengths of the left
and right lists differ by at most one, you end up with a well-balanced tree. A function
apply with variable-length arguments is added for convenience, so trees can be built
directly from enumerated keys, as in BinTree(4,5,1,3,2).
This function produces an extremely unbalanced tree if the input set of keys happens
to be sorted. With this implementation, the expression fromSet(BitSet(1,2,3,4,5))
produces the following tree:
1
looser balancing notion in which the longest branch is guaranteed to be at most twice as long as the
shortest branch. Insertion and deletion in red-black trees are slightly more efficient than in AVL trees,
but the AVL trees are better balanced.
8.4 Self-Balancing Trees 109
k2 k1
k1 T3 right rotation T1 k2
T1 T2 left rotation T2 T3
of a right child. Right-to-left rotations have the opposite effect. Pattern matching makes
it straightforward to implement tree rotations in the Node class:
Scala
def rotateRight: Node = left match
case Node(keyL, leftL, rightL) => Node(keyL, leftL, Node(key, rightL, right))
// in class Node:
def imbalance = right.height - left.height
On balanced trees, this method returns a value that lies between −1 and 1 (included).
When a key is added or removed, a tree’s imbalance can reach −2 or 2. For instance,
node 71 has imbalance 1 in the tree from Figure 8.1. After key 60 is removed, the
imbalance becomes 2, an indicator that the tree needs to be rebalanced. When a tree
has an imbalance of 2, you need to consider two cases:
• The right child is perfectly balanced or “right-heavy” (its own imbalance is 1). In
that case, a single right-to-left rotation is enough to rebalance the tree:
110 Chapter 8 Case Study: Binary Search Trees
k1
k2
T1 k2
−→ k1 k3
T2 k3
T1 T2 T3 T4
T3 T4
• The right child is “left-heavy” (its own imbalance is −1). In that case, two rotations
are needed: first a left-to-right rotation of the right child, to get back to the
previous case, and then a right-to-left rotation as before:
k1 k1
k2
T1 k3 T1 k2
−→ −→ k1 k3
k2 T4 T2 k3
T1 T2 T3 T4
T2 T3 T3 T4
The case of a tree with a −2 imbalance is treated symmetrically. This strategy is
implemented in a method avl, relying again on the power of pattern matching:
Scala
def avl: Node = imbalance match
case -2 => left.imbalance match
case 0 | -1 => rotateRight
case 1 => Node(key, left.rotateLeft, right).rotateRight
case 2 => right.imbalance match
case 0 | 1 => rotateLeft
case -1 => Node(key, left, right.rotateRight).rotateLeft
case 0 | 1 | -1 => this
Scala
def + (k: Int): Node =
if k < key then Node(key, left + k, right).avl
else if k > key then Node(key, left, right + k).avl
else this
Listing 8.10: Key deletion in self-balancing binary search trees; see also Lis. 8.11.
Observe that removing a value from the right child can decrease its height by
at most one. In other words, othersRight.height is equal to right.height or
right.height - 1. So, if right.height is at least equal to left.height, the height
of left and the height of othersRight differ at most by one. Therefore, in this case,
Node(minRight, left, othersRight) is necessarily balanced, and the call to method
avl is unnecessary. Conversely, if left.height is at least equal to right.height, you
112 Chapter 8 Case Study: Binary Search Trees
can build a balanced tree by removing the largest key in the left child instead of the
smallest key in the right child. By always removing from the tallest tree, you reduce
the number of rebalancing operations. This results in an improved implementation of
method “-”, in which the last call to avl is omitted:
Scala
def - (k: Int): BinTree =
if k < key then Node(key, left - k, right).avl
else if k > key then Node(key, left, right - k).avl
else if left.isEmpty then right
else if right.isEmpty then left
else if left.height > right.height then
val (maxLeft, othersLeft) = left.maxRemoved
Node(maxLeft, othersLeft, right) // no call to avl needed
else
val (minRight, othersRight) = right.minRemoved
Node(minRight, left, othersRight) // no call to avl needed
Listing 8.11: Key deletion in self-balancing binary search trees; see also Lis. 8.10.
Method maxRemoved is symmetrical to method minRemoved and is omitted.
To conclude this chapter, let us revisit function fromSet in Listing 8.6, which builds
balanced trees from an existing set. Although the earlier implementation nicely illus-
trates list recursion and pattern matching, it will be inefficient on large sets because
method length takes time proportional to the size of the list, and splitAt needs to
allocate new lists (see the discussion of method take in Section 7.4). You could reimple-
ment fromSet more efficiently by replacing the list with a sequence with fast indexing
(typically backed by an array):
Scala
def fromSet(keys: Set[Int]): BinTree =
val keySeq = keys.toIndexedSeq.sorted
makeTree(0, keySeq.length - 1)
Listing 8.12: Conversion from set to binary search trees, improved from Lis. 8.6.
The function follows the same algorithm as before, but does not explicitly create sub-
lists. Instead, lists are represented as pairs of indices in a fixed sorted sequence. A helper
function uses two integer arguments to represent a sublist from which to build a tree and
relies on constant-time access to the middle element of the sequence. Function makeTree
8.5 Summary 113
still uses non-tail recursion, but the depth of the recursion will not exceed the binary
logarithm of the size of the set, which should be acceptable in practice.
8.5 Summary
Binary trees form a typical recursive structure on which recursive programming can
be demonstrated. Many operations, such as lookups and insertions, are implemented in
terms of applying the same operation to subtrees. Functional trees are defined in terms
of a terminal case, the empty tree Empty, and a constructor Node that creates a new tree
from two children and the contents of the root. This is reminiscent of functional lists,
which are built from the empty list Nil and a cons operator “::”. The fundamental
difference is that a tree node can have two children, while a non-empty list has only
one tail.
In languages that support pattern matching, trees can be processed by treating the
empty tree as a special case, and deconstructing a non-empty tree into a node value
and two children, in the same way a non-empty list is broken into its head and tail. Al-
though tree operations could be implemented entirely in terms of functions using pattern
matching, a functional–object-oriented style is often preferred in hybrid languages.
Functional trees cannot be mutated, and insertion and deletion operations produce
new trees instead. However, many subtrees can be shared between the tree before and
the tree after an insertion or deletion. This is again reminiscent of the data sharing
already discussed in the context of lists.
Some tree operations, such as deletion, are more complex than their list counterparts.
While removing the head of a list leaves a tail that is itself a list, the remaining children
of a tree after the root is removed do not form a tree and need to somehow be merged.
Binary search trees extend binary trees by maintaining a key ordering property
that can speed up lookups. This property needs to be preserved by all tree operations,
including the merging of children after a tree root has been removed.
Binary search trees offer efficient lookup only to the extent that a tree is well bal-
anced. While a balanced tree can easily be built from a list, further insertions and
deletions can result in unbalanced trees. Algorithms have been devised for trees that
remain balanced after every insertion and deletion. As an example, this case study ex-
tends basic binary search trees into self-balancing AVL trees. The implementation of
rebalancing operations relies heavily on recursion and pattern matching.
This page intentionally left blank
Chapter 9
Higher-Order Functions
It is natural for a programming paradigm that centers on functions to treat them as
first-class citizens. In functional programming, functions are values and can be stored in
variables and passed as arguments. Functions that consume or produce other functions
are said to be higher-order functions. Using higher-order functions, computations can
be parameterized by other computations in powerful ways.
115
116 Chapter 9 Higher-Order Functions
A limitation of this function, however, is that you can only search for a target if
you already have a value equal to that target. For instance, you cannot search a list
of temperatures for a value greater than 90. Of course, you can easily write another
function for that:
Scala
def findGreaterThan90(list: List[Int]): Option[Int] = list match
case Nil => None
case h :: t => if h > 90 then Some(h) else findGreaterThan90(t)
findGreaterThan90(temps) // Some(91)
But what if you need to search for a temperature greater than 80 instead? You can
write another function, in which an integer argument replaces the hardcoded value 90:
Scala
def findGreaterThan(list: List[Int], bound: Int): Option[Int] = list match
case Nil => None
case h :: t => if h > bound then Some(h) else findGreaterThan(t, bound)
This is better, but the new function still cannot be used to search for a temperature
less than 90, or for a string that ends with "a", or for a project with identity 12345.
You will notice that functions find, findGreaterThan90, and findGreaterThan are
strikingly similar. The algorithm is the same in all three cases. The only part of the
implementation that changes is the test in the if-then-else, which is h == target in
the first function, h > 90 in the next, and h > bound in the third.
It would be nice to write a generic function find parameterized by a search criterion.
Criteria such as “to be greater than 90” or “to end with "a"” or “to have identity 12345”
could then be used as arguments. To implement the if-then-else part of this function,
you would apply the search criterion to the head of the list to produce a Boolean value.
In other words, you need the search criterion to be a function from A to Boolean.
Such a function find can be written. It takes another function as an argument,
named test:
Scala
def find[A](list: List[A], test: A => Boolean): Option[A] = list match
case Nil => None
case h :: t => if test(h) then Some(h) else find(t, test)
The type of argument test is A => Boolean, which in Scala denotes functions from A
to Boolean. As a function, test is applied to the head of the list h (of type A), and
produces a value of type Boolean (used as the if condition).
9.1 Functions as Values 117
You can use this new function find to search a list of temperatures for a value
greater than 90 by first defining the “greater than 90” search criterion as a function:
Scala
def greaterThan90(x: Int): Boolean = x > 90
find(temps, greaterThan90) // Some(91)
In this last expression, you do not invoke function greaterThan90 on an integer argu-
ment. Instead, you use the function itself as an argument to find. To search for a project
with identity 12345, simply define a different search criterion:
Scala
def hasID12345(project: Project): Boolean = project.id == 12345L
find(projects, hasID12345) // project with identity 12345
Scala
temps.find(greaterThan90)
projects.find(hasID12345)
From now on, code examples in this chapter use the standard method find instead of
the earlier user-defined function.
Method find is a higher-order function because it takes another function as an
argument. A function can also be higher-order by returning a value that is a function.
For example, instead of implementing greaterThan90, you can define a function that
builds a search criterion to look for temperatures greater than a given bound:
Scala
def greaterThan(bound: Int): Int => Boolean =
def greaterThanBound(x: Int): Boolean = x > bound
greaterThanBound
Listing 9.3: Example of a function that returns a function; see also Lis. 9.4 and 9.5.
Scala
temps.find(greaterThan(90))
temps.find(greaterThan(80))
In a similar fashion, you can define a function to generate search criteria for projects:
Scala
def hasID(identity: Long): Project => Boolean =
def hasGivenID(project: Project): Boolean = project.id == identity
hasGivenID
projects.find(hasID(12345L))
projects.find(hasID(54321L))
9.2 Currying
Functions that return other functions are common in functional programming, and many
languages define a more convenient syntax for them:
Scala
def greaterThan(bound: Int)(x: Int): Boolean = x > bound
def hasID(identity: Long)(project: Project): Boolean = project.id == identity
languages like Haskell and ML. For instance, we tend to think of addition as a function
of two arguments:
Scala
def plus(x: Int, y: Int): Int = x + y // a function of type (Int, Int) => Int
plus(5, 3) // 8
Curried functions are so common in functional programming that the => that repre-
sents function types is typically assumed to be right-associative: Int => (Int => Int)
is simply written Int => Int => Int. For example, the function
Scala
def lengthBetween(low: Int)(high: Int)(str: String): Boolean =
str.length >= low && str.length <= high
has type Int => Int => String => Boolean. You can use it to produce a Boolean,
as in
Scala
lengthBetween(1)(5)("foo") // true
lengthBetween1AndBound(5)("foo") // true
lengthBetween1and5("foo") // true
Before closing this section on currying, we should consider a feature that is particular
to Scala (although other languages use slightly different tricks for the same purpose).
120 Chapter 9 Higher-Order Functions
To use this syntax when multiple arguments are involved, you can rely on currying
to adapt a multi-argument function into a single-argument function. For instance, the
curried variant of function plus can be invoked as follows:
Scala
plus(5) {
val two = 2
two + 1
}
thing that comes to mind when you hear that a language has support for functional
programming.
In Scala, the syntax for lambda expressions is (v1: T1, v2: T2, ...) => expr.2
This defines a function with arguments v1, v2, . . . that returns the value produced by
expr. For instance, the following expression is a function, of type Int => Int, that
adds 1 to an integer:
Scala
(x: Int) => x + 1
Function literals can be used to simplify calls to higher-order functions like find:
Scala
temps.find((temp: Int) => temp > 90)
projects.find((proj: Project) => proj.id == 12345L)
The Boolean functions “to be greater than 90” and “to have identity 12345” are imple-
mented as lambda expressions, which are passed directly as arguments to method find.
You can also use function literals as return values of other functions. So, a third
way to define functions greaterThan and hasID, besides using named local functions
or currying, is as follows:
Scala
def greaterThan(bound: Int): Int => Boolean = (x: Int) => x > bound
def hasID(identity: Long): Project => Boolean = (p: Project) => p.id == identity
2 Lambda expressions can also be parameterized by types, though this is a more advanced feature not
used in this book. For instance, Listing 2.7 defines a function first of type (A, A) => A, parameterized
by type A. It could be written as the lambda expression [A] => (p: (A, A)) => p(0).
122 Chapter 9 Higher-Order Functions
Today, many programming languages have a syntax for function literals. The
Scala expression (temp: Int) => temp > 90 could be written in other languages as
shown here:
The argument (or arguments) of a lambda expression can be composite types. For
example, assume you have a list of pairs (date, temperature), and you need to find a
temperature greater than 90 in January, February, or March. You can use find with
a lambda expression on pairs:
Scala
val datedTemps: List[(LocalDate, Int)] = ...
datedTemps.find(dt => dt(0).getMonthValue <= 3 && dt(1) > 90)
The test checks that the first element of a pair (a date) is in the first three months of
the year, and that the second element of the pair (a temperature) is greater than 90.
Languages that support pattern matching often let you use it within a lambda
expression. In the preceding example, you can use pattern matching to extract the date
and temperature from a pair, instead of dt(0) and dt(1):
Scala
datedTemps.find((date, temp) => date.getMonthValue <= 3 && temp > 90)
This is a lot more readable than the variant that uses dt(0) and dt(1).
More complex patterns can be used. In Scala, a series of case patterns, enclosed in
curly braces, also define an anonymous function. For instance, if a list contains tem-
peratures with an optional date, and temperatures without a date are not eligible, you
can search for a temperature greater than 90 in the first three months with the follow-
ing code:3
3 Here, I must admit that the Scala syntax can be confusing at first. As with code blocks, a call
f({...}) can omit extraneous parentheses, and be written as f{...}. This example becomes clearer
once you understand that it is a call to a higher-order method on a function literal defined with pattern
matching and that a pair of unnecessary parentheses have been dropped.
9.4 Functions Versus Methods 123
Scala
val optionalDatedTemps: List[(Option[LocalDate], Int)] = ...
optionalDatedTemps.find {
case (Some(date), temp) => date.getMonthValue <= 3 && temp > 90
case _ => false
}
The notation Int => Boolean is syntactic sugar for the type Function[Int, Boolean].
This type defines a method apply, which is invoked when the function is applied. The ex-
pression temps.find(GreaterThan90) could replace temps.find(temp => temp > 90)
to perform the same computation.4 GreaterThan90 is an object—which defines a func-
tion—not a method. In contrast,
Scala
def greaterThan90(x: Int): Boolean = x > 90
Class Formatter defines only one abstract method format and is therefore a SAM
interface. It can be implemented using lambda expressions:
Scala
val f: Formatter = str => str.toUpperCase
f.println(someValue)
Note how method println is called on object f, which was defined as a lambda expres-
sion. This is possible only because f was declared with type Formatter; the expression
(str => str.toUpperCase).println("foo") would make no sense.
9.6 Partial Application 125
Many Java interfaces can be implemented as lambdas, even though they predate
Java’s syntax for lambda expressions and have little to do with functional programming:
Scala
val absComp: Comparator[Int] = (x, y) => x.abs.compareTo(y.abs)
Instead of a lambda expression, you can build the desired function argument by replacing
temp with an underscore in the expression temp * 1.8 + 32 > 90:
Scala
celsiusTemps.find(_ * 1.8 + 32 > 90)
The expression _ * 1.8 + 32 > 90 represents a Boolean function that maps temp
to temp * 1.8 + 32 > 90, just like the function defined by the lambda expression
temp => temp * 1.8 + 32 > 90. Searches written earlier using lambda expressions
can use partial application instead:
Scala
temps.find(_ > 90)
projects.find(_.id == 12345L)
Aside on Scoping
The outermost scope defines a variable str, of type String. The body
of function f creates its own scope, in which a variable x, of type Int, is
defined (the argument to the function). The block of code that constitutes the
then part of the conditional has its own scope in which a new variable str, of
type Int, is declared. Similarly, the else block defines a new variable x, of type
String, in its own scope. Variables do not exist outside their scopes: Outside
function f, variable str is the string defined on the first line, and there is no
variable named x.
The variables str and x declared in the inner scopes shadow the variables
with the same names from the outer scopes. Java forbids such shadowing and
5 Be careful, because details vary from language to language. For instance, while “_ + _” is a two-
forces you to pick different names for the variables declared inside the then and
else blocks.
While it is often the case that every block of code defines its own scope, not
all languages adhere to this rule. JavaScript and Python, for instance, introduce
a new scope for the body of a function, but not for the then and else branches
of a conditional, or the body of a loop. This can be confusing when you are used
to the more mainstream scoping rules. As an illustration, consider the following
Python program:
Python
x = 1
def f():
x = 2
if x > 0:
x = 3
y = 4
print(x) # prints 3
print(y) # prints 4
f()
print(x) # prints 1
print(y) # error: name 'y' is not defined
The body of function f defines its own scope, but the block of code inside if
does not. Instead, x=3 is an assignment to the variable x declared in the scope of
the function (initialized with 2), and y=4 introduces a variable y inside that same
scope. In particular, this variable y continues to exist after the if statement and
has value 4. The print(x) statement inside function f prints the value of the
variable x in the scope of the function, which is 3, while the print(x) statement
outside function f prints the value of the variable x in the outermost scope,
which is 1. The final print(y) statement triggers an error, since no y variable
has been declared outside the scope of the function. The behavior would be the
same in JavaScript. Contrast this with Scala:
Scala
var x = 1
def f() =
var x = 2
if x > 0 then
var x = 3
var y = 4
println(x) // prints 2
println(y) // rejected at compile-time
128 Chapter 9 Higher-Order Functions
f()
println(x) // prints 1
println(y) // rejected at compile-time
Most languages now use static scoping, with variations as to which program-
ming language constructs introduce a new scope. Dynamic scoping, in contrast,
is error-prone and has become less popular. It was used in the original Lisp and
remains available as an option in modern variants of that language. It is also
used in some scripting languages, most notably Perl and various Bourne Shell
implementations.
As an illustration, this Scala program follows static scoping rules:
Scala
var x = 1
def f() =
x += 1
println(x) // prints 2
def g() =
var x = 10
f()
g()
println(x) // prints 2
Function g defines a local variable x, then invokes f. The variable x used in-
side function f, however, is the one declared on the first line of the program,
which is the one in scope where function f is defined. The local variable with
the same name defined inside function g plays no part. The behavior would
be the same for an equivalent program written in Java, Kotlin, C, or any one of
a multitude of languages that use static scoping.
Contrast this with the following Bash implementation:
Bash
x=1
f() {
9.6 Partial Application 129
(( x++ ))
printf "%d\n" $x # prints 11
}
g() {
local x=10
f
}
g
printf "%d\n" $x # prints 1
In Bash, the variable x used inside function f is not the variable in scope where
f is defined, but the variable in scope where function f is invoked. This variable,
equal to 10, is incremented to 11. The variable x declared at the beginning of
the program was never modified.
Dynamic scoping can be used to override a global variable with a local vari-
able, thus changing the behavior of a function. This is sometimes useful, and
a similar behavior can be achieved in Scala through implicit arguments. For
instance, by reusing the Formatter type from Section 9.5, a function can be
defined to print an object with the default formatter in scope:
Scala
def printFormatted(any: Any)(using formatter: Formatter): Unit =
formatter.println(any)
Within a function—or any block of code that introduces its own scope—a
different formatter can be specified:
Scala
given UpperCaseFormatter: Formatter = str => str.toUpperCase
This technique brings back some of the flexibility of dynamic scoping but
is much safer: Function printFormatted is explicit in the fact that it allows a
locally defined formatter to impact its behavior.
The code examples in this book do not rely much on implicit arguments,
except on occasion in Part II. In particular, Scala tends to use implicit argu-
ments to specify the thread pool on which to execute concurrent activities.
130 Chapter 9 Higher-Order Functions
9.7 Closures
Recall our first implementation of function greaterThan, in Listing 9.3:
Scala
def greaterThan(bound: Int): Int => Boolean =
def greaterThanBound(x: Int): Boolean = x > bound
greaterThanBound
You can apply greaterThan to different values to produce different functions. For
example, greaterThan(5) is a function that tests if a number is greater than 5, while
greaterThan(100) is a function that tests if a number is greater than 100:
Scala
val gt5 = greaterThan(5)
val gt100 = greaterThan(100)
gt5(90) // true
gt100(90) // false
itself. The word closure comes from the fact that, in λ-calculus, a function like greaterThanBound
is represented by an open term that contains a free variable bound, and that needs to be closed to
represent an actual function.
9.7 Closures 131
Function memoLength is a function from strings to integers, like str => str.length. It
calculates the length of a string and stores it. The first time you call memoLength("foo"),
the function invokes method length on string "foo", stores 3, and returns 3. If you
call memoLength("foo") again, value 3 is returned directly, without invoking method
length of strings. Another invocation memo(str => str.length) would create a new
closure with its own store map.
What is captured by a closure is a lexical environment. This environment contains
function arguments, local variables, and fields of an enclosing class, if any:
Scala
def logging[A, B](name: String)(f: A => B): A => B =
var count = 0
val logger = Logger.getLogger("my.package")
Like memo, function logging takes a function of type A => B as its argument and pro-
duces another function of the same type. The returned function is functionally equivalent
to the input function, but it adds logging information, including the input and output
of each call and the number of invocations:
Scala
val lenLog: String => Int = logging("length")(str => str.length)
lenLog("foo")
// INFO: calling length (1) with foo
// INFO: length(foo)=3
lenLog("bar")
// INFO: calling length (2) with bar
// INFO: length(bar)=3
For this to work, the returned closure g needs to maintain references to arguments name
and f, as well as to local variables count and logger.
Note that variable count is modified when the closure is called. Writing into closures
can be a powerful mechanism, but it is also fraught with risks:
Scala
// DON'T DO THIS!
val multipliers = Array.ofDim[Int => Int](10)
var n = 0
while n < 10 do
multipliers(n) = x => x * n
n += 1
This code attempts to create an array of multiplying functions: It fills the array with
functions of type Int => Int defined as x => x * n. The idea is that multipliers(i)
should then be x => x * i, a function that multiplies its argument by i. However, as
written, the implementation does not work:
Scala
val m3 = multipliers(3)
m3(100) // 1000, not 300
All the functions stored in the array close over variable n and share it. Since n is equal
to 10 at the end of the loop, all the functions in the array multiply their argument by 10
(at least, until n is modified). Some languages, including Java, emphasize safety over
flexibility and do not allow local variables captured in closures to be written.
9.9 Summary 133
As with other forms of implicit references (e.g., inner classes), you need to be aware
of closures to avoid tricky bugs caused by unintended sharing. This is especially true
when closing over mutable data. As always, emphasizing immutability tends to result
in safer code.
Whether you use recursion or a loop, the list is queried for its values (head and tail),
but the flow of control remains within function findGreaterThan90. If instead you use
the expression temps.find(greaterThan90), you no longer query the list for its values.
Function find is now responsible for the flow of control—and may use recursion or a
loop, depending on its own implementation. It makes callbacks to your code, namely
the test function greaterThan90.
This shift of control flow from application code into library code is one of the reasons
functional programming feels more abstract and declarative compared to imperative
programming. However, once higher-order functions are well understood, they become
convenient abstractions that can improve productivity and reduce the need for debug-
ging. By using a method like find, you not only save the time it takes to write the three
or four lines needed to implement the loop, but more importantly, eliminate the risk of
getting it wrong.
9.9 Summary
• A defining characteristic of functional programming is the use of functions as
values. Functions can be stored in data structures, passed as arguments to other
functions, or returned as values by other functions. Functions that take functions
as arguments or return functions as values are said to be higher-order functions.
134 Chapter 9 Higher-Order Functions
Function exists corresponds to the existential quantifier in logic (∃). The dual universal
quantifier (∀) is available as a function forall:
Scala
temps.forall(temp => 32 <= temp && temp <= 100) // true
137
138 Chapter 10 Standard Higher-Order Functions
Keep in mind that on an empty structure, exists is always false and forall is
always true:
Scala
Some("foo").exists(_.endsWith("o")) // true
Some("foo").forall(_.endsWith("o")) // true
Some("bar").exists(_.endsWith("o")) // false
Some("bar").forall(_.endsWith("o")) // false
The last case is the one most likely to trip up a careless developer.
Functions exists and forall only tell you if some or all the elements have the
desired property. In scenarios that require knowing more precisely how many values
satisfy a condition, you can rely on function count instead:
Scala
temps.count(_ > 90) // 3
If the list temps represents temperatures over time, you can use count to calculate how
many times the temperature increased:
Scala
val ups = temps.sliding(2).count(pair => pair(1) > pair(0)) // 2
Scala
temps.filter(_ > 75) // List(88, 91, 78, 100, 98)
Scala also has a method filterNot that reverses its test. To separate values with and
without a desired property, use partition to produce both collections, usually in a
more efficient single traversal than a call to filter followed by a call to filterNot:
Scala
temps.filterNot(_ > 75) // List(69, 70)
temps.partition(_ > 75) // (List(88, 91, 78, 100, 98), List(69, 70))
Scala
def quickSort(list: List[Int]): List[Int] = list match
case Nil => list
case pivot :: others =>
val (low, high) = others.partition(_ < pivot)
quickSort(low) ::: pivot :: quickSort(high)
In the same vein, takeWhile and dropWhile are another pair of higher-order func-
tions that use a predicate. They are similar to, but different from, filter and filterNot:
Scala
temps.takeWhile(_ > 75) // List(88, 91, 78)
temps.dropWhile(_ > 75) // List(69, 100, 98, 70)
Function takeWhile takes elements from the front of the list, as long as they satisfy
the test. The function stops as soon as it encounters one element that does not pass the
test (or the list has been exhausted). When testing whether temperatures are greater
than 75, takeWhile produces 88, 91, and 78. It stops when it encounters 69. Contrast
this with the behavior of filter, which skips value 69, but continues with 100 and 98.
Like drop (see Listing 7.6), dropWhile does not need to allocate a new list in memory.
Accordingly, you could implement an efficient find using dropWhile:
Scala
def find[A](list: List[A], test: A => Boolean): Option[A] =
list.dropWhile(!test(_)).headOption
The list returned by dropWhile starts with the first element that satisfies the test or is
empty if no such element is found. Function headOption returns the head of a list as
an option and handles the empty list by returning None.
In the same way that partition combines filter and filterNot, function span
can be used when you need the outputs of both takeWhile and dropWhile on the same
predicate. For instance, to build a list in which the first temperature not greater than 75
has been replaced with zero, you can use the following expression:
Scala
temps.span(_ > 75) match
case (all, Nil) => all
case (left, _ :: right) => left ::: 0 :: right
Recall that temps is the list [88,91,78,69,100,98,70]. The list left is the same
list that would be returned by takeWhile(_ > 75): [88,91,78]. The second list is
as if obtained with dropWhile(_ > 75) and starts with 69. Its tail, list right,
140 Chapter 10 Standard Higher-Order Functions
The return type of the function being applied determines the type of the elements
in the collection that is returned:
Scala
temps.map(temp => if temp > 72 then "high" else "low")
// List("high", "high", "high", "low", "high", "high", "low")
The strings and pairs in this example are returned as a list because temps is a list. If
instead the temperatures were stored in an array, temps.map would produce an array.
A method map is available on many types in Scala:
Scala
Set(0.12, 0.35, 0.6).map(1.0 - _) // Set(0.88, 0.65, 0.4)
Some("foo").map(_.toUpperCase) // Some("FOO")
Because some collections make no sense for some types, map may return a structure
of a different type:
Scala
"foo".map(_.toInt) // IndexedSeq(102, 111, 111)
BitSet(12, 35, 60).map(_ / 100.0) // SortedSet(0.12, 0.35, 0.6)
10.3 flatMap 141
Function foreach applies the given function to all the elements in a structure, just like
map. The difference is that foreach does not return anything—its return type is Unit.
Note that temps.map(temp => out.writeInt(temp)) would write all the temperatures
to the output stream but would also produce a list of unit values—returned by method
writeInt, which is void in Java—one per temperature written. This list is useless but,
short of a compiler optimization, would have to be created and garbage collected.
You can also use foreach to apply functions that return meaningful values, but the
values are still ignored:
Scala
val writer: Writer = ...
temps.foreach(temp => writer.append(temp.toString).append('\n'))
Even though method append returns the writer itself, the call to foreach returns unit.
Using map instead of foreach would produce a useless list of references to the writer.
10.3 flatMap
Function flatMap is a variation of map that “flattens” the values being produced:
Scala
List(1, 2, 3).map(x => List(x, x, x))
// List(List(1, 1, 1), List(2, 2, 2), List(3, 3, 3))
What flatMap does is easy to understand. What can be harder to grasp is how
useful this behavior can be. With experience, you will come to realize that flatMap
is probably the most powerful and most useful of the standard higher-order functions.
Function flatMap is a fundamental operation. In particular, by using one-to-one and
one-to-zero mappings, you can express map and filter in terms of flatMap:
142 Chapter 10 Standard Higher-Order Functions
Scala
def map[A, B](list: List[A], f: A => B): List[B] = list.flatMap(x => List(f(x)))
Suppose now that each stage is not guaranteed to succeed: It is possible that a
request cannot be parsed, an account is not found, or an operation is unsuccessful. To
handle these contingencies, you modify each function to return an option, and use None
to represent a failure:
Scala
def parseRequest(request: Request): Option[User] = ...
def getAccount(user: User): Option[Account] = ...
def applyOperation(account: Account, op: Operation): Option[Int] = ...
The problem now is to combine these three functions to parse a request, retrieve an
account, and apply an operation. The expression
applyOperation(getAccount(parseRequest(request)), op)
no longer works because each stage may return None, which prevents the computation
from continuing. You could use pattern matching to test options for emptiness and
extract contents into local variables:
Scala
parseRequest(request) match
case None => None
case Some(user) =>
10.3 flatMap 143
getAccount(user) match
case None => None
case Some(account) => applyOperation(account, op)
Listing 10.4: Processing options through pattern matching; contrast with Lis. 10.5.
This works but is definitely not as nice as the earlier function composition.
As an alternative, map can be used to apply a function to the value inside an option,
if any:
Scala
Some(42).map((x: Int) => x + 1) // Some(43)
None.map((x: Int) => x + 1) // None
You could use map to retrieve an account from a user (if any), then again to apply an
operation to the account (if any):
Scala
// DON'T DO THIS!
parseRequest(request)
.map(user => getAccount(user).map(account => applyOperation(account, op)))
This approach is inadequate for several reasons. First, the type of this expression is
Option[Option[Option[Int]]]. In other words, if the last stage produces a value v,
the expression returns it as Some(Some(Some(v))), which is obviously far from ideal.
Furthermore, if a computation stage fails, the expression results in somewhat confusing
values. If the request cannot be parsed, the expression is None. If, however, the request
can be parsed but no account is found, it is Some(None), which is another way of
not having a value. If an account is found but the operation fails, the expression is
Some(Some(None)), yet another expression without a meaningful value.
All this can be avoided by replacing map with flatMap. When you apply a function
with an optional result to an optional input, flatMap flattens the “option of option”
into a simple option:
Scala
parseRequest(request)
.flatMap(user => getAccount(user))
.flatMap(account => applyOperation(account, op))
Listing 10.5: Example of using flatMap in a pipeline; see also Lis. 10.9.
This expression has type Option[Int]. It is Some(v) if an operation output v is pro-
duced and None if any stage of the computation is unsuccessful.
Some scenarios require more careful error handling—for instance, by triggering fall-
back computations. For those, you can still rely on flatMap, but use it on other types
144 Chapter 10 Standard Higher-Order Functions
like Either and Try, in addition to Option. Functional error handling is the topic of
Chapter 13.
In the code example, flatMap is used to apply transformations that might fail to
outputs of previous computations, which might also have failed. You can use a similar
strategy to transform asynchronous computations using transformations that can them-
selves be asynchronous by using flatMap to avoid nesting futures. This will be explored
in Part II of the book (contrast Listing 10.5 with Listing 26.9, for instance).
Another area where flatMap is handy is reactive programming organized in terms
of streams of events. While map is limited to a one-to-one mapping, flatMap is not.
In particular, when an event is processed, it can be entirely consumed (triggering no
further events), or it can trigger exactly one event, or it can trigger multiple events. By
using flatMap instead of map, you avoid nesting streams of events, in the same way you
avoid nesting options when dealing with missing values, or nesting futures when dealing
with asynchronous computations (see Listing 27.9 for an illustration).
Indeed, flatMap is such a fundamental operation that it has its own underlying
theory (monads), and Scala defines convenient syntax to organize flatMap-based com-
putations (see Section 10.9).
It must also satisfy the two following properties on every struct of type F[A]:
Scala
struct == struct.map(identity)
// or equivalently: struct == struct.map(x => x)
struct.map(f).map(g) == struct.map(f andThen g)
// or equivalently: struct.map(f).map(g) == struct.map(x => g(f(x)))
The first property states that mapping the identity function has no effect. The
second property requires that map preserves function composition: Mapping f
10.3 flatMap 145
and then g produces the same value as mapping a single function that applies
f and then g. When implementing a map function on your own structure, strive
to maintain these properties. As a mental exercise, you can check that the map
functions used in the book’s code illustrations all satisfy these conditions.
Monads can be defined in terms of a function unit and a method flatMap.
The unit function has type A => M[A] and is used to build a monad—for
example, from x to List(x) or from x to Some(x). Method flatMap on M[A]
has the following signature:
Scala
def flatMap[B](f: A => M[B]): M[B] = ...
Monads have their own conditions to satisfy. First, the unit function main-
tains two properties with respect to flatMap:
Scala
struct.flatMap(unit) == struct
unit(x).flatMap(f) == f(x)
The first property states that “flat-mapping” unit does nothing (as map-
ping the identity does nothing). The second property corresponds to the funda-
mental intuition behind flatMap: Function unit places x into a container, and
flatMap(f) applies f to the contents of the container (see Section 10.9). Finally,
flatMap must satisfy a form of composition similar to map:
Scala
struct.flatMap(f).flatMap(g) == struct.flatMap(x => f(x).flatMap(g))
There is a lot more to functors and monads than is covered in this short
note. Indeed, entire books have been dedicated to the subject. You are likely to
see functors and monads (and other similar abstractions) explicitly in genuine
functional programming languages (like Haskell) and libraries (like Scala’s Cats).
Even if you intend to stay away from those, keep in mind that the various map
and flatMap functions you encounter share some fundamental properties and
lead to similar programming patterns.
146 Chapter 10 Standard Higher-Order Functions
Listing 10.6: Example of iteration, recursion, and fold for the same computation.
In Listing 10.6, all three calculate functions perform the same computation: one
iteratively, one with a tail recursive function, and one using foldLeft. In each vari-
10.4 fold and reduce 147
ant, a current value acc is combined with a list element x through the expression
3.0 * acc + x + 1.0.
Folding functions are versatile, and you can use them to implement many collection-
processing computations. For example:
Scala
def sum(list: List[Int]): Int = list.foldLeft(0)(_ + _)
Listing 10.7: sum, product, reverse, and filter implemented using fold.
Functional libraries often implement a simplified version of fold named reduce. The
difference with fold is that reduce uses an element from the collection as the starting
point of the computation. If abc is the list A, B, C as before, abc.reduceLeft(f) is
f(f(A,B),C). The sum and product functions in Listing 10.7 can be written in terms
of reduce:
Scala
def sum(list: List[Int]): Int = list.reduce(_ + _)
def product(list: List[Int]): Int = list.reduce(_ * _)
Note that reduce is not defined on empty collections and is limited to a return type
that is the same as (or a supertype of) the collection’s elements. To reduce elements
into a value of a different type, you can sometimes apply map to first convert types, and
then reduce. Assume, for instance, a text file with one number per line and a task that
consists of adding the logarithms of all the absolute values of the numbers, ignoring
zeros. You can implement that with a combination of map, filter, and reduce:
Scala
lines.map(_.toDouble.abs).filter(_ != 0.0).map(math.log).reduce(_ + _)
This expression uses map to change strings into non-negative numbers, then filter to
ignore zeros, map again to apply the logarithm function, and finally reduce to calculate
the sum. A drawback of this approach is that it requires the creation of three interme-
diate lists, one with all the absolute values (the output of the first map), one with the
zeros removed (the output of filter), and one with all the logarithms (the output of
the second map). This could lead to performance issues if the file is large.
148 Chapter 10 Standard Higher-Order Functions
As an alternative, you can use foldLeft to perform the same computation without
the need for additional lists—arguably, at the cost of a minor loss in readability:1
Scala
lines.foldLeft(0.0) { (sum, line) =>
val x = line.toDouble.abs
if x != 0.0 then sum + math.log(x) else sum
}
The folded function leaves the accumulator unchanged when processing a zero. Other-
wise, the logarithm of the absolute value is added to the accumulator, thus achieving
the same result as the filter/map/reduce combination. Note how, in this last exam-
ple, the argument function is more elaborate than before and benefits from the fact that
foldLeft is defined using currying.
By contrast, iterate is given an initial value and a function that can be reapplied on
its own output (its input and output types are the same): List.iterate(X,n)(f)
consists of n values [X, f(X), f(f(X)), f(f(f(X))), . . . ], where each value is obtained
by applying function f to the previous value.
Scala
List.iterate("", 5)(str => str + "X") // List("", "X", "XX", "XXX", "XXXX")
Finally, function unfold works as a kind of reverse fold. You give it an initial state
and a function that, from a state, produces a value and the next state, if any. Applied to
a state s, this function produces a pair (fVal(s), fNext(s)) with the next value and
the next state. The function can then be applied to fNext(s) to produce another value
fVal(fNext(s)) and a next state fNext(fNext(s)), and so forth, thus producing the
sequence of values [fVal(X), fVal(fNext(X)), fVal(fNext(fNext(X))), . . . ], given
1A more readable alternative can be obtained through lazy evaluation; see Section 12.6.
10.6 sortWith, sortBy, maxBy, and minBy 149
an initial value X. The argument function produces an option, and the computation
terminates when it returns None:
Scala
List.unfold("XXXX")(str => if str.isEmpty then None else Some((str, str.tail)))
// List("XXXX", "XXX", "XX", "X")
A better alternative in this scenario is to use sortBy. This function takes as its
argument a function that maps the elements to be sorted to arbitrary ordered values:
Scala
strings.sortBy(_.length)
In the preceding expression, strings are also sorted in increasing order of their lengths.
This is achieved by mapping strings to their lengths, and then relying on the default
ordering of integers. Note that the result is a list of strings, not a list of integers (as
strings.map(_.length).sorted would be). If 2D points are represented as pairs, you
can use sortBy to sort them in increasing order of Euclidean distance to origin:
150 Chapter 10 Standard Higher-Order Functions
Scala
val points: List[(Double, Double)] = ...
points.sortBy((x, y) => x * x + y * y)
Note that the second expression needs to compare pairs. The default behavior, in Scala,
is to compare them in lexicographic order—that is, according to the first component, or
the second component if the first components are equal. As a result, datedTemps.max
finds the last (highest) date, and within this date, the highest temperature. To retrieve
the highest temperature overall, ignoring dates, use maxBy:
Scala
datedTemps.maxBy((_, temp) => temp) // highest temperature overall
To get the highest temperature overall, but pick its earliest date if it occurs multiple
times, is a little more complicated:
Scala
val byTemp: Ordering[(LocalDate, Int)] = Ordering.by((_, temp) => temp)
val byDate: Ordering[(LocalDate, Int)] = Ordering.by((date, _) => date)
datedTemps.max(byTemp.orElse(byDate.reverse))
Note the use of a higher-order function by to build ordering objects, which are then
combined to sort dated temperatures according to temperatures first, and for equal
temperatures, by reverse order of dates.
Like sortBy and maxBy, it takes a mapping function as its argument and groups together
all the elements that map to the same value:
Scala
List(2, 5, 4, 10, 7, 1, 20).groupBy(n => n % 2)
This expression maps even numbers to 0 and odd numbers to 1, resulting in an expres-
sion of type Map[Int,List[Int]] with two keys:
0 -> List(2, 4, 10, 20)
1 -> List(5, 7, 1)
You can group dated temperatures by date:
Scala
val tempsOn = datedTemps.groupBy((date, _) => date)
This expression is a little hard to parse, but the first call to map is used to transform
each list of dated temperatures into a list of plain temperatures; the second map is the
transformation of a dated temperature into an integer. If you are lucky, your language
offers groupMap, which combines groupBy and map into a single function:
Scala
datedTemps.groupMap((d, _) => d)((_, t) => t) // of type Map[LocalDate, List[Int]]
Function groupMap takes two mapping functions as arguments (in curried style), one
for grouping (as in groupBy) and one for transforming (as in map).
2 If there are no temperatures recorded on that date, the lookup in the map will fail with an
exception. This can be avoided by using the map method withDefaultValue to create a map that
returns an empty list instead.
152 Chapter 10 Standard Higher-Order Functions
case class Node(key: Int, left: BinTree, right: BinTree) extends BinTree:
def exists(test: Int => Boolean): Boolean =
test(key) || left.exists(test) || right.exists(test)
Listing 10.8: Extending binary search trees with exists, foreach, and fold.
Method exists is false on an empty tree. On a node, it checks the node’s key with the
given predicate and continues checking each subtree if the key does not satisfy the test.
Method foreach does nothing on an empty tree. On a node, it applies the argument
function to all the values of the left tree, then on the node’s key, and finally on all the
values of the right tree. Given the ordering property of binary search trees, this in-order
traversal guarantees that the function is applied to all values in increasing order. In the
same way, method fold starts by folding the left tree, uses the resulting value to apply
the argument function to the node’s key, and finishes by folding the right tree, thus
guaranteeing that tree values are processed in increasing order. (See also Chapter 14 for
an example of user-defined map and flatMap implementations.)
10.9 foreach, map, flatMap, and for-Comprehensions 153
was used in Chapter 3 to create a list of updated load objects. It is compiled into
Scala
loads.map(load => load.reduced)
3 The choice of for as a keyword is somewhat unfortunate. It suggests a loop of some sort, but in
many cases, no such loop is needed. Here, for is used to process the contents of an option, and no loop
is involved. In Chapter 26, for is applied to a future, again without a loop.
154 Chapter 10 Standard Higher-Order Functions
Similarly,
Scala
for load <- loads do load.reduce()
When for-do and for-yield use conditions, a filtering step is inserted by the compiler.
The expression
Scala
for load <- loads if load.weight != 0 yield load.reduced
is compiled into4
Scala
loads.withFilter(load => load.weight != 0).map(load => load.reduced)
Listing 10.8 added a method foreach to binary trees. As a result, you can now
process trees using for-do:
Scala
val tree: BinTree = ...
for x <- tree do println(x) // prints all the values in the tree, in order
4 Method withFilter is semantically equivalent to filter, but uses a form of delayed evaluation to
avoid the creation of an intermediate list (see the discussion of views and lazy evaluation in Section 12.9).
10.10 Summary 155
In Java, which has higher-order functions only, you need to use filter and map directly:
Java
IntStream temps = ...
temps.filter(temp -> temp > 75).map(temp -> Math.round((temp - 32) / 1.8f))
NOTE
Given my intent to prepare developers for a variety of languages, I often use higher-order functions in
code examples instead of for-do or for-yield, especially outside of case studies. This is a choice
I made for pedagogical reasons, to better emphasize the similarities of patterns across languages,
but it tends to result in Scala code that is not always idiomatic. Readers with Scala experience will
forgive me.
10.10 Summary
• The functions presented in this chapter form a core set of higher-order functions
commonly found in functional libraries. They are often available on many types,
such as streams and collections, but also options and futures.
• Several functions use a predicate—a Boolean function—as their argument. They
are used to find, count, filter, or assert the existence of elements that satisfy this
predicate.
• Function map is used to apply an arbitrary transformation to elements within a
structure and produces a new structure that contains the transformed values. If
an operation is applied to elements for the purpose of side effects only, and no
resulting structure is needed, function foreach can be used instead of map.
• Function flatMap is similar to map but takes care of “flattening” nested structures,
like lists of lists or options of options. This function is powerful and can be used
to apply a computation with optional output to an optional input (to avoid an
option of option), or to handle an event that triggers a new stream of events (to
avoid a stream of stream), or to asynchronously transform a value that is itself
calculated asynchronously (to avoid a future of future).
• Fundamentally, foreach, map, flatMap, and filter (or withFilter) are used to
transform the contents of a structure (e.g., list, option, future) from outside the
structure. This common pattern is sometimes supported by syntax at the language
level, like Scala’s for-comprehension or Python’s list comprehension.
• Folding functions are used to reduce the elements of a structure into a single value
using a combining operator. Many computations that would process the elements
of a collection iteratively can be implemented in terms of a folding function.
156 Chapter 10 Standard Higher-Order Functions
for clarity. An actual implementation would leave many elements non-public, including the Node trait.
157
158 Chapter 11 Case Study: File Systems as Trees
The list of nodes in a directory should never contain two nodes with the same name.
The order of nodes in a list is irrelevant: Directories that contain the same files and sub-
directories in different orders are considered equivalent. Trees are immutable: Adding
or removing files and directories is achieved via functions that produce a new tree.
Higher-order method partition is used to partition a list of nodes between those that
have a given name (there is at most one) and those that have a different name. If there
is no node with the given name, the function returns None. Otherwise, it returns a pair
with the node that was extracted and a list of remaining nodes. As an example, if you
look for name B in a list of nodes that contains directories/files A, B, C, and D, the
method returns the pair: (B, [A,C,D]).
Directories also define a method ls that lists their contents by name, and a method
lsFiles that only lists the names of files, ignoring subdirectories. You can implement
11.3 String Representation 159
ls by using the higher-order method map to produce a list of strings from a list of nodes.
For lsFiles, you use flatMap to keep the names of files and skip subdirectories:2
Scala
// inside class Dir
def ls: List[String] = nodes.map(_.toString)
In addition to ls and lsFiles, you may want to build a string that represents
the entire contents of the directory, including subdirectories. To emphasize the tree
structure, file and directory names are indented according to their depth.3 Figure 11.1
shows a file system tree and its string representation.
Root/
Root .File1
.Dir1/
File1 Dir1 Dir2 ..File2
..Dir3/
.Dir2/
File2 Dir3 File3 ..File3
To implement this method efficiently, you can create a string builder and use recur-
sion to add lines to it, one line for each file and directory in the tree. To achieve
indentation, each line starts with a prefix that consists of a repetition of a separator
(the separator is a single dot in Figure 11.1). A method mkString allocates a string
builder and an empty prefix and delegates the recursion to another method, also named
mkString, whose responsibility is to add lines to the builder:
Scala
// inside trait Node
def mkString(sep: String): String = mkString(sep, StringBuilder(), "").result()
def mkString(sep: String, builder: StringBuilder, prefix: String): StringBuilder
2 Method flatMap is used here to combine filtering (keep files, ignore directories) and mapping
(transform files into their names). Alternatively, lsFiles could be written in terms of filter and map.
3 This is similar to the tree command often found on Unix systems.
160 Chapter 11 Case Study: File Systems as Trees
Scala
// inside class File
def mkString(sep: String, builder: StringBuilder, prefix: String): StringBuilder =
builder ++= prefix ++= name += '\n'
Inside the Dir class, the method starts by adding a line for the directory itself. It
then proceeds by adding all the nodes (files and subdirectories) inside the directory. This
is achieved recursively by calling mkString on all the nodes, using a longer prefix and
the same string builder. You can perform these recursive calls by applying higher-order
method foreach on the list of nodes:
Scala
// inside class Dir
def mkString(sep: String, builder: StringBuilder, prefix: String): StringBuilder =
builder ++= prefix ++= name ++= "/\n"
val newPrefix = prefix + sep
nodes.foreach(_.mkString(sep, builder, newPrefix))
builder
Method mkString mixes higher-order functions and recursion: The action applied
by foreach invokes mkString, which uses foreach to invoke mkString, and so on. This
is a consequence of the fact that a node contains a list (hence the use of higher-order
functions) of nodes (hence the use of recursion). You will notice how this pattern is used
again in other tree functions in this chapter.
As mentioned earlier, method foreach makes sense only when the function being
applied has a side effect. Here, the side effect comes from modifying a mutable string
builder. Tree users, however, would call only the first mkString method—the only one
public in an actual implementation. From their standpoint, the function is pure and
produces a string from a tree, without any observable side effects.
Scala
def apply(name: String): Dir = Dir(name, List.empty)
Then, you define methods mkFile and mkDir inside class Dir to add contents to an
existing directory. These methods take in a path as a list of names. They travel down
this path inside the tree and create a file or a directory at the end of the path. As a
design decision, I choose to create missing directories along the path. However, if a file
exists where a directory is needed, no further travel on the path is possible and creation
fails.
To avoid code duplication, mkFile and mkDir rely on the same method mkPath. In
addition to a path, this method takes an optional file name. It always creates the path—
and thus can be used to implement mkDir—and optionally creates a file at the end of
the path, so as to implement mkFile:
Scala
// inside class Dir
def mkPath(path: List[String], filename: Option[String]): Dir = path match
case Nil =>
filename match
case None => this
case Some(name) =>
if nodes.exists(_.name == name) then
throw FileSystemException(name, "cannot create file: node exists")
else Dir(this.name, File(name) :: nodes)
The first case, Nil, corresponds to reaching the end of the path, where a file might
be added. If no file name is supplied, return the tree this as is. Otherwise, use higher-
order method exists to check whether a node already exists with the given name. If
so, the file cannot be created—throw an exception. Otherwise, replace this with a new
directory, using the same name, but with a new file added at the front of the list of
nodes.
The second case deals with the general path traversal. Take the first name in the
path, dirname, and use helper method removeByName to extract the current node
by that name, if it exists. If no such node is found (case None), create a subdirec-
tory (Dir(dirname)), add the remainder of the path (more) to it, and add this new
directory to the front of the current list of nodes. If instead the name corresponds
to an existing node (case Some), add the remainder of the path to it: node becomes
node.mkPath(more,filename) and is reinserted in the list of children.
162 Chapter 11 Case Study: File Systems as Trees
You need to think functionally to follow this code. Trees are immutable, and node
modification is achieved by replacing nodes with new nodes to create new trees. For
instance, when adding a file, node this—which is Dir(this.name, nodes)—is replaced
with a new tree Dir(this.name, File(name) :: nodes). In the same way, you add a
new subdirectory by first creating it as (Dir(dirname).mkPath(more, filename))—
call it newDir—and by replacing the current tree Dir(this.name, nodes) with a new
tree Dir(this.name, newDir :: nodes).
When an existing node needs to change, you remove it from the current list of
children—nodes becomes otherNodes—and replace it with a new node—node becomes
node.mkPath(more, filename). In that case, the new node is inserted at the front of
the list instead of where the original node was. This is for performance reasons: List
insertion away from the head is costly. Furthermore, in the common case where a newly
created file or directory is used immediately, this strategy makes it more easily reachable
with the next operation, since a list of nodes is always traversed from the front.
The recursive call node.mkPath(more, filename) may call mkPath on a directory
node, thus continuing the traversal/construction, or on a file node, in which case the
path being followed does not exist and cannot be created because its creation would
require replacing an existing file with a directory with the same name. Accordingly,
method mkPath in class File simply throws an exception:
Scala
// inside class File
def mkPath(path: List[String], filename: Option[String]) =
throw FileSystemException(name, "cannot create dir: file exists")
Once method mkPath is defined, you can implement the public methods mkFile and
mkDir in terms of it:
Scala
// inside class Dir
def mkDir(name: String, names: String*): Dir = mkPath(name :: names.toList, None)
Both methods use variable-length arguments and require at least one name to oper-
ate. In the case of mkDir, all the names are directories, and the path to be added is
name :: names.toList. In mkFile, the last name is the file to add. The path of di-
rectories starts with name, followed by all the strings in names, except the last: This is
allNames.init (init returns all the elements of a list, except the last). As an illustra-
tion, the file system from Figure 11.1 can now be created with the following expression:
11.4 Building Trees 163
Scala
Dir("Root")
.mkFile("Dir2", "File3")
.mkDir("Dir1", "Dir3")
.mkFile("Dir1", "File2")
.mkFile("File1")
You can remove files and directories from a tree by following a similar strategy.
Method rmPath follows a path to its last element and removes it from the tree, whether
it is a file or a directory. A first difference with mkPath is that, instead of creating
missing directories, rmPath can stop traversing a branch as soon as a directory is missing
(nothing to remove). Another difference is that you need to stop following the path one
level above the last element so you can remove this element from the list of nodes
of its parent. In other words, you remove a node by replacing Dir(name, list) with
Dir(name, shorterList), not by replacing this with “nothing.” This results in a
recursion that ends with a single-element list instead of an empty list:
Scala
// inside class Dir
def rmPath(path: List[String]): Dir = path match
case nodename :: more =>
removeByName(nodename) match
case None => this
case Some((node, otherNodes)) =>
if more.isEmpty then Dir(this.name, otherNodes)
else Dir(this.name, node.rmPath(more) :: otherNodes)
Scala
// inside class File
def rmPath(path: List[String]) = this
11.5 Querying
Section 11.4 focused on building and modifying trees that represent file systems. Once
a tree is built, you can define additional methods to query it. Because directories are
implemented in terms of a list of nodes, many querying methods can rely on standard
higher-order functions on lists. For instance, you can calculate the total number of files
and directories in a system by folding:
Scala
// inside class Dir
def fileCount: Int = nodes.foldLeft(0)(_ + _.fileCount)
def dirCount: Int = nodes.foldLeft(1)(_ + _.dirCount)
in this chapter is predicated on the assumption that the Node trait remains private.
11.5 Querying 165
Scala
def fileCount(dir: Dir): Int = dir.fileFold(0)((acc, _) => acc + 1)
def dirCount(dir: Dir): Int = dir.dirFold(0)((acc, _) => acc + 1)
You can also use fileFold to find the longest file name in a tree:5
Scala
def longestFilename(dir: Dir): String = dir.fileFold("") { (longest, file) =>
if file.name.length > longest.length then file.name else longest
}
Likewise, you can use dirFold to build a list of all the file names in a file system:
Scala
def allFileNames(dir: Dir): List[String] =
dir.dirFold(List.empty[String])((list, subdir) => subdir.lsFiles ::: list)
Be careful, however: Folding functions are not well suited to implementing searches
because they always traverse the entire tree,6 even after an element has been found. For
instance, this implementation of a fileFind method is undesirable:
Scala
// DON'T DO THIS!
The drawback of this method is that it continues to traverse the tree after a suitable
file has been found.7
To make fileFind stop once a file has been found, while keeping a functional
programming style, is not completely straightforward. Given that we implemented
fileExists by using exists on a list and fileFold by using foldLeft on a list,
you might be tempted to implement fileFind by using find on a list. The problem is
that applying find to a list of subtrees will produce a tree (if any) in which the desired
file can be found, but not the file itself. You would need to search this tree again, using
fileFind, to extract the file:
5 The body of the function could also be written as Seq(file.name, longest).maxBy(_.length).
6 Short of throwing an exception. Throwing (and catching) an exception is sometimes used as a
technique to prematurely terminate a folding computation.
7 Even though all the files are visited, this implementation stops applying the test function after
a file has been found. This is because, when orElse is applied to a non-empty option, the expression
used as its argument is not evaluated. This phenomenon is known as lazy or delayed evaluation, and is
explored in Chapter 12.
11.5 Querying 167
Scala
// DON'T DO THIS!
This implementation uses find to search for a node that contains the desired file. If
found, map is invoked on the option to apply fileFind on this node to get the actual
file. This is undesirable because the node inside the option has already been searched
for the file and indeed contains it (the call to fileFind is guaranteed to succeed, hence
the use of get to get the actual file). The node is thus searched twice and, because
of the recursive nature of findFile, inner nodes end up being searched multiple times
(the deeper the node, the more times it is searched).
Instead of using find to locate a subtree that contains a suitable file, a better
strategy is to extract the desired file from each subtree by using the
method map of lists. The expression nodes.map(_.fileFind(test)) has type
List[Option[File]] and contains an acceptable file (if any) for each subtree. Method
find can then be used to find a non-empty option from the list. The expression
nodes.map(_.fileFind(test)).find(_.nonEmpty) produces the desired file, but as
an Option[Option[File]], which can be flattened:
Scala
// DON'T DO THIS!
The problem with this approach is that map will process the entire list of nodes no
matter what, and all the nodes inside a directory are searched, even after a file has
been found. The standard technique to avoid this issue is to introduce a form of lazy
evaluation:
Scala
// OK, but can be improved
By calling higher-order method map on nodes.view instead of nodes, you prevent nodes
from being explored after a file has been found. Lazy evaluation is discussed in detail
in Chapter 12. For now, it is enough to know that the implementation in the example
stops when the first file is found (or traverses the entire tree if no file can be found).
168 Chapter 11 Case Study: File Systems as Trees
You can replace the map/flatten combination with a call to flatMap to produce all
the files unwrapped instead of being inside options. In the end, the desired implemen-
tations of fileFind and dirFind are as follows:8
Scala
// inside class Dir
def fileFind(test: File => Boolean): Option[File] =
nodes.view.flatMap(_.fileFind(test)).headOption
11.6 Navigation
A last set of methods in classes Dir and File deal with navigation inside a file system.
You can define a method cd to enter subdirectories, which mimics the Unix command
by the same name:
Scala
// inside class Dir
def cdPath(path: List[String]): Dir = path match
case Nil => this
case dirname :: more =>
nodes
.find(_.name == dirname)
.getOrElse(throw FileSystemException(dirname, "cannot change: no such dir"))
.cdPath(more)
8 Instead of if test(this) then Some(this) else None, the body of fileFind in class File could
be written Some(this).filter(test). There are other places in the code where filter could also be
applied to options. Although this would illustrate the use of higher-order method filter, some code
readability might be lost. I chose to stick with if-then-else for clarity.
11.7 Tree Zipper 169
As was done earlier with mkFile, mkDir, and rm, method cd is implemented in terms
of a helper method cdPath that traverses a list of directories, specified by name. If
the list of names is empty, return the current directory. Otherwise, use find to find a
subdirectory with the specified name, if any. Using getOrElse, extract the directory
from the resulting option or throw an exception if the option is empty. The remainder
of the path is then applied to the subdirectory, recursively. If, at any point, the path
encounters a file instead of a directory, an exception is thrown.
Notice that the first call to method up, for instance, cannot simply go back to the tree
on which down("Dir2") was invoked, because this tree does not contain File3. In other
words, up does not necessarily bring you back to a previous tree, but also needs, in some
cases, to create a new tree.
170 Chapter 11 Case Study: File Systems as Trees
A viable approach is to modify the tree type into a zipper. Section 5.6 implemented
a zipper to navigate lists leftward and rightward. The same idea can be used to move
up and down a tree.9 Zippers are immutable. A tree zipper combines a subtree with
the sequence of steps that led to that subtree. Those steps can be reapplied, in reverse
order, to go up. In contrast to returning to an existing tree, applying a step builds a
new tree, which contains updated subtrees that reflect modifications to the file system.
In the preceding code, the call to nav changes the type of the file system from a tree to
a tree zipper, which implements navigation via its up and down methods. The final call
to dir brings you back to a regular tree.
To go down a subdirectory, you create a step that contains the name of the parent
directory, as well as a list of all the other nodes in that directory. For instance, going
down into Dir1 in the file system from Figure 11.1 returns the subtree rooted at Dir1
but also creates a step that contains the name Root and a list of subtrees other than
Dir1:
1
Root
2a
File1 Dir1 Dir2
2b
3
File2 Dir3 File3
You can see that from the subtree (labeled 3), the step name (labeled 1), and the step
trees (labeled 2a and 2b), the previous tree can be reconstructed to go up. If the subtree
has been changed—that is, replaced with a different tree—going up by reassembling will
produce a different tree (same 1 and 2 parts, but a different subtree 3).
The zipper is implemented in a class DirNav:
Scala
final class DirNav (val dir: Dir, steps: List[(String, List[Node])]):
...
A zipper consists of a current directory (dir) and a list of down steps that led to it
(steps). Each step is a pair that contains the name of the parent node as a String,
and a list of the siblings of the current directory, of type List[Node]. The list of steps
is empty when the current directory is the root of the file system.
The zipper implements two methods up and down. To move down to a subdirectory,
method down produces the same subdirectory as the earlier cd method, but adds a step
to the list of steps so the downward move can be reversed by method up to go back up:
9 This section considers only up and down movements. A more complex zipper can be written to
Scala
// inside class DirNav
def down(dirname: String): DirNav =
dir.removeByName(dirname) match
case None =>
throw FileSystemException(dirname, "cannot change: no such directory")
case Some((file: File, _)) =>
throw FileSystemException(dirname, "cannot change: not a directory")
case Some((subdir: Dir, otherNodes)) =>
DirNav(subdir, (dir.name, otherNodes) :: steps)
To go down, search the target subdirectory by name and remove it from the list of
nodes. If it is not found, or if a file is found by that name instead, throw an exception.
Otherwise, if the subdirectory exists (subdir), create a zipper with it by adding a new
step. This step contains the name of the parent directory (dir.name) and all the siblings
of subdir (otherNodes). The new step is added at the front of the list of steps, which is
used like a stack. To go back up, method up pops the first step from the stack and uses
its name and list of nodes to construct a new tree Dir(name, dir :: nodes), which is
wrapped in a zipper with the remaining steps.
For convenience, you can also implement a method to go down multiple times and
a method to go back to the root of the tree:
Scala
// inside class DirNav
def down(dirname1: String, dirname2: String, dirnames: String*): DirNav =
(dirname1 :: dirname2 :: dirnames.toList).foldLeft(this)(_.down(_))
@tailrec
def top: DirNav = if steps.isEmpty then this else up.top
To go down a path made of multiple subdirectories, you use foldLeft with the current
directory as the initial value to invoke the single-name variant of method down on all the
names in the path, one by one. To move all the way to the top, method top repeatedly
applies up until the root of the tree is reached.
172 Chapter 11 Case Study: File Systems as Trees
All the other methods of class DirNav are implemented by forwarding calls to the
corresponding method in class Dir, except that creation methods (adding and removing
files and subdirectories) need to return a navigable directory instead of a plain directory:
Scala
// inside class DirNav
override def toString = dir.toString
export dir.{
mkString, ls, lsFiles,
dirCount, dirExists, dirFold, dirFind, fileCount, fileExists, fileFold, fileFind
}
Finally, a method nav is added to class Dir to make a plain directory navigable:
Scala
// inside class Dir
def nav: DirNav = DirNav(this, List.empty)
11.8 Summary
In this chapter, a file system is represented as a tree of files and directories. The contents
of a directory are stored as a list of nodes, and the higher-order methods of lists are used
to process these nodes effectively. The tree also defines its own higher-order methods.
They are implemented recursively, and recursive calls on subdirectories are performed
using the higher-order methods of lists on a directory’s contents. Finally, trees are
extended into zippers to become navigable, with methods to go down a subdirectory
and back up to the parent directory. In all (including the many footnotes), this file
system case study uses and/or reimplements standard higher-order methods exists,
find, flatMap, foldLeft, foreach, filter, map, maxBy, and partition.
Chapter 12
Lazy Evaluation
By using functions as values, it becomes possible to replace data with a function that
computes this data. With this approach, the evaluation of arguments can be delayed
until they are needed, making it possible to pass unevaluated code to methods and
functions. Combined with syntactic help from the language, it becomes possible to define
additional control structures from within a programming language, leading potentially
to the definition of an internal domain-specific language. Functions can also be stored
within data structures to delay the evaluation of parts of that structure. Streams are
a classic functional programming structure that implements lazily evaluated sequences.
Besides unevaluated arguments and lazy data structures, lazy evaluation can take a
few other forms, such as views that delay the evaluation of higher-order functions and
mechanisms for the lazy initialization of variables.
A drawback of this method is that the string argument is always built, even when
logging at the INFO level is disabled. In this example, if logging is set at the WARNING
level, a potentially costly hostname lookup still takes place to create a string that is
immediately discarded.
Java 8 introduced a second info method to the Logger class. It replaces the string
argument with a function that returns a string:
Scala
logger.info(() => s"incoming request from ${ip.getHostName} (${ip.getHostAddress})")
173
174 Chapter 12 Lazy Evaluation
You can invoke this info method just as easily as the previous one by using a lambda
expression for the function argument. The preceding expression does not perform a
hostname lookup. A lookup will happen only within the logger and only if logging has
been set to a level that includes INFO messages.
You can define your own methods that use function arguments to replace explicit
values that may not be needed. For instance, the Java Properties class defines a method
for property lookup with a second argument that specifies a default value if a key is not
found:
Scala
val properties: Properties = System.getProperties
properties.getProperty("hostname", "unknown")
This is sufficient when the default value is a constant, like "unknown". However, if
calculating the default value is expensive, this computation is wasted if a property key is
already associated with a value. There is no getProperty method that takes a function
as its second argument. However, you can easily add one as as an extension by relying
on a single-argument getProperty method that returns null when a key is not found:
Scala
extension (properties: Properties)
def getProperty(key: String, fallback: () => String): String =
val prop = properties.getProperty(key)
if prop ne null then prop else fallback()
The default value is specified as a fallback function, which is invoked only if the
property is not found. With this extension in scope, you can write
Scala
properties.getProperty("hostname", () => InetAddress.getLocalHost.getHostName)
without incurring any additional cost if the hostname property is already defined.
If you are coming from languages like Java or C, this may look surprising. Indeed,
method fill from java.util.Collections behaves differently:
Java
List<Integer> numbers = ...
Collections.fill(numbers, rand.nextInt(100)); // sets the list to 30, 30, 30, 30, ...
In the Java variant, the expression rand.nextInt(100) produces the value 30, and
the list is filled with this value. In the Scala code, however, fill behaves as if the
thunk () => rand.nextInt(100) had been passed as its argument. This is because the
argument to Scala’s fill is passed by name, unevaluated. You could implement your
own fill function as follows:
Scala
def fill[A](len: Int)(value: => A): List[A] =
val buffer = List.newBuilder[A]
var i = 0
while i < len do
buffer += value
i += 1
buffer.result()
Scala
// inside class Option:
def orElse[B >: A](alternative: => Option[B]): Option[B] =
if isEmpty then alternative else this
The alternative and default arguments are passed by name and are evaluated only
if the option is empty.
The getProperty extension could rely on getOrElse for its implementation:
Scala
def getProperty(key: String, fallback: () => String): String =
Option(properties.getProperty(key)).getOrElse(fallback())
The predefined function Option wraps values inside options, mapping null to None. (For
an illustration of orElse in a branching algorithm, see Section 12.11.)
Similar to orElse and getOrElse, mutable maps have getOrElseUpdate(k,v), a
method that retrieves the value associated with key k or, if the key is not present,
updates the map with a new pair (k,v). The argument v is passed by name: If the key
is present, v is not needed and will not be evaluated. You can use getOrElseUpdate to
simplify the implementation of the higher-order function memo from Listing 9.6:
Scala
def memo[A, B](f: A => B): A => B =
val store = mutable.Map.empty[A, B]
x => store.getOrElseUpdate(x, f(x))
Listing 12.2: Memoization uisng a by-name argument, improved from Lis. 9.6.
As before, the function starts with a lookup in the map. If x is found in the map,
getOrElseUpdate does not evaluate its second argument. Only if the key is not found
will function f be called on x.
Java
<A> void writeToFile(Path file, Iterable<A> values) throws IOException {
try (var out = new ObjectOutputStream(Files.newOutputStream(file))) {
for (var item : values) out.writeObject(item);
}
}
This function writes a collection of objects into a file. The try construct guarantees
that the file is closed after writing, even in the presence of exceptions. You can write a
similar method in Scala:
Scala
def writeToFile[A](file: Path, values: Iterable[A]): Unit =
Using.resource(ObjectOutputStream(Files.newOutputStream(file))) { out =>
for item <- values do out.writeObject(item)
}
This function takes a by-name argument, keeps track of the amount of time needed
to evaluate it, and returns the duration in seconds. You can use it without explicitly
creating a thunk function:
Scala
val seconds: Double = timeOf {
InetAddress.getLocalHost.getHostName // or any code for which you want the duration
}
The popular Scala library Scalactic defines a times construct, used as follows:
Scala
3 times {
println("Beetlejuice!")
}
Although times looks like a keyword and a language construct, it is actually defined as
a regular function. You could implement it as an extension method:
Scala
extension (count: Int)
infix def times[U](code: => U): Unit =
var n = count
while n > 0 do
code
n -= 1
Scala
var n = 3
repeat {
println("Beetlejuice!")
n -= 1
} until (n == 0)
Methods and objects are carefully named it, should, in, a, be, and thrownBy so that
testing code flows naturally, and suitable error messages are produced when a test fails.
2 This example is taken quasi-verbatim from the Scalatest website.
180 Chapter 12 Lazy Evaluation
A drawback of this approach is that the entire list of data has to be produced before
it can be searched, possibly resulting in situations where the list is large and only the
first few elements are needed for the search. As an alternative, you could use a lazily
evaluated argument:
Scala
def searchData(data: => List[Data]): Option[Data] = ...
But that does not help at all, because the entire list still has to be evaluated before the
search starts. What you want instead is to delay the evaluation of each list element.
You could do it with thunks:
Scala
def searchData(data: List[() => Data]): Option[Data] = ...
However, this is far from ideal because all the thunks need to be created before the
search starts. With this approach, you may not have to produce all the actual data, but
you still need to create as many thunk functions as there are potential pieces of data.
What you really want is to avoid creating anything beyond the data elements needed
for the search to complete. In Scala, you achieve this with a searching method with the
following signature:
Scala
def searchData(data: LazyList[Data]): Option[Data] = ...
LazyList is a type of linear sequence—like a list—in which elements are lazily evaluated.
Such sequences are often called streams. Scala streams are immutable. They are also
memoized:3 Once computed, each element is stored in the stream.
You can often use streams like regular lists. For instance, to search for the first piece
of data with a value greater than 10, you can define a tail recursive function:
3 This means that mutation actually occurs within streams, to replace unevaluated elements with ac-
tual values. As a sequence, though, streams are conceptually immutable and a stream always represents
the same sequence.
12.5 Streams as Lazily Evaluated Lists 181
Scala
def searchData(data: LazyList[Data]): Option[Data] =
if data.isEmpty then None
else if data.head.value > 10 then Some(data.head)
else searchData(data.tail)
Except for the method’s signature, this code is identical to a variant that uses lists. Note
that, due to memoization, the head of the stream is evaluated only once, even though
data.head is used twice in the code. Furthermore, if this first piece of data has a value
greater than 10, no other stream element is evaluated.
LazyList supports pattern matching, and you can also write the recursive searching
function as follows:
Scala
def searchData(data: LazyList[Data]): Option[Data] = data match
case LazyList() => None
case head #:: tail => if head.value > 10 then Some(head) else searchData(tail)
This implementation is similar to pattern matching and recursion on lists. The only
differences are the empty stream pattern—LazyList() instead of Nil—and the “cons”
operator—named “#::” on streams instead of “::” on lists. For this simple searching
task, though, your best strategy is to rely on the standard higher-order methods defined
on the LazyList type:
Scala
def searchData(data: LazyList[Data]): Option[Data] = data.find(_.value > 10)
Usually, list-creating code can be easily modified to produce streams instead. For
instance, the hanoi function, written in Chapter 6 to illustrate recursion, simply prints
each move as a string. To display moves graphically, it could be inconvenient to incor-
porate graphics code directly inside the solving function. Instead, you can separate the
creation of moves from their display by having a solving function create a list of moves
and a displaying function consume it:
Scala
def hanoi[A](n: Int, from: A, mid: A, to: A): List[(A, A)] =
if n == 0 then List.empty
else hanoi(n - 1, from, to, mid) ::: (from, to) :: hanoi(n - 1, mid, from, to)
Listing 12.6: Tower of Hanoi moves as a list; contrast with Lis. 12.7.
Instead of printing them, this function returns the moves as a list of pairs (from, to).
If you want a stream instead of a list, the necessary changes to the code are trivial:
182 Chapter 12 Lazy Evaluation
Scala
def hanoi[A](n: Int, from: A, mid: A, to: A): LazyList[(A, A)] =
if n == 0 then LazyList.empty
else hanoi(n - 1, from, to, mid) #::: (from, to) #:: hanoi(n - 1, mid, from, to)
Note that allMoves is a stream of 2100 − 1 elements, a number on the order of 1030 .
Obviously, such a collection would not fit in memory with a list-based implementation.
The computation of oneMove triggers the evaluation of the first 1000 elements of the
stream. The evaluation of anotherMove does not involve any additional calculation:
The 50th move, already in the stream, is retrieved.
Since the function’s first argument is a list of lines, and each line is a list of words,
flatMap is used to avoid nested lists. Words are then cleaned, and only the long
words are included in the final result.
This code produces the desired output, but suffers from a minor inefficiency: Each
stage of the computation produces a list—first a list of words, then a list of clean words,
12.6 Streams as Pipelines 183
and finally the desired list of long words. These intermediate lists have to be allocated
in memory, then garbage-collected, which is not free.
In Section 10.4, we discussed a possible approach to avoid this problem: Replace a
pipeline of higher-order functions with a single fold. You might recall that this change
did not improve code readability. In the case of longWords, it is even worse:
Scala
def longWords(lines: List[String], min: Int): List[String] =
lines.foldRight(List.empty[String]) { (line, words) =>
line.split(' ').foldRight(words) { (word, moreWords) =>
val cleanWord = clean(word)
if cleanWord.length >= min then cleanWord :: moreWords else moreWords
}
}
Don’t bother trying to understand this function. It is here to show you that the problem
can indeed be solved by using foldRight, but the resulting code is difficult to read—
certainly much harder than the flatMap/map/filter variant.
A better strategy that avoids the intermediate lists is to keep using flatMap, map,
and filter, but on streams instead of lists:
Scala
def longWords(lines: List[String], min: Int): List[String] =
lines.to(LazyList)
.flatMap(_.split(' '))
.map(clean)
.filter(_.length >= min)
.toList
The initial list of lines is converted into a stream. Because streams are lazily evaluated,
calls to flatMap, map, and filter do not trigger any computation, and in particular,
do not create intermediate lists. The final call to toList forces an evaluation of the last
stream and produces the desired list without allocating any intermediate list.
This approach is particularly popular in Java, where the Stream class implements
all the necessary higher-order methods but collections like List do not. You would
write the longWords function in Java as follows:
Java
List<String> longWords(List<String> lines, int min) {
return lines.stream()
.flatMap(line -> line.replace(' ', '\n').lines())
.map(word -> clean(word))
.filter(word -> word.length() >= min)
.toList();
}
184 Chapter 12 Lazy Evaluation
At first sight, function countUp makes no sense. It seems to violate one of the
principles of a good recursive function—namely, to terminate, a function should always
define at least one non-recursive branch. Here, countUp always invokes countUp. Indeed,
if you write this function using lists, it will never terminate. On streams, however, the
definition is valid. It produces a never-ending stream of numbers. Of course, any attempt
to evaluate the entire stream—for example, naturals.toList—would result in a non-
terminating computation.
You can sometimes use infinite streams to replace mutable data with values of a
more functional flavor. For instance, a (mutable) pseudo-random number generator can
be replaced with an infinite stream of pseudo-random numbers:
Scala
def randomNumbers: LazyList[Float] = Random.nextFloat() #:: randomNumbers
Infinite streams are often created using generative higher-order functions. For exam-
ple, the naturals and randomNumbers streams can be created in this way:
Scala
val naturals = LazyList.iterate(0)(_ + 1)
val randomNumbers = LazyList.continually(Random.nextFloat())
12.8 Iterators
The “3n + 1” problem defines the following sequence: If a natural number n is even,
its successor is n ÷ 2; otherwise, it is 3n + 1. This sequence, sometimes known as the
Collatz sequence, is conjectured to always reach 1 eventually, for any starting natural
number. For instance, if you start with 27, the sequence produces 111 numbers before
reaching 1 for the first time (and then continues forever as 4, 2, 1, 4, 2, 1, . . . ):
12.8 Iterators 185
27 82 41 124 62 31 94 47 142 71 214 107 322 161 484 242 121 364 182 91 274 137
412 206 103 310 155 466 233 700 350 175 526 263 790 395 1186 593 1780 890 445
1336 668 334 167 502 251 754 377 1132 566 283 850 425 1276 638 319 958 479 1438
719 2158 1079 3238 1619 4858 2429 7288 3644 1822 911 2734 1367 4102 2051 6154
3077 9232 4616 2308 1154 577 1732 866 433 1300 650 325 976 488 244 122 61 184
92 46 23 70 35 106 53 160 80 40 20 10 5 16 8 4 2 1
You can write a function to calculate the number of steps needed to reach 1. An
imperative implementation could be:
Scala
def collatz(start: BigInt): Int =
var count = 0
var n = start
while n != 1 do
n = if n % 2 == 0 then n / 2 else 3 * n + 1
count += 1
count
collatz(27) // 111
collatz(BigInt("992774507105260663893249807781832616822016143650134730933270")) // 2632
This implementation builds the sequence explicitly, as an infinite stream, then uses
takeWhile to look for the first occurrence of the number 1. The length of the sequence
from the starting number to 1 is the desired output of the function.
Although it produces the correct output for the two previous examples, this function
is inefficient and would actually fail on larger numbers. The reason is that the entire list
of numbers is allocated in memory—2632 numbers in the second test—before calculating
the length. Fortunately, there is an easy fix: Replace the lazy list with an iterator. Scala’s
Iterator type implements many standard higher-order methods, including iterate and
takeWhile:
186 Chapter 12 Lazy Evaluation
Scala
def collatz(start: BigInt): Int =
Iterator
.iterate(start)(n => if n % 2 == 0 then n / 2 else 3 * n + 1)
.takeWhile(_ != 1)
.length
Value moves is an iterator, which can be used to retrieve the moves one by one, with
no actual sequence of moves expanded in memory. The iterator method is correctly
implemented in the LazyList class to not keep a reference to the head of the stream,
thus avoiding memory leaks.
As a last example of introducing laziness via iterators, consider the problem of
searching text files for lines that match a given criterion. First, consider the (unwieldy)
imperative implementation:
Scala
def read(file: Path): List[String] = ...
A function read is used to load the contents of a file into an array of lines. An outer
loop traverses all the files. For each file, an inner loop examines all the lines, until a
suitable line is found, in any file.
The functional equivalent is remarkably simpler. First, you use flatMap to flatten
all the files into a single sequence of lines, on which you apply find to search for the
desired line:
Scala
// DON'T DO THIS!
def searchFiles(files: List[Path], lineTest: String => Boolean): Option[String] =
files.flatMap(read).find(lineTest)
This implementation looks nice, but it actually opens and reads all the files, no
matter what. By contrast, the imperative program opens files only until a suitable
line is found; the remaining files are left untouched. To achieve this in a functional
programming style, use lazy evaluation:
Scala
def searchFiles(files: List[Path], lineTest: String => Boolean): Option[String] =
files.iterator.flatMap(read).find(lineTest)
The call to method flatMap on an iterator produces an iterator; no files are open.
Method find searches this iterator by opening the files one by one, until a line is found.
The function now performs the same computation as the imperative program.
Scala
val iter1 = List(1, 2, 3).iterator
iter1.find(_ > 1) // Some(2)
iter1.next() // invalid call
A call to find invalidates the iterator. You cannot invoke any method on the iterator
after that. Similarly, after you use method map, you need to rely on the new iterator that
is returned; the old iterator on which map was applied is no longer invalid.
In consequence, iterators are well suited to pipelined computations, such as in func-
tions collatz and searchFiles. For other uses, you should exercise extreme caution. If
you want to avoid wasted memory due to memoization but need something that is more
reusable than an iterator, other types are sometimes available. In Scala, for instance,
views behave like non-memoized streams, without the drawbacks of iterators.5
Let’s illustrate the differences between lists, streams, and views with a simple
example. Assume a function times10 that multiplies its input by 10, but also prints
multiplying on the terminal each time it is invoked—to keep track of which operations
are performed lazily. First, consider the case of plain lists:
Scala
val list = List(1, 2, 3).map(times10)
println(list)
println(list.head)
println(list.last)
println(list.head)
The call to method map eagerly applies function times10 on all the list elements before
the list is even displayed. As the list is queried, there is no further computation.
5 The details vary from language to language. For instance, Java’s Stream type is not memoized,
defines many methods that consume the stream, and is closer to Scala’s Iterator than Scala’s LazyList.
12.9 Lists, Streams, Iterators, and Views 189
LazyList(<not computed>)
multiplying
10
multiplying
multiplying
30
10
The stream is displayed first as not yet computed. Function times10 is not called at all
when method map is invoked. When the first element of the stream is needed, a single
call to times10 takes place and value 10 is printed. Querying the last value of the stream
triggers an evaluation of all its remaining elements, resulting in two calls to times10
before 30 is displayed. After that, if the head of the stream is queried again, no further
computation takes place, due to memoization.
Views behave differently from streams:
Scala
val view = List(1, 2, 3).view.map(times10)
println(view)
println(view.head)
println(view.last)
println(view.head)
SeqView(<not computed>)
multiplying
10
multiplying
multiplying
multiplying
30
multiplying
10
As with streams, the call to map does not invoke times10 at all, and the view is first dis-
played unevaluated. Displaying the first element triggers one call to times10. Displaying
190 Chapter 12 Lazy Evaluation
the last element triggers three calls to times10—instead of two with streams—as the
head of the view is reevaluated (no memoization). For the same reason, displaying
the head one more time requires another call to function times10.
Users retrieve the object via the method value. The first time this method is called, the
object is created and stored in field theValue. Further calls simply return the stored ob-
ject. If method value is never called, the object is never created.
Some languages define mechanisms that facilitate the implementation of this pattern.
In Scala, for instance, you can declare a val field to be lazy:
Scala
class SomeClass:
lazy val value: ExpensiveType = ExpensiveType.create()
This will trigger a single evaluation of ExpensiveType.create the first time value is
accessed, if any. If the field is never used, the object is not created. One way you can
think of it is that a lazy val behaves like a one-value stream—delayed evaluation and
memoization.6
Although it is less common, lazy initialization can also be applied to local variables.
Consider, for instance, the hanoi function from Listing 12.7, rewritten here to introduce
two local variables before and after:
Scala
def hanoi[A](n: Int, from: A, mid: A, to: A): LazyList[(A, A)] =
if n == 0 then LazyList.empty
else
6 As an added benefit, lazy val is thread-safe: Even if multiple threads access value at the same
time, the expensive object is guaranteed to be created only once. This is notoriously not true of the
previous variant. If you don’t need it, this thread-safety can be turned off for increased performance.
12.11 Illustration: Subset-Sum 191
Without lazy, the streams before and after would be fully evaluated before “#:::”
and “#::” can work their magic, and laziness would be entirely lost.
problem without using first. It is important that the second recursive call only takes
place if the first one fails to produce a solution. This behavior is guaranteed here by the
fact that the argument to orElse is passed by name and is evaluated only if the first
option is empty.
As a side note, notice that the implementation is simplified by the fact that lists are
immutable. In particular, even though the recursive calls are conceptually digging into
the list of numbers, the same list others is used as an argument in both recursive calls.
If lists were mutated, you would need to “undo” changes from the first computation
before reusing the list in a second call.
Subset-sum problems may have multiple solutions. For instance, 18 can also be
reached as 3 − 6 + 11 + 7 + 3 = 18 using the numbers from the example set. To compute
all the solutions, you can modify findSum to always make the second recursive call, even
if the first call succeeds:
Scala
def findAllSums(target: Int, numbers: List[Int]): Set[List[Int]] =
if target == 0 then Set(List.empty)
else
numbers match
case Nil => Set.empty
case first :: others =>
(findAllSums(target - first, others).map(first :: _)
union findAllSums(target, others))
Listing 12.11: All the solutions to subset-sum; contrast with Lis. 12.10 and 12.12.
The differences between findSum and findAllSums are minimal. Options are replaced
with sets, and orElse is replaced with union. Contrary to orElse, method union always
evaluates both of its arguments.
Given how similar they are, is there a way to avoid having to write both functions?
If you have all the solutions to a problem, you can always pick one, so you could be
tempted to implement findSum in terms of findAllSums:
Scala
// DON'T DO THIS!
def findSum(target: Int, numbers: List[Int]): Option[List[Int]] =
findAllSums(target, numbers).headOption
The problem with this approach is that findSum now calculates all the solutions
even though it needs only one. By contrast, the function in Listing 12.10 stops once a
solution has been found. This can be remedied by writing a lazily evaluated variant of
findAllSums:
Scala
def lazyFindAllSums(target: Int, numbers: List[Int]): LazyList[List[Int]] =
if target == 0 then LazyList(List.empty)
else
12.12 Summary 193
numbers match
case Nil => LazyList.empty
case first :: others =>
lazyFindAllSums(target - first, others).map(first :: _)
#::: lazyFindAllSums(target, others)
Listing 12.12: The solutions to subset-sum derived lazily; contrast with Lis. 12.11.
Again, the required changes to the code are minimal. Sets are replaced with streams,
and union is replaced with “#:::”, which brings back lazy evaluation. You can then
use this function as the basis for various subset-sum functions:
Scala
def findSum(target: Int, numbers: List[Int]): Option[List[Int]] =
lazyFindAllSums(target, numbers).headOption
This findSum function evaluates only the first element of the stream, and stops com-
puting as soon as one solution is found, as in Listing 12.10. In findAllSums, the call to
toSet forces the evaluation of the entire stream to compute all the solutions.7 Function
findShortest also computes the entire stream to find a solution of minimal length. It
relies on minByOption—a variant of minBy that does not fail on an empty collection—to
compare solutions by their length.
12.12 Summary
• In languages that support higher-order functions, you can replace explicit argu-
ments with functions that compute them, thus delaying evaluation of an argument
until its value is needed. In particular, if the value of an argument is never needed,
it may never be computed.
• Some programming languages add syntax so arguments can be passed unevaluated
without explicitly creating a function—typically, the function is created by the
compiler and hidden from the user. For a while, functional programming languages,
such as Miranda and Haskell, even experimented with the idea of passing all
arguments unevaluated.
• Unevaluated arguments can sometimes be used to create abstractions that em-
bed reusable patterns and resemble standard programming language constructs.
7 The list of numbers is sorted beforehand so that solutions are always produced as sorted lists of
numbers. Otherwise, the set could contain “duplicates”—List(3,7) and List(7,3), for instance.
194 Chapter 12 Lazy Evaluation
This function needs to return an integer, even when a target is not found in the list.
A common convention—used, for instance, by Scala’s indexWhere function—is to re-
turn -1 as a special value to represent a failed search.
195
196 Chapter 13 Handling Failures
Suppose now that you use this search function to extract from a list all the values
between two given targets:
Scala
// DON'T DO THIS!
def between[A](values: List[A], from: A, to: A): List[A] =
val i = search(values, from)
val j = search(values, to)
values.slice(i min j, (i max j) + 1)
No error is reported, but the outputs make no sense. Of course, the problem is that
the implementation of function between is missing code to check that the targets were
indeed found in the list. However, you get no help from the compiler to let you know
that something is missing. A special value like -1, which can potentially be used as an
actual value, is particularly dangerous. In some languages, -1 is actually a valid index,
which refers to the last element of a sequence.
Another value frequently used to represent failures is null:1
Scala
def search[A](values: List[A], target: A): Integer = ... // returns null when not found
The null value cannot be used as a number, and the same implementation of between
does not produce nonsensical values when a target is not found:
Scala
between(words, "two", "four") // List("two", "three", "four")
between(words, "four", "two") // List("two", "three", "four")
between(words, "two", "five") // throws NullPointerException
between(words, "ten", "four") // throws NullPointerException
1 The return type is changed from Int to Integer because type Int does not contain null in Scala.
13.2 Using Option 197
This is a mixed blessing. On the one hand, the risk of continuing the computation
with an incorrect list is eliminated. On the other hand, the exception, if it is not handled,
has the potential to cause a lot of damage—for instance, by stopping a server entirely
instead of only failing one transaction.
Instead of using special values, function search could rely on an exception to indicate
that an element is missing:
Scala
// throws NoSuchElementException when not found
def search[A](values: List[A], target: A): Int = ...
But there is still nothing that forces function between to handle the exception. An im-
plementation without checks for missing targets now throws NoSuchElementException
instead of NullPointerException, with the same potential for widespread damage.2
The benefit of this definition is that the previous implementation of function between,
which is missing error-handling code, can no longer be compiled. Instead, you need to
handle failed searches explicitly:
Scala
def between[A](values: List[A], from: A, to: A): List[A] =
(search(values, from), search(values, to)) match
case (Some(i), Some(j)) => values.slice(i min j, (i max j) + 1)
case _ => List.empty
2 Java uses the notion of a checked exception, which forces the calling code to either handle an
exception or declare it explicitly as being rethrown. The mechanism is intended to prevent developers
from ignoring possible exceptions, but it is unwieldy, especially when using lambda expressions. It
would not help in this illustration because you may prefer to ignore the exception if you know for sure
that the targets are in the list. Indeed, NoSuchElementException is an unchecked exception in Java. I
am not aware of another language today that uses checked exceptions.
198 Chapter 13 Handling Failures
The failure case could be handled in any way. Here, the chosen semantics is to return
an empty list.
Like Option, the Try type supports higher-order methods. You can use map to feed
the list of strings, if any, to function between. On Try, function map transforms a valid
value; failures are left unchanged:
Scala
readFile(wordFile).map(between(_, "two", "three")) // Success(List("two", "three"))
readFile(notFound).map(between(_, "two", "three")) // Failure(...)
These two expressions have type Try[List[String]]. If file wordFile contains the
sequence of words used earlier, the first expression is a success value. If file notFound
does not exist, the second expression is the failure returned by readFile, is unchanged
by map, and contains the relevant NoSuchFileException.
In Section 10.3, we saw how flatMap can be used to transform optional values with
computations that themselves produce options. The same is true of Try:
Scala
def compute(list: List[String]): Int = ...
def computeOrFail(list: List[String]): Try[Int] = ...
If reading the file has failed, both expressions produce a failed value with the exception
that caused readFile to fail. If the file can be read, the first expression is a success.
The second expression can still be a success or a failure, depending on what is returned
by computeOrFail.
If you prefer to ignore the cause of a failure, you can change a Try type into an
option:
Scala
readFile(...).toOption // of type Option[List[String]]
The Try type is heavily used in concurrent programming to carry exceptions from
one thread to another (several examples are provided in Part II). Contrast this use of a
future in Java:
Java
future.whenComplete((value, error) -> {
if (error == null) ... // use value
else ... // handle error
});
Using Try is cleaner and safer than dealing with a pair (value,error) of which one
element is always null.
either value.
200 Chapter 13 Handling Failures
You can define a readFile function that returns the contents of a file as a list or an
error message if the file cannot be read:
Scala
def readFile(file: Path): Either[String, List[String]] = ...
You can then use the return value through map and flatMap as you would with Option
or Try:
Scala
readFile(wordFile).map(between(_, "two", "three")) // Right(List("two", "three"))
readFile(notFound).map(between(_, "two", "three")) // Left("not found: ...")
The Either type is versatile. If the left part of an Either value is an exception,
you can use toTry to change the Either into a Try. You can also ignore the left part
entirely with toOption. Conversely, an option can be turned into a value of type Either
by specifying an additional left or right part, which, in Scala, is lazily evaluated:
Scala
Some(42).toRight("no number") // Right(42)
None.toRight("no number") // Left("no number")
You can also use fold to extract the contents of a value of type Either[A,B] into
a value of type C by providing two functions, one for the left, of type A => C, and one
for the right, of type B => C:
Scala
def mkString(stringOrNumber: Either[String, Int]): String =
stringOrNumber.fold(identity, n => if n < 0 then s"($n)" else s"$n")
mkString(Right(-42)) // "(-42)"
mkString(Left("no number")) // "no number"
When you need to return a special value of the same type as a regular value—say, an
error message in a string function—you can use Either with the same type on the left
and right. For instance, Java’s library function binarySearch searches a sorted array.
If the element is found, the corresponding index is returned. Otherwise, the function
returns a negative number n such that −n − 1 is the index where the element would be
if it were in the array. This way, you can easily access the element just before or just
after the missing element.
To eliminate the risk of using this negative value as an index, you could make
binarySearch return a value of type Either[Int,Int], using Right when the element is
found and Left to return useful information when it is not found. For a given number x,
the values Left(x) and Right(x) are distinct, and the calling code can handle them in
different ways.
13.5 Higher-Order Functions and Pipelines 201
Austin: 101
Chicago: 88
Big Spring: 92
An application needs to read locations and temperatures from a file, focus on temper-
atures recorded in Texas, convert the Fahrenheit temperatures to Celsius, and produce
a list of strings formatted like the input file. You can implement it as a pipeline of
higher-order functions:
Scala
@throws[IOException]
def readFile(file: Path): List[String] = ...
@throws[ParseException] @throws[NumberFormatException]
def parse(line: String): (String, Int) = ...
@throws[NoSuchElementException]
def stateOf(city: String): String = ...
Listing 13.1: Pipeline example with no handling of failures; contrast with Lis. 13.2.
You use map to parse lines into pairs, filter to eliminate cities outside of Texas, map
again to convert the temperatures to Celsius, and map one more time to generate the
final strings. The bracketing pair view/toList is not strictly necessary but is used to
avoid the creation of intermediate lists. On the sample file, function convert produces
the list ["Austin: 38", "Big Spring: 33"].
A number of things can go wrong in this computation: A file cannot be read, a city
is not found in the database, or a line cannot be parsed into a city name and a number.
202 Chapter 13 Handling Failures
For each of these situations, an exception is thrown. If it is not handled, any exception
terminates the entire pipeline. As an alternative, you might prefer to flag incorrect lines,
skip unknown cities, or replace missing temperatures with a default value. The problem
is that handling exceptions within the pipeline is not easy. Functions map and filter
implement generic behaviors, and you cannot tell them to skip or replace a value when
an exception happens without modifying the functions used as their arguments.
What you need are variants of readFile, parse, and stateOf that, instead of throw-
ing exceptions, return special values, for which you can use types like Option, Try, and
Either. As an exercise, let’s modify Listing 13.1 so that the possible failures mentioned
earlier are handled as follows:
• I/O errors while reading the input file are forwarded to the user.
• Lines that cannot be parsed as a name and a temperature are left unchanged but
enclosed in square brackets.
• Cities outside Texas or unknown to function stateOf are ignored.
• Temperatures that cannot be parsed as integer values are replaced with the word
"unknown".
Functions readFile, stateOf, and parse are modified to not use exceptions, and the
pipeline then relies on Option, Try, and Either methods to perform the desired error
handling:
Scala
def readFile(file: Path): Try[List[String]] = ...
Listing 13.2: Pipeline example with failure handling; contrast with Lis. 13.1.
13.5 Higher-Order Functions and Pipelines 203
Let’s unpack this. Function readFile returns a Try that contains either a list of
lines or an I/O exception. Function parse return an Either value. It produces a pair
when a line can be parsed (right) or leaves the line unchanged (left). Each pair contains
the name of a city and a temperature, but the temperature is wrapped in an option
to handle strings that cannot be converted to integer values. Finally, function stateOf
maps cities to states but uses options to deal with cities that are not in the database.
The very first thing convert does is apply map to the Try value returned by readFile.
If it is an exception, map leaves it unchanged, and convert returns a Failure value with
the I/O exception. Otherwise, we have a list of lines to work with: Make it lazy with
view, and apply parse to each line using map. Function filter is then applied to what
is now a list of Either values. Each value is either a line that could not be parsed or
a pair with a city name and an optional temperature. These values are tested using
forall, which is true on any Left value but applies a test on Right values.4 Thus,
unparsed lines are left as they are, but pairs are potentially eliminated. The test that
is used to eliminate them invokes stateOf on the city name and checks that the option
being returned contains the string "TX". This test is false on None (unknown cities) and
on options with a string other than "TX" (cities outside of Texas).
The next stage of the pipeline uses map to transform each remaining Either value
(variable badLineOrPair in the code). The first transformation converts Fahrenheit
temperatures to Celsius. Unparsed lines are again left unchanged because map only
transforms Right values. The conversion itself is achieved by using map on an option
that contains the temperature. The second and final transformation creates a string
from each Either value by adding square brackets to an unparsed line or by formatting
a city name with a temperature. The temperature, now in Celsius, is extracted from
its option, with getOrElse being used to replace missing temperatures with the string
"unknown".
This variant of convert produces the same output as before on the sample file. On
a file with defects such as:
Austin: hot
Chicago: 88
Big
Spring: 92
it produces the list ["Austin: unknown", "[Big]"]. Chicago is eliminated for being
out of Texas, and Spring for not being in the database.
Note that, in this programming style, errors tend to propagate down the pipeline
in the form of an empty Option, a failed Try, or a left Either value. By contrast, a
thrown exception would go up to the top of the pipeline, unhandled by the higher-order
functions above it and ignoring all the stages below it.
Compared to its previous version, the updated code for function convert may look
complicated, especially if you are new to functional programming, but error handling
is rarely easy. Alternative approaches, without Option, Try, Either, and their higher-
order methods, are unlikely to be simpler.
4 This line is tricky: filter is applied to a sequence, forall to a value of type Either, and contains
to an option.
204 Chapter 13 Handling Failures
13.6 Summary
• Functions can deal with failure by returning special values. However, choosing
these values within the same return type—null for an object, -1 for an integer—is
error-prone. It makes it too easy for code on the calling site to forget to check for
those values. Null values, in particular, have notoriously been abused as a cheap
form of error reporting.
• To produce error values, functional programs often rely on specific types, usually in
the form of alternatives. The simplest of those types is Option, which can represent
either a value or the absence of a value. When more failure-related information is
needed, other types can be used, including Try (which stores an exception) and
Either (which contains a substitute value).
• These types tend to be supported by higher-order functions that allow computa-
tions to keep progressing in the presence of failures. Higher-order functions are
used to transform a valid value, to propagate an error, or to recover from an error,
without explicit error checking.
• In general, exceptions are not well suited to a functional programming style. They
deviate from the core principle of value-returning functions and often disrupt
control flow embedded in higher-order functions.
Chapter 14
Case Study: Trampolines
The Scala compiler optimizes (most) tail recursive functions and generates code that
implements them as loops. Other languages, like Java, do no such thing. Even in Scala,
tail-calls outside tail recursive functions are not optimized. This case study implements
trampolines, a classic strategy used for tail-call optimization. It shows how trampolines
can be used to implement tail recursive calls without growing the execution stack, even
in Java. The strategy is then refined to handle non-tail-calls.
zero(1_000_000) // 0
Function zero is implemented as a loop and can handle large input values.
The current Scala compiler, however, does not optimize all tail-calls. In particular,
tail-calls in mutually recursive functions grow the execution stack:
Scala
def isEven(n: Int): Boolean = if n == 0 then true else isOdd(n - 1)
def isOdd(n: Int): Boolean = if n == 0 then false else isEven(n - 1)
isEven(42) // true
isOdd(42) // false
isEven(1_000_000) // throws StackOverflowError
1 Throughout the chapter, integer arguments are assumed to be non-negative in all illustrations.
205
206 Chapter 14 Case Study: Trampolines
Even though the call to isOdd is the very last expression inside function isEven (and
vice versa), no optimization takes place and a call to isEven on a large enough number
causes a stack overflow.
computation instead of relying on nested function calls. Trampolines can be described in terms of
continuations, but continuations are not covered in this book, and introducing trampolines and contin-
uations together in this case study would be confusing. Instead, in this chapter, trampolines are viewed
as a form of delayed evaluation.
3 The Computation trait should be sealed to prevent subtypes other than Done and Call from
being added.
14.3 Tail-Call Optimization in Java 207
Two add-ons can be defined for a nicer syntax (and, from this point on, are assumed
to be in scope):
Scala
implicit def done[A](value: A): Computation[A] = Done(value)
def call[A](comp: => Computation[A]): Computation[A] = Call(() => comp)
The done function wraps a value inside an instance of the Done class to create a com-
pleted computation. It is defined as an implicit function, so the conversion can be
automatically triggered by the compiler, based on type analysis. Function call uses a
by-name argument (see Section 12.2) to hide the thunk from the trampoline user.
With these in place, you can rewrite the even–odd example as follows:
Scala
def isEven(n: Int): Computation[Boolean] = if n == 0 then true else call(isOdd(n - 1))
def isOdd(n: Int): Computation[Boolean] = if n == 0 then false else call(isEven(n - 1))
isEven(1_000_000).result // true
Listing 14.3: Optimized tail-calls, using a trampoline; contrast with Lis. 14.1.
When evaluated, the expression isEven(1_000_000) creates a computation object
Call(() => isOdd(999_999)) and stops there. The actual computation is triggered
by the call to result and, as mentioned earlier, is executed as a loop. It does not grow
the execution stack at all. In effect, the execution stack has been replaced with a lazily
evaluated sequence of thunks. Without the add-ons, the two functions would not look
as nice. You would have to write isEven in this way:
Scala
def isEven(n: Int): Computation[Boolean] =
if n == 0 then Done(true) else Call(() => isOdd(n - 1))
Computation<Boolean> isOdd(int n) {
if (n == 0) return done(false);
else return call(() -> isEven(n - 1));
}
isEven(1_000_000).result() // true
of those “boxes” from our earlier discussion of map in Section 10.9. Indeed, the com-
putation that calculates the factorial of n is factorial(n - 1).map(x => x * n). So,
what we need is to add a map method to the Computation type. It could be implemented
naively as follows:
Scala
// DON'T DO THIS!
trait Computation[A]:
@tailrec
final def result: A = this match
case Done(value) => value
case Call(thunk) => thunk().result
Invoking map(f) on a Done object simply applies f to the value inside. On a Call
object, map produces a new computation that will apply f to the result of the current
computation.
The problem with this approach is that the execution stack grows when you com-
pute the result of a computation. If c2 = c1.map(f), then the evaluation of c2.result
involves computing f(c1.result).result. In other words, before the tail-call to
result—which is still optimized as a loop—the execution reenters result to apply f. If
you evaluate a long enough chain of calls to map, you will still run out of stack space.
To avoid this, the implementation of Computation needs to be refined. The remainder
of this section focuses on the implementation of flatMap instead of map. The reason is
that flatMap is needed to handle functions with multiple recursive calls, and map can
always be derived from flatMap.
NOTE
The implementation developed here follows a strategy defined by Rúnar Óli Bjarnason in his arti-
cle “Stackless Scala with Free Monads.” The same strategy is implemented, more robustly, in the
standard Scala library as scala.util.control.TailCalls.
The implementation of Computation just given runs into problems because the map
method in class Call uses result to generate a suitable input for function f. This
can be avoided by delaying the creation of the computation that uses f until you are
inside the implementation of result. There, you can deal with chained calls to flatMap
explicitly, in ways that avoid growing the execution stack:
14.4 Dealing with Non-Tail-Calls 211
Scala
trait Computation[A]:
@tailrec
final def result: A = this match
case Done(value) => value
case Call(thunk) => thunk().result
case FlatMap(f, arg) => arg match
case Done(v) => f(v).result
case Call(thunk) => thunk().flatMap(f).result
case FlatMap(g, arg2) => arg2.flatMap(x => g(x).flatMap(f)).result
Computation in place, you can use map to handle the non-tail recursive call inside
the factorial function:
Scala
def factorial(n: Int): Computation[BigInt] =
if n == 0 then BigInt(1) else call(factorial(n - 1)).map(_ * n)
Listing 14.7: Non-tail recursive factorial, as a trampoline; see also Lis. 14.10.
The factorial function makes a single recursive call, for which map is sufficient.
When a function makes multiple recursive calls, they can be combined with flatMap.
For instance, you can rewrite our earlier function size on binary trees in terms of a
trampoline:
Scala
def size[A](tree: BinTree[A]): Computation[Int] = tree match
case Empty => 0
case Node(_, left, right) =>
call(size(left)).flatMap(ls => call(size(right)).map(rs => 1 + ls + rs))
Listing 14.8: Non-tail recursive size on trees, as a trampoline; see also Lis. 14.10.
The first recursive call, size(left), produces the size of the left child as a value ls. This
value is transformed using flatMap instead of map because the function being applied
in the transformation produces a Computation[Int], not an Int. This computation is
obtained by recursively calling size(right) to get a right size rs, then by transforming
rs into 1 + ls + rs, using map.
Function hanoi from Listing 12.6 can be similarly trampolined:
Scala
def hanoi[A](n: Int, from: A, middle: A, to: A): Computation[List[(A, A)]] =
if n == 0 then List.empty
else
val call1 = call(hanoi(n - 1, from, to, middle))
val call2 = call(hanoi(n - 1, middle, from, to))
call1.flatMap(moves1 => call2.map(moves2 => moves1 ::: (from, to) :: moves2))
Listing 14.9: Non-tail recursive hanoi, as a trampoline; see also Lis. 14.10.
Recall from Section 10.9 that Scala’s for-yield construct is transformed at compile
time into suitable calls to higher-order methods. Instead of using flatMap directly, you
can write functions factorial, size, and hanoi very nicely as follows:
14.5 Summary 213
Scala
def factorial(n: Int): Computation[BigInt] =
if n == 0 then BigInt(1) else for f <- call(factorial(n - 1)) yield f * n
14.5 Summary
The idea of using thunks to delay argument evaluation, which was explored in Chap-
ter 12, can also be used to delay the evaluation of recursive tail-calls, a technique some-
times known as a trampoline. Each call is represented as a thunk, and the execution
stack is replaced with a lazily built series of thunks, which requires very little memory.
These thunks can be evaluated one by one using a loop (or a tail recursive function in
a language that optimizes it into a loop). When a recursive call is not in tail position,
its value is used for further computation after the call. To maintain lazy evaluation,
the computation that uses this value needs to be embedded into the trampoline as a
transformation using either map (for a regular function call, using the execution stack) or
flatMap (for a transformation that requires further recursive calls). The implementation
of flatMap is delicate because it must ensure that the evaluation of long chains of calls
proceeds without growing the execution stack. Once method flatMap is implemented
efficiently, map can be derived from it.
This page intentionally left blank
A Brief Interlude
This page intentionally left blank
Chapter 15
Types (and Related Concepts)
Much of the power—but also much of the learning curve—associated with modern lan-
guages like Scala or Rust—or newer incarnations of earlier languages, like Java—revolves
around types. As an interlude between our exploration of functional and concurrent pro-
gramming language features, this chapter discusses several type-related concepts from
a developer’s perspective—no type theory here. Readers are likely to be familiar with
common features, such as static and dynamic type checking, abstract data types, sub-
typing, and polymorphism, but maybe less comfortable with others—for example, type
inference, type variance, type bounds, and type classes—that not all programming lan-
guages support.
217
218 Chapter 15 Types (and Related Concepts)
An application creates a list of books and displays their titles but, by mistake, a number
is inserted in the list of books:
Scala
val books: List[Book] = List(book1, book2, 42) // rejected by the compiler
for book <- books do println(book.title)
The error message is clear: The number 42, of type Int, cannot be part of a list of
books.
If the compiler is left to infer the type of the list,
Scala
val books = List(book1, book2, 42)
for book <- books do println(book.title) // rejected by the compiler
You want to put book1, book2, and 42 inside the same list, and the compiler is try-
ing to please you. It compiles the first line by inferring variable books to be of type
List[Matchable] because Matchable is the narrowest type that fits both integers and
books (see Section 15.5). Compilation then fails on the second line because not all values
of type Matchable define a title attribute. (Matchable is a very broad type: Strings,
lists, and options are all “matchable” and don’t have a title.)
In both scenarios—type declared or type inferred—the programmer’s mistake halts
compilation and must be dealt with before the program can be executed. Contrast
these compile-time errors with the behavior of a Python program:
Python
books = [book1, book2, 42]
for book in books:
print(book.title)
You can compile and run this code, but the execution fails with the following error:
AttributeError: 'int' object has no attribute 'title'
15.1 Typing Strategies 219
In Scala, variable bookOrString is given type Book by the compiler, and the sec-
ond assignment is rejected. Python, by contrast, does not assign a type to variable
book_or_string, and allows it to be assigned with values of different types at different
times. Accordingly, you could argue that Scala’s typing is stronger than Python’s.
However, if price is defined as a number, the expression
book.title + ": " + price
is accepted by Scala, but rejected by Python:
TypeError: can only concatenate str (not "float") to str
220 Chapter 15 Types (and Related Concepts)
In this instance, Python is the fussier language, refusing to append a number to a string.
So, it is not necessarily the case that a language is always stronger, or always weaker.
Furthermore, what’s acceptable varies not only from language to language, but also over
time. As of today, the variant
price + ": " + book.title
is rejected by Python, triggers a deprecation warning in Scala, and is perfectly fine in
Java. Before concluding that Python is more strongly typed than Scala—itself more
strongly typed than Java—consider a different scenario:
Python
if book.pagecount:
print(book.title)
if (books.add(book))
added += 1;
Method add returns true if an element is actually added (that is, if it was not already
in the set). As a result, this code correctly counts added books. However, don’t write
the same thing in Python:
Python
books = ... # a set of books
# DON'T DO THIS!
if books.add(book):
added += 1
The add method in Python is “void”: It returns None,1 and None evaluates to false in a
Boolean test. Python’s type checking does not catch the mistake—the program can be
executed, but variable added is not incremented.
1 This is Python’s None, not the None of options in Scala. Python’s None is closer to Scala’s unit.
15.1 Typing Strategies 221
To further muddy the discussion, languages can also define backdoors to their own
type system, inviting weakness where the language is strong. The Python function
Python
def print_title(book):
print(book.title)
displays book titles but can also be called on an integer, in which case it will fail from not
finding a suitable title attribute (as described earlier). To better handle the mistake,
the function can be rewritten to test the type of its argument explicitly:
Python
def print_title(book):
if isinstance(book, Book):
print(book.title)
else:
raise TypeError("not a book")
This added code is not needed in Scala, since the compiler can be trusted to check that
the argument has the desired type. However, you can still write an equivalent to the
Python function:
Scala
// DON'T DO THIS!
def printTitle(item: Any) =
if item.isInstanceOf[Book] then
val book = item.asInstanceOf[Book]
println(book.title)
else throw IllegalArgumentException("not a book")
Type checking by the compiler is bypassed, and replaced with a form of dynamic check-
ing at runtime, thus weakening the overall type safety of the language.
Methods like isInstanceOf/asInstanceOf constitute a backdoor to the type sys-
tem. They are sometimes necessary, but should not be abused. Some languages have no
such loopholes. For example, the runtime type checking and casting used in function
printTitle is not possible in—more static? more strongly typed?—languages like SML
or Haskell.
Although compilers can sometimes use type information to improve performance,
the main motivation for types is to help programmers catch—and fix—mistakes. From
a developer’s standpoint, a useful type system is one that contributes to this goal,
whether it does so by being “strong” or by being “static.”
Three characteristics of types often matter more to you as a programmer than the
static/dynamic or strong/weak opposition. First, typing utility is determined by how
222 Chapter 15 Types (and Related Concepts)
safe types are—that is, how effective they are at catching mistakes before these mistakes
trigger runtime bugs. Second, convenience is contingent on how flexible a type system
is: A flexible type system does not get in the way of the developer’s design. And third is
simplicity: Some type systems are easier to understand, while others are more complex.
Of safety, flexibility, and simplicity, a programming language typically favors two at
the expense of the third. How languages handle type variance, discussed in Section 15.8,
is a perfect example. A language could make all data structures covariant; it would be
simple and flexible, but unsafe. Or it can have them all be invariant, which is simple
and safe, but lacks flexibility. Or it can define mechanisms for users to specify vari-
ance through annotations, which is safe and flexible, but more complex. In the case
of variance, older languages (Java, C++) have favored simplicity and rigidity, but more
modern languages (C#, Scala, Kotlin) have improved flexibility at the cost of increased
complexity. The remainder of this chapter discusses several concepts that contribute to
type safety, flexibility, and complexity.
Smaller types are more precise and convey more information. The List(1,2,3)
value is in type List[Int], but also in types Seq[Int], List[Any], and Any. Type
List[Int] is the most informative: Seq[Int] includes non-list sequences, List[Any]
includes lists of non-integer values, and Any includes objects that are not lists at all.
2 Some languages, including Scala, also define kinds as sets of types. For instance, List is a kind
that contains the types List[String] and List[Int], among others. This chapter does not cover kinds.
3 “x ∈ S” denotes the fact that value x is an element of set S. “S ⊆ T ” means that S is a subset of
In Scala, every variable x defines a small type x.type that contains only x—the
singleton {x} in terms of sets. For instance, the function
Scala
def doSomethingWithBook(book: Book): book.type = ...
does something with a book before returning it. Using Book as the return type would
allow the function to return a different book, possibly a copy of the input. The very
specific type book.type guarantees that the function returns the same book object used
as input.
This approach is often used in “builder” classes to chain method calls. For instance,
you can define a buffer with an append method:
Scala
class Buffer[A]:
def append(value: A): Buffer[A] = ...
...
// used as:
val buffer = Buffer[String]()
buffer.append("foo").append("bar");
The intent is for method append to add an element to the buffer and to return the buffer
itself so further operations can be applied. However, the Buffer[A] return type allows
an implementation to return a different buffer. You can use a more specific type to make
it clear that the buffer itself if being returned:
Scala
class Buffer[A]:
def append(value: A): this.type = ...
...
Finally, some languages define a type Nothing that contains no value—the empty
set. In Scala, for instance, the empty list, Nil, has type List[Nothing]. A function that
specifies Nothing as its return type, such as Nil.head, cannot possibly return a value,
since type Nothing is empty. All it can do is keep running forever or throw an exception.
Because the empty set is a subset of every set, Nothing is a subtype of every type.
This defines a Book type with three functionalities: title, author, and page count. This
class implements books as simple records4 of two strings and an integer, but of course
other implementations are possible:
Scala
class Book(pages: Seq[String]):
def title: String = pages(0)
def author: String = pages(1)
def pageCount: Int = pages.length
The functionalities offered by the Book type are specified, but not implemented.
When viewed as collections of services, undesirable types are not so much large sets—
like Any—but rather interfaces that fail to include necessary information, or, contrari-
wise, bloated interfaces with too many methods, or even interfaces that leak irrelevant
implementation-specific details.
uses case classes for the sake of compactness, but regular classes could be used instead.
5 Similarly, in mathematics, sets are augmented with laws, which are obeyed by their elements, to
become algebraic structures, like groups and fields. The sets of integers, for instance, must satisfy laws
that state x + y = y + x, x + (y + z) = (x + y) + z, x + 0 = x, and x + (−x) = 0, for all x, y, and z,
making it an abelian group.
15.5 Type Inference 225
ADTs use various mechanisms to define their operations, but a common approach
is to specify semantics as a set of axioms. As an example, consider functional lists,
introduced in Section 3.7 and used throughout Part I of the book. You can define a
functional list as an ADT by specifying axioms like these, for every value x and every
list L (empty denotes the empty list):
head(cons(x, L)) = x
tail(cons(x, L)) = L
length(empty) = 0
length(cons(x, L)) = length(L) + 1
The first axiom states that if you prepend x in front of a list L, using cons, the head
of the new list is x. The second axiom says that prepending x in front of list L, then
taking the tail of the new list, gives you back list L. The last two axioms define the
length of a list, recursively. From these axioms, you can then derive other list properties:
The Scala class is only an approximation of the ADT. It does not specify the behavior of
list operations, as defined by the ADT’s axioms. Such semantics are typically expressed
outside the type system through comments and other forms of documentation.
is given type Seq[Int], because both List and Vector are subtypes of Seq. In contrast,
Scala
if x > 0 then List(1, 2, 3) else "123"
is given the larger type List[Option[Int]], which can also accommodate value None.
Strictly speaking, the types inferred by the compiler are not always the absolute
smallest sets given the constraints. Instead, user intent is taken into account. For
instance,
Scala
var n = 0
infers type Int for variable n, not type 0. Similarly, given the definitions
Scala
val book1, book2: Book = ...
val books = List(book1, book2)
Sometimes, what constitutes the “best” inferred type is not obvious. Given the
expression
Scala
List(book1, book2, "book")
With the current Scala compiler, variable v is given the following type:
Scala
scala.collection.immutable.Iterable[Int] & (Int => Int | Boolean) & Equals
Recall that “&” is type intersection. So, variable v is assigned three types. First it is
an iterable, immutable collection of integers, which both List[Int] and Set[Int] are.
But lists and set are also functions: List[A] is a subtype of Int => A, and Set[A] is
a subtype of A => Boolean. Therefore, the expression can also be seen as a function of
type Int => Int | Boolean. Finally, both List and Set support equality comparison,
which is not the case of all iterables and functions—hence the type Equals.
Thanks to this complex type, you can use v as an iterable collection:
Scala
v.iterator.next() // 1 or 4, of type Int
or as a function:
Scala
v(1) // 2 or false, of type Int | Boolean
Sometimes, the type being inferred cannot even be expressed in the programming
language:
Java
var task = new Runnable() {
public int result;
task.run();
int r = task.result; // 42
Variable task is defined using var, without an explicit type. The type inferred by
the Java compiler is not Runnable—for which task.result on the last line would be
rejected—but a form of Runnable that also defines a public field result. This type
cannot be expressed within the Java language.6
When a compiler makes a decision that does not suit your needs, you can sometimes
specify a desired type explicitly, as long as you choose a type that contains the value.
For instance, this code cannot be compiled:
Scala
var books = List.empty
books ::= book1 // rejected by the compiler
The intent is for books to start as an empty list, to which books can then be added.
However, the compiler infers type List[Nothing] for variable books and refuses to
reassign it with a list of books. Instead, you can express your intent with an explicit
type declaration:
Scala
var books: List[Book] = List.empty
books ::= book1 // adds book1 to the list
The variants var books = List.empty[Book] (an explicit type parameter to func-
tion empty) and var books = List.empty: List[Book] (a type ascription on value
List.empty) would also work. Similarly,
Scala
var solution = None
if solutionIsFound then solution = Some(value) // rejected by the compiler
does not work because variable solution is given type None.type by the compiler.
Instead, you can specify a type explicitly, using, for instance, Option.empty[...]. Note
that, because compilers tend to infer types as small as possible, the point of an explicit
type declaration is always to widen a type to a larger set, not to specify a smaller type.7
15.6 Subtypes
Thinking of types in terms of sets helps us understand the notion of subtype. If types
are sets, subtypes are basically subsets. If a type S is a subtype of a type T, all the
values in type S are also in type T and therefore implement (at least) the functionalities
defined by type T. When type S is a subtype of T, type T is called a supertype of S. Like
subsets, subtyping is a transitive relation: If S is a subtype of T, and T is a subtype of U,
then S is a subtype of U.
As an illustration, the Book type can be part of a type hierarchy:
Scala
trait Publication:
def title: String
def pageCount: Int
case class Book(title: String, author: String, pageCount: Int) extends Publication
case class Magazine(title: String, number: Int, pageCount: Int) extends Publication
Type Publication has two functionalities, title and pageCount. Types Book and
Magazine are both subtypes of Publication. They are therefore subsets: Every book
is a publication and every magazine is a publication. As a consequence, books and
magazines must implement the publication functionalities: title and pageCount. In
addition to these, books have an author (but magazines do not), and magazines have a
number (but books do not). Types Book and Magazine are not related to each other:
Book is not a subtype of Magazine, and Magazine is not a subtype of Book.
The most fundamental property of subtypes is that their values can be substituted
for values of a supertype. Thinking again in terms of sets, they are values of the super-
type. This is sometimes referred to as the Liskov substitution principle or as behavioral
subtyping. For instance, you can define a function to print the title of a publication:
Scala
def printTitle(publication: Publication): Unit = println(publication.title)
Because types Book and Magazine are subtypes of Publication, a book or a magazine
is guaranteed to have a title and can be used as argument to function printTitle. This
is enforced by the type system at the programming language level:
7 In Scala, a type declaration can also trigger an implicit conversion: String is not a subtype of
Seq[Char], but you can still write val chars: Seq[Char] = "a string".
230 Chapter 15 Types (and Related Concepts)
Scala
// rejected by the compiler
case class Book(author: String, pageCount: Int) extends Publication
This definition is rejected by the Scala compiler because values of this Book type have
no title, and therefore cannot be part of the set of publications.
As an alternative, you can define books without titles by taking them out of
the Publication type. Then, of course, these books cannot be used as arguments
of type Publication:
Scala
case class Book(author: String, pageCount: Int) // OK
val book: Book = ...
printTitle(book) // rejected by the compiler
8 Named after the saying that if something looks like a duck, quacks like a duck, and walks like a
Function print_title does not specify a type for its book argument. You can use it
successfully on any object that defines a suitable title method:
Python
magazine = Magazine(title="Life", number=123, pagecount=45)
print_title(magazine) # prints "Life"
The person object can also be passed to function print_title because it happens
to have a title method, entirely unrelated to publications. Languages with nominal
subtyping can sometimes achieve the same flexibility in a more controlled way by using
type classes, which are discussed in Section 15.10.
The Liskov substitution principle is the main reason composition is often prefer-
able to inheritance. Consider, for instance, a type T with three services:
Scala
class T:
def service1: Int = ...
def service2(str: String): Int = ...
def service3(n: Int): String = ...
Scala
class S:
private val underlying: T = ...
export underlying.service1
Type S is now independent from type T, and attempts to use an S value where
a T value is expected are rejected at compile time. There is no service2 in type S
and therefore no risk of it being invoked by mistake. Using export, method
service1 is available in type S and is forwarded unchanged to the underlying T
instance. By contrast, method service3 uses the underlying instance explicitly
to modify the behavior of the method by the same name defined in class T.
15.7 Polymorphism
Polymorphism is defined in the Oxford Dictionary as “the condition of occurring in
several different forms.” The idea behind polymorphism in programming languages is
to have a service exist in multiple forms, depending on the types of its arguments.
Polymorphism itself exists in different forms (!), and languages typically implement
some or all of three variants.
First is the notion of ad hoc polymorphism, implemented by overloading method or
function names:
Scala
def displayBook(book: Book) = println(s"${book.title} by ${book.author}")
def displayMagazine(mag: Magazine) = println(s"${mag.title} (${mag.pageCount} pages)")
Here, two distinct services are implemented, one for books and one for magazines. The
services use different names, and books should use the displayBook function while
magazines use the displayMagazine function. However, languages that support ad hoc
polymorphism can offer both services under the same name:
Scala
def display(book: Book) = println(s"${book.title} by ${book.author}")
def display(mag: Magazine) = println(s"${mag.title} (${mag.pageCount} pages)")
You can think of function display as two services accessed uniformly, or as a single
service that takes a different form for books and for magazines—a polymorphic service.
15.7 Polymorphism 233
Scala
def withHash[T](value: T): (T, Int) = (value, value.##)
This function combines a value with its hash code, as a pair. It is parameterized by a
type T and is polymorphic in the sense that it exists in multiple forms, one for each
possible type argument. This single parameterized function represents a collection of
type-specific functions—for instance:
Scala
def bookWithHash(book: Book): (Book, Int) = (book, book.##)
def magazineWithHash(magazine: Magazine): (Magazine, Int) = (magazine, magazine.##)
Instead of defining multiple functions, you invoke the same withHash on books and
magazines to obtain values of type (Book,Int) or (Magazine,Int) as necessary:
Scala
val hashedBook: (Book, Int) = withHash(book)
val hashedMagazine: (Magazine, Int) = withHash(magazine)
Scala
def printTitles(pubs: List[Publication]) = for pub <- pubs do println(pub.title)
method m based on the runtime type of x. Some languages, such as C# and some Lisp
variants, implement multiple dispatch to dynamically choose which method to invoke
based on arguments other than the target—deciding on an implementation of method m
using the types of both x and y in the preceding example. Because Scala supports only
single dispatch polymorphism, the following code cannot be compiled, even with both
display functions from Listing 15.1 defined:
Scala
// rejected by the compiler
def displayCollection(pubs: List[Publication]) = for pub <- pubs do display(pub)
The call display(pub) is resolved at compile time. It fails because there is no display
function defined for an argument of type Publication, even though at runtime, pub
will be of type Book or Magazine, for which such a function exists.
Of course, you could query types at runtime within the program to select a suitable
function:
Scala
// DON'T DO THIS!
def displayCollection(pubs: List[Publication]) =
for pub <- pubs do
pub match
case book: Book => display(book)
case magazine: Magazine => display(magazine)
The main drawbacks of this approach are its verbosity and rigidity. Not only do all
existing subtypes need to be enumerated explicitly, but a later addition of a new subtype
will call for an update everywhere this pattern is used. For instance, if you create a new
subtype of Publication, say Report, and displayCollection is called on a list that
contains reports, it will fail with a MatchError exception because the pattern-matching
code has no case to handle reports.
Instead, you should design your code to leverage subtype polymorphism. In the case
of displayCollection, this means replacing the display functions of Listing 15.1 with
a display method invoked on a target of type Publication:
Scala
trait Publication:
def title: String
def pageCount: Int
def display(): Unit
case class Book(title: String, author: String, pageCount: Int) extends Publication:
def display() = println(s"$title by $author")
15.8 Type Variance 235
case class Magazine(title: String, number: Int, pageCount: Int) extends Publication:
def display() = println(s"$title ($pageCount pages)")
Heavy use of explicit runtime type testing and casting—including through pattern
matching—is often a sign of a flawed design that should be modified to rely instead on
subtype polymorphism.
The type List refers here to Java’s List interface. As before, printTitles relies on
subtype polymorphism, and invokes the title method that corresponds to the runtime
type of object pub. You can call it on a mixed list of books and magazines:
Java
List<Publication> pubs = List.of(book, magazine);
printTitles(pubs); // prints both titles
236 Chapter 15 Types (and Related Concepts)
If the Java compiler lets you call printTitles on a List<Book> value, it must also
let you call printTitlesAndAddMagazine on that same value—both functions have the
same signature, and an argument that is valid for one must be valid for the other. But
the compiler cannot allow a call to printTitlesAndAddMagazine on a List<Book> value
because this would cause a magazine to be added to a list of books, which is unsound
type-wise, since type Magazine is not a subtype of type Book. Type List<Book> is not
a subtype of type List<Publication> because it does not adhere to the substitution
principle: It does not support all the functionalities of the supertype. In particular, a
magazine can be added to a List<Publication> but not to a List<Book>.
The situation is different in Scala. You can apply function printTitles from List-
ing 15.3 to a List[Book] value:
Scala
def printTitles(pubs: List[Publication]) = for pub <- pubs do println(pub.title)
How come? If such a call is unsound in Java, why is it allowed in Scala? The key
difference is that Scala lists are immutable: You cannot write a Scala equivalent of the
15.8 Type Variance 237
The relevant difference here is between [A], which specifies non-variance, and [+A],
which indicates covariance. Many immutable types in Scala use a covariant annotation:
Scala
class Tuple2[+T1, +T2] ...
class Option[+A] ...
class Vector[+A] ...
class HashMap[K, +V] ... // inside package immutable
class HashMap[K, V] ... // inside package mutable
class LazyList[+A] ...
List[Publication] is created.
238 Chapter 15 Types (and Related Concepts)
def get(): A =
count += 1
contents
Methods in class Ref[+A] can return values of type A, but any method that declares
an argument of type A—like method set—will be rejected. If you want a resettable
reference, you will need to declare it as class Ref[A] and make it non-variant.
Covariance has a dual concept called contravariance: If C is contravariant and S is
a subtype of T, then C[T] is a subtype of C[S]. The contravariance annotation in Scala
is a minus sign:
15.8 Type Variance 239
Scala
class TrashCan[-A](log: Logger):
def trash(x: A): Unit = log.info(s"trashing: $x")
This is the type of functions with a single argument of type A and a return value
of type B—also denoted A => B. If S is a subtype of T, type A => S is a subtype of
A => T. By returning a value of type S, a function of type A => S does indeed implement
the functionality of a function of type A => T: It consumes a value of type A and
produces a value of type T—because of type S. But because of contravariance in the
input side, type T => B is also a subtype of type S => B, by the same argument used for
class TrashCan—type TrashCan is basically A => Unit. Functions of multiple arguments
follow the same pattern:
Scala
trait Function2[-T1, -T2, +R] extends ...
trait Function3[-T1, -T2, -T3, +R] extends ...
...
subtype
and functions are covariant in their output type. Indeed, if you have a higher-order
function defined as
Scala
def higherOrder(f: Book => Publication) = ...
you can safely call it on an argument of type Publication => Magazine. As a function,
this argument can be applied to values of type Book (it can be applied to any publication
type) and will produce values of type Publication (specifically, magazines), so it is
consistent with the type requirements of argument f.
Peculiarities might arise when variance is considered. Here are two examples. First,
immutable sets in Scala are non-variant. The reason is that Set[A] is a subtype of
A => Boolean. The data structure view calls for covariance, but the function type would
require contravariance.
Second, while all other data structures are non-variant in Java, arrays are covariant.
This design choice predates the introduction of generics and was necessary to improve
flexibility. For instance, you can define a sorting function with signature Object[] and
apply it to a String[] value—String[] is a subtype of Object[]. Since arrays are
mutable, however, this covariance leads to type unsoundness:
Java
void printTitlesAndAddMagazine(Publication[] pubs) {
for (Publication pub : pubs) System.out.println(pub.title());
pubs[0] = new Magazine(...);
}
The last two variants of printTitles can be called on a value of type Set[Book], or
Set[Magazine], or Set[Publication].
The use of an explicit name A for the unknown type is usually preferred because
it helps with type inference. For instance, the current Scala compiler rejects the first
implementation of (useless) function f shown here but accepts the second:10
Scala
def f(pubs: Set[? <: Publication]) = pubs += pubs.head // cannot be compiled
def f[A <: Publication](pubs: Set[A]) = pubs += pubs.head // OK
10 The Set type is always non-variant in Scala. Some examples in this section use a mutable set,
which defines a method “+=”. Other examples leave the mutability of the set unspecified.
242 Chapter 15 Types (and Related Concepts)
You can use “<:” to specify an upper bound, meaning that you require a type
variable to be a subtype of another type. You can use the opposite operator “>:” to
specify a lower bound, requiring a type variable to be a supertype of another type:
Scala
def addMagazine[A >: Magazine](pubs: Set[A]): Unit = pubs += Magazine(...)
The first function is rejected because it attempts to add a magazine to a Set[A] value,
but all you know is that type A is a subtype of Publication. There is no guarantee
that type A is a supertype of type Magazine—it could be Book, for instance. The second
function fails because it tries to invoke a title method on a value of type A, but A is only
known to be a supertype of Magazine, not necessarily a subtype of Publication—type A
could be AnyRef, for instance. To make it work, printTitlesAndAddMagazine needs
A to be both a subtype of Publication and a supertype of Magazine. You implement
it by specifying both a lower bound and an upper bound:
Scala
def printTitlesAndAddMagazine[A >: Magazine <: Publication](pubs: Set[A]): Unit =
for pub <- pubs do println(pub.title)
pubs += Magazine(...)
Listing 15.9: Example of a type parameter with both lower and upper bounds.
In more complex scenarios, you can combine type bounds with type intersections to
specify multiple upper or lower bounds:
15.9 Type Bounds 243
Scala
def printTitlesInOrder[A <: Publication & Ordered[A]](pubs: Set[A]): Unit =
for pub <- SortedSet.from(pubs) do println(pub.title)
Listing 15.10: Example of a combined upper type bound; contrast with Lis. 15.14.
Publications can be gathered in a sorted set because type A is known to be an ordered
type, in addition to being a subtype of Publication.
Most of the functions used as illustrations in this section can be written in Java:
Java
<A extends Publication> void printTitles(Set<A> pubs) {
for (Publication pub : pubs) System.out.println(pub.title());
}
Java uses S extends T and ? extends T to express upper bounds and ? super T
to express lower bounds. Java has no S super T syntax, making it necessary to use a
wildcard in the definition of function addMagazine. As of this writing, lower bounds and
upper bounds cannot be combined in Java, preventing you from writing an equivalent
to function printTitlesAndAddMagazine from Listing 15.9.
When designing libraries that depend on non-variant types, it is good practice to rely
on type bounds for increased flexibility. Consider, for instance, a function that executes
tasks in parallel and returns a list of their outputs.11 Tasks are specified as no-argument
functions:
Java
public <A> List<A> runInParallel(List<Function0<A>> tasks) {...}
You can now call this function on a List<BookPublisher>. However, it then returns a
value of type List<Book> specifically:
Java
List<BookPublisher> bookPublishers = ...
List<Book> books = runInParallel(bookPublishers); // OK
List<Publication> pubs = runInParallel(bookPublishers); // rejected by the compiler
For the last line to work, type A would have to be Publication. But BookPublisher
is a subtype of Function0<Book>, not a subtype of Function0<Publication>, and
function types are not covariant in Java: Function0<Book> is not a subtype of
Function0<Publication>.
You can further improve the runInParallel function by using a second type bound,
inside the function type:
Java
public <A> List<A> runInParallel(List<? extends Function0<? extends A>> tasks) {...}
Listing 15.11: Using type bounds on non-variant types for increased flexibility.
This second bound makes it possible to call runInParallel(bookPublishers) and have
the return value be of type List<Publication> because BookPublisher is a subtype
of Function0<? extends Publication>. You can define a similar function in Scala:12
Scala
def runInParallel[A](tasks: List[() => A]): List[A] = ...
The definition is simpler and needs no type bounds because, in Scala, lists are covariant
and functions are also covariant in their return type. Type A can be Publication
because BookPublisher is a subtype of () => Book, and () => Book is a subtype of
() => Publication.
12 An even more flexible function would use kinds (higher-order types) so a list is returned from a
Even though reports have titles, you cannot use a function printTitle defined as
Scala
def printTitle(pub: Publication): Unit = println(pub.title)
Of course, this increased flexibility comes with the same risks discussed in Section 15.6
in the context of Python:13
Scala
val person = Noble(name = "Edmond Dantès", title = "Comte de Monte-Cristo")
printTitle(person) // prints "Comte de Monte-Cristo"
There is a better alternative to structural subtyping, one that has its origins in the
world of functional programming: type classes. This concept appears complicated at
first glance, but it is quite powerful and used extensively in functional programming
libraries.
As with structural typing, the starting point is to define a type of values with a title:
Scala
trait Titled[A]:
def titleOf(document: A): String
Titled represents a type class, the class of all types that have a title. You can then use
an object of type Title[A] to display the title of a value of type A:
Scala
def printTitle[A](doc: A, titledEvidence: Titled[A]): Unit =
println(titledEvidence.titleOf(doc))
Function printTitle takes a second argument as evidence that type A is “titled.” You
can apply printTitle on reports, as long as an object of type Titled[Report] is
available:
Scala
object ReportsAreTitled extends Titled[Report]:
def titleOf(report: Report): String = report.title
Note that you can create an evidence value like ReportsAreTitled for any existing
type, including one that predates function printTitle. In Scala, the evidence that
a type belongs to a type class is typically passed implicitly using a context bound :
Scala
def printTitle[A : Titled](doc: A): Unit =
val titledEvidence = summon[Titled[A]]
println(titledEvidence.titleOf(doc))
The syntax A : Titled indicates that you must pass an implicit value of type Titled[A]
to function printTitle, in addition to the document doc. This value can be retrieved
13 Additionally, the default implementation of this mechanism in Scala uses Java reflection, with a
non-negligible runtime performance cost.
15.10 Type Classes 247
with the summon function. Once a value similar to object ReportsAreTitled is made
implicitly available, you can display report titles without mentioning the evidence:
Scala
given Titled[Report] with
def titleOf(report: Report): String = report.title
If you define “given” values of type Titled[Book] and Titled[Magazine], you can
display book and magazine titles as well.
For convenience, you can add an apply method to retrieve the implicit argument
more easily:
Scala
object Titled:
def apply[A : Titled]: Titled[A] = summon[Titled[A]]
You can also add an extension so “titled” values have a title method:
Scala
extension [A : Titled](document: A) def title: String = Titled[A].titleOf(document)
Let’s put all these elements together in a typical application of type classes:
Scala
// define a Titled type class, with convenience methods
trait Titled[A]:
def titleOf(document: A): String
object Titled:
def apply[A : Titled]: Titled[A] = summon[Titled[A]]
printTitle(report) // OK
printTitle(book) // OK
printTitle(magazine) // OK
A shorted syntax _.title is used to define given values. Since Titled is a SAM inter-
face, you can use a function to define a Titled object and thus rely on partial application
or lambda expressions.
When you define type classes, you rely on ad hoc polymorphism—various titleOf
methods with distinct signatures—instead of subtype polymorphism—the same
title method implemented differently in separate subtypes. This improves flexibility
because existing types can easily be added to a type class without modification: The
necessary code, like method titleOf, resides outside the type.
For instance, this definition of method printTitlesInOrder is preferable to that of
Listing 15.10, which uses a Publication & Ordered[A] bound:
Scala
def printTitlesInOrder[A <: Publication : Ordering](pubs: Set[A]): Unit =
for pub <- SortedSet.from(pubs) do println(pub.title)
Listing 15.14: Example of combining type bounds and type classes for flexibility.
This variant of the function is more flexible because you can use it on any publication
type A for which an Ordering[A] value is defined, including publication types that
are not subtypes of Ordered. For instance, thanks to implicit conversions from the
standard library, printTitlesInOrder can now be called on publications that extend
Java’s Comparable interface.
There are several benefits to using type classes instead of structural types. First, an
attempt to print the title of a noble person is now rejected:
Second, you can use printTitles on documents that conceptually have a title but
define no title method. You simply need to create a suitable evidence object:
Scala
case class Memo(header: String)
val memo = Memo("Famous Counts")
Even though memos have headers instead of titles, you can call printTitle on memos.
Third, values and methods can be associated with each type class:
Scala
trait Titled[A]:
def titleOf(document: A): String
def logger: Logger = Logger.getAnonymousLogger
15.10 Type Classes 249
This type class associates a logger with each titled type, so book-related activities, for
instance, are logged separately from other publications. This is something you cannot
do with plain subtype polymorphism. (See also the case of zero and fromInt in the
next example.)
In Scala, type classes are used extensively in the design of libraries. The standard
library defines several type classes like Ordered (used earlier), Numeric (numbers with
addition and multiplication), and Fractional (a subclass of Numeric with division). As
a final illustration, consider a function that averages a list of numbers:
Scala
def average(numbers: Seq[Double]): Double =
if numbers.isEmpty then 0.0 else numbers.sum / numbers.length.toDouble
This function returns zero if the sequence of numbers is empty. Otherwise, it divides
the sum of all the numbers by the length of the sequence to compute the average. You
can generalize this function to any type A that supports arithmetic operations:
Scala
def average[A : Fractional](numbers: Seq[A]): A =
if numbers.isEmpty then Fractional[A].zero
else Fractional[A].div(numbers.sum, Fractional[A].fromInt(numbers.length))
Scala
val doubles: List[Double] = List(1.2, 2.4)
val decimals: List[BigDecimal] = List(1.2, 2.4)
Note that even if Double and BigDecimal implemented a common numerical type—
say, Number—you could not rely on subtype polymorphism to implement the average
function because there would be no way to obtain a value zero and a function fromInt
for a given subtype A.
15.11 Summary
• Typing strategies vary from programming language to programming language.
They are often compared in terms of static (compile time) versus dynamic (run-
time), and/or strong (fussier) versus weak (looser). The terminology is somewhat
ambiguous and not universally agreed upon.
• A more important characteristic of type systems is that they vary in terms of safety
(how effective they are at catching mistakes), flexibility (the degree to which they
enable rather than prevent a preferred design), and simplicity (how easy they are
to understand and exercise). Languages typically favor two of these attributes at
the expense of the third.
• One possible view of types is that they are sets of values. Some sets, such as
String and List[Int], are large. Others, such as Unit and Boolean, are small.
If a variable has a given type, its value always belongs to the corresponding set.
The smaller the type, the more valuable this information will be.
• The dual view seeks to identify types with the services—for example, methods and
attributes—that they define. When defining a type as an interface in an object-
oriented language, for instance, the focus is not so much on the set of values this
interface can contain, but rather on the operations that can be applied to these
values.
• Abstract data types (ADTs) are a formal model of types that is based on the
dual view of sets and operations. ADTs define the semantics of all the operations
available on a type, typically in the form of axioms. Programming language types
are only an approximation of ADTs, as they usually specify the signatures of their
operations, but not their semantics.
15.11 Summary 251
• Typically, immutable data structures can safely be made covariant, while mutable
structures are only type-safe when they are non-variant. In languages that support
functional programming, function types are often covariant in their output types
and contravariant in their input type. For simplicity, languages may deliberately
introduce type unsoundness by making some mutable structures covariant, as is
the case with arrays in Java and C#.
• To increase flexibility, code that uses non-variant types can rely on type bounds—
also called use-site variance annotations—instead of specifying exact types. An
upper bound is used to constrain a type argument to be a subtype of another
type; a lower bound is used to require it to be a supertype of another type.
• Ad hoc and parametric polymorphism are sometimes packaged into the notion
of type class, especially in languages that have no support for subtype polymor-
phism. Type classes can be used to uniformly access services on values of different
types, even when those services are not related by a common supertype. In par-
ticular, types developed independently from a function that uses a type class can
be adapted and used as arguments to this function.
Part II
Concurrent Programming
This page intentionally left blank
Chapter 16
Concepts of
Concurrent Programming
Part I started with the observation that there is no single definition of functional pro-
gramming. Unfortunately, there is no universally agreed-upon definition of concurrent
programming either, and defining concurrent programming is at least as difficult a task
as defining functional programming was in Part I of the book. The issue is similar: The
terminology is ambiguous, and different people choose to emphasize different aspects.
Nevertheless, from the indisputable premise that concurrent programs do not execute
in a purely sequential manner, several key concepts of concurrent programming emerge,
including synchronicity, atomicity, threads, synchronization, and nondeterminism.
255
256 Chapter 16 Concepts of Concurrent Programming
This program is sequential and prints A, B, and C, in that order. You can break sequen-
tiality by letting the second print statement execute out of order. In Scala, it could be
written as follows:
Scala
println('A')
Future(println('B'))
println('C')
The second print statement is scheduled to be executed out of order, and the program
output is no longer guaranteed to be A followed by B followed by C. Out-of-order execu-
tion can also be triggered in other languages like Kotlin:
Kotlin
println('A')
launch { println('B') }
println('C')
or JavaScript:
JavaScript
console.log('A');
Promise.resolve().then(() => console.log('B'));
console.log('C');
In the initial pseudocode, println('B') is a synchronous call: It takes place here and
now, within the current run, between the printing of A and the printing of C. By contrast,
the Scala line Future(println('B')), the Kotlin line launch { println('B') },
and the JavaScript line Promise.resolve().then(() => console.log('B')) have
the similar effect of invoking the printing of B asynchronously, not here and/or not
now, out of the current flow of operations.
As a noncomputing analogy, if I order a sandwich and stand at the counter while
the sandwich is being prepared in front of me, that’s synchronous. If instead I place my
order, get a number, and don’t wait for the sandwich to be ready before leaving the
counter, that’s asynchronous.1
So, the three programs just shown are no longer sequential. Are they concurrent?
The JavaScript program uses a single thread and guarantees that B will be printed
after A and C, in an ACB order. The printing of B is not happening “now,” but in a
way it is still happening “here,” within the same thread of execution. Given that the
program is single-threaded, and that the ACB order is guaranteed, you can even argue
that the execution is still sequential, only in a different order.
1 In this case, it is also multithreaded, unless I’m being asked to make the sandwich myself when I
Is concurrency, then, tied to the use of multiple threads? Depending on how they
are configured, the example Scala and Kotlin programs may or may not rely on an
additional thread to execute the printing of B. Can the same program be concurrent or
not, depending on external configuration? Even if we assume that the Scala and Kotlin
programs are configured to use multiple threads when Future or launch is invoked,
does that necessarily make them concurrent? Consider these variants:
Scala
println('A')
Await.ready(Future(println('B')), Duration.Inf)
println('C')
Kotlin
println('A')
launch { println('B') }.join()
println('C')
Because of added synchronization, these programs, even when they use multiple threads,
will print A, B, and C sequentially, in the ABC order. You could say that the printing of B
is not taking place “here,” in the current thread of execution, but it is still happening
“now,” between A and C. Does this make the programs sequential, or are they still
concurrent? Are there multithreaded sequential programs?2 Furthermore, even in the
earlier, non-synchronized versions of the programs, A, B, and C are bound to be printed
in some order, one at a time, sequentially. It’s not as if two letters could end up on
top of each other, or even on the same line. So, what is concurrent in these nominally
non-sequential programs?
Let’s modify the Scala program to use a slow printing function:
Scala
def slowPrint(x: Any) =
var n = BigInt("1000000000")
while n > 0 do n -= 1
println(x)
slowPrint('A')
Future(slowPrint('B'))
slowPrint('C')
Printing each letter now takes time because it requires counting down from a large num-
ber using big integers. Using a single-threaded configuration, on the multicore computer
2 You might argue that these last two examples are silly and artificial (they are). But more realistic
illustrations exist. For instance, the code that handles incoming messages in actors (Section 27.5) is
typically executed sequentially by multiple threads, one at a time.
258 Chapter 16 Concepts of Concurrent Programming
used for typesetting this book, the program takes 7.6 seconds to complete and prints the
letters in the ACB order, like the JavaScript program. With a multithreaded configura-
tion, the running time is reduced to 5.2 seconds, about two-thirds of the single-threaded
time. This is because the slow printing of B and the slow printing of C are now running
concurrently, inside two separate threads of execution. The time is reduced from
(printing of A) + (printing of B) + (printing of C)
to
(printing of A) + max((printing of B) + (printing of C))
Two of the computer’s cores perform the slow countdown at the same time, in parallel,
even if the actual printing of the two letters on the terminal still happens sequentially,
B before C or C before B. In other words, there is concurrency at the level of two cores
executing big integer operations, but printing on a shared terminal remains sequential
(as it should), even when multiple threads are involved.
NOTE
At this point, you may expect the obligatory discussion of concurrency versus parallelism to take
place. In books and Internet blogs, this is often considered to be a fundamental point to be addressed
early on. I disagree. As definitive as the many explanations aim to be, they tend to be inconsistent
with one another. If you corner me, as students have, and insist that I make a distinction, my
view—inspired by the Association for Computing Machinery (ACM) Computer Science Curricula
2013 —is that parallelism is the concurrency I want (for speed!), and concurrency is the parallelism
I need to deal with (all those things happening at the same time in my program). However, my
argument here is that it doesn’t matter much. By and large, you face the same programming chal-
lenges whether you attribute them to parallelism or to concurrency. For the purposes of this book,
it is enough to stick with basic dictionary definitions and to think of parallelism and concurrency as
synonymous, both representing the idea of several actions taking place at (about) the same time.
In the following chapters, I use both terms interchangeably.
actions that do not happen “here” and “now,” but instead elsewhere (another
thread) or later, or both.
• Threads of execution, or threads for short, are a common source of concurrency in
programs, but they are not the only one: Processes tend to run concurrently as
well, and external events such as user clicks or incoming server connections also in-
troduce concurrency. Furthermore, programmers often handle concurrency at non-
thread levels—for instance, in terms of tasks, or futures, or actors, or coroutines.
• Atomicity defines elementary program units, the “parts” that can be executed
out of order, or “at the same time.” In the examples in Section 16.1, for instance,
decrementing a BigInt by one is an atomic operation, but counting down from
1,000,000,000 to zero is not. Similarly, a letter is printed on the terminal, or it
is not; it cannot be “partially printed.” In contrast, printing multiple letters is
not necessarily atomic. The single-threaded JavaScript program guarantees that
the A and C lines form an atomic unit, with no possible B in between, while the
Scala and Kotlin programs—when configured to use multiple threads—do not. As
a programmer, you must be aware of what needs to be atomic in your program
and of the atomicity guarantees of the languages and libraries with which you are
programming.
• Synchronization is often needed to coordinate concurrent activities. You can
use synchronizers, such as locks and semaphores, to reduce the concurrency of
a program to a level that does not jeopardize the program’s correctness. When
synchronizers are misused—as I did earlier with Scala’s Await and Kotlin’s join—
they can turn a concurrent program into a sequential one, or even prevent the
program from running at all (see the discussion of deadlocks in Section 22.3).
• Nondeterminism is a natural consequence of concurrency and is the bane of concur-
rent programming. The most salient difference between the JavaScript program,
on the one hand, and the Scala and Kotlin programs, on the other hand—when
configured to be multithreaded, without additional synchronization—is that the
former can produce the three letters in only one possible sequence, ACB, while
the others may produce different sequences in different runs. The impact of such
nondeterministic behaviors on testing and debugging is momentous.
16.3 Summary
There is no need to delve into a philosophical discussion of what constitutes concurrent
programming to introduce several of the core concepts that underlie many programming
language constructs, such as threads of executions, asynchronous calls, atomic actions,
synchronization, and nondeterminism. The art of concurrent programming is to write
programs that properly balance multiple facets of concurrency: Too many threads (or
260 Chapter 16 Concepts of Concurrent Programming
not enough) may impact performance; not enough synchronization (or too much) may
impact correctness; atomicity is key to reasoning about concurrent systems, but non-
determinism makes such reasoning difficult; and so on. The following chapters explore
these intertwined concepts of concurrent programming through code illustrations in
Scala, Java, and Kotlin.
NOTE
All of the code examples in this book rely on a JVM implementation of the languages. Although
most examples from Part I would behave similarly on a different implementation—both Scala and
Kotlin define an alternative, JavaScript-based implementation—this is less true of the code illustra-
tions in Part II. Multithreading typically involves a close interaction with the operating system (OS).
Threads, in particular, are often managed by the OS as lightweight processes. In the remainder of
the book, the discussion of concurrent programming is predicated on a standard JVM running on a
Unix system. Some comments and explanations would need to be adjusted if code is compiled and
executed on a different platform, such as Android.
Chapter 17
Threads and Nondeterminism
Sequential programs have a single thread of execution. Programs can be made con-
current by creating additional threads so that multiple threads execute a program in
parallel. How threads access hardware computing resources is typically orchestrated by
a scheduler—often in the operating system—which is not under a programmer’s con-
trol. As a result, and in the absence of additional synchronization, the exact order in
which multiple threads jointly execute their instructions is unpredictable. Consequently,
multithreaded programs tend to exhibit nondeterministic behavior. This nondetermin-
ism vastly complicates testing and debugging.
println("START")
println("END")
This program relies on a println function that is defined to add thread and timing
information to the message being displayed.2 Here, a thread called “main,” created and
started by the JVM, executes both statements in sequence, within a few milliseconds,
at which point the program ends and the JVM terminates.
To create additional threads (besides main), Java defines a Thread class, and JVM
languages rely on Java’s mechanisms for thread creation. You instantiate the Thread
1 More precisely, user code is executed within a single thread of execution. Typically, a JVM uses
additional threads for garbage collection, just-in-time compilation, and other operations.
2 In the following chapters, many code examples use this modified println function—or its printf
261
262 Chapter 17 Threads and Nondeterminism
val tA = Thread(LetterPrinter('A'))
val tB = Thread(LetterPrinter('B'))
val tC = Thread(LetterPrinter('C'))
tA.start()
tB.start()
tC.start()
println("END")
The START and END messages are printed by the main thread, as before. Letters A, B,
and C are printed by three different threads, named Thread-0, Thread-1, and Thread-2,
respectively. When you invoke start on tA, tB, and tC, you effectively introduce paral-
lelism in the system. After the three new threads are started, there are four running
threads in the program—main, Thread-0, Thread-1, and Thread-2—and all the print-
ing statements (A, B, C, and END) happen together at about the same time.
It is good practice to name threads as they are created. This makes it easier to read
logs and thread dumps (see Section 22.4). To name them, you could create the three
threads tA, tB, and tC as shown here:
Scala
val tA = Thread(LetterPrinter('A'), "printerA")
val tB = Thread(LetterPrinter('B'), "printerB")
val tC = Thread(LetterPrinter('C'), "printerC")
17.3 Nondeterminism of Multithreaded Programs 263
Scala
val tA = Thread(() => println('A'), "printerA")
val tB = Thread(() => println('B'), "printerB")
val tC = Thread(() => println('C'), "printerC")
Java
void printB() {
println('B');
}
Of particular interest is the fact that the three letters are printed in CAB order in the
first run, but in ABC and BAC order in the next two. This is an illustration of the non-
determinism of multithreaded programs.
When the start method is called on a Thread object, the runtime needs to create
an actual thread—often a lightweight process managed by the OS—and schedule it for
execution. How the runtime schedules threads is not under a programmer’s control. A
scheduler allocates runtime quorums to threads in ways that are unknown at the level
of an application. When I ran the demo program, my computer reported 2985 active
threads—most of them outside the JVM—sharing eight processor cores. All the run-
ning threads are constantly swapped in and out of execution at times you don’t know
and can’t choose. As a consequence, you cannot know exactly when your threads execute
their instructions and for how long before they are suspended by the OS to let other
threads run. The exact order in which the letters are printed by the example program
is unpredictable and typically varies from run to run.
Notice also that in the last two runs, the main thread terminates before the printing
is complete. After the main thread invokes start on another thread, it continues its
own run, and the runnable target is executed later, inside a separate thread. Because
there are other active threads, the JVM keeps running after the main thread is finished.
It stops only after all the threads are terminated.3
seldom used.
17.4 Thread Termination 265
def run() =
while continue do
// perform task
end while
// cleanup
end task
Listing 17.1 can be modified to make the main thread pause until the three printing
threads are finished:
Scala
...
tA.join()
tB.join()
tC.join()
println("END")
Listing 17.4: Waiting for thread termination without a timeout; see also Lis. 17.5.
With the added calls to join, the main thread will not reach its final print statement
until the other three threads are terminated, thus guaranteeing that the END message
appears last—the A, B, and C messages can still appear in any order. Some of the
nondeterminism of the program execution is thus reduced by the use of additional
inter-thread synchronization, a topic that is discussed in more detail in Chapter 22.
A variant of join is also available with a timeout. You must use it in conjunction
with isAlive:
Scala
runner.join(500) // timeout in milliseconds
if runner.isAlive then
// handle timeout case
else
// runner is terminated
The invocation runner.join(500) blocks the calling thread until thread runner ter-
minates or 500 milliseconds has elapsed, whichever comes first.
Method join is often used to implement a pattern, known as fork-join or scatter-
gather, in which a thread creates several worker threads, starts them on their tasks
(scatter), and waits for their termination to assemble the results (gather). Variations of
this fundamental pattern are explored in later sections, notably when waiting on tasks
instead of threads in Section 22.2 and when scheduling a gathering of results without
waiting for task completion in Chapters 26 and 27.
times to trigger an incorrect behavior. This situation creates a double headache for
developers:
• Testing: You can test a program successfully a million times, only to have it fail on
the next run of the tests—or even in production—because of an unfortunate timing
of the executions of threads. It can even be that the circumstances needed for
such an unfortunate timing exist only in the environment in which the application
is deployed, perhaps due to specific data (which impact running times) or the
presence of other activities running on the same hardware.
• Debugging: After you observe a failure (in testing or in production), it can be
difficult to reproduce it: The next execution of the exact same run may now
appear to be working. This complicates debugging tremendously. Even when you
can reproduce a failure, it is often the case that the faulty scenario involves high
levels of concurrency (e.g., hundreds of threads), which makes it very difficult to
track bugs. Even worse, any step you apply in the debugging process (breakpoints,
logging, etc.) impacts the synchronization and the timing of threads and can make
the bug undetectable (see the aside on “Heisenbugs” at the end of Section 18.3).
While there is no silver bullet for testing and debugging multithreaded programs, it
helps to keep a few basic principles in mind:
but only that the fault is easier to trigger and observe with a hundred threads.
5 This method has nothing to do with Scala’s for-yield construct. However, because yield is a
its current use of a processor.” While testing, you can insert calls to yield in a
program in an attempt to increase the number of ways actions are interleaved in
a run. (Section 18.3 on atomicity discusses examples of suitable uses of yield.)
Because threads can always be suspended at any time by the scheduler, using
yield does not produce scenarios that were not already possible, and therefore
added calls to yield should never cause a fault in a correct program.
• Name threads and log fine-grain activities. It is hard enough to cause an incorrect
concurrent program to fail using tests. It is even harder to figure out the sequence
of steps that caused the failure, especially when many threads are involved.
Detailed logging information—which unfortunately grows with the number of
threads—is often necessary to pinpoint a synchronization mistake. When logging,
you can access information on the thread that is executing a code fragment by
using the Java static method Thread.currentThread.
• Complement testing with static analysis tools. Static analysis techniques can be
applied at the design, source code, or compiled code level, and many such tech-
niques are primarily focused on concurrency issues. Because they do not rely on
the nondeterministic occurrence of a particular series of runtime steps, they can
often discover bugs that have escaped other forms of testing. (See the aside on
model checking at the end of Section 23.4.)
17.6 Summary
• By default, many programs are single-threaded. Concurrency/parallelism is intro-
duced by creating additional threads of execution. Threads can then run concur-
rently within the same program and interact with each others.
• JVM languages rely on Java’s Thread class to create threads. An instance of
this class is a regular object, with methods and fields. However, when its start
method is invoked, a new (real) thread of execution is created. This thread is
typically implemented as a lightweight process of the operating system, but could
also be handled directly by a JVM.
• The behavior of a new thread is specified by an object with a no-argument run
method, which contains the code to be executed by the thread. This object can
often be implemented as a function, using a lambda expression syntax.
• Threads terminate when they reach the end of their code, either normally or
abruptly through an exception. Threads can wait for the termination of other
threads via a method join, which is often used by a supervisor thread to wait for
completion of worker threads. A JVM terminates when all its application threads
are finished.
17.6 Summary 269
• Threads need to share computing resources with other computer activities, includ-
ing threads from other processes outside the JVM. A scheduling mechanism, typ-
ically part of the operating system, is in charge of allocating computing time to
all the threads. Threads are thus suspended—to let other threads run—and later
resumed in ways that are not under the control of the application. This unpre-
dictable scheduling of threads tends to make multithreaded programs nondeter-
ministic: The same program, on the same input, can exhibit different behaviors in
successive runs.
• Nondeterminism often makes testing and debugging of multithreaded programs
very tricky. A given test may succeed or fail unpredictably, and faulty scenarios
can be hard to reproduce when debugging. Bugs that are exhibited only rarely
and in the presence of many threads can be very hard to track.
This page intentionally left blank
Chapter 18
Atomicity and Locking
During the execution by a thread of a compound operation that consists of multiple
steps, other threads can observe and interfere with the state of an application as it
exists between these steps. By contrast, an atomic operation consists of a single step,
with no possibility of disruption during that step. Locking is a mechanism that can
be used to make a compound operation by a thread appear atomic to other threads,
thereby preventing them from interfering during its execution in ways that would be
detrimental to correctness. Different systems define their own locking mechanisms. In
particular, Java defines a basic form of locks that can be used with any JVM language.
18.1 Atomicity
Chapter 17 discussed a program in which three threads shared a terminal to output the
letters A, B, and C. The three letters could appear in different orders from run to run,
and this nondeterminism was due to the unpredictable scheduling of threads.
Consider this variant in which, instead of sharing a terminal, two threads (in addition
to main) share an integer value:1
Scala
// DON'T DO THIS!
var shared = 0
t1.start(); t2.start()
t1.join(); t2.join()
println(shared)
1 Function times is used to repeat code n times. It was defined in Listing 12.4.
271
272 Chapter 18 Atomicity and Locking
Two threads are started by the main thread. Each thread increments a shared integer
five times. The main thread waits for completion of all the increment operations and
then prints the value of the shared integer.
On its first run, the program’s output was
main at XX:XX:19.251: 10
In my experiment, the output was similar on the next run and, using a loop to repeat
execution, in the next 3189 runs. The 3190th run, however, produced this output:
main at XX:XX:26.180: 7
The remaining 96,810 runs all produced the value 10. In other words, two threads
performing five increments each increase the value of a shared integer by 10 most of the
time, but not always.
Looking at the bytecode2 resulting from the compilation of shared += 1 can be
informative:
bytecode
0: aload 1
1: getfield #92 // Field scala/runtime/IntRef.elem:I
4: iconst_1
5: iadd
6: istore 2
7: aload 1
8: iload 2
9: putfield #92 // Field scala/runtime/IntRef.elem:I
Field 92 corresponds to variable shared in Listing 18.1. The bytecode shows that
the field is read from memory (getfield), incremented by one (iconst_1/iadd),
and the new value stored in memory (putfield).
When threads T1 and T2 execute this code concurrently, the scenario depicted in
Figure 18.1 is possible. In this scenario, T2 reads the same value of shared read by T1
(say, 3), also increments it to 4 (as T1 did), and writes 4 back into shared. After both
threads have completed their execution of shared += 1, the value has changed from 3
to 4, and an increment has been lost. Because of overlapping read/write sequences by
separate threads during a run, you can end up with a final value of shared that is less
than 10, like the value 7 observed in the 3190th run.
This example illustrates that incrementing an integer variable is not an atomic oper-
ation. It is implemented using multiple instructions—at least reading, incrementing, and
writing back—and those steps can end up being interleaved among multiple threads in
arbitrary ways.
2 Of course, the bytecode could be different with another compiler and is often recompiled into
machine code by the JVM’s just-in-time compilers anyway. Nonetheless, looking at it helps emphasize
the non-atomicity of an integer increment.
18.2 Non-atomic Operations 273
shared=3
Thread T1 Thread T2
aload 1
getfield #92 3
iconst_1
iadd aload 1
istore 2 getfield #92 3
aload 1 iconst_1
iload 2 iadd
putfield #92 4 istore 2
aload 1
iload 2
putfield #92 4
shared=4
t1.start(); t2.start()
t1.join(); t2.join()
println(shared.size)
Listing 18.2: Multiple threads share a list unsafely; fixed in Lis. 18.5.
As before, some runs display a size of 10 for the list at the end, while others finish with
fewer than ten elements. If you think about it, somewhere in the implementation of
ArrayList, an index needs to be incremented to fill the next slot in the array. Since
274 Chapter 18 Atomicity and Locking
integer increments are not atomic and can produce incorrect values in a multithreaded
context, it is not surprising that multiple threads calling method add at the same time
can overwrite each other, resulting in lost values.
If you run the program enough times, it will also exhibit another interesting behavior.
The 6,402,829th run in my experiment produced an exception in thread T1:
Exception in thread "T1" java.lang.ArrayIndexOutOfBoundsException:
Index 3 out of bounds for length 0
at java.base/java.util.ArrayList.add(ArrayList.java:455)
...
An examination of the source code of the class ArrayList used in the test suggests an
explanation. Lists are created using an empty array with zero capacity. When a first
element is added to a list, this array is replaced with a newly allocated array (with a
capacity of 10). In the failed run shown here, thread T2 created this new array and used
it to insert three values into the list. Because class ArrayList does not include any
mechanism for synchronization, thread T1 then tried to insert its first value (4th list
element, at index 3) into the previous array, with zero capacity.
Functions typically consist of multiple statements, and most statements consist of
even more elementary instructions at the machine code level. These can be interleaved
in arbitrary ways by independent threads. As a result, code that was not designed to
handle such interleaving is likely to break when instances are shared among threads.
In particular, and with very few exceptions (Random is one), most classes in java.util
are not safe for multithreaded use without synchronization. If you improperly share
instances of classes such as HashMap and PriorityQueue between threads, they will
exhibit unpredictable behavior similar to what was demonstrated using ArrayList.
Note that the programs being tested in this experiment were designed to fail, and
they still can appear to behave correctly in thousands of successive runs. In these pro-
grams, threads interact in unrealistic ways, modifying shared data as often as possible,
which increases the chances for an undesirable interaction. In an actual application,
threads will perform independent work and only occasionally access a shared data struc-
ture at the same time. Failures will be more infrequent and harder to produce but still
potentially devastating.
can implement registration by incrementing a userCount value with each incoming user.
However, if you want it to work correctly when a service is shared among threads, you
cannot write the increment as userCount += 1, which is not atomic. Instead, you would
need to rely on a different type, one that implements an atomic increment. Java defines
a class AtomicInteger for this purpose:
Scala
private val userCount = AtomicInteger(0)
// DON'T DO THIS!
def getRank(): Int =
userCount.increment()
userCount.get
A single atomic integer is created and initialized with zero. The getRank function incre-
ments this integer using a method increment, which is designed to be safe when called
by multiple threads. This way, all the registrations are guaranteed to be recorded—no
lost increments as occurred earlier—and after ten users register, the value of userCount
will be 10.
However, the program is still incorrect. It does not satisfy the desired property
that all users are assigned unique ranks. A scenario is possible in which two users, say
the third and fourth, register at about the same time. Both threads execute userCount
.increment(), which brings the counter value to 3, then to 4. Both threads then execute
userCount.get, which returns the value 4 for both, and both users end up with identical
ranks (Figure 18.2).
userCount=2
Thread T1 Thread T2
userCount.increment()
userCount.increment()
userCount.get 4
return 4
userCount.get 4
return 4
userCount=4
The increment that took multiple steps with a regular integer has become atomic,
but in getRank, incrementing and getting the incremented value are still two separate
steps. Indeed, Java’s class AtomicInteger does not define an increment method; I
added it only for the purposes of this illustration. Instead, the class defines a method
incrementAndGet, which first increments the integer, then returns the incremented
value, all in one atomic step. This is exactly what you need to implement getRank:
276 Chapter 18 Atomicity and Locking
Scala
private val userCount = AtomicInteger(0)
userCount=4
Thread T1 Thread T2
it in a single atomic step. It is reserved for advanced usage, and is illustrated in Chapter 27.
18.3 Atomic Operations and Non-atomic Composition 277
sometimes, but sometimes fails. It fails when the atomic integer is accessed by another
thread between increment and get (first example), or between if and then (the second
example). You can make these failures more likely by inserting calls to Thread.yield
(see Section 17.5) in the code of function getRank:
Scala
def getRank(): Int =
userCount.increment()
Thread.`yield`()
userCount.get()
With added calls to Thread.yield, the test fails much more frequently. By contrast,
adding calls to Thread.yield in a correct program should never break it. It is often
a good testing strategy to insert such calls in places where a thread losing control of
the CPU is thought to potentially be unsafe; correct code should continue to behave
correctly.
t1.start(); t2.start()
t1.join(); t2.join()
println(shared.size)
In this variant, each thread checks that it is not too far ahead of the other thread
before adding a value into the shared list. A thread adds a new item to the list
278 Chapter 18 Atomicity and Locking
only if the list is at least twice as long as the number of items already added by
that thread. When run on my computer, this program prints nothing and keeps
running.
In an attempt to help debug the problem, you can log the value of variable
added using a simple print statement:
Scala
...
while added < n do
println(added)
if shared.size >= 2 * added then
...
Remarkably, this modified program does terminate, with the expected output:
T1 at XX:XX:28.543: 0
T2 at XX:XX:28.543: 0
T1 at XX:XX:28.543: 1
T2 at XX:XX:28.543: 1
T1 at XX:XX:28.544: 2
T2 at XX:XX:28.544: 2
T1 at XX:XX:28.544: 3
T2 at XX:XX:28.544: 3
T2 at XX:XX:28.544: 4
T1 at XX:XX:28.544: 4
main at XX:XX:28.544: 10
18.4 Locking
The incorrect code examples so far in this chapter all make the same mistake: Threads
share mutable data without proper synchronization. There are different approaches
to solving this problem, some of which are discussed in later chapters, but locking
remains one of the most commonly used techniques to synchronize thread manipulation
of shared data.
The basic principle of locking is to guard portions of code (such as functions and
methods) with locks that need to be acquired before running the code. The most com-
18.5 Intrinsic Locks 279
mon form of lock is the exclusive lock, which can be acquired by only one thread at
a time. When a thread acquires an exclusive lock, the thread is said to own the lock.
Ownership ceases when the lock is released by the thread. While a thread owns an ex-
clusive lock, the lock is not available to other threads. Any attempt by another thread
to acquire the lock is denied, usually by blocking the requesting thread until the lock
becomes available.
In Listing 18.2, two threads add strings into a shared list, leading to undesirable
behaviors and incorrect output. To fix the problem, one strategy could be to have each
thread acquire an exclusive lock before it can add a string to the list:
pseudocode
exclusiveLock.lock()
shared.add(str)
exclusiveLock.unlock()
Because the lock is exclusive, you cannot have two threads inside the list’s add method
at the same time. Instead, one thread must complete the execution of the entire call to
add before it releases the lock and another threads starts the method again. In effect,
method add now appears to be atomic: It is not possible for a thread to partially run
the method and have another thread come in. In particular, the intermediate states
that the list can have while a thread is executing add are not visible to other threads—
assuming the list is not used elsewhere in the application.
or
Scala
lock.synchronized {
shared.add(str)
}
(Recall that Scala lets you invoke a single argument function using braces instead of
parentheses.) The lock is acquired upon entry into the specified code fragment and
released when the thread exits this code, either normally or by throwing an exception.
While a thread is inside the synchronized method, the lock is not available, and other
280 Chapter 18 Atomicity and Locking
threads attempting to acquire it will be blocked. The value lock must be a reference to
an object. Any Java object can be used as a lock.4
The incorrect list sharing program of Listing 18.2 can be fixed using an intrinsic
lock:
Scala
val shared = ArrayList[String]()
t1.start(); t2.start()
t1.join(); t2.join()
println(shared.size)
Listing 18.5: Multiple threads share a list safely; fixed from Lis. 18.2.
Both threads need to lock the shared object before they can call its add method. Since
only one thread can own the lock at any given time, it is now impossible for multiple
threads to be inside the add method at the same time. The size of the list displayed at
the end of the program is now guaranteed to be 10.
In Listing 18.5, the lock is acquired and released before and after each call to add. As
an alternative, a thread could acquire the lock once and keep it while calling add mul-
tiple times:
Scala
def addStrings(n: Int, str: String): Unit = shared.synchronized {
n times shared.add(str)
}
In this case, locking makes the entire function addStrings appear atomic: One thread
adds all its strings to the list, followed by all the strings of the other thread. There will
be no interleaving of "T1" and "T2" in the list.
Which function addStrings is right? It depends on the needs of your application.
If, for instance, a list of strings is shared for the purpose of logging, and threads call the
add method to add a logging message, locking each invocation of add separately might
be the right choice. Conversely, if method add is used to add one line of a multiline
logging message, you will probably have to synchronize a single block around multiple
4 When using locks, you should stick to objects that you have allocated. Some objects are created
implicitly by the JVM—for instance, to box primitive values—and cannot reliably be used as locks.
18.6 Choosing Locking Targets 281
calls to add, since a log containing interleaved lines from different messages would be
undesirable. You could even offer both functions as a choice to the user:
Scala
def addStrings(n: Int, str: String): Unit =
n times {
shared.synchronized(shared.add(str))
}
Depending on their needs, users can now call addStrings for fine-grained parallelism
or addAllStrings for coarse-grained parallelism. Note that calls to addStrings that
originate from within addAllStrings already own the shared lock by the time they
reach shared.synchronized(shared.add(str)). This is fine. The JVM intrinsic locks
are reentrant, and a thread that owns a lock can enter other blocks of code synchronized
on the same lock. A thread will release a lock only when it exits the outer block—in
this example, when it finishes its call to addAllStrings.
You need to use locks judiciously to introduce enough atomicity for a program to
be correct. If the code chunks made atomic by locking are small, they can be inter-
leaved nondeterministically in many different ways, and all the possibilities need to be
considered when asserting correctness. If the chunks are too small—three atomic steps
to increment an integer, for instance—correctness might be lost. On the flip side, large
atomic steps reduce parallelism because these steps must be executed sequentially, one
thread at a time. This can have a negative impact on performance.
Scala
object SharedListApplication:
@main def run(msg: String, count: Int) =
t1.start(); t2.start()
t1.join(); t2.join()
println(shared.size)
end SharedListApplication
The object on which method synchronized is invoked is not specified. Several potential
objects could be used. The lock could be the list shared, as before. But you could also
use SharedListApplication, the object that defines the entire application, or msg1, the
string used by the first thread, or even scala.collection.immutable.Nil, the object
that represents an empty list. All these choices would result in a correct program.
Obviously though, some choices make more sense than others. In this example, us-
ing shared or SharedListApplication is reasonable. Using msg1 would be confusing.
Using Nil would be downright bizarre. The question of which objects to use as locks is
revisited in more detail in Chapter 19. Note that if you choose msg1, the program works
because both threads lock the same object, not each thread its own string. It is essential
that all threads compete for the same lock to achieve the desired mutual exclusion. A
possible mistake here would be to use this as the target of synchronization—that is, to
simply write synchronized(shared.add(str)). In that case, each thread would lock
its own Adder instance when writing into the shared list. Both instances could be locked
at the same time, allowing the threads to enter the add method of the list together. In
other words, setting code inside a synchronized block is not a guarantee that this code
cannot be executed by multiple threads at the same time. It all depends on which object
you use as the locking target.
In some cases, there is no obvious object to lock on. In these situations, you can
create an additional object for this purpose. The program in Listing 18.4 is incorrect
because of a non-atomic if-then-else. It can be fixed with an intrinsic lock:
Scala
private var userCount = 0
Some(userCount)
else None
}
Listing 18.7: Atomic check-then-act using locking; fixed from Lis. 18.4.
Because locking is necessary to make the if-then-else atomic, there is no issue with
incrementing a regular integer and AtomicInteger is not needed.
Depending on the circumstances, you can choose to protect data with a single lock,
or decide to split data in chunks guarded by separate locks. Assume, for instance, that
the registered users of Listing 18.7 are players, and a function play can be called by a
registered user to act in the game:
Scala
def getRank(): Option[Int] = X.synchronized {
// register user
}
Depending on the internal design of the class, new user registration may or may not
interfere with playing. If X and Y are references to the same object—X eq Y is true—the
getRank and play functions are mutually exclusive: A thread inside one of the functions
prevents other threads from entering the other function. If, however, X and Y refer to
two distinct objects, a thread can be executing function getRank with the X lock owned,
while another thread executes function play with the Y lock owned. Whether X and Y
are the same object or two different objects, no two threads can execute getRank at
the same time, and no two threads can execute play at the same time. Our next case
study, in Chapter 20, explores different choices of the object used as a single lock and
also discusses a two-lock variant of the same structure.
18.7 Summary
• JVM objects are allocated in a shared heap that all threads can access. If multiple
threads have a reference to the same object, they can use it at the same time.
However, the nondeterministic scheduling of threads makes it possible for them to
be interrupted in the middle of an operation, possibly leaving data in a state that
is not suitable for other threads to use. Mutable structures that were not designed
with thread-safety in mind cannot be shared among threads without additional
coordination, or they will fail in unpredictable ways. In particular, this is true of
most collections from java.util.
284 Chapter 18 Atomicity and Locking
val duties: Map[Int, List[Runnable]] = Map(1 -> tasksA, 2 -> tasksB, 3 -> tasksB)
285
286 Chapter 19 Thread-Safe Objects
Tasks are stored in immutable lists, and the lists in an immutable map. As they run,
all three threads access the map concurrently, and threads T2 and T3 also safely share
the list tasksB. This is done without any locking or other synchronization.
You can greatly simplify the design of your multithreaded programs by avoiding
shared mutable data, first by minimizing sharing—when possible, threads are better off
working on their own data—and second by making shared data immutable as often as
possible. In Part I of the book, we considered some of the advantages of immutability
in the context of functional programming. Immutable objects are also tremendously
beneficial to concurrent programming.
On the JVM, there is technically a difference between immutable objects and objects
that are never mutated, due to a special treatment of final fields. Non-final fields (var
fields in Scala) may require some form of initial synchronization before they can be
freely read by any thread. This synchronization tends to happen naturally in clean,
simple designs.1 Still, it is good practice, when dealing with threads, to prefer val over
var—or to add the final modifier in Java—for fields that are not intended for mutation.
To decide if a non-final field of a never-mutated object can safely be read by a thread
requires an understanding of the Java Memory Model (JMM), a somewhat advanced
topic (see Section 22.5).
replaced with var, as long as the map is never mutated and the variables never reassigned.
2 Indeed, some static analyzers do exactly this: When a value is almost always accessed with the
same object locked, any access to the value without this lock is flagged as suspicious code.
19.2 Encapsulating Synchronization Policies 287
access an object, you make the object itself encapsulate its own policy. Objects capable
of enforcing their policy internally are thread-safe: The required thread coordination
is implemented inside the object’s methods, and threads can freely call these methods
without further synchronization.
As an illustration, let’s rewrite the list-sharing example with an encapsulated locking
strategy:
Scala
class SafeStringList:
private val contents = ArrayList[String]()
Listing 19.2: Safe list that encapsulates synchronization; contrast with Lis. 18.5.
The array-based list, which is not thread-safe, is encapsulated inside a SafeStringList
class. The inner list contents is private and accessed only via methods add and size.
These methods implement the necessary synchronization by always locking this before
they use the array list. You can now write function addStrings without any explicit
synchronization.
Inside class SafeStringList, both methods use locking to access the inner list—not
only add, which modifies the list, but also size, which just queries its size. This is
extremely important: When mutable data is protected by a lock, all accesses to the
data must go through the lock. This includes both writing and reading accesses.3
There are two main reasons for this requirement. First, locking when reading prevents
undesirable read/write interactions. If read accesses are not locked, you could possibly
observe intermediate, inconsistent states of a structure. Though unlikely in an array-
based list, method size could be implemented by using a loop that traverses the list,
and a list that is in the middle of a modification by a thread may not be in a state that
is safe for another thread to perform this iteration. The second reason is more technical.
It is a consequence of the Java Memory Model: Without locking when reading, there is
no guarantee that a thread will see all the modifications to a structure that were made
earlier by other threads, including data written with locks owned. The details are not
essential, and the rule is simple: When locking, all accesses—writing and reading—must
go through the same lock.
3 Listing 18.5 correctly queries the size of the list at the end without locking, because the query is
A thread that executes this code acquires the lock to obtain a reference to the inner list
contents. However, the lock is released at the end of method getAll, before the thread
calls add. In other words, the thread is now adding to the array-based list without the
lock.
Of course, reference escape is often harmful in object-oriented designs and needs to
be avoided even in single-threaded programs. However, multithreading tends to compli-
cate the issue. This next variant of method getAll shares the contents list in a way
that would be harmless in a single-threaded context but will still cause problems when
multiple threads are involved:
Scala
// DON'T DO THIS!
def getAll: List[String] = Collections.unmodifiableList(synchronized(contents))
In short, some of the techniques you could use to maintain data encapsulation in a
single-threaded context are inadequate in the presence of multiple threads. You can still
write a thread-safe variant of method getAll, but at the cost of a full copy of the list:
Scala
def getAll: List[String] = ArrayList(synchronized(contents))
Such defensive copies can often be avoided by implementing mutable objects in terms
of immutable data structures. This was discussed in Section 3.8, and Section 19.5 will
revisit the topic in the context of multiple threads.
that the user manually synchronize on the returned list when traversing it,” thus indi-
cating that the wrapper itself is the lock that is used internally.
Second, public locks involve a risk of being misused by code that locks the list when
it should not. A thread that executes code of the form
Scala
shared.synchronized {
// lengthy computation, or I/O, or other blocking calls...
}
prevents other threads from accessing the shared list in any way, even just to query its
size, possibly for a long time.
Third and finally, public locking is viable only when using simple synchronization
policies—SafeStringList uses a single lock. It ceases to be workable with more complex
strategies, such as lock striping (e.g., java.util.concurrent.ConcurrentHashMap)
and lock-free algorithms (e.g., java.util.concurrent.ConcurrentLinkedQueue).
To circumvent these difficulties, locks can be kept private:
Scala
class SafeStringList:
private val contents = ArrayList[String]()
Listing 19.4: Safe list based on a private lock; contrast with Lis. 19.2.
This implementation of SafeStringList is just as thread-safe as before but uses a pri-
vate object, the inner list, as a lock. The drawback, of course, is that external functions
that require an atomic iteration or an atomic check-then-act cannot be written. A func-
tion like addStringIfCapacity from Listing 19.3 is no longer possible because it has
no access to contents for locking. As an alternative, you can enrich SafeStringList
to offer additional atomic operations, but users will be limited to what the class defines.
data stored in reassignable fields (see Section 4.1 for an example). This dual observation
yields the following thread-safe SafeStringList implementation:4
Scala
class SafeStringList:
private var contents = List.empty[String]
Listing 19.5: Safe list based on an immutable list; contrast with Lis. 19.2 and 19.4.
The List type refers here to Scala’s lists, and contents is now an immutable list.
Instead of a val, the list is stored into a var field so it can be reassigned. You add
elements by setting contents to a new list—contents ::= str is the same as contents
= str :: contents. An interesting aspect of this implementation is that getAll simply
returns a reference to the internal list, avoiding the copy that would be needed if you
used a mutable list. The internal list ends up being shared among threads but, because
the list is immutable, this sharing is harmless.5
Of course, locking is still needed to read and write variable contents. This variant
uses a public lock, the wrapper itself (this). If a private lock is preferred, locking on
the underlying list, as in Listing 19.4, is not an option:
Scala
// DON'T DO THIS!
class SafeStringList:
private var contents = List.empty[String]
4 I assume here that threads obtain references to objects from other threads through proper syn-
chronization, as is typically the case. Otherwise, the subtleties of the Java Memory Model again come
into play, and the contents field would need to be initialized while owning the lock:
private var contents: List[String] = uninitialized
synchronized {
contents = List.empty
}
5 Because values are always added to the front of the list, they are returned in reverse order. If order
is relevant, an immutable type other than a list, such as an immutable queue, can be used.
292 Chapter 19 Thread-Safe Objects
This approach does not work. A thread that calls add needs to read the value of
contents first, before it can use it as a lock. This reading is done without locking
and is incorrect.
A peek at the bytecode for method add helps make the issue more tangible:
bytecode
0: aload 0
1: getfield #38 // Field contents:Lscala/collection/immutable/List;
4: dup
5: astore 2
6: monitorenter
7: aload 0
8: aload 0
9: getfield #38 // Field contents:Lscala/collection/immutable/List;
12: aload 1
13: invokevirtual #51 // Method scala/collection/immutable/List.$colon$colon: ...
16: putfield #38 // Field contents:Lscala/collection/immutable/List;
...
Field contents is read on line 1, without any locking. It is then locked on line 6—
monitorenter is the JVM locking instruction—and read again on line 9. Between lines 1
and 6, variable contents could be written by another thread that invokes method add,
in which case the list that is read here on line 9 is different from the list that is locked
on line 6. A scenario, in simplified bytecode, is shown in Figure 19.1. After three calls
to method add, the list contains only two values. Note how, at some point, the two
threads end up locking two different objects—an empty list and a list that contains
only "B"—and proceed inside their synchronized code in parallel.
contents=list[]
Thread T1 Thread T2
// add("A") // add("B")
getfield #38 list[] getfield #38 list[]
monitorenter: lock list[]
getfield #38 list[]
putfield #38 list[B]
monitorexit
// add ("C")
getfield #38 list[B]
monitorenter: lock list[] monitorenter: lock list[B]
getfield #38 list[B] getfield #38 list[B]
putfield #38 list[A,B]
putfield #38 list[C,B]
monitorexit monitorexit
contents=list[C,B]
You can use a val field as a lock to guard itself, as in Listings 18.5 and 19.4, but you
cannot do that with a reassignable var field. (The same applies to final and non-final
fields in Java.) Instead, you need to create an additional locking target, as in Listing 18.7:
Scala
class SafeStringList:
private var contents = List.empty[String]
Listing 19.6: Safe list with an immutable list/private lock; contrast with Lis. 19.5.
19.6 Thread-Safety
Both this chapter and Chapter 18 include examples of flawed programs that behave
incorrectly in the presence of multiple threads. We also saw how correctness can be
brought back through locking, resulting in thread-safe objects that can handle concur-
rent accesses. So far in this discussion, however, “safe” (or “unsafe”) and “correct” (or
“incorrect”) have only been loosely defined.
Some programs are clearly incorrect, like the program in Listing 18.2 that blows
with an exception when trying to add a string to a list. But what does it mean exactly
for a program to be correct in the presence of concurrency? The program in List-
ing 18.5 is correct because a list of strings has length 10 in the end. But why is the
value 10 expected? You could answer that the runs 5 times shared.add("T1") and
5 times shared.add("T2") in sequence produce a list of size 10, and thus that this
should also be the case when the two computations are executed in parallel. But the
sequential execution also produces all five "T1" values together and all five "T2" values
together, and this is not expected of the parallel version, which can produce an arbitrary
interleaving of "T1" and "T2" values. So, a parallel program might produce a list that
is impossible in its sequential counterpart, yet not be considered incorrect because of
that. A more detailed explanation is needed.
When you add a string to a list, there is an expectation that method add is atomic
in its effects. The list first exists without the string, then with the entire string—there
is no state of the list where a string is “half added.” The very definition of a list—the
semantics of what it means to be a list—is expressed in terms of such atomic operations.
The “thread-safe” expectation is that these atomic operations are preserved—in other
words, that a list continues to be a list when accessed by multiple threads.
Concurrency can be formally defined in terms of interleaving of atomic operations.
If A and B are atomic operations on a shared structure—like add or get on a list—and
a thread executes A in parallel with another thread that executes B, denoted as A k B,
294 Chapter 19 Thread-Safe Objects
thread 1 A
thread 2 B
time
A A
A B A
A B A B ?
B A ?
The same principle can be generalized to more than two threads and to compound
operations. If A, B, C, and D are atomic operations, and three threads execute A,
(B; C), and D concurrently—A k (B; C) k D—the outcome, after all operations are
finished, should be the same as A; B; C; D or A; B; D; C or A; D; B; C or B; A; C; D
or B; A; D; C or B; C; A; D or B; C; D; A or B; D; A; C or B; D; C; A or D; A; B; C or
D; B; A; C or D; B; C; A. As long as a program produces one of these outcomes, it
is behaving correctly. Note how B must always come before C because B and C are
executed by the same thread, sequentially.
As an illustration, consider a list that contains only X and Y, and three threads that
call add(Z), remove(Z), and clear() in parallel.7 If the list is thread-safe, then after
6 Note that it could happen in an execution of A k B that half (or more) of action A has already been
executed by the time B starts, but B; A is still a valid outcome. Indeed, when non-locking strategies
are involved, such as those explored in Sections 27.1 and 27.2, B; A can be the more likely outcome of
a run of A k B in which A starts first and B interferes.
7 I assume here that method clear is atomic, but no other outcome is acceptable, even under the
all three operations are completed, it can either be empty or contain only Z. These are
the only possible outcomes of the three calls in any order, and thus the only acceptable
outcome of a parallel execution.
Going back to the program in Listing 18.5, if a thread calls add("T1") on a list
while another thread calls add("T2"), the list must behave as if both calls happened
sequentially, in some order. In particular, after both calls are completed, both strings
were added to the list and nothing else was added or removed. This is the reason you
can rightfully expect a size of 10 after two threads perform five add operations each:
The list should be the same as if all ten operations had happened sequentially, in some
order.
This discussion of thread-safety is based on atomic operations. What is and is not
atomic depends on the structure under consideration. For instance, many thread-safe
structures from java.util.concurrent include an addAll method. The method is
guaranteed to be safe—no value will be lost—but not necessarily atomic. Therefore, a
concurrent execution of add(T) and addAll(X,Y,Z) on an empty structure can poten-
tially produce [X,Y,T,Z], which is not equivalent to both methods invoked in sequence.
19.7 Summary
• Many—if not most—of the difficulties of concurrent programming arise from
threads sharing mutable data. In your concurrent applications, you should strive to
minimize sharing; when sharing is necessary, you should prefer sharing of immut-
able objects.
• Immutable objects are inherently thread-safe. Any thread that obtains a reference
to an immutable object can freely read this object in parallel with other threads.
The design of concurrent applications can be greatly simplified by maximizing the
use of immutable objects.
• All accesses to shared mutable data must go through the proper synchronization
steps. This includes not only code that modifies the data, but also all reading
accesses. Without the necessary synchronization, reading data that is modified by
other threads can lead to unpredictable behavior.
• You must also maintain data encapsulation by making sure that references to
internal data do not escape. Escaping references could allow threads to access
data without going through the proper synchronization steps. Be aware that some
296 Chapter 19 Thread-Safe Objects
In this chapter, the goal is to write a version of this queue that is safe for multi-
threaded usage. The resulting queue will be non-blocking: Adding elements is always
possible (unbounded capacity), and elements taken from the queue are returned as
options so that taking from an empty queue can return None.1
1 The alternative is to block a thread on a full queue until elements are removed by other threads and
on an empty queue until elements are added by other threads. This requires the use of synchronizers,
which are discussed in Chapters 22 and 23. See possible blocking queue implementations in Listings 23.6
to 23.8.
297
298 Chapter 20 Case Study: Thread-Safe Queue
Listing 20.1: Thread-safe queue (single public lock); see also Lis. 20.5 and 20.7.
A queue is empty when both lists are empty. You add a value to the queue by prepending
it to the in list—in ::= value is the same as in = value :: in. You remove a value
from the queue by reducing list out to its tail and returning its head. An empty list
out needs to be handled as a special case: Reverse list in, make it the new out, and
reset in to an empty list. The resulting code is basically a standard two-list queue
implementation, except that the entire body of each method is executed with a lock.
Note that isEmpty is called from within take with the lock already owned. There is
no harm in that, as intrinsic locks are reentrant. However, the following variant, which
calls isEmpty before acquiring the lock, is incorrect:
Scala
// DON'T DO THIS!
def take(): Option[A] =
if isEmpty then None
else
synchronized {
if out.isEmpty then
out = in.reverse
in = List.empty
val first = out.head
20.2 Single Public Lock Implementation 299
out = out.tail
Some(first)
}
This method always reads and writes both variables in and out with the lock owned,
but this is not sufficient. The mistake here is that take now relies on a non-atomic
check-then-act: The lock is acquired when entering isEmpty, released at the end of this
method, and acquired again in the else branch. Between the moment the queue is
unlocked and when it is locked again, it could be acquired by another thread, resulting
in the scenario depicted in Figure 20.2. You should always be wary of a non-atomic
check-then-act and take into account anything that might happen between releasing a
lock and reacquiring it.
in=list[], out=list[X]
Thread T1 Thread T2
// take() // take()
lock
isEmpty false
unlock
lock
isEmpty false
unlock
lock
out.isEmpty false
out list[]
return Some(X)
unlock
lock
out.isEmpty true
out list[]
in list[]
out.head NoSuchElementException!
Listing 20.1 uses the queue itself as the lock, enabling client-side locking. For instance,
you can write a function to add a batch of values to a queue without intervening
elements:
Scala
def putAll[A](queue: ConcurrentQueue[A], values: A*): Unit =
queue.synchronized {
for value <- values do queue.put(value)
}
This relies again on the fact that locks are reentrant, and the outer locking guarantees
that the queue is not modified between successive calls to put. Batch removal can be
implemented as a drain function that dumps the entire contents of a queue into a list:
Scala
def drain[A](queue: ConcurrentQueue[A]): List[A] =
val buffer = List.newBuilder[A]
queue.synchronized {
while !queue.isEmpty do buffer += queue.take().get
}
buffer.result()
Listing 20.4: Batch extraction without synchronization; contrast with Lis. 20.3.
The strategy here is different: Instead of testing the queue for emptiness, you blindly
call take and test the resulting option. This implementation repeatedly calls take on
the queue until it returns None.
Observe that the two implementations of drain are not equivalent. If a thread
attempts to add (or remove) a value while another thread is in the middle of draining a
queue, this value is excluded from (or included in) the list according to the implemen-
tation in Listing 20.3, but not in the Listing 20.4 variant. In practice, this makes little
difference: If a call to drain runs concurrently with calls to put or take, whether the
drained list contains the added/removed values is unpredictable anyway.2
2 Another difference is that the first implementation locks the queue once for an entire drain, while
the second variant locks and unlocks the queue each time it removes a value. The impact on performance
is hard to predict. A standard compiler optimization—lock coarsening—could skip any number of
unlock/relock pairs when running the second variant.
20.3 Single Private Lock Implementation 301
Listing 20.5: Thread-safe queue (single private lock); see also Lis. 20.1 and 20.7.
Not much of the code is changed. Instead of this, a private object—created for this
purpose—is used as the lock.
Note that, as a design decision, method isEmpty has become private. Without client-
side locking, keeping isEmpty public is not essential because, in general, the value
returned by the method is useless in the presence of other threads calling put and
take on the queue. A public isEmpty method is an invitation to write incorrect code:
Scala
// DON'T DO THIS!
if queue.isEmpty
then ... // no guarantee that the queue is empty here
else ... // no guarantee that the queue is non-empty here
There are scenarios in which a public method isEmpty would be usable—for instance,
the value returned by isEmpty cannot switch from true to false if all the tasks liable
302 Chapter 20 Case Study: Thread-Safe Queue
to add to the queue are terminated—but you can usually deal with these situations by
relying on take (as in Listing 20.4).
As before, isEmpty acquires the lock to access fields in and out. Given that the
method is private, this is not strictly necessary: isEmpty is called only from with-
in method take, after the lock has been acquired, and therefore a thread that enters
isEmpty already has the necessary lock. You could write the method as follows:
Scala
private def isEmpty: Boolean =
assert(Thread.holdsLock(lock))
out.isEmpty && in.isEmpty
An assertion is used to ensure that the necessary lock is indeed held. It can help catch
a mistake, by which other paths to isEmpty have been overlooked. In practice, there
is little value in this alternative implementation. Going through a lock that is already
acquired is extremely fast.
Due to private locking, functions putAll and drain from Listings 20.2 and 20.3 can
no longer be written. If needed, you can implement them within class ConcurrentQueue:
Scala
// inside ConcurrentQueue
def putAll(values: A*): Unit = lock.synchronized {
for value <- values do in ::= value
}
Listing 20.7: Thread-safe queue (two private locks); see also Lis. 20.1 and 20.5.
Two private objects, inLock and outLock, are created to serve as locks.3 Inside method
put, which is based solely on list in, you acquire inLock, but leave outLock free for
threads that call method take. Method take is a little more intricate. You start by using
outLock to access list out. If the list is non-empty, this is enough: outLock is the only
lock you need, and the call to take proceeds without any interference with concurrent
calls to put. If, however, list out is empty, you have to acquire the second lock, inLock,
3 In Scala, val x, y = expr evaluates expr twice, once to initialize x and once to initialize y.
304 Chapter 20 Case Study: Thread-Safe Queue
to retrieve and reset list in. You then reverse it and use it as the new out list. The
code is written to hold inLock to read and update list in; the list is reversed after
the lock is released, so that calls to put can take place concurrently with this reversal.
You also want to avoid any initial isEmpty test, because it would require locking both
lists. Between occasional reversals, multiple calls to put and take can proceed fully in
parallel, without lock contention.4
In this implementation of ConcurrentQueue, the locks are private and not exported.
Client-side locking is not available, but you can implement addAll and drain as part
of the class if needed:
Scala
// inside ConcurrentQueue
def putAll(values: A*): Unit = inLock.synchronized {
for value <- values do in ::= value
}
(i, o)
}
}
out ++ in.reverseIterator
The only difference is that inLock is acquired before outLock. The change may seem
innocent enough—after all, both locks are needed anyway—but could result in a major
failure if two threads call take and drain concurrently.
Consider the scenario from Figure 20.3. A thread T1 invokes method take at the same
time a thread T2 enters method drain. T1 acquires lock outLock and T2 acquires lock
inLock. If list out is empty, thread T1 now needs inlock—held by T2—and thread T2
needs outLock—held by T1. However, T1 will not release outLock until it locks inLock,
and T2 will not release inLock until it locks outLock. Both threads are stuck waiting
for a lock that will never be available. This situation is known as a deadlock—a set of
threads waiting for each other in a cyclical way. Multiple threads acquiring the same
set of locks in different orders is a common cause of deadlock. Deadlocks are discussed
in Section 22.3.
in=list[X], out=list[]
Thread T1 Thread T2
// take() // drain()
lock outLock lock inLock
out.isEmpty true // blocks trying to lock outLock
// blocks trying to lock inLock
20.5 Summary
First-in-first-out queues can be implemented in terms of two lists and the implementa-
tion made thread-safe by locking. The simplest approach is to use a single lock to guard
both lists. The lock can be public—enabling client-side locking—or private. A possible
mistake is to release and then reacquire the lock in the middle of a compound operation,
like check-then-act, thus breaking atomicity. Instead, you want to hold the lock for the
duration of the entire operation.
However, unnecessarily holding locks that are not needed can prevent other threads
from performing operations that could potentially be done in parallel. In general, code
should strive to release locks as soon as they are not needed, especially before starting
lengthy operations.
Querying methods like isEmpty are of limited use on concurrent data structures
without client-side locking: By the time the value they return is used, the lock has been
306 Chapter 20 Case Study: Thread-Safe Queue
released and the state of the structure may have changed. In particular, an isEmpty
test cannot guarantee, in general, that the structure is or is not empty.
Since a queue uses two lists internally, an alternative to using a single lock is to
have each list guarded by its own lock. A two-lock variant of the queue allows for more
parallelism between the put and take operations, albeit at the cost of increased code
complexity. In particular, operations that require acquiring both locks at the same time
must be implemented with care to avoid possible deadlocks.
Chapter 21
Thread Pools
For performance reasons, threads are often pooled. Pooling helps reduce thread creation
by reusing threads and makes it easier to place a bound on the number of active threads.
A thread pool consists of generic workers that collectively execute the tasks submitted to
the pool. Different types of thread pools exist, characterized by various properties: fixed
or flexible number of workers, bounded or unbounded queue of tasks, scheduling and
delaying facilities, and so on. Applications can use thread pools explicitly by submitting
tasks directly to workers. Additionally, some languages define structures that can process
their contents in parallel by submitting internal tasks to a thread pool. In functional and
hybrid languages, this typically takes the form of parallel implementations of higher-
order functions.
307
308 Chapter 21 Thread Pools
exec.execute(LetterPrinter('A'))
exec.execute(LetterPrinter('B'))
exec.execute(LetterPrinter('C'))
println("END")
Listing 21.1: Example concurrency from a thread pool; contrast with Lis. 17.1.
A thread pool is created with four worker threads. Letter-printing tasks are submitted
to it using method execute. The output is similar to that of Listing 17.1, except for
the names of the threads:
main at XX:XX:38.670: START
pool-1-thread-1 at XX:XX:38.694: A
pool-1-thread-2 at XX:XX:38.694: B
main at XX:XX:38.694: END
pool-1-thread-3 at XX:XX:38.694: C
A thread that calls execute with a task typically does not run the task but only
submits it for execution. Method execute returns immediately, and the task is executed
later, asynchronously. The task is actually run when one of the available workers grabs
it and starts its execution. If all the workers are busy, the task might be queued or
rejected, depending on the thread pool configuration.
Method execute does not return anything useful—it is a void method in Java.
Using it leads to a “fire-and-forget” programming style: Submit a task and let in run.
You can also use thread pools in more controlled ways by creating handles on running
tasks, known as futures. Futures are discussed in Chapter 25.
In the preceding run, notice that the main thread terminates before all the tasks are
finished. Earlier examples relied on the thread method join to force the main thread
to wait for the other threads to be done before terminating. When using thread pools,
you can achieve a similar behavior by waiting for all the worker threads to be done:
Scala
...
exec.shutdown()
exec.awaitTermination(5, MINUTES)
println("END")
Method shutdown is used to indicate to the pool that no new task will be submitted (it
disables method execute). Once a pool is shut down, you can wait for its termination.
21.2 Illustration: Parallel Server 309
This will happen after all remaining tasks have been executed1 and the worker threads
are terminated. Method awaitTermination waits for all the worker threads to finish,
or for a timeout to have elapsed, whichever happens first. This variant of the program
guarantees that the END print statement comes last, as in Listing 17.4.
In many scenarios, you often need to wait for the completion of one or more specific
tasks instead of the termination of an entire thread pool. This requires the use of
synchronizers, which are discussed in Chapter 22.
You can use thread pools as a convenient mechanism to create threads, have them
run a set of tasks concurrently, and terminate them. For instance, class SafeStringList
from Listing 19.5 can be tested for thread-safety by using tasks run by a thread pool:
Scala
val exec = Executors.newCachedThreadPool()
val shared = SafeStringList()
exec.shutdown()
exec.awaitTermination(5, MINUTES)
assert(shared.size == 5 * N)
assert((1 to N).forall(i => shared.getAll.count(_ == i.toString) == 5))
Listing 21.2: Using a thread pool to create threads and wait for their termination.
This program creates N numbered tasks. Each task writes its number five times into a
shared list. All the tasks are executed by a thread pool to introduce concurrency. After
all the tasks are completed, you can check the length of the list (it should be 5 * N) as
well as its contents (each task number should appear in the list exactly five times).
Method newCachedThreadPool creates a pool with no upper bound on the number
of threads: As tasks come in, the pool creates more threads if none is available. Given
that one reason for using thread pools is to keep a bound on the number of threads
created, such an unbounded pool is rarely well suited to production code.
Scala
def handleConnection(socket: Socket): Unit = ...
while true do
val socket = server.accept()
handleConnection(socket)
Listing 21.3: A server that processes requests sequentially; contrast with Lis. 21.4.
This server starts to listen on a given port number and processes all incoming requests
sequentially. There is only one thread involved. It is blocked on method accept until a
request comes in. When a client connects to the server, accept returns a socket, which
can be used as a bidirectional channel between the client and the server. While the
thread is inside method handleConnection, handling a request by reading from and
writing into the socket, no further connections are accepted.
You can easily make this server multithreaded to handle multiple clients in parallel:
Scala
val server = ServerSocket(port)
val exec = Executors.newFixedThreadPool(16)
while true do
val socket = server.accept()
exec.execute(() => handleConnection(socket))
Listing 21.4: A server that processes requests concurrently; contrast with Lis. 21.3.
In this variant, when a request comes in, the thread that was blocked on method accept
does not process it. Instead, it creates a task—as a lambda expression—and hands it
over to a thread pool. The call to method execute is instantaneous, and the thread
immediately goes back to method accept to listen for more incoming connections,
possibly starting new handling tasks in the thread pool, which can run in parallel.
Listening and accepting connections is all the thread does, which is why it is commonly
called the listening thread. All the actual work of processing requests from clients is
done in the thread pool.
An easy mistake you need to avoid is the temptation to “inline” the socket variable:
Scala
// DON'T DO THIS!
while true do
exec.execute(() => handleConnection(server.accept()))
21.2 Illustration: Parallel Server 311
Such an inlining would be harmless in the sequential server of Listing 21.3, but it is
unacceptable in the parallel server of Listing 21.4. The behavior would fundamentally
change. Because method execute is asynchronous—it does not stop to execute a task—
the loop now creates an unbounded number of tasks, all listening for connections. All the
calls to accept take place in the thread pool. There is nothing blocking in the body of
the loop, which will quickly exhaust resources and crash the server.
On multicore or multiprocessor hardware, the use of a thread pool improves server
throughput: By handling connections in parallel, more work is being done per unit of
time. Using a thread pool can also improve latency, even on single-processor systems.
A drawback of the sequential server is that a single lengthy request can delay multi-
ple smaller requests. By contrast, the thread pool makes it possible to process large
and small requests together on separate threads, even on a single processor. Of course,
these benefits don’t come for free. In particular, you now need handleConnection to
be implemented in a thread-safe manner, as it can end up being executed concurrently
by multiple threads from the pool.
In addition to offering better performance, the parallel variant of the server is more
robust than the sequential one. In a sequential implementation, requests that come
in while the thread is busy processing a connection are stored in one of the internal
queues managed by the operating system and/or the networking library. These queues
are typically small—up to a dozen requests stored—and further attempts to connect
to the server will be rejected with a “connection refused” error. By contrast, if all the
worker threads in Listing 21.4 are busy processing connections, new requests are placed
in the queue of the thread pool, which is stored in heap memory and typically quite
large.2
A parallel server based on a thread pool is also more robust than the following
variant:
Scala
// DON'T DO THIS!
while true do
val socket = server.accept()
Thread(() => handleConnection(socket)).start()
fill up the entire memory. However, more tasks can normally be stored in memory than sockets can be
created. When overloaded, the server would likely fail at the level of the listening thread calling accept.
312 Chapter 21 Thread Pools
The pool maintains a minimum of four worker threads, ready to run tasks. If all four
threads are busy, tasks are queued in an array-based queue with a capacity of 128. If
this queue gets full, up to 12 additional threads are created, for a maximum pool size
of 16. Later, these extra threads are terminated after they have been idle for 3 minutes.
If all 16 threads are busy, the queue is full, and a new task is submitted, the thread
that submits the tasks runs it. (CallerRunsPolicy is a convenient way to bring in an
extra worker and reduce task creation rate at the same time; in the server example,
this would make the listening thread run connection-handling code for a while.) Finally,
information is logged every time a task starts to run.
For convenience, a default thread pool is often made available to task-creating code.
You can access it as scala.concurrent.ExecutionContext.global in Scala and as
java.util.concurrent.ForkJoinPool.commonPool in Java. Its size is typically based
on (but not necessarily equal to) the number of available cores in the runtime. Because
3 More accurately, the pool uses a zero-capacity queue, which is always empty.
21.3 Different Types of Thread Pools 313
this thread pool is not shut down, its threads are started in daemon mode so as not to
prevent JVM termination (see Chapter 17, footnote 3).4
In addition to standard thread pools, class ScheduledThreadPoolExecutor extends
ThreadPoolExecutor with mechanisms for scheduled tasks. It defines several methods
to delay and/or repeat task execution so it can be used as a timer. For instance, you
can modify the letter printing example to use a scheduled thread pool:
Scala
println("START")
val exec = Executors.newScheduledThreadPool(2)
println("END")
The execution starts at 09:23:41. Three seconds later, at 09:23:44, letters B and C
are printed. Two seconds later—5 seconds from the beginning—letter A is printed.
At 09:23:54, 10 seconds after they were first displayed, letters B and C are printed again,
and so on, repeatedly.
The slight time difference between the next display of B and C illustrates the dif-
ference between scheduleAtFixedRate and scheduleWithFixedDelay. A display of
letter B is initiated every 10 seconds (fixed rate), while a display of letter C is initiated
4 This approach tends to make the default pool unsuitable for small illustrative examples due to
early termination of the JVM. For this reason, I don’t use it much in this book and rely instead on
customized pools of non-daemon threads.
5 A type ascription :Runnable is necessary because of another overloaded schedule method.
314 Chapter 21 Thread Pools
10 seconds after the previous occurrence of the same event (fixed delay). Since displaying
a letter takes time, letter C is printed a few milliseconds after letter B. This difference
accumulates over time. Letter B is always displayed at time 09:23:44 plus a multiple of
10 seconds; letter C is always printed 10 seconds after the previous display, with no tie to
the starting time. The output at a later time shows that the time difference between B
and C has increased:
...
pool-1-thread-2 at 10:15:14.524: B
pool-1-thread-1 at 10:15:15.526: C
pool-1-thread-2 at 10:15:24.521: B
pool-1-thread-1 at 10:15:25.531: C
...
Fifty-two minutes after the program started, letter C is a full second behind letter B.
You might also observe the difference between fixed-rate scheduling and fixed-delay
scheduling when a task takes an unusual amount of time to complete. If, for instance,
printing a single B was delayed for some reason and took a full minute to complete,
multiple displays of B would take place in rapid succession afterward to maintain the
desired rate. By contrast, if a printing of C is delayed, missed runs are skipped and
the next display takes place 10 seconds later.
.size
}
println(s"end $url")
count
println("END")
All the sources are processed sequentially by the main thread. The entire computation
takes about 45 seconds. However, counting words in a URL is completely independent
6 This application is I/O-bound, and computers are fast. In the sample outputs of this section, word
processing has been artificially slowed down to observe more meaningful and predictable timings.
316 Chapter 21 Thread Pools
from other URLs, which suggests that all URLs could be processed in parallel. Indeed,
achieving this parallelism can be as easy as adding .par to a list:
Scala
import scala.collection.parallel.CollectionConverters.ImmutableSeqIsParallelizable
Sources are now being processed by the eight7 threads of the default thread pool. Eight
computations start immediately, in parallel. The remaining two start after two worker
threads—numbered 20 and 17—have become available—at times 19.754 and 20.370,
respectively. Overall, the computation takes about 9 seconds.
Not apparent in the output is the fact that method map on ParSeq still computes the
same list: The numbers in the list are in the order of their URLs, not in the order in which
the parallel computations terminate. In other words, List(1, 2, 3).par.map(_ * 10)
is guaranteed to be the list [10, 20, 30], in this order.
7 This is the size of Scala’s default thread pool on an eight-core computer.
21.4 Parallel Collections 317
In Listing 21.6, sources are processed in parallel, but each source is still processed
sequentially. This could be changed by creating a parallel collection of lines and having
lines from the same source processed concurrently:
Scala
source
.getLines()
.to(ParSeq)
.flatMap(line => line.split("""\b"""))
...
By inserting the stage to(ParSeq), you create a parallel collection of lines, allowing later
computations triggered by flatMap, map, and filter to proceed on multiple threads.
In this word counting example, parallelization has been achieved at very little cost in
code complexity. In general, though, you need to be careful when using parallel collec-
tions. For instance, you cannot replace the call to counts.max with the following code:
Scala
// DON'T DO THIS!
var max = Int.MinValue
for count <- counts do if count > max then max = count
Some runs do produce value 29,154 in variable max, but others do not, in an unpre-
dictable manner. The issue here is that for-do in Scala is syntactic sugar for a call to
higher-order method foreach (see Section 10.9), and method foreach is not sequential
on a parallel collection. The if-then expression ends up being evaluated by multiple
threads from the pool at the same time, with the adverse consequences already dis-
cussed in Section 18.3 (non-atomic check-then-act). Instead, you should call method
max—correctly implemented on parallel sequences—directly on counts.
If you need to process a sequence one element at a time, for instance to compute
something more complicated than max, you can convert a parallel sequence into a sequen-
tial one using method seq, which has the opposite effect of method par. The following
computation of the maximum is correct, albeit sequential:
Scala
var max = Int.MinValue
for count <- counts.seq do if count > max then max = count
Another aspect of parallel collections you need to be aware of is that not all higher-
order methods are implemented in parallel. For instance, in the sequential word counting
application, the intermediate list of numbers can be avoided by using foldLeft (see
Section 10.4):
318 Chapter 21 Thread Pools
Scala
val max = urls.foldLeft(Int.MinValue)(_ max distinctWordsCount(_))
Method foldLeft is implemented sequentially, from left to right, even on parallel col-
lections.8 Avoiding the intermediate list of numbers while processing sources in parallel
calls for the slightly more complicated method aggregate:
Scala
val max = urls.par.aggregate(Int.MinValue)(_ max distinctWordsCount(_), _ max _)
Method aggregate takes two function arguments: the main computing function, as in
foldLeft, and a second function used to combine the intermediate results computed
in parallel.
21.5 Summary
• Instead of creating a new thread whenever a task needs to be executed, pools of
generic worker threads are often set up first, and tasks are then submitted to the
pools for execution.
• Using thread pools has two major benefits: It makes it easier to keep the total
number of threads for an application under control, and it helps amortize the cost
of creating and terminating threads, which is typically non-negligible.
• Thread pools can be shut down, allowing their threads to terminate. Creating a
temporary thread pool to run tasks in parallel, shutting it down after all the tasks
have been submitted, and waiting for the pool to terminate can often be a simple
way to implement the fork-join pattern mentioned in Section 17.4, in which a
primary thread creates worker threads and then waits for them to complete their
tasks.
• Thread pools can be used in a fire-and-forget style, in which tasks are submitted
and let run to completion, with the submitting thread having no further interaction
with them. A server that processes independent or mostly independent requests
can be implemented as a single listening thread that submits request-handling
8 In contrast to foldLeft and foldRight, method fold is implemented in parallel but cannot be
used to count words from a list of URLs because of limitations in its signature.
21.5 Summary 319
tasks to a thread pool. Bursts of activity are then managed by temporarily storing
requests in the queue of the thread pool.
• Thread pools are often available in variants that allow task execution to be delayed
and/or repeated, as with a timer. Repeating tasks can be scheduled at a fixed rate,
or with a fixed delay between successive runs of the task.
• Thread pools are sometimes used implicitly by mechanisms that hide the paral-
lelism in their implementation. Scala, for instance, offers parallel collections that
execute some of their higher-order methods in parallel. The functional behavior is
the same as their sequential counterpart—map is still map, filter is still filter—
but parallel processing can produce a performance improvement. Java implements
a similar mechanism in its Stream class.
This page intentionally left blank
Chapter 22
Synchronization
Thread cooperation requires coordination, typically via synchronization. Waiting for
a thread or a thread pool to terminate is an example of such synchronization. More
generally, different types of synchronizers can be used to block one or more threads
until the state of an application allows them to make progress. When misused, synchro-
nization that unnecessarily blocks threads may cause a program to slow down, or even
to come to a halt in the case of a deadlock. Snapshots of the states of threads—whether
they are blocked, and on which synchronizer—can be used to debug these situations.
Synchronization is also leveraged by runtime systems to optimize the memory usage of
parallel tasks. These optimizations are specified in a memory model.
exec.shutdown()
exec.awaitTermination(5, MINUTES)
assert(shared.size == 10)
321
322 Chapter 22 Synchronization
There are numerous situations in actual applications where this approach will not be
practical. Perhaps the thread pool is shared with other components of the application,
so it cannot be shut down; or maybe you need to wait for completion of a subset of
tasks only, while leaving other activities running in the thread pool. What is needed is
a mechanism for threads to wait for completion of specific tasks, and more generally
to coordinate with other thread activities. Synchronization is such a mechanism, and
synchronizers are its building blocks.
Before applying a valid synchronization technique to the string-adding example,
let’s spend some time detailing two incorrect strategies that you should avoid in gen-
eral: sleeping and busy-waiting. Together—and they are often combined—they probably
constitute the most common mistake in designing concurrent programs.
Let’s start with sleeping. To guarantee that all strings have been added before using
the list, one of the first and simplest ideas that comes to mind is to just wait “long
enough”:
Scala
...
// DON'T DO THIS!
MILLISECONDS.sleep(10)
assert(shared.size == 10)
The obvious flaw in this approach is that the sleeping time is arbitrary. Why 10 milli-
seconds? Why not 5 or 30? You cannot choose a “long enough” sleeping time without
knowing beforehand how long the tasks will take. If it is too small, the tasks may not be
finished by the time you trigger the next computing stage—here, calling method size—
and the application will break. If it is too large, you end up waiting for no reason after
data is ready, which artificially slows down your application and makes it less responsive.
This is especially true of tasks with running times that follow a long-tailed distribu-
tion: Tasks are short most of the time, but can occasionally be very long. If you choose
a sleeping value large enough to accommodate the longest runs, much time is wasted
when runs are shorter. But if you reduce the sleeping time to improve performance, the
application will fail with the rare occurrences of unusually long tasks.
In short, the sleeping approach does not work because there is no right amount
of sleeping time in general. Still, it is enormously tempting,1 when you know that a
bug is caused by code that finishes a little too early, to want to solve it by inserting
a well-adjusted delay. Don’t do it.
1 As the joke goes, there are only two kinds of concurrent programmers: those who have used a
well-adjusted delay in their code and those who lie about it.
22.1 Illustration of the Need for Synchronization 323
// DON'T DO THIS!
while terminated < 2 do ()
assert(shared.size == 10)
Based on our earlier discussion of shared accesses and locks, we already know that this
program is flawed: Mutable variable terminated is shared by multiple threads without
locking. In particular, both threads could finish their task at the same time and execute
terminated += 1 concurrently, which we know to be unsafe.
However, adding the necessary locking is not the solution:
Scala
val shared = SafeStringList()
var terminated = 0
val lock = Object()
// DON'T DO THIS!
while lock.synchronized(terminated) < 2 do ()
assert(shared.size == 10)
Variable terminated is now always accessed with a lock. As written, the program does
guarantee that the main thread waits for both tasks to finish before calling method
324 Chapter 22 Synchronization
size on the shared list, but the implementation is still unacceptable. The real flaw
of the busy-waiting approach is that the main thread keeps running while the tasks
are being executed. This is wasteful of computing resources—CPU cycles—and cannot
scale to real-size applications, especially when waiting for long-running tasks. To make
things worse, the main thread is not only wasting CPU time, it is also interfering with
the running tasks by repeatedly acquiring the shared lock—about a hundred thousand
times in this small illustrative program—thus potentially delaying the completion of the
task it is waiting for!
Sleeping and busy-waiting are sometimes combined in a mixed approach:
Scala
// DON'T DO THIS!
while lock.synchronized(terminated) < 2 do MILLISECONDS.sleep(5)
While this alleviates some weaknesses of each individual approach—less CPU time is
wasted in the main thread, and smaller sleeping times can safely be used—it still won’t
make your code as efficient and responsive as a properly synchronized solution.
These approaches—sleeping, busy-waiting, or a combination thereof—are a major
source of bugs and poor performance in concurrent programs and should be avoided.
Instead, you want to rely on synchronizers, which can efficiently suspend threads (no
CPU wasted) until the exact moment a desired condition is established (no time wasted
sleeping).
22.2 Synchronizers
Fundamentally, synchronizers are stateful objects, capable of blocking threads. A syn-
chronizer has a mutable state and defines operations to modify it. Moreover, some of
these operations may block the threads that call them until the synchronizer is tran-
sitioned into an acceptable state by another thread. You can think of synchronizers as
smart traffic lights for thread coordination: They stop threads or let them go and use
the threads’ own actions to change the lights.
To help illustrate the concept, consider exclusive locks, which we discussed in Chap-
ters 18 and 19. Locks are synchronizers. A lock has a two-valued state—free or owned—
and defines operations to modify this state—lock, which changes the state from free
to owned, and unlock, which changes it from owned to free. The lock operation can be
blocking if a lock is in the owned state. The unlock operation never blocks: An owned
lock can always be freed, and freeing a free lock is invalid.
As another example, consider thread pools. In some of the earlier examples, they
were used as synchronizers. A thread pool can transition from one state where it is
active and accepting tasks, to another state where it is shut down and all its tasks
are terminated. Method awaitTermination is used to block a thread until a pool is in
the terminated state.
The focus of this chapter is inter-thread synchronization via shared memory. In sys-
tems that communicate via messages instead, a channel is also a synchronizer. Sending
22.3 Deadlocks 325
and receiving messages are operations that modify the state of a channel, and that can
be blocking if a channel is full or empty. Synchronization of non-thread entities, such as
actors and coroutines, and the use of messages are briefly discussed in Chapter 27.
Locks are primarily used for non-interference, but other synchronizers are better
suited to cooperation. Chapter 23 gives an overview of the most common synchronizers.
For now, let’s solve the list-sharing problem with a simple but useful synchronizer:
the countdown latch. It is implemented in the java.util.concurrent library as a
CountDownLatch:
Scala
val shared = SafeStringList()
val latch = CountDownLatch(2)
latch.await()
assert(shared.size == 10)
Listing 22.2: Waiting for task completion using a countdown latch synchronizer.
The state of a countdown latch is a counter. Method countDown decrements this counter
if it is positive, and method await blocks any calling thread until the latch’s counter is 0.
In this example, a latch is created with a state equal to 2. The main thread waits for
the latch state to be 0 before querying the size of the list. The other two threads each
decrement the state by 1 as they finish, thus bringing it to 0 when they are both done.
By calling latch.await, you avoid the drawbacks of sleeping or busy-waiting: The main
thread is blocked efficiently and does not use CPU time while waiting, and it is notified
to resume execution as soon as the latch opens, without additional sleeping delay.
22.3 Deadlocks
When a program uses synchronizers, threads are blocked until the synchronizer reaches
a desired state. Synchronization carries a danger—namely, it makes threads wait for a
condition that may never be satisfied. Some errors are easy to catch. For instance, a
thread calls await on a latch, but nowhere is countDown ever called in the code. Other
mistakes can be trickier, such as waiting for more countdowns than can possibly happen
in a program.2
2 Though not the focus of this section, waiting for fewer countdowns is also a synchronization
mistake, resulting in a thread that uses data too early.
326 Chapter 22 Synchronization
Deadlocks are a rather common situation in which threads end up waiting for
a condition that is never established. In a deadlock, multiple threads wait for each
other in a cycle. As an illustration, consider the following implementation of a thread-
safe box:
Scala
class SafeBox[A]:
private var contents = Option.empty[A]
private val filled = CountDownLatch(1)
// DON'T DO THIS!
def get: A = synchronized {
filled.await()
contents.get
}
not need to be reimplemented. Futures and promises are discussed in Chapter 25.
22.3 Deadlocks 327
contents=None
Thread T1 Thread T2
// get // set(value)
acquire lock on "this"
filled.await() block
wait to lock "this"
DEADLOCK!
Figure 22.1 Possible deadlock of the incorrect box implementation in Lis. 22.3.
This problem is easy to fix. There is no need for a thread inside method get to
access the contents of the box until the latch is open. Therefore, the thread should
wait on the latch first. (As a synchronizer, a latch is obviously thread-safe and can be
accessed without locking.) Then, after the latch is open, the thread can lock the box
to access its contents:
Scala
class SafeBox[A]:
private var contents = Option.empty[A]
private val filled = CountDownLatch(1)
def get: A =
filled.await()
synchronized(contents.get)
Listing 22.4: Thread-safe box with a latch; fixed from Lis. 22.3; see also Lis. 27.2.
The implementation of set is left unchanged. Since countDown is a fast, non-blocking
operation, you can call it while holding the lock.
Developers who are new to concurrent programming might think it strange, but you
can also implement method set in this way:
328 Chapter 22 Synchronization
Scala
def set(value: A): Boolean = synchronized {
if contents.nonEmpty then false
else
filled.countDown()
contents = Some(value)
true
}
Listing 22.5: First variant for method set from Lis. 22.4.
This variant opens the latch before variable contents is set, while the box is still empty.
It might seem like a mistake, but thanks to locking, this latch opening is harmless: A
thread inside get might go through the latch at this point, but it will still be blocked
by the lock until the box contents are actually set.
If you want to access the latch outside the locked section, as in get, you can write
the set method slightly differently:
Scala
def set(value: A): Boolean =
val setter = synchronized {
if contents.nonEmpty then false
else
contents = Some(value)
true
}
if setter then filled.countDown()
setter
Listing 22.6: Second variant for method set from Lis. 22.4.
In this variant, you first lock the box to check that it is still empty. If so, the box
contents are set, and setter is true. You can then release the lock before opening the
latch. Releasing the lock is harmless: It simply lets another thread call set and find
the box already filled.
A Java virtual machine (JVM) can usually provide information on its threads, typ-
ically in the form of a thread dump. How you trigger a thread dump depends on the
JVM—the HotSpot JVM used in this book reacts to a SIGQUIT signal—and the exact
format of the output may vary.
As an illustration, you can cause the incorrect box implementation from Listing 22.3
to deadlock by using the following two-thread program:
Scala
val exec = Executors.newCachedThreadPool()
val box = SafeBox[Int]()
exec.execute(() => box.set(0))
println(box.get)
A secondary thread attempts to set a box with zero while the main thread invokes get.
On some runs, this program terminates properly. On others, it gets stuck. A thread
dump reveals the following information (slightly edited for clarity):
"pool-1-thread-1" #15 prio=5 os_prio=31 cpu=3.79ms elapsed=121.29s
java.lang.Thread.State: BLOCKED (on object monitor)
at chap22.Box1$SafeBox.set
- waiting to lock <0x000000061f1f6a68> (a chap22.Box1$SafeBox)
(The dump includes many more threads not displayed here. These are internal to
the JVM and are used for just-in-time compilation, garbage collection, and other
operations.)
The output shows that the two threads of interest are not running. The worker
thread from the pool, pool-1-thread-1, is blocked on an object monitor—JVM’s ter-
minology for its intrinsic locks—at the beginning of method set. It is trying to lock
object 0x000000061f1f6a68, an instance of SafeBox. Meanwhile, the main thread is
blocked on method await from CountDownLatch. (All the synchronizers in java.util
.concurrent are implemented in terms of an AbstractQueuedSynchronizer class and
use a low-level operation park to suspend threads.) You can see on the last line that
the main thread owns the lock on the same SafeBox instance that the worker thread is
waiting for, object 0x000000061f1f6a68. The deadlock state is exactly that shown in
Figure 22.1 (main is T1 and pool-1-thread-1 is T2).
330 Chapter 22 Synchronization
In the presence of multiple threads, however, memory models are often not sequen-
tially consistent—in other words, reading a variable may not produce the last value
written into it. The reason is that sequential consistency is typically not needed in
multithreaded applications, and hardware can be more efficient by not implement-
ing it. In particular, weaker consistency models facilitate hardware optimizations such
as core-level caching, instruction reordering, and speculative execution. To improve
performance, hardware manufacturers make consistency commitments that are not
sequential, and programmers need to be aware of them (especially when implementing
compilers and operating systems). To ensure portability across different hardware and
operating systems, Java defines its own memory model, the Java Memory Model (JMM),
common to all languages running on the JVM. And to better leverage hardware perfor-
mance, the JMM is not sequentially consistent.4
Figure 22.3 shows the same operations as Figure 22.2, but executed by two different
threads, T1 and T2. While sequential consistency would ensure that when variable x is
read by T1 its value is v3, there is no such guarantee under the JMM, even through v3
4 The memory model of .NET languages is very similar to the JMM; the C/C++ model is more
complex.
22.5 The Java Memory Model 331
is the last value written into x. Similarly, the reading of variable y by T2 is not guaranteed
to produce v4.
Wx(v1) Wy(v4) Ry Rx
T1
v4 ?
So, what do we know about the reading of x in T1 and the reading of y in T2? In
the absence of synchronization, the JMM does not guarantee much. We know that the
reading of variable y by T1 does produce v4, the last value written into y by thread T1
itself. The reading of variable x by T1, however, may produce v1, or v3, or a mixture
of v1 and v3, such as an object in which some fields are as in v1 and others are as in v3.
Similarly, the value of variable y read by T2 could be v2, or v4, or a combination of
both. Clearly, writing correct programs under such assumptions is next to impossible.
Fortunately, threads tend to coordinate their actions via synchronization to ensure
that some operations take place before others, and the JMM makes stronger guaran-
tees when synchronization is involved. This is why a sequentially consistent model is not
needed: Most multithreaded programs require some form of synchronization, and a mem-
ory model that guarantees consistency for properly synchronized programs is enough.
The JMM is defined in terms of a happens-before ordering of events (thread actions).
Events include reading and writing variables, but also starting threads and synchroniza-
tion operations such as locking and unlocking. The precise semantics of the happens-
before relation are defined in Section 17.4 of the Java Language Specification and in the
documentation of concurrency libraries such as java.util.concurrent. The details are
beyond the scope of this book, and it is usually sufficient that you keep in mind three
characteristics of the happens-before relation:
• Intra-thread code is sequentially consistent: An action taken by a thread happens-
before a subsequent action by the same thread. This is why in Figure 22.3,
thread T1 is guaranteed to read v4 in variable y.
• Synchronization operations—locking, unlocking, inserting into a thread-safe queue,
opening a latch, and so on—connect events from different threads in terms of
happens-before.
• The happens-before relation is transitive: If A happens-before B and B happens-
before C, then A happens-before C.
332 Chapter 22 Synchronization
For example, in Listing 17.3, we saw a pattern for thread termination that uses a
volatile Boolean variable to request termination of another thread. It is reproduced here:
Scala
@volatile private var continue = true
def run() =
while continue do
// perform task
end while
// cleanup
The defining property of volatile variables is that writing a volatile variable happens-
before the next read of that variable. From this, you know that the first time a thread
in run reads the shared variable continue after method terminate has been called, it
will read the value false written in terminate. If the shared Boolean flag is not volatile,
there is no happens-before connection between its writing and subsequent reading, and
the runner thread is at risk of still reading true after false was written, and thus of
continuing to run.
As another illustration, consider a three-thread program:
Scala
def run1() = // behavior of thread t1
SECONDS.sleep(4)
println(x)
SECONDS.sleep(5)
y.synchronized {
SECONDS.sleep(1)
println(x)
SECONDS.sleep(2)
}
Listing 22.7: Example of events tied and not tied by happens-before; see Fig. 22.4.
The three threads start running at the same time, and the graph in Figure 22.4 shows
how their actions are connected according to the happens-before relation. Ly and Uy
represent the locking and unlocking of object y, respectively. Of particular interest are
the edges between t3.start and t3.Rx (starting a thread happens-before the first
action taken by that thread) and between t2.Uy and t1.Ly (releasing a lock happens-
before the next acquiring of that lock). The other edges are the consequence of in-thread
program ordering. Calls to method sleep do not create happens-before links—another
reason why you should not use sleeping for synchronization purposes in addition to
the reasons discussed in Section 22.1.
t1.Rx t1.Ly t1.Rx t1.Uy
? v2
t3.Rx t3.Rx
v1 ?
The first read of variable x by thread t3 produces v1, the last value written into x,
because t2.Wx(v1) happens-before t3.Rx (by transitivity). Similarly, the second read
of variable x by thread t1 produces v2, the last value written into x, because t2.Wx(v2)
happens-before the second t1.Rx, by transitivity. Note that t2.Wx(v1) happens before
(no dash, as in “takes place earlier than”) the first t1.Rx in terms of real time, but it
does not happen-before it (with dash, according to the relation that defines the JMM).
This is also true of t2.Wx(v2) and the second t3.Rx. As a result, the first t1.Rx and
the second t3.Rx are free to produce values other than v1 and v2.
Roughly speaking, you can think of concurrent programs as belonging to one of three
groups:
• Group 1: fully synchronized programs in which all the necessary happens-before
relations exist and are obvious due to locking, submitting to thread pools, or other
forms of explicit synchronization.
• Group 2: fully synchronized programs in which the necessary happens-before
relations exist but are not necessarily obvious. For instance, the implementation of
334 Chapter 22 Synchronization
22.6 Summary
• When parallel tasks are not fully independent, threads need to coordinate with
other threads. A common need is for a thread to pause its computation and wait
until other threads have made the state of a system suitable to continue.
• When facing this problem, two courses of actions are as tempting as they are
unsuitable in general. The first mistaken approach is to make a thread sleep for a
set amount of time. No matter how careful you are in the selection of a sleeping
time value, there is no correct choice when the activities of other threads have
unpredictable duration. Too short, and the state of an application may not be
ready by the time a thread resumes after sleeping. Too long, and time is wasted
sleeping while a thread could make progress instead, thus hurting an application’s
performance and responsiveness.
• The other misguided strategy is for a thread to constantly poll the state of the
system until it is suitable for that thread to continue, a programming style known
as spinning or busy-waiting. This approach is wasteful in several ways and does
not scale with large numbers of threads and/or lengthy activities. First, polling
threads continually use computing resources and need to repeatedly be scheduled
for execution, thus taking CPU time away from the threads they are waiting on.
Second, repeatedly querying the state of an application while other threads are
updating it necessitates thread-safe mechanisms—such as locks—which tend to be
costly when used frequently, as in a continuously checking loop.
• Instead of sleeping and busy-waiting, concurrent programs should rely on proper
synchronization. Synchronizers are designed to block threads efficiently (no busy-
waiting) until other threads establish a desired condition (no guessing of a suitable
sleeping duration). In shared memory systems, synchronizers maintain a state and
offer methods to block one or more threads until other threads set the synchronizer
to the required state.
• The best known synchronizer is the exclusive lock, which blocks acquiring threads
until the lock is freed by another thread and becomes available. Concurrency
libraries typically implement many other types of synchronizers, some of which
are discussed in Chapter 23.
• Using synchronizers is error-prone. Threads can be made to wait indefinitely for
conditions that will never be established, possibly causing an entire application
to halt. Deadlocks are a common cause of indefinite waiting. They are the result
of threads waiting for each other in a cyclical way. Invoking blocking methods on
synchronizers while holding locks is a sure way to cause a deadlock.
336 Chapter 22 Synchronization
• Deadlocks are made easier to investigate by the fact that the threads involved
are blocked, so they can be examined at the programmer’s convenience. A JVM
can produce a snapshot of information on its threads, known as a thread dump.
This information typically includes whether threads are blocked or running, which
synchronizers they are blocked on, which locks they currently own, and so forth.
• Single-threaded programs are sequentially consistent: Reading a memory loca-
tion produces the last value written into that location. For performance reasons,
processors, operating systems, and virtual machines are often not sequentially
consistent in the presence of multiple threads. Though not sequentially consis-
tent in general, the Java Memory Model is defined in such a way that properly
synchronized programs can expect sequential consistency.
Chapter 23
Common Synchronizers
Concurrency libraries often contain extensive tools for inter-thread synchronization.
This chapter describes some of the most common synchronizers, along with typical
usage patterns.
23.1 Locks
The best-known synchronizer is the exclusive lock, which we have already seen in several
code illustrations in previous chapters. What was used through synchronized is the so-
called intrinsic locking mechanism of the JVM—every object can be used as a lock. The
Java standard library defines three types of locks in addition to intrinsic locks. Two
forms are of particular interest and are briefly discussed in this section. The third,
StampedLock, is rarely used and reserved for advanced usage.
First is the class ReentrantLock. This class implements an exclusive lock, like
synchronized, but one richer in features. For instance, threads can attempt to acquire
a lock with a timeout or allow themselves to be interrupted while waiting for a lock. You
can use ReentrantLock to write a variant of the getRank method from Listing 18.7 in
which threads give up trying to register after 100 milliseconds if the lock is not available:
Scala
private var userCount = 0
private val lock = ReentrantLock()
337
338 Chapter 23 Common Synchronizers
with an if-then-else before freeing the lock. You can also use tryLock without a
timeout to acquire a lock if it is available, or give up immediately without blocking
the thread.
ReentrantLock is typically used within a try-finally block to ensure that the
lock is properly released, even in the presence of exceptions. In this particular example,
not much could go wrong between tryLock and unlock, but a try-finally construct
makes it easier to release the lock when a value is returned, without having to store this
value in a local variable.
When acquiring a lock without a timeout, you can use lock to wait uninterruptibly,
or use lockInterruptibly to make the blocked thread responsive to interrupts. The
behavior of lock is the same as synchronized: Interrupting a thread does not make
the thread give up trying to acquire a lock. By contrast, lockInterruptibly throws
an InterruptedException when a thread is interrupted while waiting for a lock. Using
lockInterruptibly makes it easier to write lock-using tasks that can be canceled.
Besides its more flexible locking methods, the other major benefit of ReentrantLock
over intrinsic locks is that you can create multiple conditions on the same lock. Condi-
tions are discussed in Section 23.4, and Listing 23.8 uses a lock with two conditions.
In addition to this reimplementation of exclusive locks, a second class from the stan-
dard library, ReentrantReadWriteLock, implements non-exclusive locks. The defining
characteristic of these locks is that you can acquire them in two separate modes: write
(exclusive) and read (shared). When acquired in write mode, a lock is owned by a single
thread, as with exclusive locks. By contrast, a lock acquired in read mode can be shared
among multiple threads in that mode.
You can use read-write locks to guard structures that are frequently read but rarely
modified. Threads can read concurrently by acquiring the lock in read mode, whereas
writing the structure requires exclusive access and necessitates that threads acquire the
lock in write mode. Note, however, that the name read-write is historical and can be
somewhat misleading. A read-write lock is really a shared-exclusive lock, whether the
shared mode is used while writing or while reading.
Suppose, for instance, that an application relies on a thread-safe structure, with no
additional synchronization necessary to read or write, but also occasionally needs to
read the entire state of the structure to get a snapshot:
Scala
import scala.collection.concurrent.TrieMap // thread-safe
rlock.lock()
try if users.putIfAbsent(username, info).isEmpty then Some(info) else None
finally rlock.unlock()
for i <- 1 to N do
exec.execute { () =>
start.countDown()
start.await()
5 times shared.add(i.toString)
finish.countDown()
}
start.countDown()
start.await()
val time1 = System.nanoTime()
finish.await()
val time2 = System.nanoTime()
exec.shutdown()
assert(shared.size == 5 * N)
assert((1 to N).forall(i => shared.getAll.count(_ == i.toString) == 5))
assert((time2 - time1) / 1E9 <= 0.01)
Listing 23.3: Add latches to Lis. 21.2 for increased concurrency and precise timing.
The purpose of the start latch, which is initialized with N + 1, is to make sure that
all the pool threads start adding to the list and the main thread starts recording time
at the same instant. First, this makes it more likely that threads will attempt to call
method add on the list at the same time, thus increasing your chances of finding a bug
if the list is not thread-safe. Second, it’s a way to measure how long it takes for all the
add operations to complete. By waiting on the start latch before you start keeping
time, you can ignore the time needed to activate threads. The second latch, finish,
opens when all the list insertions have taken place, at which point you can record the
end time. In addition to checking the final state of the shared list, the test asserts that
all the list operations were completed in less than 1/100th of a second.
Latches can transition from “closed” to “open” one time, and one time only. They
cannot transition from “open” to “close” and so are not reusable—countdown and await
have no effect on an open latch. When threads need to use a synchronization point
repeatedly, they can use a cyclic barrier instead. A cyclic barrier automatically opens
when the last thread reaches the barrier, and it closes again immediately. Instead of two
latches, you could write the list testing example with a single reusable barrier:
Scala
val exec = Executors.newCachedThreadPool()
val shared = SafeStringList()
val startEnd = CyclicBarrier(N + 1)
23.3 Semaphores 341
for i <- 1 to N do
exec.execute { () =>
startEnd.await()
5 times shared.add(i.toString)
startEnd.await()
}
startEnd.await()
val time1 = System.nanoTime()
startEnd.await()
val time2 = System.nanoTime()
exec.shutdown()
assert(shared.size == 5 * N)
assert((1 to N).forall(i => shared.getAll.count(_ == i.toString) == 5))
assert((time2 - time1) / 1E9 <= 0.01)
Listing 23.4: Replace two latches from Lis. 23.3 with a single barrier.
In this variant of the test, all the threads reach the barrier a first time, at which point
the pool threads start adding to the list and the main thread starts keeping track
of time. After all the add operations are completed, all the threads reach the barrier
again, and the main thread records the finish time. Cyclic barriers are often used in the
implementation of iterative algorithms.
23.3 Semaphores
A semaphore is a synchronizer that maintains a count of virtual permits as its state.
Threads can acquire and release permits, and when no permit is available, the acquiring
method becomes blocking.
Semaphores are very versatile, but rather low-level synchronizers. You can implement
many other synchronizers using semaphores. (Concurrency libraries typically rely on
more efficient, non-semaphore-based implementations.) For instance, you can write a
simple, exclusive, non-reentrant lock using a semaphore:
Scala
class SimpleLock:
private val semaphore = Semaphore(1)
@volatile private var owner = Option.empty[Thread]
In this lock implementation, a semaphore is created with a single permit. The lock is
locked by acquiring this permit and unlocked by releasing the permit. The lock is not
reentrant: A thread that owns the lock and attempts to lock it again will get stuck on
an empty semaphore. To prevent threads from attempting to unlock a lock that they
do not own, a reference is kept on the current lock owner, as an option.
Because of their notion of permits, semaphores are often used to enforce bounds.
For example, you can implement a bounded queue using two semaphores:
Scala
class BoundedQueue[A](capacity: Int):
private val queue = mutable.Queue.empty[A]
private val canTake = Semaphore(0)
private val canPut = Semaphore(capacity)
def take(): A =
canTake.acquire()
val element = synchronized(queue.dequeue())
canPut.release()
element
Listing 23.6: Bounded queue based on two semaphores; see also Lis. 23.7 and 23.8.
The number of permits in semaphore canTake equals the number of elements in the
queue, initially zero. A permit from canTake is needed to take an element out of
the queue, thus ensuring that the queue is not empty. If the queue is empty, method take
blocks the calling thread. A permit for semaphore canTake is created inside method put
when an element is added to the queue, so the number of permits always reflects the
number of elements in the queue. Semaphore canPut is used in a symmetrical way:
Its number of permits equals the number of available slots in the queue, initially set to
the queue’s capacity. Permits from canPut are acquired to insert elements—thus guar-
anteeing that the queue is not full—and created when elements are removed—in other
words, when empty slots are created.
Observe that, to avoid deadlocks, semaphore permits are acquired first, before lock-
ing the internal queue (see the discussion in Section 22.3). Note also that permits
23.4 Conditions 343
acquired by some threads may be released by other threads (semaphore permits are
virtual), hence my choice of the word “created” instead of “released.”
This implementation of a bounded queue suffers from the fact that operations on a
non-empty and non-full queue—which are non-blocking—still need to go through two
semaphore operations. This inefficiency is avoided in a better bounded queue implemen-
tation in Listing 23.8.
23.4 Conditions
Conditions are another low-level synchronizer that you can use to implement other
synchronizers. Conceptually, a condition maintains a set of waiting threads, which are
blocked. A condition defines methods for a thread to notify another thread from the set
that it can continue, to notify all the threads in the set that they can continue, or to
add itself to the set. Notification has no effect if no thread is currently waiting.
Conditions have been available on the JVM since the beginning: Every Java object
is a condition, in the same way that every Java object is a lock. Conditions and locks
work hand-in-hand: You typically need to lock a shared state to decide whether waiting
is needed. If a thread needs to wait, it must release this lock before waiting so that other
threads can lock and modify the shared state. Note that an object needs to maintain
two separate sets of waiting threads: those waiting to acquire the lock and those waiting
to be notified from the condition.
Before the advent of java.util.concurrent in Java 5, you had to write all your
synchronizers in terms of these basic Java conditions. As an illustration, the blocking
queue from Section 23.3 could be reimplemented using a condition:
Scala
class BoundedQueue[A](capacity: Int):
private val queue = mutable.Queue.empty[A]
Listing 23.7: Bounded queue with a single condition; see also Lis. 23.8.
The bounded queue itself is used as the lock and condition—all calls to methods
synchronized, wait, and notifyAll are made on this. In take, you make the calling
344 Chapter 23 Common Synchronizers
thread wait on the condition if the queue is empty; in put, you make it wait if the queue
is full. After any successful element insertion or removal, you notify the entire set of
waiting threads that the state of the queue has changed.
This code has a few peculiarities that deserve to be discussed in detail:
• The blocking method wait is called within the synchronized block of code, even
though I stated earlier that you should never invoke a blocking method while
holding a lock because it results in deadlocks. First, let’s understand why this is
necessary. The issue is a non-atomic check-then-act: In take, for instance, a thread
could see that a queue is empty and release the lock in anticipation to calling wait.
But before this thread reaches wait and is actually added to the set of waiting
threads, another thread invokes put and calls notifyAll. By the time the tak-
ing thread reaches the waiting set, notifyAll has already happened, and the
thread gets stuck, even though the queue is now non-empty. (Listing 23.6 does
not suffer from this problem because a semaphore permit can always be safely
created, whether a thread is waiting for it or not.)
Calling method wait within the synchronized block avoids this problem, but what
about the deadlock issue? Things work correctly only because wait internally
releases the lock after a thread is added to the waiting set (and it is invalid
for a thread to invoke wait on an object that is not locked by the thread). So,
the first rule to remember is simple: Code should always use X.wait inside an
X.synchronized block on the same object X.
• In both take and put, the check that a queue is not empty or not full is per-
formed inside a loop. This is essential, and it is a consequence of the locking issues
discussed earlier. When a thread calls wait, it owns a lock that is automatically
released. This lock needs to be reacquired after the thread is notified for the thread
to continue executing the synchronized block. Several threads—notified from the
set of waiting threads, or initiating new calls to method take or put—compete
for this lock, and any thread that ends up acquiring the lock can modify the state
of the queue. A taking thread, for instance, could be notified that a queue is non-
empty, but the queue could become empty again by the time the thread manages
to reacquire the lock. So, once the lock has been reacquired, a thread needs to
test the state of the queue again and go back to waiting again if the state is not
suitable. The second rule is as simple as the first: Always invoke wait in a loop
that reevaluates the condition you are waiting for.
• All the waiting threads are notified—using notifyAll—after a queue element is
added or removed. This is wasteful: If you are adding one element in an empty
queue, why notify ten threads to come and get it? However, method notify, which
notifies only one of the waiting threads, cannot be used in this example. When
multiple threads are waiting, method notify does not specify which thread is
23.4 Conditions 345
notified. In this implementation, threads waiting to add to the queue and threads
waiting to take from the queue are waiting on the same condition. Therefore, a
call to notify within method take, for instance, would run the risk of notifying
another taking thread instead of a thread waiting to put an element into the
queue.1
By using notifyAll, more threads may end up being notified than can make
progress, but all the threads reevaluate the state of the queue in a loop anyway
and will wait again if needed. For instance, if an element is added to an empty
queue, all the threads waiting to take from the queue are notified, even though only
one will be able to get the element. All the others will observe that isEmpty is true
and block again. The implementation is correct but inefficient. See Listing 23.8
for an alternative implementation that avoids this problem.
Methods wait, notify, and notifyAll treat Java objects as intrinsic conditions
associated with intrinsic locks. If you use ReentrantLock as an alternative implemen-
tation of exclusive locks, you can use its own implementation of conditions. The main
advantage of ReentrantLock over intrinsic locks, at least in that regard, is that multiple
conditions can be associated with the same lock. You can write a better bounded queue
by having the threads that need the queue to not be empty, and the threads that need
the queue to not be full, wait on two separate conditions. This is difficult to do with
wait and notify because then two conditions would require two locks, and wait would
release only one of them. With ReentrantLock, you can create multiple conditions on
the same lock, using method newCondition:2
Scala
class BoundedQueue[A](capacity: Int):
private val queue = mutable.Queue.empty[A]
private val lock = ReentrantLock()
private val canPut, canTake = lock.newCondition()
def take(): A =
lock.lock()
try
while queue.isEmpty do canTake.await()
canPut.signal()
queue.dequeue()
finally lock.unlock()
1 The issue is subtle. It is not immediately obvious that threads that add to the queue and threads
that take from the queue can end up in the set of waiting threads at the same time, but it can happen
if the number of threads sharing the queue is more than double the queue capacity (see the aside on
model checking at the end of this section). In such a scenario, replacing calls to notifyAll with notify
could result in a deadlock.
2 In Scala, val x, y = expr evaluates expr twice, once to initialize x and once to initialize y.
346 Chapter 23 Common Synchronizers
The details are not essential, but it can be seen that the model uses the same
strategy as in Listing 23.7, except that notify is used instead of notifyAll.a
In particular, if the queue is not full, an element is added and method notify
is called; otherwise, the thread calls method wait and the queue is unchanged.
Method take is modeled similarly.
a The +
models uses IF instead of while because of an implicit loop in the TLA model, but
the condition is being rechecked as in Listing 23.7.
23.4 Conditions 347
Methods wait, notify, and notifyAll can also be modeled according to their
Java semantics:
∆
Notify = IF waitSet = {} THEN UNCHANGED waitSet
ELSE ∃ t ∈ waitSet : waitSet0 = waitSet \ {t}
NotifyAll = waitSet0 = {}
∆
Again, the details are beyond the scope of this note, but you can see that wait
adds a thread t to the set of waiting threads—in TLA+, the notation waitSet0 =
· · · means “the new value of waitSet is . . . .” Similarly, notifyAll removes all
the threads from the waiting set, making the set empty. The representation
of method notify is a little more complex. It basically says that if the set of
waiting threads is empty, nothing happens and the set is unchanged. Otherwise,
a thread t is removed from the set. The existential quantifier is used to represent
the fact that method notify does not specify which thread is being removed—
some thread is taken out of the set.
Given these mathematical definitions—and a few other elements omitted
here—a model-checker can be run. A basic model-checker simply tries to enumer-
ate all possible ways threads might interleave their actions until all possibilities
have been checked or an error is found.
On the blocking queue, a model-checker can produce the deadlock scenario
shown Figure 23.1. In this scenario, a queue has a capacity of two and is shared
among three producing threads—p1, p2, and p3—and two consuming threads—
c1 and c2. Initially, the queue is empty and all the threads are running. The
producers start calling method put until the queue is full (state 3). The next
three calls to put result in producers calling wait and being added to the set
of waiting threads (state 6). Consumers start calling method take, notifying a
producer each time, until the queue is empty (state 8). They keep calling take
until they are all added to the waiting set (state 10). Interestingly, at this point,
the set contains both producing and consuming threads.
In state 11, a producer—p2 or p3—adds an element into the queue, and as a
result calls notify. The intent of this call to notify is to let a consumer know
that a value has been added to the queue. However, thread p1—a producer—is
taken out of the set instead. The queue is again filled (state 12), resulting in
the producers being blocked in the waiting set (state 15). At this point, thread
c1 is the only running thread. It consumes a first element from the queue and
notifies p1 (state 16). It then takes the second element from the queue but notifies
consumer c2 instead of a producer (state 17). Both consumers then block on the
empty queue, leaving only thread p1 to run (state 19). This thread puts a value
in the queue but notifies p2 instead of a consumer.
348 Chapter 23 Common Synchronizers
N times exec.execute(searchTask)
In this code, the files to search are stored in a concurrent queue. The queue needs
to be thread-safe because all the searching tasks obtain files from it in parallel. Each
searching task keeps taking files from the queue until the queue is empty, at which
point the task terminates. The queue does not need to be blocking: It is filled with
files initially and is not used after it is empty. You need to use poll—which returns
null3 on an empty queue—to extract a file from the queue: if !files.isEmpty then
file = files.take() would not work because of a non-atomic check-then-act. Each
file is searched sequentially, and matching lines are added to a shared output file. You
need to lock accesses to this file because, even though instances of type Writer tend
to be thread-safe in Java, line contents and newline separators could be improperly
interleaved otherwise.4
Instead of making all the searching tasks add their results to a shared output, you
could refactor the code into a producer–consumer pattern in which producers create
matching lines from files, and a single consumer stores these lines into the output file:
3 Todeal with null, values are wrapped into options, as was done in Section 12.2.
4 Class
PrintWriter defines an atomic println method, which I could have used here. However, it is
implemented as two calls to method write in a synchronized block, just as in my searchFile method.
23.5 Blocking Queues 351
Scala
val queue = ArrayBlockingQueue[String](capacity)
N times exec.execute(searchTask)
exec.execute(writeTask)
The blocking queue could be temporarily empty—all matching lines found so far have
been written into the output file—but that doesn’t mean the search is finished. Searching
tasks are still potentially opening and reading files and could find more matches. If the
writing task terminates because the queue was empty, these new matches will never be
saved to file.
352 Chapter 23 Common Synchronizers
The next idea is to keep track of how many searching tasks are still active. The
count of active tasks has to be thread-safe—it is decremented by the searching tasks
themselves and queried by the file writing task. If N searching tasks are used, you could
be tempted to modify the code as follows:
Scala
val active = AtomicInteger(N)
...
active.decrementAndGet() // at the end of each searching task
// DON'T DO THIS!
while active.get() > 0 do ...
However, this could still result in early termination of the loop: All the searching tasks
have terminated, the active count reaches zero, and the file writing task is allowed to
terminate without processing the lines that might possibly remain in the queue. This
would result again in matches missing from the output file.
Now you might think: No problem, let’s combine both termination conditions—no
active searching tasks, empty queue of matches—to make sure that the writing task
drains the queue after all searching tasks are done:
Scala
// DON'T DO THIS!
while active.get() > 0 || !queue.isEmpty do ...
This still doesn’t work, but the issue here is more subtle. Suppose you take the last
matching line from the queue but searching tasks are still running. At this point, the
queue is empty, but the condition active.get() > 0 is true, so the file writing task
does not terminate. Instead, it reenters its loop, calls queue.take, and blocks. However,
if no further matches are found, the searching tasks terminate without adding any more
lines to the queue. The active count becomes zero, and the termination condition
is established—the count is zero and the queue is empty. But the consuming thread
remains blocked on take indefinitely, with no opportunity to evaluate this condition
again.
A common technique to properly terminate producer–consumer applications is to
insert special values—called poison pills—into the queue as a way for producers to indi-
cate that consumers should terminate. There are two ways you can apply a poison
pill strategy to the file searching example. First, you can keep a thread-safe count of
running producers, as before, and make the last producer to terminate insert a single
poison pill into the queue. Alternatively, you can make each producer insert a special
value upon termination and have the consuming thread count these values until they
are all received. The first approach still requires a thread-safe count of active tasks; the
second approach, used in the following code, does not:
23.6 Summary 353
Scala
val queue = ArrayBlockingQueue[Option[String]](capacity)
23.6 Summary
• The Java standard library offers a flexible implementation of exclusive locks
through class ReentrantLock, which can be used as an alternative to intrinsic
locks. It provides users with the same basic locking properties but defines addi-
tional methods for threads to be interrupted or to time out (or even not block at
all) when a lock is not available.
• The library also defines non-exclusive locks, commonly known as read-write locks.
These locks can be acquired in shared mode or in exclusive mode; that is, a
354 Chapter 23 Common Synchronizers
lock that is not free can either have a single exclusive-mode owner or one or
more shared-mode owners. Concurrency can be increased by replacing a regular
exclusive lock with a read-write lock and having multiple threads work in parallel
with the lock acquired in shared mode.
• Latches are simple synchronizers that block threads until the latch is open. Java
implements countdown latches, which open after their countDown method is called
a specific number of times.
• Latches cannot be reused because once open, they cannot be closed again. By
contrast, cyclic barriers open and close repeatedly and automatically. Barriers are
set up for a fixed number of participating threads. They open—and close again
immediately—every time all the threads reach the barrier. (In other words, the
arrival of the last thread is the event that triggers opening.) Barriers are often used
in iterative algorithms to guarantee that no thread can begin its kth iteration until
all threads have completed k − 1 iterations.
• Semaphores are a classic type of synchronizer, defined in terms of virtual per-
mits. Permits can be acquired (or consumed) by threads and released (or created)
by other threads. When a semaphore runs out of permits, its acquiring method
becomes blocking. Semaphores are versatile, and they have been used historically
to implement other synchronizers.
• Conditions maintain a set of waiting threads and offer methods to block threads
(add them to the set) or notify threads (remove them from the set). Conditions
are always used within locked sections of code. Threads that block on a condi-
tion automatically—and atomically—release the corresponding lock. Threads that
are notified reacquire the necessary lock automatically, but this is not atomic—
other threads may run code with the lock after a thread has been notified and
before it can itself reacquire the lock. The JVM’s intrinsic locks supports only a
single condition; the later class ReentrantLock permits multiple conditions to be
created on the same lock.
• Blocking queues are both (thread-safe) data structures and synchronizers. In add-
ition to standard queue semantics, they define methods to synchronize threads
based on the state of the queue: Taking from an empty queue is blocking and,
if the queue is bounded, adding to a full queue is also blocking. Blocking queues
are often used in producer–consumer patterns to decouple data-producing activ-
ities from data-consuming activities. Useful patterns can be defined by using a
single producer (scattering data), a single consumer (gathering data), or multiple
producers and consumers on the same blocking queue.
Chapter 24
Case Study: Parallel Execution
This chapter investigates, as a case study, the problem of performing a collection of inde-
pendent tasks in parallel. Tasks are represented as executions of an impure function, run
for its side effects. (Value-returning tasks are the focus of the next few chapters.) Differ-
ent strategies are explored that rely on explicit thread creation (bounded or unbounded),
thread pools (bounded or unbounded, dedicated or shared), or parallel collections. The
last section uses conditions and semaphores to implement a variant in which additional
tasks can be submitted after the computation has already started.
A runner is created from a function of type A => Unit. Method run is given a sequence
of inputs and executes the function on each input in turn. To better demonstrate paral-
lelism, we will use a function sleepTask, of type Int => Unit, throughout the chapter.
This function takes the number of seconds to run as its input value and displays a mes-
sage on the terminal when it starts and when it stops. You can use it in a sequential run:
Scala
println("START")
println("END")
355
356 Chapter 24 Case Study: Parallel Execution
All the work is performed by the thread that calls method run. The entire execution
takes 6 seconds, the sum of the duration of the three tasks.
This chapter discusses several strategies for parallelization. Parallel runners are
required to run all the tasks to completion within a synchronous call to method run—
all the tasks must be finished before the END message—but they can use additional
threads to execute multiple tasks in parallel and speed up the computation. The tasks
are assumed to be independent and to not interfere with each other.
All three tasks run in parallel. The execution takes about 3 seconds, the duration of the
longest of the tasks.
The order in which you call join on the threads doesn’t matter—the code needs to
wait for all the threads to terminate—but it is essential that all the threads are started
before the calls to join begin. A possible mistake would be to try to combine the last
two iterations into a single loop:
Scala
// DON'T DO THIS!
for thread <- threads do
thread.start()
thread.join()
Even though you still execute each task in a separate thread, the tasks now run sequen-
tially because you start a thread only after the previous thread has terminated. Pro-
cessing a 2, 1, 3 sequence takes 6 seconds, as with the sequential runner:
main at XX:XX:06.235: START
Thread-0 at XX:XX:06.263: begin 2
Thread-0 at XX:XX:08.264: end 2
Thread-1 at XX:XX:08.265: begin 1
Thread-1 at XX:XX:09.265: end 1
Thread-2 at XX:XX:09.266: begin 3
Thread-2 at XX:XX:12.266: end 3
main at XX:XX:12.267: END
To limit the number of threads created by a runner to a specified bound, you can
rely on a queue-based approach similar to Listing 23.9:
Scala
class Runner[A](bound: Int)(comp: A => Unit):
def run(inputs: Seq[A]): Unit =
val queue = ConcurrentLinkedQueue(inputs.asJava)
The runner uses two threads; they immediately pull inputs 2 and 1 from the queue.
After 1 second, Thread-1 completes the first task and pulls input 3 from the queue. One
second later—2 seconds into the run—Thread-0 finishes task 2, finds the queue empty,
and terminates. Two seconds later, Thread-1 finishes task 3 and also terminates. The
entire run takes about 4 seconds.
24.4 Dedicated Thread Pool 359
exec.shutdown()
exec.awaitTermination(Long.MaxValue, NANOSECONDS)
done.await()
canStart.acquire()
exec.execute(task)
end for
canStart.acquire(bound)
The run ends up using three different threads—the thread pool is unlimited—but there
are never more than two tasks active at the same time. The overall run takes about 4
seconds, as before.
• The thread that initiates a computation is stuck inside method run for the duration
of the run. Adding tasks can happen only in other threads. Accordingly, runner
instances now need to be safe when shared by multiple threads.
• There is an inherent race condition between a finishing run and an attempt to add
inputs to this run. A method to test the state of a runner—is it running?—would
be useless because of a non-atomic check-then-act: A runner could become passive
right after it is checked for being active. Instead, method addInput should be
expected to fail to add an input to the current run if it comes too late, and it
needs a way to indicate such a failure to its caller.
• Waiting for run completion is harder because new tasks can be added after waiting
has begun. In particular, the simple approach used earlier, based on a countdown
latch created from a known number of tasks, becomes insufficient.
• If runs are allowed to overlap—a new run can be started by another thread before
the previous run is finished—and further inputs are added while multiple runs are
ongoing, to which run should they be added?
• Alternatively, overlapping runs can be prohibited by making sure that method run
is not allowed to start until the previous run is completed. This choice requires that
the method behaves differently depending on the state of a runner—active versus
passive. All the runners implemented so far have been stateless—not counting the
thread pool, which is obviously thread-safe. A stateful runner must be designed
more carefully when shared among threads.
This section focuses on implementations that share two design choices. First, runs
cannot overlap. Method run blocks until the previous run is terminated, then submits
a new set of inputs (and blocks again until this new computation is finished). Second,
addInput indicates success or failure by returning a Boolean value. (Some of Java’s
concurrent collections use the same approach, such as method offer on queues.)
You can implement blocking in method run by combining conditions with a count
of active tasks. With this approach, a run finishes—and a new run may begin—when
all the tasks in a runner have completed, including tasks that were added after the run
started:
Scala
class Runner[A](exec: Executor)(comp: A => Unit):
private var active = 0
private var runs = 0
Method run waits for the active count to be zero to start a run, and again to terminate
the run. Consider now a scenario in which a thread T1 initiates a run and is blocked
waiting for the run to terminate and a thread T2 is blocked at the beginning of method
run waiting to start a second run. Both threads wait for active to be zero. When the
last task of the first run finishes, both threads are notified. At this point, they compete
24.8 Asynchronous Task Submission Using Conditions 365
to relock the runner so as to continue their execution of method run. If thread T2, which
is waiting to start a run, gets the lock first, it will submit new tasks to the executor and
make the active count non-zero. It will then release the lock and wait for this second run
to finish. Thread T1 can then obtain the lock, but it will test active != 0 to be true
and will go back waiting—now waiting for the completion of the second run, instead of
simply being done with its own run.
This is a difficulty that is actually not uncommon in concurrent programming. You
often need to distinguish between separate repetitions of the same state—here, a run-
ner with a positive count of active tasks. For instance, cyclic barriers face a similar
challenge when they need to differentiate between two closed states in a closed–open–
closed sequence. A common strategy, used here, is to number iterations as a mechanism
to distinguish between states that otherwise would be equivalent: Closed–open–closed
becomes closed1 –open1 –closed2 , where closed2 is somehow different from closed1 .
In the runner example, each run is assigned a unique number,3 and threads that
have started a run wait for the active count of their own run to be zero. If a new run
has already started—which implies that the active count must have reached zero—those
threads are free to terminate their execution, even if the active count has already become
non-zero again.4
To conclude this section, I will make two final comments on this implementation.
First, locking is necessary to manipulate fields active and runs: You need to read and
write them within synchronized blocks. However, it is essential that when the tasks
apply the runner’s function, they do it without the lock owned. If the code comp(in) is
placed inside the synchronized block, all parallelism is lost.
Second, it is necessary to use notifyAll when you wake waiting threads because
you need to notify not only the thread that initiated the run that its run is finished,
but also possibly another thread that is waiting to start a new run. Two successive calls
to notify won’t work because they might notify two threads waiting to start a run,
and not the thread that is finishing. In contrast, by using notifyAll, you can notify all
the waiting threads, including multiple threads waiting to start a new run, even though
only one of them can initiate a run at a time.
Ideally, you would like to notify the thread waiting for its run to finish as well as
one of the threads waiting to start a new run. For this, you need to use two separate
conditions, one to start a run and another to finish a run:
3 Strictly speaking, runs can wrap and reuse a value, but the number of runs needed is so large that
it is not an issue in practice: A run will have terminated long before its number is being reused.
4 To avoid getting stuck because the active count quickly transitions from non-zero to zero and back
to non-zero, you might be tempted to implement method run as follows (assuming inputs is not empty):
// DON'T DO THIS!
def run(inputs: Seq[A]): Unit = synchronized {
while active != 0 do wait()
active = inputs.length
for input <- inputs do exec.execute(task(input))
wait()
}
This is incorrect because the JVM allows for spurious wake-ups: A thread blocked on wait may—
rarely—unblock and continue without having been notified. For this reason, it is important to always
stick to the pattern of invoking wait inside a loop that reevaluates the condition a thread is waiting for.
366 Chapter 24 Case Study: Parallel Execution
Scala
class Runner[A](exec: Executor)(comp: A => Unit):
private var active = 0
private var runs = 0
private val lock = ReentrantLock()
private val start, finish = lock.newCondition()
that a new run can begin. Although multiple threads could be waiting on the start
condition, only one is notified to begin a new run. The remainder of the implementation
is unchanged,5 except for the awkwardness of having to use ReentrantLock instead of
synchronized.
the current run finishes and gets the lock before the thread that is about to terminate its own run.
368 Chapter 24 Case Study: Parallel Execution
24.10 Summary
In most applications, concurrent programming need not take the form of individual
threads created on a per-task basis and coordinated using low-level synchronizers such as
locks. Abstractions such as thread pools and parallel collections (more will be explored
in the following chapters) can be used to hide intricate synchronization details and
achieve parallelism with relatively simple code. The old proposition that concurrent
programming is hard remains true in many cases, but parallelism can also often be
achieved with code that is safe and straightforward by leveraging suitable high-level
constructs.
When do-it-yourself levels are warranted—for instance, while implementing generic
libraries—code complexity can increase rapidly. This case study developed several imple-
mentations of a parallel runner of independent tasks. Some, like Listing 24.6, are trivially
simple. Others, while not as trivial, can be written in terms of a single synchronizer,
without explicit threads or locks. Still others combine multiple synchronizers in complex
ways, and their correctness can be asserted only after all sneaky scenarios have been
taken into account.
6 As a result, and by contrast to the earlier variants, this implementation works properly only if the
exec.shutdown()
exec.awaitTermination(5, MINUTES)
assert(shared.size == 10)
To ensure this approach would work, the shared list was designed to be thread-
safe—in this case, by relying on locks. Note, however, that this thread-safety gives us
more than we need: The list remains in a valid state after each call to its add method
369
370 Chapter 25 Futures and Promises
by any thread. In this particular example, these intermediate states, while valid, are
not needed. The list is used only after the tasks that added to it are finished—when it
already contains all ten strings.
If all you need is to build a collection of all the strings produced together by the two
tasks, you don’t care about the state of the list as it evolves during this computation.
Instead, you can have each task produce its own collection of strings, and then put the
two collections together at the end:
Scala
val exec = Executors.newCachedThreadPool()
exec.shutdown()
exec.awaitTermination(5, MINUTES)
This variant of the program proceeds differently from before. Instead of sharing a single
list, each task creates its own list. These per-task lists are not shared and do not need
to be thread-safe.1 No locks are involved. After the tasks have terminated, the main
thread puts the two lists together, again without a need for synchronization.
The program works but lacks elegance. It uses two mutable variables, initialized
with arbitrary values, and runs the risk that these values may be used by mistake,
resulting in a NullPointerException. What brings about this convoluted structure is
a discrepancy between the purpose of each task—to create a list—and the fact that
the tasks are still implemented as actions—instances of Runnable that modify mutable
data. It would be more sensible for the tasks to be functions—value-producing tasks,
or functional tasks—instead of actions that set external variables.
All of the mechanisms used so far to create concurrent activities—the Runnable
interface, the constructor of Thread class, the execute method of a thread pool—focus
on actions. They are inadequate to handle functional tasks. This chapter and the next
discuss futures, an established device tailored to value-producing tasks.
1 As immutable lists, they are thread-safe in this example, but they don’t need to be.
25.2 Futures as Synchronizers 371
NOTE
The Java standard library defines an interface Future, introduced in Java 5. The Scala standard
library also defines a trait by the same name. The Scala type includes many functionalities not
found in the Java interface. Various third-party libraries—most notably Google’s Guava—developed
types of future that were closer to Scala’s Future trait until a class CompletableFuture was in-
troduced in Java 8. Java’s CompletableFuture and Scala’s Future now offer similar mechanisms
(CompletableFuture has a bit more), which are explored in Chapter 26. This chapter focuses on
using futures as synchronizers, which you can do with all three types. Code illustrations in this
section and the next use Java’s earlier Future for simplicity. Section 25.4 introduces the other two
implementations.
So far, we have used thread pools through a method execute, which takes an argu-
ment of type Runnable. In addition to Runnable, Java defines a Callable interface.
In contrast to the run method in Runnable, the call method in Callable can return
values.2 Java thread pools define a method submit that takes a Callable argument.
While execute returns nothing—it’s a void method—submit returns a future:
Scala
val exec: ExecutorService = ...
while true do
val socket = server.accept()
exec.execute(() => handleConnection(Connection(socket)))
This server uses a thread pool to handle incoming connections in parallel, up to 16 re-
quests concurrently. Each connection, however, is processed sequentially through a series
of operations: First read the request, then retrieve data from a database, then fetch a
customized ad, and so on.
Instead of happening in sequence, some of these steps could be executed concurrently.
For instance, the server could be made faster and more responsive by fetching the ad in
parallel with the other operations:
Scala
val exec1 = Executors.newFixedThreadPool(12)
val exec2 = Executors.newFixedThreadPool(8)
val server = ServerSocket(port)
while true do
val socket = server.accept()
exec1.execute(() => handleConnection(Connection(socket), exec2))
Listing 25.2: A parallel server with parallelism within responses; see also Lis. 26.8.
After a request has been read from a socket, a task is created to fetch a customized ad.
The call to method submit returns immediately with a future, and the thread handling
the request continues with the database lookup. After the data has been retrieved and
logged, the ad is needed to assemble the page. You obtain it by calling get on the future.
There are two possibilities at this point. If the ad-fetching task is still running, method
get blocks. Once the ad becomes available, get unblocks, and the page is assembled.
If, however, the ad-fetching task is already finished by the time you reach makePage,
method get does not block and simply returns the ad, which you use to assemble the
page. In other words, fetching the ad and querying the database now happen in parallel,
and the same code handles all cases, whether ad fetching is faster or slower than database
querying.
In addition to the no-argument method get, futures in Java define a method isDone
to query the status of a task without blocking. It can be used to implement more complex
374 Chapter 25 Futures and Promises
strategies that involve combinations of polling and blocking. Note that the server in List-
ing 25.2 uses two thread pools. Using the same executor to run handleConnection and
fetchAd would run a risk of deadlock. This topic is discussed in detail in Section 26.1.
is an attempt to terminate the execution. This is often implemented by interrupting the running thread,
which may or may not stop the task. If the Boolean is false, no attempt is made to stop a running
task, but the future is still marked as canceled. Java’s newer futures implement only cancel(false)
semantics, and Scala’s futures have no cancellation mechanism. (The rationale is that, as a cancellation
mechanism, interruption works adequately only with tasks designed to respond to it. Tasks that perform
non-interruptible I/O, for instance, will not react to cancel(true).) Canceling running tasks is difficult
in general. The case study in Chapter 28 requires the need to cancel tasks after they have begun to run
and implements an ad hoc cancellation strategy independent of futures.
25.5 Promises 375
or
Scala
val f1 = CompletableFuture.supplyAsync(() => makeStrings(5, "T1"))
The second form uses a common thread pool instead of a user-specified thread pool.
Scala uses its own futures and specifies the thread pool to use as an implicit argu-
ment. This allows you to write
Scala
given ExecutionContext = exec
or
Scala
val f1 = Future(makeStrings(5, "T1")) // uses the default thread pool in scope
Note that Scala relies on an unevaluated, by-name argument, thereby avoiding the use
of a lambda expression as an explicit thunk (see the discussion in Section 12.2).
With both Scala’s Future and Java’s CompletableFuture, the future that is returned
implements a richer interface that supports a form of functional-concurrent program-
ming, as discussed in Chapter 26.
25.5 Promises
Promises are the internal mechanism by which futures are created. A promise represents
the (yet-to-come) value of a future. In some sense, it is the data part of the data/
synchronizer combination that constitutes a future.
A promise can be fulfilled with either a value or an error, at which point the future
is complete. Promises are often—but not always—created by one thread and completed
376 Chapter 25 Futures and Promises
by another. For instance, this is how futures are created by thread pools: The thread
that submits a task creates a promise, and a worker from the pool fulfills it.
As an illustration, consider function apply, from the Future companion object. It
was used earlier to create future f1—recall that Future(makeStrings(5, "T1")) is
Future.apply(makeStrings(5, "T1")) in Scala. You could implement apply using a
promise:
Scala
import scala.concurrent.{ Future, Promise }
Listing 25.4: Possible implementation of Future.apply; see also Lis. 25.5 and 25.6.
Function apply is curried and its first argument—the code to be executed—is passed
by name, unevaluated. The function creates a promise and returns the future associated
with it. It also schedules on the thread pool a task that evaluates the code argument
and fulfills the promise with its output. Method complete is used to fulfill the promise.
It uses an argument of type Try (see Section 13.3), which allows it to handle both the
successful and failure cases. As an alternative, Scala promises define methods success
and failure if you want to handle both cases separately.
If you want to write a similar apply function that produces a Java future instead,
you have a choice between using the older or newer implementation of Future:
Scala
import java.util.concurrent.{ CompletableFuture, Future }
This variant uses the newer CompletableFuture. As before, a promise is created and a
task is scheduled to fulfill it. One difference from Listing 25.4 is that CompletableFuture
doesn’t use a Try type, so you need to treat the successful and error cases sepa-
rately. Another difference is that the promise is itself the future being returned—type
CompletableFuture plays both parts.5
Alternatively, you could implement apply by using Java’s older futures:
Scala
import java.util.concurrent.{ Future, FutureTask }
Java refers to it as a future. As it happens, because of the way promises are currently implemented in
Scala, promise.future is also the same object as promise—method future simply returns this—but
is used through two distinct types, Future and Promise.
378 Chapter 25 Futures and Promises
Class TrieMap, which was used in Chapter 23 in a read-write lock example, implements
a thread-safe map. As a result, memo(f) is now thread-safe if function f is. However,
the implementation suffers from a major drawback.
Consider the following scenario: Suppose function f is such that it takes 10 seconds
to compute f(x0 ) for some value x0 . Function g is created as memo(f). A thread T1 calls
g(x0 ) and starts to calculate f(x0 ). Nine seconds later, a thread T2 also calls g(x0 ). At
that time, thread T1 is still inside its computation of f(x0 ), and the store map is still
empty. Accordingly, method getOrElseUpdate triggers a second computation of f(x0 )
within thread T2 . It will take 10 seconds from this moment for thread T2 to obtain the
value f(x0 ), which is 9 seconds after it is added to the map by thread T1 (Figure 25.1).
compute f(x0)
g(x0) v=f(x0)
T1
f(x0) available
g(x0) v=f(x0)
T2
compute f(x0)
Scala
def memo[A, B](f: A => B): A => B =
val store = TrieMap.empty[A, Future[B]]
x =>
val future = store.get(x) match
case Some(future1) => future1
case None =>
val task = FutureTask(() => f(x))
store.putIfAbsent(x, task) match
case Some(future2) => future2
case None =>
task.run()
task
future.get()
25.7 Summary
• Tasks that produce values can be implemented as actions—typically, instances of
interface Runnable. In that case, they need to modify shared data to store these
values, and concurrent access to shared mutable data requires that they implement
their own synchronization. Futures are an alternative synchronization mechanism
specifically designed for tasks to retrieve values produced by other tasks.
• Futures are well suited to functional tasks—that is, tasks that invoke a function
to produce a value. If the function is pure, all the synchronization that is needed
can be embedded into a future.
• Thread pools can create a future when a task is submitted for execution.
The future then acts as a handle on the task. It can be used to query the status
of the task or as a synchronizer to wait for task completion. After the task fin-
ishes, the future also serves as a container for the value—or the error—produced
by the task.
• Different types of futures vary in the functionalities they offer. This chapter
focused on the basic mechanisms defined by Java’s older Future type. Scala’s
Future is richer, and Java’s CompletableFuture is richest. Their additional fea-
tures are explored in Chapter 26.
• Futures are created from promises. A promise is a container, initially empty, that
can be fulfilled by setting a value or, in case of failure, an exception. A promise can
be fulfilled by the thread that created it (as in Listing 25.7), or by some other
thread, such as from a thread pool (as in Listings 25.4 to 25.6).
• Promises and futures are tightly coupled and are often the same object, playing
two roles. Terminology can be a bit confusing, as some sources refer to this object
as a promise and others as a future.
Chapter 26
Functional-Concurrent
Programming
Functional tasks can be handled as futures, which implement the basic synchroniza-
tion needed to wait for completion and to retrieve computed values (or exceptions).
As synchronizers, however, futures suffer from the same drawbacks as other blocking
operations, including performance costs and the risk of deadlocks. As an alternative,
futures are often enriched with higher-order methods that process their values asyn-
chronously, without blocking. Actions with side effects can be registered as callbacks,
but the full power of this approach comes from applying functional transformations to
futures to produce new futures, a coding style this book refers to as functional-concurrent
programming.
381
382 Chapter 26 Functional-Concurrent Programming
This method follows the same pattern as the previous implementation of quick-sort in
Listing 10.1. The only difference is that the function uses a separate thread to sort
the low values, while the current thread sorts the high values, thus sorting both lists
in parallel. After both lists have been sorted, the sorted low values are retrieved using
method get, and the two lists are concatenated around the pivot as before. This is the
same pattern used in the server example to fetch a customized ad in the background
except that the task used to create the future is the function itself, recursively.
At first, the code appears to be working well enough. You can use quickSort to
successfully sort a small list of numbers:
Scala
val exec = Executors.newFixedThreadPool(3)
quickSort(List(1, 6, 8, 6, 1, 8, 2, 8, 9), exec) // List(1, 1, 2, 6, 6, 8, 8, 8, 9)
However, if you use the same three-thread pool in an attempt to sort the list
[5,4,1,3,2], the function gets stuck and fails to terminate. Looking at a thread dump
would show that all three threads are blocked on a call lowFuture.get in a deadlock.
Figure 26.1 displays the state of the computation at this point as a tree. Each sorting
task is split into three branches: low, pivot, and high. The thread that first invokes
quickSort is called main here. It splits the list into a pivot (5), a low list ([4,1,3,2]),
and a high list ([]). It quickly sorts the empty list itself, and then blocks, waiting for
the sorting of list [4,1,3,2] to complete. Similar steps are taking place with the thread
[5,4,1,3,2]
main waits
[4,1,3,2] 5 []
worker 1 waits
[1,3,2] 4 []
[] 1 [3,2]
worker 2 waits
[2] 3 []
worker 3 waits
[] 2 []
in charge of sorting list [4,1,3,2], and so on, recursively. In the end, the three workers
from the thread pool are blocked on three sorting tasks: [1,3,2], [2], and []. This
last task—sorting the empty list at the bottom left of Figure 26.1—sits in the queue of
the thread pool, as there is no thread left to run it.
This faulty implementation of quick-sort works adequately on computations with
short low lists, even if the high lists are long. You can use it to sort the already sorted
list [1,2,...,100] on a three-thread pool, for instance. However, the function fails
when low lists are long, even if high lists are short. On the same three-thread pool, it
cannot sort the list [5,4,1,3,2], or even the list [4,3,2,1].
Tasks that recursively create more tasks on the same thread pool are a common
source of deadlocks. In fact, that problem is so prevalent that special thread pools were
designed to handle it (see Section 27.3). Recursive tasks will get you in trouble easily,
but as soon as tasks wait for completion of other tasks on the same thread pool, the risk
of deadlock is present, even without recursion. This is why the server in Listing 25.2
uses two separate thread pools. Its handleConnection function—stripped here of code
that is not relevant to the discussion—involves the following steps:
Scala
val futureAd: Future[Ad] = exec.submit(() => fetchAd(request)) // a Java future
val data: Data = dbLookup(request)
val page: Page = makePage(data, futureAd.get())
connection.write(page)
Listing 26.2: Ad-fetching example; contrast with Lis. 26.3, 26.6, and 26.7.
With a single pool of N threads, you could end up with N simultaneous connections,
and thus N concurrent runs of function handleConnection. Each run would submit an
ad-fetching task to the pool, with no thread to execute it. All the runs would then be
stuck, forever waiting on futureAd.get.
Avoiding these deadlock situations is typically not easy. You may have to add many
threads to a pool to make sure that deadlocks cannot happen, but large numbers
of threads can be detrimental to performance. It is quite possible—likely, even—that
most runs will block only a small subset of threads, nowhere near a deadlock situation,
and leave too many active threads that use CPU resources. Some situations are hope-
less: In the worst case, the naive quick-sort example would need as many threads as
there are values in the list to guarantee that a computation remains free of deadlock.
Even if you find yourself in a better situation and deadlocks can be avoided with a
pool of moderate size, waiting on futures still incurs a non-negligible cost. Blocking—on
any kind of synchronizer, including locks—requires parking a thread, saving its execution
stack, and later restoring the stack and restarting the thread. A parked thread also tends
to see its data in a processor-level cache overwritten by other computations, resulting
in cache misses when the thread resumes execution. This can have drastic consequences
on performance.
Avoiding deadlocks should, of course, be your primary concern, but these perfor-
mance costs cannot always be ignored. Thus, they constitute another incentive to reduce
thread blocking. Several strategies have been proposed to minimize blocking, and some
384 Chapter 26 Functional-Concurrent Programming
are described in detail in Chapter 27. For now, we will focus on a functional-concurrent
programming style that uses futures through higher-order functions without blocking. It
departs from the more familiar reliance on synchronizers and as such takes some getting
used to. Once mastered, it is a powerful way to arrange concurrent programs.
26.2 Callbacks
NOTE
The code illustrations in this chapter rely mostly on Scala’s futures for the same reason the book
uses Scala in the first place: They tend to be cleaner than (though not always as rich as) Java’s
CompletableFuture. Listings 26.10 and 27.9 and some of the code in Chapter 28 make use
of CompletableFuture, with more examples in Appendix A. Note also that callback mechanisms
and other higher-order functions on futures often require an execution context—typically a thread
pool in concurrent applications. How this context is specified varies from language to language. It
can also become a distraction when presenting more important concepts. Most functions in this
chapter assume a global execution context, which is left unspecified. One exception is Listing 26.5
(for consistency with Listing 26.1); other functions could use a similar pattern—that is, add a
“(using ExecutionContext)” argument instead of assuming a global context.
A callback is a piece of code that is registered for execution, often later (asynchronous
callback). Modern implementations of futures—including Scala’s Future and Java’s
CompletableFuture—offer a callback-registration mechanism. On a Scala future, you
register a callback using method onComplete, which takes as its argument an action to
apply to the result of the future. Because a future can end up with an exception instead
of a value, the input of a callback action is of type Try (see Section 13.3).
On a future whose task is still ongoing, a call to onComplete returns immediately.
The action will run when the future finishes, typically on a thread pool specified as
execution context:
Scala
println("START")
given ExecutionContext = ... // a thread pool
val future1: Future[Int] = ... // a future that succeeds with 42 after 1 second
val future2: Future[String] = ... // a future that fails with NPE after 2 seconds
future1.onComplete(println)
future2.onComplete(println)
println("END")
26.3 Higher-Order Functions on Futures 385
This example starts a 1-second task and a 2-second task and registers a simple callback
on each. It produces an output of the following form:
main at XX:XX:33.413: START
main at XX:XX:33.465: END
pool-1-thread-3 at XX:XX:34.466: Success(42)
pool-1-thread-3 at XX:XX:35.465: Failure(java.lang.NullPointerException)
You can see that the main thread terminates immediately—callback registration takes
almost no time. One second later, the first callback runs and prints a Success value.
One second after that, the second callback runs and prints a Failure value. In this
output, both callbacks ran on the same thread, but there is no guarantee that this will
always be the case.
You can use a callback in the ad-fetching scenario. Instead of waiting for a customized
ad to assemble a page, as in Listing 26.2, you specify as an action what is to be done
with the ad once it becomes available:
Scala
val futureAd: Future[Ad] = Future(fetchAd(request))
val data: Data = dbLookup(request)
futureAd.onComplete { ad =>
val page = makePage(data, ad.get)
connection.write(page)
}
is used asynchronously, you may even need callbacks within callbacks, which are diffi-
cult to write and even more difficult to debug. A better solution would be to bring the
non-blocking nature of callbacks into code that maintains a more functional style.
Before we revisit the ad-fetching example in Section 26.5, consider this callback-
based function:
Scala
def multiplyAndWrite(futureString: Future[String], count: Int): Unit =
futureString.onComplete {
case Success(str) => write(str * count)
case Failure(e) => write(s"exception: ${e.getMessage}")
}
You create a promise to hold the multiplied string and a callback action to fulfill the
promise. If the future futureString produces a string, the callback multiplies it and
fulfills the promise successfully. Otherwise, the promise is failed, since no string was
available for multiplication.
Now comes the interesting part. Conceptually, the preceding code has little to do
with strings and multiplication. What it really does is transform the value produced by
a future so as to create a new future. Of course, we have seen this pattern before—for
instance, to apply a function to the contents of an option—in the form of the higher-
26.3 Higher-Order Functions on Futures 387
order function map. Instead of focusing on the special case of string multiplication, you
could write a generic map function on futures:
Scala
def map[A, B](future: Future[A], f: A => B): Future[B] =
val promise = Promise[B]()
future.onComplete {
case Success(value) => promise.complete(Try(f(value)))
case Failure(e) => promise.failure(e)
}
promise.future
This function is defined for generic types A and B instead of strings. The only mean-
ingful difference with function multiply is that f might fail and is invoked inside Try.
Consequently, the promise could be failed for one of two reasons: No value of type A is
produced on which to apply f, or the invocation of function f itself fails.
The beauty of bringing up map is that it takes us to a familiar world, that of higher-
order functions, as discussed in Chapters 9 and 10 and throughout Part I. Indeed, the
Try type itself has a method map, which you can use to simplify the implementation of
map on futures:
Scala
def map[A, B](future: Future[A], f: A => B): Future[B] =
val promise = Promise[B]()
future.onComplete(tryValue => promise.complete(tryValue.map(f)))
promise.future
A thread that calls multiply does not block to wait for the input string to become
available. Nor does it create any new string itself. It only makes sure that the input
string will be multiplied once it is ready, typically by a worker from a thread pool.
You can even use zipWith, which combines zip and map into a single method:
Scala
def multiply(futureString: Future[String], futureCount: Future[Int]): Future[String] =
futureString.zipWith(futureCount)((str, count) => str * count)
26.4 Function flatMap on Futures 389
The last two functions may be easier to read than the flatMap/map variant. Never-
theless, you should keep in mind the fundamental nature of flatMap. Indeed, zip and
zipWith can be implemented using flatMap.
An experienced Scala programmer might write multiply as follows:
Scala
def multiply(futureString: Future[String], futureCount: Future[Int]): Future[String] =
for str <- futureString; count <- futureCount yield str * count
for very small lists down to the empty list. More realistic implementations would stop the parallelization
at some point and sort short lists within the current thread instead. Java’s Arrays.parallelSort
function, for instance, stops distributing subarrays to separate threads once they have 8192 or fewer
elements.
390 Chapter 26 Functional-Concurrent Programming
Scala
val futureAd: Future[Ad] = Future(fetchAd(request))
val data: Data = dbLookup(request)
val futurePage: Future[Page] = futureAd.map(ad => makePage(data, ad))
futurePage.foreach(page => connection.write(page))
You use map to transform an ad into a full page by combining the ad with data already
retrieved from the database. You now have a Future[Page], which you can use wherever
the page is needed. In particular, sending the page back to the client is a no-value action,
for which a callback fits naturally. For illustration purposes, this code registers the
callback with foreach instead of onComplete. The two methods differ in that foreach
does not deal with errors: Its action is not run if the future fails.
In Listing 26.6, the database is queried by the connection-handling thread while
a customized ad is fetched in the background. Alternatively, database lookup can be
turned over to another thread, resulting in a value of type Future[Data]. You can then
combine the two futures using flatMap/map:
Scala
val futureAd: Future[Ad] = Future(fetchAd(request))
val futureData: Future[Data] = Future(dbLookup(request))
val futurePage: Future[Page] =
futureData.flatMap(data => futureAd.map(ad => makePage(data, ad)))
futurePage.foreach(page => connection.write(page))
What is interesting about this code is that the thread that executes it does not
perform any database lookup, ad fetching, or page assembling and writing. It simply
creates futures and invokes non-blocking higher-order methods on them. If you write
the remainder of the connection-handling code in the same style, you end up with a
handleConnection function that is entirely asynchronous and non-blocking:
Scala
given exec: ExecutionContextExecutorService =
ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(16))
26.5 Illustration: Parallel Server Revisited 391
The handleConnection function starts by submitting to the thread pool a task that
reads a request from a socket and produces a future, requestF. From then on, the code
proceeds by calling higher-order functions on futures. First, using map, an ad-fetching
task is scheduled to run once the request has been read. This call produces a future adF.
A database lookup future, dataF, is created in the same way. The two futures dataF
and adF are combined into a future pageF using flatMap and map, as before. Finally,
three callback actions are registered: one on dataF for logging, and two on pageF for
statistics recording and to reply to the client.
No actual connection-handling work is performed by the thread that runs func-
tion handleConnection. The thread simply creates futures and invokes non-blocking
functions on them. The time it takes to run the entire body of handleConnection is
negligible. In particular, you could make the listening thread itself do it, in contrast to
Listing 25.2, where a separate task is created for this purpose.
The various computations that need to happen when handling a request depend on
each other, as depicted in Figure 26.2. The server implemented in Listing 26.8 executes
a task as soon as its dependencies have been completed, unless all 16 threads in the pool
are busy. Indeed, the 16 threads jump from computation to computation—fetching ads,
logging, building pages, and so on—as the tasks become eligible to run, across request
boundaries. They never block, unless there is no task at all to run. This implementation
maximizes parallelism and is deadlock-free.
write close
fetchAd makePage
read updateStats
dbLookup addToLog
Instead of Scala futures, you could implement the server of Listing 26.8 using
Java’s CompletableFuture (see Appendix A.14 for a pure Java implementation).
Note, however, that CompletableFuture tends to use less standard names:
thenApply, thenCompose, and thenAccept are equivalent to map, flatMap, and
foreach, respectively.
There is one aspect of concurrent programming that a non-blocking approach tends
to make more difficult: handling timeouts. For instance, you could decide that it is
undesirable to have the server wait for more than 0.5 second for a customized ad after
data has been retrieved from the database. In a blocking style, you can achieve this
easily by adding a timeout argument when invoking futureAd.get, as in Listing 25.3.
It can be somewhat more challenging when using a non-blocking style.
Here, CompletableFuture has the advantage over Scala’s Future. It defines a method
completeOnTimeout to complete a future with an alternative value after a given time-
out. If the future is already finished, completeOnTimeout has no effect. You can use it
to fetch a default ad:
Scala
val adF: CompletableFuture[Ad] = ...
...
adF.completeOnTimeout(timeoutAd, 500, MILLISECONDS)
Scala’s Future type has no such method, which makes the implementation of a
timeout ad more difficult. You can follow a do-it-yourself approach by creating a promise
and relying on an external timer to complete it if needed. First, you create a timer as
a scheduling thread pool:
Scala
val timer = Executors.newScheduledThreadPool(1)
This code runs when future dataF completes—when data from the database becomes
available. If, at that point, the ad is ready—adF.isCompleted is true—you can use
it. Otherwise, you need to make sure that an ad will be available quickly. For this
purpose, you create a promise, and you use the timer to fulfill this promise with a
default ad after 0.5 second. You also add a callback action to adF, which tries to fulfill
the same promise. Whichever runs first—the timer task or the customized ad task—will
set the promise with its value.2 The call timerF.cancel is not strictly necessary, but
it is used to avoid creating an unnecessary default ad if the customized ad is available
in time.
As part of the last case study, Listing 28.4 uses a similar strategy to extend Scala
futures with a completeOnTimeout method.
As an illustration, the three optional functions used in Section 10.3 can be changed
to represent asynchronous steps:
Scala
def parseRequest(request: Request): Future[User] = ...
def getAccount(user: User): Future[Account] = ...
def applyOperation(account: Account, op: Operation): Future[Int] = ...
Scala
parseRequest(request)
.flatMap(user => getAccount(user))
.flatMap(account => applyOperation(account, op))
The expression in Listing 26.9 is exactly the same as that in Listing 10.5, except that
it produces a value of type Future[Int] instead of Option[Int].
Scala
val allAccounts: Map[User, Account] = ...
def getAccount(user: User): Future[Account] = Future.successful(allAccounts(user))
This function returns an already completed future and does not involve any additional
thread. If a need to fetch accounts asynchronously then arises, you can reimplement the
function without modifying its signature, and leave all the code that uses it—such as
Listing 26.9—unchanged.
Scala
def getAccount(user: User): Future[Account] = Future.fromTry(Try(allAccounts(user)))
Scala
val safePageF: Future[Page] = pageF.recover { case ex: PageException => errorPage(ex) }
Scala
pageF.failed.foreach { ex =>
connection.write(errorPage(ex))
connection.close()
}
Either the callback actions specified using pageF.foreach or those specified using
pageF.failed.foreach will run, but not both.
Scala
val f1: Future[Int] = ...
val f2: Future[String] = ...
val f3: Future[Double] = ...
This won’t scale to larger numbers of futures, though. An interesting and not uncom-
mon case is to combine N futures of the same type into a single one, for an arbitrary
396 Chapter 26 Functional-Concurrent Programming
number N . In the server example, a client might obtain data from N database queries,
which are executed in parallel:
Scala
def queryDB(requests: List[Request]): Future[Page] =
val futures: List[Future[Data]] = requests.map(request => Future(dbLookup(request)))
val dataListF: Future[List[Data]] = Future.sequence(futures)
dataListF.map(makeBigPage)
The first line uses map to create a list of database-querying tasks, one for each request.
These tasks, which run in parallel, form a list of futures. The key step in queryDB is the
call to Future.sequence. This function uses an input of type List[Future[A]] to
produce an output of type Future[List[A]]. The future it returns is completed when
all the input futures are completed, and it contains all their values as a list (assuming no
errors). Invoking Future.sequence serves the same purpose as the “join” part of a fork-
join pattern, but does so without blocking. The last step uses a function makeBigPage
from List[Data] to Page to build the final page.
As of this writing, there is no standard sequence function for CompletableFuture,
but you can implement your own using thenCompose (equivalent to flatMap) and
thenApply (equivalent to map):
Scala
def sequence[A](futures: List[CompletableFuture[A]]): CompletableFuture[List[A]] =
futures match
case Nil => CompletableFuture.completedFuture(List.empty)
case future :: more =>
future.thenCompose(first => sequence(more).thenApply(others => first::others))
This function uses recursion to nest calls to thenCompose (flatMap). In the recur-
sive branch, sequence(more) is a future that will contain the values of all the input
futures, except the first. This future and the first input future are then combined using
thenCompose and thenApply (flatMap and map), according to the pattern used earlier
to merge two futures (as in Listings 26.5, 26.7, and 26.8).
Instead of working on a list of futures, as sequence does, function traverse uses a list
of inputs and a function from input to future output. It “forks” a collection of tasks by
applying the function to all inputs, and then “joins” the tasks into a single future, as
in sequence, without blocking.
26.7 Summary
• When used as a synchronizer, a future requires a thread to potentially block and
wait to access the value being computed. Parking and unparking threads has
a non-negligible performance cost. Worse yet, tasks that block to wait on other
tasks running on the same group of threads can easily result in deadlocks. It is not
always possible to avoid these deadlocks by increasing the number of threads in a
pool, and larger pool sizes tend to cause inefficiencies even when it can be done.
• Callbacks can be used as an alternative to blocking. They trigger a computation
when a future is ready, without having to explicitly wait for this future to finish.
Callbacks can be complex—future values can be used in arbitrary ways—and can
lead to intricate code, especially when callbacks within callbacks are involved.
• By defining non-blocking higher-order functions on futures, you can bring to the
world of concurrent programming the same shift from actions to functions that is at
the core of functional programming. Instead of using effect-based callbacks, future
values are handled functionally, as when using functions, but asynchronously.
• The resulting functional-concurrent programming style does not use futures as
synchronizers—thereby avoiding many deadlock scenarios and performance costs
associated with blocking—and also sidesteps the inherent complexity of callbacks.
• Higher-order functions on futures can be used to transform values, combine multi-
ple computations asynchronously, or recover from failures. The same higher-order
functions that proved hugely beneficial to functional programming—particularly,
map, flatMap, foreach, and filter—provide developers with tools to orchestrate
complex concurrent computations according to patterns that maximize concur-
rency while avoiding blocking.
• Functions flatMap and map, in particular, can be used to combine in a uniform
way computations that may be synchronous or asynchronous, failed or successful.
They can also be used to implement, without blocking threads, patterns that
(conceptually) wait for tasks to finish, such as fork-join.
• Adjusting to functional-concurrent programming requires a shift in program design,
away from locks and synchronizers. This can initially require some effort, similar
to casting aside assignments and loops when moving from imperative to functional
programming. Once you become accustomed to it, though, this programming style
is often easier and less error-prone than the alternatives.
This page intentionally left blank
Chapter 27
Minimizing Thread Blocking
Chapter 26 advocates the functional use of futures in a programming style that min-
imizes thread blocking. Over the years, much effort has been spent trying to avoid
unnecessary thread blocking, often for performance reasons.1 This chapter provides a
quick overview of some popular techniques that strive not to block threads and often
abstain from using locks and other synchronizers.
This code is correct and thread-safe, but not necessarily efficient. If two threads share
such an integer, and both invoke incrementAndGet at the same time, one thread suc-
cessfully acquires the object’s intrinsic lock and the other thread is blocked. This is
undesirable because it will take a lot more time to park and unpark this thread than
is needed for the other thread to execute value += 1. When the lock is not available, it
1 This process is still ongoing. On the JVM, in particular, see https://openjdk.org/jeps/425,
399
400 Chapter 27 Minimizing Thread Blocking
would be better for a thread to waste a few CPU cycles spinning instead of blocking,3
until another thread is done incrementing the integer.
The actual class AtomicInteger does not use a lock. Conceptually, it is implemented
as follows:
Java
public class AtomicInteger {
This expression sets a target to a new value, but only if the target equals an ex-
pected value. If the target is different from the expected value, the CAS operation does
not modify it. Semantically, this is equivalent to if (target == expected) target =
newValue, except that the check-then-act is performed atomically. A CAS returns true
if the target is successfully updated, and false otherwise.4
3 For this reason, modern JVM implementations often do not park a thread right away when a lock
is not available, but instead spin for a while first, in case the lock becomes available quickly.
4 Compare-and-set is sometimes also called compare-and-swap. The difference is that compare-and-
swap returns the value of a variable before the swap, while compare-and-set returns a Boolean.
27.1 Atomic Operations 401
The role of updater in Listing 27.1 is to provide a CAS operation on the value
field of the class.5 This gives you an atomic check-then-act that is sufficient to safely
implement incrementAndGet. First, you read the number in value into a local variable
current. Then, you use CAS to update value: If it is still equal to current, it is
replaced with next, which is current + 1. If the CAS fails and returns false, it is an
indication that another thread has modified value after it was read. In that case, you
use a loop to read the new contents of value, and you attempt the increment operation
again.
One way to think about this implementation is that, instead of using locks to guard
against interference, a thread proceeds under an optimistic assumption that no other
thread will interfere with its operation. If, however, interference does take place, it is
detected from the failure of CAS, and the operation is restarted.
When two threads modify an atomic integer at the same time, it is possible that a
thread ends up computing next = current + 1 twice, but the cost of the additional
increment is negligible compared to the time it takes to park and unpark a thread. Even
if more than two operations are involved and some threads end up repeating the loop
multiple times before a CAS is successful, this non-locking strategy tends to outperform
a lock-based one, unless the number of threads is extreme.
The Java standard library defines other types with atomic operations, such as
AtomicLong and AtomicBoolean. As an illustration of the latter, you can rewrite the
thread-safe box from Listing 22.4 without using any lock:
Scala
class SafeBox[A]:
private var contents: A = uninitialized
private val filled = CountDownLatch(1)
private val isSet = AtomicBoolean(false)
def get: A =
filled.await()
contents
guarantees that if several threads call getAndSet(true) on a false Boolean, the method
returns false for exactly one thread and true for all the other threads. Internally,
getAndSet is implemented as a loop, using CAS.
The test isSet.get is not strictly necessary: It’s a performance optimization to
avoid the more costly getAndSet if the Boolean is already true. The nonEmpty test on
an option, which was necessary in Listing 22.4, is no longer used. The useless option
is removed for performance reasons, and variable contents has now type A instead of
Option[A].
You need locking to bring atomicity to push and pop. Without synchronization, two con-
current calls to push, for instance, could read the same top value and create two nodes,
V1 and V2 , that both refer to it (Figure 27.1). Then, if you update top to point to
node V1 , you lose the V2 value, and vice versa.
27.2 Lock-Free Data Structures 403
Push(V1) Push(V2)
V1 V2
top top
Figure 27.1 Possible loss of value from concurrent calls to push on a stack.
To avoid locking, you can change top to be an atomic reference and update it
using CAS. This way, a failed CAS can be used to detect attempts to modify the stack
concurrently:
Scala
import java.util.concurrent.atomic.AtomicReference
@tailrec
def pop(): Option[A] = top.get match
case null => None
case node => if top.compareAndSet(node, node.next)
then Some(node.value) else pop()
Method peek is not much different from before: It reads the top node and, if it is not
null, returns its value. No atomicity is needed, and top is used as if it were a plain
volatile variable. The other two methods are more involved.
In push, you start by creating a new node that refers to the current value of top as
its own next pointer. You then need to update top to point to this new node (move top
pointer up in Figure 27.2). You do this with a CAS, in case another thread is modifying
top concurrently (inside push or pop). If the CAS fails, you update the node’s next
pointer to refer to the new value of top and attempt to update top again. Similarly,
pop uses a CAS to try to update top to top.get.next (move the top pointer down in
Figure 27.2) and retries the pop operation if the CAS fails.6
Push Pop
node node
In this stack example, you can see that the lock-free algorithm is noticeably more
complex than one that uses locks. Many thread-safe data structures in java.util
.concurrent rely on lock-free algorithms, including structures used internally to imple-
ment synchronizers, such as queues of waiting threads. Some of them are quite tricky
due to the fact that an operation needs to update multiple references, which is not an
atomic operation, even when using compareAndSet on instances of AtomicReference.
A limitation of CAS is that it can be successful, even if the target value has been
modified concurrently, as long as it was modified back to its expected value—a scenario
known as the ABA problem. For instance, compareAndSet(false, true) on an
AtomicBoolean can fail to notice concurrent interactions that changed the Boolean
from false to true, then back to false. If it is necessary for an algorithm to detect such a
change, class AtomicStampedReference can be used. This class adds an integer counter
to references as a way to distinguish between identical values.7
6 Method pop uses tail recursion for retries; method push uses a loop because it makes it easier to
reuse the same node across CAS attempts—using recursion would require introducing an additional
method.
7 Modern processors offer an alternative to CAS known as load-linked/store-conditional (LL/SC), a
pair of bracketed instructions that detect interference between a load and a store independently from
the values being loaded and stored, and, therefore, are not sensitive to the ABA problem.
27.3 Fork/Join Pools 405
NOTE
Sections 27.3 to 27.6 rely on the ad-fetching example from Chapter 26 to illustrate various frame-
works. This example is not necessarily the best illustration of the strengths of these frameworks.
Instead, the goal here is to make it easier to compare approaches and to emphasize that they can
all be used to achieve the same non-blocking computation.
8 The ForkJoinPool class implements a work-stealing pattern in which each worker thread maintains
its own queue of tasks and “steals” from other workers when it runs out of work. In contrast to a single
queue model, this approach has the benefit that newly created tasks can be added to the queue of the
current worker for later processing without interfering with other threads. This became such a popular
feature that Java later introduced a newWorkStealingPool method that makes this pattern available
independently from the fork/join mechanism.
406 Chapter 27 Minimizing Thread Blocking
The calls futureData.join() and futureAd.join() do not block the thread, even if
the ad or the data is not yet available. Contrast this implementation with Listing 25.2,
in which method get of a regular future is used to retrieve the ad, possibly causing a
worker thread to block if the ad-fetching task is still running.
For all practical purposes, the implementation in Listing 27.4 is equivalent to List-
ing 26.7, which uses flatMap and map to combine the ad with database data in a
non-blocking way. Its main benefit is that it is closer in style to familiar (but blocking)
variants like that seen in Listing 25.2.
With fork and join, you have a mechanism for a task to wait for another task
without blocking a thread. However, if you use a lock, or a semaphore, or a blocking
queue, you can still block a thread. To truly leverage the advantages of ForkJoinPool,
you need to refrain from using synchronizers. If you absolutely need to block a thread,
the framework defines a ManagedBlocker interface, which you can use to indicate that a
thread is about to block. This makes it possible for the pool to create additional threads
appropriately to maintain a desired level of parallelism.
async {
val futureAd: Future[Ad] = Future(fetchAd(request))
val futureData: Future[Data] = Future(dbLookup(request))
val ad: Ad = await(futureAd)
val data: Data = await(futureData)
val page: Page = makePage(data, ad)
}
Method await suspends the execution of the async block until its target future
is completed, but it does not block the thread, which continues to run code else-
where. The implementation again follows a conventional blocking style but is pretty
much equivalent in behavior to Listing 26.7 (with flatMap/map) or Listing 27.4 (with
ForkJoinPool).9
9 This example uses a non-standard library by Ruslan Shevchenko, which actually transforms
coroutineScope {
val futureAd: Deferred<Ad> = async { fetchAd(request) }
val futureData: Deferred<Data> = async { dbLookup(request) }
val data: Data = futureData.await()
val ad: Ad = futureAd.await()
val page: Page = makePage(data, ad)
}
27.5 Actors
Actors is a classic message-based model that has been revived through languages such
as Erlang and libraries such as Akka. Actors do not typically share mutable data, but
instead communicate by sending and receiving (immutable) messages. Within an actor,
all messages are processed sequentially, allowing actors to rely on internal data structures
that are not thread-safe. When an actor has no message to process, it remains passive,
but no thread is blocked. Instead, threads are available to run other actors, making
408 Chapter 27 Minimizing Thread Blocking
it possible for a small number of threads to handle a large number of actors, as with
coroutines.
You could use Akka actors to reimplement the parallel ad-fetching and database
lookup example:
Scala
import akka.actor.typed.{ ActorRef, Behavior }
import akka.actor.typed.scaladsl.Behaviors
Behaviors.receiveMessagePartial {
case AdMsg(ad) =>
val page = makePage(data, ad)
replyTo ! PageMsg(page)
Behaviors.stopped
}
}
to fetch a customized ad, and dbQuerying, to query the database. You give adFetching
the address of dbQuerying so it can send the ad after it has been fetched. (The method
“!” is used to send a message in Akka, a notation that has its origins in CSP.) The
behavior of dbQuerying starts with a database query and continues by handling the mes-
sage sent by adFetching, which contains the ad. The actor then assembles the page and
sends it back to the entity that made the initial request (replyTo). Figure 27.3 shows
the flow of messages in this actor system. By running actors on a pool with multiple
threads, you achieve the same parallel, non-blocking ad fetching and database querying
as in Listings 26.7, 27.4, 27.5, and 27.6.
re
qu pag
es e
t
request request db
handling lookup
re
qu
es
ad
t
ad
fetching
After it spawns two new actors—which takes very little time—the requestHandling
actor is finished with the request. However, it continues to exist, with the same
behavior as before, Behaviors.same, remaining ready to handle more requests. By
contrast, adFetching and dbQuerying are transient actors. They terminate, using
Behaviors.stopped, after they have completed their task.
You can achieve great flexibility by choosing to reuse one or more existing actors,
or by deciding instead to spawn new ones. If you are implementing the full server
from Listing 26.8, for instance, you could decide to use a single actor for logging, thus
guaranteeing that all logging is done sequentially. Instead of spawning a new actor to
fetch an ad for each request, you could dedicate a fixed number of actors to the task of
fetching ads, and even dynamically change this number based on server activity.
Finally, relying on message-based protocols makes it easier to deploy actors in a
distributed system—as long as messages can be serialized for network communication.
Akka is often used to implement computations over a network and offers many features
to support distributed computing, including actor migration and load balancing.
410 Chapter 27 Minimizing Thread Blocking
As a side note, actors have the interesting property that they can switch behaviors,
and thus process subsequent incoming messages differently. This can be leveraged to
implement state machines in terms of actors:
Scala
trait Command
case class Start(replyTo: ActorRef[Number]) extends Command
case class Number(value: Int) extends Command
case object Stop extends Command
Start
reset Stop add
Method add is implemented in an imperative style, with a mutable variable sum. This
is done on purpose, to emphasize the fact that, within an actor, messages are processed
sequentially. The actor may receive numbers concurrently and still add them correctly,
even though no lock is used and the assignment sum += value is not thread-safe.
27.6 Reactive Streams 411
Scala
val pages: Flux[Page] = ...
The expression uses overlapping windows to report every 5 minutes the length of the
longest page created in the previous hour, as a stream of numbers. Think of how you
would implement this directly in terms of individual futures of pages, without ever
blocking a thread. This is a nontrivial problem.
When executed on a pool with multiple threads, this program prints the two statements
"future 1 starts" and "future 2 starts", and then hangs. To complete, future2
needs a number from future1, and future1 needs a string from future2. This is a
deadlock of futures, not of threads. A thread dump would reveal that all the threads
from the pool are idle, waiting for tasks to execute.
You might object that this example is artificial (it is), but complex applications,
especially when you use promises explicitly, can face this type of situation. It is also
easy to end up deadlocked in a message-oriented framework by having a set of entities
wait for messages in a cycle. For instance, you can use Kotlin coroutines to wait for a
message on a channel without blocking a thread:
Kotlin
val strings = Channel<String>()
val ints = Channel<Int>()
This program also prints two messages and then stops, even though method receive is
not blocking threads and all threads are idle. You face the same difficulty with actors.
It is a common pattern for an actor to need N messages before it can issue a response—
similar to using a countdown latch in a blocking variant. If you wait for N − 1 messages
instead, you may produce an incomplete response; if you wait for N + 1 messages, you
may end up in a deadlock, with no response being sent at all.
Besides performance considerations, which are important, a major benefit of
non-blocking strategies is that they help programmers shift their attention from non-
interference to cooperation. Non-interference synchronization—typically using locks—
is not needed, but the patterns that would require a cooperating synchronizer—such
as a blocking queue or a countdown latch—are still there. You just implement them
differently—for instance, as an actor or a coroutine waiting for messages, or by com-
bining several futures through zip or flatMap. The fact that threads are not blocked,
414 Chapter 27 Minimizing Thread Blocking
however, does not necessarily prevent these patterns from being implemented incor-
rectly. Concurrent programming is still hard.
27.8 Summary
• When programming with synchronizers, such as locks, latches, and futures, threads
can be temporarily blocked while other threads perform an operation, which often
results in non-negligible performance loss. Chapter 26 presented a programming
style that uses futures in a functional way, without blocking threads. Some other
techniques minimize thread blocking, as discussed in this chapter.
• Compare-and-set (CAS) is a hardware instruction that implements an elemen-
tary atomic check-then-act operation. It is also available on language-level types
like AtomicInteger and AtomicReference as a compareAndSet method. It can
be used to implement other thread-safe operations, such as incrementAndGet and
getAndSet in class AtomicInteger.
• In particular, class AtomicReference and atomic updaters make it possible to
manipulate pointers atomically without going through locks. This is the basis for
various lock-free algorithms that implement thread-safe data structures without
locking, albeit at the cost of increased code complexity.
• Lock-free algorithms tend to follow a similar pattern. Computation proceeds with-
out locking, and thus with no guarantee that other threads will not interfere. Work
is performed speculatively and then committed using CAS. A failed CAS is an in-
dication that interference from other threads did take place, in which case the
operation can be reattempted.
• Functional-concurrent programming avoids blocking threads by using futures
through callbacks and higher-order functions. An alternative approach is to make
it possible for code to explicitly wait for completion of a future, but without block-
ing a thread. Many different implementations of this principle are possible. Java’s
ForkJoinTask, for instance, implements a kind of future with a join method that
suspends code execution until the future terminates, but does not block the run-
ning thread, leaving it free to execute other tasks in the pool. This results in a
programming style that often feels more natural—resembling a blocking variant—
than one that relies on callbacks and scheduled transformations.
• This idea of waiting without blocking is sometimes extended beyond futures and
made available in other flavors, such as async/await constructs and coroutines.
In addition to waiting for a value produced by a task, these constructs offer non-
blocking ways to wait for a message from a channel, or to acquire an exclusive
lock, for instance.
• Actors are a model of concurrent and distributed programming centered on mes-
sages. Actors send and receive messages, which are handled sequentially within an
27.8 Summary 415
NOTE
This case study was inspired by a problem I faced, a long time ago, while working on static analysis
of programs. I was implementing a tool that worked by generating proof obligations—that is, logical
formulas that need to be valid to show that a program satisfies a desirable property. The formulas
were typically large, numerous, and costly to verify. My tool used automatic theorem provers to
check the validity of the formulas, but different theorem provers—and different configurations of
the same theorem prover—would often perform very differently on the same formula. With no way
to predict which prover would work best, several theorem provers were started in parallel on each
formula and stopped as soon as one prover found the formula to be valid or invalid. Some of the
assumptions made in this chapter reflect the particulars of this theorem-proving scenario.
This last case study investigates a scenario in which a series of problems need to be solved
using a variety of heuristics. For a given problem, you don’t know which heuristic will
work, so you would like to try several heuristics in parallel until one succeeds. In this
chapter, heuristics are represented in terms of a Strategy interface:
Scala
trait Strategy[A, B]:
trait Task:
def run(): Option[B]
def cancel(): Unit
417
418 Chapter 28 Case Study: Parallel Strategies
Strategies are parameterized by an input type A and an output type B. (In the program
verification problem, type A would represent logical formulas, and type B would be
Boolean.) When applied to a given input, a strategy produces a task, which can be run
in an attempt to calculate the corresponding output. A task may succeed or fail, and thus
returns an option. A task can also be canceled—for instance, because it has exceeded
its time quota, or because another strategy has already solved the problem. The code
developed in this chapter hinges on several assumptions:
• A strategy can only succeed with a value, fail by returning an empty option, or
keep running forever. In particular, we assume that strategies do not fail with
exceptions. This is not a significant limitation. You can always replace a func-
tion f of type A => Option[B] that might throw an exception with the function
x => Try(f(x)).getOrElse(None), which returns None instead.
• For a given input, all the strategies that produce a successful value produce the
same value. (In my original context, if a theorem prover can show that a formula
is valid, other provers cannot show that it is not, and vice versa.) There is no
notion of “better” value. This makes it possible—and even desirable—to ignore
the output of all the other strategies once a strategy has terminated successfully.
• Tasks are canceled by invoking their cancel method, which is customized for
each strategy. This makes it easier for strategies to use code that is not responsive
to thread interrupts—the default cancellation mechanism used by Java thread
pools—such as socket I/O. (My verification tool ran theorem provers on remote
computers.)
• Cancellation may be implemented on a “best effort” basis: A task may continue
to run for some time after it has been canceled. Calling cancel on a task does not
guarantee that a thread inside its run method completes this call immediately.
• Canceling a task before it runs prevents this task from ever running; if invoked, its
run method returns None immediately. Canceling a task that is already finished
has no effect.
For reference, a simple sequential implementation would try all the strategies in turn
until either one produces a non-empty option or all the strategies have been exhausted:
Scala
{runner-seq}
class Runner[A, B](strategies: Seq[Strategy[A, B]]):
def compute(input: A): Option[B] =
strategies.view.flatMap(strategy => strategy(input).run()).headOption
A runner is created from a collection of strategies. Its compute method iterates over all
the strategies, using a view. For each strategy, it creates and runs a task, until either one
task produces a non-empty option or all the strategies fail. The use of a view guarantees
that compute terminates as soon as a strategy is successful, without needlessly evaluating
28.2 Sequential Implementation with Timeout 419
the remaining strategies.1 (See Sections 12.8 and 12.9 for an introduction to using views
and iterators in this programming style.)
After you catch a timeout exception but before compute returns None, it can be
desirable to cancel the run that is still ongoing, since it is now wasting computing
resources to calculate data that will never be used, as well as all the runs that follow.
The easiest way to do so is simply to cancel all the tasks, including tasks that have
already completed with a failure: In this case study, we assume that canceling a finished
task is harmless. This approach to cancellation is used in all runner implementations
throughout the chapter.2
Despite the use of a second thread, the runner implementation in Listing 28.1 remains
sequential: One thread is blocked while the other thread applies strategies one at a time.
In particular, the (single) computing thread might potentially spend all the allowed
time running an early strategy, even though other strategies further down the list could
produce a result more quickly.
With no a priori knowledge of which strategies are more likely to be successful, and
assuming enough computing resources, a better approach is to start all the strategies
in parallel and use the result of the first strategy that succeeds. It is straightforward
to modify Listing 28.1 to use a multithreaded pool and to create multiple tasks that
each evaluate a single strategy. The difficulty is to wait exactly until the first successful
strategy terminates or the elapsed time has reached the timeout value.
A “do-it-yourself” solution is certainly possible. For instance, you could use an
approach that combines:
• A shared mutable variable to store the result of the first successful strategy, if any.
• A counter of how many strategies are actively running.
• A latch to wait, with a timeout, for the first available result. The latch is opened
by a task if it is successful or if it is the last task to finish.
• A lock, to guard the variable used to store the result and the count of active tasks.
Instead of a low-level programming style, which we exercised in Chapter 24, the remain-
der of this chapter uses futures—and related standard mechanisms—to derive simpler
or richer implementations.
that a task you are about to cancel might terminate at any time, and therefore the possibility that
cancel is applied to a finished task needs to be addressed anyway.
28.4 Parallel Implementation Using CompletionService 421
Scala
class Runner[A, B](strategies: Seq[Strategy[A, B]], exec: ExecutorService):
def compute(input: A, timeout: Double): Option[B] =
val deadline = System.nanoTime() + (timeout * 1E9).round
thread pool, and wait for them in the order in which they terminate. You can use it to
implement a runner on a thread pool specified as an instance of Executor:
Scala
class Runner[A, B](strategies: Seq[Strategy[A, B]], exec: Executor):
def compute(input: A, timeout: Double): Option[B] =
val deadline = System.nanoTime() + (timeout * 1E9).round
@tailrec
def loopQueue(pending: Int): Option[B] =
if pending == 0 then None
else
queue.poll(deadline - System.nanoTime(), NANOSECONDS) match
case null => None
case future if future.get().nonEmpty => future.get()
case _ => loopQueue(pending - 1)
try loopQueue(tasks.length)
finally for task <- tasks do task.cancel()
Scala
def compute(input: A, timeout: Double): scala.concurrent.Future[Option[B]] = ...
A call to compute returns immediately. Later, the future is completed when a strategy
succeeds, all the strategies have failed, or the specified timeout is reached, whichever
comes first. An asynchronous runner is more flexible than the blocking variant: You can
choose to use the returned future asynchronously, by means of higher-order functions,
or to block until the future is complete, which brings you back to the behavior of the
previous (blocking) compute methods.
The implementation of an asynchronous runner using Scala futures faces two diffi-
culties. First, you need a mechanism to complete a future after a timeout. You can reuse
the approach described in Section 26.5, based on a promise and a timer, to implement
a generic completeOnTimeout extension to futures:
Scala
extension [A](future: Future[A])
def completeOnTimeout(timeout: Long, unit: TimeUnit)(fallbackCode: => A)(
using exec: ExecutionContext, timer: ScheduledExecutorService
): Future[A] =
if future.isCompleted then future
else
val promise = Promise[A]()
val complete = (() => promise.completeWith(Future(fallbackCode))): Runnable
val completion = timer.schedule(complete, timeout, unit)
completes right after you test it to be incomplete, the else branch is applied to a
finished future and unnecessarily creates a promise and a timer task. But everything
still works.
The default value, fallbackCode, is not needed if the timeout is not triggered and
is passed by name, unevaluated. If the timeout is triggered, fallbackCode is evaluated
on the thread pool exec to avoid running unknown code in a timer thread—timer tasks
are typically expected to be short. This is the reason completeWith is used instead
of tryComplete to complete the promise from the timer. Except for the fact that the
fallback code is passed unevaluated, this extension behaves like the completeOnTimeout
method of Java’s CompletableFuture.
The second difficulty in implementing a non-blocking runner is the need to com-
plete the future returned by method compute exactly when the first successful strategy
terminates—assuming no timeout. Scala defines a function firstCompletedOf, which
is a kind of non-blocking invokeAny: Given a list of futures, it creates a future that is
completed when the first future from the list terminates. At first glance, this would
seem to be exactly what we need. However, while invokeAny returns the first future
to terminate successfully, firstCompletedOf returns (a future equivalent to) the first
future that terminates either successfully or unsuccessfully. This is a problem because
a direct application of firstCompletedOf on a list of running strategies will produce a
future that contains the result of the first strategy that terminates, even if this strategy
returns None. That’s not what you want.
What you do want is a findFirst function that finds from a list the earliest future
to terminate with a value that satisfies a given predicate—a non-empty option, for
instance. You can use firstCompletedOf to implement findFirst,3 with a mixture of
flatMap and recursion, reminiscent of Listing 26.10:
Scala
def findFirst[A](futures: Seq[Future[A]], test: A => Boolean)(
using ExecutionContext
): Future[Option[A]] =
if futures.isEmpty then Future.successful(None)
else
Future.firstCompletedOf(futures).flatMap { _ =>
val (finished, running) = futures.partition(_.isCompleted)
finished.flatMap(_.value.get.toOption).find(test) match
case None => findFirst(running, test)
case found => Future.successful(found)
}
Listing 28.5: Earliest Scala future that satisfies a condition; see also Lis. 28.7.
Function findFirst does nothing blocking. Its responsibility is to create a future that
will eventually contain the value you are looking for. It achieves this by using flatMap
to trigger computations when a future finishes. First, you use firstCompletedOf to
create a future that is completed when the first future from the list terminates. At this
3 You don’t have to. You can also implement findFirst directly with callbacks and a promise, as
in Listing 28.7.
28.5 Asynchronous Implementation with Scala Futures 425
point—when this future terminates—you know that at least one future is completed.
You separate finished futures from those that are still running and look inside the
finished futures for a successful outcome that passes the test.4 If one is found, it is
the value you are looking for. You return it as a completed future, and you are done.
Otherwise, you keep waiting for the remaining running futures with a recursive call. If
the list runs out of futures without finding one with a value that satisfies the test, you
return None as a completed future. Note that finished is necessarily non-empty (one
future has to finish to trigger the evaluation of the body of flatMap), so running is a
smaller list than futures, as is needed for recursion to work.
With methods completeOnTimeout and findFirst implemented, you can finally
write an asynchronous runner of strategies:
Scala
class Runner[A, B](strategies: Seq[Strategy[A, B]], exec: ExecutionContext)
(using ScheduledExecutorService)(using ExecutionContext):
def compute(input: A, timeout: Double): Future[Option[B]] =
val deadline = System.nanoTime() + (timeout * 1E9).round
Thus, the result from the earliest successful strategy is returned as an option of option.
426 Chapter 28 Case Study: Parallel Strategies
(The code stops updating the ative counter after the promise has been fulfilled, since
it will not be needed at this point.)
The action used to create the callbacks begins with a non-atomic check-then-act. It is
harmless for the same reason that the initial, non-atomic test in Listing 28.4 is harmless:
The test if !promise.isDone is included here only as a performance optimization. The
case of multiple threads attempting to fulfill the promise at the same time is handled
by method complete, which internally uses an atomic if-then. The worst thing that
can happen is that threads needlessly evaluate test(value) or decrement active after
the promise has already been completed.
With method findFirst written, you can easily implement an asynchronous runner:
Scala
class Runner[A, B](strategies: Seq[Strategy[A, B]], exec: Executor):
def compute(input: A, timeout: Double): CompletableFuture[Option[B]] =
val deadline = System.nanoTime() + (timeout * 1E9).round
which take a thread pool argument, for an implementation closer to the Scala variants.
428 Chapter 28 Case Study: Parallel Strategies
account. This design is based on the assumption that failed strategy runs might have
succeeded with more time, but not with less.7 As before, the cache is organized as a
thread-safe mapping from inputs to futures of outputs (see the rationale in Section 25.6)
but also stores a timeout value associated with each future.
When a thread calls compute with an input value and a timeout, the mapping is
searched for the input, and the outcome of this search is handled as follows:
1. The input is not found. In this case, the design from Listing 25.7 is reused: Put a
promise into the mapping, and invoke strategies to complete it.
2. A completed future with a value other than None is found. This implies that a
strategy was successful in the past. The desired output is taken from the future,
and no further computation is necessary.
3. A future is found but has no usable output. This is either an incomplete future or
a future completed with None. In this situation, there are two cases to consider:
(a) If the timeout associated with this earlier computation is larger than the
timeout now specified, you have no hope of improving the result, due to
the assumption stated earlier; don’t start a new computation, and use the
future from the mapping.
(b) If the future in the mapping corresponds to a computation with a smaller
timeout, you could use it in hopes that it produces a successful outcome,
but there is also a chance that it has failed (or will fail) where a lengthier
computation would have succeeded.
This is the interesting case. If the cached future is completed (with None),
you can ignore it. But if it is still running, you need to consider it: It may
soon produce a successful output. However, you cannot commit to it because
it may still fail. Furthermore, the future can also switch from incomplete to
complete at any moment while you are considering it.
You handle this case by initiating a new computation, with a larger timeout,
and by using the first successful result available, either from this new com-
putation or from the future already in the cache. Note that this approach
works whether the cached future is completed or not, so it correctly handles
the case where the cached future suddenly finishes in the middle of these
steps, successfully or not.
Using Scala futures for a change—Listing 25.7 uses CompletableFuture—a possible
implementation is as follows:
Scala
class Runner[A, B](strategies: Seq[Strategy[A, B]], exec: ExecutionContext)
(using ScheduledExecutorService)(using ExecutionContext):
7 This may not be true in practice, as computing resources may fluctuate, making it possible for
shorter runs to use more actual CPU time than longer runs did.
28.7 Caching Results from Strategies 429
findFirst(futures, _.nonEmpty)
.map(_.flatten)
.completeOnTimeout(deadline - System.nanoTime(), NANOSECONDS)(None)
.onComplete { result =>
promise.complete(result)
for task <- tasks do task.cancel()
}
promise.future
end doCompute
searchCache()
end compute
This is the very last code illustration of the book, but certainly not its simplest. You
can follow it step by step. A cache is allocated as a thread-safe mapping from inputs to
pairs. Each pair contains a future and the timeout associated with it. Method compute
simply returns the future produced by searchCache, which proceeds with the cases
outlined earlier, in the same order.
In the first case, no future is found in the cache. Therefore, you must initiate a new
computation. You create a promise and try to add it to the cache. If the promise is
added, you use doCompute to start the computation. The else branch deals with the
case where another thread added a promise to the cache before you could (see discussion
of Listing 25.7 in Section 25.6), in which case you need to consider it with a recursive
call to searchCache [the same thing happens in case 3(b)].
The next case in searchCache finds a successful completed future and returns it
unchanged. (Function hasResult is hard to read: It returns true if a future is finished,
its computation didn’t throw an exception, and the option it produced is non-empty.)
The third case deals with a future that is either failed or incomplete. If the future
is associated with a timeout value larger than what the current invocation of compute is
allowed [case 3(a)], you use it without starting a new computation. You simply add
a timeout to the existing future in case its remaining run is longer than the timeout
argument used in the call to compute.
This leaves case 3(b): You find a future—failed or incomplete, it doesn’t matter—but
its timeout is smaller than what you can afford to spend in the current call to compute.
In this case, you create a promise for a new computation, as in case 1, and use it to
replace the existing cached future. Once the new future is in the cache, you initiate its
computation. The only difference from case 1 is that, to the list of futures created from
the strategies, you add the older future that you found in the cache (this is done with
++ future at the beginning of function doCompute).
To implement function doCompute, used in cases 1 and 3(b), you proceed as in
Listing 28.6: Use findFirst, flatten, add a timeout, and cancel all remaining tasks
upon completion. However, instead of returning the future produced by findFirst
directly, you use it to complete the promise that is already in the cache.
In summary, this caching runner treats existing futures in the cache in one of two
ways. If a future cannot possibly be improved—it is finished with a successful value
or it is based on a larger timeout—it is used. Otherwise, it is replaced by a future
that cannot possibly be worse.8 This new future is completed as soon as the existing
future terminates successfully or later based on new strategy calls. It is no worse than
the previous future, because it produces a successful outcome at the same time if the
original future is successful or keeps running and may still produce a successful value
after the original future has failed.
8 Ignoring the possibility that starting new computations slows down existing ones given limited
computing resources.
28.8 Summary 431
28.8 Summary
While the case study in Chapter 24 focused on tasks as actions, executed for their
side effects, this variant considered functional—value-producing—tasks instead. It used
futures, a common handler of functional tasks, either as synchronizers (Lis. 28.1),
or jointly with other synchronizers (Lis. 28.2 and 28.3), or through non-blocking
higher-order functions (Lis. 28.4 to 28.9). Special attention was given to issues of time-
outs and cancellation. To contrast two standard future implementations, we imple-
mented the asynchronous runner twice, first with Scala futures and then with Java’s
CompletableFuture. Finally, a thread-safe, timeout-aware cache was added to asyn-
chronous runners using Scala futures and promises.
Appendix A
Features of Java and Kotlin
This appendix reviews many of the concepts and features discussed in the book, using an
older language, Java, and a newer language, Kotlin, instead of Scala. Several examples
written earlier in Scala are rewritten in Java or in Kotlin (or both). Code illustrations
use Kotlin 1.7 and Java 19 (with some “preview” features enabled).
Kotlin
fun abs(x: Int): Int = if (x > 0) x else -x
You can define functions locally inside functions in Kotlin but not in Java:
Kotlin
fun abs(x: Int): Int {
fun max(a: Int, b: Int) = if (a > b) a else b
return max(x, -x)
}
Kotlin can use a simple expression as a function body, as in the first abs imple-
mentation. If a block is used instead, you need to use return—blocks are not values
as they are in Scala. You can also see from the examples here that Kotlin, like Scala,
omits semicolons, uses the variable: Type syntax, and has a functional if-then-else
construct.
433
434 Appendix A Features of Java and Kotlin
In Kotlin, you can define functions as extensions and invoke them as methods. Also, a
Kotlin method declared infix can be invoked in infix notation. This example combines
both features to define an infix extension max to integers:
Kotlin
infix fun Int.max(that: Int): Int = if (this > that) this else that
Given this definition, you can write expressions like x.max(y) or x max y on integers x
and y. There is no support for these features in Java.
Neither language supports symbolic names, but several operators can be defined in
Kotlin by implementing a corresponding method:
Kotlin
operator fun String.times(count: Int): String {
val builder = StringBuilder(this.length * count)
for (i in 1..count) builder.append(this)
return builder.toString()
}
"A" * 3 // "AAA"
"foo"(2) // 'o'
Note that this[i] is actually compiled into a method call this.get(i). By defining
your own get, you can set up a square bracket syntax on your own types, something that
is not possible in Scala. There is no support for user-defined operators or for functions
applied implicitly in Java.
In both Java and Kotlin, you can parameterize functions by types. For instance, the
following function returns the first and last elements of a list as a short list:
A.1 Functions in Java and Kotlin 435
Java
<A> List<A> firstLast(List<A> list) {
return List.of(list.get(0), list.get(list.size() - 1));
}
Kotlin
fun <A> firstLast(list: List<A>): List<A> = list.slice(listOf(0, list.size - 1))
Kotlin
fun average(first: Double, vararg others: Double): Double =
(first + others.sum()) / (1 + others.size)
A.2 Immutability
Java and Kotlin functions can be either pure or impure. For actions—functions with
no return value, applied for side effects—Java has a keyword void, whereas Kotlin
uses a Unit return type, like Scala. Kotlin uses var and val to define reassignable and
non-reassignable variables, like Scala. In Java, variables are reassignable by default but
become non-reassignable when the keyword final is used.
Kotlin lets you defer the initialization of a val variable, a feature Scala does
not have:
Kotlin
fun parseVerbosity(arg: String): Int {
val verbosity: Int
if (arg == "-v") verbosity = 1
else if (arg == "-vv") verbosity = 2
else verbosity = 0
return verbosity
}
Most Java collections are mutable. Kotlin defines immutable collection types, but
they are mostly read-only views of underlying mutable implementations:
Kotlin
val readOnly: List<Int> = listOf(1, 2, 3)
val cheating = readOnly as MutableList<Int>
cheating[1] = 20
val x = readOnly[1] // 20
In this example, variable readOnly is of type List, an immutable type in Kotlin. In-
ternally, however, it is implemented (on the JVM) as an old-fashioned array-based list.
It can even be type cast to a mutable type, and modified: The second element of list
readOnly is changed from 2 to 20.
This makes Kotlin lists fundamentally different from lists in Scala (or in any func-
tional programming language). In particular, Kotlin lists have no constant-time head
and tail operations:
Scala
val a: List[String] = List("A","B","C")
val b = a.tail // constant time, lists a and b share all their data
A.3 Pattern Matching and Algebraic Data Types 437
Kotlin
val a: List<String> = listOf("A","B","C")
val b = a.drop(1) // requires a full copy, no data sharing
Immutable structures in Kotlin do not implement the data sharing scheme described
in Section 3.7 (and shown in Figure 3.1 with Scala lists). They are more like the unmodi-
fiable views available in Java. One difference is that they define their own types, without
methods for mutation. In contrast, in Java, calling a mutating method on an unmodi-
fiable structure throws an exception:
Java
List<Integer> nums = List.of(1, 2, 3);
nums.add(4); // throws UnsupportedOperationException
Kotlin
val nums = listOf(1, 2, 3)
nums.add(4) // rejected at compile time
In both Java and Kotlin, there is only one list implementation internally. In Kotlin,
you can view it as mutable or immutable, under two different types—List and
MutableList. In Java, there is only one type—List, mutable—but, on some lists,
mutating methods are not available at runtime.
Here, you use the whole switch expression as the value returned by the function. Later,
Java added type testing and logical guards:
Java
<A> String listInfo(List<A> list) {
return switch (list) {
case null -> "no list";
case List<A> empty when empty.isEmpty() -> "an empty list";
case RandomAccess seq -> "a random access list";
default -> "some other list";
};
}
Java also recently introduced pattern matching of records, which are similar to Scala’s
case classes (see Section 5.2):1
Java
record TempRecord(String city, int temperature) {}
The when construct neither supports arbitrary guards nor the deconstruction of records
as in Java and Scala.
1 The last two Java examples in this section and the Java code in Section A.4 use a pattern matching
The tailrec keyword works differently from the @tailrec annotation in Scala. In
Kotlin, optimization takes place only if tailrec is specified (and the function is indeed
tail recursive).
You can define and process recursive structures in both languages, as shown here:
Java
sealed public interface BinTree {
BinTree Empty = new Empty(); // empty tree singleton
}
record Empty() implements BinTree {}
record Node(int key, BinTree left, BinTree right) implements BinTree {}
Kotlin
sealed interface BinTree
object Empty : BinTree
data class Node(val key: Int, val left: BinTree, val right: BinTree) : BinTree
The main difference with the Scala variant in Listing 6.7 is that Kotlin’s when is more
limited than pattern matching: You cannot use a pattern of the form Node(...) to
extract the left and right children of a node, as you would in Java or Scala.
Kotlin
fun <A> negate(f: (A) -> Boolean): (A) -> Boolean = { x -> !f(x) }
Java and Kotlin offer bridges between methods and functions via method references.
Kotlin also defines a shorter form for anonymous functions, where x -> f(x) is replaced
with f(it), similar to Scala’s use of “_” for partial application:
Kotlin
fun pos(x: Int): Boolean = x > 0
If pos were a method of an object ref, you would replace ::pos with ref::pos in the
last expression. In Java, methods are always defined within a class, so the “::” operator
used for method reference always has a left-hand-side argument:
Java
class Math {
public boolean pos(int x) {
return x > 0;
}
}
Math m = new Math();
Java
Comparator<String> byLength = (a, b) -> Integer.compare(a.length(), b.length());
By contrast, in Kotlin, you often need to mention an interface’s name explicitly for a
lambda expression to implement it:
Kotlin
// rejected by the compiler:
val byLength: Comparator<String> = { a, b -> a.length.compareTo(b.length) }
In Java, lambda expressions actually require that you use a SAM interface:
Java
ToIntFunction<String> len = (String str) -> str.length(); // OK
Kotlin
val len: Any = { str: String -> str.length } // OK
val len = { str: String -> str.length } // type (String) -> Int inferred
Recall that, in Scala, functions can be curried, and single-argument calls can use
curly braces instead of parentheses:
Scala
def existsOrEmpty[A](list: List[A])(test: A => Boolean): Boolean =
list.isEmpty || list.exists(test)
Kotlin has no curried functions but uses a different “trick” for the same purpose. It lets
you move a lambda expression—which includes its own pair of braces—outside the list
of arguments if it is the last argument in a call:
Kotlin
fun <A> existsOrEmpty(list: List<A>, test: (A) -> Boolean): Boolean =
list.isEmpty() || list.any(test)
Notice how, in the last expression, the lambda expression { num -> num > 1 } has
been moved outside the list of existsOrEmpty arguments.
A.5 Higher-Order Functions 443
Kotlin’s any, used in the preceding example, is equivalent to the Scala method
exists. Kotlin collections define most standard higher-order functions, including
forEach, filter, map, and flatMap. For instance, you can convert Fahrenheit tem-
peratures into Celsius using map:
Kotlin
temps.map { temp -> ((temp - 32) / 1.8f).roundToInt() }
This example uses the shorter syntax based on it. Kotlin tends to use null where
Scala would use options—for instance, toIntOrNull instead of toIntOption. Method
mapNotNull combines a mapping and filtering steps, ignoring null outputs of the mapped
function.
In Java, the standard collections implement very few higher-order functions. Instead,
collections are bridged into streams, which implement all the higher-order functions. A
simple conversion into Celsius could be written as follows:
Java
List<Integer> temps = ...
This calculation forces boxing and unboxing of primitive int values to and from Integer
values. To avoid it, Java also defines specialized stream classes for its primitive types:
Java
int[] temps = ...
You can also implement in Java the more complex transformation shown earlier, from
a list of lines into a list of Celsius temperatures. There is no standard Java function to
parse a string into an optional integer (or even into a nullable integer), so you need
to write one first:
Java
Optional<Integer> parse(String str) {
try {
return Optional.of(Integer.valueOf(str));
} catch (NumberFormatException ex) {
return Optional.empty();
}
}
Using this function, the code for the transformation is similar to what it would be in
Scala. The key difference is that an option needs to be converted into a stream explicitly
in the second call to flatMap (Scala uses an implicit conversion there):
Java
Pattern SPACES = Pattern.compile("\\s+");
List<String> strings = ...
You could also write a variant that processes an array of strings String[] into an array
of integers int[] by using OptionalInt instead of Optional and IntStream instead of
Stream.
When higher-order functions return functions, closures are created in both Java and
Kotlin. The memoization example of Listing 12.2 can be written in either language:
Java
<A, B> Function<A, B> memo(Function<A, B> f) {
Map<A, B> store = new HashMap<>();
return x -> store.computeIfAbsent(x, f);
}
Kotlin
fun <A, B> memo(f: (A) -> B): (A) -> B {
val store = mutableMapOf<A, B>()
return { x -> store.computeIfAbsent(x, f) }
}
A.5 Higher-Order Functions 445
This example creates a variant of a function f that can be invoked only once before it
throws an exception—for instance, for the purpose of testing. This requires writing the
Boolean variable called from the closure. By contrast, Java does not allow closures to
reassign variables. This code is rejected by the compiler:
Java
<A, B> Function<A, B> single(Function<A, B> f) {
var called = false;
return x -> {
if (called) throw new IllegalStateException();
called = true; // rejected at compile time
return f.apply(x);
};
}
Instead, you can replace the lambda expression with an anonymous class that defines a
reassignable Boolean field:
Java
<A, B> Function<A, B> single(Function<A, B> f) {
return new Function<>() {
private boolean called = false;
public B apply(A x) {
if (called) throw new IllegalStateException();
called = true;
return f.apply(x);
}
};
}
The argument f is still captured in a closure, but it is only read and not reassigned.
446 Appendix A Features of Java and Kotlin
You can use the same function List where you need two functions, tabulate and fill,
in Scala. The Kotlin function used in this example is equivalent to tabulate. You could
write the last line in the preceding example as
Kotlin
List(5) { _ -> Random.nextInt(1, 11) }
It is used in exactly the same way as the Scala variant, with no explicit thunk visible:
A.6 Lazy Evaluation 447
Kotlin
val time = timeOf {
computeSomething()
}
Java is more limited than Scala or Kotlin. Even its Stream class does not define a
tabulate function, and Collections.fill can only fill a list with a repetition of the
same value. You can still create the two lists used at the beginning of this section, but
in a more roundabout way:
Java
RandomGenerator random = RandomGenerator.getDefault();
Kotlin doesn’t use options, but you can achieve a similar behavior—lazily evaluated
alternates—with its special handling of null (see Section A.7).
Scala’s LazyList—a lazily evaluated, memoized sequence—has no direct equivalent
in the Java or Kotlin standard libraries. Java’s Stream and Kotlin’s Sequence are more
448 Appendix A Features of Java and Kotlin
like Scala’s views and iterators; they provide delayed evaluation without memoization.
The iterator-based function from Listing 12.9 can be written in Java or in Kotlin:2
Java
int collatz(BigInteger start) {
return (int) Stream.iterate(start, n ->
n.mod(BigInteger.TWO).equals(BigInteger.ZERO) ? n.divide(BigInteger.TWO)
: n.multiply(BigInteger.valueOf(3)).add(BigInteger.ONE))
.takeWhile(n -> !n.equals(BigInteger.ONE))
.count();
}
Kotlin
fun collatz(start: BigInteger): Int =
generateSequence(start) { n ->
if (n % BigInteger.TWO == BigInteger.ZERO) n / BigInteger.TWO
else n * 3.toBigInteger() + BigInteger.ONE
}
.takeWhile { n -> n != BigInteger.ONE }
.count()
Like Scala’s iterators and views, Stream and Sequence stack and delay transformations
until the sequence is consumed, avoiding the creation of intermediate structures. The
two functions shown here do not allocate the list of numbers from start to ONE.
Finally, Kotlin—but not Java—offers a mechanism similar to Scala’s lazy for lazy
initialization of variables. Scala’s
Scala
lazy val variable: Int = someComputation()
In both cases, someComputation is triggered only the first time variable is accessed,
if it is accessed at all. As in Scala, lazy is thread-safe by default, but thread-safety can
be turned off when it is not needed.
2 Both implementations could be written more simply without a takeWhile step. In Java, you could
use a predefined “iterate while” function; in Kotlin, you could rely on the fact that generateSequence
stops when its function argument returns null. The code here is written to mimic the Scala variant.
A.7 Handling Failures 449
The second expression is simpler in Scala, but Kotlin’s Result type does not (yet)
implement a flatMap method.
Java only has its Optional type to offer, but at least it defines a flatMap method:
Java
List<String> someListComputation() { ... }
int compute(List<String> list) { ... }
Optional<Integer> computeOrFail(List<String> list) { ... }
is equivalent to Kotlin’s
Kotlin
val maybeString: String? = ...
Kotlin types Int and String do not contain null, but types Int? and String?
do. Therefore, maybeString could be either a string or null. The call maybeString
.uppercase() is rejected at compile time, requiring that you add a check for null in
your code. Instead of relying on if-then-else, Kotlin defines additional syntax that
makes it easier to handle null values. The operator “?.” applies a method to a non-null
reference, or returns null, but does not throw NullPointerException. It replaces the
use of map and flatMap on options in Scala.
On its Option type, Scala also define methods like orElse and getOrElse that take
an unevaluated argument, to be used only when an option is empty:
Scala
def someStringComputation: String = ...
val maybeString: Option[String] = ...
def maybeOtherString: Option[String] = ...
The right-hand side of “?:” is evaluated only if the left-hand side is null.
A.8 Types
Types in Scala, Java, and Kotlin are similar in many ways. All three languages are
statically typed but allow for type testing and casting at runtime. They rely on interfaces
(or traits) and classes as the primary mechanism to introduce user-defined types. You
can rely on type inference to various degrees—Java does not infer the return type of
methods, for instance. Function names can be overloaded for ad hoc polymorphism,
classes and functions can be parameterized by types for parametric polymorphism, and
dynamic binding of methods implements subtype polymorphism.
Both Scala and Kotlin let you define type aliases—the same type under different
names. Java does not. Scala can define opaque types—separate types with identical
implementations:
Scala
// inside a package
opaque type Length = Double
3 If you wonder about the name, tilt your head left and look again at the symbol. Does it remind
you of someone?
452 Appendix A Features of Java and Kotlin
Type Length is implemented as a double value but is incompatible with type Double.
Outside the defining package, you cannot call wholeMeters on a Double value.
Kotlin achieves the same type safety using inline classes (which was the approach
used in earlier versions of Scala):
Kotlin
@JvmInline
value class Length(val meters: Double)
This code successfully prints the titles of both books. Recall that a similar printTitles
function in Java could not be invoked on a List<Book> value because lists are non-
variant in Java. In Kotlin, the List type is covariant. It is defined as follows:
Kotlin
public interface List<out E> : Collection<E> { ... }
The variance annotation out plays the same role as “+” in Scala and makes the immut-
able List type covariant. As in Scala, mutable types are non-variant. Types such as
Array and MutableList are defined in this way
Kotlin
public class Array<T> { ... }
public interface MutableList<E> : List<E>, MutableCollection<E> { ... }
A.9 Threads 453
without an out annotation. A variance annotation in plays the role of “-” in Scala. It
is used to define contravariant types:
Kotlin
public interface Comparable<in T> { ... }
A.9 Threads
NOTE
Part II of the book illustrates concurrent programming concepts using mostly Scala and Java. Like
Scala, Kotlin targets platforms other than the JVM. The discussion in this appendix pertains to
Kotlin’s JVM incarnation.
By default, threads are automatically started, as with thread tA in the example. How-
ever, this can be disabled—for instance, thread tB is started after creation, explicitly.
454 Appendix A Features of Java and Kotlin
OptionalInt getRank() {
synchronized (lock) {
if (userCount < 5) {
userCount += 1;
return OptionalInt.of(userCount);
}
return OptionalInt.empty();
}
}
Kotlin
private val lock = Object()
private var userCount = 0
return OptionalInt.empty();
}
has no exact equivalent in Java or Kotlin. In the Scala implementation, both “+=” and
all are efficient: “+=” creates a new set that shares much of its data with the current
set, and all simply returns a pointer to the current set.
4 If
the method is static, the reflection object associated with the enclosing class is used as the
lock instead of this.
456 Appendix A Features of Java and Kotlin
By contrast, code that relies on the Java or Kotlin standard libraries requires a
full copy of the set, either when adding or when returning the entire collection. In the
following Java and Kotlin variants, for instance, you need to copy the entire set—with
the lock owned—inside method all:
Java
public class SafeSet<A> {
private final Set<A> elements = new java.util.HashSet<>();
Kotlin
class SafeSet<A> {
private val elements = mutableSetOf<A>()
Kotlin
// DON'T DO THIS!
@Synchronized fun all(): Set<A> = elements
A.12 Thread Pools 457
Because the set being returned is (or contains) a reference to the internal set, these
implementations would potentially result in unsafe interactions between a writing thread
(with the lock owned) and a reading thread (without the lock).
In contrast, if you use Kotlin’s immutable sets, you can make all efficient—but now
add requires a full copy of the entire set:
Kotlin
class SafeSet<A> {
private var elements = setOf<A>() // an immutable set
Scala’s variant uses an efficient “+” method on immutable sets—the sets elements and
elements + elem share data—that is not available in Kotlin’s standard library.
or in Kotlin as
Kotlin
exec.execute {
handleConnection(socket)
}
458 Appendix A Features of Java and Kotlin
We saw earlier that, instead of using Java thread pools directly, Scala code typically
defines execution contexts that make it easier to create futures and apply higher-order
methods on them:
Scala
given ExecutionContext = ExecutionContext.fromExecutor(exec)
Future {
// can use context implicitly to create futures
}
In a similar way, Kotlin tends to define dispatchers that you can use to run coroutines
(coroutines were discussed in Section 27.4 and are revisited in Section A.15):
Kotlin
val dispatcher = exec.asCoroutineDispatcher()
withContext(dispatcher) {
// can use context implicitly to run coroutines
}
Conversely, you can use Scala contexts and Kotlin dispatchers as regular Java thread
pools if needed:
Scala
ExecutionContext.global.execute(() => handleConnection(socket))
Kotlin
Dispatchers.Default.asExecutor().execute {
handleConnection(socket)
}
You can use thread pools implicitly in Scala to speed up the execution of higher-order
functions on parallel collections. For instance, Listing 21.6 processes URLs in parallel
by first transforming a regular list into a parallel sequence, using par:
Scala
val urls: List[URL] = ...
The same mechanism exists in Java. (Kotlin has no direct equivalent in its standard
library.) Java streams can be parallel or sequential, and parallel streams implement their
higher-order functions on top of thread pools:
A.13 Synchronization 459
Java
List<URL> urls = ...;
Thread pools are used transparently in a few other places in Java. For instance, class
java.util.Arrays defines functions parallelSetAll and parallelSort that rely on
the common thread pool to initialize and sort arrays, respectively.
A.13 Synchronization
The synchronizers discussed earlier are implemented in the Java standard library and
thus are available in Java, Scala, and Kotlin. As an illustration, you can implement the
simple lock from Listing 23.5 in Java or in Kotlin using code very similar to the Scala
variant:
Java
public class SimpleLock {
private final Semaphore semaphore = new Semaphore(1);
volatile private Thread owner;
Kotlin
class SimpleLock {
private val semaphore = Semaphore(1)
@Volatile private var owner: Thread? = null
fun lock() {
semaphore.acquire()
owner = Thread.currentThread()
}
460 Appendix A Features of Java and Kotlin
fun unlock() {
if (owner != Thread.currentThread())
throw IllegalStateException("not the lock owner")
owner = null
semaphore.release()
}
}
Calls to thenApply, thenCompose, and thenAccept are non-blocking and are used to
schedule future computations as data becomes available. They correspond to Scala calls
to map, flatMap, and foreach, respectively. The thread that invokes handleConnection
does no actual processing, and you can use the listening thread of the server.
On CompletableFuture, higher-order methods exist in two flavors: with or without
the “Async” suffix. With the suffix, you can specify a thread pool on which to run the
A.15 Minimizing Thread Blocking 461
argument function; without the suffix, the code keeps running in an existing thread.
Scala offers no such choice. (A plain thenCompose is used in the example because the
code to run is only a call to thenApplyAsync, which takes no time.)
You could write a similar server in Kotlin. However, Kotlin tends to favor an asyn-
chronous programming style that uses coroutines instead. See Listing 27.6 for a Kotlin
coroutine variant of a (simpler) server.
In this example, a pool is created with N threads, and M tasks are submitted for
execution. These tasks share a countdown latch, created with an initial count equal
to M . Each task decrements the count and then waits for the latch to open.
As long as N > M , the tasks are able to complete. However, if M > N , the program
gets stuck in a deadlock. After the latch count is decremented N times, all the worker
threads are blocked waiting for the latch to open, and no worker is available to run the
remaining decrementing tasks. The behavior would be the same in Scala or Kotlin.
In Kotlin, however, you can define a countdown latch that suspends coroutines with-
out blocking a thread:
Kotlin
class CountDownLatch(count: Int) {
private val remaining = AtomicInteger(count)
private val semaphore = Semaphore(permits = 1, acquiredPermits = 1)
suspend fun await() {
if (remaining.get() > 0) {
semaphore.acquire()
462 Appendix A Features of Java and Kotlin
semaphore.release()
}
}
fun countDown() {
if (remaining.get() > 0 && remaining.decrementAndGet() == 0)
semaphore.release()
}
}
This implementation uses a semaphore. The semaphore has no permit initially, making
method await blocking. Once the latch count reaches zero, a permit is created. This
permit is then acquired by one of the tasks blocked on await and released again for the
next task, thus allowing all the blocked tasks to go through the latch.
The semaphore used in the preceding example is an instance of kotlinx.coroutines
.sync.Semaphore, and its acquire method is implemented to suspend a calling corou-
tine without blocking the corresponding thread. Therefore, await in the countdown
latch can also block a task without blocking a thread. You can use the latch to imple-
ment a Kotlin coroutine variant of the Java program introduced earlier in this section:
Kotlin
val exec = Executors.newFixedThreadPool(N)
val latch = CountDownLatch(M)
withContext(exec.asCoroutineDispatcher()) {
for (id in 1..M) {
launch {
latch.countDown()
latch.await()
}
}
}
This program does not deadlock, even when M > N . When a thread reaches await
on a closed latch, the corresponding coroutine is suspended, but the thread remains
available to run another coroutine, which will perform another countdown. Eventually,
all the countdowns are executed, and the latch opens.
Glossary
action An impure subroutine that relies on side effects to modify the state of an
application but does not return a (meaningful) value; sometimes also called a
procedure. See also pure. Discussed in Chapter 3.
algebraic data type A type that combines existing types through a combination of
alternatives (sum) and aggregation (product). Many standard types, including tu-
ples, options, lists, and trees, can be defined as algebraic data types. Discussed
in Chapter 5.
anonymous function See function literal.
argument An input to a function or method, such as a value or a type. Also called
parameter .
asynchronous, asynchronously The opposite of synchronous, synchronously.
callback 1. A piece of code registered for (single, multiple, or optional) execution.
Callbacks may run synchronously (as in Chapter 9) or asynchronously (as in
Chapter 26). 2. The action of running such code.
CAS Compare-and-set.
class In object-oriented programming, a template for the creation of objects.
concurrent Said of multiple code executions that happen at the same time (“concur-
rently”). In this book, synonymous with parallel . Discussed in Chapter 16.
curried, currying A curried function consumes its first argument (or argument
list) and returns another function that will use the remaining arguments (or argu-
ment lists). By currying, a function that uses a list of multiple arguments can be
transformed into a function that uses multiple lists of fewer arguments. See also
higher-order. Discussed in Section 9.2.
deadlock A situation is which several entities perpetually wait for each other in a cycle
due to faulty synchronization. Discussed in Sections 22.3 and 27.7.
exception A disruption in the flow of program execution, typically caused by a failure
and possibly handled by an exception handler. Exceptions are said to be thrown
(or raised) and caught (or handled). Uncaught exceptions can cause a thread to
terminate its execution.
execution stack A stack that keeps track of the current nesting of subroutines exe-
cuted by a thread: Entering a subroutine adds to the stack, exiting it removes
from the stack. Also called call stack .
463
464 Glossary
function 1. A mathematical abstraction that maps each value from a set to a unique
value from another set. 2. A programming language subroutine parameterized by
zero or more arguments and which produces a value. For disambiguation, see
pure, side effect. Discussed in Chapter 2.
function literal An expression that denotes an unnamed function; also called anony-
mous function. Lambda expressions are a common syntax for function literals.
Discussed in Section 9.3.
happens-before The partial order that defines the Java Memory Model. Note that
“happens before” and “happens-before” have a different meaning in the book.
Discussed in Section 22.5.
immutable That which cannot be changed. This term can apply to an object state
(immutable object) or a non-reassignable variable. See also mutable. Discussed
in Chapter 3.
infix Said of a notation in which an operator appears between its two arguments, as
in x + y. See also prefix, postfix.
Glossary 465
lambda expression A common syntax for function literals, named after λ-calculus,
a theory of computable functions. Discussed in Section 9.3.
lazy Refers to various forms of delayed evaluation: lazy evaluation, lazy initialization,
etc. Discussed in Chapter 12.
list 1. A generic term for an ordered collection of values, which can be mutable or not,
support efficient access by indexing or not, etc. 2. A specific data structure used
in functional programming, and characterized by its immutability and head/tail
structure; sometimes referred to as functional list for disambiguation. Discussed
in Section 3.7 and Chapter 7.
mutable That can be changed. For instance, a mutable object has a state that can
be modified; a mutable variable can be reassigned with a new value. See also
immutable. Discussed in Chapter 3.
operator A function typically of one or two arguments, often with a symbolic name,
and invoked in prefix, infix, or postfix notation.
option Used to represent a value that may or may not exist. An option either is
empty or contains exactly one value. Options are often used as the return type of
functions that do not always have a valid value to return. Discussed in Section 5.3.
parallel Said of multiple code executions that happen at the same time (“in parallel”).
In this book, synonymous with concurrent. Discussed in Chapter 16.
466 Glossary
postfix Said of a notation in which an operator follows its arguments, as in x++. See
also infix, prefix.
prefix Said of a notation in which an operator precedes its arguments, as in ++x. See
also infix, postfix.
pure Said of a programming function whose behavior depends only on its input, and
which has no side effects. Pure functions are used to represent true (mathe-
matical) functions. See also action. Discussed in Chapter 3.
recursive Said of a function that invokes itself in its computation, one or more times,
directly or indirectly, and of a programming style that relies on such functions.
Discussed in Chapter 6.
runtime During code execution, the period in contrast to compile time—when code is
compiled.
side effect A modification of the state of system. Could be intentional or not. See also
pure, action. Discussed in Chapter 3.
stack 1. A data structure that stacks elements on top of each other, typically with only
the top element accessible. Sometimes referred to as a last-in-first-out (LIFO)
queue. 2. The execution stack.
Glossary 467
stream A sequence whose values are created over time. A stream can potentially be
endless. Discussed in Chapter 12.
switch A programming language construct used to pick among several alternatives. It
can be seen as a generalization of if-then-else, which is basically a switch with
two branches. See also pattern matching.
synchronization A generic term for mechanisms and techniques used by threads to
coordinate. Discussed in Chapters 22 and 23.
synchronous, synchronously In this book, refers to a computation that takes place
within the program flow (“now”), as opposed to an asynchronous computation that
does not interfere with the current flow of execution. Discussed in Chapter 16.
tail recursion A particular form of recursion susceptible to compiler optimizations.
Discussed in Section 6.5.
thread Short for thread of execution. A thread represents the execution (or run) of a
program. A running program contains at least one thread, but programs can also
be multithreaded. Discussed in Chapter 16.
tree A connected, acyclic graph, typically undirected. A rooted tree identifies a vertex
as the root of the tree; an ordered tree maintains an ordering among the children
of a node (e.g., left and right in a binary tree).
tuple An ordered aggregate of several values, possibly of different types. A 2-tuple is
referred to as a pair; a 3-tuple as a triple.
469
470 Index
closures, 130–134, 377. See also scoping default arguments. See arguments
collections. See data structures defensive copies, 33–34, 37, 56, 289, 291,
compare-and-set (CAS), 400–404, 414. See 296
also atomic operations domain specific language (DSL), 179, 194.
composition See also control abstraction
of actions, 10–11 drop list function, 82–84, 89
of functions, 10–12 dropWhile higher-order function, 139
of objects, 172, 231–232 DSL. See domain specific language
comprehension duck typing. See structural subtyping
for-comprehension, 35–37, 153–155, 212– dynamic binding. See subtype poly-
213, 317, 362, 389, 407. See also morphism
higher-order functions dynamic dispatch. See subtype poly-
list comprehension, 154–155 morphism
concat list function (:::), 59, 85–86
condition synchronizer, 343–346, 354, 364– E
366. See also locks error handling. See failure handling
cons (::), 31. See also functional lists exceptions, 48, 177, 195–197, 199, 201–204
consumer. See producer-consumer pattern exists higher-order function, 137–138, 152,
contains list function, 80–81 164–165
expressions, 25–28, 36,
contravariance, 238–240, 251–252. See also
code block as, 14
variance,
if-then-else as, 10, 26, 28, 36
annotation (-), 238–239
statements versus, 25–26
restrictions on, 239
switch as, 28, 47, 60
control abstraction, 176–179, 193
extension methods, 13–14, 178, 247. See also
coroutines, 407, 412–413. See also async/await
methods
count higher-order function, 138
extractors, 59–61. See also pattern
countdown latch synchronizer, 325–328, 334,
matching
339–340, 354
covariance, 222, 237–240, 244, 251–252. See F
also variance failure handling, 6, 195–204
annotation (+), 237–239 of futures, 199, 374, 394–395
restrictions on, 238 using higher-order functions, 198, 200–
currying, 5, 118–120, 134, 148. See also par- 204
tial application filter, filterNot higher-order functions,
cyclic barrier synchronizer, 340–341, 354 138, 142, 154–155
find higher-order function, 116–118, 167–
D 168
daemon threads, 264, 313 flatMap higher-order function, 141–146,
data structures. See also functional lists; 153, 155, 211–213, 388–389, 393–
trees; streams; thread-safety 394
lock-free, 401–404 flatten
mutable versus immutable, 29–30 list function, 87
parallel, 314–319, 362 on futures, 389
recursive, 6, 53–54, 61, 69–71 fold, foldLeft, foldRight higher-order
deadlocks, 259, 304–306, 325–327, 335–336, functions, 146–148, 152, 155, 165–
342, 374, 382–383, 391, 166
debugging with thread dumps, 328– forall higher-order function, 137–138
329, 336 foreach higher-order function, 140–141,
of tasks and not threads, 412–415 152–155
Index 471
P recursive functions, 54
pairs. See tuples designing, 67–69, 77, 80
parallel data structures, 314–319, 362 as equalities, 79–88, 96
parallel server illustration, 309–311, 372– with tail recursion, 71–77, 81–82, 94–
374, 383, 385, 390–392, 405–408, 96, 205–209
411 termination, 67–69, 93
parametric polymorphism, 17–20, 233, 235– versus loops, 6, 63–65, 74, 77, 82, 95,
244, 251. See also polymorphism 133
partial application, 125–126, 134, 248. See reduce higher-order function, 147
also function literals referential transparency, 27–28. See also
partition higher-order function, 138–139 functional variables
pattern matching, 6, 47–61 reverse list function, 90–91, 96
on algebraic data types, 48–56, 60–61
to define function literals, 122–123 S
for runtime type checking, 48 SAM. See single-abstract-method interfaces
for switching, 47–48, 60 scatter-gather. See fork-join pattern
performance, 6, 29, 45, 85–86, 90–92, 94– scoping, 126–129. See also closures,
96, 147–148, 173–174, 281, 303, static (lexical) vs dynamic (late bind-
311, 316, 322, 324, 330, 335, 343, ing), 126
383, 397, 414–415 semaphore synchronizer, 341–343, 354, 407
pipelines shadowing of variables, 126. See also scop-
asynchronous, 394, 411–412 ing
handling failures in, 201–203 Shevchenko, Ruslan, 406
lazily evaluated, 182–183, 186–187, 201– side effects, 21–23,
202 due to assignments, 27–30
poison pills, 352–353 single-abstract-method interfaces (SAM),
polymorphism, 232–244, 247–252 124–125, 134,
ad hoc, 232, 248, 251–252 implemented as lambda expressions,
parametric, 17–20, 233, 235–244, 251 124–125
subtype, 233–235, 249–251 sleeping for synchronization, 322, 324, 335.
procedures. See actions See also busy-waiting
producer. See producer-consumer pattern sortBy, sortWith higher-order function,
producer-consumer pattern 149–150, 156
through blocking queues, 349–354 sorting illustration, 92–94, 139, 381–383,
through lazy evaluation, 180–181 389
product types. See algebraic data types
span higher-order function, 139
promises, 375–380. See also futures
splitAt list function, 89
pure functions, 21–23,
statements versus expressions, 25–27. See
versus actions, 24–25
also side effects
Q streams, 7, 180–184. See also lazy evalua-
queues, 86, tion
blocking, 311–312, 342–346, 349–354 as infinite data structures, 184, 194
thread-safe, 297–306, 350–351 as lazy data structures, 180–182, 189,
192–194
R for pipelines, 182–183, 194
reactive streams, 411–412, 415 string interpolation, 60
read-write locks, 338–339, 353–354 structural subtyping, 230, 245–248, 251. See
recursive data structures, 53, 69–71, 77. See also subtyping
also functional lists; trees subset-sum illustration, 191–193
474 Index
V Z
val/var keywords. See functional variables zip list function, 87–88
varargs. See repeated arguments zipper illustration, 56–59
variable-lengths arguments. See arguments on trees, 169–172
variance, 235–240. See also types zipWith higher-order function, 388–389, 411
views. See lazy evaluation
virtual methods. See subtype polymorphism
volatile variables. See Java memory model
This page intentionally left blank
Register Your Product at informit.com/register
Access additional benefits and save up to 65%* on your next purchase
• utomatically receive a coupon for 35% off books, eBooks, and web editions and
A
65% off video courses, valid for 30 days. Look for your code in your InformIT cart
or the Manage Codes section of your account page.
• Download available product updates.
• Access bonus material if available.**
• heck the box to hear from us and receive exclusive offers on new editions
C
and related products.
Addison-Wesley • Adobe Press • Cisco Press • Microsoft Press • Oracle Press • Peachpit Press • Pearson IT Certification • Que