Mena As Functional Programming Ideas For The Curious Kotline
Mena As Functional Programming Ideas For The Curious Kotline
Cover design:
Blanca Vielva Gómez (_ @BlancaVielva)
Elena Vielva Gómez (_ @ElenaVielva)
Technical reviewers:
Andrei Bechet (_ @goosebumps4) Oliver Eisenbarth (_ @alfhir80)
Pedro Félix (_ @pmhsfelix) Garth Gilmour (_ @GarthGilmour)
Kasper Janssens (_ @JanssensKasper) Raúl Raja (_ @raulraja)
Ron Spannagel Dan Wallach (_ @danwallach)
Jörg Winter (_ @jwin)
1 Ŏ Introduction 1
1.1 The DEDE principles . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Overview of the book . . . . . . . . . . . . . . . . . . . . . . . . . 5
I x Everyday techniques 7
3 Ɓ Containers 25
3.1 Mapping, filtering, and folding . . . . . . . . . . . . . . . . . . . . 25
3.1.1 The same, but faster . . . . . . . . . . . . . . . . . . . . . 29
3.1.2 Effectful mappings . . . . . . . . . . . . . . . . . . . . . . 31
3.1.3 Fold is a generalized fold . . . . . . . . . . . . . . . . . . 32
3.2 Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.1 Deep recursive functions . . . . . . . . . . . . . . . . . . . 36
3.3 Custom control structures . . . . . . . . . . . . . . . . . . . . . . 37
i
4.2.1 Semi-structured data . . . . . . . . . . . . . . . . . . . . . 48
4.2.2 Hierarchy of optics . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Building a DSL for copies . . . . . . . . . . . . . . . . . . . . . . . 52
ii
10.2 Testing binary trees . . . . . . . . . . . . . . . . . . . . . . . . . . 116
10.2.1 Custom generators . . . . . . . . . . . . . . . . . . . . . . 118
10.2.2 Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
10.3 Testing services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
iii
iv
1
Ŏ Introduction
• it makes your code auto-magically better (and maybe saves a cat on its
way home), or
• it’s impossible to understand (or should I even say “decipher”?), and only
for those with 10 PhDs in abstract categorical programming.
As usual, truth lies in the middle. One of the main goals of FP is to be declar-
ative – roughly, focusing on “what to achieve” rather than “how to achieve” –
and this tends to produce code which is shorter and more understandable. Fa-
miliarity with functional idioms is a plus when working on such codebases, but
the main concepts are easily grasped by developers. Think of the map function
which applies a function to each element in a list, often considered a prime
example of functional style. Yet Kotliners use it in a daily basis, many without
even knowing they are doing FP!
Functional programming is considered one of the four “main” program-
ming paradigms, alongside imperative, object-oriented, and logic program-
ming. The time of purity in programming languages is long time gone now, and
it’s pretty common to see languages that mix the best of several paradigms.
Kotlin is no exception, taking the good parts from Java – its main predecessor
–, adding a bit of functional salt, and a bit of its own pepper. In fact, this book
exists because Kotlin is a great vehicle for functional idioms.
1
FP, from now on.
1
One of the stumbling blocks when approaching FP is that a great deal of
the literature fixates on first principles,2 leaving aside the question of how
functional programming benefits the code the developer is actually writing.
To avoid this trap, this book approaches functional programming from two
complementary points of view:
At the end of the day, our goal is to write fewer lines and more correct code.
FP is definitely this hammer you were missing in your toolbox all this time.
Since catchy acronyms are much better to remember than long lists – pro-
gramming has its own tradition, see the SOLID and GRASP principles or the
ACID guarantees – we are going to give one to our FP principles: DEDE. DEDE
stands for “domain, explicit, data, and effects.”
Although one can follow these principles in any programming language,
doing so is simpler when using particular programming languages. Several
features in Kotlin are enablers for FP, like higher-order functions or when ex-
pressions. In turn, by using those features we can more easily apply some
functional idioms or techniques, like functional validation.
Note that the way in which these principles are made concrete differs by
language. For example, the “explicitness” rule in Kotlin translates to very infor-
mative types, whereas dynamic languages such as Racket introduce (run-time
2
The interested reader should search for “λ-calculus” or “lambda calculus”.
3
The concept of FP is not set in stone; people still debate what FP amounts to. This list summa-
rizes what the author believes to be the most useful ideas stemming from the community.
2
checked) contracts. In both cases, the developer has somehow made explicit
what otherwise would remain implicit.
Developers should share the language used by other stakeholders,4 and not
the other way around. This powerful idea is the basis of Domain-Driven Design
(DDD), one of the most popular sofware design methodologies.5 FP provides
enough tools to make that dream a reality. It’s fairly common in the FP arena to
create small sub-languages for each particular domain, giving raise to Domain-
Specific Languages (DSLs for short).
Take Ktor, a popular HTTP library for Kotlin. When you use Ktor to write
a web server, you describe your server using a vocabulary taken from that
domain: routes, sessions, and so on.
routing {
get (”/”) {
call.respondText(”Hello, world!”)
}
}
4
The stakeholders may be themselves, like in a language used to describe web servers.
5
Domain Modelling Made Functional by Scott Wlaschin is a great introduction to DDD from the
perspective of FP.
3
context(DatabaseService)
fun User.save(): Result<UserId> { ^^. }
the second one is more explicit: the Result wrapping UserId indicates that
the operation may fail, and the context declares the services required by
the function.6 This principle goes hand-in-hand with the previous one, since
a good domain language is one in which our function signatures sharply de-
scribe what the corresponding body implements.
Apart from its rich type language, the combination of data classes to form
a sealed hierarchy is another powerful enabler of this principle within Kotlin.
An important part of our journey in this book is realizing how many other basic
building blocks of programming, not only validation and errors, can be turned
into simple data manipulation. Or simply said, how when can become the
absolute king/queen7 of your code.
Looping in Kotlin provides another example of this principle. Instead of the
for statement with an initialization, update, stop condition triple from the C
tradition; developers are asked to create some iterable including all the values
– a piece of data – which is manipulated. They could have even gone further
and scrap for completely, since higher-order functions allow us to write,
6
context declares a context receiver, a Kotlin feature available since version 1.6.20. We describe
the usage of this extension when discussing Services and dependencies.
7
Whether when is a masculine or femenine control structure is left to the reader’s discretion.
4
(1^.3).forEach { ^* do something ^/ }
The interesting point here is that the construction of the list (1^.3) can be
defined as a regular (recursive) function. If now instead of incrementing one
at each step we want to increment by 2, we just write a different generation
function, and keep the forEach part which manipulates the data untouched.
Functions have a explicit behavior in that they take some arguments and re-
turn some resulting value. On top of that, some function have some additional
behavior; for example, when you call print(1) the effect of writing to the con-
sole is part of the usefulness of this function. Once again, FP tries to make
that information explicit. In that case, though, we don’t stop there, we also
try to compartimentalize the different kinds of effects happening in your ap-
plication, and ensuring that only the minimal amount required are present as
arguments.
Context receivers serve this need incredibly well within Kotlin code. Read-
ing the following piece of code,
context(LoggingService, DatabaseService)
fun User.save(): Result<UserId> { ^^. }
gives as a very precise idea of what the function is doing in its body. Note
that caring about effects is very related to the goal of reducing the coupling
between components and explicitly declaring our dependencies.
There’s one effect which is nowadays recognized as potentially problem-
atic: mutability. In Kotlin we try to use val instead of var as much as possible.
The focus on immutability is also linked to the previous principle, as control
flow which depends on data mutable elsewhere is harder to track, as we don’t
have a direct caller-callee dependency, but an ordering dependency between
elements of your program. If there’s one simple rule to follow in FP style is
programming always with val.
The book is divided in two parts. The first one, Everyday techniques, introduces
“older” techniques, which have stood the test of time in other languages and
5
communities, and for which Kotlin provides good support. The second part,
Advanced techniques, discuss a few techniques in the near horizon, some even
outside the strict coding process, like modeling. The ideas in the second part
are described more independently from each other, whereas the ideas in the
first part are in many ways deeply interrelated.
We’ve already shown some snippets of Kotlin code. To follow this book
you need at least version 1.6.20 of the Kotlin toolchain,8 but you can other-
wise follow it in a JVM, Native, or Multiplatform project (or any other target we
cannot foresee.) There’s also a chapter on how to tackle FP style from modern
Java: newer versions of the language include many of the features pioneered
by Scala, Kotlin, and others.
We often refer to Arrow in the coming pages. Arrow is a set of libraries
whose goals align very much with the principles outlined here (disclaimer: the
author has worked in several of those packages.) In particular, Arrow Core
extends Kotlin’s standard library with new types and abstractions brought from
the FP community. Its website, arrow-kt.io, has detailed instruction on how
to add Arrow to your project, although in most cases it’s as easy as adding a
io.arrow-kt:arrow-core dependency to your Gradle file.
Concrete code examples require concrete domains, and this section introduces
the one used throughout this book. Trading card games (TCGs) form an inter-
esting mix between the player and collector mindsets: on the one hand, you
can build decks out of a huge pool of cards, in many cases with new ones com-
ing every few months. On the other hand, you usually don’t buy exactly the
cards you want; rather you buy boosters with a random allocation of cards.
Some cards occur less often than others, making them a target for collectors.
Examples of popular trading card games are Magic: The Gathering™, Poké-
mon™, and Yu-Gi-Oh™.
TCGs usually have complicated sets of rules, and cards may literally pro-
duce any effect on the game. We’ll be focusing on a simplified TCG instead,
to avoid both complications for overly-complex domain, and being sued for
copyright infringement. Our first goal is to model the cards, just one turn of a
page away.
8
That is the version that introduced context receivers.
6
Part I
x Everyday techniques
7
2
Our domain language
In this chapter, we are going to look at how to model a domain using FP id-
ioms, and how it differs from object-oriented modeling. After reading this
chapter, you’ll become acquainted with the usage of sealed hierarchies of
data classes, writing functions by pattern matching, and how immutability
changes the way we deal with data.
Without further ado, here’s an example of a monster card featuring the Loch
Ness monster, an all-time favorite for old and young alike.
┌────────────────────────┐
│ Loch Ness Monster │
├────────────────────────┤
│ Body: 100 points │
│ │
│ Attacks: │
│ [*] Roar 10 │
│ [WW] Tsunami 50 │
│ │
│ ID: A─04 │
└────────────────────────┘
9
Monster cards are one type of card found in our TCG; more will be introduced
later in the chapter. The basic elements of a monster card are its identifier, its
name, its body points (how much attack power they can “take”), and a list of
attacks. We model this by using a bare data class,
Attacks can be modeled in a similar way, each of them being defined by its
name, a power cost, and some amount of damage. We’ll define what Power is
in the next section.
The key point to notice here is that even though we think of these as one
piece of our domain model, they are implemented as mere repositories or
containers of data. This is in sharp contrast to object-oriented programming,
where classes also include behavior, often mutating or accessing some private
data. In fact, the OOP community refers to these kind of classes as anemic;
the name already hints to the fact that they are not in high esteem. However,
once we drop mutability, richer objects have less reason to exist.
A common fear at this point is thinking of a function as “belonging” to a
class, and thus having the . and ^. operators available,
monster.maxAttackWithPower(powers)^.name
reads much better than its counterpart where all arguments are defined as
“regular” arguments,
maxAttackWithPower(monster, powers)^.name
10
Fortunately, Kotlin separates the ability to use . from the requirement of func-
tions being defined within the class, using extension functions. The declaration
of maxAttackWithPower may appear anywhere, not only inside MonsterCard.
Alas, these classes don’t yet describe the domain as sharp as they could.
We can define a monster so weak that its body points are negative!
val weakMonster =
MonsterCard(”BOOH!”, ”Weak Monster”, -10, emptyList())
Following our principle of explicitness, we want to make very clear in the code
that we don’t expect such negative values in the body field! The right thing to
do is to introduce yet another class, using an initialization block to check that
the positivity constraint is satisfied.
Many coders seem to develop a fear of introducing too many types. But re-
member, we have “domain” and “explicitness” as guiding principles, and as a
consequence:
We can hear you mumbling, though, won’t this result in a performance loss?
Fortunately, Kotlin has our back once again with the use of value classes. If
we define the aforementioned class as a value class, and add the @JvmInline
annotation,1
1
kotlinlang.org/docs/inline-classes.html contains more information about inline
classes.
11
@JvmInline
value class Points(val points: Int) {
init { require(points > 0) }
the compiler takes care of erasing any trace of Points in the generated code,
leaving us with super-fast integers. We can even overload the + operator to
keep the same syntax we had for integers. This is a great example of how a
compiler can enforce an invariant without developers having to suffer incon-
veniences in return. If you want to go a step further, Arrow Analysis2 is able to
turn that require call from a run-time to a compile-time check.
2.1.1 Immutability
We have been very careful and always introduce the fields in our classes using
val, which means that they are immutable. Let me stress again that we want
to leave out mutability in our search for better control of effects, which in turn
removes the (time) coupling between components.
Imagine that we want to write a small function which duplicates the amount
of body points in a monster card. Since cards are immutable, we need to pro-
duce a new MonsterCard value,
fun MonsterCard.duplicateBody() =
MonsterCard(id, name, body * 20, attacks)
That code is not great, though. First of all, we had to repeat the name of all
the fields, even though we were changing only one. Second, it’s not very main-
tainable, as any change in MonsterCard (for example, a new field pointing
to a picture) requires changing duplicateBody, even though the change has
nothing to do with it.
Here comes a nicety of data classes: the copy method which is auto-
matically generated. Using it, we can be explicit about the changes, without
the need for repeating other fields.
2
arrow-kt.io/docs/analysis
12
fun MonsterCard.duplicateBody() = this.copy(body = body * 20)
Unfortunately, this is not the end of the game. Modifying nested values quickly
become cumbersome, as happens when we want to duplicate attacks.
Power cards are the other type of card in our game. These are simply identified
by their power type, which could be water, fire, air, or ground. To accommodate
this new fact we define a new common parent interface which also includes
MonsterCard.
Since there are only two types of cards in our domain, namely monster and
power cards, we mark the interface as sealed. Note that we have also split
the notion of “power card” from that of “power type”, since those concepts are
different in the domain; for example, attacks mention PowerType.
3
This is a great example of why defaults actually matter. Simpler syntax for mutability steers
programmers into that direction.
13
This pattern of having one sealed interface with a few immutable data
classes inheriting from it is the main form of modeling in FP style. In other
communities this pattern is called Algebraic Data Types (ADTs). Generally, us-
ing ADTs means that you model each type in your domain as:
This is again a common pattern in FP code, and it sometimes goes by the name
of pattern matching. In other languages, pattern matching is more powerful –
it can also “extract” data from the fields – but the general rule still applies: if
you define data as ADTs, you need a (preferably simple) way to detect which
choice you’ve taken.
You may have noticed that the first attack of the Loch Ness Monster card
at the beginning of the chapter shows a * symbol. In our game this means
that such a power requirement can be met by any power card, regardless of
its type. A first go at accommodating this fact is extending the enumeration,
14
sealed interface PowerType
Now it becomes possible to distinguish the needs from Attack from the needs
of PowerCard,
When using sealed hierarchies, you should not limit yourself to one level.4
Here we are using two levels; the first one separates Asterisk from the rest
of the power types, and then ActualPowerType splits into four choices. Note
that the use of enum is just circumstantial in this example, we could have also
defined it using a hierarchy of objects.
As a final example for this chapter, let’s define a function to check whether
there are enough power cards in a set of cards to “pay” for the cost of an attack.
The reason this is not a straight comparison is that * can match with any power
card. In any case, when doing so we are not interested in monster cards, so
we can just filter those out.
15
We do not want a yes/no answer, but rather a list of those power types that
we are missing. Our first approach at modeling this fact is returning a nullable
type. This works, but we can do even better with respect to explicitness if we
introduce our own result type.
This may seem going too far (and as usual with small examples, maybe it
is). Introducing a new result type alleviates the Boolean blindness problem, in
which you make your output binary – in this case, whether you have missing
power or not – just because you want to reuse the Boolean or null in your
programming language. Very often, though, the domain or the requirement
changes, leading to a third option which is hard to add.
In the context of Algebraic Data Types (ADTs), one often talks about product
and sum (or union, or coproduct) types. Those names refer back to counting
how many elements live in a particular type. That problem isn’t very relevant
in the context of programming,5 but the names have stuck.
Let’s begin with a simple enumeration, like the built-in Boolean or PowerType
defined above. In those cases counting how many different elements of those
types exist is as simple as counting the possible values of the enumeration.
In the case of Boolean we have 2 – true and false – and in the case of
PowerType we have 5 – Asterisk and the four options for ActualPowerType.
Note that we are considering only “proper” values, the following is not taken
as part of the elements of Boolean.
16
Let’s think about how many possible values we have for the following type,
We have 5 choices for the type field, times 2 possible choices for lastsOneTurn,
so 10 in total. In general, to count how many elements we could have for a data
class we multiply the amount of choices of every field. This is the reason we
also refer to them as product types.
Let’s refine our domain of PowerCards by introducing the regular cards
we’ve described in the previous section along with the new extended cards. In
summary, PowerCard is now defined as,
17
two sets. Disjointness refers to the fact that we remember which set values
came from, so we completely separate them. This is not always obvious; imag-
ine a type defined as follows.
Even though both OneWay and OrAnother have a single Boolean field, the
type system doesn’t confuse one with the other. We have exactly 4 possible
values of type OneWayOrAnother, 2 coming from each choice. If we want to be
a bit pedantic, we say that class declarations in Kotlin are generative: every
definition defines a new type completely unique from any other (as far as the
type system is concerned).
Other programming languages support union types directly, in contrast to
Kotlin, where a hierarchy is used to model it. For example, the following is
valid TypeScript, note the number | string in the signature.
interface CardVisitor<R> {
fun visitMonster(id: String, name: String, ^^.): R
18
fun visitPower(type: ActualPowerType): R
}
From the perspective of the visitor the hierarchy is in fact sealed, even if the
language doesn’t support that notion. The available choices for Card are de-
fined in CardVisitor. We’ll shortly explore this idea in the Expression prob-
lem section.
We can now swap the definition of printName above, where we used when,
with one using the visitor. The body for each choice now lives in different meth-
ods. We’re using Kotlin’s ability to create an implementation of an interface
on the spot.
19
class needs to override. Even if the original Card didn’t have a visitor func-
tion, we can implement one without changing the body of MonsterCard and
PowerCard at all.
The possibility of defining a visitor interface and a visit method using when
is by no means restricted to Card. We can follow such a pattern for every type
defined as a sealed hierarchy, the result is called a (generalized) fold, or if you
prefer Greek-inspired words, a catamorphism. The only difference is that when
applied in FP-land, one usually introduces the methods directly as arguments
instead of bundling them in an interface.
The API provided by visit and fold is in fact the same; one can implement
the former in terms of the latter, and vice versa. This relation stretches even
further; in the same way that for every ADT a fold function can be defined,
one can get rid of ADTs altogether and define types by means of higher-order
functions in the shape of a fold. Unfortunately, to use that technique the type
system needs to feature some constructions not (currently) available in Kotlin.
You might still be wondering, though, what is the relation between this
fold and the fold method defined on Iterable; the next chapter provides
the answer…
We began the chapter explaining that data classes are the preferred way of
modeling in FP style; this leads to anemic models in OOP jargon. It seems that
20
we don’t stop here: the insistence on sealed hierarchies seems to go against
the open-closed principle, which states:
There’s a lot to gain by taking the ADTs route. One visible improvement
is the exhaustiveness check: for every when statement, the compiler checks
whether we have covered every choice, or forces us to write an else branch.
One less visible benefit is that the control flow of our program gets more linear,
since there are no methods which could have been overriden in a subclass and
which subvert some of the invariants.8
This does not mean that modeling really open hierarchies is impossible.
To begin with, Kotlin allows you to mark a class as open, giving access to the
OOP style of extensibility. Restricting ourselves to (sealed) ADTs, we can use a
function in a field to provide an extension point.
21
object Cat: Animal
data class Other(
val name: String, val sound: () ^> String
): Animal
Modeling data using anemic classes and sealed hierarchies is a big depar-
ture from mainstream OOP, and takes time to master. Never forget our end
goal: being close to the actual domain, and being explicit about the shape of
the data.
In fact, the Visitor pattern discussed above reverses the extensibility proper-
ties in OOP. Since the visitor has one method per possible case, this means
that adding a new case requires modifying that interface. That implies, in turn,
22
that every implementor of the interface must be updated to support the new
method. We are back in the FP-style square.
Since the Expression problem was first discussed in the 1970s, many so-
lutions have been proposed. Extension methods in Kotlin are a way to add
methods to existing classes which are outside of our control, adding a bit
of FP-style to an OOP-based model. Multimethods in Clojure allow refining
an existing method by different means, adding a bit of OOP to an FP-based
model. Tagless final, a popular technique in the Scala community, provides
yet another solution by using the powerful type system in combination with
interfaces.
23
24
3
Ɓ Containers
Some parts of what we classically know as FP have permeated the whole de-
veloper experience. The most successful is definitely the way working with
containers have shifted from iterator-based to an API based on higher-order
functions. This is, in fact, how kotlin.collections looks. In this chapter we
look at such interface through the glasses of the DEDE principles, and learn to
recognize a few patterns commonly used in FP-style programming.
listOf(1, 2, 3)
mapOf(1 to ”a”, 2 to ”b”)
25
the database – although by the end of this book you’ll know that most of the
times you should use parMap with a suspended function instead.
The map function is an example of a higher-order function, that is, a func-
tion which takes another function as argument. However, most simple usages
of map don’t feel that way, since there’s not even a parenthesis. For example,
this builds a list by incrementing each value of [1, 2, 3] by 1.
Kotlin plays a couple of tricks here. First of all, instead of defining a function
to give as argument, you can write the function inline by giving the body inside
curly braces – these are often called anonymous functions or lambdas.1 On
top of that, if the lambda is the last argument, you can drop the parentheses;
these are called trailing lambdas.
Compare with the version in the increment function is defined separately.
There we need the parentheses, and also ^: to create a function reference from
the name of the function. References are rarely seen, many people would still
write a lambda, even for simply calling the function.
Since we’re unraveling all the syntactic sugar provided by Kotlin, the pre-
ceding code is actually shorthand for the variant with parentheses around the
lambda.
1
This is a reference to the mathematical theory of λ-calculus, initially developed by Alonzo
Church in the 1930s, which first introduced the idea of higher-order function.
26
val twoThreeFour = listOf(1, 2, 3).map({ x ^> x + 1 })
As mentioned above, those parentheses are not required if the lambda ap-
pears in final position, also known as a trailing lambda.
The next item in our toolbox is the filter function, which keeps only those
elements which satisfy a predicate, and its companion filterNot, which drops
those which satisfy the predicate.
If you need both results – in other words, if you need to split elements in a list
depending on a condition – then partition is your friend. The result is similar
to applying filter and filterNot, but with the performance improvement
coming from iterating over the container only once.
This notion of how many times we go over a collection of elements is an
important one, once performance enters the picture. We’ll go deeper into the
Sequences section, but for the time being let’s introduce a function which per-
forms both filtering and mapping, mapNotNull. The type signature looks as
follows,
Notice that transform must be a function which returns possibly null values.
The mapNotNull operation drops every result of the transformation which re-
sults in null, and keeps the other ones. Using this function we can reim-
plement filter by retuning null when the condition doesn’t hold, and the
unchanged value otherwise.
The final set of operations to discuss are those related to aggregation, that
is, returning from a whole collection of items one single value which summa-
rizes some property of those. The standard library provides many examples
like summing all values, or taking the maximum.
listOf(1, 2, 3).sum() ^/ 6
listOf(1, 2, 3).max() ^/ 3
27
Those are particular cases of the generic aggregation operation called a fold.
To define one such fold, one specifies an initial value, and then how that value
is updated with each subsequent element in the collection. For example, sum
starts with the value 0, and updates this accumulator each round by adding
the value.
The function fold goes through the collection starting with the element at the
beginning of it. For some types, like lists, a foldRight operation goes in the
opposite direction. There are cases, like sum, in which both functions return
the same value; some in which both can be applied but only one gives the right
result; and some others in which the algorithm dictates the right direction.
The other example aggregator, max, belongs to a family of functions which
required at least one element to return a sensible value. In this case, we can
only compare one value to another in the list, we really need that first ele-
ment.2 Instead of fold, the right choice here is reduce, which doesn’t require
an initial value. Remember, though, that using reduce with an empty list re-
sults in an exception; use reduceOrNull if you want to cover your back for
those cases.
Using combinations of map, filter, and fold, one can express pipelines
that manipulate a collection and ultimately return a single value. We have
created a language – a domain-specific one – to express how to work with a
container. Note that all the operations described in this section can be imple-
mented using different usages of for:
• map corresponds to creating a new mutable collection, and add the trans-
formed elements one by one;
• filter works in a similar fashion, but we only add those elements which
satisfy a condition;
• for fold one creates a mutable accumulator, which is updated on each
round of the for.
By using these operations instead, we gain explicitness; our code “says” what
the intention is. Instead of an all-encompassing for whose goal one needs to
2
We could use ´8 as iniitial value, since max ´8 x is always equal to x. Unfortunately, we
cannot express infinity using the Int type.
28
unravel on each usage, a pipeline built upon maps and filters clearly states
what’s happening. Furthermore, for usually brings mutability into the code
– at the very least, the iterator have some mutable state to keep track of the
current position – which goes against tight control of side effects.
Kotlin’s standard library is also an example of how data manipulation is
preferred over control flow. In most languages from the C tradition, to loop
over some numbers one uses the three-place for,
The idiomatic translation of that piece of code is to create a Range, which can
be used as any Iterable,
(0 until max).forEach {
^/ do something with it
}
list.map { it * 2 }.map { it + 1}
list.map { (it * 2) + 1 }
The first option traverses the list twice. Not only that, in order for the second
map to kick in, the result of the first map has to be materialized. In other words,
we are creating a whole intermediate data structure, which is of no use after-
wards. Alas, this intermediate data structure requires initialization, allocating
some memory, and ultimately work from the garbage collector.
One possibility is to combine several steps into one. For example, mapNotNull
fuses together a filter and map into a single traversal. In general we can play
the trick of fusing together two maps into a single one by making use of func-
tion composition. After importing the Arrow Core library, you can write
29
list.map({ x: Int ^> x + 1 } compose { x: Int ^> x * 2 })
Those are small functions and we can easily inline their definitions, but the
compose trick works for any two functions. Graphically, the composition looks
as follows,
It takes a while to get used to the “reverse” order in which compose expects
its arguments. One trick to remember is that (f compose g)(x) is the same
as f(g(x)) – the innermost operation goes in the rightmost position.
30
3.1.2 Effectful mappings
Getting ourselves slightly ahead of the following chapters, let’s consider a vari-
ant of mapNotNull commonly used for validation, in which we want to get back
a list only if the result of every transformation is non-null.3
For example, this collection may be a list of card identifiers to query for the
database, and the validation function may check that they comply with the
right format. Since these are identifiers which are later sent to the database,
we are only interested in the case in which every identifier is valid.
The key point here is not so much the specific usage, which we’ll discuss
in depth in their respective chapters, but rather the general structure of those
two functions. They are quite similar to map, but their transform functions
look “funny”: in the case of traverse the function may return a null value, in
the case of parMap the function is suspended. This “funny part” is transported
to the result of the function.
This general pattern is called a traversal or an effectful map, and it’s ubiq-
uitous in functional programming. The variation from the bare function type
is known as an effect, and represents additional behavior that a function may
have on top of pure computation – not returning a value when the result type
3
The functions mentioned in this section are available in Arrow Core and Arrow Fx.
31
contains ?, dispatching tasks in the case of suspend. Some programming lan-
guages, like Haskell or Scala, are powerful enough to define traverse once
and for all for every possible effect;4 in Kotlin this is not possible, but we still
can talk of the general pattern.
In the previous chapter we left a question in the air: what is the relation be-
tween the fold described in this chapter for collections, and the notion of
generalized fold discussed back there. In a nutshell, a generalized fold is the
FP counterpart of a visitor; a function which applies some behavior depending
on the choice of a sealed hierarchy. The answer lies on the fact that linked
lists provide a general model for any collection.
Linked lists get their name because of their structure of related cells. Each
cell holds one value of the list, and a reference (a pointer when implemented
in C) to the next element of the list.5 The only missing ingredient is a marker
for the end of the list; in C or Java you would usually use a null reference, but
for this discussion we introduce this marker explicitly as End.
How does the generalized fold for LinkedList look like? Let’s start by
writing the visitor interface first, with one method per choice.
4
Because of its generality, there’s an on-going meme in the FP community about traverse
being the answer to every programming question.
5
The out in the generics part of LinkedList describes the variance of the type. In partic-
ular, we state that if Cat is a subtype of Animal, then LinkedList<Cat> is a subtype of
LinkedList<Animal>.
32
fun <A, R> LinkedList<A>.visit(
visitor: LinkedListVisitor<A, R>
): R = when(this) {
is End ^> visitor.visitEnd()
is Cell ^> visitor.visitCell(value, next)
}
This is not the visitor we are looking for, though. This simple visitor only visits
one cell, but we want one which visits every single cell. The solution is to im-
plement a variation of the Visitor pattern, namely the Recursive Visitor pattern.
In this variation the recursiveVisit function calls itself recursively, which in
turn means that the signature of visitCell changes slightly, as it takes the
result of the recursive call over next as argument, instead of the next value
itself.
Using this API we can implement sumAll for linked lists in a very similar fash-
ion to what we’ve done above for List<Int>. The recursive visitor has to
implement two methods, one that defines the initial value, and other which
accumulates the partial sum.
33
override fun visitCell(value: Int, nextResult: Int) =
value + nextResult
})
The final touch is to move from the API defined in terms of an explicit visitor
interface into one where the two methods are given directly as arguments.
This is exactly the signature of fold defined over collections! Well, almost…
the actual fold takes a value R directly instead of a function () ^> R comput-
ing it. At the end of the day those are two representations of the same value,
the only difference being whether you compute it sooner or later.
In summary, fold is nothing more than the visitor function, in which we
take the very broad view that every collection is an ordered sequence of ele-
ments. This view works because other collections can be shoehorned into it:
sets can be given an order, and maps can be seen as a list of pairs.
3.2 Functors
34
This BinaryTree type supports a map operation which works in pretty much
the same way as the list, set, or map one. This map takes a function to trans-
form the elements, while keeping the structure of the tree intact. The corre-
sponding structure in the case of list is the order in which the elements appear,
and in the case of maps, to which key they are paired.
We can describe a pattern for all the aforementioned types. Types that
follow such pattern are called functors.6
• Those types have at least one type parameter, which describes which
kind of elements are contained within the structure.
For traverse, Kotlin is not powerful enough for us to describe a generic in-
terface for a functor, as one could do in Haskell or Scala. Still, this design
pattern is so common in containers used in FP code, so being aware of it helps
navigating and understanding the language of those APIs.
There’s a corresponding notion of functor, but using traverse instead of
map. In several languages types that support traversals are called – no surprise
here – traversable functors, or simply traversables. Our BinaryTree above is
an example of traversable: you can implement traverse for functions of the
form (A) ^> B? or parMap for suspended functions, as we have for List.7
6
The name comes from a branch of mathematics called category theory, in which they describe
a generalize notion of a mapping. In the realm of FP, however, functor is only used in the
(narrower) sense of a type with a map.
7
Time to stop reading, open your editor of choice, copy the BinaryTree definition, and work
hard on implementing those functions!
35
3.2.1 Deep recursive functions
The definition of map for BinaryTree is concise and direct, but comes with a
big problem: because of its recursive nature, it may overflow the stack if the
tree to transform is too deep. Fortunately, Kotlin’s standard library brings an
idiomatic solution to the table. Although not directly related to our discussion
of functors, this kind of deep recursive traversals are often found when trans-
forming immutable data structures, so it’s good to be aware of the solution,
should the time come for you to implement your own.
The key point is that instead of a function with the signature (BinaryTree<A>)
^> BinaryTree<B>, we are going to use DeepRecursiveFunction with the
same type arguments. Unfortunately we need to switch the order of argu-
ments, since DeepRecursiveFunction should only be applied to those ar-
guments which change between function calls, and the transform function
is uniform over the whole map body. DeepRecursiveFunctions are created
via their constructor, which takes a lambda defining the body of the function.
That body reads pretty much as the regular version, except that recursive calls
are marked explicitly with callRecursive, as done below.
@kotlin.ExperimentalStdlibApi
fun <A, B> mapD(
transform: (A) ^> B
): DeepRecursiveFunction<BinaryTree<A>, BinaryTree<B^> =
DeepRecursiveFunction {
when (it) {
is Leaf ^> Leaf(transform(it.value))
is Branch ^> Branch(
callRecursive(it.left), callRecursive(it.right)
)
}
}
We can still provide the same interface to map as we had before by calling the
version with DeepRecursiveFunction.
@kotlin.ExperimentalStdlibApi
fun <A, B> BinaryTree<A>.map(transform: (A) ^> B): BinaryTree<B> =
mapD(transform)(this)
36
The code looks a bit funny, with those two lists of arguments between paren-
theses. The reason is that the first one calls mapD with the transform function,
which returns a DeepRecursiveFunction. And then we call that deep recur-
sive function by providing the BinaryTree argument.
We’ve already mentioned how Kotlin can get away without a three-piece for,
by using higher-order functions over Ranges instead. Through the same glasses
we see that list.forEach { e ^> ^^. } is a rewording of the typical for
(e in lst) { ^^. }. In fact, with higher-order functions we can simulate
almost every control structure; which means that you can create your own if
needed!
Let’s begin with conditionals. There are three pieces in if ^^. then ^^.
else ^^., which we shall refer to as the condition, the then branch (the one
executed when the condition is false), and the else branch (to be executed
when the condition is false). The types of the two branches need to coincide,
but have no other constraint; this leads us to our first attempt.8
then both A and B appear on the screen. This is definitely not the expected
behavior for something which attempts to replace if. The reason for this be-
havior is that whenever we call a function – and ifThenElse is a function like
any other – then all the arguments are executed – in this case, leading to the
execution of both printlns.
8
We can also implement ifThenElse with the regular if, but it feels a bit like cheating. Trying
to stick to our principles, we use when over Booleans.
37
In order to delay the execution of the branches until we decide, the trick
is to make them lambdas without parameters. Arguments which are functions
cannot be executed right away, so the compiler just passes them to the body;
to execute them we provide the missing arguments, which in this case are
none.
38
Haskell is a lazy programming language because nothing is evaluated by de-
fault – like then and otherwise in ifThenElse – until some piece of infor-
mation in a value is required for another computation; and even then compu-
tation only proceeds as much as required, not until the full value is known.
39
40
4
ʺ Immutable data
transformation
We have already hinted in the previous chapter at the main problem we are
about to tackle below: working with immutable data is quite challenging once
data classes become nested. To better showcase the problem, let’s update the
MonsterCard class from the previous chapter by adding information about the
location of images.
41
but also quite painful to read and maintain, due to the convoluted copy with
other copy inside. If MonsterCard used mutable variables instead, readibility
would be greatly improved.
The question is then: can we get the benefits of immutable values, without the
pain of writing complex, yet boilerplate, code whenever we need to transform
that data? Fortunately, the answer is “yes”, if we bring in the right tools. We
want to still create new copies of every value we transform, yet keep the nice
syntax that comes with vars.
4.1 Lenses
42
The @optics annotation creates a lens (or functional reference) for each field
in the data class, available through the companion object. For MonsterCard
above, we get MonsterCard.id, MonsterCard.name, and so on. With one
such lens we can get the value from a specific MonsterCard,
MonsterCard.name.get(lochNessMonster)
^/ ”Loch Ness Monster”
MonsterCard.name.set(lochNessMonster, ”Nessie”)
^/ MonsterCard(id=A-04, name=Nessie, body=100, ^^.)
and yet another option is to modify the existing value instead of providing a
completely new one,
MonsterCard.body.modify(lochNessMonster) { it * 2 }
^/ MonsterCard(id=A-04, name=Loch Ness Monster, body=200, ^^.)
We may have gained a bit nicer syntax for creating a copy of the value by
modifying a field, at the expense of an awkward ordering: instead of value.field,
we know write lens.get(value). The real benefit is only noticeable when
we start composing lenses. If we have a lens which points to the image field
in MonsterCard, giving back a value of type Image, and another lens which
points to the small field of that Image value, then we can create another (com-
posed) lens which points directly to the small field within the image field of
a MonsterCard.
As you can see, lenses have its own type Lens, which records the “container”
and the “contained” types. The fact that lenses are values that we can compose
and manipulate brings a lot of additional power – similar to the advantages of
higher-order functions, but on the realm of fields and properties – but this is
43
not something we explore here. But from the practical point of view, we’re just
interested in composing the lenses right before calling one of the modification
functions. For example, our replaceSmallUrl function can be rewritten as
follows.
This code should be read as: focus on the image field and then on the small
field within it, take the string stored in that place, and create a copy of the
entire MonsterCard where that nested field has been modified as described.
Once you introduce lenses for all the fields in your domain language, manipu-
lating them becomes as succint as with mutable variables. In fact, if you know
how to create them using @optics, how to compose them, and the basic op-
erations get, set, and modify, you’re more than ready to benefit from nicer
immutable data manipulation. Yet, the design of Lens is very elegant, and
deserves a small explanation.
Feel free to jump over this section, it is not required for understand-
ing the rest of the book.
44
interface Lens<S, A> {
fun get(x: S): A
fun set(x: S, newValue: A): S
}
The real power, as we have discussed above, comes from the ability of
composing lenses to give access to nested pieces of information. Even that
part is not that long, a witness of the elegance of lenses.
45
The compose function describes how to get a field nested within another we
simply chain the calls to get coming from each lens. Setting is a bit more
involved, as we first need to get the value from the first level, which we then
modify using the second lens; the result then becomes the new value of the
first level.
And that’s it! By changing the way we look at data manipulation, and mak-
ing functional references a concept of its own, we’ve been able to provide a
comfortable and small API to work with immutable data.
Lenses solve the problem of modifying nested fields without turning into an
incomprehensible blob of copys. Our domain model, though, also includes
collections in several places: the lists of attacks in a card, the cost of a single
a attach. We defined back then functions such as duplicateAttack,
With lenses we could remove the top-level and the nested calls to copy, but
there’s still a map in between. If we bring a new type of optic to the table –
namely traversals – the code can be written quite succintly:
fun MonsterCard.duplicateAttacks() =
(MonsterCard.attacks + Every.list() + Attack.damage)
.modify(this) { it * 2 }
46
The example above also shows that optics of different kind – lenses and
traversals – can be composed together. The kind of the new optics reflects the
amount of elements it focuses on. Every.list() focuses on several values,
and for each of those Attack.damage focuses on one single nested field. The
result still focuses on several values, so we get back a traversal.
Other useful use case for traversals involve obtaining a list of all possible
paths within nested containers. For example, we can obtain all the different
power types used in all attacks from a single card by focusing on those two
lists one after the other,
fun MonsterCard.usedPowers() =
(MonsterCard.attacks + Every.list() + Attack.cost + Every.list())
.getAll(this).distinct()
We cannot use get as we did for lenses, because that focuses on a single
element; we need to use the getAll method instead.
fun MonsterCard.duplicateFirstAttack() =
(MonsterCard.attacks + Index.list().index(0) + Attack.damage)
.modify(this) { it * 2 }
Let’s think for a second the kind of optic Index.list().index(0) is. It can-
not be a lens, since the first element of the list may not be present. Being a
traversal works, but we can be more precise in this case: the result of index
focuses on either 0 or 1 elements, depending on whether the element is there
or not. We call them optionals.
One important characteristic of optionals is that they do not blow up when
you call modify and the element is not present. Instead, the modification is
completely ignored, since there’s no target for it. If a MonsterCard passed as
argument to the function above has no attack, then the copy returned from it
is identical to the original card. You can think of them as using ^. – if at some
point the value is not there, nothing is done.
47
4.2.1 Semi-structured data
The name semi-structured data refers to those values which have some struc-
ture, but don’t follow a strict model. XML and JSON documents are good ex-
amples: the format specifies some hirarchical structure with either tags or
objects, but doesn’t specify which particular tags or keys should be present
in a document. To define a stricter model one usually resorts to XML or JSON
schemas. Dealing with this kind of documents often requires a lot of check-
ing whether a certain key is present, or whether a some value is of a certain
primitive type. Optionals provide a nice API for dealing with those, as exempli-
fied by the kotlinx-serialization-jsonpath library we’re going to briefly
describe.
The name of the library already suggests that we’ll be working with the
classes defined by KotlinX Serialization, one of the most common packages for
(de)serialization in the Kotlin space. In particular, we focus on the JsonElement
class, which can represent any JSON value (that is, an entire document or part
of it); an instance of JsonElement can be obtained by decoding a string using
the methods in the utility object Json. According to the JSON specification,
a JSON value may take 6 different forms; the library includes an optional for
each of them:
Note that these optionals are different in nature than those introduced in the
previous section. Back then optionals focused on a particular field whose
value may be absent; in any case a piece of the whole value. Here optionals are
used to represent which of the possible choices has been used to construct a
value. This kind of optics arise in fact every time there’s a sealed hierarchy (ei-
ther concretely or conceptually, as in this case), and are called prisms. Arrow
Optics create prisms if the annotated class is the root of a sealed hierarchy.
The library also leverages the most common behavior of optionals for ac-
cessing keys in an object, or positions in an array. These optics have a dual
48
purpose: they only focus on a value if both the current value is an object or
an array, respectively, and the specified selector or index exists within it.
To make this explanation more concrete, let’s try to dig some information
out of the following JSON document, which represents the very first card we
modeled.
{
”id”: ”A-04”,
”name”: ”Loch Ness Monster”,
”body”: 100,
”attacks”: [
{ ”name”: ”Roar”, ”damage”: 10 },
{ ”name”: ”Tsunami”, ”damage”: 50 }
]
}
select(”attacks”).get(0).select(”damage”)
Note however that this won’t give us a number, but a JsonElement which can
still represent any of the 6 choices. We need to apply a final prism to focus
only on the value if it’s an integer,
select(”attacks”).get(0).select(”damage”).int
Feel free to jump over this section, it is not required for understand-
ing the rest of the book.
49
At the top of the hierarchy we have Traversal<S, A>, which focuses on
an unknown amount of values and modifies all of them in bulk. Following the
lead of the discussion on lenses, we describe a traversal by directly saving the
related functions as fields.
As an example, here’s the Every.list traversal which we’ve used above. This
is maybe the simplest traversal, as getting simply returns the list itself, and
modifying maps the given function over all the elements.
Staying within the topic of lists, a useful optional is that which focuses on a
particular element within a list. A possible definition is given below; for the
modification we need to check whether we are in fact changing the right index,
and leave the value untouched otherwise.
50
modify = { x, f ^> x.mapIndexed { index, a ^>
if (index ^= i) f(a) else a
} }
)
The final step is a lens, in which the focus is on exactly 1 value. We’ve
discussed how a getter and a setter is enough to implement modify, an idea
we re-use in the implementation below.
The only missing part for a nice optics API are the various compositions of
optics. We’ve already described the compose function for lenses, and how the
amount of values to focus on determines which is the result of combining two
different optics.
Subtyping?
At first sight, it seems that the way we modeled optics in this section – with
open classes extending each other – is at odds with our preference for ADTs.
Let’s unravel a bit what’s going on.
First of all, we are still defining anemic classes: Lens, Optional and Traversal
only hold their corresponding operations, they don’t define further methods
inside the class, not hold any hidden or private state. Alas, the Kotlin compiler
doesn’t allow for open data classes, data classes are inherently sealed.
The second consideration is that in this case we really want to make use of
the subtyping mechanism in the language. It’s just so useful to be able to use
a Lens when only a Traversal is required, without further complication. So
useful that in fact Haskell implementations of optics have created elaborate
workarounds3 to simulate this mechanism in a language with them. The ques-
3
The most common are known as van Laarhoven lenses – introduced by Twan van Laarhoven
in his blog post CPS based functional references, and popularized by the lens package – and
profunctor optics.
51
tion is still whether a feature which is useful in a few cases – optics, collections
– needs to take over the entire way we model our types.
The final reason why optics benefit from subtyping is that in fact we’re
defining a bunch of interfaces, and at the same time providing a single imple-
mentation. But we could have gone the extra mile and define them separately:
companion object {
fun <S, A> invoke(
getAll_: (x: S) ^> List<A>,
modify_: (x: S, f: (A) ^> A) ^> S
) = object : Traversal<S, A> {
override fun getAll(x: S) = getAll_(x)
override fun modify(^^.) = ^^.
}
}
}
interface Optional<S, A^: Traversal<S, A> { ^^. }
interface Lens<S, A^: Optional<S, A> { ^^. }
Arguably, this is just a harder way to encode what we did previously with a few
open classes. Having said so, in an ideal world we would seal the hierarchy of
classes, since we don’t really want to expose the ability to extend Traversal in
more ways. Alas, in the case of classes, Kotlin only gives the choice of being
fully open, or completely sealed – as opposed to interfaces where sealed
includes the classes defined in the same module.
We’ve already poked at the fact that, even though optics provide a more pow-
erful interface to manipulate immutable values, the Kotlin language still pro-
vides a more intuitive interface for mutability. In this section we describe a
small domain-specific language inspired on the built-in copy for data classes,
but using optics to describe what to modify. This journey introduces the Builder
52
pattern used in many Kotlin DSLs, and provides the single accepted example
of mutability in FP-style.
To make things concrete, the end goal is to provide another overload of
the copy methods. Instead of field = newValue pairs, this variation takes
a block. Inside that block, we can use lens setTo newValue to specify the
new value for the field pointed to by that lens, or traversal transform f to
indicate that the values pointed to by the traversal ought to be updated follow-
ing the instructions in f. For example, here’s a function which both updates
the name and the images in a MonsterCard.
The Builder pattern consists of three moving parts. The first one is an
interface which defines the new operations available within the block. In our
case, we want setTo and transform to be available.4
interface Copy<S> {
infix fun <A> Lens<S, A>.setTo(newValue: A)
infix fun <A> Traversal<S, A>.transform(f: (A) ^> A)
}
These two elements are enough to provide the syntax we’re looking for. First
of all, Kotlin doesn’t require parentheses for the last argument if it’s given as
4
The infix modifier allows calling these functions without parentheses, increasing the feeling
that those operators are part of the language, instead of implemented in our own library.
53
a block – a so-called trailing lambda. Second, the methods of an extension
receiver are implicitly available within the corresponding block, without any
additional qualification. This means that we can use setTo and transform
directly, as we’ve done in the example.
The missing third leg of this stool is an implementation of Copy which
we can use to write copy. This is where mutability enters the game: we are
going to start with an initial value, which changes with each call to setTo or
transform.
The final current value is inspected at the end of each call to copy. Note
how this definition makes essential use of side-effects, because we are by no
means using the return value of f(copy).
The book has been preaching for pages how sinful mutability is; what makes
it suddendly acceptable in this case? The answer is that mutability here is lo-
cal, confined to the frontiers of the copy function. To the outside world copy is
a perfectly pure function which doesn’t change the values given as arguments,
nor has other behavior than computing a new result.
Local mutability is often used when translating imperative algorithms into
the functional realm, while still keeping the same computational cost. Take
for example Dijkstra’s algorithm to compute the shortest path in a graph: the
traditional algorithm uses a mutable set of visited nodes, which prevents a
54
node to be visited more than once. In this case we could implement the algo-
rithm with an immutable set of visited nodes which is refined on each iteration,
but that would be quite costly (unless specific compiler optimizations kick in).
Keeping the mutable version is also fine, though, since for the outside world
a function with signature
is perfectly pure: the result only depends on the input, and doesn’t have any
other behavior apart from the computation. Local mutability in this case is
just an implementation detail, and doesn’t jeopardize the composability guar-
antees from FP-style.
Optics show us that immutable data is not at odds with comfortable ma-
nipulation. Furthermore, by introducing the idea of focusing on several val-
ues at one, we can very succintly describe modifications to elements within a
container, something which is not possible with the regular value.field =
newValue syntax. Although is takes some time to get used to them, optics are
one of the techniques with the best return of investment in FP-style program-
ming.
55
56
5
Á Errors and validation
The world is far from perfect, and our card game is no exception. At some
point somebody is going to try to introduce a card in the card database with
an empty name; or maybe the card is completely right, but the connection to
the database has been lost. Henceforth, our software must deal with those
not-so-happy paths.
Not all error conditions are created equal, though. Roughly speaking, we
can divide problems into three big groups, depending on the point of the ar-
chitecture in which they are found.
57
3. Exceptions describe problems coming from effects: failing connections,
wrong file permissions, and so on. In most cases an exception results
in aborting an operation altogether, maybe trying an alternative path of
execution. Most exceptions are also transient, that is, the status change
over time: a lost connection can be regained, the permisions on a file can
be changed. This means that retrying is a possible strategy for recovering
from exceptions, as opposed to validation or domain errors (the chapter
on suspensions and concurrency contains further information about this
topic.)
Kotlin comes with a built-in way to declare absence, by means of nullable types,
which we can spot by their final question mark, as in Int?. Nullable types can
be seen as a rudimentary approach to error handling in FP style: whenever
there’s a problem in the execution of the function – domain, validation or ex-
ceptional circumstance – we return the special value null to indicate so.
fun buildMonsterCard(
id: String, name: String, body: Int, attacks: List<Attack>
): MonsterCard? =
if (id.isNotEmpty() ^& name.isNotEmpty() ^& body > 0)
MonsterCard(id, name, body, attacks)
else
null
This pattern is a very common one, dubbed smart constructors, and we can
make it even nicer by simulating that the MonsterCard constructor can fail if
we introduce an invoke method in the companion object.
58
operator fun invoke(^^.): MonsterCard? =
if (id.isNotEmpty() ^& name.isNotEmpty() ^& body > 0)
MonsterCard(id, name, body, attacks)
else
null
}
}
As simple as they are, nullable types already turn control flow into data ma-
nipulation. Instead of throwing an exception which needs to be caught, check-
ing whether the MonsterCard was created correctly is a matter of matching
on the resulting value:
when (MonsterCard(^^.)) {
null ^> ^* something is wrong ^/
else ^> ^* we know everything is fine ^/
}
One of the great innovations that Kotlin has brought into the mainstream pro-
gramming languages arena is a powerful nullability analysis. The compiler
knows that in the else branch of the when right above, the result of the con-
structor is not null – otherwise it would be gone through the first branch – so
in the body afterwards there’s no need for further null checks. Nullable types
are so important in Kotlin that there are even specialized Elvis operators ^:
and ^. to work with them.
59
Leaving aside the out and Nothings required for this type to work in good
terms with the type checking process, Either is a sealed hierarchy with two
choices. Right represents that the happy path has been taken and a value of
type A has been computed, whereas Left1 is used to signal a problem.
The Either type is often used in combination with an enumeration or
sealed interface which defines the possible errors for a particular function or
set of functions. For example, here are the possible validation problems for
Cards: empty fields or non-positive numbers when one greater than zero is
expected.
Libraries such as Arrow Core include several extension functions to make work-
ing with Either a bit more pleasant. You can write CardError.EMPTY_ID.left(),
for example, alleviating a bit the noise introduced by the mandatory construc-
tors, which were not required when using nullable types.
The code above is quite straightforward, but you may already see some
problems with modularity. If the check for correct identifiers or allowed names
was more involved – not only checking that they are not empty, but ensuring
1
This pun was introduced in Haskell’s standard library, and has since then spread to other FP
communities.
60
they follow a particular format or only use a subset of all Unicode characters –
we would like to make them their own functions. A first step towards a better
solution is to turn those checks into functions,
However, we are still coupling the production of the particular error message
to the constructor function, instead of taking that step into the validField
family of functions. A better approach is to make them return Either with the
error information, instead of plain Booleans. Using Either forces us to return
some information in the happy path too, so we just reuse the provided argu-
ment. Note that this is a great place to check other invariants and introduce
wrapper inline types, if the codebase uses them.
The question now becomes how can we combine all those Either values into
a single one. In particular, we want to build the MonsterCard only if the valida-
tion of every single field goes through. The answer is the zip function, which
applies a given transformation on a sequence of Eithers only if all of them
are Right. The version below this text shows the preferred way to describe
validations using FP style.
61
id, name, body ^> MonsterCard(id, name, body, attacks)
}
The definition of zip itself is not that complicated – just a nested sequence
of when looking at each argument in order.
However, one often wants to have versions from two to a dozen arguments.
That big set of overloads of zip is usually provided by libraries such as Arrow
Core.
The zip utility function is very useful when dealing with validations, because
in that case we often combine several validations into a single one. You may
have already noticed a problem with the code above, though: we only get the
very first error when the two Eithers happen to be Left. This means that we
are not giving back all the possible information required to fix the problem,
where in fact the validation of the identifier and the name are independent
and we could do so. There are in fact two strategies to deal with several Left
Eithers: one is bailing out on the first problem, as done above, the other is
accumulating or collecting all the results.
To implement the accumulation idea, we need to switch the error type in-
side Either to be a list of errors, instead of a single one. This combination of-
ten goes with Validated in FP-style libraries.2 We also define a small invalid
helper to create error cases with a single problem inside.
2
It’s possible to be more precise and require the list of problems to be non-empty. For example,
Arrow Core uses a specific NonEmptyList type where we use regular Lists. We keep that
requirement implicit in the following pages for the sake of clarity.
62
typealias Validated<E, A> = Either<List<E>, A>
Keeping a list of errors works great also when we have a list of elements to
process, and each of them may fail independently of the other. In our example
this corresponds to have a validator or smart constructor for a single attack,
63
a few chapters ago we introduced a name for this type of map-like function
(also known as traverse) which also embodies one effect, in this case the
possibility of failure.
64
is Either.Left ^> Either.Left(transform(error))
is Either.Right ^> Either.Right(value)
}
For this code to work we also need to introduce ATTACK_ERROR as a new mem-
ber of CardError. Since this needs to hold information about the specific er-
ror in the attack, this unfortunately requires us to move from the concise enum
syntax into a full-fledged hierarchy where CardError is an interface, and each
possible error a data class or object.
65
The functions introduced up to now are enough to define most validations.
Unfortunately, there’s quite some scaffolding involved – one needs to call zip
and mapError, even though they are not really part of the validation logic.
One common problem that happens right after being introduced to Either is
to use it everywhere an error condition may occur. However, one should try to
resist that urge and think twice whether there the “success or failure” mode fits
well the problem. To make it more concrete, let’s think of a function looking
up a card in the database; a first signature could be
where the absence of a card in the database with the given identifier is repre-
sented as Either.Right(null). Is this the best approach?
One could argue that there are not two, but three (or even four) outcomes
possible for getCardById:
1. Everything works: the card is in the database and we can retrieve it.
2. There’s a problem with the database connection.
3. The connection works well, but a card with that identifier is not found.
4. If we perform an additional check to see whether the given identifier sat-
isfies any shape restriction, failing that validation step is another possi-
ble outcome.
66
This solution has the added benefit that we don’t overuse null to signal
problems. If we interoperate with a language without built-in null tracking,
such as Java, this can save us from incorrectly treating a Right value as “card
was found” when null is returned there.
You can transport most of what we’ve discussed about Either to Kotlin’s
Result bundled in the standard library. The goal of Result is to transform
the “success or exception” model inherited from Java into the data manipula-
tion model we prefer. In fact, you can think of Result<A> as being a synonym
for Either<Throwable, A>: a “success or failure” type where the problem
type is hard-coded to Throwable, the parent of the exception hierarchy.
The function bridging the two worlds is called runCatching. As the name
suggests, any exception thrown in the body won’t bubble up, but is instead
saved as failure within the Result.
^/ initialization function
val result: Result<String> = runCatching {
val file = File(CONFIG_FILE_PATH)
if (!file.exists()) file.writeText(INITIAL_CONFIG)
file.readText()
}
To inspect that result, one option is to use when. Note that Result doesn’t
define the two options as different types, you are forced instead to use the
isFailure and isSuccess predicates, and to unwrap the contents using ei-
ther getOrThrow or getOrNull.
when {
result.isFailure ^> ^* not going well ^/
result.isSuccess ^> {
val config = parseConfigFile(result.getOrThrow())
^^.
}
}
67
A nicer interface using functions as parameters is provided by fold. Instead
of matching, the two arguments represent what to do on each case; and you
get access directly to the exception or the resulting value.3 The code above
can be rewritten as follows.
result.fold(
onFailure = { ^* not going well ^/ }
onSuccess = { s ^>
val config = parseConfigFile(s)
^^.
}
)
When compared with Either, Result lacks functions to work with a col-
lection of them, such as zip or mapOrAccumulate. One could argue that in
this way Kotlin doesn’t push too hard into avoiding exceptions altogether –
this would make things like interoperability with Java, or error propagation in
coroutines too cumbersome. The middle ground approach works pretty well
in this argument: keep using exception-based mechanisms for those parts
of your system which are mainly about interfacing with external systems, but
wrap them with runCatching at the top-level, when you really need to handle
them, to obtain the pure interface we prefer.
5.2.1 Dependencies
68
There’s a final function related to Either we haven’t introduced yet, which
solves this problem quite neatly. The key is abstracting away the matching on
the Left case, but otherwise allow the transform function to still fail.
The code above becomes much shorter, and we don’t have to use result1.value
to access the result of the happy path of step1 anymore.
We still have some amount of annoying nesting, but we’ve removed a bit of the
boilerplate. The Arrow Core library defines flatMap also for Result, leading
to a nicer shared interface between both types. If one prefers to stay within
the bounds of Kotlin’s standard library, the mapCatching function is a good
substitute. That way any exception thrown in the next step is also transformed
to a Result, without having to call runCatching over and over.
There’s one similarity between this flatMap and the first zip we defined
in this chapter; in both cases if the first element is a Left, we immediately
return it. In the case of zip this was not desired, as we were composing in-
dependent computations and we would like to get as much information from
their validation as possible. In the case of flatMap, though, stopping on first
failure is inherent to the problem: since the next step depends on the previous
one, we cannot really execute the next one if the previous one has failed.
Using the approach described in this chapter to errors and validation imposes
quite some ceremony; the developer needs to sprinkle zips, maps, and flatMaps,
and getting used to them takes time. The general guidelines are:
69
• Validation function should always return the validated value, not only
a Boolean which states whether the value is correct. This approach is
often called “Parse, don’t validate”4
• One defines the top-level validation function as a zip of the different
components. Usually one is building a value of a particular (data) class,
and validation applies to each field of that value independently.
• If the type of the field we are validating is a collection, a map or traverse-
operation is often required. Most of the time the validation happens
element-by-element, so accumulation is the right approach.
• Leave those validations which depend on previously-validated errors un-
til the end – where you need to introduce flatMap. This ensures that
failed validation can still visit as many independent parts as possible,
which leads to better error reporting overall.
Note that those constraints introduce a dependency in the list of attacks – the
validation of the second one depends on the value of the first one – but doesn’t
involve the card as a whole. For that reason, we’ll introduce the flatMap op-
eration not on the entirety of the card, but on the attacks. We can represent
the entire chain of validation as follows:
4
The blog post with the same title by Alexis King, lexi-lambda.github.io/blog/2019/11/
05/parse-don-t-validate/, highlights how this idea can be applied to domains other than
strictly input validation.
70
The diagram shows that validation of a MonsterCard is made of four inde-
pendent validations of each field. This means, once again, that the top-level
validation uses a zip.
Now let’s work out validAttacks. We’ve already discussed that there’s some
validation which happens independently. However, instead of mapping the
AttackError type directly on mapOrAccumulate, we are doing so as a final
step, so we can more easily introduce intermediate steps. The current version
of that validation looks as follows.
fun validAttacks(
attacks: List<Triple<String, List<PowerType>, Int^>
): Either<CardError, List<Attack^> =
attacks
.mapOrAccumulate { (n, c, d) ^> Attack(n, c, d) }
.mapError { it.map(^:ATTACK_ERROR) }
fun validAttacks(
attacks: List<Triple<String, List<PowerType>, Int^>
): Either<CardError, List<Attack^> =
attacks
.mapOrAccumulate { (n, c, d) ^> Attack(n, c, d) }
.flatMap { attacks ^> when (attacks.size) {
0 ^> TOO_FEW_ATTACKS.invalid()
71
1 ^> Either.Right(attacks)
2 ^> when {
attacks[0].damage > attacks[1].damage ^>
SECOND_ATTACK_IS_SMALLER.invalid()
else ^> Either.Right(attacks)
}
else ^> TOO_MANY_ATTACKS.invalid()
} }
.mapError { it.map(^:ATTACK_ERROR) }
Another approach would have been to separate the checking of the two con-
straints into separate steps. However, given that the size also defines whether
the damange checks needs to be done, it makes sense to keep them together
in this code.
Using more advanced Kotlin it’s possible to remove almost all this cere-
mony, and use regular function calls and map, instead of manual zip and
mapOrAccumulate. The chapter on Errors and resources describes a tech-
nique which solves the problem by re-using the context receivers machinery
built in the Kotlin compiler.
72
6
Ϙ Services and dependencies
1
Sometimes referred to as hexagonal architecture, even though the number 6 is completely
irrelevant for the concept.
73
problem has been solved by using Dependency Injection frameworks. However,
those tend to rely too much on unsafe features such as run-time reflection; we
prefer more typed (and thus compiler-checked) approaches.
interface LoggerService {
fun log(level: LogLevel, subject: String, message: String)
}
The second step is adding the interface to the context of each function
making use of it, which is done using context receivers. As a consequence, the
methods of the interface become available as part of this in the body of the
class – although most of the time this is implicit. In the code below we use
both LoggerService, and a HttpService
context(LoggerService, HttpService)
fun login(username: String, password: String): User? {
^^. ^/ compute password hash
val params = mapOf(”username” to username, ”password” to hash)
val r = httpPost(”/login”, params).getOrNull()
return r^.let { result ^>
74
log(LogLevel.INFO, ”login”, ”successful”)
parseUser(result)
}
}
What makes context receivers so nice in practice is that those within the
body of a function are implicitly available for others. Let’s say the initialization
step in our game reads the configuration and then logs in the user; which we
can implement as follows.
The call to login doesn’t require anything else, since the LoggerService and
HttpService are already part of the context of initialize.
This is the moment in which our principle of controlling effects kicks in, in a
different fashion than bare immutability. Each of the elements in the context
declaration should be thought of as bringing one new kind of effect, or scope
into the function body. No effect should be allowed if it doesn’t come from
the context in front of the function type (unfortunately we cannot always
guarantee this at compile time). There are two benefits to being so strict:
The fact that Kotlin uses the context keyword, this book uses “effect”, the
architectural style speaks of “ports”, is a witness of the many different views
one can adopt when talking about this pattern.
75
Or in other words, the context keyword brings additional methods and
functions into scope.
• If we think about behavior, each interface delimits some subset of behav-
iors or effects that the function may show. That is, if we have HttpService
in our context, making a HTTP call may be part of the behavior of the
function.
• If we look at the function from the outside, these interfaces define the
ports that the caller needs to provide; in other words, the dependencies
that the function requires to operate.
interface UserService {
fun User.getAvatar(): URL?
fun User.getFriends(): List<Friend>?
}
interface CardService {
fun cardById(id: String): Card?
}
This is just a re-packaging of the Rule of least power:2 one should strive to
declare the most restrictive set of effects possible. This again helps both in
documenting what the function can do, and also makes it easier to provide an
alternative implementation. If you depend on DatabaseService you may be
tying yourself to a relational database, when this is not strictly required for
the application.
The easiest way to provide a value for the context is using with. This function
creates a new scope in which the value is now implicitly available, and can thus
2
en.wikipedia.org/wiki/Rule_of_least_power
76
be “taken” by a function which requires it. For example, ConsoleLogger de-
fines an adapter for LoggerService which prints the log messages to screen.
with(ConsoleLogger()) {
^/ now we can call functions
^/ which require LoggerService
}
Adapters are chosen by their type, so you need to ensure that the object you
provide implements the interface defining that port.
class ConsoleLogger(
val to: PrintStream = System.err
): LoggerService {
override fun log(
level: LogLevel, subject: String, message: String
) = to.print(”[$level] $subject: $message”)
}
This covers a great percentage of the use cases. Context receivers are a very
powerful tool, and we can play some interesting tricks with them.
It’s fairly common to include some logging on development mode for actual
HTTP calls. Using context receivers at the class level we can express that a
particular implementation of HttpService depends on the LoggerService,
which provides a nice generic interface for this task.
context(LoggerService)
class NetworkHttpService(): HttpService {
override fun httpPost(route: String, params: Map<String, String>)
: Result<String> {
log(LogLevel.INFO, ”http”, ”doing stuff”)
^^. ^/ do the real stuff
}
}
77
Now whenever we want to create a new instance of NetworkHttpService,
we need to have a LoggerService in context. This means that we need to be
careful about the nesting of with, ensuring that the required dependencies
for each adapter are available.
with(ConsoleLogger()) {
with(NetworkHttpService()) {
login(”me”, ”1234”)
}
}
Using with we can change part of the context for a particular section of the
code. This can be useful, for example, to change the log level in a particular
module, or use an in-memory database during development. As an example,
here’s a class which wraps an existing LoggerService and makes it ignore
some of them:
class OnlyImportantLogger(
val logger: LoggerService
): LoggerService {
override fun log(
level: LogLevel, subject: String, message: String
) = when (level) {
LogLevel.INFO, LogLevel.WARN ^> { }
else ^> logger.log(level, subject, message)
}
}
Now we use with with an instance of this new class. We could have made
the LoggerService also part of the context of OnlyImportantLogger, but
in this case the code reads better by making it explicit. To obtain the value of
an element in the context we can use the this@Class syntax.
context(LoggerService)
fun User.logOut() {
^^. ^/ at this point we want to log all levels
78
with(OnlyImportantLogger(this@LoggerService)) {
^^. ^/ auxiliary behavior, we are only
^/ interested on critical failures
}
}
However, using it can be slightly annoying, because you need to extract the
different services in the call to with,
Now a single call with(app) brings in scope values for all the implemented
interfaces, so you don’t need to extract those individual components anymore.
79
can say by providing you with an extended vocabulary – each “word” being a
method. You cannot use those words freely, though, the types limit the way
in which you can combine them. Or from other perspective, the types tell us
which code is “grammatically correct.” Similarly, a piece of code using a service
is called a description. The true meaning (semantics) of each word (and of the
entire description) comes through each implementation, in the same way that
a true meaning for “this cow” in a given context is a milk-producing animal
raised in a farm.
Other pair of terms in use are algebra and interpretations, especially in
the Scala community. In this case, the former term comes from mathematics:
an algebra describes a set of operations over some mathematical object. This
corresponds to interfaces in our account, as they describe the methods that
must be present. The latter term, interpretation, stems from the fact that we
saying how each of those abstract operations map to particular behaviors.
Sometimes we even find a mix of all this terminology, and people talk
about “syntax and interpretation”, or “semantics of effects.” This is definitely
confusing, but also a result of historically separate communities realizing that
they were building on top of the same foundation.
80
7
Ķ Suspensions and concurrency
In the previous chapter we discussed effects, and how contexts help us delin-
eate more clearly what particular effects or services are required by a piece
of code. In this chapter we’re going to dig a bit deeper, up to the point in
which we actually perform the duties, the actual side effects like reading a file
or from console. In the same way that context allows us to inject different
interpretations of the same interface, we’ll see here that suspend allows us
to manipulate the actual execution of those side effects.
In other words, our final goal is to delay the actual interpretation of our
code until the very last moment, until the edges of our application. However,
an expression such as readLine() goes against that policy: whenever we call
the function the behavior is performed now. The caller of the code has no
control about when it should be executed, or whether several actions should
be performed in parallel. As we are going to discuss later in this chapter, this
is an important step for building larger blocks such as schedules or circuit
breakers.
Here’s where one of Kotlin’s main departures from Java makes its shiny
appearance: suspend functions, tighly related to the language coroutine sup-
port.1 The effects within the body of a suspended functions are not executed
until runBlocking, async, launch, or any of the other available runners are
invoked.
1
Many of these ideas stem from Why suspend over IO, available at arrow-kt.io/docs/
effects/io.
81
suspend provides the ultimate effect control.
This line of though results in an important rule of thumb for defining ser-
vices: if you foresee that some implementations will perform actual side ef-
fects, you should mark the methods with the suspend keyword. For example,
LoggerService should have been,
interface LoggerService {
suspend fun log(level: LogLevel, subject: String, msg: String)
}
context(LoggerService)
suspend fun logHttpError(code: HttpStatusCode) =
log(LogLevel.INFO, ”HTTP ERROR”, code.toString())
This is A Good Thing, the compiler is taking care of deferring side effects cor-
rectly. In fact, the Kotlin compiler is really good in creating new suspensions
for smaller ones, something which requires a non-trivial amount of boilerplate
if done by hand.
The conclusion of this journey is that an application following the DEDE
principles and its implementation in the Kotlin language is built, interpreted,
and executed in three steps.
1. The code is written declaring its ports (or required services) using inter-
faces and context receivers.
2. The implementation of those services is injected. The result is a suspended
function.
3. That suspension is executed, performing the actual work.
82
7.1.1 Concurrency as a service
For the final twist on effects and suspensions, let’s have a closer look at the
type of runBlocking:2
fun CoroutineScope.launch(
^^., block: suspend CoroutineScope.() ^> Unit
): Job
This is the exact same mechanism we have been using for other services. Here
the algebra or port is defined by CoroutineScope, and runBlocking corre-
sponds to the top-level with call. We have turned something usually thought
as different from any other components in a software system – concurrency –
into a regular service.
Other languages (and their communities) have also played with the idea of sep-
arating the description of side-effectful computations from the execution. One
2
The expect keyword indicates that the implementation of this function is platform-dependant.
3
The library uses the old receiver syntax, which was overloaded for extension func-
tions and additional receivers. If re-written, block would possibly get the type
context(CoroutineScope) () ^> T.
83
approach pioneered by Haskell, and then implemented for the JVM in Scala li-
braries such as Cats Effect, is marking side-effectful computation by assigning
them a different type. For example, the function reading a line from the con-
sole,
states that once we run the side-effects – this is the IO part of the type – we
get a String result. In Haskell only the runtime system can execute IO values
which stem from the main function, Cats Effect provides unsafeRun to start
the execution.
The main disadvantage is that you can no longer manipulate those values
in the same way that regular values. For example, you cannot do
readLine().capitalize()
because capitalize only applies to String values and the result of readLine
is IO[String]. Both languages solve this problem by introducing special syn-
tax for these wrapped types – for comprehensions in the case of Scala, and
do notation in the case of Haskell.
for {
s <- readLine()
} yield s.capitalize()
84
7.2 Abstractions on top of suspensions
launch { downloadImages(id) }
async { queryDatabase(id) }
The main difference between launch and async is that the latter also contains
a result, which can be awaited for and eventually obtained.
Those functions are really powerful – and the Coroutines mechanism pro-
vides infinite amount of knobs to control evaluation – but they are too low-
level. We are going to leave aside our library-agnosticism for a section, and
describe some of the interesting tools one can build on top of Kotlin’s suspend
mechanism. These tools are part of Arrow, in particular of their Fx library; the
dependency is called io.arrow-kt:arrow-fx-coroutines.
Our first stop is the set of functions whose name start with par. The general
idea behind all of them is executing several computations in parallel. Note that
a correct handling of parallelism is harder than it seems, because coroutines
may terminate with an exception. In that case, these functions take care of
releasing the resources associated with the rest of spawned computations in
the right way.
The first of those functions is parZip, which is given a fixed number of
computations to perform in parallel, and how the results ought to be combined
once all of them finish. The combination results in a new suspended function,
so threads are only created once executed.
85
in advance. In that case you should look at parMap, which operates on an
Iterable, executing a computation over each element. For example, we can
run the function getUserInfo defined above over a list of friends.
Spawning several threads and then caring about the result of all of them
is one way to handle those. Another possibility is to care only about the first
which completes its job. In that case you should use raceN instead. For exam-
ple, we can redefine the function above to look also in the cache.
The last call to merge is required because raceN tells us which of the com-
putations was chosen by returning an Either, which provides the Left and
Right choices, which here correspond to the first and second computations.
Since we want to return the obtained value regardless of the choice, we merge
the Either<List<User>, List<User^> into a single List<User>.
The case of raceN is even more worthwhile to have in a library than parZip
is. Correctly handling all possibilities of any of the two computations finishing
first, correctly disposing of the other one (who might have finished also in the
meanwhile!), and bubbling exceptions in the right way, is a hard task.
7.2.1 Resilience
86
Let’s focus on the downloadImages function we’ve been using above as
example. Instead of failing whether the server could not be reached, we want
to try several times. The simplest schedule consists of a fixed number of rep-
etitions, built using recurs. This schedule defines how the code in the block
within retry should be re-attempted.
Directly retrying after failure is not the best schedule, though, because we risk
overloading the server once it becomes available again. Instead, we can use
exponential back-off, which spaces retries with increasing time.
@ExperimentalTime
val downloadSchedule =
Schedule.exponential<Throwable>(10.milliseconds)
You can even express combinations of basic schedules, like attempting a max-
imum of three times, but with a delay following the exponential distribution.
val downloadSchedule =
Schedule.recurs<Throwable>(3)
.and(Schedule.exponential<Throwable>(10.milliseconds))
Or try three times directly, and only then begin with exponential backoff.
val downloadSchedule =
Schedule.recurs<Throwable>(3)
.andThen(Schedule.exponential<Throwable>(10.milliseconds))
The key point to stress is, again, that we are creating a language to ex-
press the concept we want – a retry schedule – instead of hiding it as com-
plex method flow. This language is highly compositional: given to schedules
87
we can create new ones by running them at the same time, or sequentially.
As we’ve discussed, we even have multiple interpretations, because the same
Schedule can be used with retry to attempt a possibly-failing computation,
but also with repeat to execute the same action multiple times.
These are just a few goodies in the Arrow Fx library. We recommend the in-
terested reader to check CircuitBreaker for the next step in failure manage-
ment, in which the system is aware of this failure and goes through different
phases while recovering from the problem.
88
8
Ċ Errors and resources
This is a small chapter in which we go back to our discussion on Errors and val-
idation, and look at it through the new lenses we’ve obtained from discussing
contexts and effects. In a weird turn of events, we’re going to discover how ab-
stracting from ways to handle errors take us to a programming style closer to
exceptions, but keeping all the principles from the FP style. When dealing with
errors we should never forget about proper resource management – acquiring
and releasing them timely –, a topic we also cover in this section.
Á In this chapter we use the names for the upcoming Arrow 2.0.
In the 1.x series Raise was called Effect.
we see that all of them provide the ability to finish with the “success case”,
whose type is referred to as T above, or to return an “error case”, for which
Either and Validated give the choice of the type E to the programmer, and
89
Result hardcodes it to Throwable. Since returning a value is the normal way
of things for functions, if we want to introduce a new service or effect for errors,
what we need is an operation for error returns.
The naïve encoding of that idea is the following interface. If we have an
Raise in the context, we can at any point call raise with the error to return.
Following the lead of Either and Validated, we introduce a type parameter
E to represent the particular type of errors.
interface Raise<E> {
fun raise(error: E): Nothing
}
Note that this interface does not need to refer to the type of the happy path, as
Either and Validated require with T. Here the raise method refers solely to
possibility of error; the successful finish of the function is still given as return
of the function.
At this point we can make explicit that a function may end up with an error
by introducing an Raise in the context. For example, our queryDatabase
function from our discussion on concurrency could be defined as below, in
which we use the raise function from the context to signal errors, such as an
invalid identifier.
context(Raise<DbError>)
suspend fun queryDatabase(id: UserId): User {
if (!id.isValid()) raise(ValidationError(”invalid id $id”))
^^. ^/ do the actual work
}
Once we have the syntax for the effect – in the form of Raise – we can think
of different ways to interpret them. In this case we can define one for each
different data type related to errors. As an example, here’s the one resolving
any problem into Either.
90
fun <E, A> either(
action: context(Raise<R>) () ^> A
): Either<E, A> = ^^.
How to fill the ^^. in the function above is quite a challenging task, though.
The reason is that whenever there’s a call to raise, nothing of the remain-
ing computation has to be executed. This requires cooperating with the inter-
nals of the coroutine system, working with precision with the low-level prim-
itives. Fortunately, Arrow Core includes such an implementation as part of
arrow.core.continuations.1
If you come from the Java world, or at least interoperate with libraries tar-
geting Java, you’ve surely written a @Throws annotation somewhere in your
code. One of the Java-the-language features which almost no programming
language has copied after it is checked exceptions, that is, being explicit about
which exceptions may be thrown in that piece of code.
@Throws(DbError^:class)
suspend fun queryDatabase(id: UserId): User {
if (!id.isValid()) throw ValidationError(”invalid id $id”)
^^. ^/ do the actual work
}
Describing errors as effects takes the part of this approach which goes well
with the DEDE principles: explicitness. But it refrains from adding complex
control flow – as done internally in the JVM by throw and catch – and uses in-
stead simple data types to encode the path that has been taken. Furthermore,
we provide different interpretations, freeing the programmer for the endless
boilerplate of converting from Result to Either (or any of the other 11 com-
binations) if the library you want to use and the interface you want to provide
don’t agree on how to encode the problematic path.
Going back again with our examples for concurrency, let’s assume that we
have the following two functions,
91
context(Raise<NetworkProblem>) suspend fun
downloadImages(id: UserId): Image = ^^.
and we want to execute them in parallel with parZip, as we did back then.
How can we encode that such a combined function can raise not one, but two
different kinds of errors? The simplest solution is to introduce both scopes.
context(Raise<DbError>, Raise<NetworkProblem>)
suspend fun getUserInfo(id: UserId): User { ^^. }
The main drawback of this approach is that you need two calls to either
to completely interpret the two Raises. The type of the resulting value –
Either<DbError, Either<NetworkProbblem, User^> – expresses clearly
the output you got, but working with such values is cumbersome.
Another possibility is to turn the different kinds of errors into a hierarchy.
This idea is that instead of requiring two Raises in the context, we get away
with a single one for the common Problem interface.
context(Raise<Problem>)
suspend fun getUserInfo(id: UserId): User { ^^. }
This hints that the solution is make Raise contravariant, which means that
within the interface that value is only consumed but never produced. This is
satisfied by our definition, so we can add in in front of the generic parameter.
92
interface Raise<in E> {
fun raise(error: E): Nothing
}
Now the required relationship between types hold, and the Kotlin compiler
gladly accepts our code.
Just being able to short-circuit a computation using Raise already gets you
a long way, since at any point you may call either (or any other runner) to
turn this behavior into data. Using that approach we can implement a function
pretty similar to what a catch block is for exceptions,
93
catch({ ^/ Raise<DbError> in context
queryDatabase(id)
}, { dbError ^> ^/ wrap dbError as DbProblem
raise(DbProblem(dbError))
})
context(outerScope@Raise<E>)
fun <R, E, A> catch(
action: context(Raise<R>) () ^> A,
handler: context(Raise<E>) (R) ^> A
): A = when (val result = either(action)) {
is Either.Left ^> handler(this@outerScope, result.value)
is Either.Right ^> result.value
}
Notice that we are using two different Raises here. There’s one with the ability
to raise errors of type R in which the main action works. If it’s successful we
return the value as the complete result. However, if the error path is taken, we
get Either.Left as the result of either, and then we have an error R over
which to run the handler; in this handler we have the ability to raise errors of
a different type E – we refer to that ability using the context label outerScope.
Using this catch we can implement a version of mapError – similar to the one
in the Errors and validation chapter – but now working entirely with Raise.
context(Raise<E>)
fun <R, E, A> mapError(
action: context(Raise<R>) () ^> A,
handler: (R) ^> E
): A = catch(action) { raise(handler(it)) }
94
8.1.2 A worked-out example, redux
context(Raise<CardError>)
fun validId(id: String): String = when {
id.isEmpty() ^> raise(CardError.EMPTY_ID)
else ^> id
}
Note that we no longer need to wrap the happy path with Right, we just return
the value. This pattern of checking a property of a value is quite common, so
Arrow Core defines an ensure function which raises if a condition is false,
context(Raise<CardError>)
fun validId(id: String): String {
ensure(id.isNotEmpty()) { CardError.EMPTY_ID }
return id
}
95
}
return attacks
}
As being hinted, the first thing to notice is what it’s not there, namely any
function related to manipulating Either values. Instead, we use a regular map,
and use the results in the next line without further complication. The check for
problems only needs to raise when there’s a problem – when there’s only one
attack there’s nothing to be done. If we get to the last line when we return
attacks, that means that no error has been raised.
The last step is putting everything together in the smart constructor for
MonsterCard. We’ve decided to keep the same interface as before with Either,
and for that reason we have a top-level call to the either runner; another pos-
sibility would be to expose the Raise variant. Inside that block we just call the
validation functions for the different fields, again without any additional zip.
The only place where we still need a bit of mangling is on the attacks: those
returned AttackError, so we need to apply the ATTACK_ERROR constructor
to get a CardError.
We’ve gone a long way, introducing context receivers and wrapping our
head around effects. The result scores really high on the DEDE principles scale:
we define a language to describe potential problems in our code, Raise, which
we make explicit using context. Furthermore, we control how those effects
map to the actual data: in the same way we have an either runner we can
define runners for nullable or Result types.
96
8.2 Resource management
If you compare the more traditional exception handling with the Raise intro-
duced in this chapter, you may notice that something is missing.
• Kotlin contains a use function, which ensures that after executing a block
a Closeable resource is disposed;
• Java does the same with Automatic Resource Management, which ex-
tends the try syntax with initializers.
97
suspend fun close() { ^^. }
}
class CacheConnection: Closeable { ^^. }
context(ResourceScope)
suspend fun initialize(params: DbParams): Connections {
val dbConn = install(
{ DbConnection(params).also { it.start() } },
{ db, _ ^> db.close() }
)
val cache = autoCloseable { CacheConnection() }
return Connections(dbConn, cache)
}
The next question is how do we run those initializers and finalizers; in other
words, how to do provide the ResourceScope required to execute a function
like initialize. The answer is resourceScope, which guarantees that re-
sources are correctly disposed at the end of the block. Most of the time you
call resourceScope at the top level of your application: the main function in
a desktop application, or the main activity in Android.
98
the documentation: sometimes you only need to create an instance, some-
times you need to call a few methods in a particular order. The result of this
packaging is a value of type Resource<A>.
In this chapter we’ve discussed how two other important elements in pro-
gramming – error and resource handling – can be described using the same
features as any other service. Remember that services may be freely mixed
by putting more than one in the context block; that way we can express, for
example, that a particular function uses asynchronous computations and may
fail by writing context(Raise<DbError>, CoroutineScope).
99
100
9
ż Mutability done well
Let’s envision for a moment that our trading card game becomes so popular
that people start exchanging cards all over the world. We want to capitalize on
that success, and decide to build a website in which players can publish which
cards they have and want, and the system puts matching players in contact.1
You may assume that we are using Ktor or some other web framework, but the
details are not important.
If the outcome of the matching algorithm between two sets of cards is
positive, we want to assign a unique identifier. To do so we go the easy route,
and create a new mutable variable holding the state:
var nextMatchId = 0
If this doesn’t read as the beginning of a horror novel, let me stress the
fact that matches may happen concurrently, so obtainMatchId may be called
from multiple threads coming from multiple clients. The higher the number of
1
Are we building a dating site?
101
users, the higher the possibilities that two of them read nextMatchId at the
same time, obtain the same identifier, and update the counter incorrectly.
This is a well-known problem when dealing with concurrent data, and both
Kotlin and the JVM have no shortage of solutions. But many of them require
either manual handling – locks or semaphores –, which can lead to deadlocks if
not used correctly, or provide a limited set of operations – like AtomicInteger.
But what if we guarantee that the work from one thread doesn’t “contaminate”
the other?
Following our own principles from previous chapters, let’s abstract this no-
tion of “identifier provider” into its own interface. This allows us to program
against the same API regardless of the underlying implementation. The only
method should be marked as suspend since implementations potentially ex-
ecute effects, even when several variables are involved.
interface IdProvider {
suspend fun getNextId(): Int
}
In this chapter we discuss two different options for sharing mutable state
safely. Actors build upon the coroutine support in the language, and provide
a simple way to exchange messages and responses. Software Transactional
Memory introduce the idea of transactions to mutable variables, guaranteeing
atomic and consistent updates.
9.1 Actors
The actor model is one of the most successful solutions to shared mutable
state. Entire stacks, like the BEAM virtual machine on top of which Erlang and
Elixir are built, use this model. The core idea is to split the application into in-
dependent actors which communicate with one another using messages. The
underlying platform ensures that messages are correctly handled; in particu-
lar if more than one is received, they are saved into a queue, from which the
actors can handle one at a time. If we ensure that a piece of mutable state is
only accessed through an actor, we prevent any inconsistencies.
102
The coroutines library bundled with Kotlin provides a nice API to build ac-
tors,2 on top of which we can implement a safe IdProvider. Actors in Kotlin
must declare the type of the messages they handle; in our case there’s a single
one, corresponding to obtaining the next identifier.
object NextId
The code above already points out something missing in our implementation:
how do we return the identifier back? The solution is CompletableDeferred,
which works as a box which handles the exchange of data.
This box must be given as part of the message, since otherwise the actor
doesn’t have knowledge of it. This means we need to change the message
type,
103
We are ready to implement IdProvider using an actor. Let’s leave aside
for a moment how we create the actor, and focus on how we communicate with
it. As you can see in the code below, you send messages to the actor, and then
wait for the deferred to be done. When time comes for the actor to handle
the message, the deferred is completed, and then the identifier is ready to be
queried.
In order to create the ActorIdProvider we need the actor. However, the cre-
ation must happen in a suspended scope to have access to the CoroutineScope,
so we cannot call the constructor directly. We can simulate it by providing an
invoke function on the companion object, to which we can add whatever mod-
ifiers we need.
The actors provided by the Kotlin coroutines library are quite basic, but
other libraries provide more powerful implementations. Erlang and Elixir, for
example, introduce the concept of supervision trees to ensure that actors are
correctly re-started when any of them crashes.
104
9.2 Software Transactional Memory
Another interesting solution to shared mutable state is STM, short for Soft-
ware Transactional Memory. This name describes the goal of the technique:
bring the guarantees usally coming from database transactions into the realm
of regular programming. STM guaranteed three of the four ACID properties:
transactions are run atomically, they have a consistent view of the world, and
are executed in isolation one from another. Only durability is left out, because
the memory is not backed up by any storage.
STM is not part of Kotlin’s standard library. The API discussed in this section
corresponds to Arrow Fx, Arrow’s companion library to the coroutine system.
Let’s implement IdProvider using STM. The first step is creating a trans-
actional variable, or TVar, which is protected by all the guarantees mentioned
above. Alas, we cannot simply do,
class STMIdProvider (
val nextId: TVar<Int> = TVar.new(0)
): IdProvider
The other side of the coin is manipulating this transactional variable. The
important concept here is a transactional boundary, a block in which STM is
in context. A transaction is executed using atomically, and such block is
taken as a whole transaction to which the ACI properties apply. This STM class
3
Yet another example of this philosophy of marking everything which may have side effects with
suspend, to give the control over execution back to the programmer.
105
provides the required methods for reading and writing TVars but, more impor-
tantly, outside of that context no changes are allowed.
The key question to ask yourself when using STM is: what is a set of changes
which moves from a correct state into another correct one? In other words,
4
Not taking overflow into consideration.
106
which changes must be performed atomically to guarantee that at every step
in your application the data is not corrupted? Those linked changed should be
then written as a single transaction, to ensure that the STM runtime executes
them with ACI guarantees.
Imagine that our system for trading cards grows to the point in which we need
to introduce some kind of caching. In particular, we want to keep an in-memory
view of the identifiers of the cards that each user is offering, to make matches
happen as quickly as possible. One very naïve approach is keeping a couple
of maps Map<UserId, List<CardId^> and Map<CardId, List<UserId^>
– listing the cards each user has and the users that have each card, respectively
– as transactional variables. Using STM we can guarantee that, even though
we need to update two variables for each insertion or deletion of a (user, card)
pair, those two maps never run out of sync with each other.
Alas, keeping the guarantees of STM is not free. If too many people update
the list of offered cards at the same time, they all have to fight to get the grasp
on the transactional variables, and be the ones executing their transaction
until the end. To minimize this problem of contention, we can switch to a
different transactional data structure, TMap in this case. A transactional map
“knows” that if two transactions work on different keys in the map, there’s no
problem in executing them concurrently; and this opens the door to a massive
improvement in performance.
Looking at the problem a bit more generally, we are making an insertion
into two different TMap<K, List<V^>, but the code to do so is the same in
both cases. Thus we can split that part into its own function.
107
For insertion we need to proceed in two steps: first we ensure that the key
is there, and then we add the new element to the already-existing list for that
value. We could have made the logic a bit more involved, by adding a singleton
list with the value when the key was not present, instead of adding an empty
list and then unconditionally modifying it. What you ought to remember is that,
regardless of the choice, STM guarantees that we never get into an inconsistent
state.
We can finally implement the cache by calling addPair twice,
Actors and STM bring to the small scale of development abilities which are
usually provided at higher levels. As discussed above, the guarantees provided
by STM mirror those from the database world. In fact, if you want mutable and
durable state, a database is the way to go.
108
Actors are closely related to messaging queues, such as Apache Kafka5 or
RabbitMQ,6 in which a collection of messages is handled by several processors.
If you want to move from the application level and have different services
handling messages, with guarantees of delivery and ordered handling, scaling
actors to full-fledged processors is the way to go.
Very close to channels, as used by actors, Kotlin introduces the notion of
flow. Later in the book we discuss how flows provide yet another way to handle
state in an application.
5
kafka.apache.org
6
rabbitmq.com
109
110
10
˪ Property-based testing
For this first section we’ll use a simplified version of the Attack type defined
in Our domain language. The main change is that PowerType is no longer
defined by a two-level hierarchy; instead every value is directly part of an enu-
meration.
111
val name: String,
val cost: List<PowerType>,
val damage: Int
)
As part of our job, we’ve received a database of cards where attacks names
have different capitalization, or their cost sometimes appears as “water, fire,
water”. We want to normalize the data by making all attack names Title Case,
and order the power types in their cost.
Unit tests, by definition, only test one particular example. The usual proce-
dure is to write several of those, trying to cover “regular” cases, and different
sorts of corner cases. In any case, you need to handcraft both the inputs and
the outputs. Property-based testing (PBT) has a different take: let’s generate
lots of different inputs, and explain when the output is correct. We call such
an explanation a property of the code under test, hence the name of the tech-
nique.
1
kotest.io
112
Here are two different properties that we expect normalize to satisfy. The
first one states that the letters in the attack name don’t change by putting
everyting in lowercase; the second one states that the types of energies don’t
change. In Kotest properties are introduced by checkAll, indicating what is
the type of inputs. When the test is run, Kotest generates randomized values
of that type, and executes the block afterwards; it’s a bit like running a ton of
unit tests.
Note that none of the properties are perfect definitions of what the function
ought to do. This is not a bad thing, we developers often fall prey to over-
specifying the results. In the unit test above we’ve specified what the result
of normalize should exactly be, but maybe an implementation which doesn’t
order the energy types but ensures that equal types appear consecutively is
equally good. PBT forces us to take a higher-level view on the desired behavior.
There’s a risk in the opposite direction, though: making properties repeat parts
of the actual implementation instead of finding a different specification.
The same property can be tested in two different ways. Randomized test-
ing is the most common approach, and the one used by default by Kotest. An-
other possibility is to use exhaustive testing, in which all values of a particular
type are tested. Obviously, the latter cannot be used for types with an infinite
amount of variation, like a list of unbounded length. Note also that random-
ized testing is not fully random; most (good) PBT frameworks try to exercise
113
known corner cases in every run. For example, empty lists, or integers which
are very close to overflowing.
It’s not surprising that PBT emerged in the FP community, since many of the
features of those languages come together in most PBT frameworks. First of
all, for PBT we need to have the ability to run a function multiple times, and a
pure function is the best-case scenario. Second, types help generating values
to run the test over; in the example above we wrote checkAll<Attack>, and
Kotest took care of the rest.
Let’s break the test for a moment, for example by stating that after normaliza-
tion the name should stay the same as before. When you run that property,
Kotest (rightly) screams at you:
The second attack name has purposely left ran out the page, to empha-
size that random strings can get pretty wild. On the other hand, the very first
attack reported by Kotest shows a very small argument: a single-letter name,
no power cost. This smaller attack still fails the property test; the failure is
explained as part of the AssertionFailedError. This is a part of a process
called shrinking, available in most PBT frameworks, in which the framework
takes a failing test and tries to prune unnecessary data, with the goal of re-
moving noise and helping the developer identify the root cause of the prob-
lem. This process is directed by heuristics like removing elements from a list,
decreasing the value of integers, and so on.
There’s a still a question in the air: are really those weird names repre-
sentative of the actual input we’ll give to normalize? Maybe this function
114
runs after some validation stage in which we ensure that damage is positive,
and names only contain English characters. Moving those constraints into the
tests help us (1) treating more realistic scenarios, and (2) generating better ran-
dom values. For that we need to move from the default (random) generators
to custom generators; in Kotest they are specified by a different overload of
checkAll.
In the example above we are mixing both modes of generation. The first
checkAll uses type parameters to indicate Kotest to generate random lists
of powers for the cost. The second checkAll generates names and damages:
for the former we want strings from 1 to 10 characters, whose characters come
from the ASCII A-to-Z block. The second argument to this second checkAll
specifies that only positive integers should be generates for the damage. It’s
customary in Kotest to have those custom generators live in the companion
object of Arb – with the important exception of Codepoint for random char-
acters.
Generators expose an API very close to collections and sequences. That
way we can further refine the values being generated. In our TCG, damages may
only be multiples of five; we achieve this by replacing Arb.positiveInt()
with
Arb.positiveInt().filter { it % 5 ^= 0 }
Producing better values is the best option to make your test inputs as close
as possible to the realistic ones. Alas, it’s not always possible or easy to specify
115
constraints as part of generation; in that case we can add assumptions, which
are tests that must be true for the test to run. The following code brings the
“multiple of 5” constraint as one such assumption.
Note however the difference between filter and assume. With the latter we
produce values which we later discard if the assumption is false. In the worse
case scenario, almost every generated value is discarded later, leading to prob-
lematic property suites. For example, Arb.constant(234) always generate
the value 0 as input. But if we instead do assume(value ^= 234), the test
almost never runs: integers have a wide range, and the probability of exactly
producing 234 is extremely low.
Normalizing a few fields is a good testbed for Kotest properties, but things
get more interesting when the implementation is a bit more involved. In this
section we look at testing binary search trees (BSTs), a data structure which
holds values in such a way that looking for an elements can be performed in
logarithmic time, instead of linear time as it’s the case with lists. The skeleton
of the type is quite similar to the BinaryTree we introduced back when talking
about functors.
116
val left: BST<A>, val value: A, val right: BST<A>
): BST<A>
The defining property of BSTs is that on every Branch the elements in the left
subtree are smaller than the value, and those in the right subtree are larger.
This property can be used later on to guide search, potentially2 halving the
search space at each Branch, since you know whether you should go left or
right. Insertion in the BST should be engineering to respect this property.
Using insert we can make BST implement the Collection interface from
the standard library. We also introduce a union extension method that puts
together two BSTs.
117
@Test suspend fun `union with an empty tree`() {
checkAll<BST<Int^> {t ^>
(t union Leaf) shouldBe t
}
}
Alas, running this test usually leads to a stack overflow. The default genera-
tor doesn’t know when to stop using Branch, leading to more subtrees being
generated, each of them making the stack grow bigger. The solution is to write
our own custom generator.
The source of the problem is that generation of BSTs doesn’t know when to
stop. We are going to fix the problem by writing a custom generator with a
depth. We follow the conventions in Kotest and make this new bst generator
an extension of Arb’s companion object.
The Arb includes all the facilities to specify how random generation should
proceed. Our bst generator has three cases:
1. When the maximum depth is smaller than 0, that means that we shouldn’t
grow more trees, but only return Leafs. The Arb.constant method in-
volves no randomness: it always returns the same value.
2. When 0 is an allowed depth – in other words, when empty trees may be
generated – we want to randomly generate Leafs and Branches. The
Arb.choose combinator allows us to specify the relative probabilities
118
of a list of cases: in our code we generate Leafs twice as often as we
generate Branches.
3. When the depth to generate is larger than 0, we only generate Branches.
Since this generation happens in both this and the previous case, we’ve
extracted the common code as the bstBranch function.
We’ve already discussed how generators expose a very similar API to col-
lections; the good news is that the rest of their API is very close to Either’s.
The implementation of bstBranch uses bind, which is just another name for
our zip. We recursively generate subtrees with smaller sizes, and a value in
between both. This is the reason we had a gen argument in the first place: we
need to randomly generate values of a generic type A.
The custom generator is ready to go! The final step is using it in the test,
changing from the type-directed variant of checkAll to the one where the
generator is explicitly given.
Data structures like BSTs are interesting case studies for property-based
testing because of the many interesting properties we can write, without over-
specifying the behavior. Another invariance property is that elements con-
tained in a couple of trees and their union should be the same.
119
@Test suspend fun `union keep the same elements`() {
checkAll(Arb.bst(Arb.int()), Arb.bst(Arb.int())) {t1, t2 ^>
(t1.elements().toSet() union t2.elements().toSet())
shouldBe (t1 union t2).elements().toSet()
}
}
In many cases we need a few auxiliary definitions to better express our proper-
ties, like elements in the code above. Try to keep those definitions obviously
correct, that is, so simple that a code review could catch problematic behavior.
10.2.2 Laws
You can define a map function for BSTs, very similarly to what we did for regular
binary trees,
fun <A, B> BST<A>.map(f: (A) ^> B): BST<B> = when (this) {
is Leaf ^> Leaf
is Branch ^> Branch(left.map(f), f(value), right.map(f))
}
Actually, we expect this property to hold for any implementation of a map func-
tion, for any type which is a functor. For that reason we enshrine it as a law, a
120
property which is shared by many different types. In particular, map respecting
identity is one of the two “functor laws”; the other is that maping twice is the
same as maping two functions one after the other.4
interface Semigroup<A> {
infix fun combine(other: A): A
}
The single law for semigroups is called associativity, and states that if you
combine three elements parentheses don’t matter. We can express this law
more concretely in code, completely generic over the semigroup we are testing.
@Test
suspend inline fun <reified A: Semigroup<A^> associativity() {
checkAll<A, A, A> { x, y, z ^>
((x combine y) combine z)
shouldBe (x combine (y combine z))
}
}
121
10.3 Testing services
interface CardService {
suspend fun cardById(id: String): Card?
}
This might be more than enough if you’re just testing transformations of data
where the actual values in the card don’t matter that much. However, in an-
other test you might actually be testing for some known identifiers to be found,
in that case a different mock is more useful.
5
Testcontainers, which use containers – as in Docker – to bring up real services as part of tests,
are becoming more popular in the JVM world.
122
class MapCardService(val cards: Map<String, Card>): CardService {
override suspend fun cardById(id: String): Card? =
cards.get(id)
}
The general rule is to try to make your mock exhibit the behavior under test,
and nothing more. Since creating those classes is (usually) cheap, don’t fall
into the trap of having a complex, multi-parametrizable, mock class.
We haven’t described how to use those fake services in our tests, but the
truth is that there’s nothing different from injecting them in any other situation.
Your tests are now wrapped with a first later of with,
Mocking faults
This allows us to cover all those branches in the application which only exe-
cute when a problem arises. If we have more than one service, we can test
different combinations of failures to check the resilience of our application;
this is usually harder to achieve if you are communicating with real systems
like databases.
Here it comes, the only other time in which using (a bit of) mutability is accept-
able. Imagine that as part of our TCG platform we want to keep track of which
123
cards are banned.6 We want our service to allow both banning and un-banning
a card.
interface BanService {
suspend fun MonsterCard.ban(): Unit
suspend fun MonsterCard.unban(): Unit
suspend fun MonsterCard.banned(): Boolean
}
class InMemoryBanService(
val bannedCards: MutableSet<String> = mutableSetOf()
): BanService {
override suspend fun MonsterCard.ban() {
bannedCards.add(id)
}
override suspend fun MonsterCard.unban() {
bannedCards.remove(id)
}
override suspend fun MonsterCard.banned(): Boolean =
id in bannedCards
}
Since the class exposes its bannedCards field, we can perform the desired set
of steps, and then check that the final state has the properties we want.
6
Almost every competitive TCG has a ban list: a list of “buggy” cards which are accidentally too
powerful. Since including any of those card gives so big of an advantage, players are often
forbidden to use them in official tournaments. Even more, if they weren’t banned, the game
would turn boring.
124
Part II
y Advanced techniques
125
11
Ν Actions as data
The trading card game outlined at the beginning of the book is a very simple
one. More realistic games usually involve flipping coins for some amount of
randomness, attacks or abilities to draw additional cards, and many other ac-
tions over the game. This poses an interesting question: how do we model
those actions? This chapter provides an answer based on sealed hierarchies,
and compares it to one based on interfaces.
┌────────────────────────┐
│ Yeti │
├────────────────────────┤
│ Body: 150 points │
│ │
│ Attacks: │
│ [G] Sight 50 │
│ Flip a coin. If │
│ tails, the attack │
│ does nothing. │
│ │
│ ID: A─17 │
└────────────────────────┘
127
Let’s start by adding one new action to our attacks, coin flips. The Yeti –
defined in the previous page – is an example which uses that action to decide
whether the attack does any damage.
The bulk of the Attack type defined at the beginning of the book remains
the same; we just replace the damage field with a more complex action. The
goal of this chapter is understanding how that latter piece of data is repre-
sented.
Damage must be one such action – every attack definable in the previous
version of Attack must still be possible to write.
For the coin flip we need to consider two cases: the coin heads, or it tails.
Those are represented as two different fields, describing each possibility.
We are now ready to define the Sight attack (here only the action part is pro-
vided).
128
val sightAction: Action =
FlipCoin(ifHeads = FlipCoin(ifHeads = Damage(50),
ifTails = Damage(0)),
ifTails = Damage(0))
In fact, data types which define a behavior using a tree are called abstract
syntax trees (ASTs.) The same technique is used within compilers to represent
programs; if you look at the source code of the Kotlin compiler, there’s a (big)
sealed hierarchy, with one choice for each element in the language.
The next step is defining functions which execute such a trace. The most
direct one is to simulate the coin flip by randomly generating a Boolean value.1
This version, however, poses quite some challenges for testing, due to its in-
herent randomness. We need a way to express properties such as “if the coin
1
This function is tail recursive, that is, it calls itself (recursive) but when it does so it’s the last
operation in the function (the tail). By annotating the function with the tailrec, we instruct
Kotlin to generate optimized bytecode which doesn’t suffer from stack overflows.
129
tails in the first outcome, then the result is 0.” This can be achieved by making
the future no longer an unknown, but a parameter to a simulation function.
There are a few patterns which are commonly found when describing actions
or behavior as data. In this section we give a close look at some of those.
Let’s have a look at another card, which features not only conditionals in the
form of coin flips, but also a loop.
┌────────────────────────┐
│ San Fermin Bull │
├────────────────────────┤
│ Body: 70 points │
│ │
│ Attacks: │
│ [GA] Stump │
│ Flip coins until you │
│ get tails. Damage is │
│ 20 times the amount │
│ of heads. │
│ │
│ ID: A─02 │
└────────────────────────┘
130
Following with our intuition of values of type Action being trees, this attack
required an infinite tree, as we need to account for every potential good strike
of heads. To build such a tree we are going to use a recursive function which
accumulates the value up to that point.
Kotlin’s syntax helps us here, since the only change required in the usage sites
is to wrap the action in curly braces to create the functions.
Flipping a coin is not the only additional action in our game, we may also
draw or discard cards as a result of an attack. Those actions are different than
flipping a coin in that we need to add additional information about which card
to discard, or reflect on the card that has been drawn, so we can implement
actions such as the following.
131
┌────────────────────────┐
│ Leprechaun │
├────────────────────────┤
│ Body: 30 points │
│ │
│ Attacks: │
│ [] Treasure Hunt │
│ Draw a card. If it │
│ is not a Power card, │
│ discard it. │
│ │
│ ID: A─11 │
└────────────────────────┘
data class Discard(val card: Card, val next: () ^> Action): Action
Drawing a card, on the other hand, gives back some piece of data – which
card has been drawn. We cannot use the approach described for FlipCoin
of having one field for every possible outcome of the action, since they are
potentially infinite. Instead, we represent this range of possibilities using a
function.
In fact, any action which produces a value can be modeled as a data class
with a function which consumes said value. We could even rewrite FlipCoin
to follow this pattern.
Taking advantage of the fact that final arguments which are functions can
be left out the parentheses in Kotlin, we arrive to a quite readable translation
of the attack.
132
val treasureHunt: Action =
DrawCard { card ^>
when(card) {
is PowerCard ^> Damage(0)
else ^> Discard(card) { Damage(0) }
}
}
Here the continuation is the next function you pass around to describe the
following operation. Although not so common in code written by humans, code
in this style is often produced by compilers; the transformation of suspend
functions into JVM ones is done in a very similar fashion.
Derived actions
fun flipUntil(
stop: (FlipOutcome) ^> Boolean,
133
next: (List<FlipOutcome>) ^> Action
): Action
fun flipUntil(
stop: (FlipOutcome) ^> Boolean,
next: (List<FlipOutcome>) ^> Action
): Action {
fun worker(accumulatedOutcomes: List<FlipOutcome>): Action =
FlipCoin { outcome ^>
if (stop(outcome))
next(accumulatedOutcomes)
else
worker(accumulatedOutcomes + listOf(outcome))
}
return worker(emptyList())
}
The only caveat is that the syntax becomes a bit convoluted once the actions
become bigger. Fortunately, there’s a way to get a nicer language using our
favorite tool, suspend!
134
11.2.1 More generic, more monadic
The current shape of Action forces every instance to end with a Damage value,
that is, with some integral amount. It’s useful, though, to be able to describe
Actions with other return types; this gives us the ability to abstract common
patterns in different cards and compose larger Actions with them. Code-wise,
the change is quite minor: we simply add a type parameter to Action and
“upgrade” the Damage class to accept any type of value.
The rest of the cases stay as they were before, except for the introduction of
type parameters.
One simple example is creating a derived action which flips a coin and
returns whether it heads.
Note that difference with the flipUntil derived action defined above: here
we don’t have a continuation as argument, we directly “return” the value. Alas,
in the current incarnation there’s nothing you can do with such a heads sub-
action, since there’s no way to thread the Boolean you get to another compu-
tation.
The solution is to introduce a function which does exactly that: combine
one Action and a continuation to form a larger Action.
135
Before discussing the implementation, let’s see the potential this function
brings to the table by looking at the “Stump” from “San Fermín Bull”.
Here we are able to compose any step in a uniform fashion using then. This
includes stumpAction itself,2 so we no longer need complicated recursion
with an accumulator.
To understand the implementation of then, we should think of an Action
in a similar way we think of a linked list:3 we either have a primitive action
(flip coin, draw card, or discard) following by the rest of the action, or we’re
finished with Done. In that sense, then is like concatenating two actions:
• If we are at the end of the list, described by Done, then we need to con-
tinue with the second action. At that point we have a value we can feed
to other to obtain the action.
• In any other case, we copy the current primitive action and then keep
concatenating recursively.
136
is DrawCard ^> DrawCard { card ^> next(card) then other }
is Discard ^> Discard(card) { next() then other }
}
Once you know about them, monads pop everywhere. Lists feature a flatMap
function with the correct signature, and you can write similar functions for
nullable types and Deferreds.
Several languages in the FP space feature special syntax to work with mon-
ads; the basic idea being that you don’t need to write thens explicitly. This
alleviates the boilerplate while keeping the doors open to any new implemen-
tation. Haskell (with do notation) and Scala (with for comprehensions) are
the main examples. The caveat to monads is that you need to “duplicate”
many functions; for example in Haskell map iterates over a list applying a pure
function, but if you need a monadic one you need to switch to traverse.
137
138
12
ˎ Suspended state machines
If you have the feeling that the encoding of actions – or in general syntax – as
data described in the previous chapter feels unidiomatic, you’re completely
right. In the rest of the book we’ve tried to use interfaces and contexts to
describe services or effects, and we would like to keep the same approach
here. Fortunately, the suspend mechanism is Kotlin gives us enough power to
bridge both worlds, by encoding actions as state machines.
Before diving into the actual implementation, let’s describe the interface we
want to provide to describe our actions. This is literally an interface, with a
method corresponding to each of the primitive actions in our language.
interface ActionScope {
suspend fun flipCoin(): FlipOutcome
suspend fun drawCard(): Card?
suspend fun discard(card: Card): Unit
}
139
suspend context(ActionScope) fun sightAction(): Int =
when (flipCoin()) {
FlipOutcome.HEADS ^> 50
FlipOutcome.TAILS ^> 0
}
The change is more noticeable for more complex actions. For example, we
can implement “Stump” from the “San Fermín Bull” using loops and variables,
instead of nesting constructors and recursion.
However, we can also leverage the buildList function, which creates a scope
where we can add values, which make up a complete list at the end. Internally
buildList uses a much faster implementation that repeatedly concatenating
elements at the front, as we go above.
140
suspend context(ActionScope) fun flipUntil(
stop: (FlipOutcome) ^> Boolean
): List<FlipOutcome> = buildList {
var outcome = flipCoin()
while (!stop(outcome)) {
add(outcome)
outcome = flipCoin()
}
}
The fact that we can encode actions or behaviors in two ways – as data using
classes, and as functions using interfaces and contexts – is well-known in FP
communities. In many cases the former implementation is referred to as ini-
tial, whether implementation using interfaces is known as final encoding. This
raises the question: when is one preferrable over the other?
The usual wisdom is that one should use data whenever you have the
need to inspect, analyze, or manipulate the description of the behavior, as
opposed to merely executing the behavior. For example, if we define a DSL
to describe SQL queries, we may want to optimize those before sending them
to the database. In that case, an initial encoding of the query language is the
perfect tool for the job.
On the other hand, final encodings usually perform better for execution.
Initial encodings always involve the creation of some intermediate data struc-
ture, and that costs both computation time and memory. By contrast, final
encodings take advantage of the built-in mechanisms for abstracting over be-
havior: interfaces in the Kotlin world, implicits or type classes in Scala and
Haskell, respectively. After all, executing stumpAction is done by simply in-
jecting the corresponding ActionScope, which may perform all the operations
on the spot.
141
In many cases, final encodings are also superior in the syntax they provide:
the actions in this chapter are defined in a more direct way that their coun-
terparts in the previous one. This is at least the case in Kotlin, but it’s very
language-dependant: in Haskell you can barely notice the difference between
both kinds of encoding if you use do notation. However, as we shall see in
the following section, you can get the best of both world: provide an interface
based on ActionScope, yet obtain an initial encoding as result.
142
Roughly, the idea is to implement a state machine, where the state represents
the next primitive operation to execute in the computation. The type of states
follows the same structure of Action, but replacing the continuations in the
form of (T) ^> Action<A> with the actual Continuation type.
The way you get your hands on such a Continuation is using the suspendCoroutine
operation in the standard library.
{ flipOutcome ^>
when (flipOutcome) {
1
In its full generality, a Continuation can be resumed both with a value and with an exception.
We are not using that latter feature, but it becomes more relevant when interfacing with side
effectful operations where exceptions may be thrown.
143
FlipOutcome.HEADS ^> 50
FlipOutcome.TAILS ^> 0
}
}
This is not entirely correct, because in the coroutine system the result of a
computation is also passed to the next continuation, so 50 and 0 would be
wrapped on those. But intuitively, continuations give us the ability to decide
when and how to execute the following steps in a computation.
The state machine works as an interplay of two different elements. On the
one hand we keep the current state in a variable, which we update on each
call to a primitive action. As discussed above, we capture the continuation as
part of the state.
144
}
}
Let me emphasize that execute is tail-recursive, since the very last thing we do
to interpret each primitive action is to call execute again. In practical terms
that means that the compiler turns this code into a machine loop, instead of
using real method calls, which would risk overflowing the stack.
As you can see, the interaction between the different components is quite
complex. The next figure summarizes this interaction for “Stump”, which flips
coins repeateadly.
The leftmost line represents stumpAction. At some point the flipCoin prim-
itive action is called, leading to a change in the current variable. On the other
side of the diagram we have execute, which repeatedly reads current and
generates the corresponding instance from the initial encoding. The circuit is
closed whenever the continuation within FlipCoin is required; at that point
we go back to the point of execution we were in stumpAction when the first
flipCoin was called. Execution continues until a new suspension point (in
this case, corresponding to another coin flip), and then we repeat the process:
inspect current and generate the next primitive action.
145
The final pieces of the puzzle is how we start the stumpAction computa-
tion (or any other), so that execute knows its first primitive action, and how
we detect that the suspended function representing the action has finished
execution with a result. The solution comes from another operation in the
standard library.
This function is a close relative of the scope function run, which puts a value
as the receiver – the difference is that run takes the receiver first and the block
second, whereas startCoroutine uses the reverse order. The final argument
to startCoroutine is what to do when the function finished execution. In
our case, we build a Continuation which calls done,
where done is the method responsible from setting the state to Done.
For the sake of simplicity we are assuming that the suspended function repre-
senting an action will never throw an exception, so it’s safe to call getOrThrow.
146
A better implementation would handle that other possible end state for an ac-
tion in a safer way.
@InitialStyleDSL
sealed interface Action<A>
Limitations. The coroutine system is a powerful tool, but not every effect can
be implemented in the manner described above. The reason is that the same
Continuation cannot be resumed more than once; this kind of continuations
is known as one-shot. However, certain effects need to execute the same sus-
pension more than once.
Lists (or in general, collections) are the main example of such effect. We’ve
already hinted at them when describing the notion of monad, where flatMap
connects two computations by iterating over all the possible values taken by
the first action. In the same way that we think of Int? as the effect of maybe
not having a number, we can think of List<Int> as the effect of having mul-
tiple values. For that reason people talk about the non-determinism effect.
If we were to describe non-determinism in this framework, the main oper-
ation would take some kind of collection, and execute the rest of the compu-
tation over every element of that collection. In both initial and final style:
2
serranofp.com/inikio. The library has been developed by the author of this book.
147
interface NonDeterminismScope {
suspend fun forEvery<A>(elements: Iterable<A>): A
}
To finish this chapter we’re going to look back at the Errors as effect section,
and implement Raise and either, which we then left unexplored. In this case
our initial encoding is given by Either.
If you compare Either to Action, you can see that Right is nothing more
than the Done subclass, but applied to the effect of erroring out. The Left case
corresponds to the only primitive action; however in this case we don’t have
a continuation field, because we know that such continuation would never be
called. If we were to spell it out completely, Left would look as follows.
But if you think about it, how are you ever going to call next? You cannot
create an instance of the Nothing type!3 In any case, this reading provides a
bridge to the final encoding,
3
Unless you throw an exception or go into an infinite loop, but that defeats our purpose.
148
interface Raise<E> {
fun raise(error: E): Nothing
}
Interestingly enough, if you follow the steps required to obtain the State
type for the implementation of Raise, you get a type with the exact same
structure as Either. In other words, we can get away without defining a new
type, and use Either directly.
This implementation is quite naïve, though. It’s fine if you just need Raise
on pure computations, but in more complex scenarios you need to cooperate
better with the coroutine system. Exceptions and cancellations are some of
the interactions you need to care about.
149
150
13
Ă Composable flows
Throughout this book we’ve talked mostly about how to model data, and how
to transform this data, keeping tight control of the side effects. This is no sur-
prise, as it’s one of the pillars of FP. At first sight, though, this model doesn’t
fit interactive applications – think of the user interface (UI) of any mobile ap-
plication –, since they usually hold some state which changes over time.
We’ve already hinted at some solutions in Mutability done well; in this chap-
ter we look at flows as a third solution to this problem. Flows model sequences
of values, which we could think of as evolving snapshots of the application
state. We’ll briefly discuss Flow, StateFlow, and @Composable; three impor-
tant pieces of the Kotlin ecosystem which help us designing our applications
our this concept. And, yes, we mean @Composable as in the Jetpack Compose
libraries of Android fame, a witness of how these concepts have slowly perme-
ated to the front-end world.
Let’s focus on a concrete problem, to see how to move our point of view away
from a stateful design. As in many TCGs, players of our game build a decks –
sets of cards – which they then use to play the game with. We want to provide
an interactive tool to create those decks, in which the player chooses cards
from the screen to add, or removes cards from the decks.
We model a deck using a data class, and a few operations to add and re-
move cards.
151
data class Deck(val cards: Set<Card>) {
companion object {
val EMPTY = Deck(emptySet())
}
}
This model is immutable. We keep the same property when modeling the ac-
tions or events which may occur in the application.
It seems impossible to live without that var. After all, mutable fields change
over time, exactly as our current deck does!
152
The key change to “FP-ify” this design is modeling the actions explicitly
as a sequence. Instead of having a process method which takes care of a
single DeckAction, we write a function which turns sequences of actions into
consecutive snapshots of Decks. The Flow type is perfect for this task, as it
describes ordered sequences of values.
We’ll get to writing this function in a moment, but let’s stretch this idea for
a while. In the same way that the sequences of actions produces snapshots of
decks, the sequence of snapshots of decks can be turned into descriptions of
the UI. In turn, these description can be turned into the actual interface by a UI
framework. The final step to close the loop is for the UI framework to generate
a flow of UI actions, which are then turned into DeckActions, which produce
the next version of Deck, and so on.
This may sound familiar if you’ve used Jetpack Compose, React, or SwiftUI. The
only difference is those frameworks hide the fact that you’re creating a se-
quence of UI snapshots, because you only care about mapping the latest ver-
sion of the state – the Deck here – to the new version of the UI. In any case,
you don’t have to manually add, remove, or modify components on the screen;
that job is handled by the framework, usually after diff-ing the previous and
the new UI.
Kotlin provides a handful of flow types with its coroutines library; the most
important are Flow and StateFlow. The difference between both is usually
described as the former being cold – new values are only generated upon re-
quest – and the latter being hot – new values are pushed to the next element
in the chain.
153
Working directly with flows is hard, though, as it requires using specific
operators to combine them. To give a small example, let’s model our UI as a
combination of two elements: a string representing the summary of the deck,
and a gradient which represents the background color to apply. The former
can be represented using map, which as in the case of lists, maps each value
to a transformed one.
The computation of the background takes into account what is the main power
type in the deck.
To build the snapshots of the UI we need to use combine, which turns sequen-
tial values of both Flows into a single value by applying a function.1
The amount of maps and combines scales up very quickly. If you break
the definition of the UI into small pieces, the actual logic is drown in a sea of
operators. Jetpack Compose developers recognized this problem early, and de-
veloped a solution based on @Composable functions, which are then handled
by a compiler plug-in. Much in the same way as suspend is compiler magic to
describe effectful functions, @Composable is magic for describing flows.
If you’re developing a Compose application, either using the Android toolkit
by Google or the platform-independent one by JetBrains, @Composable is al-
ready available to you. In the examples below we’ll make use of the Molecule
project2 , which packs the compiler plug-in in a self-contained way.
1
The Background and Label classes emulate those from UI frameworks, but do not correspond
to any specific framework.
2
github.com/cashapp/molecule/
154
13.2.1 @Composable functions
@Composable
fun deckSummary(deck: Deck): String = ”${deck.cards.count()}”
@Composable
fun background(deck: Deck): Color { ^^. }
In fact, in the code above you could even turn those functions into a regu-
lar (non-@Composable) function, since you’re not using any additional feature
from flows.
Molecule’s magic is encapsulated in a couple of functions. The first two
generate a hot StateFlow or a cold Flow from a @Composable function. The
third one, collectAsState walks the other direction: converts a Flow into a
source of values, which are now treated individually.
Background(color = background(deck)) {
Label(text = deckSummary(deck))
}
}
155
There are a few technicalities required to understand this piece of code.
The first one is that after val deck we don’t have an equal signs, but the by
keyword. The consequence is that any attempt to read the value of deck goes
through collectAsState instead of directly reading a field. These delegated
properties can be used to implement many behaviors on top of values, includ-
ing laziness.3
You can see that launchMolecule takes a first argument representing the
frame clock. This specifies when new values are computed. If you’re working
outside of an Android application, then you need to use Immediate, because
no other clock is available. The other possibility is ContextClock, which uses
the clock from the surrounding CoroutineScope; and should be used when
integrating Molecule in an Android application.
Note also that launchMolecule requires a CoroutineScope. The reason
is that a hot flow needs to exist separately from its consumers, so the coroutine
system needs to be aware of it. In a cold flow the consumer drives the process,
and no such separate instance is required.
@Composable
fun currentDeck(actions: Flow<DeckAction>): Deck {
var deck by remember { mutableStateOf(Deck.EMPTY) }
val action by actions.collectAsState(NoAction)
156
}
return deck
}
The use of delegated properties obscures a bit the moment when new ele-
ments of currentDeck are emitted:
Putting all together, that means that currentDeck emits a new value every
time actions pushes one.
Although this first example returns the same value we are remembering,
this doesn’t have to be the case. One common use case is to return information
for the next step of the pipeline, which could be another transformation or the
UI. Let’s refine our currentDeck to take into account a maximum deck size,
and give back an error if exceeded. This could be used to show some form of
dialog or message to the user in the UI.
The currentDeck function doesn’t change a lot. The main difference is the
additional check in AddCard – removing cards never takes us over the maxi-
mum –, with an early return when exceeded. In that case we’ve also decided
to not change the saved deck.
@Composable
fun currentDeck(actions: Flow<DeckAction>): DeckState {
var deck by remember { mutableStateOf(Deck.EMPTY) }
val action by actions.collectAsState(NoAction)
157
when (val action = action) {
is NoAction ^> { }
is AddCard ^> {
if (deck.cards.size ^= Deck.MAXIMUM_SIZE)
return DeckState.TooManyCards
deck = deck.add(action.card)
}
is RemoveCard ^> { deck = deck.remove(action.card) }
}
return DeckState.Ok(deck)
}
Reifying state change as a flow is very useful if we control the inputs, be-
cause then we know what outputs are expected. This becomes handy to test
the behavior of our system: we can create an input flow which represents the
actions to be taken,
pass it to our @Composable function and check that the elements emitted
there correspond to the expectation. Projects like Turbine,4 a library for testing
Flows, help a lot in this respect.
We’ve just scratched the surface of what flows offer. The combination high-
lighted here – a flow of input actions, and a remembered piece of data – allow
us to model state machines easily. The current state is the data wrapped by
remember, but we don’t have to expose it fully, as we’ve seen here.
For the sake of readability, the examples in this chapter have all been pure
flows. But another advantage of this technique is that the Kotlin ecosystem
provides a lot of connectors to other systems which use Flows as the input
or output channel. Message queues, like Apache Kafka5 or RabbitMQ, fit this
model particularly well; we could imagine DeckAction and DeckState being
linked to a couple of queues in a distributed system.
4
github.com/cashapp/turbine/
5
nomisrev.github.io/kotlin-kafka/ exposes this interface.
158
Part III
C Appendices
159
A
ϩ FP in Modern Java
Kotlin is a great programming language, built upon the strong roots of the
Java ecosystem. Modern Java doesn’t look at all as the Java of 20 years ago,
both the language and the ecosystem have improved, making huge parts of
the book also applicable.
In the latest years Java-the-language has made huge efforts to catch up with
more modern features,1 many of them from the FP line of thinking. In this
section we describe several of those, to realize that many of the patterns de-
scribed in this book are also applicable to the venerable language.
Long gone are the days in which one had to write the name of each type twice:
one to declare the type of the variable, and another one to call the constructor.
In many cases, Java can infer the type of the variable, so var is enough; this
gives us almost the same syntax as Kotlin’s.
1
en.wikipedia.org/wiki/Java_version_history contains a thorough description of the
evolution of Java over the years.
161
var cardDb = new HashMap<String, Card>();
By default every variable defined in Java code is mutable, but this can be
switch by prepending the final modifier.
In essence, Java’s final var is equivalent to Kotlin’s val, just a few characters
longer.
Higher-order functions
numbers.stream()
.filter(e ^> e < 0)
.map(e ^> e * 2)
.toList()
The code above also shows that modern Java has lambdas. There are a few
differences with Kotlin lambdas, though:
162
public sealed interface BinaryTree<A>
permits Leaf, Branch { }
Pattern matching
163
Java). With the passing of the years, Java’s switch has become much stronger,
allowing us to write code in a similar style to Kotlin’s when.
The first new power gained by switch is the ability to return an expression,
instead of changing the control flow. Notice how we use ^> instead of : after
the definition of each case; this tells the compiler that the next expression
should be the result of the switch when that branch is taken. As a result,
we can directly use return switch; when using switch as control flow one
would include the return in each of the cases.
The second power is the ability to mention a type followed by a variable.
This works like an instanceof check, plus a cast from the inspected value into
that same class, available in the new variable. In the example above, when
this is of type Branch, then b is equal to (Branch) this. Kotlin’s smart
casting feature is similar, but instead of forcing the developer to introduce a
new variable name, it reuses the one from when. Note that no version is better
than the other, Java’s design helps when matching on a long expression,
164
New versions of Java promise even more powerful matching for records,
giving access to the fields in the pattern itself. That means that instead of:
This gets the syntax closer to full pattern matching as found in functional lan-
guages such as Scala or Haskell. However, the fact that these nicer patterns
only work with classes defined as records means that they won’t be directly
available for older code.
Some of the language features not available directly in Java can be obtained
using Project Lombok,3 a compiler plug-in. Marking a class (or a record) with
the @With annotation creates a bunch of methods, each with the with prefix
and the name of the field afterwards, which create a copy of the value with
that field changed.
This is very similar to the copy mechanism in Kotlin, but with a different method
per field.
Lombok also backports some of the newer Java features to older versions.
If we mark a class with the @Value annotation, the compiler plug-in makes
all fields read-only, and generates the corresponding constructor. The Person
record defined above could be obtained also as,
2
Lombok is part of the Sunda Islands, the same group to which the Java island belongs to.
3
projectlombok.org
165
@Value @With class Person {
String name;
int age;
}
By default the getter uses the more traditional getField nomenclature, but
the new fluent one is also available if you add @Accessors(fluent = true)
to the list of annotations.
The standard library bundled alongside the virtual machine has grown in the
latest versions with many FP-oriented features. Most of them live within the
java.util package.
166
java.util.concurrent is the place to look for classes related to concur-
rency. This is the area with the most differences with Kotlin, since the latter
uses coroutines to describe asynchronous programming.
The java.util package is a great step in the FP direction for Java, but still
misses many of the important types. Fortunately, the Vavr library4 is there to
fill the remaining gaps.
Resilience4j
We’ve briefly touched the topic of resilience when discussing abstractions built
on top of suspended functions. When we are able to operate with functions,
techniques such as retries or circuit breakers become much more natural to
express. The Resilience4j library6 provides a set of built-in abstractions similar
to Arrow Fx, but targeting the functional types in Java.
4
docs.vavr.io
5
Java’s standard library contains immutable collections, but those are almost never what you
want. In particular, any attempt to modify such a collection results in an exception, whereas
the desired outcome is to have a (cheap) copy while leaving the original untouched.
6
resilience4j.readme.io
167
168
B
œ Formal modeling
Alloy’s code looks pretty much like Kotlin’s, except that that class is swapped
with sig (from signature), and the val for fields is dropped. Another important
difference is that every field has a cardinality, that is, a particular amount of
elements which might be related through that field. The default is one, but
you can use set to specify that any amount is allowed. Here’s the translation
of the Attack from our domain language back at the beginning of the book.
open util/integer
1
alloytools.org. Both alloy.readthedocs.io and haslab.github.io/formal-
software-design/ are good references for further exploration.
169
sig Name { }
sig Attack {
name : Name,
cost: set Power,
damage: Int
}
If you copy this text into Alloy and click the Execute button, you should get a
message similar to the following.
If you click Instance, a new window appears, usually with one square per
Power. This is an instance of the problem, a possible world which satisfies
all the constraints in the model. That’s a very boring instance, but if you click
Magic Layout (and answer “yes” to the question) and New a few times, you
should eventually come to an instance with one attack. The image below rep-
resents a single attack with a name, no power cost, and a damage of 7.
170
We get here to an important idea when modeling in Alloy: fields for which
we don’t really care about the value, other than for knowing whether they are
equal. In our particular example, our attacks have names, but those are not
important for the model we are developing. For that reason we make Name
an empty signature – this means we can still ask the question of whether two
Names are equal, but nothing more. In fact, in this case we don’t even care
about names, so we can drop it from Attack altogether.
This model of attacks is still not precise; we are declaring damage as an Int,
and that means that we may have negative damages. We can instruct Alloy
about an invariant of the instances of a signature by including an additional
block after the definition of the fields.2
sig Attack {
cost: set Power,
damage: Int
} {
gte[damage, 0]
}
If we want to ensure that at least two attacks are generated on each in-
stance we can include constraints for the process. The following piece of code
declares a predicate states that the number of Attack instances should be
greater than 2. Then we ask – by means of run – to generate instances which
satisfy that predicate, and give 5 as the upper bound for their “size”. Run Ex-
ecute once again, and every instance you get in the other window contain at
least two attacks; use this moment to explore a bit the model.
pred show {
#Attack > 2
}
run show for 5
The notion of “size” is defined by Alloy in a precise way, but at the intuitive
level all you need to know is that bigger numbers allow bigger instances to be
generated. This is important because as more invariants enter the game, edge
or problematic cases tend to appear only in bigger instances.
2
The syntax for working with Int is slightly cumbersome.
171
B.2 Relations and facts
Let’s introduce cards, both for monsters and powers. Alloy follows the same
modeling style as we’ve discussed for Kotlin: an abstract signature represent-
ing the type as a whole, with subclasses representing each of the possibilities.
As before we are going to drop the name field, as those values are not impor-
tant in our model.
To make our model a bit more interesting, we are going to include a notion
of monster mutation in our game. The idea is present in several trading card
game, and essentially means that if you have a monster which mutates from
another monster, you can put the former on top of the latter to make it mu-
tate – other popular names are evolving or strengthening. Not every monster
mutates from another, so we should make that field optional; this is achieved
in Alloy using the lone modifier, which restricts the cardinality to 0 or 1.
It doesn’t take long after clicking Execute to come to a weird model in which
a monster mutates into itself.
172
To eliminate these problematic cases from our model, we need to impose fur-
ther constraints. This time, though, the constraint involves more than one
value, so we need to include it into a separate fact block.
fact NoSelfMutation {
no m: MonsterCard | m = m.mutatesFrom
}
In general you can use quantifiers like no, all, or some, to express invariants
which should hold for every value of a signature. After the | sign you define
the predicate which should (or shouldn’t) hold.
Monsters may still be quite weird. Even with the SelfMutation fact satis-
fies, we could build an instance in which the Water Monster mutates from the
Ice Monster, and the Ice Monster mutates from the Water Monster. We want
to forbid this situation too – essentially we want to forbid cycles made from
mutatesFrom arrows in the instances.
fact NoMutationCycles {
no m: MonsterCard | m in m.^(mutatesFrom)
}
173
The key operator here is ˆ, which follows a particular relation any amount of
times and collects all the reached values. In other words, m.^(mutatesFrom)
is equivalent to the potentially infinite set of monster cards
{ m.mutatesFrom, m.mutatesFrom.mutatesFrom
, m.mutatesFrom.mutatesFrom.mutatesFrom^^. }
fact NoSameMutationFromDifferentMonsters {
no m1: MonsterCard, m2: (MonsterCard - m1) |
m1.mutatesFrom = m2.mutatesFrom
}
In the block above we use the fact that the name of signature implicity refers
to the set of potential values of that signature. By writing MonsterCard - m1
we build a new set equal to every value in MonsterCard except from m1.
The true power of Alloy is unleashed when we start modeling not only data, but
also time. Within the same tool we can define how data may evolve over time,
look at the traces in the same way we looked at instances, and ask for coun-
terexamples when an invariant is not satisfied. To make things more concrete,
let’s model how the deck and cards in play evolve through a game.
174
The one in from of Board indicates that it’s a singleton – pretty much like
object in Kotlin. Different games don’t interact with each other, so focusing
on a single one helps us make better sense of our trace. The other difference
is that we mark the fields with var, to allow them to evolve over time.
Before we move on, let’s fix a small problem. Since we are using set, we
won’t allow duplicates in the deck or in the cards in play, but this doesn’t reflect
the actual real world. The solution is to introduce an intermediate signature
which holds the Card,
sig UniqueCard {
card: Card
}
Values of UniqueCard may be different even if they hold the same Card inside.
That way we can represent duplicates in both sets.
Let’s not focus for the time being on where does the c: UniqueCard come
from; rather think about how we model the effect of drawing that particular
card. The block has two distinct parts:
1. First we find two preconditions, stating that the card to draw must be in
the deck and not in play.
175
2. Then we define the change of state. The syntax for “new value of field F”
is F', thus here we are saying that
It’s very important to remember that we need to define the new state for every
var field. Otherwise, Alloy assume that it may change in any way during that
step. This makes sense since in Alloy you always declare constraints, so not
declaring a new value actually means attaching no constraints to such new
value.
We need to provide a bit more information if we want Alloy to generate
traces. There’s a bit of boilerplate involved: first we need to declare a “do
nothing” step.
pred skip {
Board.deck' = Board.deck
Board.play' = Board.play
}
And then a fact which specifies the initial state (in our case, any deck with
three cards and none in play), and that at every step we may apply one of
the predicates defined above. This is where the magic UniqueCard makes it
appearance: one of the possible steps is to choose one such card and draw it.
Note that the preconditions on draw restrict the cards to be drawn from those
in the deck.
fact trace {
^/ initial state
#Board.deck = 3
#Board.play = 0
^/ steps
always (skip or (some c : UniqueCard | draw[c]))
}
The instance view now looks a bit difference. There’s a sequence of num-
bers on top; by clicking on one of those numbers, you see below the instance
at that time and how it evolves to the next one. For example, after clicking
New trace a few times on the model above, we get to the following image.
176
If you look closely, you see that UniqueCard1 moves from the deck to play; this
means that the draw action has been performed.
You may have also notice that in the image above there’s a small loop
above the number 2 in the sequence; this means that the state evolves indef-
initely into itself. This gives the reason why we needed a skip step: Alloy is
only able to produce lasso traces, which are infinite traces in which there’s
some sort of loop. The usual trick to turn those infinite traces into finite ones
is to “do nothing” from a certain point until eternity.
The final example in this super-quick walkthrough of Alloy showcases the
ability of the tool to find counterexamples of a fact that one expects to hold.
Note the difference between a fact – which you impose over the model – and
a property to be checked – which you expect to follow from those facts. For
example, any trace in our model keeps the number of cards constant throught
the game; this is not explicitly stated, but follows from the fact that any draw-
ing subtract one card from deck and adds one to play.
Let’s begin by adding yet another possible step to our model, which corre-
sponds to attacking. In this case the player chooses a card, which ought to be
in play, and one attack within this card. On top of that, there must be at least
as many power cards in play with the right type as the cost required for the
attack.
177
all p : a.cost |
some pc : Board.play.card & PowerCard |
pc.type = p
Board.deck' = Board.deck
Board.play' = Board.play
}
In the code above we use a couple more niceties from the Alloy language. Since
Board.play returns a set of UniqueCard, using the dot yet another time with
.card obtains the value of that field for every element in the set. The . in
Alloy works as regular field access, mapping, and even flatMapping in set.
The result of Board.play.card is a set of Card, but we are only interested in
the PowerCard; for that we apply use an intersection with &.
In the definition of trace above we used always. This is an example of
a temporal operator, something which states a property with respect to the
passing of time. For the property we want to check we use eventually, which
encodes the fact that a certain property holds in some moment in the trace. In
our case, we check that at some point in the trace we’ll be able to attack with
some card.
assert CanAttack {
eventually (some c: UniqueCard, a: Attack | attack[c, a])
}
check CanAttack
and then clicking Execute (remove any run commands if the counterexample
finding doesn’t begin). In this case, the tool tells us that a counterexample is
found.
178
This is the initial state of one of the traces. There are only monster cards
and no power cards, so there’s no possibility to execute an attack step. A
counterexample, thus, of the assertion that “at some point in the trace attack
is possible”.
Formal modeling has greatly developed in the latest years. Tools like Alloy
provide a language quite close to what most software developers know – we’ve
stressed in this chapter how ideas like sealed hierarchies map from Kotlin to
Alloy.
Most of the benefits of formal modeling fall from the fact that such a model
is a documented artifact which is understood by a tool. This means that the
model can reside in version control, as any other code artifact, removing many
of the resistance that paper or diagrammatic model have when an update
needs to happen. A pull request updating the model file is a strong signal
that other parts of the code many need to be updated, too.
It doesn’t make sense to formally model the entire application, though.
It’s a quite costly process, and you always have the doubt of whether the im-
plementation follows the model correctly. A more pragmatic and productive
approach is to create several model of your core domain (or domains), each of
them exposing an important characteristic of the domain. Exploration and re-
179
finement of the invariants usually translate to handling of corner cases which
would otherwise be ignored, or found later in the development process.
One particular area in which a tool like Alloy makes a great difference is in
every protocol or interaction which involves concurrent parties. Those parts
of a system are usually hard to test – too many possible orderings of actions,
discouting problems related to message or connection drop. On the other
hand, in Alloy it’s very easy to simulate a few concurrent users, and check
automatically that the expected invariant still holds.
To finish, remember that models – as any other part of the code – evolve
over time. Requirements change, our understanding of the model becomes
better. Don’t think of formal modeling as something you do at the very begin-
ning of the development process, and then forget about; keeping the models
up-to-date ensures that the shared understanding of the domain is also up-
to-date.
180