Skip to content

Commit 4bda01a

Browse files
committed
Add My OTP-21 Highlights
1 parent 623fd58 commit 4bda01a

File tree

4 files changed

+146
-0
lines changed

4 files changed

+146
-0
lines changed
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
---
2+
layout: post
3+
title: My OTP 21 Highlights
4+
tags: otp 21 release
5+
author: Lukas Larsson
6+
---
7+
8+
OTP-21 Release Candidate 1 has just been released. I thought that I would go
9+
through the changes that I am the most excited about. Most likely this will
10+
mostly mean features in erts and the core libraries as those are the
11+
changes that I am the most familiar with.
12+
13+
You can download the readme describing the changes here: [OTP 21-RC1 Readme](http://erlang.org/download/otp_src_21.0-rc1.readme).
14+
Or, as always, look at the release notes of the application you are interested in.
15+
For instance here: [OTP 21-RC1 Erts Release Notes](http://erlang.org/doc/apps/erts/notes.html).
16+
17+
# Compiler / Interpreter #
18+
19+
Björn Gustavsson has been doing a lot of work with the compiler and interpreter
20+
the last year while I have been sitting next to him cheering. The largest changes
21+
is part of the OTP-14626 ticket. While working on the
22+
[BEAMJIT](https://www.youtube.com/watch?v=PtgD5WRzcy4) development
23+
I've been looking a lot at the [luajit](http://luajit.org/) project and what
24+
Mike Pall has done both in the JIT but also in the interpreter. Inspired by
25+
this and some other ideas that we got from the BEAMJIT project we decided it was time
26+
to do a major overhaul of the way that the BEAM interpreter is created. Most of the
27+
changes done boil down to decreasing the size of beam code in memory, thus making more
28+
code fit in the L1/L3 caches and in extension making code run faster. We've
29+
decreased the loaded code size by about 20% using our optimizations. This has
30+
translated to about a 5% performance increase for most Erlang code
31+
which is quite amazing. Me or Björn will most likely write more about exactly
32+
what this has entailed in a future blogpost.
33+
34+
Another compiler change that has had quite a large impact (at least in our benchmarks)
35+
is OTP-14505 contributed by José Valim in [PR 1080](http://github.com/erlang/otp/pull/1080).
36+
The change makes the compiler re-write:
37+
38+
example({ok, Val}) -> {ok, Val}.
39+
40+
to
41+
42+
example({ok, Val} = Tuple) -> Tuple.
43+
44+
eliminating the extra creation of the tuple. As it turns out this is a quite
45+
common pattern in Erlang code so this will be good for all programs.
46+
47+
An example of this performance gain can be seen in the estone benchmarks SUITE
48+
below. OTP-14626 together with some other compiler and erts improvements have
49+
increased the number of stones from 370000 in OTP-20.3 (the green line), to
50+
400000 in OTP-21 (the blue line). So about 7.5%.
51+
52+
![Estone OTP-21 benchmark](../images/estone_otp21_benchmark.png)
53+
54+
# Erlang run-time system #
55+
56+
There are many changes in the run-time system.
57+
58+
## File handling ##
59+
60+
All file IO has traditionally been handled through a port. In OTP-21 all of the
61+
file IO has been rewritten to use nifs instead, OTP-14256. This was mainly done
62+
in order to run file operation in the dirty IO schedulers. It also had the nice
63+
side-effect of significantly increasing throughput of certain operations.
64+
65+
![File tiny reads OTP-21 benchmark](../images/file_tiny_reads_otp21_benchmark.png)
66+
67+
For instance in the tiny reads benchmark OTP-21 (the blue line) is about 2.8 times
68+
faster than OTP-20.3 (the green line).
69+
70+
Also it is now possible to open device files using file:open, see OTP-11462.
71+
72+
## I/O Polling ##
73+
74+
The entire underlying mechanism for checking for I/O on sockets has been rewritten
75+
and optimized for modern OS kernel polling features. See OTP-14346 and
76+
[I/O polling options in OTP 21]({{ site.baseurl }}/IO-Polling) for more details.
77+
78+
## Distribution ##
79+
80+
It has always been possible to write your own distribution carrier if you want
81+
to if, for instance, you wanted to use [RFC-2549](https://tools.ietf.org/html/rfc2549)
82+
to send your distributed Erlang messages. However you have had to implement it as
83+
a linked-in driver. With the introduction of OTP-14459 you can now use a process
84+
or port as the distribution carrier. So now you can use gen_pigeon instead of having
85+
to call the boost equivalent.
86+
87+
The ability to use processes as distribution carriers is now used by the TLS
88+
distribution. This allows us to not have to jump through several hoops as was done
89+
before increasing the throughput of TLS distribution significantly.
90+
91+
## Process signals ##
92+
93+
When running benchmarks using cowboy and hammering it with connections that
94+
do not use keep-alive, one of the SMP scalability bottlenecks that pop up
95+
is the link lock of the supervisor that supervises all the connections.
96+
The reason why this lock pops up is because when you have a lot of linked
97+
processes, the rb-tree in which the links are stored becomes very large so
98+
the insertion and deletion time increases. In OTP-14589 this has been
99+
changed so that all link and monitor requests now are sent as messages
100+
for the receiving process to take care of. This means that the lock has been
101+
completely removed. Now all signals (be they messages, links, monitors,
102+
process\_info, group\_leader etc) are handled through the same queue.
103+
104+
In addition, OTP-14901 now makes it so that monitor + send signals
105+
are merged into one signal. So the contention is reduced even further
106+
for gen_server:call like functions.
107+
108+
![GenStress OTP-21 benchmark](../images/genstress_otp21_benchmark.png)
109+
110+
The performance difference is quite significant. The genstress benchmark
111+
seen above OTP-21 (the blue line) has almost doubled in throughput
112+
compared to OTP-20.3 (the green line).
113+
114+
# Logger
115+
116+
OTP-13295 adds a completely new logging framework for Erlang/OTP. It is
117+
inspired by the way that [lager](https://github.com/erlang-lager/lager),
118+
the [Elixir Logger](https://hexdocs.pm/logger/Logger.html) and the [Python
119+
logger](https://docs.python.org/3/howto/logging.html) works.
120+
With logger the logging handlers can intercept the logging
121+
call in the process that does the actual call instead of having to
122+
wait for a message. This opens up all sorts of possibilities of early
123+
rejection of log messages in case of an overload, see [Logger User's Guide](http://erlang.org/documentation/doc-10.0-rc1/lib/kernel-6.0/doc/html/logger_chapter.html#protecting-the-handler-from-overload)
124+
for more details. The user can also add special purpose filters that are run
125+
before the handler is invoked in order to silence or amend log messages in the system.
126+
127+
# Misc
128+
129+
HiPE has finally been fixed by Magnus Lång to use the receive reference optimization
130+
that beam has had for a long time, OTP-14785.
131+
132+
The ftp and tfpt parts of inets have been separated into their own applications
133+
instead of being bundled, OTP-14113.
134+
135+
The rand module has seen a lot of work, adding new features. I'm not sure when or
136+
how the difference is use full, but the theory around this is fascinating, OTP-13764.
137+
138+
The maps module now has an maps:iterator/0 and maps:next/1, OTP-14012.
139+
140+
io_lib:format/3 has been added to limit the output of the functions. This is especially
141+
useful when building logging frameworks as you may get arbitrarily large terms to
142+
format and may want to cut them in order to not overwhelm the system, OTP-14983.
143+
144+
As a final note, I'm not sure if anyone noticed, but as of OTP-20.3, processes that
145+
are in the state GARBING when your system crashes now have stack traces in the
146+
crash dump!!!

images/estone_otp21_benchmark.png

32.8 KB
Loading
36.7 KB
Loading

images/genstress_otp21_benchmark.png

43.7 KB
Loading

0 commit comments

Comments
 (0)