Concurrency - Basic Experiments With Python and Async IO
Concurrency - Basic Experiments With Python and Async IO
shortingoptions <[email protected]>
Historically, in Python, we had two main ways of concurrent programming: threads and
multiple processes.
Starting in Python 3.5, Python added the "async" and "await" keywords, thus making
asynchronous programming an integrated part of Python. Note that there were async
libraries that predated Python 3.5 (Twisted and Tornado, for example). Additionally,
there was limited Python asyncio support built into Python 3.4.
In asynchronous programming, the events are no longer sequential, and without doing
special things, you no longer control the order of events. Event-7 might happen before
event-2, or vice versa. You generally won't know the order, and it could easily change.
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 1 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
A thread is a single execution unit inside a given Python process. For example, when I
type "python my_prog.py", I will be starting a single Python process, and inside this
single process there will be a single execution unit (a single thread). Here, I am
assuming that I am not doing anything special inside my Python program that would
cause multiple Python threads or processes to start.
So, by default, we have one of these execution units, but we are trying to get
concurrent work done.
And by concurrent work, I mean that if we look at any humanly significant time slice
(say greater than one-tenth of a second), then work from different tasks can be
executed. And here I don't mean one task fully completing and the next task starting,
but instead I mean time-multiplexing, i.e., task-1 is executing for part of this time slice,
and then task-2 is switched in, and then task-3, etc.
Note that concurrent work does not necessarily imply parallelism. In other words, you
can have concurrency even though there are never two things executing
simultaneously. Concurrency reflects the fact that computers are much, much faster
than humans, and computers can rapidly switch tasks in and out (such that for us
super, slow humans, they appear to be happening at the same time).
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 2 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
In the diagram above, if we look at the time slice from t0 to t1, then work from job1,
job2, job3, and job4 are all being performed. And from the perspective of human
beings, it looks like the work happened at the same time.
So we are trying to accomplish concurrent work using async (and for the rest of this
article, I will use "async" to mean single-threaded, non-blocking async).
To illustrate this, let's first talk about executing a task synchronously. Let's say I have
this code here:
With synchronous programming, the for-loop will be executed one device after the
other.
The very first device here will establish an SSH connection using Netmiko and then
execute "show version".
Once this data is returned, we will print out the corresponding result. The first device
gave me the following output:
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 3 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
From the output, we can see that I have 19 total devices in the "devices" list, and the
first device completed its work after 1.2 seconds. If we continue on to the fifth device,
then we see the following:
Cisco2 did not start its work until the work for Cisco1 was complete. Similarly, Cisco5
did not start its work until the first four devices had been completed.
If we look at the very last device, device number 18, we see the following:
So "nxos2" didn't finish until over 52 seconds had elapsed (mostly because it had to
wait for the first eighteen devices to complete before it started its work).
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 4 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
Now, one thing we observe about network automation and our current script is that
there is a lot of waiting for IO.
It takes a bit over one second to connect to the very first device and to execute the
"show version" command. Some of the other devices are much slower than this.
From the perspective of our Python script, we spend the vast majority of this time just
waiting on the remote device(s): connect and then wait, and then wait, and then finally,
login. Now execute "show version" and then wait some more.
Instead of just sitting there and waiting for the remote device to respond, what if we
had our program do other work (in these essentially idle, wait times)?
With async, we are going to create an event loop—basically, a loop that keeps
checking on various tasks. In async, the various tasks that we poll from the event loop
are generally asynchronous functions (also known as coroutines). In diagram form, our
event loop looks as follows:
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 5 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
We keep polling these various tasks (asynchronous functions) to see if they have more
work that is currently ready.
These asynchronous functions also have a mechanism for relinquishing control (the
"await" keyword). The "await" keyword indicates that the coroutine (async function) can
yield program control here.
Consequently, if we recreate our original Python SSH code, but this time using async,
we could construct something similar to the following:
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 6 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
Now the first part of this code just reads a YAML file and converts the devices from
Netmiko format to Scrapli format.
Scrapli is a Python library created by Carl Montanaro, and one very nice feature of
Scrapli is that it has asynchronous SSH support.
One aspect of working with async (single-threaded, non-blocking async) is that the
code must be async all the way down. Consequently, the asynchronous parts of Scrapli
must be written in a special way, and the underlying SSH library that async-Scrapli
uses must also be async. In this case, the underlying SSH library is AsyncSSH.
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 7 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
Now the first line of this code doesn't actually call the "show_version()" function.
Instead, this line just creates a generator expression that will yield coroutines for each
of the devices.
The very next line, the "await asyncio.gather(*remote_work)" line, is where most of the
excitement happens.
What does this line do? First, the "*remote_work" component is going to unpack the
generator expression and create coroutines for each item. Note that these coroutines
have not started executing—they are just fed into asyncio.gather.
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 8 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
event loop and also create an event loop task for itself.
Both the main() coroutine and the asyncio.gather coroutine will then be paused
("awaited"). The main() coroutine is "awaiting" on the asyncio.gather and the
asyncio.gather is "awaiting" on all of the "show version" coroutines.
Now all of the show_version coroutines do their work. With the main event loop and
asyncio switching in and out tasks that are waiting for IO.
Note the above is complex, and this is my best understanding of the mechanics of
what happens here.
First, once "asyncio.gather()" executes, I then observe the outbound SSH connections
from the admin computer to the devices (using 'netstat -an | grep ":22"' at the system
level).
After this time, I get all of the results printed on the screen and can observe that it took
ten seconds for "show version" to be gathered from all nineteen devices (10.3
seconds).
This ten-seconds is the "gather" step, i.e., the place where all of the "show version"
coroutines are executing and where asyncio.gather() is "awaiting" for all of them to
complete.
The net effect of all of this is that we have a single thread and a single process working
on nineteen concurrent SSH connections (coroutines). Asyncio is rapidly switching
between them to accomplish work.
The total time for this work to complete is basically the time it takes for the slowest
device to complete. In my lab, "nxos1" and "nxos2" are the two slowest devices, and
both take about ten seconds to complete.
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&search…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 9 of 10
Gmail - [PyNet] - Basic Experiments with Python and Async IO 13/03/25, 3:14 PM
Note that the performance improvement here is not predominantly a function of Scrapli
vs. Netmiko, but instead of sequential versus async. We can achieve similar slow
results using Scrapli and a sequential pattern (or similar fast results in Netmiko using
threads or multiple processes).
Regards,
Kirk Byers
To make sure you keep getting these emails, please add [email protected] to your
address book or whitelist us. Want out of the loop? Unsubscribe.
Our postal address: Twin Bridges Technology, 88 King Street #1217, San Francisco, CA
94107
https://mail.google.com/mail/u/0/?ik=f14f839995&view=pt&searc…=thread-f:1785813430203793837&simpl=msg-f:1785813430203793837 Page 10 of 10