Understanding Python's asyncio
How to get started using Python's asyncio.
Earlier this year, I attended PyCon, the international Python conference. One topic, presented at numerous talks and discussed informally in the hallway, was the state of threading in Python—which is, in a nutshell, neither ideal nor as terrible as some critics would argue.
A related topic that came up repeatedly was that of "asyncio", a relatively new approach to concurrency in Python. Not only were there formal presentations and informal discussions about asyncio, but a number of people also asked me about courses on the subject.
I must admit, I was a bit surprised by all the interest. After all, asyncio isn't a new addition to Python; it's been around for a few years. And, it doesn't solve all of the problems associated with threads. Plus, it can be confusing for many people to get started with it.
And yet, there's no denying that after a number of years when people ignored asyncio, it's starting to gain steam. I'm sure part of the reason is that asyncio has matured and improved over time, thanks in no small part to much dedicated work by countless developers. But, it's also because asyncio is an increasingly good and useful choice for certain types of tasks—particularly tasks that work across networks.
So with this article, I'm kicking off a series on asyncio—what it is, how to use it, where it's appropriate, and how you can and should (and also can't and shouldn't) incorporate it into your own work.
What Is asyncio?
Everyone's grown used to computers being able to do more than one thing at a
time—well, sort of. Although it might seem as though computers are
doing more than one thing at a time, they're actually switching, very
quickly, across different tasks. For example, when you ssh
in to a Linux
server, it might seem as though it's only executing your commands. But
in actuality, you're getting a small "time slice" from the CPU, with the
rest going to other tasks on the computer, such as the systems that
handle networking, security and various protocols. Indeed, if you're
using SSH to connect to such a server, some of those time slices
are being used by sshd
to handle your connection and even allow you to
issue commands.
All of this is done, on modern operating systems, via "pre-emptive multitasking". In other words, running programs aren't given a choice of when they will give up control of the CPU. Rather, they're forced to give up control and then resume a little while later. Each process running on a computer is handled this way. Each process can, in turn, use threads, sub-processes that subdivide the time slice given to their parent process.
So on a hypothetical computer with five processes (and one core), each process would get about 20% of the time. If one of those processes were to have four threads, each thread would get 5% of the CPU's time. (Things are obviously more complex than that, but this is a good way to think about it at a high level.)
Python works just fine with processes via the "multiprocessing" library. The problem with processes is that they're relatively large and bulky, and you cannot use them for certain tasks, such as running a function in response to a button click, while keeping the UI responsive.
So, you might want to use threads. And indeed, Python's threads work, and they work well, for many tasks. But they aren't as good as they might be, because of the GIL (the global interpreter lock), which ensures that only one thread runs at a time. So sure, Python will let you run multithreaded programs, and those even will work well when they're doing lots of I/O. That's because I/O is slow compared with the CPU and memory, and Python can take advantage of this to service other threads. If you're using threads to perform serious calculations though, Python's threads are a bad idea, and they won't get you anywhere. Even with many cores, only one thread will execute at a time, meaning that you're no better off than running your calculations serially.
The asyncio additions to Python offer a different model for concurrency. As with threads, asyncio is not a good solution to problems that are CPU-bound (that is, that need lots of CPU time to crunch through calculations). Nor is it appropriate when you absolutely must have things truly running in parallel, as happens with processes.
But if your programs are working with the network, or if they do extensive I/O, asyncio just might be a good way to go.
The good news is if it's appropriate, asyncio can be much easier to work with than threads.
The bad news is you'll need to think in a new and different way to work with asyncio.
Cooperative Multitasking and CoroutinesEarlier, I mentioned that modern operating systems use "pre-emptive multitasking" to get things done, forcing processes to give up control of the CPU in favor of another process. But there's another model, known as "cooperative multitasking", in which the system waits until a program voluntarily gives up control of the CPU. Hence the word "cooperation"—if the function decided to perform oodles of calculations, and never gives up control, then there's nothing the system can do about it.
This sounds like a recipe for disaster; why would you write, let alone run, programs that give up the CPU? The answer is simple. When your program uses I/O, you can pretty much guarantee that you'll be waiting around idly until you get a response, given how much slower I/O is than programs running in memory. Thus, you can voluntarily give up the CPU whenever you do something with I/O, knowing that soon enough, other programs similarly will invoke I/O and give up the CPU, returning control to you.
In order for this to work, you're going to need all of the programs within this cooperating multitasking universe to agree to some ground rules. In particular, you'll need them to agree that all I/O goes through the multitasking system, and that none of the tasks will hog the CPU for an extended period of time.
But wait, you'll also need a bit more. You'll need to give tasks a way to stop executing voluntarily for a little bit, and then restart from where they left off.
This last bit actually has existed in Python for some time, albeit with slightly different syntax. Let's start the journey and exploration of asyncio there.
A normal Python function, when called, executes from start to finish. For example:
def foo():
print("a")
print("b")
print("c")
If you call this, you'll see:
a
b
c
Of course, it's usually good for functions not just to print something, but also to return a value:
def hello(name):
return f'Hello, {name}'
Now when you invoke the function, you'll get something back. You can grab that returned value and assign it to a variable:
s = hello('Reuven')
But there's a variation on return
that will prove central to what
you're doing here, namely yield
. The yield
statement looks and acts
much like return
, but it can be used multiple times in a function,
even within a loop:
def hello(name):
for i in range(5):
yield f'[{i}] Hello, {name}'
Because it uses yield
, rather than return
, this is known as a
"generator function". And when you invoke it, you don't get back a
string, but rather a generator
object:
>>> g = hello('Reuven')
>>> type(g)
generator
A generator
is a kind of object that knows how to behave inside a
Python for
loop. (In other words, it implements the iteration protocol.)
When put inside such a loop, the function will start to run. However,
each time the generator function encounters a yield
statement, it will
return the value to the loop and go to sleep. When does it wake up
again? When the for
loop asks for the next value to be returned from
the iterator:
for s in g:
print(s)
Generator functions thus provide the core of what you need: a
function that runs normally, until it hits a certain point in the code.
At that point, it returns a value to its caller and goes to sleep. When
the for
loop requests the next value from the generator, the function
continues executing from where it left off (that is, just after the
yield
statement), as if it hadn't ever stopped.
The thing is that generators as described here produce output, but can't get any input. For example, you could create a generator to return one Fibonacci number per iteration, but you couldn't tell it to skip ten numbers ahead. Once the generator function is running, it can't get inputs from the caller.
It can't get such inputs via the normal iteration protocol, that is.
Generators support a send
method, allowing the outside world to send
any Python object to the generator. In this way, generators now support
two-way communication. For example:
def hello(name):
while True:
name = yield f'Hello, {name}'
if not name:
break
Given the above generator function, you now can say:
>>> g = hello('world')
>>> next(g)
'Hello, world'
>>> g.send('Reuven')
'Hello, Reuven'
>>> g.send('Linux Journal')
'Hello, Linux Journal'
In other words, first you run the generator function to get a generator
object ("g") back. You then have to prime it with the next
function,
running up to and including the first yield
statement. From that
point on, you can submit any value you want to the generator via the
send
method. Until you run g.send(None)
, you'll continue to get
output back.
Used in this way, the generator is known as a "coroutine"—that is, it has state and executes. But, it executes in tandem with the main routine, and you can query it whenever you want to get something from it.
Python's asyncio uses these basic concepts, albeit with slightly different syntax, to accomplish its goals. And although it might seem like a trivial thing to be able to send data into generators, and get things back on a regular basis, that's far from the case. Indeed, this provides the core of an entire infrastructure that allows you to create efficient network applications that can handle many simultaneous users, without the pain of either threads or processes.
In my next article, I plan to start to look at asyncio's specific syntax and how it maps to what I've shown here. Stay tuned.