Scripting

An Introduction to Python Async IO

Introduction to Python async io

In general Python programs are single-threaded which means that tasks are queued to be run one after another. This can be fine with small programs which don’t perform any I/O operations or perform a very small number of I/O operations. But for programs where we need to fetch data from an API or fetch some data from a remote database, then that I/O operation may take a considerable amount of time depending on the network speed. For those types of programs, we need Asynchronous I/O or Async I/O operations to retain the responsiveness of the application.

Async I/O is a concurrent programming paradigm that has received dedicated support in Python from Python 3.4. It allows I/O operations to be continued while previous I/O operations are waiting for their response.

Now at this point, you may be wondering about the differences between concurrency, parallelism, threading, multiprocessing and where does Async I/O fits in? Let’s break these into simple terms. For more details about multithreading and multiprocessing in Python, I’ll recommend you to check my previous article.

Concurrency vs Parallelism

Fig: Concurrency

Source: http://tutorials.jenkov.com/java-concurrency/concurrency-vs-parallelism.html

Concurrency is an abstraction using which single CPU based computers make progress on more than one task at the same timer at least seemingly at the same time. During execution to execute more than one task concurrently CPU switches between those tasks as illustrated in the above figure.

Fig: Parallelism

Source: http://tutorials.jenkov.com/java-concurrency/concurrency-vs-parallelism.html

Whereas Parallelism applies to the computers having multiple CPUs or processors where all CPUs are executing different tasks simultaneously. In the above figure, 2 CPUs are processing 2 different tasks parallely.

While both multithreading and multiprocessing allow tasks to run concurrently, only multiprocessing will allow tasks to run parallely in real. Async I/O is a technique using which we can run more than one task concurrently in a single-threaded application, without creating new threads or processes manually.

The asyncio Package

According to Python Documentation asyncio is a library to write concurrent code using the async/await syntax. Using async/await syntax, we declare native coroutines which are the building blocks of concurrent programming in python.

Here we will be discussing native coroutines only, as it is the preferred way to declare coroutines now.

Note: Support for generator-based coroutines is deprecated and is scheduled for removal in Python 3.10.

async/await and Coroutines

Python coroutines are nothing but functions whose execution can be paused or suspended at any particular point before the return statement.

So, to make a coroutine. we need a keyword, using which we can insert a checkpoint in a function and pause its execution, and return the control to the point from where it was called from.

In Python, we use async def to declare a native coroutine and await keyword to return the control from that coroutine. Let’s understand this with a simple example:

import asyncio
from datetime import datetime

async def print_hello_world():
    print(f'{datetime.now().strftime("%H:%M:%S")} -> Hello')
    await asyncio.sleep(4)
    print(f'{datetime.now().strftime("%H:%M:%S")} -> World')

if __name__ == '__main__':
    print(f'Program Started -> {datetime.now().strftime("%H:%M:%S")}')
    asyncio.run(print_hello_world())
    print(f'Program Completed -> {datetime.now().strftime("%H:%M:%S")}')

Output:

Here on the execution of await asyncio.sleep(4) the coroutine print_hello_world() will wait for 4 seconds but the control will be transferred to the Event Loop (will be discussed below), which will check whether there are any other coroutines available to run concurrently or not. If available, then those coroutines will be executed. But after 4 seconds, Event Loop will be notified automatically and the rest of print_hello_world() will be executed.

Event Loop

According to Python Documentation,

The event loop is the core of every asyncio application. Event loops run asynchronous tasks and callbacks, perform network IO operations, and run subprocesses.

Application developers should typically use the high-level asyncio functions, such as asyncio.run(), and should rarely need to reference the loop object or call its methods.

Tasks

Python Tasks are a subclass of Future. Tasks are a wrapper around coroutines and when a coroutine is wrapped into a Task with functions like asyncio.create_task() the coroutine is automatically scheduled to run soon. Tasks alose give us the ability to track the execution progress of the coroutine wrapped into it. We can cancel a task, check its progress or add a callback to be run when the task is done.

Running Tasks Concurrently

import asyncio

from datetime import datetime

# Coroutine 1

async def coroutine1():

   print(f'{datetime.now().strftime("%H:%M:%S")} -> Coroutine 1 Started...')

   await asyncio.sleep(4)

   print(f'{datetime.now().strftime("%H:%M:%S")} -> Coroutine 1 Completed...')

# Coroutine 2

async def coroutine2():

   print(f'{datetime.now().strftime("%H:%M:%S")} -> Coroutine 2 Started...')

   await asyncio.sleep(2)

   print(f'{datetime.now().strftime("%H:%M:%S")} -> Coroutine 2 Completed...')

# Coroutine 3

async def coroutine3():

   print(f'{datetime.now().strftime("%H:%M:%S")} -> Coroutine 3 Started...')

   await asyncio.sleep(3)

   print(f'{datetime.now().strftime("%H:%M:%S")} -> Coroutine 3 Completed...')

async def main():

# Wrap coroutines into Tasks

   tasks = [

       asyncio.create_task(coroutine1()),
       asyncio.create_task(coroutine2()),
       asyncio.create_task(coroutine3())

]

   # Wait for all tasks to complete

   await asyncio.gather(*tasks)

if __name__ == '__main__':

    print(f'Program Started -> {datetime.now().strftime("%H:%M:%S")}')

    # run main function

    asyncio.run(main())

    print(f'Program Completed -> {datetime.now().strftime("%H:%M:%S")}')

Output:

Here all three coroutines are running concurrently without blocking each other.

Before wrapping up this tutorial I just want to discuss a few more things. Concurrency does not mean Parallelism and vice-versa. We can combine them both. We can have multiple threads, running Tasks parallely but each thread may not be running Tasks concurrently. Concurrency and Parallelism are vast topics, beyond the scope of this article, but I’ll try to discuss those in detail in future articles. Until then, Stay Safe and Happy Coding! ????

Similar Posts