0% found this document useful (0 votes)
22 views

Multithreading

Multi threading in python

Uploaded by

saiteja1235
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Multithreading

Multi threading in python

Uploaded by

saiteja1235
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Prepared by P. SATHISH MCA, M.

TECH (PhD)

Multithreading in Python
Thread
In computing, a process is an instance of a computer program that is being executed. Any
process has 3 basic components:
• An executable program.
• The associated data needed by the program (variables, work space, buffers, etc.)
• The execution context of the program (State of process)
A thread is an entity within a process that can be scheduled for execution. Also, it is the
smallest unit of processing that can be performed in an OS.

A thread is a sequence of such instructions within a program that can be executed


independently of other code. For simplicity, you can assume that a thread is simply a subset
of a process!
A thread contains all this information in a Thread Control Block (TCB):
• Thread Identifier: Unique id (TID) is assigned to every new thread
• Stack pointer: Points to thread’s stack in the process. Stack contains the local
variables under thread’s scope.
• Program counter: a register which stores the address of the instruction
currently being executed by thread.
• Thread state: can be running, ready, waiting, start or done.
• Thread’s register set: registers assigned to thread for computations.
• Parent process Pointer: A pointer to the Process control block (PCB) of the
process that the thread lives on.
Consider the diagram below to understand the relation between process and its thread:

1
Prepared by P. SATHISH MCA, M.TECH (PhD)

Multithreading
Multiple threads can exist within one process where:

• Each thread contains its own register set and local variables (stored in stack).
• All thread of a process share global variables (stored in heap) and the program
code.
Consider the diagram below to understand how multiple threads exist in memory:

Multithreading is defined as the ability of a processor to execute multiple threads


concurrently.
In a simple, single-core CPU, it is achieved using frequent switching between threads. This
is termed as context switching. In context switching, the state of a thread is saved and state
of another thread is loaded whenever any interrupt (due to I/O or manually set) takes place.
Context switching takes place so frequently that all the threads appear to be running
parallely (this is termed as multitasking).

Consider the diagram below in which a process contains two active threads:

2
Prepared by P. SATHISH MCA, M.TECH (PhD)

Multithreading in Python
In Python, the threading module provides a very simple and intuitive API for spawning
multiple threads in a program.

Let us consider a simple example using threading module:


# Python program to illustrate the concept
# of threading
# importing the threading module
import threading

def print_cube(num):
"""
function to print cube of given num
"""
print("Cube: {}".format(num * num * num))

def print_square(num):
"""
function to print square of given num
"""
print("Square: {}".format(num * num))

if __name__ == "__main__":
# creating thread
t1 = threading.Thread(target=print_square, args=(10,))
t2 = threading.Thread(target=print_cube, args=(10,))

# starting thread 1
t1.start()
# starting thread 2
t2.start()

3
Prepared by P. SATHISH MCA, M.TECH (PhD)

# wait until thread 1 is completely executed


t1.join()
# wait until thread 2 is completely executed
t2.join()

# both threads completely executed


print("Done!")
output:
Square: 100
Cube: 1000
Done!
Let us try to understand the above code:

• To import the threading module, we do:


• import threading

• To create a new thread, we create an object of Thread class. It takes following


arguments:
• target: the function to be executed by thread
• args: the arguments to be passed to the target function
In above example, we created 2 threads with different target functions:

t1 = threading.Thread(target=print_square, args=(10,))

t2 = threading.Thread(target=print_cube, args=(10,))

• To start a thread, we use start method of Thread class.


• t1.start()

• t2.start()

• Once the threads start, the current program (you can think of it like a main thread)
also keeps on executing. In order to stop execution of current program until a
thread is complete, we use join method.
• t1.join()

• t2.join()

4
Prepared by P. SATHISH MCA, M.TECH (PhD)

As a result, the current program will first wait for the completion of t1 and then t2.
Once, they are finished, the remaining statements of current program are executed.
Consider the diagram below for a better understanding of how above program works:

Consider the python program given below in which we print thread name and corresponding
process for each task:

# Python program to illustrate the concept

# of threading

import threading

import os

def task1():

print("Task 1 assigned to thread: {}".format(threading.current_thread().name))

print("ID of process running task 1: {}".format(os.getpid()))

def task2():

print("Task 2 assigned to thread: {}".format(threading.current_thread().name))

5
Prepared by P. SATHISH MCA, M.TECH (PhD)

print("ID of process running task 2: {}".format(os.getpid()))

if __name__ == "__main__":

# print ID of current process

print("ID of process running main program: {}".format(os.getpid()))

# print name of main thread

print("Main thread name: {}".format(threading.current_thread().name))

# creating threads

t1 = threading.Thread(target=task1, name='t1')

t2 = threading.Thread(target=task2, name='t2')

# starting threads

t1.start()

t2.start()

# wait until all threads finish

t1.join()

t2.join()

output:

ID of process running main program: 11758


Main thread name: MainThread
Task 1 assigned to thread: t1
ID of process running task 1: 11758
Task 2 assigned to thread: t2
ID of process running task 2: 11758
Let us try to understand the above code:

6
Prepared by P. SATHISH MCA, M.TECH (PhD)

• We use os.getpid() function to get ID of current process.


• print("ID of process running main program: {}".format(os.getpid()))

As it is clear from the output, the process ID remains same for all threads.

• We use threading.main_thread() function to get the main thread object. In


normal conditions, the main thread is the thread from which the Python
interpreter was started. name attribute of thread object is used to get the name of
thread.
• print("Main thread name: {}".format(threading.main_thread().name))

• We use the threading.current_thread() function to get the current thread


object.
print("Task 1 assigned to thread: {}".format(threading.current_thread().name))

The diagram given below clears the above concept:

Why use Multithreading?

Multithreading allows you to break down an application into multiple sub-tasks and run these
tasks simultaneously. If we use multithreading properly, our application speed, performance,
and rendering can all be improved.

Python MultiThreading

Python supports constructs for both multiprocessing as well as multithreading.

There are two main modules which can be used to handle threads in Python:

1. The thread module, and


2. The threading module

7
Prepared by P. SATHISH MCA, M.TECH (PhD)

The Thread and Threading modules

The two modules that you will learn about in this tutorial are the thread module and
the threading module.

However, the thread module has long been deprecated. Starting with Python 3, it has been
designated as obsolete and is only accessible as __thread for backward compatibility.

The Thread Module

The syntax to create a new thread using this module is as follows:

thread.start_new_thread(function_name, arguments)

import _thread

import time

def name(n):

time.sleep(0.5)

print("my name is:",n)

def country(m):

time.sleep(0.5)

print("my country is:",m)

r=time.time()

_thread.start_new_thread(name,("vaag",))

_thread.start_new_thread(country,("India",))

print("time taken from two function:",time.time()-r)

OUTPUT:my name is: vaag

my country is:India

time taken from two function:0.0045555

8
Prepared by P. SATHISH MCA, M.TECH (PhD)

The Threading Module

This module is the high-level implementation of threading in python and the de facto
standard for managing multithreaded applications. It provides a wide range of features when
compared to the thread module.

Structure of Threading module

Here is a list of some useful functions defined in this module:

Function Name Description

activeCount() Returns the count of Thread objects which are still alive

currentThread() Returns the current object of the Thread class.

enumerate() Lists all active Thread objects.

isDaemon() Returns true if the thread is a daemon.

isAlive() Returns true if the thread is still alive.

Thread Class methods

start() Starts the activity of a thread. It must be called only once for each
thread because it will throw a runtime error if called multiple times.

9
Prepared by P. SATHISH MCA, M.TECH (PhD)

run() This method denotes the activity of a thread and can be overridden by
a class that extends the Thread class.

join() It blocks the execution of other code until the thread on which the
join() method was called gets terminated.

Backstory: The Thread Class

Before you start coding multithreaded programs using the threading module, it is crucial to
understand about the Thread class.The thread class is the primary class which defines the
template and the operations of a thread in python.

The most common way to create a multithreaded python application is to declare a class
which extends the Thread class and overrides it's run() method.

The Thread class, in summary, signifies a code sequence that runs in a separate thread of
control.

So, when writing a multithreaded app, you will do the following:

1. define a class which extends the Thread class


2. Override the __init__ constructor
3. Override the run() method

Once a thread object has been made, the start() method can be used to begin the execution of
this activity and the join() method can be used to block all other code till the current activity
finishes.

import threading

import time

def square(n):

print("square of a number is:\n",n)

for i in n:

time.sleep(0.2)

print("square is:\n",i*i)

def cube(n):

print("cube of a number is:",n)

for i in n:

10
Prepared by P. SATHISH MCA, M.TECH (PhD)

time.sleep(0.2)

print("cube is:\n",i*i*i)

r=time.time()

l=[1,2,3,4,5]

t1=threading.Thread(target=square,args=(l,))

t2=threading.Thread(target=cube,args=(l,))

t1.start()

t2.start()

t1.join()

t2.join()

print("time taken to execute two fun:",time.time()-r)

OUTPUT

square of a number is: [1, 2, 3, 4, 5]

cube of a number is: 1, 2, 3, 4, 5]

cube is:1

square is:1

square is:4

cube is:8

square is:9

cube is:27

square is:16

cube is:64

square is:25

cube is:125

time taken to execute two fun: 1.3097057342529297

11
Prepared by P. SATHISH MCA, M.TECH (PhD)

Synchronizing threads

To deal with race conditions, deadlocks, and other thread-based issues, the threading module
provides the Lock object. The idea is that when a thread wants access to a specific resource,
it acquires a lock for that resource. Once a thread locks a particular resource, no other thread
can access it until the lock is released. As a result, the changes to the resource will be atomic,
and race conditions will be averted.

A lock is a low-level synchronization primitive implemented by the __thread module. At any


given time, a lock can be in one of 2 states: locked or unlocked. It supports two methods:

1. acquire()

When the lock-state is unlocked, calling the acquire() method will change the state to
locked and return. However, If the state is locked, the call to acquire() is blocked until
the release() method is called by some other thread.

2. release()

The release() method is used to set the state to unlocked, i.e., to release a lock. It can
be called by any thread, not necessarily the one that acquired the lock.

import threading
lock = threading.Lock()

def first_function():
for i in range(5):
lock.acquire()
print ('lock acquired')
print ('Executing the first funcion')
lock.release()

def second_function():
for i in range(5):
lock.acquire()
print ('lock acquired')
print ('Executing the second funcion')
lock.release()

if __name__=="__main__":
thread_one = threading.Thread(target=first_function)
thread_two = threading.Thread(target=second_function)

thread_one.start()
thread_two.start()

thread_one.join()
thread_two.join()

12
Prepared by P. SATHISH MCA, M.TECH (PhD)

Apart from locks, python also supports some other mechanisms to handle thread
synchronization as listed below:

1. RLocks
2. Semaphores
3. Conditions
4. Events, and
5. Barriers

What is GIL in Python?

Global Interpreter Lock (GIL) in python is a process lock or a mutex used while dealing
with the processes. It makes sure that one thread can access a particular resource at a time and
it also prevents the use of objects and bytecodes at once. This benefits the single-threaded
programs in a performance increase. GIL in python is very simple and easy to implement.

A lock can be used to make sure that only one thread has access to a particular resource at a
given time.

One of the features of Python is that it uses a global lock on each interpreter process, which
means that every process treats the python interpreter itself as a resource.

For example, suppose you have written a python program which uses two threads to perform
both CPU and 'I/O' operations. When you execute this program, this is what happens:

1. The python interpreter creates a new process and spawns the threads
2. When thread-1 starts running, it will first acquire the GIL and lock it.

13
Prepared by P. SATHISH MCA, M.TECH (PhD)

3. If thread-2 wants to execute now, it will have to wait for the GIL to be released even
if another processor is free.
4. Now, suppose thread-1 is waiting for an I/O operation. At this time, it will release the
GIL, and thread-2 will acquire it.
5. After completing the I/O ops, if thread-1 wants to execute now, it will again have to
wait for the GIL to be released by thread-2.
6. Due to this, only one thread can access the interpreter at any time, meaning that there
will be only one thread executing python code at a given point of time.
7. This is alright in a single-core processor because it would be using time slicing (see
the first section of this tutorial) to handle the threads. However, in case of multi-core
processors, a CPU-bound function executing on multiple threads will have a
considerable impact on the program's efficiency since it won't actually be using all the
available cores at the same time.

Python Global Interpreter Lock – (GIL) do?

The Global Interpreter Lock (GIL) of Python allows only one thread to be executed at
a time.

What is GIL?

The Global Interpreter Lock (GIL) is a python process lock. As you can guess, it “locks”
something from happening. The something here is “Multi -threading”. Basically, GIL in
Python doesn’t allow multi-threading which can sometimes be considered as a
disadvantage.

Why does python need GIL?

Till now, we know that GIL restricts parallel programming and reduces efficiency.
Despite these reasons, Python uses GIL. Why?

Unlike the other programming languages, Python has a “reference-counter” for


memory management. When an object is declared in python, there’s a reference-counter
variable dedicated to it. This will keep track of the number of references that point to the
particular object.

You can get the reference count through sys.getrefcount() function.

What will happen to the reference counter in case of MultiThreading ?

In the case of Multithreading, there is a possibility that the two threads might increase
or decrease the counter’s value at the same time. Because of this, the variable might
be incorrectly released from the memory while a reference to that object still exists.

14
Prepared by P. SATHISH MCA, M.TECH (PhD)

It can cause leaked memory, even end up in system crash or numerous bugs. Hence, GIL
protects the reference counter by disabling multi-threading in Python.

Why GIL is chosen as the solution?

Some of the reasons were :

1. Python is used extensively because of the variety of packages it provides.


Many of these packages are written in C or C++. These C extensions were
prone to inconsistent changes. GIL can provide a thread-safe memory
management which was much required.
2. It’s a simple design as only one lock has to be managed.
3. GIL also provides a performance boost to the single-threaded programs.
4. It makes it possible to integrate many C libraries with Python. This is a
main reason which made it popular.

Impact of GIL on Multi-threaded problems

We already know that GIL does not allow multi-threading and decreases the inefficiency.
Let’s look more in detail here. First thing to know, there are two types of programs:
CPU-bound and I/O bound.

What is CPU-bound and I/O bound programs?

CPU-Bound means that the majority of time taken for completion of the
program(bottleneck) depends upon the CPU (central processing unit).

Mathematical operations such as mathematical computations like matrix multiplications,


searching, image processing, etc fall under CPU-bound.

Whereas, I/O bound means the program is bottlenecked by input/output (I/O). This
includes tasks such as reading or writing to disk, processing inputs, network, etc. The
I/O bound programs depend upon source and user. Python’s GIL mainly impacts the
CPU-bound programs.

In the case of CPU-bound programs, multi-threading can save huge time and resources.
If you have multiple CPU cores, you can execute each thread using separate cores and
take advantage. But GIL stops all this. Python threads cannot be run in parallel on
multiple CPU cores due to the global interpreter lock (GIL).

15
Prepared by P. SATHISH MCA, M.TECH (PhD)

How to deal with GIL?

The last sections told us the problems GIL created especially in the case of CPU -bound
programs. There have been attempts to remove GIL from Python. But, it destroyed some
of the C extensions which caused more problems. Other solutions decreased the
efficiency and performance of single-threaded programs. Hence, GIL is not removed.
So, let’s discuss some ways you could deal with it.

The most common way is to use a multiprocessing approach instead of


multithreading. We use multiple processes instead of multiple threads. In this case,
python provides a different interpreter for each process to run. In short, there are multiple
processes, but each process has a single thread.

Each process gets its own Python interpreter and memory space which means GIL
won’t stop it.

Multithreading in Python
For performing multithreading in Python threading module is used.The threadingmodule
provides several functions/methods to implement multithreading easily in python.
Before we start using the threading module, we would like to first introduce you to a module
named time, which provides a time(), ctime() etc functions which we will be using very often
to get the current system time and another crucial function sleep() which is used to suspend the
execution of the current thread for a given number of seconds.

For example

import time

print("Current time is: " + time.ctime())

print("Going to sleep for 3 seconds...")

time.sleep(3)

print("I am awake!")

output:

Current time is: Sat Jan 16 11:39:03 2021

Going to sleep for 3 seconds...

I am awake! import time

import threading

16
Prepared by P. SATHISH MCA, M.TECH (PhD)

def thread1(i):

time.sleep(3)

print('No. printed by Thread 1: %d' %i)

def thread2(i):

print('No. printed by Thread 2: %d' %i)

if __name__ == '__main__':

t1 = threading.Thread(target=thread1, args=(10,))

t2 = threading.Thread(target=thread2, args=(12,))

# start the threads

t1.start()

t2.start()

# join the main thread

t1.join()

t2.join()

print('Execution completed.')

output:

No. printed by Thread 2: 12

No. printed by Thread 1: 10

Execution completed.

Let's try to understand the above code:


We imported the thread class using import threading statement and we also imported
the time module. To create a new thread, we create an object of te Thread class. It takes the
following arguments:
target: The function which will be executed by the thread.

17
Prepared by P. SATHISH MCA, M.TECH (PhD)

args: The arguments to be passed to the target function. We can pass multiple arguments
separated by comma.
In the example above, we created 2 threads with different target functions
i.e. thread1(i) and thread2(i).
To start a thread, we have used the start() method of the Thread class.
We have also used the time.sleep() method of the time module to pause the execution
of thread1 for 3 seconds.
nce the threads start, the current program (you can think of it as the main thread) also keeps
on executing. In order to prevent the main program from completing its execution until the
thread execution is complete, we use the join() method.
As a result, the current program will wait for the completion of t1and t2 and only after their
execution is finished, the remaining statements of the current program will get executed i.e
the statement print('Execution completed.').
import time
import threading

def thread1(i):
time.sleep(3)
#print('No. printed by Thread 1: %d' %i)
def thread2(i):
time.sleep(3)
#print('No. printed by Thread 2: %d' %i)
if __name__ == '__main__':
t1 = threading.Thread(target=thread1, args=(10,))
t2 = threading.Thread(target=thread2, args=(12,))
t1.start()
t2.start()
print("Current thread is: ",threading.current_thread())
print("No. of active threads: ", threading.active_count())
t1.join()
t2.join()

output:

Current thread is: <_MainThread(MainThread, started 7264)>

No. of active threads: 4

18

You might also like