Sorting and Hashing (1)
Sorting and Hashing (1)
Code: CSPC-213
By
Dr Kunwar Pal
Sorting Algorithms
• Sorting is the process of arranging the elements of an array so
that they can be placed either in ascending or descending
order. For example, consider an array A = {A1, A2, A3, A4, ??
An }, the array is called to be in ascending order if element of
A are arranged like A1 > A2 > A3 > A4 > A5 > ? > An .
• Consider an array;
• int A[10] = { 5, 4, 10, 2, 30, 45, 34, 14, 18, 9 )
• The Array sorted in ascending order will be given as;
A[] = { 2, 4, 5, 9, 10, 14, 18, 30, 34, 45 }
05/23/2025 2
Sorting Terminologies
What is in-place sorting?
• An in-place sorting algorithm uses constant extra space for producing the
output (modifies the given array only).
• It sorts the list only by modifying the order of the elements within the list.
For example, Insertion Sort and Selection Sorts are in-place sorting
algorithms as they do not use any additional space for sorting the list and a
typical implementation of Merge Sort is not in-place, also the
implementation for counting sort is not in-place sorting algorithm.
• Insertion Sort
• Bubble Sort
• Selection Sort
• Merge Sort
• Quick Sort
• Radix Sort.
• Heap Sort
05/23/2025 5
Sorting
Merge Sort Approach
• Merge Sort is a Divide and Conquer algorithm. It divides the input array into two
halves, calls itself for the two halves, and then merges the two sorted halves.
• The merge() function is used for merging two halves. The merge(arr, l, m, r) is a
key process that assumes that arr[l..m] and arr[m+1..r] are sorted and merges the
two sorted sub-arrays into one.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = l+ (r-l)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
05/23/2025 6 6
Sorting
Merge Sort
Alg.: MERGE-SORT(A, p, r)
if p < r
then q ← (p + r)/2
MERGE-SORT(A, p, q)
MERGE-SORT(A, q + 1, r)
MERGE(A, p, q, r)
05/23/2025 7
Sorting
Merge Sort
Alg.: MERGE-SORT(A, p, r)
if p < r
MERGE-SORT(A, p, q)
MERGE-SORT(A, q + 1, r)
MERGE(A, p, q, r)
Initial call:
MERGE-SORT(A, 1, n)
05/23/2025 8
Sorting
Example – n Power of 2
1 2 3 4 5 6 7 8
Divide 5 2 4 7 1 3 2 6 q=4
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
05/23/2025 9 9
Sorting
Example – n Power of 2
1 2 3 4 5 6 7 8
1 2 2 3 4 5 6 7
Conquer
and 1 2 3 4 5 6 7 8
Merge 2 4 5 7 1 2 3 6
1 2 3 4 5 6 7 8
2 5 4 7 1 3 2 6
1 2 3 4 5 6 7 8
5 2 4 7 1 3 2 6
05/23/2025 10 10
Sorting
4 7 2 6 1 4 7 3 5 2 6 q=6
Divide
1 2 3 4 5 6 7 8 9 10 11
q=3 4 7 2 6 1 4 7 3 5 2 6 q=9
1 2 3 4 5 6 7 8 9 10 11
4 7 2 6 1 4 7 3 5 2 6
1 2 3 4 5 6 7 8 9 10 11
4 7 2 6 1 4 7 3 5 2 6
1 2 4 5 7 8
4 7 6 1 7 3
05/23/2025 11 11
Sorting
1 2 4 4 6 7 2 3 5 6 7
1 2 3 4 5 6 7 8 9 10 11
2 4 7 1 4 6 3 5 7 2 6
1 2 3 4 5 6 7 8 9 10 11
4 7 2 1 6 4 3 7 5 2 6
1 2 4 5 7 8
4 7 6 1 7 3
05/23/2025 12 12
Sorting
Merging
p q r
1 2 3 4 5 6 7 8
2 4 5 7 1 2 3 6
05/23/2025 13 13
Sorting
Merging
p q r
• Idea for merging: 1 2 3 4 5 6 7 8
2 4 5 7 1 2 3 6
– Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile
– Repeat the process until one pile is empty
– Take the remaining input pile and place it face-down onto the
output pile
A1 A[p, q]
A[p, r]
A2 A[q+1, r]
05/23/2025 14 14
Time complexity of Merge Sort
05/23/2025 15
Time complexity of Merge Sort
As we have already learned in Binary Search that whenever we divide a number into half in
every step, it can be represented using a logarithmic function, which is log n and the
number of steps can be represented by log n + 1(at most)
Also, we perform a single step operation to find out the middle of any subarray, i.e. O(1).
And to merge the subarrays, made by dividing the original array of n elements, a running
time of O(n) will be required.
Hence the total time for mergeSort function will become n(log n + 1), which gives us a
time complexity of O(n*log n).
05/23/2025 16
Quick Sort
• Quick Sort is also based on the concept of Divide and Conquer, just like merge sort.
• But in quick sort all the heavy lifting(major work) is done while dividing the array into
subarrays, while in case of merge sort, all the real work happens during merging the
subarrays.
• In case of quick sort, the combine step does absolutely nothing.
• It is also called partition-exchange sort. This algorithm divides the list into three main
parts:
• Pivot element can be any element from the array, it can be the first element, the last
element or any random element.
• There are many different versions of quickSort that pick pivot in different ways.
Always pick first element as pivot.
Always pick last element as pivot (implemented below)
Pick a random element as pivot.
05/23/2025 Pick median as pivot. 17
Quick Sort
• In the array {52, 37, 63, 14, 17, 8, 6, 25}, we take 25 as pivot.
• So after the first pass, the list will be changed like this.
• {6 8 17 14 25 63 37 52}
• Hence after the first pass, pivot will be set at its position, with
all the elements smaller to it on its left and all the elements
larger than to its right.
• Now 6 8 17 14 and 63 37 52 are considered as two separate
subarrays, and same recursive logic will be applied on them,
and we will keep doing this until the complete array is sorted.
05/23/2025 18
Quick Sort
• The key process in quickSort is partition().
• Target of partitions is, given an array and an element x of array as pivot, put x at its
correct position in sorted array and put all smaller elements (smaller than x) before
x, and put all greater elements (greater than x) after x.
• All this should be done in linear time.
Pseudo Code for recursive QuickSort function :
/* low --> Starting index, high --> Ending index */
quickSort(arr[], low, high)
{
if (low < high)
{
/* pi is partitioning index, arr[pi] is now
at right place */
pi = partition(arr, low, high);
quickSort(arr, low, pi - 1); // Before pi
quickSort(arr, pi + 1, high); // After pi
}
}
05/23/2025 19
Quick Sort
05/23/2025 20
Quick Sort
Partitioning Algorithm: This function takes last element as pivot, places the
pivot element at its correct position in sorted array, and places all smaller (smaller
than pivot) to left of pivot and all greater elements to right of pivot
arr[] = {10, 30, 40, 50, 70, 90, 80} // 80 and 70 Swapped
05/23/2025 23
Complexity Analysis
Quick Sort Analysis- (Best and Average Case)
To find the location of an element that splits the array into two parts, O(n)
operations are required.
This is because every element in the array is compared to the partitioning
element.
After the division, each section is examined separately.
If the array is split approximately in half (which is not usually), then there will
be log2n splits.
05/23/2025 24
Complexity Analysis
Worst Case-
05/23/2025 25
Heap Sort Algorithm
Heap: A Heap is a special Tree-based data structure in which the tree is a complete
binary tree. Generally, Heaps can be of two types:
Max-Heap: In a Max-Heap the key present at the root node must be greatest among
the keys present at all of it’s children. The same property must be recursively true for
all sub-trees in that Binary Tree.
Min-Heap: In a Min-Heap the key present at the root node must be minimum among
the keys present at all of it’s children. The same property must be recursively true for
all sub-trees in that Binary Tree.
05/23/2025 26
Heap Sort Algorithm
Max Heap Example
19
12 16
1 4 7
19 12 16 1 4 7
Array A
05/23/2025 27
Heap Sort Algorithm
Min heap example
4 16
7 12 19
1 4 16 7 12 19
Array A
05/23/2025 28
Heap Sort Algorithm
05/23/2025
MAX-HEAPIFY(A,i)
05/23/2025 30
05/23/2025 31
05/23/2025 32
05/23/2025 33
Heap Sort Algorithm
Process of Deletion:
Since deleting an element at any intermediary position in the heap can be costly, so we
can simply replace the element to be deleted by the last element and delete the last
element of the Heap.
• Replace the root or element to be deleted by the last element.
• Delete the last element from the Heap.
• Since, the last element is now placed at the position of the root node. So, it may not
follow the heap property. Therefore, heapify the last node placed at the position of
root.
05/23/2025 34
05/23/2025 35
05/23/2025 36
05/23/2025 37
Hashing
Hashing
• Hashing is the concept where keys stored in hash table using hash function.
• Objective of hashing is accessing a key from the hash table in O(1) time.
• If district key values mapping to distinct location then that hash function is
called Good hash function. In that case we can access key in O(1) time.
• If district key values mapping to same location then that hash function is
called Bad hash function. In that case we can access key in O(n) time and
collision occurs.
05/23/2025 38
Hashing
05/23/2025 39
Hashing
Hash Function
• The hash function:
– must be simple to compute.
– must distribute the keys evenly among the cells.
• If we know which keys will occur in advance we
can write perfect hash functions, but we don’t.
05/23/2025 40
Hashing
Hash function
Problems:
05/23/2025 41
Hashing
Hash Functions
• If the input keys are integers then simply
Key mod TableSize is a general strategy.
– Unless key happens to have some undesirable properties.
(e.g. all keys end in 0 and we use mod 10)
• If the keys are strings, hash function needs more care.
– First convert it into a numeric value.
05/23/2025 42
Hashing
Types of hash function
There are various types of hash function which are used to
place the data in a hash table,
1. Division method
Then:
05/23/2025 43
Hashing
In this method firstly key is squared and then mid part of the result is taken as the index.
For example: consider that if we want to place a record of 3101 and the size of table is
1000. So 3101*3101=9616201 i.e. h (3101) = 162 (middle 3 digit)
In this method the key is divided into separate parts and by using some simple operations
these parts are combined to produce a hash key. For example: consider a record of 12465512
then it will be divided into parts i.e. 124, 655, 12. After dividing the parts combine these
parts by adding it.
H(key)=124+655+12
=791
05/23/2025 44
Hashing
Characteristics of good hashing function
1.The hash function should generate different hash values for the
similar string.
2.The hash function is easy to understand and simple to compute.
3.The hash function should produce the keys which will get distributed,
uniformly over an array.
4.A number of collisions should be less while placing the data in the
hash table.
Collision
5.The hash function is a perfect hash function when it uses all the
Itinput
is a situation
data. in which the hash function returns the same hash key for
more than one record, it is called as collision. Sometimes when we are
going to resolve the collision it may lead to an overflow condition and
this overflow and collision condition makes the poor hash function.
05/23/2025 46
Hashing
Separate Chaining
• The idea is to keep a list of all elements that hash to the same
value.
– The array elements are pointers to the first nodes of the
lists.
– A new item is inserted to the front of the list.
• Advantages:
– Better space utilization for large items.
– Simple collision handling: searching linked list.
– Overflow: we can store more items than the hash table
size.
– Deletion is quick and easy: deletion from the linked
list.
05/23/2025 47
Hashing
1) Chaining
It is a method in which additional field with data i.e. chain is introduced. A chain is
maintained at the home bucket. In this when a collision occurs then a linked list is maintained
for colliding data.
Chaining
Example: Let us consider a hash table of size 10 and we apply a hash function of H(key)=key
% size of table. Let us take the keys to be inserted are 31,33,77,61. In the above diagram we
can see at same bucket 1 there are two records which are maintained by linked list or we can
say by chaining method.
05/23/2025 48
Hashing
Example
Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81
1 81 1
2
4 64 4
5 25
6 36 16
7
9 49 9
05/23/2025 49
Hashing
05/23/2025 50
Hashing
Open Addressing
05/23/2025 51
Hashing
Linear Probing
• In linear probing, collisions are resolved by sequentially
scanning an array (with wraparound) until an empty cell is
found.
– i.e. f is a linear function of i, typically f(i)= i.
• Example:
– Insert items with keys: 89, 18, 49, 58, 9 into an empty hash
table.
– Table size is 10.
– Hash function is hash(x) = x mod 10.
• f(i) = i;
05/23/2025 52
Hashing
Linear probing
hash table
after each
insertion
05/23/2025 53
Hashing
Quadratic Probing
05/23/2025 54
Hashing
A quadratic probing
hash table after
each insertion
(note that the table
size was poorly
chosen because it
is not a prime
number).
05/23/2025 55
Hashing
Quadratic Probing
• Problem:
– We may not be sure that we will probe all locations in the
table (i.e. there is no guarantee to find an empty cell if table
is more than half full.)
– If the hash table size is not prime this problem will be
much severe.
• However, there is a theorem stating that:
– If the table size is prime and load factor is not larger than
0.5, all probes will be to different locations and an item can
always be inserted.
05/23/2025 56
Hashing
Double Hashing
• A second hash function is used to drive the collision
resolution.
– f(i) = i * hash2(x)
• We apply a second hash function to x and probe at a distance
hash2(x), 2*hash2(x), … and so on.
• The function hash2(x) must never evaluate to zero.
– e.g. Let hash2(x) = x mod 9 and try to insert 99 in the previous example.
• A function such as hash2(x) = R – ( x mod R) with R a prime
smaller than TableSize will work well.
– e.g. try R = 7 for the previous example.(7 - x mode 7)
05/23/2025 57
Hashing
Hashing Applications
05/23/2025 58
05/23/2025 59
Counting Sort Algorithm
This sorting technique doesn't perform sorting by comparing
elements. It performs sorting by counting objects having distinct key
values like hashing. After that, it performs some arithmetic
operations to calculate each object's index position in the output
sequence. Counting sort is not used as a general-purpose sorting
algorithm.
Counting sort is effective when range is not greater than number of
objects to be sorted.
1.countingSort(array, n) // 'n' is the size of array
max = find maximum element in the given array
create count array with size maximum + 1
Initialize count array with all 0's
for i = 0 to n
find the count of every unique element and
store that count at ith position in the count array
for j = 1 to max
1. Find the maximum element from the given array. Let max be the
maximum element
05/23/2025 61
Working of counting sort Algorithm
3. Now, we have to store the count of each array element at their corresponding index
in the count array.
05/23/2025 62
A[n] // Input array of n values
B[n] // Output Sorted array
C[]={0}; // Count Array
Count_Sort(A[],C[],B[],n,k){
for(int i=0;i<n;i++)
{
++C[A[i]];
}
for(int i=1;i<=k;i++)
{
C[i]=C[i-1]+C[i];
}
for(int i=n-1;i>=0;i--)
{
C[A[i]]=C[A[i]]-1;
B[C[A[i]]]=A[i];
}
}
05/23/2025 63
Sorting
Radix Sort
The lower bound for Comparison based sorting algorithm (Merge Sort,
Heap Sort, Quick-Sort .. etc) is Ω(nLogn), i.e., they cannot do better than
nLogn.
Counting sort is a linear time sorting algorithm that sort in O(n+k) time
when elements are in range from 1 to k.
05/23/2025 64
Sorting
In this array [121, 432, 564, 23, 1, 45, 788], we have the
largest number 788. It has 3 digits. Therefore, the loop should
go up to hundreds place (3 times).
05/23/2025 65
Sorting
05/23/2025 66
Sorting
radixSort(array)
d <- maximum number of digits in the largest element
create d buckets of size 0-9
for i <- 0 to d
sort the elements according to ith place digits using countingSort
05/23/2025 67
Faculty Video Links, Youtube & NPTEL
Video Links and Online Courses Details
• Self Made Video Link:
https://www.youtube.com/watch?v=Yg-bbg8MQDU&t=5s
https://www.youtube.com/watch?v=3DV8GO9g7B4
https://www.youtube.com/watch?v=BO145HIUHRg
https://www.youtube.com/watch?v=sfmaf4QpVTw
05/23/2025 68
Daily Quiz
Q2. A simple graph with ‘n’ vertices and ‘k’ components can have at most:
A. n edges
B. n-k edges
C. (n-k)(n-k-1) edges
D. (n-k)(n-k+1)/2 edges
05/23/2025 69
Daily Quiz
05/23/2025 70
Daily Quiz
Q5. A selection sort is one in which the previous elements are selected in order
and placed into their proper sorted position.
A. True
B. False
Q7. An insertion sort is one that stores a set of records by inserting records into
an exisitng sorted file.
A. True
B. False
05/23/2025 71
Daily Quiz
Q8. A complete binary tree is said to satisfy the ‘heap conditon’ if the key
of each node is ______to.
Answer: Greater than or equal.
05/23/2025 72
Weekly Assignment
05/23/2025 73
MCQ s
05/23/2025 74
MCQ s
3. What is the worst case complexity of bubble sort?
a) O(nlogn)
b) O(logn)
c) O(n)
d) O(n2)
05/23/2025 75
MCQ s
5. Which of the following is not an advantage of optimised bubble sort
over other sorting techniques in case of sorted elements?
a) It is faster
b) Consumes less memory
c) Detects whether the input is already sorted
d) Consumes less time
6. The given array is arr = {1, 2, 4, 3}. Bubble sort is used to sort the
array elements. How many iterations will be done to sort the array?
a) 4
b) 2
c) 1
d) 0
05/23/2025 76
MCQ s
8. The given array is arr = {1,2,4,3}. Bubble sort is used to sort the
array elements. How many iterations will be done to sort the array with
improvised version?
a) 4
b) 2
c) 1
d) 0
05/23/2025 77
MCQ s
9. What is recurrence for worst case of QuickSort and what is the time
complexity in Worst case?
A. Recurrence is T(n) = T(n-2) + O(n) and time complexity is O(n^2)
B. Recurrence is T(n) = T(n-1) + O(n) and time complexity is O(n^2)
C. Recurrence is T(n) = 2T(n/2) + O(n) and time complexity is
O(nLogn)
D. Recurrence is T(n) = T(n/10) + T(9n/10) + O(n) and time
complexity is O(nLogn)
Q1. What do you mean by hashing and collision? Discuss the advantages
and disadvantages of hashing over other searching techniques.
Q2. What do you understand by stable and in place sorting?
Q3. Write algorithm for Quick sort. Trace your algorithm on the following
data to sort the list: 2, 13, 4, 21, 7, 56, 51, 85, 59, 1, 9, 10. How the choice
of pivot element effects the efficiency of algorithm.
Q4. Classify the Hashing Functions based on the various methods by
which the key value is found. What are the types of Collision Resolution
Techniques and the methods used in each of the type?
Q5. How merge sort works? Explain it with suitable example. On what
type of data set method is suitable ?
Q6. Define heap. How priory queue is implemented with the help of heap?
05/23/2025 79
Old Question Papers
https://drive.google.com/open?id=15fAkaWQ5c
cZRZPzwlP4PBh1LxcPp4VAd
05/23/2025 80
Summary
05/23/2025 81
References
05/23/2025 82
Thank you
05/23/2025 83