0% found this document useful (0 votes)
15 views64 pages

Lecture 07 - Advance Sorting

The document outlines a course on Data Structures and Algorithms at the Vietnam National University of HCMC, focusing on advanced sorting techniques including Shell sort, Quick sort, and Radix sort. It provides an overview of each sorting algorithm, their implementations, efficiencies, and practical exercises for students. The course emphasizes understanding the mechanisms behind these algorithms and their applications in programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views64 pages

Lecture 07 - Advance Sorting

The document outlines a course on Data Structures and Algorithms at the Vietnam National University of HCMC, focusing on advanced sorting techniques including Shell sort, Quick sort, and Radix sort. It provides an overview of each sorting algorithm, their implementations, efficiencies, and practical exercises for students. The course emphasizes understanding the mechanisms behind these algorithms and their applications in programming.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Vietnam National University of HCMC

International University
School of Computer Science and Engineering

Data Structures and Algorithms


★ Advance Sorting ★

Dr Vi Chi Thanh - [email protected]


https://vichithanh.github.io
Week by week topics (*)

1. Overview, DSA, OOP and Java 7. Advanced Sorting


2. Arrays 8. Binary Tree
3. Sorting 9. Hash Table
4. Queue, Stack 10.Graphs
5. List 11.Graphs Adv.
6. Recursion Final-Exam
Mid-Term 10 LABS

SATURDAY, 11 NOVEMBER 2023 2


Today objectives

• Shell sort
• Partitioning
• Quick sort
• Radix sort

SATURDAY, 11 NOVEMBER 2023 3


Shell sort

SATURDAY, 11 NOVEMBER 2023 4


Introduction

• Based on insertion sort


• Is good for medium-size arrays
• Faster than O(N2) – selection, insertion
• Is recommended to use in first place for any sorting project.

SATURDAY, 11 NOVEMBER 2023 5


Review insertion sort

• Sort the following array

100 34 51 61 73 0

• How many copies have been made?


• → To many copies
• → can be improved

SATURDAY, 11 NOVEMBER 2023 6


N-sorting

• Insertion sort widely spaced elements


• Increment: spacing between elements (h)

SATURDAY, 11 NOVEMBER 2023 7


4-sorting

SATURDAY, 11 NOVEMBER 2023 8


4-sorting

• Array is though of as 4 subarrays:


• (0, 4, 8), (1, 5, 9), (2, 6), (3, 7)

SATURDAY, 11 NOVEMBER 2023 9


4-sorted arrays

• All sub-arrays are sorted


• No item is more than 3 cells from where it should be (in our case)
• → “almost” sorted
• → is the secret of the Shellsort.
• Continue with the 1-sorting (insertion sort)

SATURDAY, 11 NOVEMBER 2023 10


Animation

• https://opendsa-server.cs.vt.edu/embed/shellsortAV

SATURDAY, 11 NOVEMBER 2023 11


Diminishing gap

• For array of 10 elements:


• 4-sort then 1-sort
• For array of 1000 elements?
• 364-sort, 121 sort, 40-sort, 13-sort, 4-sort and then 1-sort
• What is the interval sequence or gap sequence?
• How would you calculate it?

SATURDAY, 11 NOVEMBER 2023 12


Knuth gap sequence

h=3*h+1
• First value: 1
• Apply the formula until
h > size of array
• Example:
• Generate the gap sequence for 1100-element array

SATURDAY, 11 NOVEMBER 2023 13


Knuth gap sequence

• What is the next gap?

h = (h - 1) / 3

• Until h = 1

SATURDAY, 11 NOVEMBER 2023 14


Implementation

• Find the initial value of h (gap)

SATURDAY, 11 NOVEMBER 2023 15


SATURDAY, 11 NOVEMBER 2023 16
Other interval sequence

• Original paper:
• h=h/2
• Not the best approach: sometimes degenerates O(N2) running time
• Another variation:
• h = h /2.2 (i.e., n = 100 has h = 45, 20, 9, 4, 1)
• Another possibility (Flamig):
• h<5→h=1
• h = (5 * h - 1)/11

SATURDAY, 11 NOVEMBER 2023 17


Efficiency of Shell sort

• Range from
• O (N3/2) down to O(N7/6)
• → Better than simple sort

SATURDAY, 11 NOVEMBER 2023 18


Quick sort

• Efficient, general-purpose sorting algorithm


• Developed by British computer scientist Tony Hoare in 1959
• Example of Divide and Conquer algorithm
• Two phases
• Partition phase: Divides the work into half
• Sort phase: Conquers the halves!

SATURDAY, 11 NOVEMBER 2023 19


Partitioning - Introduction

• Is the underlying mechanism of Quick sort


• Is a useful operation
• Partition data : divide data into 2 groups
• Based on a pivot value
• > pivot value
• <= pivot value

SATURDAY, 11 NOVEMBER 2023 20


Partition

• Partition:
• leftmost item of right sub-array
• is returned from the partitioning
method
• Indicate where the division is

SATURDAY, 11 NOVEMBER 2023 21


Implementation

• Find an item (a)


• in the left, pointed by leftPtr
• and bigger than pivot
• Find an item (b)
• in the right, pointed by rightPtr
• and smaller than pivot
• Swap them
• Repeat until two pointers meet
SATURDAY, 11 NOVEMBER 2023 22
Implementation

• Input: an array with • Example: partition this array


• Index of left-most item • [5, 10, 3, 8, 6, 9, 2]
• Index of right-most item • Pick a pivot value (i.e., 7)
• Pivot value
• Output:
• Partitioned array
• Index of the partition element
(where the division is)
https://liveexample.pearsoncmg.com/liang/animation/web/QuickSortPartition.html

SATURDAY, 11 NOVEMBER 2023 23


Implementation –
Find (a), (b)

SATURDAY, 11 NOVEMBER 2023 24


Efficiency of Partition

• Two pointers start from two ends of array


• Move toward each other
• When they meet, partition is complete
→ O(N)

SATURDAY, 11 NOVEMBER 2023 25


Quick sort

SATURDAY, 11 NOVEMBER 2023 26


Introduction

• Most popular sorting algorithm


• Is the fastest (in most of the cases)
• On average: O(N*logN)

SATURDAY, 11 NOVEMBER 2023 27


Main idea

• Partition an array into two sub-arrays


• Then call itself recursively to quicksort each of these sub-arrays

SATURDAY, 11 NOVEMBER 2023 28


Implementation

SATURDAY, 11 NOVEMBER 2023 29


Main idea

• Three steps:
1. Partition the array or subarray into left (smaller keys) and right (larger keys)
groups.
2. Call ourselves to sort the left group.
3. Call ourselves again to sort the right group.

SATURDAY, 11 NOVEMBER 2023 30


After first partitioning

SATURDAY, 11 NOVEMBER 2023 31


Choosing a Pivot value

• Should be the value of an actual data item


• Can pick at random place in array
• For our algorithm: the rightmost item
• After partition,
• Partition item is at BOUNDARY between left and right subarray
• Swap the pivot item with partition item.
• The pivot item will be in its FINAL position

SATURDAY, 11 NOVEMBER 2023 32


Update the implementation

SATURDAY, 11 NOVEMBER 2023 33


SATURDAY, 11 NOVEMBER 2023 34
The improvement

• Do not need to check for the end of array in while loop


• leftPrt < right

SATURDAY, 11 NOVEMBER 2023 35


Step-by-step sort

SATURDAY, 11 NOVEMBER 2023 https://opendsa-server.cs.vt.edu/embed/quicksortAV 36


Degenerate to O(N2)

• The pivot divides the list into two sublists of size 0 and n-1

SATURDAY, 11 NOVEMBER 2023 37


Degenerate to O(N2)

• Ideally, pivot should be the MEDIAN of the items


• The worst case: after partition, we have
• 1 element & N-1 elements
• → Increase the number of recursive call
• → Slow
• → Stack overflow
• → Need better approach for selecting pivot

SATURDAY, 11 NOVEMBER 2023 38


Quick sort
with Median-Of-Three Partioning

SATURDAY, 11 NOVEMBER 2023 39


Median-Of-Three Partitioning

• Ideally, examine all items → Median


• Compromise solution:
Median of (Left, Right, Center)
• In addition, sort Left, Right and Center

SATURDAY, 11 NOVEMBER 2023 40


Median-Of-Three Partitioning

SATURDAY, 11 NOVEMBER 2023 41


Implementation

SATURDAY, 11 NOVEMBER 2023 42


MedianOf3

SATURDAY, 11 NOVEMBER 2023 43


Partition (p. 349)

SATURDAY, 11 NOVEMBER 2023 44


Cutoff point

• This version can use only if array size > 3


• If not, sort manually or use insertion sort

SATURDAY, 11 NOVEMBER 2023 45


Efficiency of Quick sort

• O(N * logN)
• Is a divide-and-conquer algorithm

SATURDAY, 11 NOVEMBER 2023 46


Radix Sort

SATURDAY, 11 NOVEMBER 2023 47


Radix sort

• Radix Sort is a clever and intuitive little sorting algorithm. Radix Sort
puts the elements in order by comparing the digits of the numbers.
We will explain with an example.

SATURDAY, 11 NOVEMBER 2023 48


Radix Sort

• Consider the following scheme


• Given the numbers
16 31 99 59 27 90 10 26 21 60 18 57 17
• If we first sort the numbers based on their last digit only, we get:
90 10 60 31 21 16 26 27 57 17 18 99 59
• Now sort according to the first digit:
10 16 17 18 21 26 27 31 57 59 60 90 99
Radix Sort

• Notice that the numbers were added onto the list in the order
that they were found, which is why the numbers appear to be
unsorted in each of the sublists.
Radix Sort

• Thus, consider the following algorithm:


• Suppose we are sorting decimal numbers
• Create an array of 10 queues
• For each digit, starting with the least significant
• Place the ith number into the bin corresponding with the current digit
• Remove all digits in the order they were placed into the bins in the
order of the bins
Radix Sort

• Suppose that two n-digit numbers are equal for the first m digits:
a = anan – 1an – 2...an – m + 1an – m...a1a0
b = anan – 1an – 2...an – m + 1bn – m...b1b0
where an – m < bn – m
• For example, 103574 < 103892 because 1 = 1, 0 = 0, 3 = 3 but 5 < 8
• Then, on iteration n – m, a will be placed in a lower bin than b
• When they are taken out, a will precede b in the list
Radix Sort

• For all subsequent iterations, a and b will be placed in the same bin,
and will therefore continue to be taken out in the same order

• Therefore, in the final list, a must precede b


Example

• Sort the following decimal numbers:


86 198 466 709 973 981 374 766 473 342
• First, interpret 86 as 086

SATURDAY, 11 NOVEMBER 2023 54


Example

• Next, create an array of 10 queues:


0
1
2
3
4
5
6
7
8
9
Example
• Push according to the 3rd digit:
086 198 466 709 973 981 374 766 473 342

0
1 981
2 342
3 973 473
4 374
5
6 086 466 766
7
8 198
9 709

and dequeue: 981 342 973 473 374 086 466 766 198 709
Example
• Enqueue according to the 2nd digit:
981 342 973 473 374 086 466 766 198 709
0 709
1
2
3
4 342
5
6 466 766
7 973 473 374
8 981 086
9 198

and dequeue: 709 342 466 766 973 473 374 981 086 198
Example
• Enqueue according to the 1st digit:
709 342 466 766 973 473 374 981 086 198
0 086
1 198
2
3 342 374
4 466 473
5
6
7 709 766
8
9 973 981

and dequeue: 086 198 342 374 466 473 709 766 973 981
Example

• The numbers
086 198 342 374 466 473 709 766 973 981
are now in order

• The next example uses the binary representation of numbers, which is


even easier to follow
Java code

• Google is the magic!


• Some examples to read:
• https://www.geeksforgeeks.org/radix-sort/
• https://www.javatpoint.com/radix-sort
• https://www.tutorialspoint.com/design_and_analysis_of_algorithms/design_
and_analysis_of_algorithms_radix_sort.htm
• https://www.programiz.com/dsa/radix-sort

SATURDAY, 11 NOVEMBER 2023 60


Complexity of Radix sort

• The time complexity of radix sort is given by the formula:


• T(n) = O(d*(n+b))
• d: the number of digits in the given list
• n: the number of elements in the list
• b: the base or bucket size used, which is normally base 10 for
decimal representation.

SATURDAY, 11 NOVEMBER 2023 61


Practice

• QuickSort1App.java
• QuickSort2App.java
• QuickSort3App.java
• Add counters for the number of comparisons, swaps, and recursive
calls, and display them after sorting.
• Compute the average number of comparisons, swaps, and recursive
calls over 100 runs.

SATURDAY, 11 NOVEMBER 2023 62


Practice

• ShellSortApp.java
• Generate an array of 50 random elements
• Run the code to “shell sort” the array
• For each change of h
• Print out h value
• Print out the array

SATURDAY, 11 NOVEMBER 2023 63


Vietnam National University of HCMC
International University
School of Computer Science and Engineering

THANK YOU

Dr Vi Chi Thanh - [email protected]


https://vichithanh.github.io

You might also like