Medians and Order Statistics
Prof. Prateek Vishnoi
August 29, 2024
Medians and Order Statistics
Definition and Terminology
Let X ⊆ N and |X | = n
Prof. Prateek Vishnoi Medians and Order Statistics
Medians and Order Statistics
Definition and Terminology
Let X ⊆ N and |X | = n
ith order statistics of X
Prof. Prateek Vishnoi Medians and Order Statistics
Medians and Order Statistics
Definition and Terminology
Let X ⊆ N and |X | = n
ith order statistics of X
ith smallest element of the set.
Prof. Prateek Vishnoi Medians and Order Statistics
Medians and Order Statistics
Definition and Terminology
Let X ⊆ N and |X | = n
ith order statistics of X
ith smallest element of the set.
Minimum of X is first order statistic.
Prof. Prateek Vishnoi Medians and Order Statistics
Medians and Order Statistics
Definition and Terminology
Let X ⊆ N and |X | = n
ith order statistics of X
ith smallest element of the set.
Minimum of X is first order statistic.
Maximum of X is called nth order statistic.
Prof. Prateek Vishnoi Medians and Order Statistics
Medians and Order Statistics
Definition and Terminology
Let X ⊆ N and |X | = n
ith order statistics of X
ith smallest element of the set.
Minimum of X is first order statistic.
Maximum of X is called nth order statistic.
Median of X :
$ %th
n+1
order statistic of X
2
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Time Complexity = Θ(n log n)
Second Approach
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Time Complexity = Θ(n log n)
Second Approach
Divide and conquer approach.
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Time Complexity = Θ(n log n)
Second Approach
Divide and conquer approach.
Apply the RANDOMISED PARTITION from quick sort on the
array by selecting a pivot.
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Time Complexity = Θ(n log n)
Second Approach
Divide and conquer approach.
Apply the RANDOMISED PARTITION from quick sort on the
array by selecting a pivot.
Once the partition ends the array is divided into two parts
with (k − 1) elements on left array and (n − k) on right array.
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Time Complexity = Θ(n log n)
Second Approach
Divide and conquer approach.
Apply the RANDOMISED PARTITION from quick sort on the
array by selecting a pivot.
Once the partition ends the array is divided into two parts
with (k − 1) elements on left array and (n − k) on right array.
If k= i, then RETURN.
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Time Complexity = Θ(n log n)
Second Approach
Divide and conquer approach.
Apply the RANDOMISED PARTITION from quick sort on the
array by selecting a pivot.
Once the partition ends the array is divided into two parts
with (k − 1) elements on left array and (n − k) on right array.
If k= i, then RETURN.
If k < i, recursively call the RANDOMISED PARTITION on
right subarray.
Prof. Prateek Vishnoi Medians and Order Statistics
Find the i th order statistic of X , where 1 ≤ i ≤ n
First approach
Sort the X and return the i th element.
Time Complexity = Θ(n log n)
Second Approach
Divide and conquer approach.
Apply the RANDOMISED PARTITION from quick sort on the
array by selecting a pivot.
Once the partition ends the array is divided into two parts
with (k − 1) elements on left array and (n − k) on right array.
If k= i, then RETURN.
If k < i, recursively call the RANDOMISED PARTITION on
right subarray.
otherwise call on left subarray.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Best Case
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Best Case
Best Case occurs when the pivot selected is the i th order
statistic.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Best Case
Best Case occurs when the pivot selected is the i th order
statistic.
Recurrence Relation
T (n) = O(1) + Θ(n)
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Best Case
Best Case occurs when the pivot selected is the i th order
statistic.
Recurrence Relation
T (n) = O(1) + Θ(n)
Θ(n) requires for the partitioning.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Best Case
Best Case occurs when the pivot selected is the i th order
statistic.
Recurrence Relation
T (n) = O(1) + Θ(n)
Θ(n) requires for the partitioning.
O(1) requires for the returning the element.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Best Case
Best Case occurs when the pivot selected is the i th order
statistic.
Recurrence Relation
T (n) = O(1) + Θ(n)
Θ(n) requires for the partitioning.
O(1) requires for the returning the element.
Time Complexity = Θ(n)
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Worst Case occurs when the pivot selected divides array into
two parts of size (n − 1) and 0 and pivot is not the i th order
statistic.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Worst Case occurs when the pivot selected divides array into
two parts of size (n − 1) and 0 and pivot is not the i th order
statistic.
Recurrence Relation
T (n) = T (n − 1) + Θ(n)
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Worst Case occurs when the pivot selected divides array into
two parts of size (n − 1) and 0 and pivot is not the i th order
statistic.
Recurrence Relation
T (n) = T (n − 1) + Θ(n)
Θ(n) requires for the partitioning.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Worst Case occurs when the pivot selected divides array into
two parts of size (n − 1) and 0 and pivot is not the i th order
statistic.
Recurrence Relation
T (n) = T (n − 1) + Θ(n)
Θ(n) requires for the partitioning.
T (n − 1) requires for the recursive call.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Worst Case occurs when the pivot selected divides array into
two parts of size (n − 1) and 0 and pivot is not the i th order
statistic.
Recurrence Relation
T (n) = T (n − 1) + Θ(n)
Θ(n) requires for the partitioning.
T (n − 1) requires for the recursive call.
Time Complexity = O(n2 )
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
where X is a random variable s.t, 0 ≤ X ≤ (n − 1)
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
where X is a random variable s.t, 0 ≤ X ≤ (n − 1)
T (n) = (n − 1) + E [T (X )]
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
where X is a random variable s.t, 0 ≤ X ≤ (n − 1)
T (n) = (n − 1) + E [T (X )]
Possible splits of array :
(0, n−1), (1, n−2), (2, n−3) . . . (n/2−2, n/2+1), (n/2−1, n/2)
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
where X is a random variable s.t, 0 ≤ X ≤ (n − 1)
T (n) = (n − 1) + E [T (X )]
Possible splits of array :
(0, n−1), (1, n−2), (2, n−3) . . . (n/2−2, n/2+1), (n/2−1, n/2)
Expected size of larger array= 3n/4
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
where X is a random variable s.t, 0 ≤ X ≤ (n − 1)
T (n) = (n − 1) + E [T (X )]
Possible splits of array :
(0, n−1), (1, n−2), (2, n−3) . . . (n/2−2, n/2+1), (n/2−1, n/2)
Expected size of larger array= 3n/4
E [T (X )] ≤ 12 T ( 3n 1 1 3n 1
4 ) + 2 T (n − 1) ≤ 2 T ( 4 ) + 2 T (n)
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
where X is a random variable s.t, 0 ≤ X ≤ (n − 1)
T (n) = (n − 1) + E [T (X )]
Possible splits of array :
(0, n−1), (1, n−2), (2, n−3) . . . (n/2−2, n/2+1), (n/2−1, n/2)
Expected size of larger array= 3n/4
E [T (X )] ≤ 12 T ( 3n 1 1 3n 1
4 ) + 2 T (n − 1) ≤ 2 T ( 4 ) + 2 T (n)
Place the bound on the upper equation.
Prof. Prateek Vishnoi Medians and Order Statistics
Average Case
Recurrence Relation
T (n) = (n − 1) + T (X )
where X is a random variable s.t, 0 ≤ X ≤ (n − 1)
T (n) = (n − 1) + E [T (X )]
Possible splits of array :
(0, n−1), (1, n−2), (2, n−3) . . . (n/2−2, n/2+1), (n/2−1, n/2)
Expected size of larger array= 3n/4
E [T (X )] ≤ 12 T ( 3n 1 1 3n 1
4 ) + 2 T (n − 1) ≤ 2 T ( 4 ) + 2 T (n)
Place the bound on the upper equation.
T (n) ≤ T ( 3n
4 ) + 2(n − 1) = O(n)
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Turing Award for this Paper
In 1972, a deterministic linear time algorithm was developed by
Manuel Blum, Bob Floyd, Vaughan Pratt, Ron Rivest, and Bob
Tarjan
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Use p as a pivot to partition the array.
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Use p as a pivot to partition the array.
Recurse on the appropriate piece in the same way as previous.
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Use p as a pivot to partition the array.
Recurse on the appropriate piece in the same way as previous.
Example
A = {1,2,3,10,11,4,5,6,12,13,7,8,9,14,15}
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Use p as a pivot to partition the array.
Recurse on the appropriate piece in the same way as previous.
Example
A = {1,2,3,10,11,4,5,6,12,13,7,8,9,14,15}
{ 1,2,3,10,11} , {4,5,6,12,13}, { 7,8,9,14,15}
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Use p as a pivot to partition the array.
Recurse on the appropriate piece in the same way as previous.
Example
A = {1,2,3,10,11,4,5,6,12,13,7,8,9,14,15}
{ 1,2,3,10,11} , {4,5,6,12,13}, { 7,8,9,14,15}
{ 3,6,9}
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Use p as a pivot to partition the array.
Recurse on the appropriate piece in the same way as previous.
Example
A = {1,2,3,10,11,4,5,6,12,13,7,8,9,14,15}
{ 1,2,3,10,11} , {4,5,6,12,13}, { 7,8,9,14,15}
{ 3,6,9}
p ={6}
Prof. Prateek Vishnoi Medians and Order Statistics
Deterministic Algorithm for Selection
Algorithm
I/P : Array A of size n and positive integer k ≤ n
Group the array into n/5 groups and find the median of each
group.
Recursively find the true median of the medians. Call this p.
Use p as a pivot to partition the array.
Recurse on the appropriate piece in the same way as previous.
Example
A = {1,2,3,10,11,4,5,6,12,13,7,8,9,14,15}
{ 1,2,3,10,11} , {4,5,6,12,13}, { 7,8,9,14,15}
{ 3,6,9}
p ={6}
5 elements at left and 9 elements at right.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Step 1 takes O(n) time.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Step 1 takes O(n) time.
Step 2 takes atmost T (n/5) time.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Step 1 takes O(n) time.
Step 2 takes atmost T (n/5) time.
Step 3(partitioning) takes O(n) time.
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Step 1 takes O(n) time.
Step 2 takes atmost T (n/5) time.
Step 3(partitioning) takes O(n) time.
Assume for now, that pivot divides the array of size atmost
7n/10
T (n) ≤ cn + T (n/5) + T (7n/10)
Prof. Prateek Vishnoi Medians and Order Statistics
Complexity Analysis
Worst Case
Step 1 takes O(n) time.
Step 2 takes atmost T (n/5) time.
Step 3(partitioning) takes O(n) time.
Assume for now, that pivot divides the array of size atmost
7n/10
T (n) ≤ cn + T (n/5) + T (7n/10)
T (n) = O(n)
Prof. Prateek Vishnoi Medians and Order Statistics
Justification for assumption
Prof. Prateek Vishnoi Medians and Order Statistics
Justification for assumption
Let there are n elements in an array.
Prof. Prateek Vishnoi Medians and Order Statistics
Justification for assumption
Let there are n elements in an array.
Let g = n/5
Prof. Prateek Vishnoi Medians and Order Statistics
Justification for assumption
Let there are n elements in an array.
Let g = n/5
Atleast ⌊g /2⌋ of them(groups having median ≤ p ), atleast 3
of 5 elements are ≤ p.
Prof. Prateek Vishnoi Medians and Order Statistics
Justification for assumption
Let there are n elements in an array.
Let g = n/5
Atleast ⌊g /2⌋ of them(groups having median ≤ p ), atleast 3
of 5 elements are ≤ p.
Total number of elements ≤ p is atleast 3n/10
Prof. Prateek Vishnoi Medians and Order Statistics