Randomized Algorithms (1)
Randomized Algorithms (1)
Randomized algorithms in data structures and algorithms (DSA) are algorithms that use
randomness in their computations to achieve a desired outcome. These algorithms introduce
randomness to improve efficiency or simplify the algorithm design. By incorporating random
choices into their processes, randomized algorithms can often provide faster solutions or better
approximations compared to deterministic algorithms. They are particularly useful in situations
where exact solutions are difficult to find or when a probabilistic approach is acceptable.
For example, in Randomized Quick Sort, we use a random number to pick the next pivot (or
we randomly shuffle the array). Typically, this randomness is used to reduce time complexity
or space complexity in other standard algorithms.
Random Variables
A random variable in statistics is a function that assigns a real value to an outcome in the
sample space of a random experiment. For example: if you roll a die, you can assign a number
to each possible outcome.
In this article, we will learn about random variables in Statistics, their types, examples, and
others in detail.
A random variable is considered a discrete random variable when it takes specific, or distinct
values within an interval. Conversely, if it takes a continuous range of values, then it is
classified as a continuous random variable.
Random variables are generally represented by capital letters like X and Y. This is explained
by the example below:
Example
If two unbiased coins are tossed then find the random variable associated with that event.
Solution:
We define a random variable as a function that maps from the sample space of an experiment
to the real numbers. Mathematically, Random Variable is expressed as,
X: S →R
where,
S is Sample Space
P(X = xi) = pi
where 1 ≤ i ≤ m
0 ≤ pi ≤ 1; where 1 ≤ i ≤ m
X = {0, 1, 2} where m = 3
P(X = 1) = (Probability that number of heads is 1) = P(HT | TH) = 1/2×1/2 + 1/2×1/2 = 1/2
For example,
Suppose a dice is thrown (X = outcome of the dice). Here, the sample space S = {1, 2, 3, 4, 5,
6}. The output of the function will be:
P(X=1) = 1/6
P(X=2) = 1/6
P(X=3) = 1/6
P(X=4) = 1/6
P(X=5) = 1/6
P(X=6) = 1/6
Variate
It has the same properties as random variables and is denoted by capital letters (commonly X).
The possible values a random variable X can take are its range, denoted R_X. Individual
values within this range are called quantiles, and the probability of X taking a specific value x
is written as P(X=x).
0 ≤ pi ≤ 1
xi 0 1 2
Solution:
P1 + 0.3 + 0.5 = 1
P1 = 0.2
Then, P (X = 0) is 0.2
Continuous Random Variable takes on an infinite number of values. The probability function
associated with it is said to be PDF (Probability Density Function).
f(x) = kx3; 0 ≤ x ≤ 3 = 0
Solution:
If a function f is said to be a density function, then the sum of all probabilities is equal to 1.
∫ f(x) dx = 1
∫ kx3 dx = 1
K[x4]/4 = 1
Given interval, 0 ≤ x ≤ 3 = 0
K[34 – 04]/4 = 1
K(81/4) = 1
K = 4/81
Thus,
P = 4/81×[16-1]/4
P = 15/81
For any random variable X where P is its respective probability we define its mean as,
Mean(μ) = ∑ X.P
where,
The variance of a random variable tells us how the random variable is spread about the mean
value of the random variable. Variance of Random Variable is calculated using the formula,
where,
E(X2) = ∑X2P
E(X) = ∑XP
For any random variable X if it assume the values x1, x2,…xn where the probability
corresponding to each random variable is P(x1), P(x2),…P(xn), then the expected value of the
variable is,
Now for any new random variable Y in which the random variable X is its input, i.e. Y = f(X),
then the cumulative distribution function of Y is,
Fy(Y) = P(g(X) ≤ y)
For a random variable its probability distribution is calculated using three methods,
Probability of a random variable X that takes values x is defined using a probability function of
X that is denoted by f (x) = f (X = x).
Binomial Distribution
Poisson Distribution
Bernoulli’s Distribution
Exponential Distribution
Normal Distribution
Also Check,
Probability Distribution Function
Expected Value
Here are some of the solved examples on Random variable. Learn random variables by
practicing these solved examples.
Example 1
Find the mean value for the continuous random variable, f(x) = x2, 1 ≤ x ≤ 3
Solution:
Given,
f(x) = x2
1≤x≤3
E(x) = [x4/4]31
E(x) = 1/4{80} = 20
Example 2
Find the mean value for the continuous random variable, f(x) = ex, 1 ≤ x ≤ 3
Solution:
Given,
f(x) = ex
1≤x≤3
E(x) = 2e3
P1. Find the mean value for the continuous random variable, f(x) = 3x3, 0 ≤ x ≤ 9
P2. Find the mean value for the continuous random variable, f(x) = x + sin x, 0 ≤ x ≤ π/4
P3. Find the variance value for the continuous random variable, f(x) = 2ex +x, -2 ≤ x ≤ 2
P4. Find the variance value for the continuous random variable, f(x) = 5 + x.tanx, -π/4 ≤ x
≤ π/4
A random variable in statistics are the variables that represent all the possible outcome of a
Random Variable.
Expected value of a Random Variable is the weighted average of all possible values of the
variable. Weight of the random variable is the probability of random variable at specific
values.
What are Continuous Random Variables?
Continuous Random Variables are type of random variable in probability theory and statistics
that are used to represent the continuous probability of the distribution of a function.
Mathematical Notations
n = number of trials
Since all n events are independent, hence the probability of k success in n trials is equivalent to
multiplication of probability for each trial.
Here its k success and n-k failures, So probability for each way to achieve k success and n-k
failure is
Hence final probability is
Let X be a binomial random variable with the number of trials n and probability of success in
each trial be p.
Expected number of success is given by
E[X] = np
Var[X] = np(1-p)
Example 1 : Consider a random experiment in which a biased coin (probability of head = 1/3)
is thrown for 10 times. Find the probability that the number of heads appearing will be 5.
Solution :
P(X=5) = ?
Here is the implementation for the same
C++
Java
Python3
C#
PHP
Javascript
#include <iostream>
#include <cmath>
if (r > n / 2)
r = n - r;
int answer = 1;
answer /= i;
return answer;
pow(1 - p, n - k);
// Driver code
int main()
int n = 10;
int k = 5;
float p = 1.0 / 3;
cout << " times where probability of each head is " << p << endl;
cout << " is = " << probability << endl;
Output:
Probability of 5 heads when a coin is tossed 10 times where probability of each head is
0.333333
is = 0.136565
Conditional Probability Conditional probability P(A | B) indicates the probability of even ‘A’
happening given that the even B happened.
We can easily understand above formula using below diagram. Since B has already happened,
the sample space reduces to B. So the probability of A happening becomes P(A ? B) divided by
P(B)
Below is Bayes’s formula for conditional probability.
The formula provides relationship between P(A|B) and P(B|A). It is mainly derived from
conditional probability formula discussed in the previous post.
Consider the below formulas for conditional probabilities P(A|B) and P(B|A)
Since P(B ? A) = P(A ? B), we can replace P(A ? B) in first formula with P(B|A)P(A)
After replacing, we get the given formula. Refer this for examples of Bayes’s formula.
Random Variables:
A random variable is actually a function that maps outcome of a random event (like coin toss) to
a real value.
Example :
toss is "Head"
Tail.
be defined as below :
-50 if Tail
Expected value is basically sum of product of following two terms (for all possible events)
a) Probability of an event.
b) Value of R at that even
Example 1:
(-50) * (1/2)
=0
Example 2:
= 3.5
Linearity of Expectation:
Let R1 and R2 be two discrete random variables on some probability space, then
What is a Randomized Algorithm?
An algorithm that uses random numbers to decide what to do next anywhere in its logic is called
a Randomized Algorithm. For example, in Randomized Quick Sort, we use a random number to
pick the next pivot (or we randomly shuffle the array). And in Karger’s algorithm, we randomly
pick an edge.
The important thing in our analysis is, time taken by step 2 is O(n).
How many times while loop runs before finding a central pivot?
The probability that the randomly chosen element is central pivot is 1/n.
Therefore, expected number of times the while loop runs is n (See this for details)
Thus, the expected time complexity of step 2 is O(n).
What is overall Time Complexity in Worst Case?
In worst case, each partition divides array such that one side has n/4 elements and other side has
3n/4 elements. The worst case height of recursion tree is Log 3/4 n which is O(Log n).
Note that the above randomized algorithm is not the best way to implement randomized Quick
Sort. The idea here is to simplify the analysis as it is simple to analyse.
Typically, randomized Quick Sort is implemented by randomly picking a pivot (no loop). Or by
shuffling array elements. Expected worst case time complexity of this algorithm is also O(n Log
n), but analysis is complex, the MIT prof himself mentions same in his lecture here.
Example :
C
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int find_solution(int n) {
srand(time(0));
return rand() % n + 1;
int main() {
return 0;
C++
#include <iostream>
#include <stdlib.h>
#include <time.h>
int find_solution(int n) {
srand(time(0));
// randomly select a number between 1 and n and return it as the solution
return rand() % n + 1;
int main() {
return 0;
Java
import java.util.Random;
// use the nextInt method to generate a random number between 0 and n-1
return solution;
Python3
import random
import time
def find_solution(n):
random.seed(time.time())
return random.randint(1, n)
def main():
print("Solution:", find_solution(n))
if __name__ == '__main__':
main()
Javascript
function findSolution(n) {
function main() {
console.log(`Solution: ${findSolution(n)}`);
main();
C#
// C# equivalent
using System;
// use the Next method to generate a random number between 0 and n-1
return solution;
}
public static void Main(string[] args)
Output
Solution: 10
Las Vegas:
A Las Vegas algorithm is an algorithm which uses randomness, but gives guarantees that the
solution obtained for given problem is correct. It takes the risk with resources used. A quick-
sort algorithm is a simple example of Las-Vegas algorithm. To sort the given array of n
numbers quickly we use the quick sort algorithm. For that we find out central element which is
also called as pivot element and each element is compared with this pivot element. Sorting is
done in less time or it requires more time is dependent on how we select the pivot element. To
pick the pivot element randomly we can use Las-Vegas algorithm.
Definition:
A randomized algorithm that always produce correct result with only variation from one aun to
another being its running time is known as Las-Vegas algorithm.
OR
A randomized algorithm which always produces a correct result or it informs about the failure
is known as Las-Vegas algorithm.
OR
A Las-Vegas algorithm take the risk with the resources used for computation but it does not
take risk with the result i.e. it gives correct and expected output for the given problem.
Let us consider the above example of quick sort algorithm. In this algorithm we choose the
pivot element randomly. But the result of this problem is always a sorted array. A Las-Vegas
algorithm is having one restriction i.e. the solution for the given problem can be found out in
finite time. In this algorithm the numbers of possible solutions arc limited. The actual solution
is complex in nature or complicated to calculate but it is easy to verify the correctness of
candidate solution.
These algorithms always produce correct or optimum result. Time complexity of these
algorithms is based on a random value and time complexity is evaluated as expected value. For
example, Randomized Quick Sort always sorts an input array and expected worst case time
complexity of Quick Sort is O(nLogn).
The Las-Vegas algorithm can be differentiated with the Monte-carlo algorithms in which
the resources used to find out the solution are bounded but it does not give guarantee that
the solution obtained is accurate.
Complexity Analysis:
The complexity class of given problem which is solved by using a Las-Vegas algorithms is
expect that the given problem is solved with zero error probability and in polynomial time.
This zero error probability polynomial time is also called as ZPP which is obtained as follows,
ZPP = RP ? CO-RP
Randomized polynomial time algorithm always provide correct output when the correct output
is no, but with a certain probability bounded away from one when the answer is yes. These
kinds of decision problem can be included in class RP i.e. randomized where polynomial time.
That is how we can solve given problem in expected polynomial time by using Las-Vegas
algorithm. Generally there is no upper bound for Las-vegas algorithm related to worst case run
time.
Monte Carlo:
The computational algorithms which rely on repeated random sampling to compute their
results such algorithm are called as Monte-Carlo algorithms.
OR
The random algorithm is Monte-carlo algorithms if it can give the wrong answer sometimes.
Whenever the existing deterministic algorithm is fail or it is impossible to compute the solution
for given problem then Monte-Carlo algorithms or methods are used. Monte-carlo methods are
best repeated computation of the random numbers, and that’s why these algorithms are used
for solving physical simulation system and mathematical system.
This Monte-carlo algorithms are specially useful for disordered materials, fluids, cellular
structures. In case of mathematics these method are used to calculate the definite integrals,
these integrals are provided with the complicated boundary conditions for multidimensional
integrals. This method is successive one with consideration of risk analysis when compared to
other methods.
There is no single Monte carlo methods other than the term describes a large and widely used
class approaches and these approach use the following pattern.
2. By using a certain specified probability distribution generate the inputs randomly from the
domain.
Produce correct or optimum result with some probability. These algorithms have deterministic
running time and it is generally easier to find out worst case time complexity. For example this
implementation of Karger’s Algorithm produces minimum cut with probability greater than or
equal to 1/n2 (n is number of vertices) and has worst case time complexity as O(E). Another
example is Fermat Method for Primality Testing. Example to Understand Classification:
A Las Vegas algorithm for this task is to keep picking a random element until we find a 1. A
Monte Carlo algorithm for the same is to keep picking a random element until we either find 1
or we have tried maximum allowed times say k. The Las Vegas algorithm always finds an
index of 1, but time complexity is determined as expect value. The expected number of trials
before success is 2, therefore expected time complexity is O(1). The Monte Carlo Algorithm
finds a 1 with probability [1 – (1/2) k]. Time complexity of Monte Carlo is O(k) which is
deterministic
In general the Monte-carlo optimization techniques are dependent on the random walks.
The program for Monte carlo algorithms move in multidimensional space around the
generated marker or handle. It wanted to move to the lower function but sometimes moves
against the gradient.
In numerical optimization the numerical simulation is used which effective and efficient
and popular application for the random numbers. The travelling salesman problem is an
optimization problem which is one of the best examples of optimizations.
There are various optimization techniques available for Monte-carlo algorithms such as
Evolution strategy, Genetic algorithms, parallel tempering etc.
The Monte-carlo methods has wider range of applications. It uses in various areas like physical
science, Design and visuals, Finance and business, Telecommunication etc. In general Monte
carlo methods are used in mathematics. By generating random numbers we can solve the
various problem. The problems which are complex in nature or difficult to solve are solved by
using Monte-carlo algorithms. Monte carlo integration is the most common application of
Monte-carlo algorithm.
The deterministic algorithm provides a correct solution but it takes long time or its runtime is
large. This run-time can be improved by using the Monte carlo integration algorithms. There
are various methods used for integration by using Monte-carlo methods such as,
ii) Random walk Monte-carlo algorithm which is used to find out the integration for
given problem.
iii) Gibbs sampling.
Consider a tool that basically does sorting. Let the tool be used by many users and there are
few users who always use tool for already sorted array. If the tool uses simple (not
randomized) QuickSort, then those few users are always going to face worst case situation.
On the other hand if the tool uses Randomized QuickSort, then there is no user that always
gets worst case. Everybody gets expected O(n Log n) time.
Algebraic identities: Polynomial and matrix identity verification. Interactive proof systems.
Probabilistic existence proofs: Show that a combinatorial object arises with non-zero
probability among objects drawn from a suitable probability space.
Randomized algorithms are algorithms that use randomness as a key component in their
operation. They can be used to solve a wide variety of problems, including optimization,
search, and decision-making. Some examples of applications of randomized algorithms
include:
1. Monte Carlo methods: These are a class of randomized algorithms that use random
sampling to solve problems that may be deterministic in principle, but are too complex to
solve exactly. Examples include estimating pi, simulating physical systems, and solving
optimization problems.
2. Randomized search algorithms: These are algorithms that use randomness to search for
solutions to problems. Examples include genetic algorithms and simulated annealing.
3. Randomized data structures: These are data structures that use randomness to improve their
performance. Examples include skip lists and hash tables.
4. Randomized load balancing: These are algorithms used to distribute load across a network
of computers, using randomness to avoid overloading any one computer.
5. Randomized encryption: These are algorithms used to encrypt and decrypt data, using
randomness to make it difficult for an attacker to decrypt the data without the correct key.
Example 1:
C++
Java
Python
C#
Javascript
#include <iostream>
#include <algorithm>
#include <random>
std::mt19937 rng(std::random_device{}());
int main() {
int size = 5;
random_permutation(array, size);
return 0;
Output
51423
Example 2 :
C++
Java
Python3
C#
Javascript
#include <algorithm>
#include <iostream>
#include <random>
#include <vector>
int n = numbers.size();
if (n == 0) {
return -1;
if (n == 1) {
return numbers[0];
std::random_device rd;
std::mt19937 g(rd());
int main()
std::vector<int> numbers1 = { 1, 2, 3, 4, 5 };
std::vector<int> numbers2 = { 1, 2, 3, 4, 5, 6 };
std::vector<int> numbers4 = { 7 };
// Example usage
std::cout
<< find_median(numbers2)
<< std::endl; // Output: 3 or 4 (randomly chosen)
return 0;
Output
None
Problem Statement : Given an unsorted array A[] of n numbers and ε > 0, compute an element
whose rank (position in sorted A[]) is in the range [(1 – ε)n/2, (1 + ε)n/2].
For ½ Approximate Median Algorithm &epsilom; is 1/2 => rank should be in the range [n/4,
3n/4]
We can find k’th smallest element in O(n) expected time and O(n) worst case time.
What if we want in less than O(n) time with low probable error allowed?
Following steps represent an algorithm that is O((Log n) x (Log Log n)) time and produces
incorrect result with probability less than or equal to 2/n2.
1. Randomly choose k elements from the array where k=c log n (c is some constant)
4. Return median of the set i.e. (k/2)th element from the set
o C
o Java
#include<bits/stdc++.h>
random_device rand_dev;
mt19937 generator(rand_dev());
if (n==0)
return 0;
int k = 10*log2(n); // Taking c as 10
set<int> s;
s.insert(arr[index]);
return *itr;
int main()
return 0
Time Complexity:
We use a set provided by the STL in C++. In STL Set, insertion for each element takes O(log
k). So for k insertions, time taken is O (k log k).
Now replacing k with c log n
=>O(c log n (log (clog n))) =>O (log n (log log n))
It is quite easy to visualize this statement since the median which we report will be (k/2)th
element and if we take k/2 elements from the left quarter(or right quarter) the median will be
from the left quarter (or the right quarter).
An array can be divided into 4 quarters each of size n/4. So P(selecting left quarter) is 1/4. So
what is the probability that at least k/2 elements are from the Left Quarter or Right Quarter?
This probability problem is same as below :
Given a coin which gives HEADS with probability 1/4 and TAILS with 3/4. The coin is tossed
k times. What is the probability that we get at least k/2 HEADS is less than or equal to?
Explanation:
P <= (1/2)2log n
P <= (1/2)log n2
P <= n-2
Probability of selecting at least k/2 elements from the left quarter) <= 1/n 2
Probability of selecting at least k/2 elements from the left or right quarter) <= 2/n 2
Therefore algorithm produces incorrect result with probability less than or equal to 2/n 2.