A probability distribution is a mathematical function or rule that describes how the probabilities of different outcomes are assigned to the possible values of a random variable. It provides a way of modeling the likelihood of each outcome in a random experiment.
While a Frequency Distribution shows how often outcomes occur in a sample or dataset, a probability distribution assigns probabilities to outcomes abstractly, theoretically, regardless of any specific dataset. These probabilities represent the likelihood of each outcome occurring.
Common types of probability distributions include:
Probability Distribution
Properties of a probability distribution include:
- The probability of each outcome is greater than or equal to zero.
- The sum of the probabilities of all possible outcomes equals 1.
In this article will be covering the key concepts of probability distribution, types of probability distribution, along with the applications in CS.
Probability Distribution of a Random Variable
Now the question comes, how to describe the behavior of a random variable?
Suppose that our Random Variable only takes finite values, like x1, x2, x3,..., and xn. i.e., the range of X is the set of n values is {x1, x2, x3,..., and xn}.
The behavior of X is completely described by giving probabilities for all the values of the random variable X.
Event | Probability |
---|
x1 | P(X = x1) |
x2 | P(X = x2) |
x3 | P(X = x3) |
The Probability Function of a discrete random variable X is the function p(x) satisfying.
P(x) = P(X = x)

Example: We draw two cards successively with replacement from a well-shuffled deck of 52 cards. Find the probability distribution of finding aces.
Answer:
Let's define a random variable "X", which means number of aces.
Since we are drawing two cards with replacement from a deck of 52 cards , X can only take on the values 0,1 or 2 as the cards are drawn with replacement, the two draws are independent experiments.
Calculating the probabilities:
P(X = 0) = P(both cards are non-aces)
= P(non-ace) x P(non-ace)
= \dfrac{48}{52} \times \dfrac{48}{52} = \dfrac {144}{169}
P(X = 1) = P(one of the cards in ace)
= P(non-ace and then ace) + P(ace and then non-ace)
= P(non-ace) x P(ace) + P(ace) x P(non-ace)
= \dfrac{48}{52} \times \dfrac{4}{52} + \dfrac{4}{52} \times \dfrac{48}{52} = \dfrac{24}{169}
P(X = 2) = P(Both the cards are aces)
= P(ace) x P(ace)
= \dfrac{4}{52} \times \dfrac{4}{52} = \dfrac{1}{169}
Now we have the probability distribution for the discrete random variable XXX.
It can be represented in the following table:
X | 0 | 1 | 2 |
---|
P(X = x) | 144/169 | 24/169 | 1/169 |
---|
It should be noted here that each value of P(X = x) is greater than zero, and the sum of all P(X = x) is equal to 1.
Types of Probability Distributions
We have seen what Probability Distributions are; now we will see different types of Probability Distributions. The Probability Distribution's type is determined by the type of random variable. There are two types of Probability Distributions:
- Discrete Probability Distributions for Discrete Variables
- Continuous Probability Distribution for Continuous Variables
We will study in detail two types of discrete probability distributions..
Discrete Probability Distributions
Discrete Probability Functions applies to discrete random variables, which take countable values (e.g., 0, 1, 2, …). These distributions assign probabilities to individual outcomes.It includes distributions such as Bernoulli, Binomial, and Poisson, which are used to model outcomes that can be counted, as explained below:
Bernoulli Trials
Trials of the random experiment are known as Bernoulli Trials, if they are satisfying below given conditions :
- Finite number of trials are required.
- All trials must be independent. (when the outcome of any trial is independent of the outcome of any other trial.)
- Every trial has two outcomes : success or failure.
- Probability of success remains constant across all trials.
Example: Can throwing a fair die 50 times be considered an example of 50 Bernoulli trials if we define:
- Success is getting an even number (2, 4, or 6),
- Failure as getting an odd number (1, 3, or 5)
Answer:
Yes.,this can be consider as example of 50 Bernoulli trails
- There are 3 even numbers out of 6 possible outcomes, so p = 3/6 = 1 /2
- There are 3 odd numbers out of 6, so q = 3/6 = 1 /2
So, throwing a fair die 50 times with this definition is a classic example of 50 Bernoulli trials, with p=1/2 and q = 1/2
Binomial Distribution
The binomial distribution models the number of successes (x) in n independent Bernoulli trials, each with success probability p.
For example,
For 1 success in 6 trials, there are 6 possible sequences (e.g., PQQQQQ,QPQQQQ,…PQQQQQ,QPQQQQ,…), each with probability p . (1−p)5
Therefore the total Probability is given as = 6. p .(1-p)5
Generalizing the idea, if Y is a Binomial Random Variable, the Probability Function P(Y) for the Binomial Distribution for n number of trials is given as:
P(Y) = nCx px(1-p)n-x
where
- p is the probability of success in a given trial,
- 'x' be the number of successes, x = 0,1,2...n
Example: When a fair coin is tossed 10 times, find the probability of getting i. exactly six heads. ii. at least six heads.
Answer:
Every coin tossed can be considered as the Bernoulli trial. Suppose X is the number of heads in this experiment:
We already know, n = 10, p = 1/2
P(X = x) = 10Cxp10-x(1-p)x
When x = 6,
(i) P(x = 6) = 10C6 p4 (1-p)6 = \dfrac{10!}{6!4!}(\dfrac{1}{2})^{6}(\dfrac{1}{2})^{4}\\ \hspace{0.4cm} = \dfrac{7\times8\times9\times10}{2\times3\times4}\times\dfrac{1}{64}\times\dfrac{1}{16} \\ \hspace{0.4cm} = \dfrac{105}{512}
(ii) P(at least 6 heads) = P(X >= 6) = P(X = 6) + P(X=7) + P(X=8)+ P(X=9) + P(X=10)
= 10C6 p4 (1 - p)6 + 10C7 p3 (1 - p)7 + 10C8 p2 (1 - p)8 + 10C9 p1(1 - p)9 + 10C10 (1 - p)10
=\dfrac{10!}{6!4!}(\dfrac{1}{2})^{10} + \dfrac{10!}{7!3!}(\dfrac{1}{2})^{10} + \dfrac{10!}{8!2!}(\dfrac{1}{2})^{10} + \dfrac{10!}{9!1!}(\dfrac{1}{2})^{10} + \dfrac{10!}{10!}(\dfrac{1}{2})^{10}\\ \hspace{0.5cm} = (\dfrac{10!}{6!4!} + \dfrac{10!}{7!3!}+ \dfrac{10!}{8!2!} + \dfrac{10!}{9!1!}+ \dfrac{10!}{10!})(\dfrac{1}{2})^{10} \\ \hspace{0.5cm} = \dfrac{193}{512}
Negative Binomial Distribution
Negative binomial distribution models the number of trials (n) needed to get k successes, where successes are fixed, but trials vary.
P(X=n)=\frac{n-1}{k−1}p^k(1−p)^{n−k}
Where:
- n = total trials (including the k-th success),
- k = required successes (fixed),
- p = probability of success on a single trial,
- \frac{n-1}{k−1} , the number of ways to arrange (k−1) successes in the first (n−1) trials.
For example,
Probability of getting exactly 3 coupons in 10 pizzas given that probability of success (per pizza): p=0.3
k=3, p = 0.3, n = 10
Therfore, total probability is P(X=10)=(\frac{2}{9})(0.3)^3(0.7)^7≈0.08 (8%)
Poisson Probability Distribution
The Poisson distribution models the number of times an event occurs in a fixed interval of time or space. It is expressed as
f(x; λ) = P(X = x) = (λxe-λ)/x!
where,
- x is the number of times the event occurred
- e = 2.718...
- λ is the mean value
Example: A bakery sells an average of 5 cupcakes per hour. What’s the probability they sell exactly 3 cupcakes in the next hour?
λ=5 (average rate), k=3 (desired events).
P(X = x) = \frac{e^{-λ}\lambda^k}{x!}\\P(X = 3) = \frac{e^{-5} 5^3}{3!} \approx 0.14
Continuous Probability Distributions
Probability distributions for continuous random variables (uncountable outcomes, e.g., time, height, temperature), such as Uniform and Normal distributions, are explained below.
Uniform Distribution models equally likely outcomes over a closed interval [a,b], where the probability is uniform.
Probability Density Function (PDF) of a Uniform Distribution is given by,
f(x) = \begin{cases} \frac{1}{b - a} & \text{if } a \leq x \leq b, \\ 0 & \text{otherwise.} \end{cases}
Cumulative Distribution Function (CDF) of a Uniform Distribution is given by,
F(x) = \begin{cases} 0 & \text{for } x < a, \\ \frac{x - a}{b - a} & \text{for } x \in [a, b], \\ 1 & \text{for } x > b. \end{cases}
Mean (μ): \mu = \frac{a + b}{2}
Variance (σ²): \sigma^2 = \frac{(b - a)^2}{12}
Example:
Random number generator between 0 and 1.
Normal (Gaussian) Distribution
Normal distribution models symmetric, bell-shaped data around a mean (μ) with a spread (σ). It describes data that clusters around a central value, with probabilities decreasing exponentially as values deviate from the mean.
PDF of Normal Distribution is given by,
f(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{(x - \mu)^2}{2 \sigma^2}}
CDF of Normal Distribution is given by,
F(x) = \frac{1}{2} \left[ 1 + \text{erf}\left( \frac{x - \mu}{\sigma \sqrt{2}} \right) \right]
Mean, Median, Mode: μ
Variance: σ²
Example:
Heights of adults in a population (μ=170, σ=10).
Chi-Square Distribution
The chi-square distribution used in hypothesis testing, especially for goodness-of-fit and independence tests. It only takes non-negative values and is positively skewed.
Degrees of freedom refer to the number of independent values or quantities that can vary in the calculation of a statistic.
- For simple experiments, k = Number of Categories - 1
- In contingency table, k = (Rows - 1) × (Columns - 1)
Mean: k
Variance :2k, where k is the degree of freedom
Critical values are used in hypothesis testing to determine whether observed frequencies in a contingency table differ significantly from expected frequencies.
Example,
Observed data; Oi: 55 heads, 45 tails in 100 flips.
Expected (fair coin): Ei: 50 heads, 50 tails.
Null Hypothesis (H0): The coin is fair (P(Heads)=0.5).
Alternative Hypothesis (Ha): The coin is biased.
Chi-Square Statistic: \chi^2 = \sum \frac{{(O_i-E_i)}^2}{E_i} = \frac{{(55-50)}^2}{50} +\frac{{(45-50)}^2}{50} = 1.0
Degrees of freedom: k = 2 − 1 = 1. (since there are 2 categories: heads/tails).
Critical value (α=0.05): {\chi} ^2_{0.95}{(1)}=3.84
Since 1.0 < 3.84, fail to reject H0 (coin may be fair). The data does not show significant evidence of bias.
Application of Probability Distribution in Computer Science
Probability distributions are used in many areas of computer science are as follows:
- In machine learning, they help make predictions and deal with uncertainty.
- In natural language processing, they are used to model how often words appear.
- In computer vision, they help understand image data and remove noise.
- In networking, distributions like Poisson are used to study how data packets arrive.
- Cryptography uses random numbers based on probability.
- Software testing and reliability also use distributions to predict bugs and failures.
Overall, probability distributions help in building smarter, more reliable, and efficient computer systems.
Solved Questions on Probability Distribution
Question 1: A box contains 4 blue balls and 3 green balls. Find the probability distribution of the number of green balls in a random draw of 3 balls.
Solution:
Given that the total number of balls is 7 out of which 3 have to be drawn at random. On drawing 3 balls the possibilities are all 3 are green, only 2 is green, only 1 is green, and no green. Hence X = 0, 1, 2, 3.
- P(No ball is green) = P(X = 0) = 4C3/7C3 = 4/35
- P(1 ball is green) = P(X = 1) = 3C1 × 4C2 / 7C3 = 18/35
- P(2 balls are green) = P(X = 2) = 3C2 × 4C1 / 7C3 = 12/35
- P(All 3 balls are green) = P(X = 3) = 3C3 / 7C3 = 1/35
Hence, the probability distribution for this problem is given as follows
X | 0 | 1 | 2 | 3 |
---|
P(X) | 4/35 | 18/35 | 12/35 | 1/35 |
---|
Question 2: From a lot of 10 bulbs containing 3 defective ones, 4 bulbs are drawn at random. If X is a random variable that denotes the number of defective bulbs. Find the probability distribution of X.
Solution:
Since, X denotes the number of defective bulbs and there is a maximum of 3 defective bulbs, hence X can take values 0, 1, 2, and 3. Since 4 bulbs are drawn at random, the possible combination of drawing 4 bulbs is given by 10C4.
- P(Getting No defective bulb) = P(X = 0) = 7C4 / 10C4 = 1/6
- P(Getting 1 Defective Bulb) = P(X = 1) = 3C1 × 7C3/10C4 = 1/2
- P(Getting 2 defective Bulb) = P(X = 2) = 3C2 × 7C2/10C4 = 3/10
- P(Getting 3 Defective Bulb) = P(X = 3) = 3C3 × 7C1/10C4 = 1/30
Hence Probability Distribution Table is given as follows
Also Check
Practice Problem Based on Probability Distribution Function
Question 1. A coin is flipped 8 times. What is the probability of getting exactly 5 heads? (Assume the coin is fair.)
Question 2. A dice is rolled until a 4 is rolled. If the first success (rolling a 4) occurs on the 6th roll, how many failures occurred before the success?
Question 3. A customer service center receives an average of 3 calls per hour. What is the probability that they receive exactly 5 calls in an hour?
Question 4. The heights of adult women in a certain population follow a normal distribution with a mean of 64 inches and a standard deviation of 3 inches. What is the probability that a randomly selected woman has a height greater than 66 inches?
Question 5. For a continuous uniform distribution between 2 and 8, find the probability that the random variable is between 4 and 6.
Question 6. A researcher performs a chi-square test to examine if there is a relationship between gender and voting preference in a survey of 150 people. The degrees of freedom for this test are 3. What is the critical value for the chi-square statistic at a 0.05 significance level?
Question 7. A sample of 12 students was taken from a population to test their exam scores. The sample mean is 78, and the sample standard deviation is 5. Test if the sample mean significantly differs from a population mean of 75 at a 0.05 significance level.
Question 8. In a factory, 95% of the machines work well, and 5% are defective. If a machine is randomly selected and found to be defective, what is the probability that it was not properly maintained, given that 20% of the machines are poorly maintained? Use Bayes' Theorem to calculate this.
Answer:-
- 0.21875.
- 5
- 0.1009.
- 0.2546.
- 0.3333.
- 7.81.
- There is no significant difference between the sample mean and the population mean at the 0.05 significance level.
- 11.11%.
Similar Reads
Engineering Mathematics Tutorials Engineering mathematics is a vital component of the engineering discipline, offering the analytical tools and techniques necessary for solving complex problems across various fields. Whether you're designing a bridge, optimizing a manufacturing process, or developing algorithms for computer systems,
3 min read
Linear Algebra
MatricesMatrices are key concepts in mathematics, widely used in solving equations and problems in fields like physics and computer science. A matrix is simply a grid of numbers, and a determinant is a value calculated from a square matrix.Example: \begin{bmatrix} 6 & 9 \\ 5 & -4 \\ \end{bmatrix}_{2
3 min read
Row Echelon FormRow Echelon Form (REF) of a matrix simplifies solving systems of linear equations, understanding linear transformations, and working with matrix equations. A matrix is in Row Echelon form if it has the following properties:Zero Rows at the Bottom: If there are any rows that are completely filled wit
4 min read
Eigenvalues and EigenvectorsEigenvalues and eigenvectors are fundamental concepts in linear algebra, used in various applications such as matrix diagonalization, stability analysis and data analysis (e.g., PCA). They are associated with a square matrix and provide insights into its properties.Eigen value and Eigen vectorTable
10 min read
System of Linear EquationsA system of linear equations is a set of two or more linear equations involving the same variables. Each equation represents a straight line or a plane and the solution to the system is the set of values for the variables that satisfy all equations simultaneously.Here is simple example of system of
5 min read
Matrix DiagonalizationMatrix diagonalization is the process of reducing a square matrix into its diagonal form using a similarity transformation. This process is useful because diagonal matrices are easier to work with, especially when raising them to integer powers.Not all matrices are diagonalizable. A matrix is diagon
8 min read
LU DecompositionLU decomposition or factorization of a matrix is the factorization of a given square matrix into two triangular matrices, one upper triangular matrix and one lower triangular matrix, such that the product of these two matrices gives the original matrix. It is a fundamental technique in linear algebr
6 min read
Finding Inverse of a Square Matrix using Cayley Hamilton Theorem in MATLABMatrix is the set of numbers arranged in rows & columns in order to form a Rectangular array. Here, those numbers are called the entries or elements of that matrix. A Rectangular array of (m*n) numbers in the form of 'm' horizontal lines (rows) & 'n' vertical lines (called columns), is calle
4 min read
Sequence & Series
Calculus
Limits, Continuity and DifferentiabilityLimits, Continuity, and Differentiation are fundamental concepts in calculus. They are essential for analyzing and understanding function behavior and are crucial for solving real-world problems in physics, engineering, and economics.Table of ContentLimitsKey Characteristics of LimitsExample of Limi
10 min read
Cauchy's Mean Value TheoremCauchy's Mean Value theorem provides a relation between the change of two functions over a fixed interval with their derivative. It is a special case of Lagrange Mean Value Theorem. Cauchy's Mean Value theorem is also called the Extended Mean Value Theorem or the Second Mean Value Theorem.According
7 min read
Taylor SeriesA Taylor series represents a function as an infinite sum of terms, calculated from the values of its derivatives at a single point.Taylor series is a powerful mathematical tool used to approximate complex functions with an infinite sum of terms derived from the function's derivatives at a single poi
8 min read
Inverse functions and composition of functionsInverse Functions - In mathematics a function, a, is said to be an inverse of another, b, if given the output of b a returns the input value given to b. Additionally, this must hold true for every element in the domain co-domain(range) of b. In other words, assuming x and y are constants, if b(x) =
3 min read
Definite Integral | Definition, Formula & How to CalculateA definite integral is an integral that calculates a fixed value for the area under a curve between two specified limits. The resulting value represents the sum of all infinitesimal quantities within these boundaries. i.e. if we integrate any function within a fixed interval it is called a Definite
8 min read
Application of Derivative - Maxima and MinimaDerivatives have many applications, like finding rate of change, approximation, maxima/minima and tangent. In this section, we focus on their use in finding maxima and minima.Note: If f(x) is a continuous function, then for every continuous function on a closed interval has a maximum and a minimum v
6 min read
Probability & Statistics
Mean, Variance and Standard DeviationMean, Variance and Standard Deviation are fundamental concepts in statistics and engineering mathematics, essential for analyzing and interpreting data. These measures provide insights into data's central tendency, dispersion, and spread, which are crucial for making informed decisions in various en
10 min read
Conditional ProbabilityConditional probability defines the probability of an event occurring based on a given condition or prior knowledge of another event. It is the likelihood of an event occurring, given that another event has already occurred. In probability, this is denoted as A given B, expressed as P(A | B), indica
12 min read
Bayes' TheoremBayes' Theorem is a mathematical formula used to determine the conditional probability of an event based on prior knowledge and new evidence. It adjusts probabilities when new information comes in and helps make better decisions in uncertain situations.Bayes' Theorem helps us update probabilities ba
13 min read
Probability Distribution - Function, Formula, TableA probability distribution is a mathematical function or rule that describes how the probabilities of different outcomes are assigned to the possible values of a random variable. It provides a way of modeling the likelihood of each outcome in a random experiment.While a Frequency Distribution shows
13 min read
Covariance and CorrelationCovariance and correlation are the two key concepts in Statistics that help us analyze the relationship between two variables. Covariance measures how two variables change together, indicating whether they move in the same or opposite directions. Relationship between Independent and dependent variab
6 min read
Practice Questions