Last Minute Notes (LMNs) - Probability and Statistics

Last Updated : 23 Jul, 2025

Probability refers to the likelihood of an event occurring. For example, when an event like throwing a ball or picking a card from a deck occurs, there is a certain probability associated with that event, which quantifies the chance of it happening. this "Last Minute Notes" article provides a quick and concise revision of the key concepts in Probability and Statistics.

Table of Content

Counting
Basics of Probability
Conditional Probability
Descriptive Statistics
Random Variable
Joint Random Variable
Probability Distributions
Inferential Statistics

Counting

Permutation:

Arrangement of items where order matters.

Formula:

P(n, r) = \frac{n!}{(n-r)!}

Example:

1) Arranging 2 out of 3 letters (A, B, C): P(3, 2) = 6 (AB, BA, AC, CA, BC, CB).

2) The number of ways to arrange 3 books out of 5: P(5, 3) = \frac{5!}{(5-3)!} = 60.

Combination:

Selection of items where order does not matter.

Formula: C(n, r) = \frac{n!}{r! \times (n-r)!}

Example:

1) The number of ways to select 2 items from 5: C(5,2) = \frac{5!}{2! \times (5-2)!} =10.

2) Selecting 2 out of 3 letters (A, B, C): C(3,2) = 3 (AB, AC, BC).

Differences Between Permutations and Combinations:

Permutation	Combination
Order is important.	Order is not important.
Formula: P(n,r).	Formula:C(n,r).
Example: AB ≠ BA.	Example: AB = BA.

read more about Permutations and Combinations.

Basics of Probability

Sample Space (S):

The set of all possible outcomes of a random experiment.
For example, tossing two coins has S = {HH, HT, TH, TT}.

Events:

A subset of the sample space. For example, getting two heads is the event A = {HH}.

Compound Event:

A compound event is an event that consists of two or more outcomes.

Mutually Exclusive Events:

Events that cannot happen simultaneously

Mathematically: P(A \cap B) = 0

Key Rules:

Addition Rule: P(A \cup B) = P(A) + P(B).
P(A∣B)=0 (If B happens, A cannot).

Examples:

1) Coin toss: Getting Heads (A) and Tails (B) are mutually exclusive events.

2) Rolling a die: Getting Odd (A) and Even (B) number are mutually exclusive events.

Independent Events:

Events where the occurrence of one does not affect the other.

Mathematically: P(A∩B)=P(A)⋅P(B).

Key Rules:

Multiplication Rule: P(A \cap B) = P(A) \cdot P(B)
P(A∣B)=P(A); P(B∣A)=P(B).

Examples:

1) Two coin tosses: Heads on the first toss (A) and Tails on the second (B).
2) Rolling two dice: Rolling a 6 (A) on one die and a 4 (B) on the other.

Important Rules:

Addition Rule:
- For two mutually exclusive events A and B: P(∪B) = P(A) + P(B).
- For two non-mutually exclusive events A and B: P(A∪B) = P(A) + P(B) − P(A∩B).

Multiplication Rule:
- For independent events A and B: P(A∩B) = P(A) ⋅ P(B).
- For conditional probability: P(A∣B) = P(A∩B)/P(B), provided P(B) > 0.

Joint, Marginal and Conditional Probability

Joint Probability:

Joint probability represents the likelihood of two or more events occurring simultaneously. It is denoted as P(A \cap B), where A and B are events.

If A and B are independent, the formula simplifies to:

P(A \cap B) = P(A) \cdot P(B)

Marginal Probability:

Marginal probability is the probability of a single event regardless of the outcomes of other events. It is obtained by summing or integrating joint probabilities over all possible values of the other event.

P(A) = \sum_B P(A, B) \quad \text{(for discrete variables)}

Conditional Probability:

Conditional probability calculates the probability of event A given that event B has already occurred. It is denoted as P(A∣B) and is computed using:

P(A|B) = \frac{P(A \cap B)}{P(B)}, \, \text{provided } P(B) > 0.

Bayes’ Theorem:

Bayes’ Theorem provides a way to calculate the conditional probability of an event A, given that another event B has already occurred. It uses prior knowledge about related events to update the probability of A.

Formula:

P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}

Here:

P(A∣B) is the probability of event A occurring given that event B has occurred.
P(B∣A) is the probability of event B occurring given that event A has occurred.
P(A) and P(B) are the probabilities of events A and B occurring, respectively.

Descriptive Statistics

Descriptive statistics involves summarizing and organizing data to make it easier to understand. It includes measures like mean, median, mode, standard deviation, and variance

Mean (Average):

The mean is the central value of a dataset, calculated by summing all data points and dividing by the number of points.

u = \frac{1}{n} \sum_{i=1}^{n} x_i

Example: For {4,8,6,5,3,7} :

u = \frac{4 + 8 + 6 + 5 + 3 + 7}{6} = 5.5

Median:

The median is the middle value of a sorted dataset. If the dataset has an odd number of elements, it’s the middle value; if even, it’s the average of the two middle values.

Example: For {3,4,5,6,7,8} the median is:

Median={5 + 6}/{2} = 5.5

Mode:

The mode is the value(s) that appear most frequently in the dataset.

Example: For {3,7,7,19,24}, the mode is: 7

Variance:

Variance is a measure of how much the values in a dataset deviate from the mean (average). It quantifies the spread or dispersion of the data.

Formula: For a dataset x_1, x_2, ..., x_n, the variance \sigma^2 is calculated as:

\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \mu)^2

Example: Consider the dataset: 3,7,8,10,12.

Mean: μ=8
Variance: \sigma^2 = \frac{1}{5}[(3-8)^2 + (7-8)^2 + (8-8)^2 + (10-8)^2 + (12-8)^2] = 9.2

Standard Deviation (SD):

The standard deviation measures the spread of the data from the mean. It is the square root of the variance.

\sigma = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (x_i - \mu)^2}

Example: For {4,8,6,5,3,7}, with μ= 5.5:

\sigma^2 = \frac{(4-5.5)^2 + (8-5.5)^2 + \dots + (7-5.5)^2}{6} = 2.92,

\quad \sigma = \sqrt{2.92} \approx 1.71

Covariance:

Measures how two variables vary together.

Range: -\infty\ to +\infty.

Types:

Positive: Both variables move in the same direction.
Negative: Variables move in opposite directions.
Zero: No linear relationship.

Formula:

\text{Cov}(X, Y) = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})

Correlation:

Standardized measure of the strength and direction of the linear relationship.

Range: −1 to +1.

+1: Perfect positive correlation.
−1: Perfect negative correlation.
0: No linear relationship.

Formula:

\rho(X, Y) = \frac{\text{Cov}(X, Y)}{\sigma_X \cdot \sigma_Y}

Random Variable

Random variables :

A random variable is a function that maps outcomes of a random experiment to real numbers. It helps quantify uncertainty and calculate probabilities.

Example: If two unbiased coins are tossed, let X (X is a random variable or function) represent the number of heads.

The sample space is S = {HH, HT, TH, TT}, and X can take the values {0, 1, 2}.

Cumulative Distribution Function (CDF):

The Cumulative Distribution Function (CDF), F(x), represents the probability that a random variable X takes a value less than or equal to x. It provides an accumulated probability up to a certain point x.

For Discrete Random Variables: F(x) = P(X \leq x) = \sum_{x_0 \leq x} P(x_0) Here, P(x_0) is the probability of X being equal to x_0.
For Continuous Random Variables: F(x) = P(X \leq x) = \int_{-\infty}^{x} f(x) dx Here, f(x) is the Probability Density Function (PDF) of the random variable X.

Joint Random Variable

Conditional Expectation:

Conditional expectation is the expected value (mean) of a random variable Y, given that another random variable X has a specific value or distribution. It provides the average value of Y, considering the information provided by X.

Conditional Variance:

Conditional variance measures the spread or variability of a random variable Y, given that another random variable X takes a specific value.

Conditional Probability Density Function:

The Conditional PDF describes the probability distribution of a random variable X, given that another random variable Y is known to take a specific value.

Mathematically:

f_{X|Y}(x|y) = \frac{f_{X,Y}(x, y)}{f_Y(y)}

f_{X,Y}(x, y): Joint PDF of X and Y.
f_Y(y): Marginal PDF of Y

Probability Distributions

Discrete Probability Distribution :

Applies to discrete random variables, which can only take specific, countable values (e.g., integers). The probabilities of these outcomes are represented by the Probability Mass Function (PMF).

Continuous Probability Distribution:

Applies to continuous random variables, which can take any value within a range or interval. Probabilities are described using the Probability Density Function (PDF).

Uniform Distribution:

The Uniform Distribution, also called the Rectangular Distribution, is a type of Continuous Probability Distribution. It represents a scenario where a continuous random variable X is uniformly distributed over a finite interval [a, b]. This means that every value within [a, b] is equally likely, and the probability density function f(x) is constant over this range.

The probability density function (PDF) of a uniform distribution is defined as:

f(x) = \begin{cases} \frac{1}{b-a}, & a \leq x \leq b \\ 0, & \text{otherwise} \end{cases}

This constant density ensures that the total probability over the interval [a, b] is 1.

Mean: μ = (a+b)/2

Variance : \sigma^2 = \frac{(b - a)^2}{12}

Binomial Distribution:

A probability distribution that models the number of successes in n independent Bernoulli trials.

Key Parameters:

n: Total number of trials.
p: Probability of success.
q=1−p: Probability of failure.

Probability Mass Function:

P(X = r) = \binom{n}{r} \cdot p^r \cdot q^{n-r}, \, r = 0, 1, 2, \dots, n

Mean = np
Variance. = np(1-p)

Bernoulli Trials:

A Bernoulli trial is an experiment with two possible outcomes: success (A) or failure (A′).
The probability of success is P(A)=p, and failure is P(A')=q=1−p.

Examples: Tossing a coin (Head = success, Tail = failure).

Theorem:

Probability of r successes in n trials is: P(X = r) = \,^{n}C_{r} \cdot p^r \cdot q^{n-r}

Exponential Distribution:

The Exponential Distribution models the time between events in a process where events occur continuously and independently at a constant average rate.

For a positive real number \lambda, the Probability Density Function (PDF) of an exponentially distributed random variable X is:

f_X(x) = \begin{cases} \lambda e^{-\lambda x}, & x \in R_X = [0, \infty) \\ 0, & x \notin R_X \end{cases}

Mean: E[X] = \frac{1}{\lambda}

Variance: \text{Var}[X] = \frac{1}{\lambda^2}

Poisson Distribution:

The Poisson distribution is a discrete probability distribution used to model the number of occurrences of an event in a fixed interval of time, space, or volume, where:

The events occur independently.
The average rate λ of occurrences is constant.

Probability Mass Function (PMF):

P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}, \, x = 0, 1, 2, \dots

where λ is the mean (expected number of events)

Mean: E[X] = \lambda

Variance: \text{Var}[X] = \lambda

Normal Distribution:

The Normal Distribution is a continuous probability distribution that models many natural and real-world phenomena. It is characterized by its symmetric, bell-shaped curve, where:

The highest point (mean) represents the most probable value.
Probabilities decrease as you move away from the mean.

Probability Density Function (PDF) is:

f_X(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2}

Mean (\mu): E[X] = \mu

Variance : V[X] = \sigma^2

Standard Normal Distribution:

The Standard Normal Distribution, also called the Z-distribution, is a special case of the normal distribution where:

Mean (μ)=0
Standard deviation (σ)=1

It is used to compare and analyze data by standardizing values using the z-score:

Z = \frac{X - \mu}{\sigma}

Probability Density Function (PDF)

f(Z) = \frac{1}{\sqrt{2\pi}} e^{-\frac{Z^2}{2}}, \quad -\infty < Z < \infty

t-Distribution:

The t-distribution (Student's t-distribution) is used in statistics to infer population means when:

Sample size is small n ≤ 30.
Population standard deviation σ is unknown.

Key Formula:

The t-score: t = \frac{\bar{x} - \mu}{s / \sqrt{n}}

Where:

\bar{x}: Sample mean
μ: Population mean
s: Sample standard deviation

Chi-Squared Distribution:

The Chi-Squared distribution represents the sum of the squares of k independent standard normal random variables:

X^2 = Z_1^2 + Z_2^2 + \dots + Z_k^2

k: Degrees of freedom (df).
As k increases, the distribution becomes more symmetric and approaches a normal distribution.

Probability Density Function (PDF):

f(x; k) = \frac{1}{2^{k/2} \Gamma(k/2)} x^{(k/2)-1} e^{-x/2}, \, x \geq 0Γ(.)\Gamma(.).

Mean: =k

Variance: = {Variance} = 2k

Inferential Statistics

Inferential statistics makes predictions or inferences about a population based on sample data.

Sampling Distribution: A sampling distribution is the probability distribution of a statistic (such as the sample mean) obtained through repeated sampling from a population. It shows how the statistic varies across different samples,

Central limit theorem:

The Central Limit Theorem (CLT) states that for a sufficiently large sample size (n >30), the distribution of the sample mean approaches a normal distribution, regardless of the shape of the population distribution. The population must have a finite variance.

formula: For a random variable X with:

Mean (μ)
Standard deviation (σ)

The sample mean \bar{X} follows:

\bar{X} \sim N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)

The Z-score for the sample mean is given by:

Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}}

Confidence Interval:

A Confidence Interval (CI) is a range of values within which the true population parameter (e.g., mean) lies with a certain confidence level (e.g., 95%).

Key Formula:

\text{CI} = \text{Point Estimate} \pm \text{Critical Value} \times \text{Standard Error}

Point Estimate: Sample mean/proportion.
Critical Value: From z-table or t-table.
Standard Error: Depends on the statistic (e.g., \frac{s}{\sqrt{n}} for the mean).

Z-Test:

A statistical test used to determine if a sample mean differs significantly from a population mean, applicable when:

Sample size n > 30
Population standard deviation (σ) is known.

Formula:

Z = \frac{\bar{x} - \mu}{\sigma}

T-Test :

A t-test is a statistical method to compare the means of two groups and determine if the difference is statistically significant. It is used when:

Sample size is small (n < 30).
Population variance is unknown.

Key Types:

One-Sample T-Test:

Compares a sample mean to a known population mean.
Formula: Formula: t = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}

Independent T-Test: Compares means of two independent groups.

Paired T-Test: Compares means from the same group at two different times.

Chi-Square Test:

A chi-square (χ²) test assesses whether there is a significant relationship between two categorical variables.It compares observed data against expected frequencies to identify if the results are likely to occur by chance.

Example: When tossing a coin, the test can show if heads or tails appear disproportionately often, suggesting that the result isn't just random.

Formula:

\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}

where:

O_i = observed frequency
E_i= expected frequency

kareeen0d5l

Improve

Article Tags :

Last Minute Notes (LMNs) - Probability and Statistics

Counting

Permutation

Combination

Basics of Probability

Sample Space (S):

Joint, Marginal and Conditional Probability

Bayes’ Theorem:

Descriptive Statistics

Random Variable

Joint Random Variable

Probability Distributions

The t-score: t = \frac{\bar{x} - \mu}{s / \sqrt{n}}

Inferential Statistics

Z = \frac{\bar{x} - \mu}{\sigma}

Explore

Basic Arithmetic

Algebra

Geometry

Trigonometry & Vector Algebra

Calculus

Probability and Statistics

Practice

Thank You!

What kind of Experience do you want to share?