Probability refers to the likelihood of an event occurring. For example, when an event like throwing a ball or picking a card from a deck occurs, there is a certain probability associated with that event, which quantifies the chance of it happening. this "Last Minute Notes" article provides a quick and concise revision of the key concepts in Probability and Statistics.
Counting
Permutation:
Arrangement of items where order matters.
Formula:
P(n, r) = \frac{n!}{(n-r)!}
Example:
1) Arranging 2 out of 3 letters (A, B, C): P(3, 2) = 6 (AB, BA, AC, CA, BC, CB).
2) The number of ways to arrange 3 books out of 5: P(5, 3) = \frac{5!}{(5-3)!} = 60.
Combination:
Selection of items where order does not matter.
Formula: C(n, r) = \frac{n!}{r! \times (n-r)!}
Example:
1) The number of ways to select 2 items from 5: C(5,2) = \frac{5!}{2! \times (5-2)!} =10.
2) Selecting 2 out of 3 letters (A, B, C): C(3,2) = 3 (AB, AC, BC).
Differences Between Permutations and Combinations:
Permutation | Combination |
---|
Order is important. | Order is not important. |
Formula: P(n,r). | Formula:C(n,r). |
Example: AB ≠ BA. | Example: AB = BA. |
read more about Permutations and Combinations.
Basics of Probability
Sample Space (S):
The set of all possible outcomes of a random experiment.
For example, tossing two coins has S = {HH, HT, TH, TT}.
Events:
A subset of the sample space. For example, getting two heads is the event A = {HH}.
Compound Event:
A compound event is an event that consists of two or more outcomes.
Mutually Exclusive Events:
Events that cannot happen simultaneously
Mathematically: P(A \cap B) = 0
Key Rules:
- Addition Rule: P(A \cup B) = P(A) + P(B).
- P(A∣B)=0 (If B happens, A cannot).
Examples:
1) Coin toss: Getting Heads (A) and Tails (B) are mutually exclusive events.
2) Rolling a die: Getting Odd (A) and Even (B) number are mutually exclusive events.
Independent Events:
Events where the occurrence of one does not affect the other.
Mathematically: P(A∩B)=P(A)⋅P(B).
Key Rules:
- Multiplication Rule: P(A \cap B) = P(A) \cdot P(B)
- P(A∣B)=P(A); P(B∣A)=P(B).
Examples:
1) Two coin tosses: Heads on the first toss (A) and Tails on the second (B).
2) Rolling two dice: Rolling a 6 (A) on one die and a 4 (B) on the other.
Important Rules:
- Addition Rule:
- For two mutually exclusive events A and B: P(∪B) = P(A) + P(B).
- For two non-mutually exclusive events A and B: P(A∪B) = P(A) + P(B) − P(A∩B).
- Multiplication Rule:
- For independent events A and B: P(A∩B) = P(A) ⋅ P(B).
- For conditional probability: P(A∣B) = P(A∩B)/P(B), provided P(B) > 0.
Joint, Marginal and Conditional Probability
Joint Probability:
Joint probability represents the likelihood of two or more events occurring simultaneously. It is denoted as P(A \cap B), where A and B are events.
If A and B are independent, the formula simplifies to:
P(A \cap B) = P(A) \cdot P(B)
Marginal Probability:
Marginal probability is the probability of a single event regardless of the outcomes of other events. It is obtained by summing or integrating joint probabilities over all possible values of the other event.
P(A) = \sum_B P(A, B) \quad \text{(for discrete variables)}
Conditional Probability:
Conditional probability calculates the probability of event A given that event B has already occurred. It is denoted as P(A∣B) and is computed using:
P(A|B) = \frac{P(A \cap B)}{P(B)}, \, \text{provided } P(B) > 0.
read more about Joint, Marginal and Conditional Probability.
Bayes’ Theorem:
Bayes’ Theorem provides a way to calculate the conditional probability of an event A, given that another event B has already occurred. It uses prior knowledge about related events to update the probability of A.
Formula:
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
Here:
- P(A∣B) is the probability of event A occurring given that event B has occurred.
- P(B∣A) is the probability of event B occurring given that event A has occurred.
- P(A) and P(B) are the probabilities of events A and B occurring, respectively.
Descriptive Statistics
Descriptive statistics involves summarizing and organizing data to make it easier to understand. It includes measures like mean, median, mode, standard deviation, and variance
Mean (Average):
The mean is the central value of a dataset, calculated by summing all data points and dividing by the number of points.
u = \frac{1}{n} \sum_{i=1}^{n} x_i
Example: For {4,8,6,5,3,7} :
u = \frac{4 + 8 + 6 + 5 + 3 + 7}{6} = 5.5
Median:
The median is the middle value of a sorted dataset. If the dataset has an odd number of elements, it’s the middle value; if even, it’s the average of the two middle values.
Example: For {3,4,5,6,7,8} the median is:
Median={5 + 6}/{2} = 5.5
Mode:
The mode is the value(s) that appear most frequently in the dataset.
Example: For {3,7,7,19,24}, the mode is: 7
Variance:
Variance is a measure of how much the values in a dataset deviate from the mean (average). It quantifies the spread or dispersion of the data.
Formula: For a dataset x_1, x_2, ..., x_n, the variance \sigma^2 is calculated as:
\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \mu)^2
Example: Consider the dataset: 3,7,8,10,12.
- Mean: μ=8
- Variance: \sigma^2 = \frac{1}{5}[(3-8)^2 + (7-8)^2 + (8-8)^2 + (10-8)^2 + (12-8)^2] = 9.2
Standard Deviation (SD):
The standard deviation measures the spread of the data from the mean. It is the square root of the variance.
\sigma = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (x_i - \mu)^2}
Example: For {4,8,6,5,3,7}, with μ= 5.5:
\sigma^2 = \frac{(4-5.5)^2 + (8-5.5)^2 + \dots + (7-5.5)^2}{6} = 2.92,
\quad \sigma = \sqrt{2.92} \approx 1.71
Covariance:
Measures how two variables vary together.
Range: -\infty\ to +\infty.
Types:
- Positive: Both variables move in the same direction.
- Negative: Variables move in opposite directions.
- Zero: No linear relationship.
Formula:
\text{Cov}(X, Y) = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y})
Correlation:
Standardized measure of the strength and direction of the linear relationship.
Range: −1 to +1.
- +1: Perfect positive correlation.
- −1: Perfect negative correlation.
- 0: No linear relationship.
Formula:
\rho(X, Y) = \frac{\text{Cov}(X, Y)}{\sigma_X \cdot \sigma_Y}
Random Variable
Random variables :
A random variable is a function that maps outcomes of a random experiment to real numbers. It helps quantify uncertainty and calculate probabilities.
Example: If two unbiased coins are tossed, let X (X is a random variable or function) represent the number of heads.
The sample space is S = {HH, HT, TH, TT}, and X can take the values {0, 1, 2}.
Cumulative Distribution Function (CDF):
The Cumulative Distribution Function (CDF), F(x), represents the probability that a random variable X takes a value less than or equal to x. It provides an accumulated probability up to a certain point x.
- For Discrete Random Variables: F(x) = P(X \leq x) = \sum_{x_0 \leq x} P(x_0) Here, P(x_0) is the probability of X being equal to x_0.
- For Continuous Random Variables: F(x) = P(X \leq x) = \int_{-\infty}^{x} f(x) dx Here, f(x) is the Probability Density Function (PDF) of the random variable X.
Joint Random Variable
Conditional Expectation:
Conditional expectation is the expected value (mean) of a random variable Y, given that another random variable X has a specific value or distribution. It provides the average value of Y, considering the information provided by X.
Conditional Variance:
Conditional variance measures the spread or variability of a random variable Y, given that another random variable X takes a specific value.
Conditional Probability Density Function:
The Conditional PDF describes the probability distribution of a random variable X, given that another random variable Y is known to take a specific value.
Mathematically:
f_{X|Y}(x|y) = \frac{f_{X,Y}(x, y)}{f_Y(y)}
- f_{X,Y}(x, y): Joint PDF of X and Y.
- f_Y(y): Marginal PDF of Y
Probability Distributions
Discrete Probability Distribution :
Applies to discrete random variables, which can only take specific, countable values (e.g., integers). The probabilities of these outcomes are represented by the Probability Mass Function (PMF).
Continuous Probability Distribution:
Applies to continuous random variables, which can take any value within a range or interval. Probabilities are described using the Probability Density Function (PDF).
Uniform Distribution:
The Uniform Distribution, also called the Rectangular Distribution, is a type of Continuous Probability Distribution. It represents a scenario where a continuous random variable X is uniformly distributed over a finite interval [a, b]. This means that every value within [a, b] is equally likely, and the probability density function f(x) is constant over this range.
The probability density function (PDF) of a uniform distribution is defined as:
f(x) = \begin{cases} \frac{1}{b-a}, & a \leq x \leq b \\ 0, & \text{otherwise} \end{cases}
This constant density ensures that the total probability over the interval [a, b] is 1.
Mean: μ = (a+b)/2
Variance : \sigma^2 = \frac{(b - a)^2}{12}
Binomial Distribution:
A probability distribution that models the number of successes in n independent Bernoulli trials.
Key Parameters:
- n: Total number of trials.
- p: Probability of success.
- q=1−p: Probability of failure.
Probability Mass Function:
P(X = r) = \binom{n}{r} \cdot p^r \cdot q^{n-r}, \, r = 0, 1, 2, \dots, n
Mean = np
Variance. = np(1-p)
Bernoulli Trials:
- A Bernoulli trial is an experiment with two possible outcomes: success (A) or failure (A′).
- The probability of success is P(A)=p, and failure is P(A')=q=1−p.
Examples: Tossing a coin (Head = success, Tail = failure).
Theorem:
Probability of r successes in n trials is: P(X = r) = \,^{n}C_{r} \cdot p^r \cdot q^{n-r}
Exponential Distribution:
The Exponential Distribution models the time between events in a process where events occur continuously and independently at a constant average rate.
For a positive real number \lambda, the Probability Density Function (PDF) of an exponentially distributed random variable X is:
f_X(x) = \begin{cases} \lambda e^{-\lambda x}, & x \in R_X = [0, \infty) \\ 0, & x \notin R_X \end{cases}
Mean: E[X] = \frac{1}{\lambda}
Variance: \text{Var}[X] = \frac{1}{\lambda^2}
Poisson Distribution:
The Poisson distribution is a discrete probability distribution used to model the number of occurrences of an event in a fixed interval of time, space, or volume, where:
- The events occur independently.
- The average rate λ of occurrences is constant.
Probability Mass Function (PMF):
P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}, \, x = 0, 1, 2, \dots
where λ is the mean (expected number of events)
Mean: E[X] = \lambda
Variance: \text{Var}[X] = \lambda
Normal Distribution:
The Normal Distribution is a continuous probability distribution that models many natural and real-world phenomena. It is characterized by its symmetric, bell-shaped curve, where:
- The highest point (mean) represents the most probable value.
- Probabilities decrease as you move away from the mean.
Probability Density Function (PDF) is:
f_X(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2} \left( \frac{x - \mu}{\sigma} \right)^2}
Mean (\mu): E[X] = \mu
Variance : V[X] = \sigma^2
Standard Normal Distribution:
The Standard Normal Distribution, also called the Z-distribution, is a special case of the normal distribution where:
- Mean (μ)=0
- Standard deviation (σ)=1
It is used to compare and analyze data by standardizing values using the z-score:
Z = \frac{X - \mu}{\sigma}
Probability Density Function (PDF)
f(Z) = \frac{1}{\sqrt{2\pi}} e^{-\frac{Z^2}{2}}, \quad -\infty < Z < \infty
t-Distribution:
The t-distribution (Student's t-distribution) is used in statistics to infer population means when:
- Sample size is small n ≤ 30.
- Population standard deviation σ is unknown.
Key Formula:
The t-score: t = \frac{\bar{x} - \mu}{s / \sqrt{n}}
Where:
- \bar{x}: Sample mean
- μ: Population mean
- s: Sample standard deviation
Chi-Squared Distribution:
The Chi-Squared distribution represents the sum of the squares of k independent standard normal random variables:
X^2 = Z_1^2 + Z_2^2 + \dots + Z_k^2
- k: Degrees of freedom (df).
- As k increases, the distribution becomes more symmetric and approaches a normal distribution.
Probability Density Function (PDF):
f(x; k) = \frac{1}{2^{k/2} \Gamma(k/2)} x^{(k/2)-1} e^{-x/2}, \, x \geq 0Γ(.)\Gamma(.).
Mean: =k
Variance: = {Variance} = 2k
Inferential Statistics
Inferential statistics makes predictions or inferences about a population based on sample data.
Sampling Distribution: A sampling distribution is the probability distribution of a statistic (such as the sample mean) obtained through repeated sampling from a population. It shows how the statistic varies across different samples,
Central limit theorem:
The Central Limit Theorem (CLT) states that for a sufficiently large sample size (n >30), the distribution of the sample mean approaches a normal distribution, regardless of the shape of the population distribution. The population must have a finite variance.
formula: For a random variable X with:
- Mean (μ)
- Standard deviation (σ)
The sample mean \bar{X} follows:
\bar{X} \sim N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)
The Z-score for the sample mean is given by:
Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}}
Confidence Interval:
A Confidence Interval (CI) is a range of values within which the true population parameter (e.g., mean) lies with a certain confidence level (e.g., 95%).
Key Formula:
\text{CI} = \text{Point Estimate} \pm \text{Critical Value} \times \text{Standard Error}
- Point Estimate: Sample mean/proportion.
- Critical Value: From z-table or t-table.
- Standard Error: Depends on the statistic (e.g., \frac{s}{\sqrt{n}} for the mean).
Z-Test:
A statistical test used to determine if a sample mean differs significantly from a population mean, applicable when:
- Sample size n > 30
- Population standard deviation (σ) is known.
Formula:
Z = \frac{\bar{x} - \mu}{\sigma}
T-Test :
A t-test is a statistical method to compare the means of two groups and determine if the difference is statistically significant. It is used when:
- Sample size is small (n < 30).
- Population variance is unknown.
Key Types:
One-Sample T-Test:
- Compares a sample mean to a known population mean.
- Formula: Formula: t = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}
Independent T-Test: Compares means of two independent groups.
Paired T-Test: Compares means from the same group at two different times.
Chi-Square Test:
A chi-square (χ²) test assesses whether there is a significant relationship between two categorical variables.It compares observed data against expected frequencies to identify if the results are likely to occur by chance.
Example: When tossing a coin, the test can show if heads or tails appear disproportionately often, suggesting that the result isn't just random.
Formula:
\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}
where:
- O_i = observed frequency
- E_i= expected frequency
Explore
Maths
4 min read
Basic Arithmetic
Algebra
Geometry
Trigonometry & Vector Algebra
Calculus
Probability and Statistics
Practice