0% found this document useful (0 votes)

108 views9 pages

Lecture Notes - Inferential Statistics

In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression

Uploaded by

shahad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

108 views9 pages

Lecture Notes - Inferential Statistics

Uploaded by

shahad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Lecture Notes

Inferential Statistics
Exploratory data analysis helped you understand how to discover patterns in data using various
techniques and approaches. As you've learnt, EDA is one of the most important parts of the data analysis
process. It is also the part on which data analysts spend most of their time.

However, sometimes, you may require a very large amount of data for your analysis which may need too
much time and resources to acquire. In such situations, you are forced to work with a smaller sample of
the data, instead of having the entire data to work with.

Situations like these arise all the time at big companies like Amazon. For example, say the Amazon QC
department wants to know what proportion of the products in its warehouses are defective. Instead of
going through all of its products (which would be a lot!), the Amazon QC team can just check a small
sample of 1,000 products and then find, for this sample, the defect rate (i.e. the proportion of defective
products). Then, based on this sample's defect rate, the team can "infer" what the defect rate is for all the
products in the warehouses.

This process of “inferring” insights from sample data is called “Inferential Statistics”.

Random Variables

Before performing any kind of statistical analysis on a problem, it is advisable to quantify its outcomes by
using random variables.

So, the random variable X basically converts outcomes of experiments to something measurable.

For example, recall that we quantified the colours of the balls we would get after playing our game by
assigning a value of X to each outcome. We did so by defining X as the number of red balls we would get
after playing the game once.

Figure 1 - Quantifying Using Random Variables

Probability Distribution
A probability distribution for X, basically, is ANY form of representation that tells us the probability for all
possible values of X. It could be a table, a chart or an equation.

A probability distribution looks like the frequency distribution, just with a different scale. For example,
here are the probability distribution and the frequency distribution (histogram) for our UpGrad red ball
game –

Figure 2 – Frequency Distribution (Left) vs Probability Distribution (Right)

Expected Value
The expected value for a variable X is the value of X we would “expect” to get after performing the
experiment once. It is also called the expectation, average, and mean value. Mathematically speaking, for a
random variable X that can take values x1, x2, x3, …… xn, the expected value (EV) is given by:

EV(X) = x1*P(X = x1) + x2*P(X = x2) + x3*P(X = x3) + …………………. + xn*P(X = xn)

Where, P(X=xi) denotes the probability that the random variable will take the value xi.

For example, suppose you’re trying to find the expected value of the number of red balls in our UpGrad
game. The random variable X, which is the number of red balls the player gets after playing the game once,
can take values 0, 1, 2, 3 and 4. So, the expected value for the number of red balls would be –

EV(X) = 0P(X = 0) + 1P(X = 1) + 2P(X = 2) + 3P(X = 3) + 4*P(X = 4)

EV(X) = 0*(0.027) + 1*(0.160) + 2*(0.347) + 3*(0.333) + 4*(0.133) = 2.385
However, you can never get 2.385 red balls in one game, as the number of balls will be an integer, like 2 or
3. However, the expected value, the value you would “expect” to get after one experiment, does not have
to be a value that will turn up in the experiment/game.

So, the expected value, actually, is the average value of X that you will get after playing the game an
infinite number of times.

Probability Without Experiment

Using basic rules of probability, i.e., addition rule and multiplication rule, you saw how you could find the
probabilities for our UpGrad red ball game, without even playing the game once.

The probability distribution thus achieved (theoretical probability distribution), was very similar to the
distribution achieved earlier via experiment (observed probability distribution).

Figure 3 – Observed Probability Distribution (Left) vs Theoretical Probability Distribution (Right)

Notice that the values of P(X = 0) are very close in both graphs, as are the values of P(X = 1), P(X = 2), P(X =
3) and P(X = 4). If the number of experiments conducted would have been more than 75, the values would
have been even closer. In fact, for an infinite number of experiments, the values will be exactly same for
both graphs.

Binomial Distribution
The binomial distribution can be used to calculate the probability of an event, if it follows the following
conditions –

1. The total number of trials is fixed

2. Each trial is binary, i.e. has only two possible outcomes, success and failure
3. The probability of success is the same for all the trials

Basically, it should be a series of yes or no questions, with the probability of yes remaining same for all
questions. Examples of such situations are –

1. Finding the probability of 5 out of the next 10 cars having an even numbered license plate
2. Finding the probability of 3 of the next 4 balls picked out from the bag, being red (UpGrad game
{balls are put back after drawing})
3. Finding the probability of 9 out of the next 20 coin tosses resulting in a heads.

For such a situation, the probability of r successes, is given by –

P(X = r) = 𝑛𝐶𝑟 (𝑝)𝑟 (1 − 𝑝)𝑛−𝑟

Where,
n is the total number of trials/questions
p is the probability of success in 1 trial
r is the number of successes after n trials

For example, in our UpGrad game –

Total number of trials, n = 4
Probability of getting a red ball in 1 trial, p = 0.6

So, the probability of getting r red balls is given by –

P(X = r) = 4𝐶𝑟 (0.6)𝑟 (0.4)4−𝑟

Using this, we get P(X = 0) = 4𝐶0 (0.6)0 (0.4)4 = 0.0256. Also, P(X = 1) = 4𝐶1 (0.6)1 (0.4)3 = 0.1536. Similarly,
we can find P(X = 2), P(X = 3) and P(X = 4).

Cumulative Probability
Cumulative probability of x, generally denoted by F(x), is the probability of the random variable X, taking a
value lesser than x. Mathematically speaking, we’d say –

F(x) = P(X < x)

For example, for our UpGrad game,

F(2) = P(X < 2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.0256 + 0.1536 + 0.3456 = 0.5238.

Probability Density Functions

For a continuous random variable, the probability of getting an exact value is very low, almost zero.
Hence, when talking about the probability of continuous random variables, you can only talk in terms of
intervals.

For example, for a particular company, the probability of an employee’s commute time being exactly equal
to 35 minutes was zero, but the probability of an employee having a commute time between 35 and 40
minutes was 0.2.

Hence, for continuous random variables, probability density functions (PDFs) and cumulative distribution
functions (CDFs) are used, instead of the bar chart type of distribution used for the probability of discrete
random variables. These functions are preferred because they talk about probability in terms of intervals.

Figure 4 – PDFs vs. CDFs (X = commute time)

To find the cumulative probability using a CDF, you just have to check the value of the graph. For example,
F(28), i.e., the probability of an employee having a commute time less than or equal to 28 minutes, is given
by the value of the CDF at X = 28. In the PDF, it is given by the area under the graph, between X = 20, the
lowest value and X = 28.

Normal Distribution
A very commonly used probability density function is the normal distribution. It is a symmetric
distribution, and its mean, median and mode lie at the center.
Figure 5 – Normal Distribution

Also, a variable that is normally distributed, follows the 1-2-3 rule, which states that there is a –

1. 68% probability of the variable lying within 1 standard deviation of the mean
2. 95% probability of the variable lying within 2 standard deviations of the mean
3. 99.7% probability of the variable lying within 3 standard deviations of the mean

Figure 6 – 1-2-3 Rule for Normal Distribution

Standard Normal Distribution

In order to find the probability for a normal variable, you actually do not need to know the value of the
mean or the standard deviation — it is enough to know the number of standard deviations away from the
mean your random variable is. That is given by:
𝑋−𝜇
Z=
𝜎

This is called the Z score, or the standard normal variable.

In fact, you can use the Z table to find the cumulative probability for various values of Z. For example, say,
you want to find the cumulative probability for Z = 0.68 using the Z table.
Figure 7 – Z Table

The intersection of row “0.6” and column “0.08” is 0.7517, which is our answer.
Samples
Instead of finding the mean and standard deviation for the entire population, it is sometimes beneficial to
find the mean and standard deviation for only a small representative sample. You may have to do this
because of time and/or money constraints.
For example, for an office of 30,000 employees, we wanted to find the average commute time. So, instead
of asking all employees, we asked only 100 of them and found that for them, the mean was equal to 36.6
minutes and the standard deviation was equal to 10 minutes.
However, we said that it would not be fair to infer that the population mean is exactly equal to the sample
mean. This is because the flaws of the sampling process must have led to some error. Hence, the sample
mean’s value has to be reported with some error margin.
For example, the mean commute time for the office of 30,000 employees would be equal to 36.6 + 3
minutes, 36.6 + 1 minutes or 36.6 + 10 minutes or, for that matter, 36.6 minutes + some error margin
However, in order to find this margin, it would be necessary to understand what sampling distributions
are, as there properties help in finding this margin.

Sampling Distributions & Central Limit Theorem

The sampling distribution, which is basically the distribution of sample means of a population, has some
interesting properties which are collectively called the central limit theorem, which states that no matter
how the original population is distributed, the sampling distribution will follow these three properties –
1. Sampling Distribution’s Mean (𝜇𝑋̅ ) = Population Mean (𝜇)
𝜎
2. Sampling Distribution’s Standard Deviation (Standard Error) = , where σ is the population’s
√𝑛
standard deviation and n is the sample size
3. For n > 30, the sampling distribution becomes a normal distribution
To verify these properties, we performed sampling using data collected for our UpGrad game from Session
1. The values for the sampling distribution thus created (𝜇𝑋̅ = 2.348, S.E. = 0.4248) were pretty close to the
values predicted by theory (𝜇𝑋̅ = 2.385, S.E. = 0.44).
To summarise, the notation and formulae related to samples, populations and sampling distributions are –

Mean Estimation Using CLT

Using CLT, you can estimate the population mean from the sample mean and standard deviation.
For example, to estimate the mean commute time of 30,000 employees of an office, you took a sample of
100 employees and found their mean commute time. For this sample, the sample mean 𝑋̅ = 36.6 minutes,
sample standard deviation S = 10 minutes.
Using CLT, you can say that the sampling distribution for mean commute time will have -
1. Mean = μ {unknown}
𝜎 𝑆 10
2. Standard error = 𝑛 = = =1
√ √𝑛 √100
3. Since n(100) > 30, the sampling distribution is a normal distribution
Using these properties, you can claim that the probability that the population mean μ lies between 34.6
(36.6-2) mins and 38.6 (36.6+2) mins, is 95.4%.
Also, there is some terminology related to the claim -
1. Probability associated with the claim is called confidence level (Here it is 95.4%)
2. Maximum error made in sample mean is called margin of error (Here it is 2 minutes)
3. Final interval of values is called confidence interval {Here it is the range – (34.6, 38.6)}
In fact, you can generalise the entire process. Let’s say you have a sample with sample size n, mean 𝑋̅ and
standard deviation S. Now, the y% confidence interval (i.e., confidence interval corresponding to y%
confidence level) for μ will be given by the range –
∗
𝑍 𝑆 𝑍 𝑆∗
Confidence Interval = (𝑋̅ − 𝑛 , 𝑋̅ + 𝑛 )
√ √

Where, Z* is the Z-score associated with a y% confidence level.

For example, the 90% confidence interval for the mean commute time will be –
∗
𝑍 𝑆 𝑍 𝑆 ∗
μ = (𝑋̅ − 𝑛 , 𝑋̅ + 𝑛 )
√ √

Here,
𝑋̅ = 36.6 minutes
S = 10 minutes
n = 100
Z* = 1.65 (Z* corresponding to 90% confidence level)
So, the confidence interval is –
μ = (34.95 mins, 38.25 mins)

UNIT 1 SSMDA NOTES
No ratings yet
UNIT 1 SSMDA NOTES
35 pages
Virtual University of Pakistan Lecture No. 23 of The Course On Statistics and Probability by Miss Saleha Naghmi Habibullah
No ratings yet
Virtual University of Pakistan Lecture No. 23 of The Course On Statistics and Probability by Miss Saleha Naghmi Habibullah
73 pages
statistics notes part-2
No ratings yet
statistics notes part-2
24 pages
05 Descriptive Statistics - Distribution
No ratings yet
05 Descriptive Statistics - Distribution
5 pages
Unit 4. Probability Distributions
No ratings yet
Unit 4. Probability Distributions
18 pages
Random Variable: The Term Random Variable Is Widely Used in Statistics. A Practical
No ratings yet
Random Variable: The Term Random Variable Is Widely Used in Statistics. A Practical
32 pages
Exponential and Normal Distribution Lecture Notes
No ratings yet
Exponential and Normal Distribution Lecture Notes
8 pages
Continuous Distribution
No ratings yet
Continuous Distribution
16 pages
Sta301 Lec23
No ratings yet
Sta301 Lec23
73 pages
Lecture6 - Random Variable - 0925
No ratings yet
Lecture6 - Random Variable - 0925
33 pages
Prob Distribution 1
No ratings yet
Prob Distribution 1
86 pages
Math Reviewer
No ratings yet
Math Reviewer
31 pages
Module 4 1
No ratings yet
Module 4 1
55 pages
6. Continuous Distribution
No ratings yet
6. Continuous Distribution
17 pages
Chapter 6 Stats
No ratings yet
Chapter 6 Stats
39 pages
Statistics For Managememt
No ratings yet
Statistics For Managememt
152 pages
Chapter 4
No ratings yet
Chapter 4
21 pages
Aem Probability PDF
No ratings yet
Aem Probability PDF
10 pages
Random variable slides
No ratings yet
Random variable slides
41 pages
Statistical Methods in Quality Management
No ratings yet
Statistical Methods in Quality Management
71 pages
Probability Distributions
No ratings yet
Probability Distributions
85 pages
Chapter One - Probability Distribution
No ratings yet
Chapter One - Probability Distribution
26 pages
QT Session 8 9 Discrete Random Variables - Nirma
No ratings yet
QT Session 8 9 Discrete Random Variables - Nirma
46 pages
Bus Stat CHP 6&7
No ratings yet
Bus Stat CHP 6&7
7 pages
Probability Distributions
No ratings yet
Probability Distributions
11 pages
Probability Distribution
No ratings yet
Probability Distribution
8 pages
Managerial Statistics - Material For Studying For Students On The Value of The Dollar
No ratings yet
Managerial Statistics - Material For Studying For Students On The Value of The Dollar
150 pages
Probability Distribution
100% (1)
Probability Distribution
20 pages
Random Variables and Pdfs
No ratings yet
Random Variables and Pdfs
18 pages
Chapter 3
No ratings yet
Chapter 3
39 pages
ENGDAT1 Module3 PDF
No ratings yet
ENGDAT1 Module3 PDF
35 pages
Probability Distributions
No ratings yet
Probability Distributions
52 pages
Math Lec2 1900
No ratings yet
Math Lec2 1900
52 pages
Lesson 5 Normal Distribution
No ratings yet
Lesson 5 Normal Distribution
9 pages
Unit 1 Review of Probability and Basic Statistics
100% (1)
Unit 1 Review of Probability and Basic Statistics
90 pages
Section M Notes With Answers
No ratings yet
Section M Notes With Answers
5 pages
Prob Stat
No ratings yet
Prob Stat
46 pages
Module 4
No ratings yet
Module 4
87 pages
Stat Prob - Q3-Handout
No ratings yet
Stat Prob - Q3-Handout
6 pages
Continuous Probability Distribution.
100% (2)
Continuous Probability Distribution.
10 pages
Module 4
No ratings yet
Module 4
27 pages
[7]probability-distribution
No ratings yet
[7]probability-distribution
14 pages
Stat - G. Assignment
No ratings yet
Stat - G. Assignment
21 pages
Chapter 1. Random Variables and Probability Distributions (Autosaved)
No ratings yet
Chapter 1. Random Variables and Probability Distributions (Autosaved)
61 pages
1-Probability Distributions
No ratings yet
1-Probability Distributions
55 pages
Chapter 7 Webnotes
No ratings yet
Chapter 7 Webnotes
7 pages
Continuous Random Variables: Probability Density Function PDF
No ratings yet
Continuous Random Variables: Probability Density Function PDF
13 pages
Chapter 2 - Lesson 4 Random Variables
No ratings yet
Chapter 2 - Lesson 4 Random Variables
19 pages
Probability Theory Lecture Notes
No ratings yet
Probability Theory Lecture Notes
68 pages
Session 4
No ratings yet
Session 4
41 pages
Unit 5
No ratings yet
Unit 5
16 pages
DISCRETE-RANDOM VARIABLE
No ratings yet
DISCRETE-RANDOM VARIABLE
30 pages
Probability Distribution of Discrete Random Variable (Lesson Plan) 2
No ratings yet
Probability Distribution of Discrete Random Variable (Lesson Plan) 2
8 pages
Topic_4B_Common_Continuous_Distributions
No ratings yet
Topic_4B_Common_Continuous_Distributions
25 pages
Functions and Probability for Sixth Graders
From Everand
Functions and Probability for Sixth Graders
Home School Brew
No ratings yet
Structured Decision Making
From Everand
Structured Decision Making
Andreas Michael Theodorou
No ratings yet
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
SAT Math Shortcuts
From Everand
SAT Math Shortcuts
Bella Biscotti
No ratings yet
Introduction to Gambling Theory – Know the Odds!
From Everand
Introduction to Gambling Theory – Know the Odds!
stanbook449
3.5/5 (2)
Q1 LE Mathematics 8 Lesson 1 Week 1
No ratings yet
Q1 LE Mathematics 8 Lesson 1 Week 1
10 pages
C++ 2
No ratings yet
C++ 2
25 pages
NRF52 Online Power Profiler
No ratings yet
NRF52 Online Power Profiler
6 pages
Mathematical Properties of Stiffness Matrices CE 131 - Theory of Structures Henri Gavin Fall, 2002
No ratings yet
Mathematical Properties of Stiffness Matrices CE 131 - Theory of Structures Henri Gavin Fall, 2002
8 pages
078 - S8 - Anastasios Lekkas PDF
No ratings yet
078 - S8 - Anastasios Lekkas PDF
37 pages
Frontally Confined Versus Frontally Emergent Submarine Landslides A 3D Seismic Characterisation
No ratings yet
Frontally Confined Versus Frontally Emergent Submarine Landslides A 3D Seismic Characterisation
20 pages
LB 52u
No ratings yet
LB 52u
2 pages
Q4 Exam HELE4
No ratings yet
Q4 Exam HELE4
2 pages
Department of Computer Sciences: Bahria University, Islamabad Campus
No ratings yet
Department of Computer Sciences: Bahria University, Islamabad Campus
5 pages
ASTM B424 (2011) - Standard Specification For Ni-Fe-Cr-Mo-Cu Alloy (UNS N08825, UNS N08221, and UNS N06845) Plate, Sheet, and Strip
No ratings yet
ASTM B424 (2011) - Standard Specification For Ni-Fe-Cr-Mo-Cu Alloy (UNS N08825, UNS N08221, and UNS N06845) Plate, Sheet, and Strip
4 pages
Gaussian Elimination of A 4x5 Matrix A
100% (1)
Gaussian Elimination of A 4x5 Matrix A
7 pages
Before You Begin: About This Manual
No ratings yet
Before You Begin: About This Manual
29 pages
Isolation Forest Algorithm For Anomaly Detection
No ratings yet
Isolation Forest Algorithm For Anomaly Detection
16 pages
Introducing Spring Boot
No ratings yet
Introducing Spring Boot
18 pages
2089 340065 Multimeter Mastech Ms8240c
No ratings yet
2089 340065 Multimeter Mastech Ms8240c
11 pages
English Pro
No ratings yet
English Pro
14 pages
Cf+Oym 2025 26 Practice Test 02c Sol
No ratings yet
Cf+Oym 2025 26 Practice Test 02c Sol
9 pages
Ks3 Mathematics 2005 Level 5 7 Paper 1
No ratings yet
Ks3 Mathematics 2005 Level 5 7 Paper 1
24 pages
Ammonia Plant Design
No ratings yet
Ammonia Plant Design
75 pages
Boiler
100% (1)
Boiler
57 pages
Rslts
No ratings yet
Rslts
5 pages
Introduction To Thermal Recovery
100% (1)
Introduction To Thermal Recovery
49 pages
Aparna File
No ratings yet
Aparna File
31 pages
DIGITAL LITERACY NOTES
No ratings yet
DIGITAL LITERACY NOTES
9 pages
BTM2133-Chapter 4 Measuring Instruments
50% (2)
BTM2133-Chapter 4 Measuring Instruments
61 pages
Charteroak Acoustic Devices: Operational Manual
No ratings yet
Charteroak Acoustic Devices: Operational Manual
7 pages
Dichotomous Key Campos
No ratings yet
Dichotomous Key Campos
14 pages
ths4222
No ratings yet
ths4222
48 pages
Geometry 3 Solutions UHSMC
No ratings yet
Geometry 3 Solutions UHSMC
10 pages
User Manual SAC 3601
No ratings yet
User Manual SAC 3601
13 pages

Lecture Notes - Inferential Statistics

Uploaded by

Lecture Notes - Inferential Statistics

Uploaded by

Lecture Notes

Figure 1 - Quantifying Using Random Variables

Figure 2 – Frequency Distribution (Left) vs Probability Distribution (Right)

EV(X) = 0*P(X = 0) + 1*P(X = 1) + 2*P(X = 2) + 3*P(X = 3) + 4*P(X = 4)

Probability Without Experiment

Figure 3 – Observed Probability Distribution (Left) vs Theoretical Probability Distribution (Right)

1. The total number of trials is fixed

For such a situation, the probability of r successes, is given by –

P(X = r) = 𝑛𝐶𝑟 (𝑝)𝑟 (1 − 𝑝)𝑛−𝑟

For example, in our UpGrad game –

So, the probability of getting r red balls is given by –

P(X = r) = 4𝐶𝑟 (0.6)𝑟 (0.4)4−𝑟

F(x) = P(X < x)

For example, for our UpGrad game,

Probability Density Functions

Figure 4 – PDFs vs. CDFs (X = commute time)

Figure 6 – 1-2-3 Rule for Normal Distribution

Standard Normal Distribution

This is called the Z score, or the standard normal variable.

Sampling Distributions & Central Limit Theorem

Mean Estimation Using CLT

Where, Z* is the Z-score associated with a y% confidence level.

You might also like

EV(X) = 0P(X = 0) + 1P(X = 1) + 2P(X = 2) + 3P(X = 3) + 4*P(X = 4)