0% found this document useful (0 votes)
189 views

Module 2 Hypothesis Testing

Here are the null and alternative hypotheses for this scenario: Ho: The average commute time is greater than or equal to 35 minutes Ha: The average commute time is less than 35 minutes The claim is that the average commute time is at least 35 minutes. So the null hypothesis states that the average commute time is greater than or equal to 35 minutes. The alternative hypothesis challenges this by stating that the average commute time is less than 35 minutes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
189 views

Module 2 Hypothesis Testing

Here are the null and alternative hypotheses for this scenario: Ho: The average commute time is greater than or equal to 35 minutes Ha: The average commute time is less than 35 minutes The claim is that the average commute time is at least 35 minutes. So the null hypothesis states that the average commute time is greater than or equal to 35 minutes. The alternative hypothesis challenges this by stating that the average commute time is less than 35 minutes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

IMT-PG Programme in Management

Decision Sciences

Doubt Resolution on Hypothesis


Testing

Presented by: Dr. Anuja Shukla

https://learn.upgrad.com/course/1260
Agenda

Topic Time (mins.)


Quiz 1. Framing of hypothesis https://forms.gle/JaUNBYKtgd7QgNor6 5
What is Hypothesis? Need for hypothesis in business 5
Converting of business problem into a hypothesis statement: Null and Alternate 10
Types of tail in test 10
Quiz 2. Testing of Hypothesis https://forms.gle/jLbszZKDucYstzyy5 5
Step-by-step process of hypothesis testing 5
Testing of Hypothesis: Critical Value method 15
Testing of Hypothesis: p value method 15
Types of Errors 10
Quiz3. Practice https://forms.gle/NmAufBEXStaRagRr8 HW
Q&A https://www.caradvice.com.au/955136/mitsubishi-loses-supreme-court-
appeal-on-misleading-and-deceptive-fuel-consumption-claim/ 10
Total 90
Module 2: Hypothesis Testing
Hypothesis testing Z distribution
✓ Two tailed test
✓ Left tailed test
✓ Right tailed test
Hypothesis testing t distribution
✓ One sample
✓ Two sample
✓ Paired
✓ Unpaired
✓ A/B testing

Quiz 1. Framing of hypothesis https://forms.gle/JaUNBYKtgd7QgNor6


Hypothesis
• A research hypothesis is a specific, clear, and testable proposition or predictive
statement about the possible outcome of a scientific research study based on a
particular property of a population, such as presumed differences between groups
on a particular variable or relationships between variables.
• Decision-makers often face situations wherein they are interested in testing
hypotheses on the basis of available information and then take decisions on the
basis of such testing.
• In social science, where direct knowledge of population parameter(s) is rare,
hypothesis testing is the often used strategy for deciding whether a sample data
offer such support for a hypothesis that generalisation can be made.
• Hypothesis may be defined as a proposition or a set of proposition set forth as an
explanation for the occurrence of some specified group of phenomena either
asserted merely as a provisional conjecture to guide some investigation or accepted
as highly probable in the light of established facts.
Need for hypothesis in business

• An airline company claims that 90% of its flights are on time.


• A consultant claims that using just-in-time production can reduce your inventory
cost per unit by ₹10.
• A tyre manufacturer claims its tyres last 50% longer than its competitors’. You are
probably left wondering how many of these are actually true.
• For instance, suppose that the fallout rate of samples drawn from two different
groups is 15% and 10%, respectively. It would be a partial judgment saying that
one is better than the other.
• Hypothesis testing is designed to detect significant differences: differences that
did not occur by random chance.

• Additional Reading :https://towardsdatascience.com/how-to-interpret-p-value-


with-covid-19-data-edc19e8483b
Need for hypothesis in business
Types of Hypothesis

Types of Hypothesis

Null Hypothesis Alternate Hypothesis


-An alternative hypothesis is one in which some
-A null hypothesis is a statement of the status quo,
difference or effect is expected. Accepting the
one of no difference or no effect. If the null
alternative hypothesis will lead to changes in
hypothesis is not rejected, no changes will be made.
opinions or actions.
-Represented by H0
-Represented by Ha
- Equal sign (=, ≥, ≤)
-Never equal sign (≠,<,>)

Null hypothesis refers to a specified value of the population parameter not sample
A null hypothesis may be rejected, but it can never be accepted based on a single test.
Tails of test

Null Alternate Type of test


H0: Mean=10 H0: Mean ≥ 10 H0: Mean ≤ 10
Ha: Mean≠10 Ha: Mean < 10 Ha: Mean >10
Two tail Left tail Right tail
Tails of test

We want to test that the We want to test that the We want to test that the
population mean is different population mean is less than 10 population mean is greater than 10
than 10

Null Alternate Type of test


H0: Mean=10 H0: Mean ≥ 10 H0: Mean ≤ 10
Ha: Mean≠10 Ha: Mean < 10 Ha: Mean >10
Two tail Left tail Right tail

Significance Level (⍺) Two-Tailed Test One-Tailed Test (Left) One-Tailed Test (Right)
0.01 ±2.58 -2.326 +2.326
0.05 ±1.96 -1.645 +1.645
0.10 ±1.645 -1.282 +1.282
Steps Involved in Hypothesis Testing
Formulate H0 and Ha

Select Appropriate Test


Choose Level of Significance

Collect Data and Calculate Test Statistic

Determine Probability Determine Critical Value of Test


Associated (p value) with Test Statistic Statistic

Compare with Level of Significance,  Compare with tabulated value


- p value≤ alpha, reject H0 -Zcal≥ Ztab, reject H0

Reject or Do not Reject H0

Draw Statistical Conclusion


A Broad Classification of Hypothesis Tests

Hypothesis Tests

Tests of Tests of
Association Differences

Distributions Means Median/


Proportions
Rankings
Formulating the Hypotheses
• A well-known car-maker claims that one of its cars has mileage of at least 17
kilometres per litre. You want to challenge this claim. Define the null and alternative
hypotheses for the problem.

• Hypothesis Statement
Null hypothesis: The mileage is greater than or equal to 17
(as this is the default claim made by the brand )
Alternative hypothesis: The mileage is less than 17
(as this challenges the null hypothesis)

• Mathematically
Ho: Mileage (mean) ≥ 17
Hα: Mileage (mean) < 17
Formulating the Hypotheses

S.No. Statement in question Null Hypothesis statement

1 At least
More than or equal to
2 More than
3 Less than
Less than or equal to
4 At most

Convention: Include the equal sign in the null hypothesis statement


Formulating the Hypotheses
• Let’s say you are the COO of a shoe-manufacturing company. An employee has
developed a new sole and claims that incorporating it will decrease the wear after
three years of use by more than 9%. Now, suppose you want to test this claim.
• What will be the null and alternative hypotheses for the sole developed by the
employee in this scenario?

Ho: Decrease in wear after 3 years ≤ 9%; Ha: Decrease in wear after 3 years > 9%
Formulating the Hypotheses
• Mr. Mohan of the Civil Engineering Department wants to test the load bearing
capacity of an old bridge which must be more than 10 tons, in that case he can
state his hypotheses as under:
• Null hypothesis H0 : tons µ<=10
• Alternative Hypothesis Ha : tons µ > 10
Formulating the Hypotheses
• The average score in an aptitude test administered at the national level
is 80. To evaluate a state’s education system, the average score of 100
of the state’s students selected on random basis was 75. The state
wants to know if there is a significant difference between the local
scores and the national scores.

• Null hypothesis H0 : µ = 80
• Alternative Hypothesis Ha : µ ≠ 80
Formulating the Hypotheses

• It is believed that the average commute time for an employee to and


from their office in Hyderabad is at least 35 minutes. Now, suppose you
want to test this claim.
• What will be the null and alternative hypotheses in this case if the
average commute time is represented by μ?
• Ho: μ ≥ 35 minutes; Ha: μ < 35 minutes
Formulating the Hypotheses
• Goodyear has launched a new tyre, which, it claims, can travel more
than 7,500 miles before it needs any replacement.
• Assuming that the ‘average distance travelled before replacement’ is
given by μ, what would be the null and alternative hypotheses in this
case?

Ho: μ ≤ 7500 miles; Ha: μ > 7500 miles


Formulating the Hypotheses
• Aashirwad company packages flour as per weight and a particular size of package is
supposed to average 10 kg. Suppose the manufacturer wants to test to determine
whether their packaging process is out of control as determined by the weight of
the flour packages.
• H0: Mean = 10 kg
• Ha: Mean ≠ 10 kg

• The null hypothesis for this experiment is that the average weight of the flour
packages is 10 kg (no problem). The alternative hypothesis is that the average is not
10 kg (process is out of control).
Formulating the Hypotheses: Practice sets
• A financial investment firm wants to test to determine whether the average hourly
change in the Dow Jones Average over a 10-year period is +0.25.
• A manufacturing company wants to test to determine whether the average
thickness of a plastic bottle is 2.4 millimeters.
• A retail store wants to test to determine whether the average age of its customers
is less than 40 years.
Testing of Hypothesis
• Quiz 2. Testing of
Hypothesis
https://forms.gle/jLb
szZKDucYstzyy5
Type of Test
Type of Test
Test One sample Population standard N>30 Z test
(normality (Parameter of deviation is known
assumption) measurement:
mean)
Population standard N<30 Independent
deviation is not known sample t test

Population standard N>30 Independent


deviation is not known sample t test
Two sample Paired two-sample means test
(Parameter of (Measuring same population before and after)
measurement: Unpaired two-sample means test
mean) (Measuring two different populations)
Comparing two versions A/B Testing
(Parameter of measurement: population proportion)
Results of Hypothesis
Critical value method
• Critical z lies within range: Fail to reject null hypothesis
• Critical z lies outside range : Reject null hypothesis
• Zcal≥ Ztab, reject H0, Zcal< Ztab, Fail to reject H0

P value method
If p<=alpha , Reject Ho
If p> alpha, Fail to Reject Ho
Confidence level
• Confidence interval is a range of values, derived from sample statistics, which is likely to contain
the value of an unknown population parameter
• The confidence level is defined for the hypothesis test according to the accuracy needed.
• A higher confidence level indicates that more evidence is needed to reject the null hypothesis.
• Therefore, increasing the confidence level makes it harder to reject the null hypothesis.
• Inversely, a low confidence level indicates that the null hypothesis can be rejected easily.
✓ The confidence level or reliability is the expected percentage of times that the actual value will
fall within the stated precision limits.
✓ Thus, if we take a confidence level of 95%, then we mean that there are 95 chances in 100 (or
.95 in 1) that the sample results represent the true condition of the population within a specified
precision range against 5 chances in 100 (or .05 in 1) that it does not.
✓ Confidence level indicates the likelihood that the answer will fall within that range, and the
significance level indicates the likelihood that the answer will fall outside that range.
✓ We can always remember that if the confidence level is 95%, then the significance level will be
(100 – 95) i.e., 5%; if the confidence level is 99%, the significance level is (100 – 99) i.e., 1%.
✓ We should also remember that the area of normal curve within precision limits for the specified
confidence level constitute the acceptance region and the area of the curve outside these limits
in either direction constitutes the rejection regions.
Level of Significance
• The significance level is the probability of
rejecting the null hypothesis when it is
true. For example, a significance level of
0.05 indicates a 5% risk of concluding
that a difference exists when there is no
actual difference. Lower significance
levels indicate that you require stronger
evidence before you will reject the null
hypothesis.
• Los (Alpha)= 1- CI
Z Score

✓If the z-score of the sample lies further away from the center than the critical z-
values, the null hypothesis is rejected.
✓Otherwise, the test fails to reject the hypothesis.
✓The only two possible outcomes of a hypothesis test are ‘reject the null
hypothesis’ or ‘fail to reject the null hypothesis’. This hypothesis can never be
‘accepted’.
Commonly used critical z scores
Left tail test Two tail test Right tail test
Two tail test

• Example : One plus


• You need to verify whether the OnePlus 6 takes 30 minutes to reach 60% charge,
since this is the popular sentiment.
• First, if the time taken is less than 30 minutes, you want to revise your claim to
boast about the better figure.
• Second, if the time taken is more, you want the engineers to fix this issue.
• What will your null and alternative hypotheses be?
• Hypothesis Statement
Assumptions • Null hypothesis: The time needed to charge till 60 percent is equal to 30 minutes.
Population is normal
• Alternative hypothesis: The time needed to charge till 60 percent is not equal to 30
Sample size is large (n>30) minutes.
Z score
Two tail hypothesis Test • For Testing hypothesis
• Ho: Mean = 30
• Hα: Mean ≠ 30
Testing hypothesis : Z score

• z score is distance of point from centre (in terms of std dev)


Testing hypothesis : P Value
✓ An alternative way of obtaining the test
result is by calculating the p-value.
✓ The p-value can be calculated from the
z-score, using a z-table or by inserting
the z-score into a p-value calculator .
✓ The null hypothesis can be rejected at
all confidence levels below 1-p.
✓ p-value can be visualised as the
‘probability of the null hypothesis
being true’
✓ Directly tells Confidence Interval at
which null hypothesis can be rejected P value method
If p<=alpha , Reject Ho
✓ If the p-value is less than the If p> alpha, Fail to Reject Ho
significance level (α), then you can
reject the null hypothesis.
http://courses.atlas.illinois.edu/spring2016/STAT/STAT200/pnormal.html
Right Tail test
• Example: Hypothesis test from the perspective of a OnePlus 6
customer
• Will you care if the OnePlus 6 makes the claim of “a day’s power in half
an hour” and then overperforms by taking lesser time to charge?
• I would care only if the phone was underperforming. Therefore, it is
often sufficient to perform the hypothesis test on only one side of the
curve, depending on the context.
• Null hypothesis : The time needed ≤ 30 minutes.
• Alternative hypothesis : The time needed is > 30 minutes.
Hypothesis test : One plus Customer

• Two tail test One Tail test (Right Tail)

Example 2: MS EXCEL

Fail to reject Null Hypothesis


Left Tail test

• Imagine you’re the owner of a pizza company, and you claim that your pizzas
are more than 9 inches in diameter. But you’ve been receiving complaints
from some of your customers, who say that the pizzas are actually smaller.
Your task is to now find out whether your chefs are producing smaller pizzas.
In this case, you will conduct a ‘left-tailed test’ by checking whether your
sample mean is significantly lesser than 9 inches, since you’re checking
whether the complaints about smaller pizzas are true.
• Hypothesis Statement
• Null hypothesis : Pizza size is at least 9 inches (i.e. 9 or more).
• Alternative hypothesis : Pizza size is less than 9 inches
• Mathematically
• Null hypothesis : Pizza size ≥ 9.
• Alternative hypothesis : Pizza size is < 9.
Hypothesis testing –One sample t test
• The One Sample t Test examines whether the mean of a population is
statistically different from a known or hypothesized value. The One
Sample t Test is a parametric test.
• Applicable when population standard deviation is unknown
• Number of samples is less than 30
Two Sample t Test
• When there is a need to compare the means of two samples, a two-
sample t-test is conducted. In such a case, the formula for the t-statistic
becomes

https://learn.upgrad.com/course/1260/segment/10349/64185/187901/997831
Types of two sample test
Paired t test Unpaired t test

• Paired t test - Paired means that • An unpaired t-test is used to compare the
both samples consist of the same test subjects mean between two independent groups. You
use an unpaired t-test when you are comparing
• Paired t-tests are used when the same item or
two separate groups with equal variance.
group is tested twice, which is known as a repeated
measures t-test. Some examples of instances for • Unpaired t test- Unpaired means that
which a paired t-test is appropriate include: both samples consist of distinct test subjects.
• Research, such as a pharmaceutical study or
• The before and after effect of a pharmaceutical
other treatment plan, where ½ of the subjects
treatment on the same group of people.
are assigned to the treatment group and ½ of
• Body temperature using two different the subjects are randomly assigned to the
thermometers on the same group of participants. control group.
• Standardized test results of a group of students • Comparing the average commuting distance
before and after a study prep course. traveled by New York City and San Francisco
residents using 1,000 randomly selected
participants from each city.

https://www.technologynetworks.com/informatics/articles/paired-vs-unpaired-t-test-differences-assumptions-and-hypotheses-330826
Summary
1. Define the hypothesis statements: Your test will either ‘reject’ or ‘fail to reject’ the null hypothesis.
2. Collect as many data points as possible: The data points you collect will produce one sample. The size
of this single sample will depend on how many data points you take.
3. Measure the sample mean and the sample standard deviation: The standard deviation should be
calculated using the ‘n-1’ method. The STDEV function in Excel takes care of this.
4. Identify the distribution of the sample means: If the sample size is larger than 30, the distribution will
be a normal one (We’re only focusing on normal distributions for now).
5. Define the confidence level: This is the level of surety that you demand from a hypothesis test. The
higher the confidence level, the harder it is to reject the null hypothesis.
6. Find the critical z-scores of the confidence level and the test statistic or the z-score of the sample: The
z-score of the sample can be calculated by subtracting the hypothesised mean from the sample mean
and dividing it by the population standard deviation, divided by the root over sample size.
7. Compare the sample test statistic with the critical z-scores: Here, you check whether the sample
statistic is more extreme than the z-scores.
8. If the sample test statistic is more extreme than the critical z-scores, you will reject the null
hypothesis. Otherwise, you will fail to reject it.
Summary
✓When the test needs to check only positive or negative deviation from the null
hypothesis, a one-tailed test is performed.
✓When the test needs to check for deviation on either side of the null hypothesis, a
two-tailed test is performed.
✓When the sample size is low, a t-test is performed.
✓A t-test is also preferred over a z-test when the population standard deviation is
unknown.
✓When two sample means need to be checked for equality, a two-sample t-test is
performed.
✓When there is a need to check whether an entire distribution is similar to another,
a goodness of fit test is performed.
✓Hypothesis testing also carries some probability of committing errors. The errors
can be of two types: Type I and Type II.
A/B testing
✓An A/B test tells you whether there is a statistical difference in the performance of
the two options.
✓Data driven decision making system
✓A/B tests are used whenever there is a need to compare two alternatives.
✓The A/B test can be considered the most basic kind of randomized controlled
experiment
✓You will now learn about ‘A/B tests’, which are used in the industry when there is a
need to make a choice between two options. An A/B test tells you whether there
is a statistical difference in the performance of the two options.
A/B testing : History
• In the 1920s statistician and biologist Ronald Fisher discovered the most
important principles behind A/B testing and randomized controlled
experiments in general.
• Fisher ran agricultural experiments, asking questions such as, What happens if I
put more fertilizer on this land? The principles persisted and in the early 1950s
scientists started running clinical trials in medicine.
• In the 1960s and 1970s the concept was adapted by marketers to evaluate
direct response campaigns (e.g., would a postcard or a letter to target
customers result in more sales?).
Areas of application
• Medicine, to understand if a drug works or not
• Economics, to understand human behaviour
• Foreign aid and charitable work (the reputable ones at least), to
understand which interventions are most effective at alleviating
problems (health, poverty, etc)
• Comparing two version of websites
• Comparing two colors/ tab/ page design
Example: A/ B testing
• Let’s say John builds a website for a free e-book and is testing out two colour
variations — red and blue. On the red website, 45 out of 100 visitors downloaded
the e-book. But on the blue website, 47 out of 100 visitors downloaded the e-book.
Based on this, John may conclude that the blue website is performing better.

• However, John’s method can backfire. This is because he did not bother to check for
statistical significance. The difference in performance observed may be due to plain
old randomness. Thus, there’s a high probability that he may end up with an
inferior website colour.

• You will tackle this problem through an A/B test


• Null hypothesis (H0): Visitors that receive Layout B will not have higher end-of-visit
conversion rates compares to visitors that receive Layout A
• Alternative hypothesis (H1): Visitors that receive Layout B will have higher end-of-
visit conversion rates compared to visitors that receive layout A
Hypothesis: A/ B testing

• A/B testing at Amazon

H0: Performance of “Buy Now”= Performance of “Shop Now”


Buy Now Shop now Ha: Performance of “Buy Now” ≠ Performance of “Shop
Now”

• A/B testing at Upgrad


H0: ‘Apply now’ button gets more than or equal number of
clicks as the ‘Enrol now’ button
Apply Now Enrol now Ha: ‘Apply now’ button gets less clicks than ‘Enrol now’ button
Example: A/ B testing
Ola launched a new coupon codes for its new users. Two coupons were provided to
facilitate the commuters.
• Coupon A: Get Rs 100 off the first ride. Book online now!
• Coupon B: Get an additional Rs 100 off. Book online now.
Test the claim that option B will result in a higher Click through rate or
test the claim that option B will be liked more by customers.
Control group Test Group
Impressions 50000 650000
Clicks 2400 2770
CTR 4.80% 4.26%

Variant B’s conversion rate (4.26%) was 11.22% lower than variant A’s conversion rate (4.80%). You can
be 95% confident that variant B will perform worse than variant A.
Power 0.00% p value 1.0000
Example: A/ B testing
Tanishq launched two ads during Diwali on youtube to promote its
products. Bothe ads were measured in terms of how many people
watched the ad and how many clicked on them to visit Tanishq store.
Using the following data, calculate if adv 2 is more effective in directing
the traffic.

Advertisement 1 Advertisement 2
Impressions 343490 344200
Clicks 96720 97535
CTR 28.16% 28.34%

Variant B’s conversion rate (28.34%) was 0.63% higher than variant A’s conversion rate (28.16%). You
can be 95% confident that variant B will perform better than variant A.
Power 75.27% p value 0.0499
Errors in Hypothesis test
Decision
Accept H0 Reject H0
H0 (True) Correct decision Type I error
(Alpha Error)
H0 (False) Type II Error Correct decision
(Beta error)

Framing of error: Left tail test


Null hypothesis : Pizza size ≥ 9.
Alternative hypothesis : Pizza size is < 9.

• Type 1- Null hypothesis was true but rejected, pizza>=9, but I rejected
• Type 2 error- Accept Ho when ho is false, pizza was not >=9 but accepted it
Handling Error
There are two ways of handling error-
1. Increasing confidence level of the test
a. Reduces type 1 error
b. Increase types two error
2. Increasing sample size
a. Reduces type 2 error
b. Doesn’t effect Type 1 error
• P value calculator
• http://courses.atlas.illinois.edu/spring2016/STAT/STAT200/pnormal.ht
ml

• A/B testing
• https://www.surveymonkey.com/mp/ab-testing-significance-calculator/

• Quiz3. Practice
https://forms.gle/NmAufBEXStaRagRr8
Doubts?
All the Best!

https://www.youtube.com/watch?v=Z9Gw9dIJGiA&t=86s&ab_channel=upGrad_Gmba

You might also like