Lecture_10
Lecture_10
Statistical
Methods
Descriptive Inferential
Statistics Statistics
Hypothesis
Estimation Testing
Hypothesis Testing
I believe the
population mean Sample value is
very far from
age is 50 claimed value.
Population (hypothesis). Reject
hypothesis?
J J
J
J J
J Random
J sample
Mean
`X = 20
What’s a Hypothesis?
I believe:
A statistical hypothesis is a - the average for ST11 is 12.7
statement/claim/assertion about the -13% of students are vegan
numerical value of a population - there is no relationship between
tattoo and hepatitis C
parameter.
Null Hypothesis
H0 : μ = 30 H0 : X = 30
Alternative Hypothesis
The analogy with the jury trial (where the initial assumption is
innocence) would be:
I. convicting an innocent person
II. letting a guilty person go free.
• Type I Error
q Reject a true null hypothesis
q Considered a serious type of error
q The probability of a Type I Error is a
–Called level of significance of the test
–Set by researcher in advance
• Type II Error
q Failure to reject false null hypothesis
q The probability of a Type II Error is β
–We will not calculate it
The Hypothesis Testing Process
Population
Sample
The Hypothesis Testing Process
It is unlikely
that we would ... therefore, we
get a sample reject the
mean of this hypothesis that
value ... 𝛍= 50.
20 𝛍= 50 Sample Means
H0
Taking a decision
If the sample mean is close enough to the stated population mean, the null
hypothesis is not rejected.
If the sample mean is too far from the stated population mean, the null
hypothesis is rejected.
Then, we take a decision using either the critical value(s) method which
defines rejection region(s), or the p-value method.
Rejection Regions (Two-Tailed Test)
Rejection Rejection
Region Region
1–𝛂
𝛂 /2 𝛂 /2
Fail to Reject
Region
Ho
Critical Value Critical
Value Value
Rejection Region (One-Tailed Test)
Rejection
Region
1–𝛂
𝛂
Fail to Reject
Region
Ho
Critical Value
Value
p-value
Draw a picture!
2. Check assumptions/conditions Show the sampling
this ensures that the sampling distribution follows a distribution based on
given model (Normal, Student or other) the null hypothesis and
where the statistic
3. Find the test statistic from your data lies
with respect to this
z-score, t-score or other
distribution.
5. Conclude
Reject or fail to reject null hypothesis
Interpret in context
Hypothesis Tests for the Mean
Hypothesis
Tests for µ
z-test: t-test:
s Known s Unknown
s Unknown and small sample
but large sample
Hypothesis Tests for the Mean
Hypothesis
Tests for µ
z-test: t-test:
s Known s Unknown
s Unknown and small sample
but large sample
Hypothesis for the Mean: z-test
The test statistic is:
𝑋% − 𝜇$
𝑍!"#" = 𝜎
𝑛
Test α 10% 5% 1%
Example 1
Example 1
1. State the appropriate null and alternative hypotheses
H0: µ = 30 the manufacturer’s claim holds
Ha: µ ≠ 30 the manufacturer’s claim doesn’t hold
(we will use a two-tailed test, since no direction is explicitly
indicated in the question)
X - μ0 29.84 - 30 - .16
ZSTAT = = = = -2.0
σ 0.8 0.08
n 100
27
Example 1
4. Critical Values Method:
For a = 0.05, the critical z values are ±1.96
-2.0
5. Since ZSTAT = -2.0 < -1.96, the test statistic is in the
rejection region, so we reject the null hypothesis and conclude
there is sufficient evidence that the mean diameter of a
manufactured bolt is significantly different from 30.
28
Example 1
4. p-value Method: how likely is it to get a ZSTAT of -2 or
something further from the mean, in either direction, if H0 is
true?
0 Z
-2.0 2.0
p-value = 0.0228 + 0.0228 = 0.0456
Hypothesis
Tests for µ
z-test: t-test:
s Known s Unknown
s Unknown and small sample
but large sample
Hypothesis for the Mean: t-test
The test statistic is:
𝑋% − 𝜇$
𝑡!"#" = 𝑠
𝑛
The critical value(s) or p-value are found using a t-table, based on the
degree of freedom df=n-1 and a level of significance 𝛂.
The conditions required for inference are the same as for the t-interval
for the mean.
33
Example 2
Example 2
1. State the appropriate null and alternative hypotheses
H0: µ = 168 the average cost of a hotel room in NY is 168$
Ha: µ > 168 the average cost of a hotel room in NY is more
than 168$
(we will use a one-tailed test, upper tail)
a = 0.05
0 tα = 1.711
1.46
5. Since tSTAT = 1.46< 1.711, the test statistic is NOT in the
rejection region, so we FAIL to reject the null hypothesis and
conclude there isn’t sufficient evidence that the average cost
of a hotel room in NY more than 168$
36
Example 2
4. p-value Method: how likely is it to get a tSTAT of 1.46 or
more, if H0 is true?
0 t
1.46
For df=24, p-value is between 5% and 10%
(using a statistical package we get 7.85%)
5. Since p-value > a, FAIL to reject the null hypothesis and
conclude there isn’t sufficient evidence that the average cost of a
hotel room in NY more than 168$
Note about finding the p-value for a t-test
Finding the p-value using our t-table is a complicated and not exact
process because our t-table is incomplete.
This means that we cannot read an exact p-value for a calculated tstat.
In fact, we can only make an approximation of the p-value by bounding
it between 2 alpha values.
You have an example with detailed explanation on p.415 of the ebook.
To make our life simpler, we agree that it’s easier to use the critical
value(s) method with rejection region(s) for a Student t-test, unless we
can read the exact p-value using a software.
Thinking Challenge #3
A random sample of 25
boxes had a mean of 372.5
and a standard deviation of
12 grams.
Example 3
Following a marketing campaign for a new
product, a company claims that it receives
8% responses from its client mailing list.
To test this claim, a random sample of 500
clients were surveyed and 25 of them said
they responded to the mail.
Can we conclude that the client response
rate is less than claimed?
Example 3
1. State the appropriate null and alternative hypotheses
H0: p = 8% the client response rate is 8%
Ha: p < 8% the client response rate is less than 8%
(we will use a one-tailed test, lower-tail)
Example 3
4. Critical Values Method:
For a = 0.05, the critical z value is -1.645
a= 0.05
-Zα = -1.645 0
-2.47
5. Since ZSTAT = -2.47 < -1.645, the test statistic is in the
rejection region, so we reject the null hypothesis and conclude
there is sufficient evidence that the client response rate is less
than claimed.
43
Example 3
4. p-value Method: how likely is it to get a ZSTAT of -2.47 or
less, if H0 is true?
0 Z
-2.47
p-value = 0.0068
When the sample data are consistent with the value of the null
hypothesis, the p-value is high and we are unable to reject the null
hypothesis.
Interpretation:
There is a 4.56% chance of finding a test statistic equal to or
more extreme than our observed sample mean diameter of
29.84mm, IF H0: µ = 30 is true.
p-value = 7.85%
Interpretation:
There is a 7.85% chance of finding a test statistic equal to or
more extreme than our observed sample mean cost of $
172.50, IF H0: µ = 168 is true.
Of course you would expect the null hypothesis value to lie OUTSIDE
this confidence interval as this is more or less equivalent to rejecting
the null hypothesis.