0% found this document useful (0 votes)

61 views34 pages

Data Science Hypothesis Testing

This document discusses statistical hypothesis testing and inference. It provides an example of testing whether a coin is fair by flipping it multiple times and counting the heads. It explains how to set up the null and alternative hypotheses and use the binomial distribution and normal approximation to calculate p-values and confidence intervals. Bayesian inference is also briefly mentioned. Key concepts covered include statistical significance, type I and II errors, and the power of statistical tests.

Uploaded by

Sahil Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views34 pages

Data Science Hypothesis Testing

Uploaded by

Sahil Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Hypothesis and Inference

Lecture 7
Centre for Data Science
Institute of Technical Education and Research
Siksha ‘O’ Anusandhan (Deemed to be University)
Bhubaneswar, Odisha, 751030

1 / 34
Overview

1 Statistical Hypothesis Testing

2 Example: Flipping a coin

3 p-Values

4 Confidence Intervals

5 p-Hacking

6 Example: Running an AIB Test

7 Bayesian Inference

8 References

2 / 34
Statistical Hypothesis Testing

Hypotheses are assertions like ”this coin is fair” or ”data scientists

prefer Python to R” that can be translated into statistics about data.
Those statistics can be thought of as observations of random
variables from known distributions.
It allows us to make statements about how likely those assumptions
are to hold.
In the classical setup, we have
Null hypothesis, H0 , represents some default position.
some alternative hypothesis, H1 , that we’d like to compare it with.
We use statistics to decide whether we can reject H0 as false or not.

3 / 34
Example: Flipping a coin

Imagine we have a coin and we want to test whether it’s fair.

We’ll make the assumption that the coin has some probability p of
landing heads.
Our null hypothesis is that the coin is fair that is, that p = 0.5.
In particular, our test will involve flipping the coin some number, n,
times and counting the number of heads, X.

4 / 34
Example: Flipping a Coin

Each coin flip is a Bernoulli trial, which means that X is a

Binomial(n,p) random variable, which we can approximate using the
normal distribution.
1 from t y p i n g i m p o r t T u p l e
2 i m p o r t math
3 d e f n o r m a l a p p r o x i m a t i o n t o b i n o m i a l ( n : i n t , p : f l o a t ) =>
Tuple [ f l o a t , f l o a t ] :
4 ””” R e t u r n s mu and s i g m a c o r r e s p o n d i n g t o a B i n o m i a l ( n
, p ) ”””
5 mu = p * n
6 s i g m a = math . s q r t ( p * ( 1 = p ) * n )
7 r e t u r n mu , s i g m a

5 / 34
Example: Flipping a Coin

Whenever a random variable follows a normal distribution, we can use

normal cdf to figure out the probability that its realized value lies
within or outside a particular interval:
1 from s c r a t c h . p r o b a b i l i t y i m p o r t n o r m a l c d f
2 # The n o r m a l c d f is the p r o b a b i l i t y the v a r i a b l e i s
below a t h r e s h o l d
3 normal probability below = normal cdf
4 # I t ’ s above the t h r e s h o l d i f i t ’ s not below the
threshold
5 d e f n o r m a l p r o b a b i l i t y a b o v e ( l o : f l o a t , mu : f l o a t = 0 ,
s i g m a : f l o a t = 1 ) => f l o a t :
6 ””” The p r o b a b i l i t y t h a t an N(mu , s i g m a ) i s g r e a t e r
t h a n l o . ”””
7 r e t u r n 1 = n o r m a l c d f ( l o , mu , s i g m a )

6 / 34
Example: Flipping a Coin

1 # I t ’ s between i f i t ’ s l e s s than hi , but not l e s s than l o

2 d e f n o r m a l p r o b a b i l i t y b e t w e e n ( l o : f l o a t , h i : f l o a t , mu : f l o a t
= 0 , s i g m a : f l o a t = 1 ) => f l o a t :
3 ””” The p r o b a b i l i t y t h a t an N(mu , s i g m a ) i s b e t w e e n l o and
h i . ”””
4 r e t u r n n o r m a l c d f ( h i , mu , s i g m a ) = n o r m a l c d f ( l o , mu ,
sigma )
5 # I t ’ s o u t s i d e i f i t ’ s not between
6 d e f n o r m a l p r o b a b i l i t y o u t s i d e ( l o : f l o a t , h i : f l o a t , mu : f l o a t
= 0 , s i g m a : f l o a t = 1 ) => f l o a t :
7 ””” The p r o b a b i l i t y t h a t an N(mu , s i g m a ) i s n o t b e t w e e n l o
and h i . ”””
8 r e t u r n 1 = n o r m a l p r o b a b i l i t y b e t w e e n ( l o , h i , mu , s i g m a )

7 / 34
Example: Flipping a Coin

We can also do the reverse find either the nontail region or the
(symmetric) interval around the mean that accounts for a certain
level of likelihood.
For example, if we want to find an interval centered at the mean and
containing 60% probability, then we find the cutoffs where the upper
and lower tails each contain 20% of the probability (leaving 60%).
1 from s c r a t c h . p r o b a b i l i t y i m p o r t i n v e r s e n o r m a l c d f
2 d e f n o r m a l u p p e r b o u n d ( p r o b a b i l i t y : f l o a t , mu : f l o a t = 0 ,
s i g m a : f l o a t = 1 ) => f l o a t :
3 ””” R e t u r n s t h e z f o r w h i c h P( Z <= z ) = p r o b a b i l i t y ”””
4 r e t u r n i n v e r s e n o r m a l c d f ( p r o b a b i l i t y , mu , s i g m a )
5 d e f n o r m a l l o w e r b o u n d ( p r o b a b i l i t y : f l o a t , mu : f l o a t = 0 ,
s i g m a : f l o a t = 1 ) => f l o a t :
6 ””” R e t u r n s t h e z f o r w h i c h P( Z >= z ) = p r o b a b i l i t y ”””
7 r e t u r n i n v e r s e n o r m a l c d f ( 1 = p r o b a b i l i t y , mu , s i g m a )

8 / 34
Example: Flipping a Coin

1 d e f n o r m a l t w o s i d e d b o u n d s ( p r o b a b i l i t y : f l o a t , mu : f l o a t = 0 ,
s i g m a : f l o a t = 1 ) => T u p l e [ f l o a t , f l o a t ] :
2 ””” R e t u r n s t h e s y m m e t r i c ( a b o u t t h e mean ) bounds t h a t
c o n t a i n t h e s p e c i f i e d p r o b a b i l i t y ”””
3 t a i l p r o b a b i l i t y = (1 = p r o b a b i l i t y ) / 2
4 # u p p e r bound s h o u l d h a v e t a i l p r o b a b i l i t y a b o v e i t
5 u p p e r b o u n d = n o r m a l l o w e r b o u n d ( t a i l p r o b a b i l i t y , mu ,
sigma )
6 # l o w e r bound s h o u l d h a v e t a i l p r o b a b i l i t y b e l o w i t
7 l o w e r b o u n d = n o r m a l u p p e r b o u n d ( t a i l p r o b a b i l i t y , mu ,
sigma )
8 r e t u r n lower bound , upper bound

9 / 34
Example: Flipping a Coin

Let’s say that we choose to flip the coin n = 1,000 times.

If our hypothesis of fairness is true, X should be distributed
approximately normally with mean 500 and standard deviation 15.8.
1 mu 0 , s i g m a 0 = n o r m a l a p p r o x i m a t i o n t o b i n o m i a l ( 1 0 0 0 ,
0.5)

We need to make a decision about significance how willing we are to

make a type 1 error (’false positive”).
In type 1 error, we reject H0 even though it’s true.

10 / 34
Example: Flipping a Coin

Consider the test that rejects H0 if X falls outside the bounds given by
1 lower bound , upper bound = normal two sided bounds (0 .95 ,
mu 0 , s i g m a 0 )

Assuming p really equals 0.5 (i.e., H0 is true), there is just a 5%

chance we observe an X that lies outside this interval.
If H0 is true, then, approximately 19 times out of 20, this test will
give the correct result.

11 / 34
Example: Flipping a Coin

We are also often interested in the power of a test, which is the

probability of not making a type 2 error (”false negative”).
In type 2 error, we fail to reject H0 even though it’s false.
In particular, let’s check what happens if p is really 0.55, so that the
coin is slightly biased toward heads.
1 # 95% bounds b a s e d on a s s u m p t i o n p i s 0 . 5
2 l o , h i = n o r m a l t w o s i d e d b o u n d s ( 0 . 9 5 , mu 0 , s i g m a 0 )
3 # a c t u a l mu and s i g m a b a s e d on p = 0 . 5 5
4 mu 1 , s i g m a 1 = n o r m a l a p p r o x i m a t i o n t o b i n o m i a l ( 1 0 0 0 ,
0.55)
5 # a t y p e 2 e r r o r means we f a i l t o r e j e c t t h e n u l l
h y p o t h e s i s , w h i c h w i l l happen when X i s s t i l l i n o u r
original interval
6 t y p e 2 p r o b a b i l i t y = n o r m a l p r o b a b i l i t y b e t w e e n ( lo , hi ,
mu 1 , s i g m a 1 )
7 power = 1 = t y p e 2 p r o b a b i l i t y # 0 . 8 8 7

12 / 34
p- Values

Instead of choosing bounds based on some probability cutoff, we

compute the probability.
Assuming H0 is true - that we would see a value at least as extreme
as the one we actually observed.
For our two-sided test of whether the coin is fair, we compute:
1 d e f t w o s i d e d p v a l u e ( x : f l o a t , mu : f l o a t = 0 , s i g m a :
f l o a t = 1 ) => f l o a t :
2 i f x >= mu :
3 # x i s g r e a t e r t h a n t h e mean , s o t h e t a i l i s
e v e r y t h i n g g r e a t e r than x
4 r e t u r n 2 * n o r m a l p r o b a b i l i t y a b o v e ( x , mu , s i g m a )
5 else :
6 # x i s l e s s t h a n t h e mean , s o t h e t a i l i s
e v e r y t h i n g l e s s than x
7 r e t u r n 2 * n o r m a l p r o b a b i l i t y b e l o w ( x , mu , s i g m a )

13 / 34
p- Values

If we were to see 530 heads, we would compute:

1 t w o s i d e d p v a l u e ( 5 2 9 . 5 , mu 0 , s i g m a 0 ) # 0 . 0 6 2

Note
Why did we use a value of 529.5 rather than using 530? This is what’s
called a continuity correction. It reflects the fact that
normal probability between(529.5, 530.5, mu 0, sigma 0) is a better
estimate of the probability of seeing 530 heads than
normal probability between(530, 531, mu 0, sigma 0) is.

14 / 34
p- Values
One way to convince yourself that this is a sensible estimate is with a
simulation:
1 i m p o r t random
2 extreme value count = 0
3 for in range (1000) :
4 num heads = sum ( 1 i f random . random ( ) < 0 . 5 e l s e 0 f o r
in range (1000) )
5 i f num heads >= 530 o r num heads <= 4 7 0 :
6 e x t r e m e v a l u e c o u n t += 1
7 # p= v a l u e was 0 . 0 6 2 => ˜62 e x t r e m e v a l u e s o u t o f 1000
8 a s s e r t 59 < e x t r e m e v a l u e c o u n t < 6 5 , f ” {
e x t r e m e v a l u e c o u n t }”

Since the p-value is greater than our 5% significance, we don’t reject

the null.
If we instead saw 532 heads, the p-value would be:
1 t w o s i d e d p v a l u e ( 5 3 1 . 5 , mu 0 , s i g m a 0 ) # 0 . 0 4 6 3

15 / 34
p- Values

The p-Value computed above is smaller than the 5% significance,

which means we would reject the null.
It’s the exact same test as before. It’s just a different way of
approaching the statistics.
For our one-sided test, if we saw 525 heads we would compute:
1 u p p e r p v a l u e ( 5 2 4 . 5 , mu 0 , s i g m a 0 ) # 0 . 0 6 1

It means we wouldn’t reject the null.

If we saw 527 heads, the computation would be:
1 u p p e r p v a l u e ( 5 2 6 . 5 , mu 0 , s i g m a 0 ) # 0 . 0 4 7

It means we would reject the null.

16 / 34
Confidence Intervals

We’ve been testing hypotheses about the value of the heads

probability p, which is a parameter of the unknown ”heads”
distribution.
When this is the case, a third approach is to construct a confidence
interval around the observed value of the parameter.
Example: we can estimate the probability of the unfair coin by
looking at the average value of the Bernoulli variables corresponding
to each flip 1 if heads, 0 if tails.
If we observe 525 heads out of 1,000 flips, then we estimate p equals
0.525.
How confident can we be about this estimate?

17 / 34
Confidence Intervals

Using the normal approximation, we conclude that we are ”95%

confident” that the following interval contains the true parameter p:
1 n o r m a l t w o s i d e d b o u n d s ( 0 . 9 5 , mu , s i g m a ) # [ 0 . 4 9 4 0 ,
0.5560]

Note
This is a statement about the interval, not about p. You should
understand it as the assertion that if you were to repeat the experiment
many times, 95% of the time the ”true” parameter would lie within the
observed confidence interval.
we do not conclude that the coin is unfair, since 0.5 falls within our
confidence interval.

18 / 34
Confidence Intervals

If instead we’d seen 540 heads, then we’d have:

1 p h a t = 540 / 1000
2 mu = p h a t
3 s i g m a = math . s q r t ( p h a t * ( 1 = p h a t ) / 1 0 0 0 ) # 0 . 0 1 5 8
4 n o r m a l t w o s i d e d b o u n d s ( 0 . 9 5 , mu , s i g m a ) # [ 0 . 5 0 9 1 ,
0.5709]

Here, ”fair coin” doesn’t lie in the confidence interval.

19 / 34
p-Hacking

A procedure that erroneously rejects the null hypothesis only 5% of

the time will—by definition—5% of the time erroneously reject the
null hypothesis.
Test enough hypotheses against your dataset, and one of them will
almost certainly appear significant.
Remove the right outliers, and you can probably get your p-value
below 0.05.
This is sometimes called p-hacking. and in some ways a consequence
of the ”inference from p-values framework.”

20 / 34
p-Hacking

1 from t y p i n g i m p o r t L i s t
2 d e f r u n e x p e r i m e n t ( ) => L i s t [ b o o l ] :
3 ””” F l i p s a f a i r c o i n 1000 t i m e s , True = heads , F a l s e =
t a i l s ”””
4 r e t u r n [ random . random ( ) < 0 . 5 f o r in range (1000) ]
5 d e f r e j e c t f a i r n e s s ( e x p e r i m e n t : L i s t [ b o o l ] ) => b o o l :
6 ””” U s i n g t h e 5% s i g n i f i c a n c e l e v e l s ”””
7 num heads = l e n ( [ f l i p f o r f l i p i n e x p e r i m e n t i f f l i p ] )
8 r e t u r n num heads < 469 o r num heads > 531
9 random . s e e d ( 0 )
10 experiments = [ run experiment () for in range (1000) ]
11 num rejections = len ( [ experiment for experiment in
experiments i f r e j e c t f a i r n e s s ( experiment ) ] )
12 a s s e r t n u m r e j e c t i o n s == 46

21 / 34
Example: Running an AIB Test

Assume one of your advertisers has developed a new energy drink

targeted at data scientists, and the VP of Advertisements wants your
help choosing between advertisement A (”tastes great!”) and
advertisement B (”less bias!”).
You decide to run an experiment by randomly showing site visitors
one of the two advertisements and tracking how many people click on
each one.
If 990 out of 1,000 A-viewers click their ad, while only 10 out of 1,000
B-viewers click their ad.
you can be pretty confident that A is the better ad.
But what if the differences are not so stark?

22 / 34
Example: Running an AIB Test

Let’s say that NA people see ad A, and that nA of them click it.
We can think of each ad view as a Bernoulli trial where pA is the
probability that someone clicks ad A.
we know that nA /NA is approximately apnormal random variable with
mean pA and standard deviation σA = pA (1 − pA )/NA .
Similarly, nB /NB is approximately a normal
p random variable with
mean pB and standard deviation σB = pB (1 − pB )/NB .
1 d e f e s t i m a t e d p a r a m e t e r s (N : i n t , n : i n t ) => T u p l e [ f l o a t ,
float ]:
2 p = n / N
3 s i g m a = math . s q r t ( p * ( 1 = p ) / N)
4 r e t u r n p , sigma

23 / 34
Example: Running an AIB Test

If we assume those two normals are independent then their difference

should also be normal with mean pA − pB and standard deviation
q
σA2 + σB2 .
This means we can test the null hypothesis that pA and pB are the
same( i.e. pA − pB = 0).
1 def a b t e s t s t a t i s t i c (N A : int , n A : int , N B : int , n B :
i n t ) => f l o a t :
2 p A , s ig m a A = e s t i m a t e d p a r a m e t e r s ( N A , n A )
3 p B , si g ma B = e s t i m a t e d p a r a m e t e r s ( N B , n B )
4 r e t u r n ( p B = p A ) / math . s q r t ( sigma A ** 2 + sigma B
** 2 )

24 / 34
Example: Running an AIB Test

For example, if ”tastes great” gets 200 clicks out of 1,000 views and
—”ess bias” gets 180 clicks out of 1,000 views, the statistic equals.
1 z = a b t e s t s t a t i s t i c ( 1 0 0 0 , 2 0 0 , 1 0 0 0 , 1 8 0 ) # = 1.14

The probability of seeing such a large difference if the means were

actually equal would be:
1 t w o s i d e d p v a l u e ( z ) # 0.254

which is large enough that we can’t conclude there’s much of a

difference.
On the other hand, if ”less bias” only got 150 clicks, we’d have:
1 z = a b t e s t s t a t i s t i c ( 1 0 0 0 , 2 0 0 , 1 0 0 0 , 1 5 0 ) # = 2.94
2 t w o s i d e d p v a l u e ( z ) # 0.003

which means there’s only a 0.003 probability we’d see such a large
difference if the ads were equally effective.
25 / 34
Bayesian Inference

An alternative approach to inference involves treating the unknown

parameters themselves as random variables.
The analyst starts with a prior distribution for the parameters and
then uses the observed data and Bayes’s theorem to get an updated
posterior distribution for the parameters.
Rather than making probability judgments about the tests, you make
probability judgments about the parameters.

26 / 34
Bayesian Inference

For example, when the unknown parameter is a probability, we often

use a prior from the Beta distribution, which puts all its probability
between 0 and 1:
1 d e f B( a l p h a : f l o a t , b e t a : f l o a t ) => f l o a t :
2 ”””A n o r m a l i z i n g c o n s t a n t s o t h a t t h e t o t a l
p r o b a b i l i t y i s 1 ”””
3 r e t u r n math . gamma( a l p h a ) * math . gamma ( b e t a ) / math .
gamma ( a l p h a + b e t a )
4 d e f b e t a p d f ( x : f l o a t , a l p h a : f l o a t , b e t a : f l o a t ) =>
float :
5 i f x <= 0 o r x >= 1 : # no w e i g h t o u t s i d e o f [ 0 , 1 ]
6 return 0
7 r e t u r n x ** ( a l p h a = 1 ) * ( 1 = x ) ** ( b e t a = 1 ) / B(
alpha , beta )

Generally speaking, this distribution centers its weight at:

alpha/(alpha + beta) and the larger alpha and beta are, the ”tighter”
the distribution is.
27 / 34
Bayesian Inference

For example, if alpha and beta are both 1, it’s just the uniform
distribution.
If alpha is much larger than beta, most of the weight is near 1.
if alpha is much smaller than beta, most of the weight is near 0.

28 / 34
Bayesian Inference

Beta Distribution
Beta(10,10)
Beta(1,1)
Beta(4,16)
Beta(16,4)
4

3
Probability

0
0.0 0.2 0.4 0.6 0.8 1.0
Values of Random Variable X
29 / 34
Bayesian Inference

Maybe we don’t want to take a stand on whether the coin is fair, and
we choose alpha and beta to both equal 1.
Then we flip our coin a bunch of times and see h heads and t tails.
Bayes’s theorem tells us that the posterior distribution for p is again a
Beta distribution, but with parameters alpha + h and beta + t.
Let’s say you flip the coin 10 times and see only 3 heads.
Your posterior distribution would be a Beta(4, 8), centered around
0.33. {(4,8) = (1+3, 1+7)}
If you started with a Beta(20, 20), your posterior distribution would
be a Beta(23, 27), centered around 0.46. {(23,27) = (20+3, 20+7)}
If you started with a Beta(30, 10), your posterior distribution would
be a Beta(33, 17), centered around 0.66. {(33,17) = (30+3, 10+7)}

30 / 34
Bayesian Inference

Posteriors from different priors

6 Beta(4,8)
Beta(33,17)
Beta(23,27)
5

0
0.0 0.2 0.4 0.6 0.8 1.0
31 / 34
Bayesian Inference

If you flipped the coin more and more times, the prior would matter
less and less until eventually you’d have (nearly) the same posterior
distribution no matter which prior you started with.
Using Bayesian inference to test hypotheses is considered somewhat
controversial in part because the mathematics can get somewhat
complicated and in part because of the subjective nature of choosing
a prior.

32 / 34
References

[1] Joel Grus. Data Science from Scratch. First Principles with Python.
Second Edition. O’REILLY, May 2019.

33 / 34
Thank You

34 / 34

Variance MLE
No ratings yet
Variance MLE
7 pages
Business Entity Formation Guide
No ratings yet
Business Entity Formation Guide
10 pages
James 2024 Are Psychologically Rich Lives Good Lives?
No ratings yet
James 2024 Are Psychologically Rich Lives Good Lives?
18 pages
Consumer Expectations of Olive Oil Sensory Properties
No ratings yet
Consumer Expectations of Olive Oil Sensory Properties
10 pages
Algorithm Aversion in Delegated Investing: Maximilian Germann Christoph Merkle
No ratings yet
Algorithm Aversion in Delegated Investing: Maximilian Germann Christoph Merkle
37 pages
Health Insurance Access for Practitioners in Ethiopia
No ratings yet
Health Insurance Access for Practitioners in Ethiopia
6 pages
Swachh Bharat Abhiyan Impact at IIIT-Allahabad
No ratings yet
Swachh Bharat Abhiyan Impact at IIIT-Allahabad
13 pages
Marketing Management 2024 27
No ratings yet
Marketing Management 2024 27
131 pages
A/B Testing for Digital Marketers
No ratings yet
A/B Testing for Digital Marketers
17 pages
Preamble 4
No ratings yet
Preamble 4
53 pages
Automated Solar Laser Bird Repellent System
No ratings yet
Automated Solar Laser Bird Repellent System
11 pages
MA PUBLIC ADMINISTRATION SEMESTER I To IV
No ratings yet
MA PUBLIC ADMINISTRATION SEMESTER I To IV
48 pages
The CONSORT Statement: Article
No ratings yet
The CONSORT Statement: Article
4 pages
Research I Q1m7 - Graph
No ratings yet
Research I Q1m7 - Graph
8 pages
HIIT's Impact on Football Performance
No ratings yet
HIIT's Impact on Football Performance
6 pages
National Slope Master Plan
100% (1)
National Slope Master Plan
6 pages
Systems Engineering Challenges and Solutions
No ratings yet
Systems Engineering Challenges and Solutions
37 pages
L3 Unit 24 Assignment 2 LA.B 22-23
No ratings yet
L3 Unit 24 Assignment 2 LA.B 22-23
5 pages
Construction and Maintenance Quiz Questions
100% (1)
Construction and Maintenance Quiz Questions
10 pages
B Vs C Test
No ratings yet
B Vs C Test
7 pages
2018 - Heo and Han - Effects of Motivaiton, Age and Stress On SDLR of Online Korean College Students
No ratings yet
2018 - Heo and Han - Effects of Motivaiton, Age and Stress On SDLR of Online Korean College Students
12 pages
Risk Assessment Template Guide
No ratings yet
Risk Assessment Template Guide
1 page
LS GR 12 Assignment 2022 MEMO
No ratings yet
LS GR 12 Assignment 2022 MEMO
7 pages
Hasn Bot
No ratings yet
Hasn Bot
2 pages
Key QM
No ratings yet
Key QM
6 pages
Synchronic and Diachronic Aspects of Kanashi 2022nd Edition Anju Saxena Ebook Digital Print Edition
100% (1)
Synchronic and Diachronic Aspects of Kanashi 2022nd Edition Anju Saxena Ebook Digital Print Edition
129 pages
The DMIQE Group and Huggett - 2020 - The Digital MIQE Guidelines Update Minimum Information For Publication of Quantitative Digital PCR
No ratings yet
The DMIQE Group and Huggett - 2020 - The Digital MIQE Guidelines Update Minimum Information For Publication of Quantitative Digital PCR
18 pages
E-Commerce Recommender Systems Review
No ratings yet
E-Commerce Recommender Systems Review
23 pages
Top Ten Research Findings For Pastors
No ratings yet
Top Ten Research Findings For Pastors
17 pages
Digital Booklet Impact on Stunting Knowledge
No ratings yet
Digital Booklet Impact on Stunting Knowledge
6 pages

Data Science Hypothesis Testing

Uploaded by

Data Science Hypothesis Testing

Uploaded by

Hypothesis and Inference

1 Statistical Hypothesis Testing

2 Example: Flipping a coin

6 Example: Running an AIB Test

Hypotheses are assertions like ”this coin is fair” or ”data scientists

Imagine we have a coin and we want to test whether it’s fair.

Each coin flip is a Bernoulli trial, which means that X is a

Whenever a random variable follows a normal distribution, we can use

1 # I t ’ s between i f i t ’ s l e s s than hi , but not l e s s than l o

Let’s say that we choose to flip the coin n = 1,000 times.

We need to make a decision about significance how willing we are to

Assuming p really equals 0.5 (i.e., H0 is true), there is just a 5%

We are also often interested in the power of a test, which is the

Instead of choosing bounds based on some probability cutoff, we

If we were to see 530 heads, we would compute:

Since the p-value is greater than our 5% significance, we don’t reject

The p-Value computed above is smaller than the 5% significance,

It means we wouldn’t reject the null.

It means we would reject the null.

We’ve been testing hypotheses about the value of the heads

Using the normal approximation, we conclude that we are ”95%

If instead we’d seen 540 heads, then we’d have:

Here, ”fair coin” doesn’t lie in the confidence interval.

A procedure that erroneously rejects the null hypothesis only 5% of

Assume one of your advertisers has developed a new energy drink

If we assume those two normals are independent then their difference

The probability of seeing such a large difference if the means were

which is large enough that we can’t conclude there’s much of a

An alternative approach to inference involves treating the unknown

For example, when the unknown parameter is a probability, we often

Generally speaking, this distribution centers its weight at:

Posteriors from different priors

You might also like