0% found this document useful (0 votes)

6 views20 pages

m09-inference

The document discusses inference and hypothesis testing, covering concepts such as test statistics, p-values, and performance evaluation metrics like false positive/negative rates and power. It also addresses the challenges of repeated tests in high-dimensional data and methods for multiple hypothesis correction, including Bonferroni and FDR adjustments. Additionally, it emphasizes the distinction between statistical significance and biological relevance.

Uploaded by

awel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views20 pages

m09-inference

Uploaded by

awel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Inference and Hypothesis

Testing
Curtis Huttenhower ([email protected])
Jason Lloyd-Price ([email protected])

http://huttenhower.sph.harvard.edu/bst281
Topics
• Basic idea

• Test statistics and p-values

• Performance evaluation

• Repeated tests and high-dimensional data

03/26/2025 2
Example

• Eric has dark hair

• On the phone: The lecturer for BST 281 has blond hair

⇒ The lecturer is not Eric

03/26/2025 3
Example

Test statistic: weight

• Adult chickens weigh 6.2 ± 0.8 lb Null distribution
Null hypothesis: this is a chicken
• On the phone: This weighs 10 lb Observation

⇒ This is probably not a chicken

p-value

03/26/2025 4
Null distributions
• The null hypothesis is the statement tested for possible rejection
◦ Usually that the results are due to chance
◦ I.e. there is no effect/bias/relationship/etc.
◦ E.g. There is no bias towards Heads/Tails, “this is a chicken”
◦ Denoted H0

• Alternate hypothesis is everything else

• The null distribution is the distribution of the test statistic if the null
hypothesis is true
◦ E.g. binomial distribution, known distribution of chicken weights
◦ For test statistic T: P(T = t | H0)
03/26/2025 5
p-values
• The p-value is the probability of observing an equal or more extreme value of
the test statistic than observed, assuming the null hypothesis
◦ For test statistic T, this is P(T ≥ t | H0)
◦ Quantifies “surprise”
◦ Lower -> observation is more unlikely -> more surprised

• When the null hypothesis is true, p-values have a uniform distribution

• p-values < α are considered “significant”

◦ I.e. reject the null hypothesis
◦ α is the fraction of tests that will be significant when the null hypothesis is true
◦ Usually 0.05 or 0.01
03/26/2025 6
z-test

• If the test statistic is normally-distributed under the null

◦ z = (x - µ) / σ

• Chicken example:
◦ Adult chickens weigh 6.2 ± 0.8 lb
◦ Observed mass is 10 lb
◦ z = (10 – 6.2) / 0.8 = 4.75
◦ P(|z| ≥ 4.75 | This is a chicken) = 2.03 × 10-6

03/26/2025 7
One-tail and two-tail tests
• What observations are considered “extreme”?

• One-sided/one-tail tests test whether the test

statistic is higher or lower than chance
◦ H0 : ≥0, HA : <0

• Two-sided/two-tail tests test whether the test

statistic is equal to
◦ H0 : =0, HA : 0

03/26/2025 8
Simple parametric tests
• When the population mean + standard deviation are unknown
◦ Data is still assumed normal
◦ Null distribution is “Student’s t” distribution
◦ t-test

• Testing Pearson correlations

◦ Usually H0 : =0, HA : 0
◦ tanh-1()  Normal
◦ -> z-test

• Often make very strong assumptions about the data

03/26/2025 9
Nonparametric tests
• Test statistic is insensitive to the distribution (e.g. by rank transform)
◦ “Non-parametric t-test”: Mann-Whitney U test
◦ Cost is decreased sensitivity

• Permutation test
◦ p-value is calculated by repeated randomizations of the data
◦ Cost is a significant increase in computation time
◦ Can also be used to test how well your test statistic fits a
specific assumption
 Quantile-Quantile (Q-Q) plot

03/26/2025 10
Performance evaluation
• How do you assess the accuracy of a hypothesis test?

• Compare with data for which the answer is known: Gold standard
◦ Negatives: null hypothesis should not be rejected (drawn from H0)
◦ Positives: null hypothesis expected to be rejected (drawn from HA)

• Perform the test on the gold standard data. Possible outcomes:

H0 True H0 False
H0 Not Rejected True Negative False Negative (Type II)
H0 Rejected False Positive (Type I) True Positive

03/26/2025 11
Performance evaluation
• Probability of incorrect call H0 HA
◦ False positive rate: FPR = P(reject H0 | H0) H0 Not Rejected TN FN
◦ False negative rate: FNR = P(!reject H0 | HA) H0 Rejected FP TP

• Power: probability of detecting a true effect

◦ P(reject H0 | HA)
◦ Also called recall, true positive rate (TPR), sensitivity
• Precision: probability a detected effect is true
◦ P(HA | reject H0)
• Specificity: probability an undetected effect is false
◦ P(!reject H0 | H0)
◦ Also true negative rate (TNR)
03/26/2025 12
Precision/recall plots (PR)

• Most tests have a parameter that

adjusts their sensitivity
◦ E.g. α

• Precision vs recall
◦ Upper-right is good

03/26/2025 13
Receiver Operating Characteristic (ROC)

• Sensitivity/specificity plots
◦ TPR vs FPR
◦ Upper left is good
◦ Well-defined behavior when guessing

• Area Under the Curve (AUC)

◦ Perfect = 1
◦ Random = 0.5
◦ Perfectly wrong = 0

03/26/2025 14
Testing many hypotheses
• Recall that under the null hypothesis, p-values follow a uniform distribution in
[0, 1]
◦ Probability of rejecting of α, even if there’s no biological effect

• Consider 20,000 genes measured at once by microarray/RNA-seq

◦ Test each gene for differences
◦ How many false positive are expected?

• What can we do?

◦ Can change the test statistic to “maximum difference over the dataset”
◦ Null hypothesis: “no effect for any of the features”
◦ Can instead “adjust” the p-value to account for multiple hypothesis testing
03/26/2025 15
Multiple hypothesis correction

• Bonferroni correction
◦ Multiply p-values by the number of tests
◦ Very strict/conservative

• Control False Discovery Rate (FDR)

◦ % of tests expected to “fail” by chance
◦ FDR q-value = (# tests) * (p-value) / (p-value rank)

03/26/2025 16
A simple experiment
• Create a 5000 x 15 matrix of random values
◦ Genes and samples
Feature Sample1 Sample2 Sample3 Sample4
Age
Age =RAND()
0.150953 0.821741 0.158316 0.898557
• Find gene most correlated with Age Gene1
Gene1
Gene2 0.801808 0.491848 0.608064 0.583391
Gene2
Gene3
Gene3
Gene4
0.146415 0.922903 0.547641 0.344042
0.219748 0.844069 0.39559 0.314302
…
Gene5
Gene4 0.613934 0.685797 0.512878 0.986288
• Does this gene truly predict longevity? Gene6
Gene5 0.322305 0.149985 0.106558 0.659032
1.2
Gene6 0.667529 0.508563 0.57018 0.803856
1

…
0.8

0.6
Pearson correlation = 0.81
Age

0.4
p-value = 0.000252
q-value = 1
0.2

0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Gene989
03/26/2025 17
Statistical significance vs biological significance

• Statistically significant ⇏ biologically significant

◦ Something can be statistically significant, but biologically irrelevant
◦ E.g. when sample size is very large or uncertainty is very low

• Not statistically significant ⇏ not biologically significant

◦ Especially when sample size is small
◦ “There is no difference at the effect size we’re powered to test”

03/26/2025 18
Accepting the null

• Consider:
◦ Adult chickens weigh 6.2 ± 0.8 lb
◦ On the phone: This weighs 6.1 lb
◦ ⇒ This is a chicken

• Equivalent to a high p-value

• “Guilty” vs “not guilty”

03/26/2025 19
Summary
• Test statistics, null hypothesis and distribution, and p-values

• Performance evaluation
◦ False positive/negative rates, power, precision and specificity
◦ Precision/recall plots
◦ Receiver-Operator Characteristic (ROC) and the Area Under the Curve (AUC)

• Repeated tests and high-dimensional data

◦ Bonferroni and FDR corrections

03/26/2025 20

Pharmacy Statistics Midterms - Hypothesis Testing
100% (1)
Pharmacy Statistics Midterms - Hypothesis Testing
41 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
54 pages
Most Common Filing Systems
No ratings yet
Most Common Filing Systems
3 pages
Overview of Hypothesis Testing: Laura Lee Johnson, PH.D
No ratings yet
Overview of Hypothesis Testing: Laura Lee Johnson, PH.D
71 pages
W7 Lecture7
No ratings yet
W7 Lecture7
19 pages
Computational Data Science - Unit 4
No ratings yet
Computational Data Science - Unit 4
18 pages
6- Hypothesis Testing
No ratings yet
6- Hypothesis Testing
27 pages
Chapter IX Hypothesis Testing
No ratings yet
Chapter IX Hypothesis Testing
31 pages
AEB03 - Inferential Statitsitics (FE)
No ratings yet
AEB03 - Inferential Statitsitics (FE)
54 pages
LECTURE 9_ NULL HYPOTHESIS SIGNIFICANCE TESTING (PART1)
No ratings yet
LECTURE 9_ NULL HYPOTHESIS SIGNIFICANCE TESTING (PART1)
20 pages
Tests of Significance
No ratings yet
Tests of Significance
60 pages
Q3: Ans: Statistical Hypothesis:: Importance
No ratings yet
Q3: Ans: Statistical Hypothesis:: Importance
3 pages
Hypothesis_testing
No ratings yet
Hypothesis_testing
5 pages
Hns 2321 Biostatistics Lecture Notes on Inferential Statistics
No ratings yet
Hns 2321 Biostatistics Lecture Notes on Inferential Statistics
25 pages
7.Hypothesis testing and Sample size determination
No ratings yet
7.Hypothesis testing and Sample size determination
60 pages
L7-Hypothesis Testing
No ratings yet
L7-Hypothesis Testing
44 pages
Hypothesis Testing For A Single Sample
No ratings yet
Hypothesis Testing For A Single Sample
51 pages
IE5005 Lecture 04
No ratings yet
IE5005 Lecture 04
57 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
86 pages
Probability and Statistics - Lecture 4
No ratings yet
Probability and Statistics - Lecture 4
35 pages
Statistics Can Be Broadly Classified Into Two Categories Namely (I) Descriptive Statistics and (II) Inferential Statistics
0% (1)
Statistics Can Be Broadly Classified Into Two Categories Namely (I) Descriptive Statistics and (II) Inferential Statistics
59 pages
Hypothesis
No ratings yet
Hypothesis
27 pages
Hypothesis Test
No ratings yet
Hypothesis Test
20 pages
Çıkarımsal İstatistik
No ratings yet
Çıkarımsal İstatistik
30 pages
Week 3 - Statistical hypothesis testing
No ratings yet
Week 3 - Statistical hypothesis testing
18 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
8.hypothesis testing (2)
No ratings yet
8.hypothesis testing (2)
43 pages
Introduction_to_hypothesis_testing24
No ratings yet
Introduction_to_hypothesis_testing24
54 pages
Biostat Hypothesis Testing
100% (4)
Biostat Hypothesis Testing
31 pages
HYPOTHESIS TESTING
No ratings yet
HYPOTHESIS TESTING
35 pages
02 Intrro Continued
No ratings yet
02 Intrro Continued
34 pages
Chapter 5
No ratings yet
Chapter 5
35 pages
Concept of Hypothesis Testing - Topic 5
No ratings yet
Concept of Hypothesis Testing - Topic 5
38 pages
Hypothesis Testing Notes
No ratings yet
Hypothesis Testing Notes
7 pages
ANP 802 lecture 2verynew
No ratings yet
ANP 802 lecture 2verynew
50 pages
1. Hypothesis Testing_Intro_Summer 2025
No ratings yet
1. Hypothesis Testing_Intro_Summer 2025
59 pages
Introduction to Statistical Hypothesis Testing in R
No ratings yet
Introduction to Statistical Hypothesis Testing in R
8 pages
Basic Statistics
No ratings yet
Basic Statistics
101 pages
Hypothesis Testing with z Tests
No ratings yet
Hypothesis Testing with z Tests
32 pages
Inferential Statistics PART 1 Presentation
No ratings yet
Inferential Statistics PART 1 Presentation
28 pages
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
No ratings yet
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
22 pages
Statistics For College Students-Part 2
100% (1)
Statistics For College Students-Part 2
43 pages
(9)HT_mean
No ratings yet
(9)HT_mean
46 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
11 pages
DMDA Unit-5 notes (2) (1)
No ratings yet
DMDA Unit-5 notes (2) (1)
35 pages
Test of Hypothesis For 2020
100% (1)
Test of Hypothesis For 2020
62 pages
chapter 7 hypothesis testing and sample size determination _2
No ratings yet
chapter 7 hypothesis testing and sample size determination _2
69 pages
Lecture Inferential Statistical Analysis
No ratings yet
Lecture Inferential Statistical Analysis
43 pages
hyp
No ratings yet
hyp
19 pages
Bear Handout Hypothesis Testing
No ratings yet
Bear Handout Hypothesis Testing
12 pages
Unit 3 (Hypothesis Testing)
No ratings yet
Unit 3 (Hypothesis Testing)
40 pages
AES Lecture5 Testing
No ratings yet
AES Lecture5 Testing
58 pages
Module 7 - MAMW100 Hypothesis Testing New
No ratings yet
Module 7 - MAMW100 Hypothesis Testing New
6 pages
Session 7-8 Reading: SFM Ch. 9
No ratings yet
Session 7-8 Reading: SFM Ch. 9
13 pages
Lect 7 Hypothesis Testing
No ratings yet
Lect 7 Hypothesis Testing
23 pages
PHPS30020 Week1 (5) - 29nov2023 (Test Decisions & Assumptions, Hypothesis, Compare 2 Groups)
No ratings yet
PHPS30020 Week1 (5) - 29nov2023 (Test Decisions & Assumptions, Hypothesis, Compare 2 Groups)
16 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
60 pages
PSAI Unit 4
No ratings yet
PSAI Unit 4
38 pages
Chapter No. 08 Fundamental Sampling Distributions and Data Descriptions - 02 (Presentation)
No ratings yet
Chapter No. 08 Fundamental Sampling Distributions and Data Descriptions - 02 (Presentation)
91 pages
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Document h Ws
No ratings yet
Document h Ws
8 pages
AGB202_2018-19
No ratings yet
AGB202_2018-19
7 pages
Untitled
No ratings yet
Untitled
8 pages
Lecture 16 Covariation Correlations
No ratings yet
Lecture 16 Covariation Correlations
42 pages
NAVIDOXINE-TABLETS
No ratings yet
NAVIDOXINE-TABLETS
5 pages
Chapter 3
No ratings yet
Chapter 3
21 pages
The Beta Distribution
No ratings yet
The Beta Distribution
11 pages
Hahuethiopia Mid Exam Collection
No ratings yet
Hahuethiopia Mid Exam Collection
9 pages
Em Tec 1012
No ratings yet
Em Tec 1012
4 pages
ComplexNumbersExplainedwithWorkedExamples by SSZakariyah
No ratings yet
ComplexNumbersExplainedwithWorkedExamples by SSZakariyah
55 pages
7
No ratings yet
7
1 page
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
Analisis Karakter Wisatawan Mancanegara Terhadap Fasilitas Wisata Kawasan Prawirotaman
No ratings yet
Analisis Karakter Wisatawan Mancanegara Terhadap Fasilitas Wisata Kawasan Prawirotaman
11 pages
Lesson 1 What Is Trend
No ratings yet
Lesson 1 What Is Trend
3 pages
Instructional Design
100% (8)
Instructional Design
22 pages
Manual - Pdms Hvac Design Vol1
No ratings yet
Manual - Pdms Hvac Design Vol1
98 pages
Demo Lesson Plan
No ratings yet
Demo Lesson Plan
4 pages
Cap 08
No ratings yet
Cap 08
44 pages
Student Handbooks 2018
No ratings yet
Student Handbooks 2018
78 pages
Piping Class Datasheet
No ratings yet
Piping Class Datasheet
4 pages
Load Test Procedure
No ratings yet
Load Test Procedure
5 pages
GRADE-2-CREATIVE-ACTIVITIES-DESIGN-May-2024-final_NoRestriction
No ratings yet
GRADE-2-CREATIVE-ACTIVITIES-DESIGN-May-2024-final_NoRestriction
52 pages
Attempt the MCQs Laptop SASTRA DDE
No ratings yet
Attempt the MCQs Laptop SASTRA DDE
25 pages
Operations-Research (Set 3)
No ratings yet
Operations-Research (Set 3)
17 pages
Detailed Lesson Plan
No ratings yet
Detailed Lesson Plan
3 pages
Gratitude Questions - Through A Soulful Journey (Soul Candy)
No ratings yet
Gratitude Questions - Through A Soulful Journey (Soul Candy)
128 pages
Chapter-03 Hydrologic Losses
No ratings yet
Chapter-03 Hydrologic Losses
65 pages
S2 Revision Notes
No ratings yet
S2 Revision Notes
2 pages
25 26 QEP 2023 - Theme - Good Governance - theIAShub - Part 1 - 2
No ratings yet
25 26 QEP 2023 - Theme - Good Governance - theIAShub - Part 1 - 2
36 pages
Oil Seal Reference Chart
No ratings yet
Oil Seal Reference Chart
1 page
Access Intentional Interviewing and Counseling Facilitating Client Development in a Multicultural Society 8th Edition Ivey Test Bank All Chapters Immediate PDF Download
100% (6)
Access Intentional Interviewing and Counseling Facilitating Client Development in a Multicultural Society 8th Edition Ivey Test Bank All Chapters Immediate PDF Download
41 pages
Art Movement Presentation Rubric
No ratings yet
Art Movement Presentation Rubric
1 page
From Agamben To Zizek - Jon Simons
No ratings yet
From Agamben To Zizek - Jon Simons
288 pages
Chapter-3 Recovery Techniques
No ratings yet
Chapter-3 Recovery Techniques
25 pages
Ece 480 Lab 1
No ratings yet
Ece 480 Lab 1
8 pages
An Evaluation of The Effect of Promotion On Employees Productivity A Case Study of Guaranty Trust Bank
No ratings yet
An Evaluation of The Effect of Promotion On Employees Productivity A Case Study of Guaranty Trust Bank
37 pages
THESIS AND DISSERTATION Guide
No ratings yet
THESIS AND DISSERTATION Guide
24 pages
Dissertation Sujet Largent Fait Il Le Bonheur
100% (2)
Dissertation Sujet Largent Fait Il Le Bonheur
4 pages
kanvadee,+ ($userGroup) ,+RJP41+-+24+บทความวิจัย+การพัฒนาสมรรถนะครูด้านการจัดการเรียนรู้ในโรงเรียนสังกัดเทศบาลนครสุราษฎร์ธานี+ +วิชา
No ratings yet
kanvadee,+ ($userGroup) ,+RJP41+-+24+บทความวิจัย+การพัฒนาสมรรถนะครูด้านการจัดการเรียนรู้ในโรงเรียนสังกัดเทศบาลนครสุราษฎร์ธานี+ +วิชา
16 pages
Exam Hall Seating Arrangement System: Bachelor of Technology
No ratings yet
Exam Hall Seating Arrangement System: Bachelor of Technology
7 pages
Email Subject: Burlington English Brings Your Child 'Speech & Debate League' - Explore The Hydra at IIT Bombay!
No ratings yet
Email Subject: Burlington English Brings Your Child 'Speech & Debate League' - Explore The Hydra at IIT Bombay!
3 pages

m09-inference

Uploaded by

m09-inference

Uploaded by

Inference and Hypothesis

• Test statistics and p-values

• Repeated tests and high-dimensional data

• Eric has dark hair

⇒ The lecturer is not Eric

Test statistic: weight

⇒ This is probably not a chicken

• Alternate hypothesis is everything else

• When the null hypothesis is true, p-values have a uniform distribution

• p-values < α are considered “significant”

• If the test statistic is normally-distributed under the null

• One-sided/one-tail tests test whether the test

• Two-sided/two-tail tests test whether the test

• Testing Pearson correlations

• Often make very strong assumptions about the data

• Perform the test on the gold standard data. Possible outcomes:

• Power: probability of detecting a true effect

• Most tests have a parameter that

• Area Under the Curve (AUC)

• Consider 20,000 genes measured at once by microarray/RNA-seq

• What can we do?

• Control False Discovery Rate (FDR)

• Statistically significant ⇏ biologically significant

• Not statistically significant ⇏ not biologically significant

• Equivalent to a high p-value

• “Guilty” vs “not guilty”

• Repeated tests and high-dimensional data

You might also like