Time: Sep 9, 2024 Every Week On Monday, Until Dec 23, 2024, 16 Occurrence(s) Join Zoom Meeting

Uploaded by

thanh0166646

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views33 pages

Time: Sep 9, 2024 Every Week On Monday, Until Dec 23, 2024, 16 Occurrence(s) Join Zoom Meeting

Uploaded by

thanh0166646

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

1.

Data Science
Time: Sep 9, 2024
Every week on Monday, until Dec 23, 2024, 16 occurrence(s)

Join Zoom Meeting

https://zoom.us/j/97619662629?pwd=vgjXmYMLObqb6NbVKeUIASv417YrGB.1

Meeting ID: 976 1966 2629

Passcode: 869227
Revised
Week 1: 2024-09-09: Chapter 1: Introduction about Data Science (Joel_1)
Week 2: 2024-09-16: Chapter 2: Python Foundations - Libraries (Joel_2,3, Wes_1,2)
Week 3: 2024-09-23: ibid. (Joel_3,4, Wes_1,2)
Week 4: 2024-09-30: Chapter 3: Statistics Foundations (Joel_4,5,7, Wes_4)
Week 5: 2024-10-07: ibid.
Week 6: 2024-10-14: Chapter 4: Probability (Joel_6,7)
Week 7: 2024-10-21: ibid.
Week 8: 2024-10-28: Chapter 5: Getting Data (Joel_9)
Week 9: 2024-11-04: Chapter 6: Working with Data (Joel_10)
Week 10: 2024-11-11: Chapter 7: Machine Learning Algorithms (Joel_11,…,19)
Week 11: 2024-11-18: ibid.
Week 12: 2024-11-25: Chapter 8: Network Analysis (Joel_21)
Week 13: 2024-12-02: Chapter 9: Recommender Systems (Joel_22)
Week 14: 2024-12-09: Chapter 10: Databases and SQL (Joel_23)
Week 15: 2024-12-16: Chapter 11: MapReduce (Joel_24)
- Chapter 1: Introduction about Data Science 11. Course Details
- Chapter 2: Python Foundations - Libraries
Pandas, NumPy, Arrays and Matrix handling, Data Visualization, Exploratory Data Analysis (EDA)
- Chapter 3: Statistics Foundations
Basic/Descriptive Statistics, Distributions (Binomial, Poisson, etc.), Bayes, Inferential Statistics
- Chapter 4: Probability
Dependence and Independence, Conditional Probability, Bayes’s Theorem, Random Variables, The Normal Distribution
- Chapter 5: Getting Data
Reading files, Scraping the web, using APIs
- Chapter 6: Working with Data
Exploring Your Data, Cleaning and Munging, Manipulating Data, Rescaling, Dimensionality Reduction
- Chapter 7: Machine Learning Argorithms
k-Nearest Neighbors, Naive Bayes, Linear Regression, Multiple Regression, Logistic Regression, Decision Trees, Neural
Networks, Clustering.
- Chapter 8: Network Analysis
Examples (data as a network versus network to represent dependence among variables), determine important nodes and edges
in a network, clustering in a network
- Chapter 9: Recommender Systems
Manual Cuaration, Recommending What's Popular, User-Based Collaborative Filtering, Item-Based Collaborative Filtering
- Chapter 10: Databases and SQL
Create Table and Insert, Update, Delete, Select, Group By, Order By, Join, Subqueries, Indexes, Query Optimization, NoSQL.
- Chapter 11: MapReduce
Why MapReduce, Examples in Analyzing Status Updates, Examples in Matrix Multiplication
- Chapter 12: Examples in research cases and bussiness
Week 7: 2024-10-21: (Joel_7)
- Chapter 3: Statistics Foundations
Basic/Descriptive Statistics, Distributions (Binomial, Poisson, etc.), Bayes,
Inferential Statistics

[0-1],[0-2] Joel:
7. Hypothesis and Inference
Statistical Hypothesis Testing/Example: Flipping a Coin/p-Values/
Confidence Intervals/p-Hacking/Example: Running an A/B Test/
Bayesian Inference
https://www.scribbr.com/statistics/type-i-and-type-ii-errors/
Figure 8.2 (drawn under the assumption that H0 is true, so that the curve centers at μ0) [4]
p-Values [0-2]
Confidence Intervals [0-2]
p-Hacking
https://www.iro.umontreal.ca/~dift3913/cours/papers/cohen1994_The_
earth_is_round.pdf
Example: Running an A/B Test
A/B testing (also known as bucket testing, split-run testing, or split testing) is
a user experience research method. A/B tests consist of a randomized
experiment that usually involves two variants (A and B), although the concept
can be also extended to multiple variants of the same variable. It includes
application of statistical hypothesis testing or "two-sample hypothesis testing"
as used in the field of statistics. A/B testing is a way to compare multiple
versions of a single variable, for example by testing a subject's response to
variant A against variant B, and determining which of the variants is more
effective.

Example of A/B testing on a website. By

randomly serving visitors two versions of a
website that differ only in the design of a single
button element, the relative efficacy of the two
designs can be measured.
Example: Running an A/B Test
Abstract Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been
decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts
that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics
requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand
has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously
so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we
provide definitions and a discussion of basic statistics that are more general and critical than typically found in
traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers
of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot
misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses
for presentation based on the P values they produce) can lead to small P values even if the declared test
hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an
explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with
guidelines for improving statistical interpretation and reporting.
Common misinterpretations of single P values
1. The P value is the probability that the test hypothesis is true; for example, if a test of the null
hypothesis gave P = 0.01, the null hypothesis has only a 1 % chance of being true; if instead it gave P =
0.40, the null hypothesis has a 40 % chance of being true. No!
2. The P value for the null hypothesis is the probability that chance alone produced the observed
association; for example, if the P value for the null hypothesis is 0.08, there is an 8 % probability that
chance alone produced the association. No!
3. A significant test result (P ≤ 0.05) means that the test hypothesis is false or should be rejected. No!
4. A nonsignificant test result (P > 0.05) means that the test hypothesis is true or should be accepted.
No!
5. A large P value is evidence in favor of the test hypothesis. No!
6. A null-hypothesis P value greater than 0.05 means that no effect was observed, or that absence of an
effect was shown or demonstrated. No!
7. Statistical significance indicates a scientifically or substantively important relation has been detected.
No!
8. Lack of statistical significance indicates that the effect size is small. No!
9. The P value is the chance of our data occurring if the test hypothesis is true; for example, P = 0.05
means that the observed association would occur only 5 % of the time under the test hypothesis. No!
…
10. If you reject the test hypothesis because P ≤ 0.05, the chance you are in error (the chance your
‘‘significant finding’’ is a false positive) is 5 %. No!
11. P = 0.05 and P ≤ 0.05 mean the same thing. No!
12. P values are properly reported as inequalities (e.g., report ‘‘P < 0.02’’ when P = 0.015 or report
‘‘P > 0.05’’ when P = 0.06 or P = 0.70). No!
13. Statistical significance is a property of the phenomenon being studied, and thus statistical tests detect
significance. No!
14. One should always use two-sided P values. No!

Common misinterpretations of P value comparisons and predictions

15. When the same hypothesis is tested in different studies and none or a minority of the tests are
statistically significant (all P > 0.05), the overall evidence supports the hypothesis. No!
16. When the same hypothesis is tested in two different populations and the resulting P values are on
opposite sides of 0.05, the results are conflicting. No!
17. When the same hypothesis is tested in two different populations and the same P values are obtained,
the results are in agreement. No!
18. If one observes a small P value, there is a good chance that the next study will produce a P value at
least as small for the same hypothesis. No!
Common misinterpretations of confidence intervals
19. The specific 95 % confidence interval presented by a study has a 95 % chance of containing the true
effect size. No!
20. An effect size outside the 95 % confidence interval has been refuted (or excluded) by the data. No!
21. If two confidence intervals overlap, the difference between two estimates or studies is not significant.
No!
22. An observed 95 % confidence interval predicts that 95 % of the estimates from future studies will fall
inside the observed interval. No!
23. If one 95 % confidence interval includes the null value and another excludes that value, the interval
excluding the null is the more precise one. No!
24. If you accept the null hypothesis because the null P value exceeds 0.05 and the power of your test is
90 %, the chance you are in error (the chance that your finding is a false negative) is 10 %. No!
25. If the null P value exceeds 0.05 and the power of this test is 90 %at an alternative, the results support
the null over the alternative. This claim seems intuitive to many, but counterexamples are easy to construct

we add the emphatic ‘‘No!’’ to underscore statements that are not only fallacious but
also not ‘‘true enough for practical purposes.’’
Bayesian Inference
Quiz of today, Week 7, 2024/10/21

[0] Comments on this lecture are welcome.

[1] Simulation of Central Limit Theorem
[1-1] Choose and set initial population by combining distribution functions
from SciPy.
[1-2] Make a set of samples with various sizes and compute sample means
and plot their distribution similar to the figure in the next slide.
[1-3] Discuss the result and show that CLT works
[4] Douglas S. Shafer and Zhiyi Zhang, Introductory Statistics
The Central Limit Theorem(revisited)
Definition(characteristic function)
Properties (characteristic function)
The Central Limit Theorem(revisited)

Lecture 7
100% (1)
Lecture 7
29 pages
(Ebook PDF) Essentials of Statistics For The Behavioral Sciences 10th Edition Download
100% (2)
(Ebook PDF) Essentials of Statistics For The Behavioral Sciences 10th Edition Download
31 pages
UniMAC PRESENTATION TEMPLATE WEEK
No ratings yet
UniMAC PRESENTATION TEMPLATE WEEK
32 pages
Statistical Tests, P Values, Confidence Intervals, and Power, A Guide To Misinterpretations.
No ratings yet
Statistical Tests, P Values, Confidence Intervals, and Power, A Guide To Misinterpretations.
15 pages
U02Lecture05 - Statistical Experiments and Significance Testing
No ratings yet
U02Lecture05 - Statistical Experiments and Significance Testing
51 pages
Module2 DS
No ratings yet
Module2 DS
46 pages
11 Statistical Tests
No ratings yet
11 Statistical Tests
24 pages
SLIDES 20180123 RodLittle
No ratings yet
SLIDES 20180123 RodLittle
50 pages
Module2 Ds
No ratings yet
Module2 Ds
28 pages
Data Interpretation and Research Translation: Part I: Descriptive Statistics
No ratings yet
Data Interpretation and Research Translation: Part I: Descriptive Statistics
21 pages
L1 BRSM Ch1 WhyDoStats
No ratings yet
L1 BRSM Ch1 WhyDoStats
37 pages
To Practical Applications (Amy Batchelor)
No ratings yet
To Practical Applications (Amy Batchelor)
206 pages
P Value
No ratings yet
P Value
31 pages
Advanced Statistics
No ratings yet
Advanced Statistics
22 pages
(Ebook PDF) Statistics For The Behavioral Sciences 10th Editioninstant Download
100% (4)
(Ebook PDF) Statistics For The Behavioral Sciences 10th Editioninstant Download
53 pages
(Ebook PDF) Statistics For The Behavioral Sciences 10Th Edition Install Download
No ratings yet
(Ebook PDF) Statistics For The Behavioral Sciences 10Th Edition Install Download
55 pages
Statistical Significance Little Quick Fix, 1st Edition ISBN 1526466783, 9781526466785 All Format Download
No ratings yet
Statistical Significance Little Quick Fix, 1st Edition ISBN 1526466783, 9781526466785 All Format Download
14 pages
Bayesian
No ratings yet
Bayesian
12 pages
PSM 201 Sampling Distributions and Hypothesis Testing
No ratings yet
PSM 201 Sampling Distributions and Hypothesis Testing
31 pages
1485 (Ebook PDF) Essentials of Statistics For The Behavioral Sciences 10th Edition PDF Download
100% (2)
1485 (Ebook PDF) Essentials of Statistics For The Behavioral Sciences 10th Edition PDF Download
54 pages
Advanced Statistics Interview Questions
No ratings yet
Advanced Statistics Interview Questions
4 pages
Key Statistical Ideas For Research Students v2
100% (1)
Key Statistical Ideas For Research Students v2
4 pages
(Ebook PDF) Statistics For The Behavioral Sciences 10th Edition Download
No ratings yet
(Ebook PDF) Statistics For The Behavioral Sciences 10th Edition Download
50 pages
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
100% (1)
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
41 pages
(DONE!) 1ST-AdvacedStats (Angge)
No ratings yet
(DONE!) 1ST-AdvacedStats (Angge)
23 pages
Basic Concepts
No ratings yet
Basic Concepts
13 pages
(Ebook PDF) Statistics For The Behavioral Sciences 10Th Edition
No ratings yet
(Ebook PDF) Statistics For The Behavioral Sciences 10Th Edition
55 pages
Hypothesis Testing, P Values, Confidence Intervals, and Significance
No ratings yet
Hypothesis Testing, P Values, Confidence Intervals, and Significance
6 pages
Utas A 1154108 sm5368
No ratings yet
Utas A 1154108 sm5368
12 pages
(Ebook PDF) Statistics: Unlocking The Power of Data, 2nd Edition Download
100% (2)
(Ebook PDF) Statistics: Unlocking The Power of Data, 2nd Edition Download
51 pages
Inferential Statistics: Sampling, Probability, and Hypothesis Testing
No ratings yet
Inferential Statistics: Sampling, Probability, and Hypothesis Testing
26 pages
Inferential Statistics in Details
No ratings yet
Inferential Statistics in Details
652 pages
9.1 Significance Tests: The Basics: Problem 1 - 911 Calls
No ratings yet
9.1 Significance Tests: The Basics: Problem 1 - 911 Calls
15 pages
Statistics Done Wrong PDF
No ratings yet
Statistics Done Wrong PDF
27 pages
Where Can Buy Statistics: Unlocking The Power of Data, 2nd Edition (Ebook PDF) Ebook With Cheap Price
100% (10)
Where Can Buy Statistics: Unlocking The Power of Data, 2nd Edition (Ebook PDF) Ebook With Cheap Price
56 pages
(Ebook PDF) Statistics For The Behavioral Sciences 10Th Edition Download
No ratings yet
(Ebook PDF) Statistics For The Behavioral Sciences 10Th Edition Download
51 pages
Confidence Limits in Statistics
No ratings yet
Confidence Limits in Statistics
30 pages
Unit 5 Review
No ratings yet
Unit 5 Review
4 pages
Introduction To Key Statistical Concepts - 2024
No ratings yet
Introduction To Key Statistical Concepts - 2024
27 pages
Test of Statistical Significance
No ratings yet
Test of Statistical Significance
10 pages
PracticeForTest2 s24
No ratings yet
PracticeForTest2 s24
5 pages
E94AxPE ServoPLC (From Firmware 02-01) v4-0 en
No ratings yet
E94AxPE ServoPLC (From Firmware 02-01) v4-0 en
976 pages
Kohavi R Tang, D Xu, Y. (2020) - Trustworthy Online Controlled Experiments. A Practical Guide To A-B Testing. 1° Edición. Cap ES
No ratings yet
Kohavi R Tang, D Xu, Y. (2020) - Trustworthy Online Controlled Experiments. A Practical Guide To A-B Testing. 1° Edición. Cap ES
33 pages
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
No ratings yet
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
10 pages
1 Vocab Reasoning
No ratings yet
1 Vocab Reasoning
3 pages
P Value - P Valor
No ratings yet
P Value - P Valor
2 pages
Theory
No ratings yet
Theory
7 pages
Significance in Statistics
100% (1)
Significance in Statistics
3 pages
Significance-Testing-White Paper
No ratings yet
Significance-Testing-White Paper
7 pages
Math 403 Engineering Data Analysi1
No ratings yet
Math 403 Engineering Data Analysi1
10 pages
What Is Statistical Significance
No ratings yet
What Is Statistical Significance
4 pages
Learner'S Packet (Leap) : Student Name: Section: Subject Teacher: Adviser
No ratings yet
Learner'S Packet (Leap) : Student Name: Section: Subject Teacher: Adviser
7 pages
Mathematical Statistics Statistic
No ratings yet
Mathematical Statistics Statistic
3 pages
Statistics Done Wrong
No ratings yet
Statistics Done Wrong
27 pages
Ten Big Statistical Ideas in Research
100% (1)
Ten Big Statistical Ideas in Research
32 pages
Walk Throught
No ratings yet
Walk Throught
4 pages
Confi Interval
No ratings yet
Confi Interval
2 pages
5 Brkens 2816
No ratings yet
5 Brkens 2816
80 pages
Startips: A Resource For Survey Researchers
No ratings yet
Startips: A Resource For Survey Researchers
2 pages
A - Statistical Versus Practical Significance
No ratings yet
A - Statistical Versus Practical Significance
12 pages
What Is A P Value
No ratings yet
What Is A P Value
4 pages
Chapter 1 A Preview of Business Statistics
No ratings yet
Chapter 1 A Preview of Business Statistics
3 pages
Lec0 CSE325
No ratings yet
Lec0 CSE325
14 pages
Pankaj Jio
No ratings yet
Pankaj Jio
72 pages
G7 Pretech Notes
No ratings yet
G7 Pretech Notes
44 pages
Vmware Vsphere: Install, Configure, Manage: para Uso Exclusivo Na Pós Graduação Mit Datacenter Do Instituto Infnet
No ratings yet
Vmware Vsphere: Install, Configure, Manage: para Uso Exclusivo Na Pós Graduação Mit Datacenter Do Instituto Infnet
11 pages
Class X Ict Notes
No ratings yet
Class X Ict Notes
6 pages
STOCK PREDICTION USING LSTM - With Testing
No ratings yet
STOCK PREDICTION USING LSTM - With Testing
29 pages
Ahmed Itam FSM
No ratings yet
Ahmed Itam FSM
7 pages
GA863 1 - 2 - 50A. - CC-Talk
No ratings yet
GA863 1 - 2 - 50A. - CC-Talk
152 pages
LP 0769
No ratings yet
LP 0769
33 pages
Data Sheet 6AV2123-2GB03-0AX0: General Information
No ratings yet
Data Sheet 6AV2123-2GB03-0AX0: General Information
9 pages
Computer-Aided Engineering (CAE) Is The Broad Usage of Computer Software To Aid in
No ratings yet
Computer-Aided Engineering (CAE) Is The Broad Usage of Computer Software To Aid in
6 pages
English DWDM Book 2
No ratings yet
English DWDM Book 2
11 pages
701 SCOR Exam
No ratings yet
701 SCOR Exam
4 pages
Schneider Electric - TransferPacT - TPCCOM16
No ratings yet
Schneider Electric - TransferPacT - TPCCOM16
3 pages
Canalyzer: The Tool For Comprehensive Ecu and Network Analysis
No ratings yet
Canalyzer: The Tool For Comprehensive Ecu and Network Analysis
2 pages
EX II Pad Locker User Manual English
No ratings yet
EX II Pad Locker User Manual English
31 pages
FCM Expert - Help
No ratings yet
FCM Expert - Help
18 pages
IntroCyberv2.1 Chp1 Instructor Supplemental Material
No ratings yet
IntroCyberv2.1 Chp1 Instructor Supplemental Material
17 pages
Dam 23603 - Lab 8 (Pointers)
No ratings yet
Dam 23603 - Lab 8 (Pointers)
5 pages
Ram vs. Rom
No ratings yet
Ram vs. Rom
2 pages
Class 7
No ratings yet
Class 7
5 pages
DS-KH6320-TE1 Video Intercom Network Indoor Station: Key Feature
No ratings yet
DS-KH6320-TE1 Video Intercom Network Indoor Station: Key Feature
3 pages
ICR Crisis - Simulation Ebook FINAL
No ratings yet
ICR Crisis - Simulation Ebook FINAL
7 pages
Darshan Manjrekar Resume New-1
No ratings yet
Darshan Manjrekar Resume New-1
2 pages
Anjalos Quateion
No ratings yet
Anjalos Quateion
1 page
Position Description Form
No ratings yet
Position Description Form
2 pages
Setting Pppoe Dan Load Balance Speedy - New Update
No ratings yet
Setting Pppoe Dan Load Balance Speedy - New Update
3 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet

Time: Sep 9, 2024 Every Week On Monday, Until Dec 23, 2024, 16 Occurrence(s) Join Zoom Meeting

Uploaded by

Time: Sep 9, 2024 Every Week On Monday, Until Dec 23, 2024, 16 Occurrence(s) Join Zoom Meeting

Uploaded by

1.

Join Zoom Meeting

Meeting ID: 976 1966 2629

Example of A/B testing on a website. By

Common misinterpretations of P value comparisons and predictions

[0] Comments on this lecture are welcome.

You might also like