SlideShare a Scribd company logo
Speakers
Statistical Models,
Explored and Explained
Sara Vafi, Stats Expert, Optimizely
Shana Rusonis, Product Marketing, Optimizely
Today’s Speakers
Sara Vafi Shana Rusonis
Housekeeping
• We’re recording!
• Slides and recording will be
emailed to you tomorrow
• Time for questions at the end
Agenda
• Bayesian & Frequentist Statistics
• Error Control - Average vs. All Error Control
• Bayes Rule
• Benefits & Risks
• Optimizely Stats Engine
• Q&A
Why Do We Experiment?
● Experimentation is essential for learning
● Try new ideas without fear of failure
● Give your business a signal to act on
in a sea of noisy data
What’s most Important to You?
● Running experiments quickly
● But also reporting on results accurately
● When not all statistical solutions are created equal
Types of Statistical Methods
Bayesian
OR
Frequentist
Bayesian Statistics
● Bayesian statistics take a more bottom-up approach to data analysis
● Our parameters are unknown
● The data is fixed
● There is a prior probability
● “Opinion-based”
“A Bayesian is one who, vaguely
expecting a horse, and catching a
glimpse of a donkey, strongly believes
he has seen a mule.”
Source
Frequentist Statistics
● Frequentist arguments are more counter-factual in nature
● Parameters remain constant during the repeatable sampling process
● Resemble the type of logic that lawyers use in court
● ‘Is this variation different from the control?’ is a basic building block of this
approach.
Example
Dan & Pete Rolling a 6-Sided Die
Scenario:
● Pete will roll a die and the outcome can either be 1, 2, 3, 4, 5, or 6
● If Pete rolls a 4, he will give Dan $1 million
If Dan was a Bayesian statistician, how would he react?
If Dan was a Frequentist statistician, how would he react?
Example
Probability of the sun exploding
Source
Error Control
Error Control Explained
● The likelihood that the observed result of an experiment happened by chance,
rather than a change that you introduced
● When we set the statistical significance on an experiment to 90%, that means
there's a 10% chance of a statistical error, or a 1 in 10 chance that the result
happened by chance
Average Error Control
● Corresponds to Bayesian A/B Testing
● Less useful for iterating on test results
● Harder to learn from individual experiments with confidence
All Error Control
● Corresponds to Frequentist A/B Testing
● Any experiment will have less than a 10% chance of a mistake
● Rate of errors is 1 in 10
Average Error Control vs. All Error Control
● Average error control leads to lower accuracy for small improvements
● All error control is accurate for all users
● There are certain cases where average error control is an appropriate
alternative
Error Rates for Experiments
Bayes Rule
Average Error Control & Bayesian A/B
Testing
● Requires two sources of randomness
○ Randomness or “noise” in the data
○ The makeup of the “typical” experiment group
● Distribution over experiment improvements
Different Beliefs in Composition of ‘Typical’ Experiments
Bayes Rule
Bayes Rule & Bayesian A/B Testing
Bayes Rule & Average Error Value
Recap Average Error Control
Bayesian A/B Testing
Prior Distributions
Bayes Rule
All Error Control is Frequentist A/B Testing
● All error control corresponds to Frequentist AB testing
● We want to aim to control the false positive rate
● Chance an experiment is either called a winner or loser
Benefits & Risks
Benefits of Bayesian A/B Testing
● Average error control can be very
attractive
● Helps solve the “peeking” problem
● Average error control is fast
Risks of Bayesian A/B Testing
● It’s more appealing but it’s risky in practice
● Smaller improvement experiments with fast results = high risk
● Higher error rate than the method actually suggests
Benefits of Frequentist A/B Testing
● This type of test will make fewer mistakes on experiments with
non-zero improvements
● The rate of errors will be less than 1 in 10
● Option to speed up experimentation by using a prior
Learning from A/B Tests
Learning from A/B Tests
Risk Involved with Typical Realistic Experiments
Realistic Bayesian A/B Tests vs. Stats Engine
● The hardest experiments to call correctly are those with small
improvements
● A/B testing in the wild is not easy
● We need more and more data in order to achieve average error control
on realistic experiments
So what does this mean?
Stats Engine
Stats EngineTM
Results are valid whenever you
check
Avoid costly statistics errors
Measure real-time results
with confidence
Key Takeaways
● Bayesian vs. Frequentist methods
● All error control vs. average error control
● Blended approach leads to greater confidence
QUESTIONS?
THANK YOU!

More Related Content

PPTX
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...
PPTX
Confidence interval
PPSX
Relative and Atribute Risk
PPT
Validity and Screening Test
PDF
Critical Appraisal of systematic review and meta analysis articles
PPTX
Hypothesis testing
PDF
Pearson Correlation, Spearman Correlation &Linear Regression
PDF
Biostatistics exam questions by tadele girum
STATISTICS : Changing the way we do: Hypothesis testing, effect size, power, ...
Confidence interval
Relative and Atribute Risk
Validity and Screening Test
Critical Appraisal of systematic review and meta analysis articles
Hypothesis testing
Pearson Correlation, Spearman Correlation &Linear Regression
Biostatistics exam questions by tadele girum

What's hot (20)

PPTX
NON-PARAMETRIC TESTS by Prajakta Sawant
PDF
Basic survival analysis
PPTX
Case control & other study designs-i-dr.wah
PPTX
Regression and corelation (Biostatistics)
PPT
Randomized controlled trial
PPT
Estimating standard error of measurement
PPTX
Kruskal Wall Test
PPTX
1.5.3 measures prevalence
 
PPTX
Inferential Statistics
PPTX
Power Analysis: Determining Sample Size for Quantitative Studies
PPTX
What does an odds ratio or relative risk mean?
PPTX
Bias, Confounding, and Interaction.pptx
PPTX
OUTCOME RESEARCH
ODP
Multiple linear regression II
PPSX
Relative and Attributable Risk For Graduate and Postgraduate Students
PPTX
Blinding in Clinical Trials
PPT
Estimation and hypothesis testing 1 (graduate statistics2)
PPTX
Types of error in hypothesis
DOCX
Binary Logistic Regression
PPTX
Nested case control study
NON-PARAMETRIC TESTS by Prajakta Sawant
Basic survival analysis
Case control & other study designs-i-dr.wah
Regression and corelation (Biostatistics)
Randomized controlled trial
Estimating standard error of measurement
Kruskal Wall Test
1.5.3 measures prevalence
 
Inferential Statistics
Power Analysis: Determining Sample Size for Quantitative Studies
What does an odds ratio or relative risk mean?
Bias, Confounding, and Interaction.pptx
OUTCOME RESEARCH
Multiple linear regression II
Relative and Attributable Risk For Graduate and Postgraduate Students
Blinding in Clinical Trials
Estimation and hypothesis testing 1 (graduate statistics2)
Types of error in hypothesis
Binary Logistic Regression
Nested case control study
Ad

Viewers also liked (20)

PPTX
The New State of Personalization
PDF
Recommendations Reboot: Improving RIO with Experimentation
PPTX
Retain or Die: The Retention Playbook
PDF
Meet Optimizely X Web Experimentation
PDF
Meet Optimizely X Recommendations
PDF
7 Habits of Highly Effective Personalization Organizations
PDF
Optimizely X Seminar Amsterdam Nov 10
PDF
Unbounce Pitch
PDF
BounceX Client Presentation
PDF
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
PPTX
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
PDF
Optimize Everything : A framework for solving your BIGGEST Problems Through O...
PPTX
Adobe Marketing Cloud
PPTX
Shopper Survey Trends, Europe
PDF
Conversion Day Belgium - Personalization keynote
PDF
Chicago Website Personalization Strategy Workshop
PPTX
The Art & Science of Standing Out in a Saturated Market
PDF
AppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
PDF
Personalization - 10 Lessons Learned from Netflix
PDF
5 Ways To Surprise Your Audience (and keep their attention)
The New State of Personalization
Recommendations Reboot: Improving RIO with Experimentation
Retain or Die: The Retention Playbook
Meet Optimizely X Web Experimentation
Meet Optimizely X Recommendations
7 Habits of Highly Effective Personalization Organizations
Optimizely X Seminar Amsterdam Nov 10
Unbounce Pitch
BounceX Client Presentation
A/B Tests mit Optimizely in Single Page Apps - Beispiel AngularJS
Real-Time Personalization: Top 5 Use Cases to Boost Conversions
Optimize Everything : A framework for solving your BIGGEST Problems Through O...
Adobe Marketing Cloud
Shopper Survey Trends, Europe
Conversion Day Belgium - Personalization keynote
Chicago Website Personalization Strategy Workshop
The Art & Science of Standing Out in a Saturated Market
AppsFlyer Mobile App Tracking | Campaign & Engagement Analytics
Personalization - 10 Lessons Learned from Netflix
5 Ways To Surprise Your Audience (and keep their attention)
Ad

Similar to Statistical Models Explored and Explained (20)

PDF
Chris Stuccio - Data science - Conversion Hotel 2015
PDF
Subject-2---Unidimensional-Data-2024.pdf
PDF
Probability and basic statistics with R
PDF
Bayesian Learning - Naive Bayes Algorithm
PPT
What So Funny About Proportion Testv3
PDF
Optimizely Workshop: Take Action on Results with Statistics
PPTX
Math138 lectures 3rd edition scoolbook
DOCX
Ashford 2 - Week 1 - Instructor GuidanceWeek OverviewThe f.docx
DOCX
Statistics in real life engineering
PPTX
Bayesian Reasoning and Learning
PDF
An introduction to Bayesian Statistics and its application
PPTX
Lec13_Bayes.pptx
PDF
Causal Inference for Everyone
PDF
Making Sense of Data Big and Small
PPTX
UNIT-II-Probability-ConditionalProbability-BayesTherom.pptx
DOCX
Data Mining Avoiding False DiscoveriesLecture Notes for Chapt
PDF
Feb21 mayobostonpaper
PDF
Introduction to Bayesian Inference
PDF
sigir2017bayesian
PPTX
Bayesian Analysis Fundamentals with Examples
Chris Stuccio - Data science - Conversion Hotel 2015
Subject-2---Unidimensional-Data-2024.pdf
Probability and basic statistics with R
Bayesian Learning - Naive Bayes Algorithm
What So Funny About Proportion Testv3
Optimizely Workshop: Take Action on Results with Statistics
Math138 lectures 3rd edition scoolbook
Ashford 2 - Week 1 - Instructor GuidanceWeek OverviewThe f.docx
Statistics in real life engineering
Bayesian Reasoning and Learning
An introduction to Bayesian Statistics and its application
Lec13_Bayes.pptx
Causal Inference for Everyone
Making Sense of Data Big and Small
UNIT-II-Probability-ConditionalProbability-BayesTherom.pptx
Data Mining Avoiding False DiscoveriesLecture Notes for Chapt
Feb21 mayobostonpaper
Introduction to Bayesian Inference
sigir2017bayesian
Bayesian Analysis Fundamentals with Examples

More from Optimizely (20)

PDF
Clover Rings Up Digital Growth to Drive Experimentation
PPTX
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
PPTX
The Science of Getting Testing Right
PDF
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
PPTX
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
PPTX
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
PDF
The Future of Optimizely for Technical Teams
PPTX
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
PPTX
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
PDF
Building an Experiment Pipeline for GitHub’s New Free Team Offering
PPTX
AMC Networks Experiments Faster on the Server Side
PDF
Evolving Experimentation from CRO to Product Development
PDF
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
PPTX
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
PPTX
Making Your Hypothesis Work Harder to Inform Future Product Strategy
PPTX
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
PPTX
Experimentation through Clients' Eyes
PPTX
Shipping to Learn and Accelerate Growth with GitHub
PPTX
Test Everything: TrustRadius Delivers Customer Value with Experimentation
PDF
Optimizely Agent: Scaling Resilient Feature Delivery
Clover Rings Up Digital Growth to Drive Experimentation
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
The Science of Getting Testing Right
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
The Future of Optimizely for Technical Teams
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Building an Experiment Pipeline for GitHub’s New Free Team Offering
AMC Networks Experiments Faster on the Server Side
Evolving Experimentation from CRO to Product Development
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
Making Your Hypothesis Work Harder to Inform Future Product Strategy
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Experimentation through Clients' Eyes
Shipping to Learn and Accelerate Growth with GitHub
Test Everything: TrustRadius Delivers Customer Value with Experimentation
Optimizely Agent: Scaling Resilient Feature Delivery

Recently uploaded (20)

PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
A Presentation on Artificial Intelligence
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
Tartificialntelligence_presentation.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
1 - Historical Antecedents, Social Consideration.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
OMC Textile Division Presentation 2021.pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Hindi spoken digit analysis for native and non-native speakers
DP Operators-handbook-extract for the Mautical Institute
Unlocking AI with Model Context Protocol (MCP)
A Presentation on Artificial Intelligence
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Enhancing emotion recognition model for a student engagement use case through...
Programs and apps: productivity, graphics, security and other tools
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation theory and applications.pdf
Chapter 5: Probability Theory and Statistics
Tartificialntelligence_presentation.pptx
Zenith AI: Advanced Artificial Intelligence
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
WOOl fibre morphology and structure.pdf for textiles
gpt5_lecture_notes_comprehensive_20250812015547.pdf
cloud_computing_Infrastucture_as_cloud_p
1 - Historical Antecedents, Social Consideration.pdf

Statistical Models Explored and Explained

  • 1. Speakers Statistical Models, Explored and Explained Sara Vafi, Stats Expert, Optimizely Shana Rusonis, Product Marketing, Optimizely
  • 3. Housekeeping • We’re recording! • Slides and recording will be emailed to you tomorrow • Time for questions at the end
  • 4. Agenda • Bayesian & Frequentist Statistics • Error Control - Average vs. All Error Control • Bayes Rule • Benefits & Risks • Optimizely Stats Engine • Q&A
  • 5. Why Do We Experiment? ● Experimentation is essential for learning ● Try new ideas without fear of failure ● Give your business a signal to act on in a sea of noisy data
  • 6. What’s most Important to You? ● Running experiments quickly ● But also reporting on results accurately ● When not all statistical solutions are created equal
  • 7. Types of Statistical Methods Bayesian OR Frequentist
  • 8. Bayesian Statistics ● Bayesian statistics take a more bottom-up approach to data analysis ● Our parameters are unknown ● The data is fixed ● There is a prior probability ● “Opinion-based”
  • 9. “A Bayesian is one who, vaguely expecting a horse, and catching a glimpse of a donkey, strongly believes he has seen a mule.” Source
  • 10. Frequentist Statistics ● Frequentist arguments are more counter-factual in nature ● Parameters remain constant during the repeatable sampling process ● Resemble the type of logic that lawyers use in court ● ‘Is this variation different from the control?’ is a basic building block of this approach.
  • 11. Example Dan & Pete Rolling a 6-Sided Die Scenario: ● Pete will roll a die and the outcome can either be 1, 2, 3, 4, 5, or 6 ● If Pete rolls a 4, he will give Dan $1 million If Dan was a Bayesian statistician, how would he react? If Dan was a Frequentist statistician, how would he react?
  • 12. Example Probability of the sun exploding Source
  • 14. Error Control Explained ● The likelihood that the observed result of an experiment happened by chance, rather than a change that you introduced ● When we set the statistical significance on an experiment to 90%, that means there's a 10% chance of a statistical error, or a 1 in 10 chance that the result happened by chance
  • 15. Average Error Control ● Corresponds to Bayesian A/B Testing ● Less useful for iterating on test results ● Harder to learn from individual experiments with confidence
  • 16. All Error Control ● Corresponds to Frequentist A/B Testing ● Any experiment will have less than a 10% chance of a mistake ● Rate of errors is 1 in 10
  • 17. Average Error Control vs. All Error Control ● Average error control leads to lower accuracy for small improvements ● All error control is accurate for all users ● There are certain cases where average error control is an appropriate alternative
  • 18. Error Rates for Experiments
  • 20. Average Error Control & Bayesian A/B Testing ● Requires two sources of randomness ○ Randomness or “noise” in the data ○ The makeup of the “typical” experiment group ● Distribution over experiment improvements
  • 21. Different Beliefs in Composition of ‘Typical’ Experiments
  • 23. Bayes Rule & Bayesian A/B Testing
  • 24. Bayes Rule & Average Error Value
  • 25. Recap Average Error Control Bayesian A/B Testing Prior Distributions Bayes Rule
  • 26. All Error Control is Frequentist A/B Testing ● All error control corresponds to Frequentist AB testing ● We want to aim to control the false positive rate ● Chance an experiment is either called a winner or loser
  • 28. Benefits of Bayesian A/B Testing ● Average error control can be very attractive ● Helps solve the “peeking” problem ● Average error control is fast
  • 29. Risks of Bayesian A/B Testing ● It’s more appealing but it’s risky in practice ● Smaller improvement experiments with fast results = high risk ● Higher error rate than the method actually suggests
  • 30. Benefits of Frequentist A/B Testing ● This type of test will make fewer mistakes on experiments with non-zero improvements ● The rate of errors will be less than 1 in 10 ● Option to speed up experimentation by using a prior
  • 33. Risk Involved with Typical Realistic Experiments
  • 34. Realistic Bayesian A/B Tests vs. Stats Engine
  • 35. ● The hardest experiments to call correctly are those with small improvements ● A/B testing in the wild is not easy ● We need more and more data in order to achieve average error control on realistic experiments So what does this mean?
  • 37. Stats EngineTM Results are valid whenever you check Avoid costly statistics errors Measure real-time results with confidence
  • 38. Key Takeaways ● Bayesian vs. Frequentist methods ● All error control vs. average error control ● Blended approach leads to greater confidence