Statistical Models Explored and Explained

Speakers
Statistical Models,
Explored and Explained
Sara Vafi, Stats Expert, Optimizely
Shana Rusonis, Product Marketing, Optimizely

Today’s Speakers
Sara Vafi Shana Rusonis

Housekeeping
• We’re recording!
• Slides and recording will be
emailed to you tomorrow
• Time for questions at the end

Agenda
• Bayesian & Frequentist Statistics
• Error Control - Average vs. All Error Control
• Bayes Rule
• Benefits & Risks
• Optimizely Stats Engine
• Q&A

Why Do We Experiment?
● Experimentation is essential for learning
● Try new ideas without fear of failure
● Give your business a signal to act on
in a sea of noisy data

What’s most Important to You?
● Running experiments quickly
● But also reporting on results accurately
● When not all statistical solutions are created equal

Types of Statistical Methods
Bayesian
OR
Frequentist

Bayesian Statistics
● Bayesian statistics take a more bottom-up approach to data analysis
● Our parameters are unknown
● The data is fixed
● There is a prior probability
● “Opinion-based”

“A Bayesian is one who, vaguely
expecting a horse, and catching a
glimpse of a donkey, strongly believes
he has seen a mule.”
Source

Frequentist Statistics
● Frequentist arguments are more counter-factual in nature
● Parameters remain constant during the repeatable sampling process
● Resemble the type of logic that lawyers use in court
● ‘Is this variation different from the control?’ is a basic building block of this
approach.

Example
Dan & Pete Rolling a 6-Sided Die
Scenario:
● Pete will roll a die and the outcome can either be 1, 2, 3, 4, 5, or 6
● If Pete rolls a 4, he will give Dan $1 million
If Dan was a Bayesian statistician, how would he react?
If Dan was a Frequentist statistician, how would he react?

Example
Probability of the sun exploding
Source

Error Control Explained
● The likelihood that the observed result of an experiment happened by chance,
rather than a change that you introduced
● When we set the statistical significance on an experiment to 90%, that means
there's a 10% chance of a statistical error, or a 1 in 10 chance that the result
happened by chance

Average Error Control
● Corresponds to Bayesian A/B Testing
● Less useful for iterating on test results
● Harder to learn from individual experiments with confidence

All Error Control
● Corresponds to Frequentist A/B Testing
● Any experiment will have less than a 10% chance of a mistake
● Rate of errors is 1 in 10

Average Error Control vs. All Error Control
● Average error control leads to lower accuracy for small improvements
● All error control is accurate for all users
● There are certain cases where average error control is an appropriate
alternative

Average Error Control & Bayesian A/B
Testing
● Requires two sources of randomness
○ Randomness or “noise” in the data
○ The makeup of the “typical” experiment group
● Distribution over experiment improvements

Different Beliefs in Composition of ‘Typical’ Experiments

Bayes Rule & Bayesian A/B Testing

Bayes Rule & Average Error Value

Recap Average Error Control
Bayesian A/B Testing
Prior Distributions
Bayes Rule

All Error Control is Frequentist A/B Testing
● All error control corresponds to Frequentist AB testing
● We want to aim to control the false positive rate
● Chance an experiment is either called a winner or loser

Benefits of Bayesian A/B Testing
● Average error control can be very
attractive
● Helps solve the “peeking” problem
● Average error control is fast

Risks of Bayesian A/B Testing
● It’s more appealing but it’s risky in practice
● Smaller improvement experiments with fast results = high risk
● Higher error rate than the method actually suggests

Benefits of Frequentist A/B Testing
● This type of test will make fewer mistakes on experiments with
non-zero improvements
● The rate of errors will be less than 1 in 10
● Option to speed up experimentation by using a prior

Risk Involved with Typical Realistic Experiments

Realistic Bayesian A/B Tests vs. Stats Engine

● The hardest experiments to call correctly are those with small
improvements
● A/B testing in the wild is not easy
● We need more and more data in order to achieve average error control
on realistic experiments
So what does this mean?

Stats EngineTM
Results are valid whenever you
check
Avoid costly statistics errors
Measure real-time results
with confidence

Key Takeaways
● Bayesian vs. Frequentist methods
● All error control vs. average error control
● Blended approach leads to greater confidence

Statistical Models Explored and Explained

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Statistical Models Explored and Explained (20)

More from Optimizely (20)

Recently uploaded (20)

Statistical Models Explored and Explained