Visualizing Abstract Concepts in
Machine Learning 
PIC
Alexandra Johnson
___________
Software Engineer @ SigOpt
#MachineLearning #MLViz
Visualizing Abstract Concepts in Machine Learning | 1
Visualizing Abstract Concepts in Machine Learning | 2
What is Machine Learning?
Versicolor
Setosa
Virginica
Training Data + Model -> Labels (Classification)
or Numbers (Regression)
Why is this so Intimidating?
Visualizing Abstract Concepts in Machine Learning | 3
In-brower deep neural net from playground.tensorflow.org
Hyperparameters = your
model's magic numbers
Examples: learning rate, ratio
of train to test data, number
of hidden layers, neurons per
hidden layer
Hyperparameter values must
be set before training
Solution: Hyperparameter Optimization
And four visualization challenges
Visualizing Abstract Concepts in Machine Learning | 4
Values you choose for your
hyperparameters have a
direct effect on the
performance of your model
Hard to capture interactions
of 20 hyperparameters
20 Dimensional Math is Hard
Visualizing Abstract Concepts in Machine Learning | 5
−15 −10 −5 0 5
0.2
0.4
0.6
0.8
1
log_C
Accuracy
Visualizing Abstract Concepts in Machine Learning | 6
20 Dimensional Math is Hard
First try: graph model
performance vs
hyperparameter value
For every hyperparameter
Good for understanding
indivudal hyperparameters,
bad for understanding
interactions
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Accuracy
Visualizing Abstract Concepts in Machine Learning | 7
20 Dimensional Math is Hard
Graph up to 4 dimensions at
once: x, y, z axis + color
Hard to visualize 4
dimensions at once, imagine
20!
Maybe you want to use an
algorithm to handle
hyperparameter optimization
Visualizing Abstract Concepts in Machine Learning | 8
Hyperparameter Optimization
Strategies are Different
Grid Search Random Search Bayesian Optimization
Some Strategies Produce
Better Results
0.96 0.97 0.98 0.99
0
5
10
15
20
25
Distribution of Best Found Values over Experiments of 25 Iterations
Maximum Accuracy
Experiments
Visualizing Abstract Concepts in Machine Learning | 9
Experiment = optimizing
hyperparameters of your
model, results in some
maximum performance
Some hyperparameter
optimization strategies are
stochastic, can't just look at
one experiment
Look at distribution of
maximum performance over
many experiments optimizing
hyperparameters of the same
model
Some Strategies Produce
Better Results
0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1
0
5
10
15
20
25
Distribution of Best Found Values over Experiments of 25 Iterations
Maximum Accuracy
Experiments
Random Search
Grid Search
Bayesian Optimization
Visualizing Abstract Concepts in Machine Learning | 10
Use the Mann-Whitney U Test to compare distributions of
maximum performance
Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Best Seen Trace
Timestep
BestSeenAccuracy
Visualizing Abstract Concepts in Machine Learning | 11
How much time do you have
for optimization?
Strategies that reliably
produce better results faster
can optimize the
hyperparameters of your
model in less time
Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.4
0.5
0.6
0.7
0.8
0.9
1
Interquartile Range of Best Seen Traces
Timestep
BestSeenAccuracy
Visualizing Abstract Concepts in Machine Learning | 12
Again, consider a distribution
of optimization experiments
25th - 75th percentile of
performance our model
could acheive if we stopped
early
Some Strategies Produce
Better Results, Faster
0 5 10 15 20
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Interquartile Ranges of Best Seen Traces
Timestep
BestSeenAccuracy
Grid Search
Random Search
Bayesian Optimization
Visualizing Abstract Concepts in Machine Learning | 13
Compare the area under the
curve of different strategies
 Further reading at
sigopt.com/research
Takeaways
Visualizing Abstract Concepts in Machine Learning | 14
Hyperparameter optimization is an invaluable part of any modern
machine learning pipeline
Concepts like comparing hyperparameter optimization strategies
are extremely abstract and difficult to understand 
Visualizations are in their infancy, but are an important part of
explaining these ideas
Thank You!
Visualizing Abstract Concepts in Machine Learning | 14
Email: alexandra@sigopt.com
Twitter: @alexandraj777
www.sigopt.com

Plotcon 2016 Visualization Talk by Alexandra Johnson

  • 1.
    Visualizing Abstract Conceptsin Machine Learning  PIC Alexandra Johnson ___________ Software Engineer @ SigOpt #MachineLearning #MLViz Visualizing Abstract Concepts in Machine Learning | 1
  • 2.
    Visualizing Abstract Conceptsin Machine Learning | 2 What is Machine Learning? Versicolor Setosa Virginica Training Data + Model -> Labels (Classification) or Numbers (Regression)
  • 3.
    Why is thisso Intimidating? Visualizing Abstract Concepts in Machine Learning | 3 In-brower deep neural net from playground.tensorflow.org Hyperparameters = your model's magic numbers Examples: learning rate, ratio of train to test data, number of hidden layers, neurons per hidden layer Hyperparameter values must be set before training
  • 4.
    Solution: Hyperparameter Optimization Andfour visualization challenges Visualizing Abstract Concepts in Machine Learning | 4
  • 5.
    Values you choosefor your hyperparameters have a direct effect on the performance of your model Hard to capture interactions of 20 hyperparameters 20 Dimensional Math is Hard Visualizing Abstract Concepts in Machine Learning | 5
  • 6.
    −15 −10 −50 5 0.2 0.4 0.6 0.8 1 log_C Accuracy Visualizing Abstract Concepts in Machine Learning | 6 20 Dimensional Math is Hard First try: graph model performance vs hyperparameter value For every hyperparameter Good for understanding indivudal hyperparameters, bad for understanding interactions
  • 7.
    0.3 0.4 0.5 0.6 0.7 0.8 0.9 Accuracy Visualizing Abstract Conceptsin Machine Learning | 7 20 Dimensional Math is Hard Graph up to 4 dimensions at once: x, y, z axis + color Hard to visualize 4 dimensions at once, imagine 20! Maybe you want to use an algorithm to handle hyperparameter optimization
  • 8.
    Visualizing Abstract Conceptsin Machine Learning | 8 Hyperparameter Optimization Strategies are Different Grid Search Random Search Bayesian Optimization
  • 9.
    Some Strategies Produce BetterResults 0.96 0.97 0.98 0.99 0 5 10 15 20 25 Distribution of Best Found Values over Experiments of 25 Iterations Maximum Accuracy Experiments Visualizing Abstract Concepts in Machine Learning | 9 Experiment = optimizing hyperparameters of your model, results in some maximum performance Some hyperparameter optimization strategies are stochastic, can't just look at one experiment Look at distribution of maximum performance over many experiments optimizing hyperparameters of the same model
  • 10.
    Some Strategies Produce BetterResults 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 0 5 10 15 20 25 Distribution of Best Found Values over Experiments of 25 Iterations Maximum Accuracy Experiments Random Search Grid Search Bayesian Optimization Visualizing Abstract Concepts in Machine Learning | 10 Use the Mann-Whitney U Test to compare distributions of maximum performance
  • 11.
    Some Strategies Produce BetterResults, Faster 0 5 10 15 20 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Best Seen Trace Timestep BestSeenAccuracy Visualizing Abstract Concepts in Machine Learning | 11 How much time do you have for optimization? Strategies that reliably produce better results faster can optimize the hyperparameters of your model in less time
  • 12.
    Some Strategies Produce BetterResults, Faster 0 5 10 15 20 0.4 0.5 0.6 0.7 0.8 0.9 1 Interquartile Range of Best Seen Traces Timestep BestSeenAccuracy Visualizing Abstract Concepts in Machine Learning | 12 Again, consider a distribution of optimization experiments 25th - 75th percentile of performance our model could acheive if we stopped early
  • 13.
    Some Strategies Produce BetterResults, Faster 0 5 10 15 20 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Interquartile Ranges of Best Seen Traces Timestep BestSeenAccuracy Grid Search Random Search Bayesian Optimization Visualizing Abstract Concepts in Machine Learning | 13 Compare the area under the curve of different strategies  Further reading at sigopt.com/research
  • 14.
    Takeaways Visualizing Abstract Conceptsin Machine Learning | 14 Hyperparameter optimization is an invaluable part of any modern machine learning pipeline Concepts like comparing hyperparameter optimization strategies are extremely abstract and difficult to understand  Visualizations are in their infancy, but are an important part of explaining these ideas
  • 15.
    Thank You! Visualizing AbstractConcepts in Machine Learning | 14 Email: [email protected] Twitter: @alexandraj777 www.sigopt.com