Bayesian inference and big data: are we there yet? by Jose Luis Hidalgo at Big Data Spain 2017
Bayesian statistics and big data: are we there yet?
Jose Luis Hidalgo
BigData Spain 2017
Clarification of some concepts
Bayesian
- "Bayes rule"
- "Bayesian statistics" (vs. frequentist statistics)
- "Reverse probability", Fisher definition
- "Bayesian models"!
Clarification of some concepts
Inference
- In classic statistics: "inferential" vs "descriptive"
- In machine learning: "inference" vs "training"
- In Bayesian statistics: estimation of parameters from data
- … to make predictions
- … to validate the model
Clarification of some concepts
Big Data
- As many definitions as there are vendors interested in
selling you something!
- Incremental vs. something new
- In our case: "big data" as the fact that we use increasingly
larger amounts of data to get to some information/insight (we
manage to extract weaker signals from oceans of noise)
A bit of history
Early Bayesian models
- Treated analytically
- Limited to what can be treated analytically... duh!
Nineties: MCMC
- Offers (the promise of) generic inference algorithms
- Very hard and computationally expensive
- Variational inference as an (even harder) alternative
A bit of history
Oughties: Probabilistic programming
- Standard ways to explain probabilistic models to a
computer
- Bayesian models are a subset of probabilistic models
- JAGS, BUGS, Stan...
- Further developments, HMC, Gibbs sampling..
- Becomes quite popular in academic circles
A bit of history
Tens: “Practical” Probabilistic programming
- Further advances in inference: NUTS, ADVI...
- New technologies to speedup computations
- GPU parallelization
- Automatic Differentiation
- "Tall" datasets (very large number of cases)
- "Wide" datasets (very large number of features)
Some sample applications
From cognitive science
- Exactly the opposite of what our NN friends are trying to do!
- Models of human memory, of language understanding, etc.
- Bayesian models are very well suited for this kind of studies
From fin-tech
- Large copula models become tractable using (Bayesian)
inference algorithms
Some sample applications
From AI
- Generative image recognition systems
From business operations
- The inventory information problem
- Probabilistic model of inventory
- Enables operational optimization
Conclusions
If you are a data science practitioner
- Familiarize yourself with this kind of models
- Learn about tools and libraries: Stan, PMC3, Edwars, etc.
If you are responsible for technical infrastructure
- Leveraging big data will require big compute...
-... and not only for neural networks!
If you are responsible for a business
- Ask for more
- Then ask again!
Thank you!

More Related Content

PDF
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
PDF
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
PDF
Data science
DOC
2005)
PDF
The Evolution of Data Science
PPTX
data science
PDF
Data science
PDF
Come diventare data scientist - Paolo Pellegrini
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Data science
2005)
The Evolution of Data Science
data science
Data science
Come diventare data scientist - Paolo Pellegrini

What's hot (20)

PDF
Introduction To Data Science
PPTX
Data Science
PDF
Introduction to Data Science
PPTX
Big Data and Data Science: The Technologies Shaping Our Lives
PDF
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
PDF
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
PPTX
Data science
DOCX
Map Reduce in Big fata
PDF
Big Data & Machine Learning
PPTX
Introduction to data science club
PPTX
Introduction to data science
PPTX
Ai2020 ai and or final
PPTX
Bigdata analytics
PDF
Data science
PPT
BSC and Integrating Persistent Data and Parallel Programming Models
PPTX
Artificial Intelligence
PPTX
Data science
PPTX
Intro to Data Science Concepts
PDF
Big Data in small words
PPTX
Data science applications and usecases
Introduction To Data Science
Data Science
Introduction to Data Science
Big Data and Data Science: The Technologies Shaping Our Lives
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
Presentación Ciro Cattuto, ISI Foundation en VI Summit País Digital 2018
Data science
Map Reduce in Big fata
Big Data & Machine Learning
Introduction to data science club
Introduction to data science
Ai2020 ai and or final
Bigdata analytics
Data science
BSC and Integrating Persistent Data and Parallel Programming Models
Artificial Intelligence
Data science
Intro to Data Science Concepts
Big Data in small words
Data science applications and usecases
Ad

Similar to Bayesian inference and big data: are we there yet? by Jose Luis Hidalgo at Big Data Spain 2017 (20)

PDF
Yo. big data. understanding data science in the era of big data.
PDF
IoT as a metaphor!
PDF
Moving forward data centric sciences weaving AI, Big Data & HPC
PDF
Data Science versus Artificial Intelligence: a useful distinction
PDF
Data science fin_tech_2016
PDF
L’IA, booster de votre activité : principes, usages & idéation
PDF
Big data and you
 
PDF
Data science as a new frontier for design.
PPTX
Big data may 2012
PDF
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
PDF
Human-in-the-loop: a design pattern for managing teams which leverage ML by P...
PDF
Why Data Science is a Science
PDF
Big Data: Big Issues for IP
DOCX
Big data lecture notes
PDF
Introduction to Big Data
PPT
Machine Learning: Foundations Course Number 0368403401
PDF
Buzzword scheme
PPTX
Big data
PDF
Tom Martens - Cube Ware - The big data challenge - bo
PDF
Palestra Ciência dos Dados
Yo. big data. understanding data science in the era of big data.
IoT as a metaphor!
Moving forward data centric sciences weaving AI, Big Data & HPC
Data Science versus Artificial Intelligence: a useful distinction
Data science fin_tech_2016
L’IA, booster de votre activité : principes, usages & idéation
Big data and you
 
Data science as a new frontier for design.
Big data may 2012
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Human-in-the-loop: a design pattern for managing teams which leverage ML by P...
Why Data Science is a Science
Big Data: Big Issues for IP
Big data lecture notes
Introduction to Big Data
Machine Learning: Foundations Course Number 0368403401
Buzzword scheme
Big data
Tom Martens - Cube Ware - The big data challenge - bo
Palestra Ciência dos Dados
Ad

More from Big Data Spain (20)

PDF
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
PDF
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
PDF
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
PDF
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
PDF
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
PDF
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
PDF
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
PDF
State of the art time-series analysis with deep learning by Javier Ordóñez at...
PDF
Trading at market speed with the latest Kafka features by Iñigo González at B...
PDF
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
PDF
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
PDF
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
PDF
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
PDF
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
PDF
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
PDF
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
PDF
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
PDF
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
PDF
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
PDF
Deep reinforcement learning : Starcraft learning environment by Gema Parreño ...
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
Trading at market speed with the latest Kafka features by Iñigo González at B...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
Deep reinforcement learning : Starcraft learning environment by Gema Parreño ...

Recently uploaded (20)

PPT
What is a Computer? Input Devices /output devices
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
The various Industrial Revolutions .pptx
PDF
CloudStack 4.21: First Look Webinar slides
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
DOCX
search engine optimization ppt fir known well about this
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
Five Habits of High-Impact Board Members
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
A review of recent deep learning applications in wood surface defect identifi...
What is a Computer? Input Devices /output devices
Flame analysis and combustion estimation using large language and vision assi...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
The various Industrial Revolutions .pptx
CloudStack 4.21: First Look Webinar slides
Custom Battery Pack Design Considerations for Performance and Safety
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
NewMind AI Weekly Chronicles – August ’25 Week III
search engine optimization ppt fir known well about this
Benefits of Physical activity for teenagers.pptx
sustainability-14-14877-v2.pddhzftheheeeee
Five Habits of High-Impact Board Members
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
A proposed approach for plagiarism detection in Myanmar Unicode text
Developing a website for English-speaking practice to English as a foreign la...
Module 1.ppt Iot fundamentals and Architecture
Enhancing emotion recognition model for a student engagement use case through...
The influence of sentiment analysis in enhancing early warning system model f...
Credit Without Borders: AI and Financial Inclusion in Bangladesh
A review of recent deep learning applications in wood surface defect identifi...

Bayesian inference and big data: are we there yet? by Jose Luis Hidalgo at Big Data Spain 2017

  • 2. Bayesian statistics and big data: are we there yet? Jose Luis Hidalgo BigData Spain 2017
  • 3. Clarification of some concepts Bayesian - "Bayes rule" - "Bayesian statistics" (vs. frequentist statistics) - "Reverse probability", Fisher definition - "Bayesian models"!
  • 4. Clarification of some concepts Inference - In classic statistics: "inferential" vs "descriptive" - In machine learning: "inference" vs "training" - In Bayesian statistics: estimation of parameters from data - … to make predictions - … to validate the model
  • 5. Clarification of some concepts Big Data - As many definitions as there are vendors interested in selling you something! - Incremental vs. something new - In our case: "big data" as the fact that we use increasingly larger amounts of data to get to some information/insight (we manage to extract weaker signals from oceans of noise)
  • 6. A bit of history Early Bayesian models - Treated analytically - Limited to what can be treated analytically... duh! Nineties: MCMC - Offers (the promise of) generic inference algorithms - Very hard and computationally expensive - Variational inference as an (even harder) alternative
  • 7. A bit of history Oughties: Probabilistic programming - Standard ways to explain probabilistic models to a computer - Bayesian models are a subset of probabilistic models - JAGS, BUGS, Stan... - Further developments, HMC, Gibbs sampling.. - Becomes quite popular in academic circles
  • 8. A bit of history Tens: “Practical” Probabilistic programming - Further advances in inference: NUTS, ADVI... - New technologies to speedup computations - GPU parallelization - Automatic Differentiation - "Tall" datasets (very large number of cases) - "Wide" datasets (very large number of features)
  • 9. Some sample applications From cognitive science - Exactly the opposite of what our NN friends are trying to do! - Models of human memory, of language understanding, etc. - Bayesian models are very well suited for this kind of studies From fin-tech - Large copula models become tractable using (Bayesian) inference algorithms
  • 10. Some sample applications From AI - Generative image recognition systems From business operations - The inventory information problem - Probabilistic model of inventory - Enables operational optimization
  • 11. Conclusions If you are a data science practitioner - Familiarize yourself with this kind of models - Learn about tools and libraries: Stan, PMC3, Edwars, etc. If you are responsible for technical infrastructure - Leveraging big data will require big compute... -... and not only for neural networks! If you are responsible for a business - Ask for more - Then ask again!