Data Science Apps: Beyond Notebooks
Natalino Busa
2 Natalino Busa - @natbusa
Linkedin + Twitter + Github:
@natbusa
DBS
Teradata
Cognitive Finance
ING Group
O’Reilly
Philips
3 Natalino Busa - @natbusa
Icons made by Gregor Cresnar
from www.flaticon.com is licensed by CC
Learning: The Scientific Method
Ørsted's "First Introduction to General Physics" (1811)
https://en.m.wikipedia.org/wiki/History_of_scientific_method
observation hypothesis deduction synthesis
Hans Christian Ørsted
experiment
4 Natalino Busa - @natbusa
Data Scientist Experience
5 Natalino Busa - @natbusa
CloudTools Math Humans
6 Natalino Busa - @natbusa
The Jupyter Project
http://jupyter.org
7 Natalino Busa - @natbusa
Jupyter notebook: what is it?
The Jupyter Notebook
The Jupyter Notebook is a web application that
allows you to create and share documents that
contain live code, equations, visualizations and
explanatory text.
Uses include: data cleaning and
transformation, numerical simulation,
statistical modeling, machine learning and
much more.
credit : Jupyter project
extracted from http://jupyter.org/index.html
8 Natalino Busa - @natbusa
Jupyter notebook: why?
Language of choice
The Notebook has support for
over 40 programming
languages, including those
popular in Data Science such as
Python, R, Julia and Scala.
Share notebooks
Notebooks can be shared with
others using email, Dropbox,
GitHub and the Jupyter
Notebook Viewer.
Interactive widgets
Code can produce rich output
such as images, videos, LaTeX,
and JavaScript. Interactive
widgets can be used to
manipulate and visualize data in
realtime.
Big data integration
Leverage big data tools, such as
Apache Spark, from Python, R
and Scala. Explore that same
data with pandas, scikit-learn,
ggplot2, dplyr, etc.
credit : Jupyter project
extracted from http://jupyter.org/index.html
9 Natalino Busa - @natbusa
Text Cell
Code Cell
Cell Input
Cell Output
Edit, Run, Kernel, Widgets Menu’s
Kernel Type
Cell output: ASCII, HTML, Image.
etc
10 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
Jupyter Notebook Server Kernel
∅MQ
Notebook files
Jupyter Notebook
Web App
Web
Browser
HTTP
Websockets
https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html
11 Natalino Busa - @natbusa
Architecture of a Jupyter Notebook
• Modular architecture:
Web App, Server, Kernel
• Kernels:
Python, R, Scala, Bash, SQL
• Web App:
Asynchronous, rich editing, syntax highlight, export and share
12 Natalino Busa - @natbusa
Jupyter Notebook
● Narratives and Use Cases
Narratives are collaborative, shareable, publishable, and reproducible. We believe that
Narratives help both yourself and other researchers by sharing your use of Jupyter
projects, technical specifics of your deployment, and installation and configuration tips so
that others can learn from your experiences.
From https://jupyter.readthedocs.io/en/latest/use-cases/content-user.html
13 Natalino Busa - @natbusa
Jupyter is more than Notebooks
“ What if I told you that the notebook
is NOT the only sort of narrative that
you can create with the Jupyter
project? ”
14 Natalino Busa - @natbusa
Examples of Jupyter powered narratives
● O’Reilly Orioles
● Examples - build your own!
15 Natalino Busa - @natbusa
Orioles: A powerful educational narrative
16 Natalino Busa - @natbusa
Geolocated clustering and prediction
services with scikit-learn
Learn how to build a venue
recommender and a geofencing
alerting engine using geolocated data,
ML clustering algorithms, and
scikit-learn
17 Natalino Busa - @natbusa
Build your own narrative!
What do you need?
Understand how to communicate to the jupyter server
Two ways: websockets or http api endpoints
Build your own web application
Many ways: e.g. angular, polymer, dart, etc
1
2
18 Natalino Busa - @natbusa
Demos: kernel gateway
Purpose:
- Understand how to expose API endpoints
- Build your own narrative!
- Productivity gain: faster app prototyping
19 Natalino Busa - @natbusa
20 Natalino Busa - @natbusa
Jupyter Gateway: expose API endpoints
Declare the endpoint
Declear MIME type, Headers, Status
GET http://localhost:8800/counters/my_counter
21 Natalino Busa - @natbusa
Jupyter: docker stacks
Docker container:
jupyter notebook + apache toree
https://github.com/jupyter/docker-stacks
22 Natalino Busa - @natbusa
Dockerize your jupyter gateway api
IMAGE=demos/kernel_gateway_demo
docker build -t $(IMAGE) .
docker run -p 8888:8888 $(IMAGE) 
jupyter kernelgateway
--KernelGatewayApp.ip=0.0.0.0 
--KernelGatewayApp.port=8888 
--KernelGatewayApp.api=notebook-http 
--KernelGatewayApp.seed_uri=/srv/notebooks/autoscience.ipynb
23 Natalino Busa - @natbusa
Big Data apps:
Dockerize your jupyter gateway api with Toree
Jupyter Kernel Gateway Toree Kernel
∅MQ
Notebook files
Web
Browser
Your own
Web App
HTTP REST API
Docker
Containers
onewebsession=
oneserveronacloud
24 Natalino Busa - @natbusa
Summary
• Jupyter notebook is a great way to create and share
data-driven uses cases and projects
• Jupyter is more than notebooks
– gateway, kernels, hub, etc
• Narratives powered by jupyter
– O’ Reilly Orioles
– build your own narrative
25 Natalino Busa - @natbusa
Resources
Jupyter
http://jupyter.org/index.html
https://jupyter.readthedocs.io/en/latest/index.html#
Jupyter Kernel Gateway
https://github.com/jupyter/kernel_gateway
http://jupyter-kernel-gateway.readthedocs.io/en/latest/
Jupyter Con (first of its kind!)
https://conferences.oreilly.com/jupyter/jup-ny
Apache Toree (Spark Kernel)
https://toree.apache.org/
Web application dev
https://angular.io/
https://www.polymer-project.org/1.0/
Docker
https://github.com/jupyter/docker-stacks
https://www.docker.com/
26 Natalino Busa - @natbusa
Linkedin and Twitter:
@natbusa

More Related Content

PDF
7 steps for highly effective deep neural networks
PDF
Data science apps: beyond notebooks
PDF
Creating Art with a Raspberry Pi - Stephanie Nemeth - Codemotion Amsterdam 2017
PDF
Power of Python with Big Data
PDF
Scaling PyData Up and Out
PDF
Python in Data Science Work
PPTX
Python for Big Data Analytics
PDF
DjangoCon Lightning Talk: Hello from Hubble
7 steps for highly effective deep neural networks
Data science apps: beyond notebooks
Creating Art with a Raspberry Pi - Stephanie Nemeth - Codemotion Amsterdam 2017
Power of Python with Big Data
Scaling PyData Up and Out
Python in Data Science Work
Python for Big Data Analytics
DjangoCon Lightning Talk: Hello from Hubble

What's hot (14)

PPTX
H2O & Tensorflow - Fabrizio
PDF
Big Data with Modern R & Spark
PPTX
OpenStack NSA
PDF
Reproducible Workflow with Cytoscape and Jupyter Notebook
PDF
Building Reproducible Network Data Analysis / Visualization Workflows
PPTX
Programming for Everybody in Python
PDF
Cytoscape and External Data Analysis Tools
PPTX
Deep learning with Tensorflow in R
PDF
Collaborations in the Extreme: 
The rise of open code development in the scie...
PDF
Halko_santafe_2015
PDF
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
PDF
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
PPTX
Making Machine Learning Scale: Single Machine and Distributed
PPTX
Python for Big Data Analytics
H2O & Tensorflow - Fabrizio
Big Data with Modern R & Spark
OpenStack NSA
Reproducible Workflow with Cytoscape and Jupyter Notebook
Building Reproducible Network Data Analysis / Visualization Workflows
Programming for Everybody in Python
Cytoscape and External Data Analysis Tools
Deep learning with Tensorflow in R
Collaborations in the Extreme: 
The rise of open code development in the scie...
Halko_santafe_2015
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Making Machine Learning Scale: Single Machine and Distributed
Python for Big Data Analytics
Ad

Similar to Data science apps powered by Jupyter Notebooks (20)

PDF
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
PPTX
2018 02 20-jeg_index
PDF
Computable content: Notebooks, containers, and data-centric organizational le...
PDF
Data analysis with Pandas and Spark
PDF
Computable Content: Lessons Learned
PDF
Computable Content
PDF
Jupyter notebooks on steroids
PPTX
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
PPTX
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
PDF
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
PDF
Jupyter For Data Science Exploratory Analysis Statistical Modeling Machine Le...
PDF
Jupyter con meetup extended jupyter kernel gateway
PDF
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PDF
Jupyter, A Platform for Data Science at Scale
PPTX
Blastn plus jupyter on Docker
PDF
Continuum Analytics and Python
PDF
Big analytics meetup - Extended Jupyter Kernel Gateway
PDF
Jupyter: A Gateway for Scientific Collaboration and Education
PDF
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
PDF
JupyterHub for Interactive Data Science Collaboration
Data Science Apps: Beyond Notebooks - Natalino Busa - Codemotion Amsterdam 2017
2018 02 20-jeg_index
Computable content: Notebooks, containers, and data-centric organizational le...
Data analysis with Pandas and Spark
Computable Content: Lessons Learned
Computable Content
Jupyter notebooks on steroids
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
Jupyter For Data Science Exploratory Analysis Statistical Modeling Machine Le...
Jupyter con meetup extended jupyter kernel gateway
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
Jupyter, A Platform for Data Science at Scale
Blastn plus jupyter on Docker
Continuum Analytics and Python
Big analytics meetup - Extended Jupyter Kernel Gateway
Jupyter: A Gateway for Scientific Collaboration and Education
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
JupyterHub for Interactive Data Science Collaboration
Ad

More from Natalino Busa (17)

PDF
Data Production Pipelines: Legacy, practices, and innovation
PDF
[Ai in finance] AI in regulatory compliance, risk management, and auditing
PDF
Strata London 16: sightseeing, venues, and friends
PDF
Data in Action
PDF
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
PDF
The evolution of data analytics
PDF
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
PDF
Streaming Api Design with Akka, Scala and Spray
PDF
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
PDF
Big data solutions for advanced marketing analytics
PDF
Awesome Banking API's
PDF
Yo. big data. understanding data science in the era of big data.
PDF
Big and fast a quest for relevant and real-time analytics
PDF
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
PDF
Strata 2014: Data science and big data trending topics
PDF
Streaming computing: architectures, and tchnologies
PDF
Big data landscape
Data Production Pipelines: Legacy, practices, and innovation
[Ai in finance] AI in regulatory compliance, risk management, and auditing
Strata London 16: sightseeing, venues, and friends
Data in Action
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
The evolution of data analytics
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Streaming Api Design with Akka, Scala and Spray
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Big data solutions for advanced marketing analytics
Awesome Banking API's
Yo. big data. understanding data science in the era of big data.
Big and fast a quest for relevant and real-time analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Strata 2014: Data science and big data trending topics
Streaming computing: architectures, and tchnologies
Big data landscape

Recently uploaded (20)

PPTX
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PPTX
Machine Learning and working of machine Learning
PDF
A biomechanical Functional analysis of the masitary muscles in man
PDF
Global Data and Analytics Market Outlook Report
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
recommendation Project PPT with details attached
PDF
Microsoft Core Cloud Services powerpoint
PPTX
statsppt this is statistics ppt for giving knowledge about this topic
PPTX
ai agent creaction with langgraph_presentation_
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
Business_Capability_Map_Collection__pptx
PPTX
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
PPTX
CYBER SECURITY the Next Warefare Tactics
PDF
Navigating the Thai Supplements Landscape.pdf
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
Machine Learning and working of machine Learning
A biomechanical Functional analysis of the masitary muscles in man
Global Data and Analytics Market Outlook Report
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
retention in jsjsksksksnbsndjddjdnFPD.pptx
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
recommendation Project PPT with details attached
Microsoft Core Cloud Services powerpoint
statsppt this is statistics ppt for giving knowledge about this topic
ai agent creaction with langgraph_presentation_
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
Business_Capability_Map_Collection__pptx
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
CYBER SECURITY the Next Warefare Tactics
Navigating the Thai Supplements Landscape.pdf
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt

Data science apps powered by Jupyter Notebooks

  • 1. Data Science Apps: Beyond Notebooks Natalino Busa
  • 2. 2 Natalino Busa - @natbusa Linkedin + Twitter + Github: @natbusa DBS Teradata Cognitive Finance ING Group O’Reilly Philips
  • 3. 3 Natalino Busa - @natbusa Icons made by Gregor Cresnar from www.flaticon.com is licensed by CC Learning: The Scientific Method Ørsted's "First Introduction to General Physics" (1811) https://en.m.wikipedia.org/wiki/History_of_scientific_method observation hypothesis deduction synthesis Hans Christian Ørsted experiment
  • 4. 4 Natalino Busa - @natbusa Data Scientist Experience
  • 5. 5 Natalino Busa - @natbusa CloudTools Math Humans
  • 6. 6 Natalino Busa - @natbusa The Jupyter Project http://jupyter.org
  • 7. 7 Natalino Busa - @natbusa Jupyter notebook: what is it? The Jupyter Notebook The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more. credit : Jupyter project extracted from http://jupyter.org/index.html
  • 8. 8 Natalino Busa - @natbusa Jupyter notebook: why? Language of choice The Notebook has support for over 40 programming languages, including those popular in Data Science such as Python, R, Julia and Scala. Share notebooks Notebooks can be shared with others using email, Dropbox, GitHub and the Jupyter Notebook Viewer. Interactive widgets Code can produce rich output such as images, videos, LaTeX, and JavaScript. Interactive widgets can be used to manipulate and visualize data in realtime. Big data integration Leverage big data tools, such as Apache Spark, from Python, R and Scala. Explore that same data with pandas, scikit-learn, ggplot2, dplyr, etc. credit : Jupyter project extracted from http://jupyter.org/index.html
  • 9. 9 Natalino Busa - @natbusa Text Cell Code Cell Cell Input Cell Output Edit, Run, Kernel, Widgets Menu’s Kernel Type Cell output: ASCII, HTML, Image. etc
  • 10. 10 Natalino Busa - @natbusa Architecture of a Jupyter Notebook Jupyter Notebook Server Kernel ∅MQ Notebook files Jupyter Notebook Web App Web Browser HTTP Websockets https://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html
  • 11. 11 Natalino Busa - @natbusa Architecture of a Jupyter Notebook • Modular architecture: Web App, Server, Kernel • Kernels: Python, R, Scala, Bash, SQL • Web App: Asynchronous, rich editing, syntax highlight, export and share
  • 12. 12 Natalino Busa - @natbusa Jupyter Notebook ● Narratives and Use Cases Narratives are collaborative, shareable, publishable, and reproducible. We believe that Narratives help both yourself and other researchers by sharing your use of Jupyter projects, technical specifics of your deployment, and installation and configuration tips so that others can learn from your experiences. From https://jupyter.readthedocs.io/en/latest/use-cases/content-user.html
  • 13. 13 Natalino Busa - @natbusa Jupyter is more than Notebooks “ What if I told you that the notebook is NOT the only sort of narrative that you can create with the Jupyter project? ”
  • 14. 14 Natalino Busa - @natbusa Examples of Jupyter powered narratives ● O’Reilly Orioles ● Examples - build your own!
  • 15. 15 Natalino Busa - @natbusa Orioles: A powerful educational narrative
  • 16. 16 Natalino Busa - @natbusa Geolocated clustering and prediction services with scikit-learn Learn how to build a venue recommender and a geofencing alerting engine using geolocated data, ML clustering algorithms, and scikit-learn
  • 17. 17 Natalino Busa - @natbusa Build your own narrative! What do you need? Understand how to communicate to the jupyter server Two ways: websockets or http api endpoints Build your own web application Many ways: e.g. angular, polymer, dart, etc 1 2
  • 18. 18 Natalino Busa - @natbusa Demos: kernel gateway Purpose: - Understand how to expose API endpoints - Build your own narrative! - Productivity gain: faster app prototyping
  • 19. 19 Natalino Busa - @natbusa
  • 20. 20 Natalino Busa - @natbusa Jupyter Gateway: expose API endpoints Declare the endpoint Declear MIME type, Headers, Status GET http://localhost:8800/counters/my_counter
  • 21. 21 Natalino Busa - @natbusa Jupyter: docker stacks Docker container: jupyter notebook + apache toree https://github.com/jupyter/docker-stacks
  • 22. 22 Natalino Busa - @natbusa Dockerize your jupyter gateway api IMAGE=demos/kernel_gateway_demo docker build -t $(IMAGE) . docker run -p 8888:8888 $(IMAGE) jupyter kernelgateway --KernelGatewayApp.ip=0.0.0.0 --KernelGatewayApp.port=8888 --KernelGatewayApp.api=notebook-http --KernelGatewayApp.seed_uri=/srv/notebooks/autoscience.ipynb
  • 23. 23 Natalino Busa - @natbusa Big Data apps: Dockerize your jupyter gateway api with Toree Jupyter Kernel Gateway Toree Kernel ∅MQ Notebook files Web Browser Your own Web App HTTP REST API Docker Containers onewebsession= oneserveronacloud
  • 24. 24 Natalino Busa - @natbusa Summary • Jupyter notebook is a great way to create and share data-driven uses cases and projects • Jupyter is more than notebooks – gateway, kernels, hub, etc • Narratives powered by jupyter – O’ Reilly Orioles – build your own narrative
  • 25. 25 Natalino Busa - @natbusa Resources Jupyter http://jupyter.org/index.html https://jupyter.readthedocs.io/en/latest/index.html# Jupyter Kernel Gateway https://github.com/jupyter/kernel_gateway http://jupyter-kernel-gateway.readthedocs.io/en/latest/ Jupyter Con (first of its kind!) https://conferences.oreilly.com/jupyter/jup-ny Apache Toree (Spark Kernel) https://toree.apache.org/ Web application dev https://angular.io/ https://www.polymer-project.org/1.0/ Docker https://github.com/jupyter/docker-stacks https://www.docker.com/
  • 26. 26 Natalino Busa - @natbusa Linkedin and Twitter: @natbusa