SlideShare a Scribd company logo
The evolution of array
computing in Python
Ralf Gommers

PyData Amsterdam 2019
whoami
Maintainer of NumPy & SciPy (2010 -- )
NumFOCUS board member (2012 -- 2018)
Director of Quansight Labs (2019 -- )
You can find me at:
https://github.com/rgommers
rgommers@quansight.com
2
Public benefit division of Quansight,
providing a home for a “PyData Core
Team” and growing the community
A very brief history of array computing in Python
3
Numeric
Numarray
1995
2003
2006
2008
2012
2015 -- today
Sparse
… ?
A motivating example
4
Another motivating example
5
Do try this at home
6
$ conda create -n pydata-ams
$ conda activate pydata-ams
$ conda install numpy dask jupyterlab
$ # If you have a NVIDIA GPU. Needs CUDA installed.
$ # CuPy 6.0.0 will be out soon, then conda-installable.
$ pip install --pre cupy-cuda100 # or cupy-90 for CUDA 9
$ python -c "import numpy; print(numpy.__version__)"
1.16.3
$ python -c "import cupy; print(cupy.__version__)"
6.0.0rc1
$ python -c "import dask; print(dask.__version__)"
1.2.0
$ export NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1
The NumPy array protocols - goals
7
Separate NumPy API from NumPy “execution engine”
Allow other libraries (Dask, CuPy, PyTorch, …) to reuse
the NumPy API
Bigger picture: avoid or reduce ecosystem
fragmentation (we don’t want to see a reimplementation of SciPy for
PyTorch, SciPy for Tensorflow, etc.)
The NumPy array protocols - goals
Current state of N-dimensional arrays in Python
8
The NumPy array protocols - goals
What we’re aiming for
9
The NumPy array protocols - concept
10
Function body

(the “implementation”, defines
semantics)
Function signature (the API)
Example for one function - this can be a ufunc or a regular function.
The NumPy array protocols - concept
11
Function body
Function signature (the API)
In short: use the NumPy API, bring your own implementation
Function body
Function signature
Function signature (the API)
if input arg has __array_function__:
execute other_function
Function body
Function signature
==
Using array protocols in your own code
Suitable for code that uses NumPy functions and ndarray
methods.
Try CuPy if you need more performance on large arrays,
and Dask if you want a distributed array.
Let’s play with this in a notebook!
12
Limits to these array protocols
13
Only functions can be overridden. And not even all functions -- only the
ones with an array_like parameter. Important exceptions:
np.array, np.asarray, np.linspace, np.concatenate
NumPy’s roadmap - what’s next?
Interoperability
Roll out __array_function__,
handle subclasses better,
new protocols?
Extensibility
Easier custom dtypes
Performance
Ufunc optimizations,

more SIMD instructions,
...
np.random rewrite
Is about to be merged
Indexing
NEP 21 (oindex/vindex), for
more intuitive behavior.
Type annotations
PEP 484 / mypy compatible
annotations, see numpy-
stubs repo.
14
The array computing landscape
15
maturity
n-D arrays (a.k.a. tensors)
dataframes
CPU
GPU
NumPy
XND
CuPy
PyTorch
xtensor
CPU
GPU
Pandas
MXNet
DyND
cuDF
Apache Arrow
Tensorflow
Uarray
TensorLy
xframe
Xarray
XND
Recreates the foundations of NumPy as a number of
smaller libraries. Plus:
Variable length strings
Ragged arrays
Categorical type
Missing data support
Easy custom dtypes
Automatic multi-threading
JIT compilation (via Numba)
16
XND
17
Variable length strings:
Ragged arrays:
xtensor
A C++ library for n-D arrays. Plus:
Lazy evaluation
Performance - very fast
Can operate on NumPy arrays
Python, Julia and R bindings
JIT compilation (via Pythran)
Built on top: rray (NumPy-like arrays for R)
18
xtensor
19
Uarray
A more general solution than the NumPy array protocols
for building APIs with multiple backends.
Override any object: functions, classes, ufuncs, dtypes,
context managers, and more
Uses multiple dispatch rather than protocols.
20
Uarray
21
Thanks!
Any questions?
22

More Related Content

What's hot (20)

Scaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsScaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUs
Travis Oliphant
 
Numba
NumbaNumba
Numba
Travis Oliphant
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
MLconf
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyData
Travis Oliphant
 
Python libraries
Python librariesPython libraries
Python libraries
Venkat Projects
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
Travis Oliphant
 
PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
Travis Oliphant
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
MLconf
 
Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft Presentation
Travis Oliphant
 
SciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with NumbaSciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with Numba
stan_seibert
 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
MLconf
 
Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009
Enthought, Inc.
 
Pycon tw 2013
Pycon tw 2013Pycon tw 2013
Pycon tw 2013
show you
 
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
MLconf
 
A Map of the PyData Stack
A Map of the PyData StackA Map of the PyData Stack
A Map of the PyData Stack
Peadar Coyle
 
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
MLconf
 
PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...
PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...
PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...
Edureka!
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.
Peadar Coyle
 
Five python libraries should know for machine learning
Five python libraries should know for machine learningFive python libraries should know for machine learning
Five python libraries should know for machine learning
Naveen Davis
 
Scaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsScaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUs
Travis Oliphant
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
MLconf
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyData
Travis Oliphant
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
Travis Oliphant
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
MLconf
 
Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft Presentation
Travis Oliphant
 
SciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with NumbaSciPy 2019: How to Accelerate an Existing Codebase with Numba
SciPy 2019: How to Accelerate an Existing Codebase with Numba
stan_seibert
 
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
MLconf
 
Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009Scientific Computing with Python Webinar --- August 28, 2009
Scientific Computing with Python Webinar --- August 28, 2009
Enthought, Inc.
 
Pycon tw 2013
Pycon tw 2013Pycon tw 2013
Pycon tw 2013
show you
 
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
MLconf
 
A Map of the PyData Stack
A Map of the PyData StackA Map of the PyData Stack
A Map of the PyData Stack
Peadar Coyle
 
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
MLconf
 
PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...
PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...
PyTorch vs TensorFlow: The Force Is Strong With Which One? | Which One You Sh...
Edureka!
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
MLconf
 
Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.Introduction to Spark: Or how I learned to love 'big data' after all.
Introduction to Spark: Or how I learned to love 'big data' after all.
Peadar Coyle
 
Five python libraries should know for machine learning
Five python libraries should know for machine learningFive python libraries should know for machine learning
Five python libraries should know for machine learning
Naveen Davis
 

Similar to The evolution of array computing in Python (20)

Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
Yoni Davidson
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts
Ralf Gommers
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Databricks
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
 
Hadoop 31-frequently-asked-interview-questions
Hadoop 31-frequently-asked-interview-questionsHadoop 31-frequently-asked-interview-questions
Hadoop 31-frequently-asked-interview-questions
Asad Masood Qazi
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
 
Getting started with Hadoop, Hive, and Elastic MapReduce
Getting started with Hadoop, Hive, and Elastic MapReduceGetting started with Hadoop, Hive, and Elastic MapReduce
Getting started with Hadoop, Hive, and Elastic MapReduce
obdit
 
Querying Network Packet Captures with Spark and Drill
Querying Network Packet Captures with Spark and DrillQuerying Network Packet Captures with Spark and Drill
Querying Network Packet Captures with Spark and Drill
Vince Gonzalez
 
A Java Implementer's Guide to Better Apache Spark Performance
A Java Implementer's Guide to Better Apache Spark PerformanceA Java Implementer's Guide to Better Apache Spark Performance
A Java Implementer's Guide to Better Apache Spark Performance
Tim Ellison
 
Strata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4jStrata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4j
Adam Gibson
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Databricks
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College London
Vitthal Gogate
 
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezIntro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Christian Perone
 
NYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKeeNYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKee
Rizwan Habib
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
Yoni Davidson
 
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
 
__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts__array_function__ conceptual design & related concepts
__array_function__ conceptual design & related concepts
Ralf Gommers
 
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...
Databricks
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
 
Hadoop 31-frequently-asked-interview-questions
Hadoop 31-frequently-asked-interview-questionsHadoop 31-frequently-asked-interview-questions
Hadoop 31-frequently-asked-interview-questions
Asad Masood Qazi
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
 
Getting started with Hadoop, Hive, and Elastic MapReduce
Getting started with Hadoop, Hive, and Elastic MapReduceGetting started with Hadoop, Hive, and Elastic MapReduce
Getting started with Hadoop, Hive, and Elastic MapReduce
obdit
 
Querying Network Packet Captures with Spark and Drill
Querying Network Packet Captures with Spark and DrillQuerying Network Packet Captures with Spark and Drill
Querying Network Packet Captures with Spark and Drill
Vince Gonzalez
 
A Java Implementer's Guide to Better Apache Spark Performance
A Java Implementer's Guide to Better Apache Spark PerformanceA Java Implementer's Guide to Better Apache Spark Performance
A Java Implementer's Guide to Better Apache Spark Performance
Tim Ellison
 
Strata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4jStrata Beijing 2017: Jumpy, a python interface for nd4j
Strata Beijing 2017: Jumpy, a python interface for nd4j
Adam Gibson
 
Making Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of RustMaking Python 100x Faster with Less Than 100 Lines of Rust
Making Python 100x Faster with Less Than 100 Lines of Rust
ScyllaDB
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Databricks
 
Apache Spark Introduction @ University College London
Apache Spark Introduction @ University College LondonApache Spark Introduction @ University College London
Apache Spark Introduction @ University College London
Vitthal Gogate
 
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezIntro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
 
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and PythonApache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Christian Perone
 
NYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKeeNYAI - Scaling Machine Learning Applications by Braxton McKee
NYAI - Scaling Machine Learning Applications by Braxton McKee
Rizwan Habib
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
Ad

More from Ralf Gommers (8)

Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdf
Ralf Gommers
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
Ralf Gommers
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
Ralf Gommers
 
Building SciPy kernels with Pythran
Building SciPy kernels with PythranBuilding SciPy kernels with Pythran
Building SciPy kernels with Pythran
Ralf Gommers
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond code
Ralf Gommers
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decade
Ralf Gommers
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_session
Ralf Gommers
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and Code
Ralf Gommers
 
Reliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdfReliable from-source builds (Qshare 28 Nov 2023).pdf
Reliable from-source builds (Qshare 28 Nov 2023).pdf
Ralf Gommers
 
Parallelism in a NumPy-based program
Parallelism in a NumPy-based programParallelism in a NumPy-based program
Parallelism in a NumPy-based program
Ralf Gommers
 
The road ahead for scientific computing with Python
The road ahead for scientific computing with PythonThe road ahead for scientific computing with Python
The road ahead for scientific computing with Python
Ralf Gommers
 
Building SciPy kernels with Pythran
Building SciPy kernels with PythranBuilding SciPy kernels with Pythran
Building SciPy kernels with Pythran
Ralf Gommers
 
Strengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond codeStrengthening NumPy's foundations - growing beyond code
Strengthening NumPy's foundations - growing beyond code
Ralf Gommers
 
Inside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decadeInside NumPy: preparing for the next decade
Inside NumPy: preparing for the next decade
Ralf Gommers
 
NumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_sessionNumFOCUS_Summit2018_Roadmaps_session
NumFOCUS_Summit2018_Roadmaps_session
Ralf Gommers
 
SciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and CodeSciPy 1.0 and Beyond - a Story of Community and Code
SciPy 1.0 and Beyond - a Story of Community and Code
Ralf Gommers
 
Ad

Recently uploaded (20)

Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-OffMicro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Tier1 app
 
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentricIntegration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Natan Silnitsky
 
How to purchase, license and subscribe to Microsoft Azure_PDF.pdf
How to purchase, license and subscribe to Microsoft Azure_PDF.pdfHow to purchase, license and subscribe to Microsoft Azure_PDF.pdf
How to purchase, license and subscribe to Microsoft Azure_PDF.pdf
victordsane
 
Essentials of Resource Planning in a Downturn
Essentials of Resource Planning in a DownturnEssentials of Resource Planning in a Downturn
Essentials of Resource Planning in a Downturn
OnePlan Solutions
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...
Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...
Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...
BradBedford3
 
Generative Artificial Intelligence and its Applications
Generative Artificial Intelligence and its ApplicationsGenerative Artificial Intelligence and its Applications
Generative Artificial Intelligence and its Applications
SandeepKS52
 
Key AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence CompaniesKey AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence Companies
Mypcot Infotech
 
FME for Climate Data: Turning Big Data into Actionable Insights
FME for Climate Data: Turning Big Data into Actionable InsightsFME for Climate Data: Turning Big Data into Actionable Insights
FME for Climate Data: Turning Big Data into Actionable Insights
Safe Software
 
Best Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small BusinessesBest Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small Businesses
TheTelephony
 
Boost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for SchoolsBoost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for Schools
Visitu
 
Revolutionize Your Insurance Workflow with Claims Management Software
Revolutionize Your Insurance Workflow with Claims Management SoftwareRevolutionize Your Insurance Workflow with Claims Management Software
Revolutionize Your Insurance Workflow with Claims Management Software
Insurance Tech Services
 
Agile Software Engineering Methodologies
Agile Software Engineering MethodologiesAgile Software Engineering Methodologies
Agile Software Engineering Methodologies
Gaurav Sharma
 
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
WSO2
 
Leveraging Foundation Models to Infer Intents
Leveraging Foundation Models to Infer IntentsLeveraging Foundation Models to Infer Intents
Leveraging Foundation Models to Infer Intents
Keheliya Gallaba
 
Topic 26 Security Testing Considerations.pptx
Topic 26 Security Testing Considerations.pptxTopic 26 Security Testing Considerations.pptx
Topic 26 Security Testing Considerations.pptx
marutnand8
 
Artificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across IndustriesArtificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across Industries
SandeepKS52
 
Scalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple DevicesScalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple Devices
Scalefusion
 
Bonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdfBonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdf
Herond Labs
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - IntroductionIBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-OffMicro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Tier1 app
 
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentricIntegration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Integration Ignited Redefining Event-Driven Architecture at Wix - EventCentric
Natan Silnitsky
 
How to purchase, license and subscribe to Microsoft Azure_PDF.pdf
How to purchase, license and subscribe to Microsoft Azure_PDF.pdfHow to purchase, license and subscribe to Microsoft Azure_PDF.pdf
How to purchase, license and subscribe to Microsoft Azure_PDF.pdf
victordsane
 
Essentials of Resource Planning in a Downturn
Essentials of Resource Planning in a DownturnEssentials of Resource Planning in a Downturn
Essentials of Resource Planning in a Downturn
OnePlan Solutions
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...
Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...
Maintaining + Optimizing Database Health: Vendors, Orchestrations, Enrichment...
BradBedford3
 
Generative Artificial Intelligence and its Applications
Generative Artificial Intelligence and its ApplicationsGenerative Artificial Intelligence and its Applications
Generative Artificial Intelligence and its Applications
SandeepKS52
 
Key AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence CompaniesKey AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence Companies
Mypcot Infotech
 
FME for Climate Data: Turning Big Data into Actionable Insights
FME for Climate Data: Turning Big Data into Actionable InsightsFME for Climate Data: Turning Big Data into Actionable Insights
FME for Climate Data: Turning Big Data into Actionable Insights
Safe Software
 
Best Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small BusinessesBest Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small Businesses
TheTelephony
 
Boost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for SchoolsBoost Student Engagement with Smart Attendance Software for Schools
Boost Student Engagement with Smart Attendance Software for Schools
Visitu
 
Revolutionize Your Insurance Workflow with Claims Management Software
Revolutionize Your Insurance Workflow with Claims Management SoftwareRevolutionize Your Insurance Workflow with Claims Management Software
Revolutionize Your Insurance Workflow with Claims Management Software
Insurance Tech Services
 
Agile Software Engineering Methodologies
Agile Software Engineering MethodologiesAgile Software Engineering Methodologies
Agile Software Engineering Methodologies
Gaurav Sharma
 
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
WSO2
 
Leveraging Foundation Models to Infer Intents
Leveraging Foundation Models to Infer IntentsLeveraging Foundation Models to Infer Intents
Leveraging Foundation Models to Infer Intents
Keheliya Gallaba
 
Topic 26 Security Testing Considerations.pptx
Topic 26 Security Testing Considerations.pptxTopic 26 Security Testing Considerations.pptx
Topic 26 Security Testing Considerations.pptx
marutnand8
 
Artificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across IndustriesArtificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across Industries
SandeepKS52
 
Scalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple DevicesScalefusion Remote Access for Apple Devices
Scalefusion Remote Access for Apple Devices
Scalefusion
 
Bonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdfBonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdf
Herond Labs
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - IntroductionIBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 

The evolution of array computing in Python

  • 1. The evolution of array computing in Python Ralf Gommers
 PyData Amsterdam 2019
  • 2. whoami Maintainer of NumPy & SciPy (2010 -- ) NumFOCUS board member (2012 -- 2018) Director of Quansight Labs (2019 -- ) You can find me at: https://github.com/rgommers [email protected] 2 Public benefit division of Quansight, providing a home for a “PyData Core Team” and growing the community
  • 3. A very brief history of array computing in Python 3 Numeric Numarray 1995 2003 2006 2008 2012 2015 -- today Sparse … ?
  • 6. Do try this at home 6 $ conda create -n pydata-ams $ conda activate pydata-ams $ conda install numpy dask jupyterlab $ # If you have a NVIDIA GPU. Needs CUDA installed. $ # CuPy 6.0.0 will be out soon, then conda-installable. $ pip install --pre cupy-cuda100 # or cupy-90 for CUDA 9 $ python -c "import numpy; print(numpy.__version__)" 1.16.3 $ python -c "import cupy; print(cupy.__version__)" 6.0.0rc1 $ python -c "import dask; print(dask.__version__)" 1.2.0 $ export NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1
  • 7. The NumPy array protocols - goals 7 Separate NumPy API from NumPy “execution engine” Allow other libraries (Dask, CuPy, PyTorch, …) to reuse the NumPy API Bigger picture: avoid or reduce ecosystem fragmentation (we don’t want to see a reimplementation of SciPy for PyTorch, SciPy for Tensorflow, etc.)
  • 8. The NumPy array protocols - goals Current state of N-dimensional arrays in Python 8
  • 9. The NumPy array protocols - goals What we’re aiming for 9
  • 10. The NumPy array protocols - concept 10 Function body
 (the “implementation”, defines semantics) Function signature (the API) Example for one function - this can be a ufunc or a regular function.
  • 11. The NumPy array protocols - concept 11 Function body Function signature (the API) In short: use the NumPy API, bring your own implementation Function body Function signature Function signature (the API) if input arg has __array_function__: execute other_function Function body Function signature ==
  • 12. Using array protocols in your own code Suitable for code that uses NumPy functions and ndarray methods. Try CuPy if you need more performance on large arrays, and Dask if you want a distributed array. Let’s play with this in a notebook! 12
  • 13. Limits to these array protocols 13 Only functions can be overridden. And not even all functions -- only the ones with an array_like parameter. Important exceptions: np.array, np.asarray, np.linspace, np.concatenate
  • 14. NumPy’s roadmap - what’s next? Interoperability Roll out __array_function__, handle subclasses better, new protocols? Extensibility Easier custom dtypes Performance Ufunc optimizations,
 more SIMD instructions, ... np.random rewrite Is about to be merged Indexing NEP 21 (oindex/vindex), for more intuitive behavior. Type annotations PEP 484 / mypy compatible annotations, see numpy- stubs repo. 14
  • 15. The array computing landscape 15 maturity n-D arrays (a.k.a. tensors) dataframes CPU GPU NumPy XND CuPy PyTorch xtensor CPU GPU Pandas MXNet DyND cuDF Apache Arrow Tensorflow Uarray TensorLy xframe Xarray
  • 16. XND Recreates the foundations of NumPy as a number of smaller libraries. Plus: Variable length strings Ragged arrays Categorical type Missing data support Easy custom dtypes Automatic multi-threading JIT compilation (via Numba) 16
  • 18. xtensor A C++ library for n-D arrays. Plus: Lazy evaluation Performance - very fast Can operate on NumPy arrays Python, Julia and R bindings JIT compilation (via Pythran) Built on top: rray (NumPy-like arrays for R) 18
  • 20. Uarray A more general solution than the NumPy array protocols for building APIs with multiple backends. Override any object: functions, classes, ufuncs, dtypes, context managers, and more Uses multiple dispatch rather than protocols. 20