SlideShare a Scribd company logo
boost-histogram and hist
Henry Schreiner
April 15, 2019
Histograms in Python
1/27Henry Schreiner boost-histogram and hist April 15, 2019
Current state of histograms in Python Histograms in Python
Core library: numpy
• Historically slow
• No histogram object
• Plotting is separate
Other libraries
• Narrow focus: speed,
plotting, or language
• Many are abandoned
• Poor design, backends,
distribution
HistBook
Histogrammar
pygram11
rootplotlib
PyROOT
YODA
physt
fast-histogramqhist
Vaex
hdrhistogram
multihist
matplotlib-hep
pyhistogram
histogram
SimpleHist
paida
theodoregoetz
numpy
2/27Henry Schreiner boost-histogram and hist April 15, 2019
What is needed? Histograms in Python
Design
• A histogram should be an object
• Manipulation and plotting should be easy
Performance
• Fast single threaded filling
• Multithreaded filling (since it’s 2019)
Flexibility
• Axes options: sparse, growing, labels
• Storage: integers, weights, errors…
Distribution
• Easy to use anywhere, pip or conda
• Should have wheels, be easy to build, etc.
3/27Henry Schreiner boost-histogram and hist April 15, 2019
Future of histograms in Python Histograms in Python
Core histogramming libraries boost-histogram ROOT
Universal adaptor Aghast
Front ends (plotting, etc) hist mpl-hep physt others
4/27Henry Schreiner boost-histogram and hist April 15, 2019
Boost::Histogram (C++14)
5/27Henry Schreiner boost-histogram and hist April 15, 2019
Intro to Boost::Histogram Boost::Histogram (C++14)
• Multidimensional templated header-only histogram library: /boostorg/histogram
• Designed by Hans Dembinski, inspired by ROOT, GSL, and histbook
Histogram
• Axes
• Storages
• Accumulators
Axes types
• Regular, Circular
• Variable
• Integer
• Category
Storage (
Static
Dynamic
)Regular axis
Regular axis with
log transformaxes
Optional overflowOptional underflow
Accumulator
int, double,
unlimited, ...
6/27Henry Schreiner boost-histogram and hist April 15, 2019
Intro to Boost::Histogram Boost::Histogram (C++14)
• Multidimensional templated header-only histogram library: /boostorg/histogram
• Designed by Hans Dembinski, inspired by ROOT, GSL, and histbook
Histogram
• Axes
• Storages
• Accumulators
Axes types
• Regular, Circular
• Variable
• Integer
• Category
Storage (
Static
Dynamic
)Regular axis
Regular axis with
log transformaxes
Optional overflowOptional underflow
Accumulator
int, double,
unlimited, ...
Boost 1.70 now released with Boost::Histogram!
6/27Henry Schreiner boost-histogram and hist April 15, 2019
boost-histogram (Python)
7/27Henry Schreiner boost-histogram and hist April 15, 2019
Intro to the Python bindings boost-histogram (Python)
• Boost::Histogram developed with Python in mind
• Original bindings based on Boost::Python
▶ Hard to build and distribute
▶ Somewhat limited
• New bindings: /scikit-hep/boost-histogram
▶ 0-dependency build (C++14 only)
▶ State-of-the-art PyBind11
Design Flexibility Speed Distribution
8/27Henry Schreiner boost-histogram and hist April 15, 2019
Design boost-histogram (Python)
• Supports Python 2.7 and 3.4+
• 260+ unit tests run on Azure on Linux, macOS, and Windows
• Up to 16 axes supported (may go up or down)
• 1D, 2D, and ND histograms all have the same interface
Tries to stay close to the original Boost::Histogram where possible.
C++
#include <boost/histogram.hpp>
namespace bh = boost::histogram;
auto hist = bh::make_histogram(
bh::axis::regular<>{2, 0, 1, "x"},
bh::axis::regular<>{4, 0, 1, "y"});
hist(.2, .3);
Python
import boost.histogram as bh
hist = bh.make_histogram(
bh.axis.regular(2, 0, 1, metadata="x"),
bh.axis.regular(4, 0, 1, metadata="y"))
hist(.2, .3)
9/27Henry Schreiner boost-histogram and hist April 15, 2019
Design: Manipulations boost-histogram (Python)
Combine two histograms
hist1 + hist2
Scale a histogram
hist * 2.0
Project a 3D histogram to 2D
hist.project(0,1) # select axis
Sum a histogram contents
hist.sum()
Access an axis
axis0 = hist.axis(0)
axis0.edges() # The edges array
axis0.bin(1) # The bin accessors
Fill 2D histogram with values or arrays
hist(x, y)
Fill copies in 4 threads, then merge
hist.fill_threaded(4, x, y)
Fill in 4 threads (atomic storage only)
hist.fill_atomic(4, x, y)
Convert to Numpy, 0-copy
hist.view()
# Or
np.asarray(hist)
10/27Henry Schreiner boost-histogram and hist April 15, 2019
Flexibility: Axis boost-histogram (Python)
• bh.axis.regular
▶ bh.axis.regular_uoflow
▶ bh.axis.regular_noflow
▶ bh.axis.regular_growth
• bh.axis.circular
• bh.axis.regular_log
• bh.axis.regular_sqrt
• bh.axis.regular_pow
• bh.axis.integer
• bh.axis.integer_noflow
• bh.axis.integer_growth
• bh.axis.variable
• bh.axis.category_int
• bh.axis.category_int_growth
0 0.5 1
bh.axis.regular(10,0,1)
𝜋/2
0, 2𝜋
𝜋
3𝜋/3
bh.axis.circular(8,0,2*np.pi)
0 0.3 0.5 1
bh.axis.variable([0,.3,.5,1])
0 1 2 3 4
bh.axis.integer(0,5)
2 5 8 3 7
bh.axis.category_int([2,5,8,3,7])
11/27Henry Schreiner boost-histogram and hist April 15, 2019
Flexibility: Storage types boost-histogram (Python)
• bh.storage.int
• bh.storage.double
• bh.storage.unlimited (WIP)
• bh.storage.atomic_int
• bh.storage.weight (WIP)
• bh.storage.profile (WIP, needs sampled fill)
• bh.storage.weighted_profile (WIP, needs sampled fill)
12/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance boost-histogram (Python)
The following measurements are with:
1D
• 100 regular bins
• 10,000,000 entries
2D
• 100x100 regular bins
• 1,000,000 entries
See my histogram performance post for measurements of other libraries.
13/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance: macOS, dual core, 1D boost-histogram (Python)
Type Storage Fill Time Speedup
Numpy uint64 149.4 ms 1x
Any int 236 ms 0.63x
Regular int 86.23 ms 1.7x
Regular aint 1 132 ms 1.1x
Regular aint 2 168.2 ms 0.89x
Regular aint 4 143.6 ms 1x
Regular int 1 84.75 ms 1.8x
Regular int 2 51.6 ms 2.9x
Regular int 4 42.39 ms 3.5x
14/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance: CentOS7, 24 core, 1D (anaconda) boost-histogram (Python)
Type Storage Fill Time Speedup
Numpy uint64 121 ms 1x
Any int 261.5 ms 0.46x
Regular int 142.2 ms 0.85x
Regular aint 1 319.1 ms 0.38x
Regular aint 48 272.9 ms 0.44x
Regular int 1 243.4 ms 0.5x
Regular int 6 94.76 ms 1.3x
Regular int 12 71.38 ms 1.7x
Regular int 24 52.26 ms 2.3x
Regular int 48 43.01 ms 2.8x
15/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance: KNL, 64 core, 1D (anaconda) boost-histogram (Python)
Type Storage Fill Time Speedup
Numpy uint64 716.9 ms 1x
Any int 1418 ms 0.51x
Regular int 824 ms 0.87x
Regular aint 1 871.7 ms 0.82x
Regular aint 4 437.1 ms 1.6x
Regular aint 64 198.8 ms 3.6x
Regular aint 128 186.8 ms 3.8x
Regular aint 256 195.2 ms 3.7x
Regular int 1 796.9 ms 0.9x
Regular int 2 430.6 ms 1.7x
Regular int 4 247.6 ms 2.9x
Regular int 64 88.77 ms 8.1x
Regular int 128 98.08 ms 7.3x
Regular int 256 112.2 ms 6.4x
16/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance: macOS, dual core, 2D boost-histogram (Python)
Type Storage Fill Time Speedup
Numpy uint64 121.1 ms 1x
Any int 37.12 ms 3.3x
Regular int 18.5 ms 6.5x
Regular aint 1 20.21 ms 6x
Regular aint 2 14.17 ms 8.5x
Regular aint 4 10.23 ms 12x
Regular int 1 17.86 ms 6.8x
Regular int 2 9.41 ms 13x
Regular int 4 6.854 ms 18x
17/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance: CentOS7, 24 core, 2D (anaconda) boost-histogram (Python)
Type Storage Fill Time Speedup
Numpy uint64 87.27 ms 1x
Any int 41.42 ms 2.1x
Regular int 21.67 ms 4x
Regular aint 1 38.61 ms 2.3x
Regular aint 6 19.89 ms 4.4x
Regular aint 24 9.556 ms 9.1x
Regular aint 48 8.518 ms 10x
Regular int 1 36.5 ms 2.4x
Regular int 6 8.976 ms 9.7x
Regular int 12 5.318 ms 16x
Regular int 24 4.388 ms 20x
Regular int 48 5.839 ms 15x
18/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance: KNL, 64 core, 2D (anaconda) boost-histogram (Python)
Type Storage Fill Time Speedup
Numpy uint64 439.5 ms 1x
Any int 250.6 ms 1.8x
Regular int 135.6 ms 3.2x
Regular aint 1 142.2 ms 3.1x
Regular aint 4 52.71 ms 8.3x
Regular aint 32 12.05 ms 36x
Regular aint 64 16.5 ms 27x
Regular aint 256 43.93 ms 10x
Regular int 1 141.1 ms 3.1x
Regular int 2 70.78 ms 6.2x
Regular int 4 36.11 ms 12x
Regular int 64 18.93 ms 23x
Regular int 128 36.09 ms 12x
Regular int 256 55.64 ms 7.9x
19/27Henry Schreiner boost-histogram and hist April 15, 2019
Performance: Summary boost-histogram (Python)
System 1D max speedup 2D max speedup
macOS 1 core 1.7 x 6.5 x
macOS 2 core 3.5 x 18 x
Linux 1 core 0.85 x 4 x
Linux 24 core 2.8 x 20 x
KNL 1 core 0.87 x 3.2 x
KNL 64 core 8.1 x 36 x
• Note that Numpy 1D is well optimized (last few versions)
• Anaconda versions may provide a few more optimizations to Numpy
• Mixing axes types in boost-histogram can reduce performance by 2-3x
20/27Henry Schreiner boost-histogram and hist April 15, 2019
Distribution boost-histogram (Python)
• We must provide excellent distribution.
▶ If anyone writes pip install boost-histogram and it fails, we have failed.
• Docker ManyLinux1 GCC 8.3: /scikit-hep/manylinuxgcc
Wheels
• manylinux1 32, 64 bit (ready)
• manylinux2010 64 bit (planned)
• macOS 10.9+ (wip)
• Windows 32, 64 bit, Python 3.6+ (wip)
▶ Is Python 2.7 Windows needed?
Source
• SDist (ready)
• Build directly from GitHub (done)
Conda
• conda package (planned, easy)
python -m pip install 
git+https://github.com/scikit-hep/boost-histogram.git@develop
21/27Henry Schreiner boost-histogram and hist April 15, 2019
Plans boost-histogram (Python)
• Add shortcuts for axis types, fill out axis types
• Allow view access into unlimited storage histograms
• Add from_numpy and numpy style shortcut(s)
• Filling
▶ Samples
▶ Weights
▶ Non-numerical fill (if possible)
• Add profile, weighted_profile histograms
• Add reduce operations
• Release to PyPI
• Add some docs and read the docs support
First alpha
Release planned this week
22/27Henry Schreiner boost-histogram and hist April 15, 2019
Bikeshedding (API discussion) boost-histogram (Python)
Let’s discuss API! (On GitHub issues or gitter)
• Download: pip install boost-histogram (WIP)
• Use: import boost.histogram as bh
• Create: hist =
bh.histogram(bh.axis.regular(12,0,1))
• Fill: hist(values)
• Access values, convert to numpy, etc.
AndGod
III
1am
a it
a a
EAB.zpkpt.LY eEFEEIE
Documentation
• The documentation will also need useful examples, feel free to contribute!
23/27Henry Schreiner boost-histogram and hist April 15, 2019
hist
24/27Henry Schreiner boost-histogram and hist April 15, 2019
A slide about hist hist
hist is the ‘wrapper’ piece that does plotting and interacts with the rest of the ecosystem.
Plans
• Easy plotting adaptors (mpl-hep)
• Serialization formats (ROOT, HDF5)
• Auto-multithreading
• Statistical functions (Like TEfficiency)
• Multihistograms (HistBook)
• Interaction with fitters (ZFit, GooFit, etc)
• Bayesian Blocks algorithm from SciKit-HEP
• Command line histograms for stream of numbers
Call for contributions
• What do you need?
• What do you want?
• What would you like?
Join in the development! This
should combine the best features
of other packages.
25/27Henry Schreiner boost-histogram and hist April 15, 2019
Questions?
26/27Henry Schreiner boost-histogram and hist April 15, 2019
Backup Questions?
• Supported by IRIS-HEP, NSF OAC-1836650
27/27Henry Schreiner boost-histogram and hist April 15, 2019

More Related Content

What's hot (20)

Pybind11 - SciPy 2021
Pybind11 - SciPy 2021Pybind11 - SciPy 2021
Pybind11 - SciPy 2021
Henry Schreiner
 
ROOT 2018: iminuit and MINUIT2 Standalone
ROOT 2018: iminuit and MINUIT2 StandaloneROOT 2018: iminuit and MINUIT2 Standalone
ROOT 2018: iminuit and MINUIT2 Standalone
Henry Schreiner
 
PyHEP 2019: Python 3.8
PyHEP 2019: Python 3.8PyHEP 2019: Python 3.8
PyHEP 2019: Python 3.8
Henry Schreiner
 
2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decays2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decays
Henry Schreiner
 
PEARC17: Modernizing GooFit: A Case Study
PEARC17: Modernizing GooFit: A Case StudyPEARC17: Modernizing GooFit: A Case Study
PEARC17: Modernizing GooFit: A Case Study
Henry Schreiner
 
RDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and PandasRDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and Pandas
Henry Schreiner
 
Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11
corehard_by
 
Massively Parallel Processing with Procedural Python (PyData London 2014)
Massively Parallel Processing with Procedural Python (PyData London 2014)Massively Parallel Processing with Procedural Python (PyData London 2014)
Massively Parallel Processing with Procedural Python (PyData London 2014)
Ian Huston
 
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
PyData
 
Pypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequelPypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequel
Mark Rees
 
PyPy - is it ready for production
PyPy - is it ready for productionPyPy - is it ready for production
PyPy - is it ready for production
Mark Rees
 
Scientific visualization with_gr
Scientific visualization with_grScientific visualization with_gr
Scientific visualization with_gr
Josef Heinen
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
Kohei KaiGai
 
Move from C to Go
Move from C to GoMove from C to Go
Move from C to Go
Yu-Shuan Hsieh
 
GPars in Saga Groovy Study
GPars in Saga Groovy StudyGPars in Saga Groovy Study
GPars in Saga Groovy Study
Naoki Rin
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
Kohei KaiGai
 
High scalable applications with Python
High scalable applications with PythonHigh scalable applications with Python
High scalable applications with Python
Giuseppe Broccolo
 
Python kansai2019
Python kansai2019Python kansai2019
Python kansai2019
Yuta Kashino
 
Apache spark session
Apache spark sessionApache spark session
Apache spark session
knowbigdata
 
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
Hansol Kang
 
ROOT 2018: iminuit and MINUIT2 Standalone
ROOT 2018: iminuit and MINUIT2 StandaloneROOT 2018: iminuit and MINUIT2 Standalone
ROOT 2018: iminuit and MINUIT2 Standalone
Henry Schreiner
 
2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decays2019 IRIS-HEP AS workshop: Particles and decays
2019 IRIS-HEP AS workshop: Particles and decays
Henry Schreiner
 
PEARC17: Modernizing GooFit: A Case Study
PEARC17: Modernizing GooFit: A Case StudyPEARC17: Modernizing GooFit: A Case Study
PEARC17: Modernizing GooFit: A Case Study
Henry Schreiner
 
RDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and PandasRDM 2020: Python, Numpy, and Pandas
RDM 2020: Python, Numpy, and Pandas
Henry Schreiner
 
Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11Mixing C++ & Python II: Pybind11
Mixing C++ & Python II: Pybind11
corehard_by
 
Massively Parallel Processing with Procedural Python (PyData London 2014)
Massively Parallel Processing with Procedural Python (PyData London 2014)Massively Parallel Processing with Procedural Python (PyData London 2014)
Massively Parallel Processing with Procedural Python (PyData London 2014)
Ian Huston
 
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
Massively Parallel Processing with Procedural Python by Ronert Obst PyData Be...
PyData
 
Pypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequelPypy is-it-ready-for-production-the-sequel
Pypy is-it-ready-for-production-the-sequel
Mark Rees
 
PyPy - is it ready for production
PyPy - is it ready for productionPyPy - is it ready for production
PyPy - is it ready for production
Mark Rees
 
Scientific visualization with_gr
Scientific visualization with_grScientific visualization with_gr
Scientific visualization with_gr
Josef Heinen
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
Kohei KaiGai
 
GPars in Saga Groovy Study
GPars in Saga Groovy StudyGPars in Saga Groovy Study
GPars in Saga Groovy Study
Naoki Rin
 
20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw20181025_pgconfeu_lt_gstorefdw
20181025_pgconfeu_lt_gstorefdw
Kohei KaiGai
 
High scalable applications with Python
High scalable applications with PythonHigh scalable applications with Python
High scalable applications with Python
Giuseppe Broccolo
 
Apache spark session
Apache spark sessionApache spark session
Apache spark session
knowbigdata
 
PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)PyTorch 튜토리얼 (Touch to PyTorch)
PyTorch 튜토리얼 (Touch to PyTorch)
Hansol Kang
 

Similar to IRIS-HEP: Boost-histogram and Hist (20)

PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
Travis Oliphant
 
Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdf
Jimmy Lai
 
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
Pôle Systematic Paris-Region
 
Guider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLGuider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGL
Peace Lee
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big Data
DESMOND YUEN
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
Travis Oliphant
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
 
Elasticwulf Pycon Talk
Elasticwulf Pycon TalkElasticwulf Pycon Talk
Elasticwulf Pycon Talk
Peter Skomoroch
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
AyushmanTiwari11
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
AyushmanTiwari11
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production Environments
Intel® Software
 
Intel python 2017
Intel python 2017Intel python 2017
Intel python 2017
DESMOND YUEN
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
Henry Schreiner
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
Matthew Broberg
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
Turi, Inc.
 
Teaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learnedTeaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learned
Martin Christen
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
Adam Doyle
 
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
Insight Technology, Inc.
 
Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)
Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)
Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)
Cheer Chain Enterprise Co., Ltd.
 
Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdf
Jimmy Lai
 
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
PyParis 2017 / Writing a C Python extension in 2017, Jean-Baptiste Aviat
Pôle Systematic Paris-Region
 
Guider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLGuider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGL
Peace Lee
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big Data
DESMOND YUEN
 
Scale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyDataScale up and Scale Out Anaconda and PyData
Scale up and Scale Out Anaconda and PyData
Travis Oliphant
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Codemotion
 
Python* Scalability in Production Environments
Python* Scalability in Production EnvironmentsPython* Scalability in Production Environments
Python* Scalability in Production Environments
Intel® Software
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
Henry Schreiner
 
Intro to open source telemetry linux con 2016
Intro to open source telemetry   linux con 2016Intro to open source telemetry   linux con 2016
Intro to open source telemetry linux con 2016
Matthew Broberg
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
Turi, Inc.
 
Teaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learnedTeaching with JupyterHub - lessons learned
Teaching with JupyterHub - lessons learned
Martin Christen
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
Adam Doyle
 
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
Insight Technology, Inc.
 
Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)
Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)
Intel Parallel Studio XE 2016 網路開發工具包新版本功能介紹(現已上市,歡迎詢價)
Cheer Chain Enterprise Co., Ltd.
 
Ad

More from Henry Schreiner (20)

Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)
Henry Schreiner
 
Tools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton WintersessionTools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton Wintersession
Henry Schreiner
 
Learning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - PrincetonLearning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - Princeton
Henry Schreiner
 
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
Henry Schreiner
 
Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024
Henry Schreiner
 
Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024
Henry Schreiner
 
Princeton RSE Peer network first meeting
Princeton RSE Peer network first meetingPrinceton RSE Peer network first meeting
Princeton RSE Peer network first meeting
Henry Schreiner
 
Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023
Henry Schreiner
 
Princeton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance ToolingPrinceton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance Tooling
Henry Schreiner
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
Henry Schreiner
 
Everything you didn't know you needed
Everything you didn't know you neededEverything you didn't know you needed
Everything you didn't know you needed
Henry Schreiner
 
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
Henry Schreiner
 
SciPy 2022 Scikit-HEP
SciPy 2022 Scikit-HEPSciPy 2022 Scikit-HEP
SciPy 2022 Scikit-HEP
Henry Schreiner
 
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packagingPyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
Henry Schreiner
 
boost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meetingboost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meeting
Henry Schreiner
 
Digital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meetingDigital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meeting
Henry Schreiner
 
CMake best practices
CMake best practicesCMake best practices
CMake best practices
Henry Schreiner
 
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex ReconstructionHOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
Henry Schreiner
 
HOW 2019: A complete reproducible ROOT environment in under 5 minutes
HOW 2019: A complete reproducible ROOT environment in under 5 minutesHOW 2019: A complete reproducible ROOT environment in under 5 minutes
HOW 2019: A complete reproducible ROOT environment in under 5 minutes
Henry Schreiner
 
ACAT 2019: A hybrid deep learning approach to vertexing
ACAT 2019: A hybrid deep learning approach to vertexingACAT 2019: A hybrid deep learning approach to vertexing
ACAT 2019: A hybrid deep learning approach to vertexing
Henry Schreiner
 
Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)
Henry Schreiner
 
Tools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton WintersessionTools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton Wintersession
Henry Schreiner
 
Learning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - PrincetonLearning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - Princeton
Henry Schreiner
 
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
Henry Schreiner
 
Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024
Henry Schreiner
 
Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024
Henry Schreiner
 
Princeton RSE Peer network first meeting
Princeton RSE Peer network first meetingPrinceton RSE Peer network first meeting
Princeton RSE Peer network first meeting
Henry Schreiner
 
Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023
Henry Schreiner
 
Princeton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance ToolingPrinceton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance Tooling
Henry Schreiner
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
Henry Schreiner
 
Everything you didn't know you needed
Everything you didn't know you neededEverything you didn't know you needed
Everything you didn't know you needed
Henry Schreiner
 
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
Henry Schreiner
 
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packagingPyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
Henry Schreiner
 
boost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meetingboost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meeting
Henry Schreiner
 
Digital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meetingDigital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meeting
Henry Schreiner
 
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex ReconstructionHOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
Henry Schreiner
 
HOW 2019: A complete reproducible ROOT environment in under 5 minutes
HOW 2019: A complete reproducible ROOT environment in under 5 minutesHOW 2019: A complete reproducible ROOT environment in under 5 minutes
HOW 2019: A complete reproducible ROOT environment in under 5 minutes
Henry Schreiner
 
ACAT 2019: A hybrid deep learning approach to vertexing
ACAT 2019: A hybrid deep learning approach to vertexingACAT 2019: A hybrid deep learning approach to vertexing
ACAT 2019: A hybrid deep learning approach to vertexing
Henry Schreiner
 
Ad

Recently uploaded (20)

Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk TechniciansOffshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
john823664
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
LSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection FunctionLSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection Function
Takahiro Harada
 
ECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptx
ECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptxECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptx
ECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptx
Jasper Oosterveld
 
Let’s Get Slack Certified! 🚀- Slack Community
Let’s Get Slack Certified! 🚀- Slack CommunityLet’s Get Slack Certified! 🚀- Slack Community
Let’s Get Slack Certified! 🚀- Slack Community
SanjeetMishra29
 
SDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhereSDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhere
Adtran
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Agentic AI - The New Era of Intelligence
Agentic AI - The New Era of IntelligenceAgentic AI - The New Era of Intelligence
Agentic AI - The New Era of Intelligence
Muzammil Shah
 
6th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 20256th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 2025
DanBrown980551
 
Contributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptxContributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptx
Patrick Lumumba
 
STKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 versionSTKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 version
Dr. Jimmy Schwarzkopf
 
Supercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMsSupercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMs
Francesco Corti
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Measuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI SuccessMeasuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI Success
Nikki Chapple
 
Securiport - A Border Security Company
Securiport  -  A Border Security CompanySecuriport  -  A Border Security Company
Securiport - A Border Security Company
Securiport
 
Introducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRCIntroducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRC
Adtran
 
Introducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and ARIntroducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and AR
Safe Software
 
The case for on-premises AI
The case for on-premises AIThe case for on-premises AI
The case for on-premises AI
Principled Technologies
 
Microsoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentationMicrosoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentation
Digitalmara
 
Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk TechniciansOffshore IT Support: Balancing In-House and Offshore Help Desk Technicians
Offshore IT Support: Balancing In-House and Offshore Help Desk Technicians
john823664
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
LSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection FunctionLSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection Function
Takahiro Harada
 
ECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptx
ECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptxECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptx
ECS25 - The adventures of a Microsoft 365 Platform Owner - Website.pptx
Jasper Oosterveld
 
Let’s Get Slack Certified! 🚀- Slack Community
Let’s Get Slack Certified! 🚀- Slack CommunityLet’s Get Slack Certified! 🚀- Slack Community
Let’s Get Slack Certified! 🚀- Slack Community
SanjeetMishra29
 
SDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhereSDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhere
Adtran
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Agentic AI - The New Era of Intelligence
Agentic AI - The New Era of IntelligenceAgentic AI - The New Era of Intelligence
Agentic AI - The New Era of Intelligence
Muzammil Shah
 
6th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 20256th Power Grid Model Meetup - 21 May 2025
6th Power Grid Model Meetup - 21 May 2025
DanBrown980551
 
Contributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptxContributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptx
Patrick Lumumba
 
STKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 versionSTKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 version
Dr. Jimmy Schwarzkopf
 
Supercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMsSupercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMs
Francesco Corti
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Measuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI SuccessMeasuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI Success
Nikki Chapple
 
Securiport - A Border Security Company
Securiport  -  A Border Security CompanySecuriport  -  A Border Security Company
Securiport - A Border Security Company
Securiport
 
Introducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRCIntroducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRC
Adtran
 
Introducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and ARIntroducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and AR
Safe Software
 
Microsoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentationMicrosoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentation
Digitalmara
 

IRIS-HEP: Boost-histogram and Hist

  • 1. boost-histogram and hist Henry Schreiner April 15, 2019
  • 2. Histograms in Python 1/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 3. Current state of histograms in Python Histograms in Python Core library: numpy • Historically slow • No histogram object • Plotting is separate Other libraries • Narrow focus: speed, plotting, or language • Many are abandoned • Poor design, backends, distribution HistBook Histogrammar pygram11 rootplotlib PyROOT YODA physt fast-histogramqhist Vaex hdrhistogram multihist matplotlib-hep pyhistogram histogram SimpleHist paida theodoregoetz numpy 2/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 4. What is needed? Histograms in Python Design • A histogram should be an object • Manipulation and plotting should be easy Performance • Fast single threaded filling • Multithreaded filling (since it’s 2019) Flexibility • Axes options: sparse, growing, labels • Storage: integers, weights, errors… Distribution • Easy to use anywhere, pip or conda • Should have wheels, be easy to build, etc. 3/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 5. Future of histograms in Python Histograms in Python Core histogramming libraries boost-histogram ROOT Universal adaptor Aghast Front ends (plotting, etc) hist mpl-hep physt others 4/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 6. Boost::Histogram (C++14) 5/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 7. Intro to Boost::Histogram Boost::Histogram (C++14) • Multidimensional templated header-only histogram library: /boostorg/histogram • Designed by Hans Dembinski, inspired by ROOT, GSL, and histbook Histogram • Axes • Storages • Accumulators Axes types • Regular, Circular • Variable • Integer • Category Storage ( Static Dynamic )Regular axis Regular axis with log transformaxes Optional overflowOptional underflow Accumulator int, double, unlimited, ... 6/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 8. Intro to Boost::Histogram Boost::Histogram (C++14) • Multidimensional templated header-only histogram library: /boostorg/histogram • Designed by Hans Dembinski, inspired by ROOT, GSL, and histbook Histogram • Axes • Storages • Accumulators Axes types • Regular, Circular • Variable • Integer • Category Storage ( Static Dynamic )Regular axis Regular axis with log transformaxes Optional overflowOptional underflow Accumulator int, double, unlimited, ... Boost 1.70 now released with Boost::Histogram! 6/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 9. boost-histogram (Python) 7/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 10. Intro to the Python bindings boost-histogram (Python) • Boost::Histogram developed with Python in mind • Original bindings based on Boost::Python ▶ Hard to build and distribute ▶ Somewhat limited • New bindings: /scikit-hep/boost-histogram ▶ 0-dependency build (C++14 only) ▶ State-of-the-art PyBind11 Design Flexibility Speed Distribution 8/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 11. Design boost-histogram (Python) • Supports Python 2.7 and 3.4+ • 260+ unit tests run on Azure on Linux, macOS, and Windows • Up to 16 axes supported (may go up or down) • 1D, 2D, and ND histograms all have the same interface Tries to stay close to the original Boost::Histogram where possible. C++ #include <boost/histogram.hpp> namespace bh = boost::histogram; auto hist = bh::make_histogram( bh::axis::regular<>{2, 0, 1, "x"}, bh::axis::regular<>{4, 0, 1, "y"}); hist(.2, .3); Python import boost.histogram as bh hist = bh.make_histogram( bh.axis.regular(2, 0, 1, metadata="x"), bh.axis.regular(4, 0, 1, metadata="y")) hist(.2, .3) 9/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 12. Design: Manipulations boost-histogram (Python) Combine two histograms hist1 + hist2 Scale a histogram hist * 2.0 Project a 3D histogram to 2D hist.project(0,1) # select axis Sum a histogram contents hist.sum() Access an axis axis0 = hist.axis(0) axis0.edges() # The edges array axis0.bin(1) # The bin accessors Fill 2D histogram with values or arrays hist(x, y) Fill copies in 4 threads, then merge hist.fill_threaded(4, x, y) Fill in 4 threads (atomic storage only) hist.fill_atomic(4, x, y) Convert to Numpy, 0-copy hist.view() # Or np.asarray(hist) 10/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 13. Flexibility: Axis boost-histogram (Python) • bh.axis.regular ▶ bh.axis.regular_uoflow ▶ bh.axis.regular_noflow ▶ bh.axis.regular_growth • bh.axis.circular • bh.axis.regular_log • bh.axis.regular_sqrt • bh.axis.regular_pow • bh.axis.integer • bh.axis.integer_noflow • bh.axis.integer_growth • bh.axis.variable • bh.axis.category_int • bh.axis.category_int_growth 0 0.5 1 bh.axis.regular(10,0,1) 𝜋/2 0, 2𝜋 𝜋 3𝜋/3 bh.axis.circular(8,0,2*np.pi) 0 0.3 0.5 1 bh.axis.variable([0,.3,.5,1]) 0 1 2 3 4 bh.axis.integer(0,5) 2 5 8 3 7 bh.axis.category_int([2,5,8,3,7]) 11/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 14. Flexibility: Storage types boost-histogram (Python) • bh.storage.int • bh.storage.double • bh.storage.unlimited (WIP) • bh.storage.atomic_int • bh.storage.weight (WIP) • bh.storage.profile (WIP, needs sampled fill) • bh.storage.weighted_profile (WIP, needs sampled fill) 12/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 15. Performance boost-histogram (Python) The following measurements are with: 1D • 100 regular bins • 10,000,000 entries 2D • 100x100 regular bins • 1,000,000 entries See my histogram performance post for measurements of other libraries. 13/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 16. Performance: macOS, dual core, 1D boost-histogram (Python) Type Storage Fill Time Speedup Numpy uint64 149.4 ms 1x Any int 236 ms 0.63x Regular int 86.23 ms 1.7x Regular aint 1 132 ms 1.1x Regular aint 2 168.2 ms 0.89x Regular aint 4 143.6 ms 1x Regular int 1 84.75 ms 1.8x Regular int 2 51.6 ms 2.9x Regular int 4 42.39 ms 3.5x 14/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 17. Performance: CentOS7, 24 core, 1D (anaconda) boost-histogram (Python) Type Storage Fill Time Speedup Numpy uint64 121 ms 1x Any int 261.5 ms 0.46x Regular int 142.2 ms 0.85x Regular aint 1 319.1 ms 0.38x Regular aint 48 272.9 ms 0.44x Regular int 1 243.4 ms 0.5x Regular int 6 94.76 ms 1.3x Regular int 12 71.38 ms 1.7x Regular int 24 52.26 ms 2.3x Regular int 48 43.01 ms 2.8x 15/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 18. Performance: KNL, 64 core, 1D (anaconda) boost-histogram (Python) Type Storage Fill Time Speedup Numpy uint64 716.9 ms 1x Any int 1418 ms 0.51x Regular int 824 ms 0.87x Regular aint 1 871.7 ms 0.82x Regular aint 4 437.1 ms 1.6x Regular aint 64 198.8 ms 3.6x Regular aint 128 186.8 ms 3.8x Regular aint 256 195.2 ms 3.7x Regular int 1 796.9 ms 0.9x Regular int 2 430.6 ms 1.7x Regular int 4 247.6 ms 2.9x Regular int 64 88.77 ms 8.1x Regular int 128 98.08 ms 7.3x Regular int 256 112.2 ms 6.4x 16/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 19. Performance: macOS, dual core, 2D boost-histogram (Python) Type Storage Fill Time Speedup Numpy uint64 121.1 ms 1x Any int 37.12 ms 3.3x Regular int 18.5 ms 6.5x Regular aint 1 20.21 ms 6x Regular aint 2 14.17 ms 8.5x Regular aint 4 10.23 ms 12x Regular int 1 17.86 ms 6.8x Regular int 2 9.41 ms 13x Regular int 4 6.854 ms 18x 17/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 20. Performance: CentOS7, 24 core, 2D (anaconda) boost-histogram (Python) Type Storage Fill Time Speedup Numpy uint64 87.27 ms 1x Any int 41.42 ms 2.1x Regular int 21.67 ms 4x Regular aint 1 38.61 ms 2.3x Regular aint 6 19.89 ms 4.4x Regular aint 24 9.556 ms 9.1x Regular aint 48 8.518 ms 10x Regular int 1 36.5 ms 2.4x Regular int 6 8.976 ms 9.7x Regular int 12 5.318 ms 16x Regular int 24 4.388 ms 20x Regular int 48 5.839 ms 15x 18/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 21. Performance: KNL, 64 core, 2D (anaconda) boost-histogram (Python) Type Storage Fill Time Speedup Numpy uint64 439.5 ms 1x Any int 250.6 ms 1.8x Regular int 135.6 ms 3.2x Regular aint 1 142.2 ms 3.1x Regular aint 4 52.71 ms 8.3x Regular aint 32 12.05 ms 36x Regular aint 64 16.5 ms 27x Regular aint 256 43.93 ms 10x Regular int 1 141.1 ms 3.1x Regular int 2 70.78 ms 6.2x Regular int 4 36.11 ms 12x Regular int 64 18.93 ms 23x Regular int 128 36.09 ms 12x Regular int 256 55.64 ms 7.9x 19/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 22. Performance: Summary boost-histogram (Python) System 1D max speedup 2D max speedup macOS 1 core 1.7 x 6.5 x macOS 2 core 3.5 x 18 x Linux 1 core 0.85 x 4 x Linux 24 core 2.8 x 20 x KNL 1 core 0.87 x 3.2 x KNL 64 core 8.1 x 36 x • Note that Numpy 1D is well optimized (last few versions) • Anaconda versions may provide a few more optimizations to Numpy • Mixing axes types in boost-histogram can reduce performance by 2-3x 20/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 23. Distribution boost-histogram (Python) • We must provide excellent distribution. ▶ If anyone writes pip install boost-histogram and it fails, we have failed. • Docker ManyLinux1 GCC 8.3: /scikit-hep/manylinuxgcc Wheels • manylinux1 32, 64 bit (ready) • manylinux2010 64 bit (planned) • macOS 10.9+ (wip) • Windows 32, 64 bit, Python 3.6+ (wip) ▶ Is Python 2.7 Windows needed? Source • SDist (ready) • Build directly from GitHub (done) Conda • conda package (planned, easy) python -m pip install git+https://github.com/scikit-hep/boost-histogram.git@develop 21/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 24. Plans boost-histogram (Python) • Add shortcuts for axis types, fill out axis types • Allow view access into unlimited storage histograms • Add from_numpy and numpy style shortcut(s) • Filling ▶ Samples ▶ Weights ▶ Non-numerical fill (if possible) • Add profile, weighted_profile histograms • Add reduce operations • Release to PyPI • Add some docs and read the docs support First alpha Release planned this week 22/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 25. Bikeshedding (API discussion) boost-histogram (Python) Let’s discuss API! (On GitHub issues or gitter) • Download: pip install boost-histogram (WIP) • Use: import boost.histogram as bh • Create: hist = bh.histogram(bh.axis.regular(12,0,1)) • Fill: hist(values) • Access values, convert to numpy, etc. AndGod III 1am a it a a EAB.zpkpt.LY eEFEEIE Documentation • The documentation will also need useful examples, feel free to contribute! 23/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 26. hist 24/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 27. A slide about hist hist hist is the ‘wrapper’ piece that does plotting and interacts with the rest of the ecosystem. Plans • Easy plotting adaptors (mpl-hep) • Serialization formats (ROOT, HDF5) • Auto-multithreading • Statistical functions (Like TEfficiency) • Multihistograms (HistBook) • Interaction with fitters (ZFit, GooFit, etc) • Bayesian Blocks algorithm from SciKit-HEP • Command line histograms for stream of numbers Call for contributions • What do you need? • What do you want? • What would you like? Join in the development! This should combine the best features of other packages. 25/27Henry Schreiner boost-histogram and hist April 15, 2019
  • 29. Backup Questions? • Supported by IRIS-HEP, NSF OAC-1836650 27/27Henry Schreiner boost-histogram and hist April 15, 2019