SlideShare a Scribd company logo
Nature | Vol 585 | 17 September 2020 | 357
Review
ArrayprogrammingwithNumPy
Charles R. Harris1
, K. Jarrod Millman2,3,4 ✉, Stéfan J. van der Walt2,4,5 ✉, Ralf Gommers6 ✉,
Pauli Virtanen7,8
, David Cournapeau9
, Eric Wieser10
, Julian Taylor11
, Sebastian Berg4
,
Nathaniel J. Smith12
, Robert Kern13
, Matti Picus4
, Stephan Hoyer14
, Marten H. van Kerkwijk15
,
Matthew Brett2,16
, Allan Haldane17
, Jaime Fernández del Río18
, Mark Wiebe19,20
,
Pearu Peterson6,21,22
, Pierre Gérard-Marchant23,24
, Kevin Sheppard25
, Tyler Reddy26
,
Warren Weckesser4
, Hameer Abbasi6
, Christoph Gohlke27
& Travis E. Oliphant6
Arrayprogrammingprovidesapowerful,compactandexpressivesyntaxfor
accessing,manipulatingandoperatingondatainvectors,matricesand
higher-dimensionalarrays.NumPyistheprimaryarrayprogramminglibraryforthe
Pythonlanguage.Ithasanessentialroleinresearchanalysispipelinesinfieldsas
diverseasphysics,chemistry,astronomy,geoscience,biology,psychology,materials
science,engineering,financeandeconomics.Forexample,inastronomy,NumPywas
animportantpartofthesoftwarestackusedinthediscoveryofgravitationalwaves1
andinthefirstimagingofablackhole2
.Herewereviewhowafewfundamentalarray
conceptsleadtoasimpleandpowerfulprogrammingparadigmfororganizing,
exploringandanalysingscientificdata.NumPyisthefoundationuponwhichthe
scientificPythonecosystemisconstructed.Itissopervasivethatseveralprojects,
targetingaudienceswithspecializedneeds,havedevelopedtheirownNumPy-like
interfacesandarrayobjects.Owingtoitscentralpositionintheecosystem,NumPy
increasinglyactsasaninteroperabilitylayerbetweensucharraycomputation
librariesand,togetherwithitsapplicationprogramminginterface(API),providesa
flexibleframeworktosupportthenextdecadeofscientificandindustrialanalysis.
TwoPythonarraypackagesexistedbeforeNumPy.TheNumericpack-
age was developed in the mid-1990s and provided array objects and
array-awarefunctionsinPython.ItwaswritteninCandlinkedtostand-
ardfastimplementationsoflinearalgebra3,4
.Oneofitsearliestuseswas
to steer C++ applications for inertial confinement fusion research at
LawrenceLivermoreNationalLaboratory5
.Tohandlelargeastronomi-
calimagescomingfromtheHubbleSpaceTelescope,areimplementa-
tionofNumeric,calledNumarray,addedsupportforstructuredarrays,
flexibleindexing,memorymapping,byte-ordervariants,moreefficient
memoryuse,flexibleIEEE754-standarderror-handlingcapabilities,and
bettertype-castingrules6
.AlthoughNumarraywashighlycompatible
withNumeric,thetwopackageshadenoughdifferencesthatitdivided
the community; however, in 2005 NumPy emerged as a ‘best of both
worlds’ unification7
—combining the features of Numarray with the
small-array performance of Numeric and its rich C API.
Now, 15 years later, NumPy underpins almost every Python library
that does scientific or numerical computation8–11
, including SciPy12
,
Matplotlib13
, pandas14
, scikit-learn15
and scikit-image16
. NumPy is a
community-developed, open-source library, which provides a mul-
tidimensional Python array object along with array-aware functions
thatoperateonit.Becauseofitsinherentsimplicity,theNumPyarray
is the de facto exchange format for array data in Python.
NumPyoperatesonin-memoryarraysusingthecentralprocessing
unit(CPU).Toutilizemodern,specializedstorageandhardware,there
has been a recent proliferation of Python array packages. Unlike with
the Numarray–Numeric divide, it is now much harder for these new
libraries to fracture the user community—given how much work is
alreadybuiltontopofNumPy.However,toprovidethecommunitywith
access to new and exploratory technologies, NumPy is transitioning
into a central coordinating mechanism that specifies a well defined
array programming API and dispatches it, as appropriate, to special-
ized array implementations.
NumPyarrays
TheNumPyarrayisadatastructurethatefficientlystoresandaccesses
multidimensionalarrays17
(alsoknownastensors),andenablesawide
variety of scientific computation. It consists of a pointer to memory,
along with metadata used to interpret the data stored there, notably
‘data type’, ‘shape’ and ‘strides’ (Fig. 1a).
https://doi.org/10.1038/s41586-020-2649-2
Received: 21 February 2020
Accepted: 17 June 2020
Published online: 16 September 2020
Open access
Check for updates
1
Independent researcher, Logan, UT, USA. 2
Brain Imaging Center, University of California, Berkeley, Berkeley, CA, USA. 3
Division of Biostatistics, University of California, Berkeley, Berkeley, CA,
USA. 4
Berkeley Institute for Data Science, University of California, Berkeley, Berkeley, CA, USA. 5
Applied Mathematics, Stellenbosch University, Stellenbosch, South Africa. 6
Quansight, Austin,
TX, USA. 7
Department of Physics, University of Jyväskylä, Jyväskylä, Finland. 8
Nanoscience Center, University of Jyväskylä, Jyväskylä, Finland. 9
Mercari JP, Tokyo, Japan. 10
Department of
Engineering, University of Cambridge, Cambridge, UK. 11
Independent researcher, Karlsruhe, Germany. 12
Independent researcher, Berkeley, CA, USA. 13
Enthought, Austin, TX, USA. 14
Google
Research, Mountain View, CA, USA. 15
Department of Astronomy and Astrophysics, University of Toronto, Toronto, Ontario, Canada. 16
School of Psychology, University of Birmingham,
Edgbaston, Birmingham, UK. 17
Department of Physics, Temple University, Philadelphia, PA, USA. 18
Google, Zurich, Switzerland. 19
Department of Physics and Astronomy, The University of
British Columbia, Vancouver, British Columbia, Canada. 20
Amazon, Seattle, WA, USA. 21
Independent researcher, Saue, Estonia. 22
Department of Mechanics and Applied Mathematics, Institute
of Cybernetics at Tallinn Technical University, Tallinn, Estonia. 23
Department of Biological and Agricultural Engineering, University of Georgia, Athens, GA, USA. 24
France-IX Services, Paris,
France. 25
Department of Economics, University of Oxford, Oxford, UK. 26
CCS-7, Los Alamos National Laboratory, Los Alamos, NM, USA. 27
Laboratory for Fluorescence Dynamics, Biomedical
Engineering Department, University of California, Irvine, Irvine, CA, USA. ✉e-mail: millman@berkeley.edu; stefanv@berkeley.edu; ralf.gommers@gmail.com
358 | Nature | Vol 585 | 17 September 2020
Review
The data type describes the nature of elements stored in an array.
Anarrayhasasingledatatype,andeachelementofanarrayoccupies
thesamenumberofbytesinmemory.Examplesofdatatypesinclude
real and complex numbers (of lower and higher precision), strings,
timestamps and pointers to Python objects.
The shape of an array determines the number of elements along
each axis, and the number of axes is the dimensionality of the array.
For example, a vector of numbers can be stored as a one-dimensional
array of shape N, whereas colour videos are four-dimensional arrays
of shape (T, M, N, 3).
Strides are necessary to interpret computer memory, which stores
elementslinearly,asmultidimensionalarrays.Theydescribethenum-
berofbytestomoveforwardinmemorytojumpfromrowtorow,col-
umntocolumn,andsoforth.Consider,forexample,atwo-dimensional
arrayoffloating-pointnumberswithshape(4, 3),whereeachelement
occupies 8 bytes in memory. To move between consecutive columns,
weneedtojumpforward8 bytesinmemory,andtoaccessthenextrow,
3 × 8 = 24 bytes. The strides of that array are therefore (24, 8). NumPy
canstorearraysineitherCorFortranmemoryorder,iteratingfirstover
either rows or columns. This allows external libraries written in those
languages to access NumPy array data in memory directly.
Users interact with NumPy arrays using ‘indexing’ (to access sub-
arrays or individual elements), ‘operators’ (for example, +, − and ×
for vectorized operations and @ for matrix multiplication), as well
as‘array-awarefunctions’;together,theseprovideaneasilyreadable,
expressive, high-level API for array programming while NumPy deals
with the underlying mechanics of making operations fast.
Indexing an array returns single elements, subarrays or elements
that satisfy a specific condition (Fig. 1b). Arrays can even be indexed
usingotherarrays(Fig. 1c).Whereverpossible,indexingthatretrievesa
subarrayreturnsa‘view’ontheoriginalarraysuchthatdataareshared
between the two arrays. This provides a powerful way to operate on
subsets of array data while limiting memory usage.
To complement the array syntax, NumPy includes functions that
perform vectorized calculations on arrays, including arithmetic,
statistics and trigonometry (Fig. 1d). Vectorization—operating on
entirearraysratherthantheirindividualelements—isessentialtoarray
programming.Thismeansthatoperationsthatwouldtakemanytens
oflinestoexpressinlanguagessuchasCcanoftenbeimplementedas
asingle,clearPythonexpression.Thisresultsinconcisecodeandfrees
users to focus on the details of their analysis, while NumPy handles
looping over array elements near-optimally—for example, taking
strides into consideration to best utilize the computer’s fast cache
memory.
Whenperformingavectorizedoperation(suchasaddition)ontwo
arrays with the same shape, it is clear what should happen. Through
‘broadcasting’ NumPy allows the dimensions to differ, and produces
results that appeal to intuition. A trivial example is the addition of a
scalarvaluetoanarray,butbroadcastingalsogeneralizestomorecom-
plex examples such as scaling each column of an array or generating
agridofcoordinates.Inbroadcasting,oneorbotharraysarevirtually
duplicated (that is, without copying any data in memory), so that the
shapes of the operands match (Fig. 1d). Broadcasting is also applied
when an array is indexed using arrays of indices (Fig. 1c).
Other array-aware functions, such as sum, mean and maximum,
performelement-by-element‘reductions’,aggregatingresultsacross
one, multiple or all axes of a single array. For example, summing an
n-dimensional array over d axes results in an array of dimension n − d
(Fig. 1f).
NumPyalsoincludesarray-awarefunctionsforcreating,reshaping,
concatenating and padding arrays; searching, sorting and counting
data; and reading and writing files. It provides extensive support for
generatingpseudorandomnumbers,includesanassortmentofprob-
ability distributions, and performs accelerated linear algebra, using
oneofseveralbackendssuchasOpenBLAS18,19
orIntelMKLoptimized
for the CPUs at hand (see Supplementary Methods for more details).
Altogether, the combination of a simple in-memory array repre-
sentation, a syntax that closely mimics mathematics, and a variety
of array-aware utility functions forms a productive and powerfully
expressive array programming language.
In [1]: import numpy as np
In [2]: x = np.arange(12)
In [3]: x = x.reshape(4, 3)
In [4]: x
Out[4]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [5]: np.mean(x, axis=0)
Out[5]: array([4.5, 5.5, 6.5])
In [6]: x = x - np.mean(x, axis=0)
In [7]: x
Out[7]:
array([[-4.5, -4.5, -4.5],
[-1.5, -1.5, -1.5],
[ 1.5, 1.5, 1.5],
[ 4.5, 4.5, 4.5]])
a Data structure g Example
x =
0 1 2
3 4 5
6 7 8
9 10 11
data
data type
shape
strides
8-byte integer
(4, 3)
(24, 8)
1 2 3 4 5 6 70 8 9 10 11
8 bytes
per element
3 × 8 = 24 bytes
to jump one
row down
b Indexing (view)
10 1199
x[:,1:] → with slices
1 2
4 5
7 8
00
33
66
x[:,::2]→ with slices
with steps
0 2
3 5
6 8
9 11
0 11 2
3 44 5
6 77 8
9 1010 11
Slices are start:end:step,
any of which can be left blank
d Vectorization
+ →
0 1
3 4
6 7
9 10
1
1
1
1
1
1
1
1
1 2
4 5
7 8
10 11
e Broadcasting
×
3
6
0
9
1 2
→
0 0
3 6
6 12
9 18
f Reduction
0 1
3 4
6 7
9 10
2
5
8
11
3
12
21
30
sum
axis 1
18 22 26
sum
axis 0
66
sum
axis (0,1)
c Indexing (copy)
4 3
7 6
with arrays
with broadcasting
→x →
,2
1 1 0
x
,
1 1
2 2
1 0
1 0
x with arraysx[0,1],x[1,2] 1 5→ →0 1 1 2
,
x[x > 9] with masks10 11→→ 5 with scalarsx[1,2]
Fig.1|TheNumPyarrayincorporatesseveralfundamentalarrayconcepts.
a,TheNumPyarraydatastructureanditsassociatedmetadatafields.
b,Indexinganarraywithslicesandsteps.Theseoperationsreturna‘view’of
theoriginaldata.c,Indexinganarraywithmasks,scalarcoordinatesorother
arrays,sothatitreturnsa‘copy’oftheoriginaldata.Inthebottomexample,an
arrayisindexedwithotherarrays;thisbroadcaststheindexingarguments
beforeperformingthelookup.d,Vectorizationefficientlyappliesoperations
togroupsofelements.e,Broadcastinginthemultiplicationoftwo-dimensional
arrays.f,Reductionoperationsactalongoneormoreaxes.Inthisexample,
anarrayissummedalongselectaxestoproduceavector,oralongtwoaxes
consecutivelytoproduceascalar.g,ExampleNumPycode,illustratingsomeof
theseconcepts.
Nature | Vol 585 | 17 September 2020 | 359
ScientificPythonecosystem
Pythonisanopen-source,general-purposeinterpretedprogramming
languagewellsuitedtostandardprogrammingtaskssuchascleaning
data,interactingwithwebresourcesandparsingtext.Addingfastarray
operations and linear algebra enables scientists to do all their work
withinasingleprogramminglanguage—onethathastheadvantageof
being famously easy to learn and teach, as witnessed by its adoption
as a primary learning language in many universities.
Even though NumPy is not part of Python’s standard library, it ben-
efits from a good relationship with the Python developers. Over the
years,thePythonlanguagehasaddednewfeaturesandspecialsyntax
so that NumPy would have a more succinct and easier-to-read array
notation.However,becauseitisnotpartofthestandardlibrary,NumPy
is able to dictate its own release policies and development patterns.
SciPyandMatplotlibaretightlycoupledwithNumPyintermsofhis-
tory,developmentanduse.SciPyprovidesfundamentalalgorithmsfor
scientificcomputing,includingmathematical,scientificandengineer-
ingroutines.Matplotlibgeneratespublication-readyfiguresandvisu-
alizations.ThecombinationofNumPy,SciPyandMatplotlib,together
with an advanced interactive environment such as IPython20
or Jupy-
ter21
,providesasolidfoundationforarrayprogramminginPython.The
scientificPythonecosystem(Fig. 2)buildsontopofthisfoundationto
provide several, widely used technique-specific libraries15,16,22
, that in
turn underlie numerous domain-specific projects23–28
. NumPy, at the
base of the ecosystem of array-aware libraries, sets documentation
standards, provides array testing infrastructure and adds build sup-
port for Fortran and other compilers.
Manyresearchgroupshavedesignedlarge,complexscientificlibrar-
ies that add application-specific functionality to the ecosystem. For
example, the eht-imaging library29
, developed by the Event Horizon
Telescope collaboration for radio interferometry imaging, analysis
andsimulation,reliesonmanylower-levelcomponentsofthescientific
Pythonecosystem.Inparticular,theEHTcollaborationusedthislibrary
forthefirstimagingofablackhole.Withineht-imaging,NumPyarrays
are used to store and manipulate numerical data at every step in the
processingchain:fromrawdatathroughcalibrationandimagerecon-
struction.SciPysuppliestoolsforgeneralimage-processingtaskssuch
asfilteringandimagealignment,andscikit-image,animage-processing
library that extends SciPy, provides higher-level functionality such
as edge filters and Hough transforms. The ‘scipy.optimize’ module
performsmathematicaloptimization.NetworkX22
,apackageforcom-
plexnetworkanalysis,isusedtoverifyimagecomparisonconsistency.
Astropy23,24
handlesstandardastronomicalfileformatsandcomputes
time–coordinatetransformations.Matplotlibisusedtovisualizedata
and to generate the final image of the black hole.
Theinteractiveenvironmentcreatedbythearray programmingfoun-
dation and the surrounding ecosystem of tools—inside of IPython or
Jupyter—isideallysuitedtoexploratorydataanalysis.Userscanfluidly
inspect,manipulateandvisualizetheirdata,andrapidlyiteratetorefine
programmingstatements.Thesestatementsarethenstitchedtogether
intoimperativeorfunctionalprograms,ornotebookscontainingboth
computationandnarrative.Scientificcomputingbeyondexploratory
workisoftendoneinatexteditororanintegrateddevelopmentenvi-
ronment (IDE) such as Spyder. This rich and productive environment
has made Python popular for scientific research.
To complement this facility for exploratory work and rapid proto-
typing,NumPyhasdevelopedacultureofusingtime-testedsoftware
engineeringpracticestoimprovecollaborationandreduceerror30
.This
culture is not only adopted by leaders in the project but also enthusi-
astically taught to newcomers. The NumPy team was early to adopt
distributedrevisioncontrolandcodereviewtoimprovecollaboration
cantera
Chemistry
Biopython
Biology
Astropy
Astronomy
simpeg
Geophysics
NLTK
Linguistics
QuantEcon
Economics
SciPy
Algorithms
Matplotlib
Plots
scikit-learn
Machine learning
NetworkX
Network analysis
pandas, statsmodels
Statistics
scikit-image
Image processing
PsychoPykhmer Qiime2 FiPy deepchem
librosaPyWavelets SunPy QuTiP yt
nibabel yellowbrickmne-python scikit-HEP
eht-imagingMDAnalysis iriscesium PyChrono
Foundation
Application-specific
Domain-specific
Technique-specific
Array ProtocolsNumPy API
Python
Language
IPython / Jupyter
Interactive environments
NumPy
Arrays
New array implementations
Fig.2|NumPyisthebaseofthescientificPythonecosystem.EssentiallibrariesandprojectsthatdependonNumPy’sAPIgainaccesstonewarray
implementationsthatsupportNumPy’sarrayprotocols(Fig. 3).
360 | Nature | Vol 585 | 17 September 2020
Review
oncode,andcontinuoustestingthatrunsanextensivebatteryofauto-
mated tests for every proposed change to NumPy. The project also
hascomprehensive,high-qualitydocumentation,integratedwiththe
source code31–33
.
Thiscultureofusingbestpracticesforproducingreliablescientific
softwarehasbeenadoptedbytheecosystemoflibrariesthatbuildon
NumPy.Forexample,inarecentawardgivenbytheRoyalAstronomi-
cal Society to Astropy, they state: “The Astropy Project has provided
hundredsofjuniorscientistswithexperienceinprofessional-standard
softwaredevelopmentpracticesincludinguseofversioncontrol,unit
testing, code review and issue tracking procedures. This is a vital skill
setformodernresearchersthatisoftenmissingfromformaluniversity
educationinphysicsorastronomy”34
.Communitymembersexplicitly
work to address this lack of formal education through courses and
workshops35–37
.
Therecentrapidgrowthofdatascience,machinelearningandarti-
ficial intelligence has further and dramatically boosted the scientific
use of Python. Examples of its important applications, such as the
eht-imaging library, now exist in almost every discipline in the natu-
ralandsocialsciences.Thesetoolshavebecometheprimarysoftware
environmentinmanyfields.NumPyanditsecosystemarecommonly
taught in university courses, boot camps and summer schools, and
are the focus of community conferences and workshops worldwide.
NumPy and its API have become truly ubiquitous.
Arrayproliferationandinteroperability
NumPyprovidesin-memory,multidimensional,homogeneouslytyped
(thatis,single-pointerandstrided)arraysonCPUs.Itrunsonmachines
rangingfromembeddeddevicestotheworld’slargestsupercomputers,
withperformanceapproachingthatofcompiledlanguages.Formost
its existence, NumPy addressed the vast majority of array computa-
tion use cases.
However,scientificdatasetsnowroutinelyexceedthememorycapac-
ity of a single machine and may be stored on multiple machines or in
thecloud.Inaddition,therecentneedtoacceleratedeep-learningand
artificialintelligenceapplicationshasledtotheemergenceofspecial-
izedacceleratorhardware,includinggraphicsprocessingunits(GPUs),
tensor processing units (TPUs) and field-programmable gate arrays
(FPGAs).Owingtoitsin-memorydatamodel,NumPyiscurrentlyunable
to directly utilize such storage and specialized hardware. However,
both distributed data and also the parallel execution of GPUs, TPUs
andFPGAsmapwelltotheparadigmofarrayprogramming:therefore
leadingtoagapbetweenavailablemodernhardwarearchitecturesand
the tools necessary to leverage their computational power.
Thecommunity’seffortstofillthisgapledtoaproliferationofnew
array implementations. For example, each deep-learning framework
created its own arrays; the PyTorch38
, Tensorflow39
, Apache MXNet40
and JAX arrays all have the capability to run on CPUs and GPUs in a
distributed fashion, using lazy evaluation to allow for additional per-
formanceoptimizations.SciPyandPyData/Sparsebothprovidesparse
arrays,whichtypicallycontainfewnon-zerovaluesandstoreonlythose
in memory for efficiency. In addition, there are projects that build on
NumPy arrays as data containers, and extend its capabilities. Distrib-
uted arrays are made possible that way by Dask, and labelled arrays—
referring to dimensions of an array by name rather than by index for
clarity, compare x[:, 1] versus x.loc[:, 'time']—by xarray41
.
Such libraries often mimic the NumPy API, because this lowers the
barriertoentryfornewcomersandprovidesthewidercommunitywith
astablearray programminginterface.This,inturn,preventsdisruptive
schisms such as the divergence between Numeric and Numarray. But
exploring new ways of working with arrays is experimental by nature
and,infact,severalpromisinglibraries(suchasTheanoandCaffe)have
alreadyceaseddevelopment.Andeachtimethatauserdecidestotrya
newtechnology,theymustchangeimportstatementsandensurethatthe
newlibraryimplementsallthepartsoftheNumPyAPItheycurrentlyuse.
Ideally, operating on specialized arrays using NumPy functions or
semantics would simply work, so that users could write code once,
and would then benefit from switching between NumPy arrays, GPU
arrays,distributedarraysandsoforthasappropriate.Tosupportarray
operations between external array objects, NumPy therefore added
the capability to act as a central coordination mechanism with a well
specified API (Fig. 2).
To facilitate this interoperability, NumPy provides ‘protocols’ (or
contractsofoperation),thatallowforspecializedarraystobepassedto
NumPyfunctions(Fig. 3).NumPy,inturn,dispatchesoperationstothe
originatinglibrary,asrequired.Overfourhundredofthemostpopular
NumPy functions are supported. The protocols are implemented by
widely used libraries such as Dask, CuPy, xarray and PyData/Sparse.
Thankstothesedevelopments,userscannow,forexample,scaletheir
computationfromasinglemachinetodistributedsystemsusingDask.
The protocols also compose well, allowing users to redeploy NumPy
codeatscaleondistributed,multi-GPUsystemsvia,forinstance,CuPy
arrays embedded in Dask arrays. Using NumPy’s high-level API, users
can leverage highly parallel code execution on multiple systems with
millions of cores, all with minimal code changes42
.
These array protocols are now a key feature of NumPy, and are
expected to only increase in importance. The NumPy developers—
many of whom are authors of this Review—iteratively refine and add
protocol designs to improve utility and simplify adoption.
Output
arrays
Input
arrays
NumPy
API
np.stack
np.reshape
np.transpose
np.argmin
np.mean
np.std
np.max
np.cos
np.arctan
np.log
np.cumsum
np.diff
...
NumPy array protocols
In [1]: import numpy as np
In [2]: import dask.array as da
In [3]: x = da.arange(12)
In [4]: x = np.reshape(x, (4, 3))
In [5]: x
Out[5]: dask.array<..., shape=(4, 3), ...>
In [6]: np.mean(x, axis=0)
Out[6]: dask.array<..., shape=(3,), ...>
In [7]: x = x - np.mean(x, axis=0)
In [8]: x
Out[8]: dask.array<..., shape=(4, 3), ...>
Array
implementation
NumPy
Dask
CuPy
PyData/
Sparse
...
...
Dask
NumPy
CuPy
PyData
Sparse
...
Dask
NumPy
CuPy
PyData
Sparse
Fig.3|NumPy’sAPIandarrayprotocolsexposenewarraystothe
ecosystem.Inthisexample,NumPy’s‘mean’functioniscalledonaDaskarray.
Thecallsucceedsbydispatchingtotheappropriatelibraryimplementation(in
thiscase,Dask)andresultsinanewDaskarray.Comparethiscodetothe
examplecodeinFig. 1g.
Nature | Vol 585 | 17 September 2020 | 361
Discussion
NumPy combines the expressive power of array programming, the
performanceofC,andthereadability,usabilityandversatilityofPython
inamature,welltested,welldocumentedandcommunity-developed
library.LibrariesinthescientificPythonecosystemprovidefastimple-
mentations of most important algorithms. Where extreme optimiza-
tion is warranted, compiled languages can be used, such as Cython43
,
Numba44
and Pythran45
; these languages extend Python and trans-
parently accelerate bottlenecks. Owing to NumPy’s simple memory
model, it is easy to write low-level, hand-optimized code, usually in C
orFortran,tomanipulateNumPyarraysandpassthembacktoPython.
Furthermore, using array protocols, it is possible to utilize the full
spectrumofspecializedhardwareaccelerationwithminimalchanges
to existing code.
NumPywasinitiallydevelopedbystudents,facultyandresearchers
to provide an advanced, open-source array programming library for
Python,whichwasfreetouseandunencumberedbylicenseserversand
softwareprotectiondongles.Therewasasenseofbuildingsomething
consequential together for the benefit of many others. Participating
in such an endeavour, within a welcoming community of like-minded
individuals, held a powerful attraction for many early contributors.
These user–developers frequently had to write code from scratch
to solve their own or their colleagues’ problems—often in low-level
languages that preceded Python, such as Fortran46
and C. To them,
theadvantagesofaninteractive,high-levelarraylibrarywereevident.
Thedesignofthisnewtoolwasinformedbyotherpowerfulinteractive
programming languages for scientific computing such as Basis47–50
,
Yorick51
, R52
and APL53
, as well as commercial languages and environ-
ments such as IDL (Interactive Data Language) and MATLAB.
WhatbeganasanattempttoaddanarrayobjecttoPythonbecame
thefoundationofavibrantecosystemoftools.Now,alargeamountof
scientific work depends on NumPy being correct, fast and stable. It is
nolongerasmallcommunityproject,butcorescientificinfrastructure.
Thedeveloperculturehasmatured:althoughinitialdevelopmentwas
highlyinformal,NumPynowhasaroadmapandaprocessforpropos-
ing and discussing large changes. The project has formal governance
structures and is fiscally sponsored by NumFOCUS, a nonprofit that
promotes open practices in research, data and scientific computing.
Overthepastfewyears,theprojectattracteditsfirstfundeddevelop-
ment, sponsored by the Moore and Sloan Foundations, and received
anawardaspartoftheChanZuckerbergInitiative’sEssentialsofOpen
Source Software programme. With this funding, the project was (and
is) able to have sustained focus over multiple months to implement
substantial new features and improvements. That said, the develop-
mentofNumPystilldependsheavilyoncontributionsmadebygradu-
ate students and researchers in their free time (see Supplementary
Methods for more details).
NumPyisnolongermerelythefoundationalarraylibraryunderlying
thescientificPythonecosystem,butithasbecomethestandardAPIfor
tensor computation and a central coordinating mechanism between
arraytypesandtechnologiesinPython.Workcontinuestoexpandon
and improve these interoperability features.
Overthenextdecade,NumPydeveloperswillfaceseveralchallenges.
Newdeviceswillbedeveloped,andexistingspecializedhardwarewill
evolvetomeetdiminishingreturnsonMoore’slaw.Therewillbemore,
andawidervarietyof,datasciencepractitioners,alargeproportionof
whom will use NumPy. The scale of scientific data gathering will con-
tinue to increase, with the adoption of devices and instruments such
as light-sheet microscopes and the Large Synoptic Survey Telescope
(LSST)54
.Newgenerationlanguages,interpretersandcompilers,suchas
Rust55
,Julia56
andLLVM57
,willcreatenewconceptsanddatastructures,
and determine their viability.
ThroughthemechanismsdescribedinthisReview,NumPyispoised
to embrace such a changing landscape, and to continue playing a
leading part in interactive scientific computation, although to do so
willrequiresustainedfundingfromgovernment,academiaandindus-
try.But,importantly,forNumPytomeettheneedsofthenextdecade
ofdatascience,itwillalsoneedanewgenerationofgraduatestudents
and community contributors to drive it forward.
1.	 Abbott, B. P. et al. Observation of gravitational waves from a binary black hole merger.
Phys. Rev. Lett. 116, 061102 (2016).
2.	 Chael, A. et al. High-resolution linear polarimetric imaging for the Event Horizon
Telescope. Astrophys. J. 286, 11 (2016).
3.	 Dubois, P. F., Hinsen, K. & Hugunin, J. Numerical Python. Comput. Phys. 10, 262–267 (1996).
4.	 Ascher, D., Dubois, P. F., Hinsen, K., Hugunin, J. & Oliphant, T. E. An Open Source Project:
Numerical Python (Lawrence Livermore National Laboratory, 2001).
5.	 Yang, T.-Y., Furnish, G. & Dubois, P. F. Steering object-oriented scientific computations. In
Proc. TOOLS USA 97. Intl Conf. Technology of Object Oriented Systems and Languages
(eds Ege, R., Singh, M. & Meyer, B.) 112–119 (IEEE, 1997).
6.	 Greenfield, P., Miller, J. T., Hsu, J. & White, R. L. numarray: a new scientific array package
for Python. In PyCon DC 2003 http://citeseerx.ist.psu.edu/viewdoc/download?d
oi=10.1.1.112.9899 (2003).
7.	 Oliphant, T. E. Guide to NumPy 1st edn (Trelgol Publishing, 2006).
8.	 Dubois, P. F. Python: batteries included. Comput. Sci. Eng. 9, 7–9 (2007).
9.	 Oliphant, T. E. Python for scientific computing. Comput. Sci. Eng. 9, 10–20 (2007).
10.	 Millman, K. J. & Aivazis, M. Python for scientists and engineers. Comput. Sci. Eng. 13, 9–12
(2011).
11.	 Pérez, F., Granger, B. E. & Hunter, J. D. Python: an ecosystem for scientific computing.
Comput. Sci. Eng. 13, 13–21 (2011).
Explains why the scientific Python ecosystem is a highly productive environment for
research.
12.	 Virtanen, P. et al. SciPy 1.0—fundamental algorithms for scientific computing in Python.
Nat. Methods 17, 261–272 (2020); correction 17, 352 (2020).
Introduces the SciPy library and includes a more detailed history of NumPy and SciPy.
13.	 Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
14.	 McKinney, W. Data structures for statistical computing in Python. In Proc. 9th Python in
Science Conf. (eds van der Walt, S. & Millman, K. J.) 56–61 (2010).
15.	 Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12,
2825–2830 (2011).
16.	 van der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014).
17.	 van der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy array: a structure for efficient
numerical computation. Comput. Sci. Eng. 13, 22–30 (2011).
Discusses the NumPy array data structure with a focus on how it enables efficient
computation.
18.	 Wang, Q., Zhang, X., Zhang, Y. & Yi, Q. AUGEM: automatically generate high performance
dense linear algebra kernels on x86 CPUs. In SC’13: Proc. Intl Conf. High Performance
Computing, Networking, Storage and Analysis 25 (IEEE, 2013).
19.	 Xianyi, Z., Qian, W. & Yunquan, Z. Model-driven level 3 BLAS performance optimization
on Loongson 3A processor. In 2012 IEEE 18th Intl Conf. Parallel and Distributed Systems
684–691 (IEEE, 2012).
20.	 Pérez, F. & Granger, B. E. IPython: a system for interactive scientific computing. Comput.
Sci. Eng. 9, 21–29 (2007).
21.	 Kluyver, T. et al. Jupyter Notebooks—a publishing format for reproducible computational
workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas
(eds Loizides, F. & Schmidt, B.) 87–90 (IOS Press, 2016).
22.	 Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and
function using NetworkX. In Proc. 7th Python in Science Conf. (eds Varoquaux, G.,
Vaught, T. & Millman, K. J.) 11–15 (2008).
23.	 Astropy Collaboration et al. Astropy: a community Python package for astronomy. Astron.
Astrophys. 558, A33 (2013).
24.	 Price-Whelan, A. M. et al. The Astropy Project: building an open-science project and
status of the v2.0 core package. Astron. J. 156, 123 (2018).
25.	 Cock, P. J. et al. Biopython: freely available Python tools for computational molecular
biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).
26.	 Millman, K. J. & Brett, M. Analysis of functional magnetic resonance imaging in Python.
Comput. Sci. Eng. 9, 52–55 (2007).
27.	 The SunPy Community et al. SunPy—Python for solar physics. Comput. Sci. Discov. 8,
014009 (2015).
28.	 Hamman, J., Rocklin, M. & Abernathy, R. Pangeo: a big-data ecosystem for scalable Earth
system science. In EGU General Assembly Conf. Abstracts 12146 (2018).
29.	 Chael, A. A. et al. ehtim: imaging, analysis, and simulation software for radio
interferometry. Astrophysics Source Code Library https://ascl.net/1904.004 (2019).
30.	 Millman, K. J. & Pérez, F. Developing open source scientific practice. In Implementing
Reproducible Research (eds Stodden, V., Leisch, F. & Peng, R. D.) 149–183 (CRC Press, 2014).
Describes the software engineering practices embraced by the NumPy and SciPy
communities with a focus on how these practices improve research.
31.	 van der Walt, S. The SciPy Documentation Project (technical overview). In Proc. 7th Python
in Science Conf. (SciPy 2008) (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 27–28 (2008).
32.	 Harrington, J. The SciPy Documentation Project. In Proc. 7th Python in Science
Conference (SciPy 2008) (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 33–35 (2008).
33.	 Harrington, J. & Goldsmith, D. Progress report: NumPy and SciPy documentation in 2009.
In Proc. 8th Python in Science Conf. (SciPy 2009) (eds Varoquaux, G., van der Walt, S. &
Millman, K. J.) 84–87 (2009).
34.	 Royal Astronomical Society Report of the RAS ‘A’ Awards Committee 2020: Astropy
Project: 2020 Group Achievement Award (A) https://ras.ac.uk/sites/default/files/2020-01/
Group%20Award%20-%20Astropy.pdf (2020).
35.	 Wilson, G. Software carpentry: getting scientists to write better code by making them
more productive. Comput. Sci. Eng. 8, 66–69 (2006).
362 | Nature | Vol 585 | 17 September 2020
Review
36.	 Hannay, J. E. et al. How do scientists develop and use scientific software? In Proc. 2009
ICSE Workshop on Software Engineering for Computational Science and Engineering 1–8
(IEEE, 2009).
37.	 Millman, K. J., Brett, M., Barnowski, R. & Poline, J.-B. Teaching computational
reproducibility for neuroimaging. Front. Neurosci. 12, 727 (2018).
38.	 Paszke, A. et al. Pytorch: an imperative style, high-performance deep learning library. In
Advances in Neural Information Processing Systems 32 (eds Wallach, H. et al.) 8024–8035
(Neural Information Processing Systems, 2019).
39.	 Abadi, M. et al. TensorFlow: a system for large-scale machine learning. In OSDI’16: Proc.
12th USENIX Conf. Operating Systems Design and Implementation (chairs Keeton, K. &
Roscoe, T.) 265–283 (USENIX Association, 2016).
40.	 Chen, T. et al. MXNet: a flexible and efficient machine learning library for heterogeneous
distributed systems. Preprint at http://www.arxiv.org/abs/1512.01274 (2015).
41.	 Hoyer, S. & Hamman, J. xarray: N–D labeled arrays and datasets in Python. J. Open Res.
Softw. 5, 10 (2017).
42.	 Entschev, P. Distributed multi-GPU computing with Dask, CuPy and RAPIDS. In EuroPython
2019 https://ep2019.europython.eu/media/conference/slides/
fX8dJsD-distributed-multi-gpu-computing-with-dask-cupy-and-rapids.pdf (2019).
43.	 Behnel, S. et al. Cython: the best of both worlds. Comput. Sci. Eng. 13, 31–39 (2011).
44.	 Lam, S. K., Pitrou, A. & Seibert, S. Numba: a LLVM-based Python JIT compiler. In Proc.
Second Workshop on the LLVM Compiler Infrastructure in HPC, LLVM ’15 7:1–7:6 (ACM, 2015).
45.	 Guelton, S. et al. Pythran: enabling static optimization of scientific Python programs.
Comput. Sci. Discov. 8, 014001 (2015).
46.	 Dongarra, J., Golub, G. H., Grosse, E., Moler, C. & Moore, K. Netlib and NA-Net: building a
scientific computing community. IEEE Ann. Hist. Comput. 30, 30–41 (2008).
47.	 Barrett, K. A., Chiu, Y. H., Painter, J. F., Motteler, Z. C. & Dubois, P. F. Basis System, Part I:
Running a Basis Program—A Tutorial for Beginners UCRL-MA-118543, Vol. 1 (Lawrence
Livermore National Laboratory 1995).
48.	 Dubois, P. F. & Motteler, Z. Basis System, Part II: Basis Language Reference Manual
UCRL-MA-118543, Vol. 2 (Lawrence Livermore National Laboratory, 1995).
49.	 Chiu, Y. H. & Dubois, P. F. Basis System, Part III: EZN User Manual UCRL-MA-118543, Vol. 3
(Lawrence Livermore National Laboratory, 1995).
50.	 Chiu, Y. H. & Dubois, P. F. Basis System, Part IV: EZD User Manual UCRL-MA-118543, Vol. 4
(Lawrence Livermore National Laboratory, 1995).
51.	 Munro, D. H. & Dubois, P. F. Using the Yorick interpreted language. Comput. Phys. 9,
609–615 (1995).
52.	 Ihaka, R. & Gentleman, R. R: a language for data analysis and graphics. J. Comput. Graph.
Stat. 5, 299–314 (1996).
53.	 Iverson, K. E. A programming language. In Proc. 1962 Spring Joint Computer Conf.
345–351 (1962).
54.	 Jenness, T. et al. LSST data management software development practices and tools. In
Proc. SPIE 10707, Software and Cyberinfrastructure for Astronomy V 1070709 (SPIE and
International Society for Optics and Photonics, 2018).
55.	 Matsakis, N. D. & Klock, F. S. The Rust language. Ada Letters 34, 103–104 (2014).
56.	 Bezanson, J., Edelman, A., Karpinski, S. & Shah, V. B. Julia: a fresh approach to numerical
computing. SIAM Rev. 59, 65–98 (2017).
57.	 Lattner, C. & Adve, V. LLVM: a compilation framework for lifelong program analysis and
transformation. In Proc. 2004 Intl Symp. Code Generation and Optimization (CGO’04)
75–88 (IEEE, 2004).
Acknowledgements We thank R. Barnowski, P. Dubois, M. Eickenberg, and P. Greenfield, who
suggested text and provided helpful feedback on the manuscript. K.J.M. and S.J.v.d.W. were
funded in part by the Gordon and Betty Moore Foundation through grant GBMF3834 and by
the Alfred P. Sloan Foundation through grant 2013-10-27 to the University of California,
Berkeley. S.J.v.d.W., S.B., M.P. and W.W. were funded in part by the Gordon and Betty Moore
Foundation through grant GBMF5447 and by the Alfred P. Sloan Foundation through grant
G-2017-9960 to the University of California, Berkeley.
Author contributions K.J.M. and S.J.v.d.W. composed the manuscript with input from
others. S.B., R.G., K.S., W.W., M.B. and T.R. contributed text. All authors contributed
substantial code, documentation and/or expertise to the NumPy project. All authors
reviewed the manuscript.
Competing interests The authors declare no competing interests.
Additional information
Supplementary information is available for this paper at https://doi.org/10.1038/s41586-020-
2649-2.
Correspondence and requests for materials should be addressed to K.J.M., S.J.v.W. or R.G.
Peer review information Nature thanks Edouard Duchesnay, Alan Edelman and the other,
anonymous, reviewer(s) for their contribution to the peer review of this work.
Reprints and permissions information is available at http://www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution
4.0 International License, which permits use, sharing, adaptation, distribution
and reproduction in any medium or format, as long as you give appropriate
credit to the original author(s) and the source, provide a link to the Creative Commons license,
and indicate if changes were made. The images or other third party material in this article are
included in the article’s Creative Commons license, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons license and your
intended use is not permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a copy of this license,
visit http://creativecommons.org/licenses/by/4.0/.
© The Author(s) 2020

More Related Content

What's hot (20)

(2019.9) 不均一系触媒研究のための機械学習と最適実験計画
(2019.9) 不均一系触媒研究のための機械学習と最適実験計画(2019.9) 不均一系触媒研究のための機械学習と最適実験計画
(2019.9) 不均一系触媒研究のための機械学習と最適実験計画
Ichigaku Takigawa
 
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...
WithTheBest
 
MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)
Juan Antonio Vizcaino
 
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. FreyMachine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Nathan Frey, PhD
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
tuxette
 
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
Preferred Networks
 
Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics Overview
Tony Fast
 
Subgraph relative frequency approach for extracting interesting substructur
Subgraph relative frequency approach for extracting interesting substructurSubgraph relative frequency approach for extracting interesting substructur
Subgraph relative frequency approach for extracting interesting substructur
IAEME Publication
 
The MGI and AI
The MGI and AIThe MGI and AI
The MGI and AI
aimsnist
 
Machine Learning for Molecules
Machine Learning for MoleculesMachine Learning for Molecules
Machine Learning for Molecules
Ichigaku Takigawa
 
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Franck Michel
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
Valery Tkachenko
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
Lihua Zhao
 
Machine Learning in computational materials science: an overview, a primer, a...
Machine Learning in computational materials science: an overview, a primer, a...Machine Learning in computational materials science: an overview, a primer, a...
Machine Learning in computational materials science: an overview, a primer, a...
Pôle Systematic Paris-Region
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
Lihua Zhao
 
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Dominic Suciu
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
Anubhav Jain
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
Ganesan Narayanasamy
 
Application of a Novel Subject Classification Scheme for a Bibliographic Data...
Application of a Novel Subject Classification Scheme for a Bibliographic Data...Application of a Novel Subject Classification Scheme for a Bibliographic Data...
Application of a Novel Subject Classification Scheme for a Bibliographic Data...
National Institute of Informatics
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Anubhav Jain
 
(2019.9) 不均一系触媒研究のための機械学習と最適実験計画
(2019.9) 不均一系触媒研究のための機械学習と最適実験計画(2019.9) 不均一系触媒研究のための機械学習と最適実験計画
(2019.9) 不均一系触媒研究のための機械学習と最適実験計画
Ichigaku Takigawa
 
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...
WithTheBest
 
MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)
Juan Antonio Vizcaino
 
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. FreyMachine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Machine Learning in Materials Science and Chemistry, USPTO, Nathan C. Frey
Nathan Frey, PhD
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
tuxette
 
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
PFP:材料探索のための汎用Neural Network Potential - 2021/10/4 QCMSR + DLAP共催
Preferred Networks
 
Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics Overview
Tony Fast
 
Subgraph relative frequency approach for extracting interesting substructur
Subgraph relative frequency approach for extracting interesting substructurSubgraph relative frequency approach for extracting interesting substructur
Subgraph relative frequency approach for extracting interesting substructur
IAEME Publication
 
The MGI and AI
The MGI and AIThe MGI and AI
The MGI and AI
aimsnist
 
Machine Learning for Molecules
Machine Learning for MoleculesMachine Learning for Molecules
Machine Learning for Molecules
Ichigaku Takigawa
 
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Franck Michel
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
Valery Tkachenko
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
Lihua Zhao
 
Machine Learning in computational materials science: an overview, a primer, a...
Machine Learning in computational materials science: an overview, a primer, a...Machine Learning in computational materials science: an overview, a primer, a...
Machine Learning in computational materials science: an overview, a primer, a...
Pôle Systematic Paris-Region
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
Lihua Zhao
 
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Dominic Suciu
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
Anubhav Jain
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
Ganesan Narayanasamy
 
Application of a Novel Subject Classification Scheme for a Bibliographic Data...
Application of a Novel Subject Classification Scheme for a Bibliographic Data...Application of a Novel Subject Classification Scheme for a Bibliographic Data...
Application of a Novel Subject Classification Scheme for a Bibliographic Data...
National Institute of Informatics
 
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Anubhav Jain
 

Similar to Array programming with Numpy (20)

Introduction-to-NumPy-in-Python (1).pptx
Introduction-to-NumPy-in-Python (1).pptxIntroduction-to-NumPy-in-Python (1).pptx
Introduction-to-NumPy-in-Python (1).pptx
disserdekabrcha
 
Numpy.pdf
Numpy.pdfNumpy.pdf
Numpy.pdf
Arvind Pathak
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)
PyData
 
Introduction to NumPy
Introduction to NumPyIntroduction to NumPy
Introduction to NumPy
Huy Nguyen
 
Python crash course libraries numpy-1, panda.ppt
Python crash course libraries numpy-1, panda.pptPython crash course libraries numpy-1, panda.ppt
Python crash course libraries numpy-1, panda.ppt
janaki raman
 
Introduction to Numpy Foundation Study GuideStudyGuide
Introduction to Numpy Foundation Study GuideStudyGuideIntroduction to Numpy Foundation Study GuideStudyGuide
Introduction to Numpy Foundation Study GuideStudyGuide
elharriettm
 
NumPy.pptx
NumPy.pptxNumPy.pptx
NumPy.pptx
EN1036VivekSingh
 
Introduction to numpy.pptx
Introduction to numpy.pptxIntroduction to numpy.pptx
Introduction to numpy.pptx
ssuser0e701a
 
L 5 Numpy final learning and Coding
L 5 Numpy final learning and CodingL 5 Numpy final learning and Coding
L 5 Numpy final learning and Coding
Kirti Verma
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyData
Travis Oliphant
 
NumPy.pptx
NumPy.pptxNumPy.pptx
NumPy.pptx
DrJasmineBeulahG
 
NumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdfNumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdf
goldenflower34
 
NumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdfNumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdf
goldenflower34
 
NumPy
NumPyNumPy
NumPy
AbhijeetAnand88
 
lec08-numpy.pptx
lec08-numpy.pptxlec08-numpy.pptx
lec08-numpy.pptx
lekha572836
 
NUMPY [Autosaved] .pptx
NUMPY [Autosaved]                    .pptxNUMPY [Autosaved]                    .pptx
NUMPY [Autosaved] .pptx
coolmanbalu123
 
Essential numpy before you start your Machine Learning journey in python.pdf
Essential numpy before you start your Machine Learning journey in python.pdfEssential numpy before you start your Machine Learning journey in python.pdf
Essential numpy before you start your Machine Learning journey in python.pdf
Smrati Kumar Katiyar
 
Python for Computer Vision - Revision
Python for Computer Vision - RevisionPython for Computer Vision - Revision
Python for Computer Vision - Revision
Ahmed Gad
 
Kaggle tokyo 2018
Kaggle tokyo 2018Kaggle tokyo 2018
Kaggle tokyo 2018
Cournapeau David
 
Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft Presentation
Travis Oliphant
 
Introduction-to-NumPy-in-Python (1).pptx
Introduction-to-NumPy-in-Python (1).pptxIntroduction-to-NumPy-in-Python (1).pptx
Introduction-to-NumPy-in-Python (1).pptx
disserdekabrcha
 
Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)Introduction to NumPy (PyData SV 2013)
Introduction to NumPy (PyData SV 2013)
PyData
 
Introduction to NumPy
Introduction to NumPyIntroduction to NumPy
Introduction to NumPy
Huy Nguyen
 
Python crash course libraries numpy-1, panda.ppt
Python crash course libraries numpy-1, panda.pptPython crash course libraries numpy-1, panda.ppt
Python crash course libraries numpy-1, panda.ppt
janaki raman
 
Introduction to Numpy Foundation Study GuideStudyGuide
Introduction to Numpy Foundation Study GuideStudyGuideIntroduction to Numpy Foundation Study GuideStudyGuide
Introduction to Numpy Foundation Study GuideStudyGuide
elharriettm
 
Introduction to numpy.pptx
Introduction to numpy.pptxIntroduction to numpy.pptx
Introduction to numpy.pptx
ssuser0e701a
 
L 5 Numpy final learning and Coding
L 5 Numpy final learning and CodingL 5 Numpy final learning and Coding
L 5 Numpy final learning and Coding
Kirti Verma
 
Array computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyDataArray computing and the evolution of SciPy, NumPy, and PyData
Array computing and the evolution of SciPy, NumPy, and PyData
Travis Oliphant
 
NumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdfNumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdf
goldenflower34
 
NumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdfNumPy__data__anlysis___using__python.pdf
NumPy__data__anlysis___using__python.pdf
goldenflower34
 
lec08-numpy.pptx
lec08-numpy.pptxlec08-numpy.pptx
lec08-numpy.pptx
lekha572836
 
NUMPY [Autosaved] .pptx
NUMPY [Autosaved]                    .pptxNUMPY [Autosaved]                    .pptx
NUMPY [Autosaved] .pptx
coolmanbalu123
 
Essential numpy before you start your Machine Learning journey in python.pdf
Essential numpy before you start your Machine Learning journey in python.pdfEssential numpy before you start your Machine Learning journey in python.pdf
Essential numpy before you start your Machine Learning journey in python.pdf
Smrati Kumar Katiyar
 
Python for Computer Vision - Revision
Python for Computer Vision - RevisionPython for Computer Vision - Revision
Python for Computer Vision - Revision
Ahmed Gad
 
Standardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft PresentationStandardizing arrays -- Microsoft Presentation
Standardizing arrays -- Microsoft Presentation
Travis Oliphant
 
Ad

More from mustafa sarac (20)

Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma son
mustafa sarac
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
mustafa sarac
 
Latka december digital
Latka december digitalLatka december digital
Latka december digital
mustafa sarac
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manual
mustafa sarac
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
mustafa sarac
 
The book of Why
The book of WhyThe book of Why
The book of Why
mustafa sarac
 
BM sgk meslek kodu
BM sgk meslek koduBM sgk meslek kodu
BM sgk meslek kodu
mustafa sarac
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
mustafa sarac
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
mustafa sarac
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
mustafa sarac
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
mustafa sarac
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
mustafa sarac
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
mustafa sarac
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
mustafa sarac
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
mustafa sarac
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
mustafa sarac
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
mustafa sarac
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
mustafa sarac
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
mustafa sarac
 
Handbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentHandbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatment
mustafa sarac
 
Uluslararasilasma son
Uluslararasilasma sonUluslararasilasma son
Uluslararasilasma son
mustafa sarac
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
mustafa sarac
 
Latka december digital
Latka december digitalLatka december digital
Latka december digital
mustafa sarac
 
Axial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manualAxial RC SCX10 AE2 ESC user manual
Axial RC SCX10 AE2 ESC user manual
mustafa sarac
 
Math for programmers
Math for programmersMath for programmers
Math for programmers
mustafa sarac
 
TEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimizTEGV 2020 Bireysel bagiscilarimiz
TEGV 2020 Bireysel bagiscilarimiz
mustafa sarac
 
How to make and manage a bee hotel?
How to make and manage a bee hotel?How to make and manage a bee hotel?
How to make and manage a bee hotel?
mustafa sarac
 
Cahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir miCahit arf makineler dusunebilir mi
Cahit arf makineler dusunebilir mi
mustafa sarac
 
How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?How did Software Got So Reliable Without Proof?
How did Software Got So Reliable Without Proof?
mustafa sarac
 
Staff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital MarketsStaff Report on Algorithmic Trading in US Capital Markets
Staff Report on Algorithmic Trading in US Capital Markets
mustafa sarac
 
Yetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimiYetiskinler icin okuma yazma egitimi
Yetiskinler icin okuma yazma egitimi
mustafa sarac
 
Consumer centric api design v0.4.0
Consumer centric api design v0.4.0Consumer centric api design v0.4.0
Consumer centric api design v0.4.0
mustafa sarac
 
State of microservices 2020 by tsh
State of microservices 2020 by tshState of microservices 2020 by tsh
State of microservices 2020 by tsh
mustafa sarac
 
Uber pitch deck 2008
Uber pitch deck 2008Uber pitch deck 2008
Uber pitch deck 2008
mustafa sarac
 
Wireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guideWireless solar keyboard k760 quickstart guide
Wireless solar keyboard k760 quickstart guide
mustafa sarac
 
State of Serverless Report 2020
State of Serverless Report 2020State of Serverless Report 2020
State of Serverless Report 2020
mustafa sarac
 
Dont just roll the dice
Dont just roll the diceDont just roll the dice
Dont just roll the dice
mustafa sarac
 
Handbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatmentHandbook of covid 19 prevention and treatment
Handbook of covid 19 prevention and treatment
mustafa sarac
 
Ad

Recently uploaded (20)

Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Natan Silnitsky
 
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Alluxio, Inc.
 
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptxIMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
usmanch7829
 
14 Years of Developing nCine - An Open Source 2D Game Framework
14 Years of Developing nCine - An Open Source 2D Game Framework14 Years of Developing nCine - An Open Source 2D Game Framework
14 Years of Developing nCine - An Open Source 2D Game Framework
Angelo Theodorou
 
Bonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdfBonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdf
Herond Labs
 
Best Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small BusinessesBest Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small Businesses
TheTelephony
 
Porting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 WebinarPorting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 Webinar
ICS
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage OverlookCode and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - IntroductionIBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 
Who will create the languages of the future?
Who will create the languages of the future?Who will create the languages of the future?
Who will create the languages of the future?
Jordi Cabot
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
WSO2
 
How the US Navy Approaches DevSecOps with Raise 2.0
How the US Navy Approaches DevSecOps with Raise 2.0How the US Navy Approaches DevSecOps with Raise 2.0
How the US Navy Approaches DevSecOps with Raise 2.0
Anchore
 
Top 11 Fleet Management Software Providers in 2025 (2).pdf
Top 11 Fleet Management Software Providers in 2025 (2).pdfTop 11 Fleet Management Software Providers in 2025 (2).pdf
Top 11 Fleet Management Software Providers in 2025 (2).pdf
Trackobit
 
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Agentic Techniques in Retrieval-Augmented Generation with Azure AI SearchAgentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Maxim Salnikov
 
Providing Better Biodiversity Through Better Data
Providing Better Biodiversity Through Better DataProviding Better Biodiversity Through Better Data
Providing Better Biodiversity Through Better Data
Safe Software
 
DevOps for AI: running LLMs in production with Kubernetes and KubeFlow
DevOps for AI: running LLMs in production with Kubernetes and KubeFlowDevOps for AI: running LLMs in production with Kubernetes and KubeFlow
DevOps for AI: running LLMs in production with Kubernetes and KubeFlow
Aarno Aukia
 
COBOL Programming with VSCode - IBM Certificate
COBOL Programming with VSCode - IBM CertificateCOBOL Programming with VSCode - IBM Certificate
COBOL Programming with VSCode - IBM Certificate
VICTOR MAESTRE RAMIREZ
 
Software Engineering Process, Notation & Tools Introduction - Part 4
Software Engineering Process, Notation & Tools Introduction - Part 4Software Engineering Process, Notation & Tools Introduction - Part 4
Software Engineering Process, Notation & Tools Introduction - Part 4
Gaurav Sharma
 
Key AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence CompaniesKey AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence Companies
Mypcot Infotech
 
Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Async-ronizing Success at Wix - Patterns for Seamless Microservices - Devoxx ...
Natan Silnitsky
 
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Alluxio, Inc.
 
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptxIMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
usmanch7829
 
14 Years of Developing nCine - An Open Source 2D Game Framework
14 Years of Developing nCine - An Open Source 2D Game Framework14 Years of Developing nCine - An Open Source 2D Game Framework
14 Years of Developing nCine - An Open Source 2D Game Framework
Angelo Theodorou
 
Bonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdfBonk coin airdrop_ Everything You Need to Know.pdf
Bonk coin airdrop_ Everything You Need to Know.pdf
Herond Labs
 
Best Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small BusinessesBest Inbound Call Tracking Software for Small Businesses
Best Inbound Call Tracking Software for Small Businesses
TheTelephony
 
Porting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 WebinarPorting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 Webinar
ICS
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage OverlookCode and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - IntroductionIBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 
Who will create the languages of the future?
Who will create the languages of the future?Who will create the languages of the future?
Who will create the languages of the future?
Jordi Cabot
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...
WSO2
 
How the US Navy Approaches DevSecOps with Raise 2.0
How the US Navy Approaches DevSecOps with Raise 2.0How the US Navy Approaches DevSecOps with Raise 2.0
How the US Navy Approaches DevSecOps with Raise 2.0
Anchore
 
Top 11 Fleet Management Software Providers in 2025 (2).pdf
Top 11 Fleet Management Software Providers in 2025 (2).pdfTop 11 Fleet Management Software Providers in 2025 (2).pdf
Top 11 Fleet Management Software Providers in 2025 (2).pdf
Trackobit
 
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Agentic Techniques in Retrieval-Augmented Generation with Azure AI SearchAgentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Maxim Salnikov
 
Providing Better Biodiversity Through Better Data
Providing Better Biodiversity Through Better DataProviding Better Biodiversity Through Better Data
Providing Better Biodiversity Through Better Data
Safe Software
 
DevOps for AI: running LLMs in production with Kubernetes and KubeFlow
DevOps for AI: running LLMs in production with Kubernetes and KubeFlowDevOps for AI: running LLMs in production with Kubernetes and KubeFlow
DevOps for AI: running LLMs in production with Kubernetes and KubeFlow
Aarno Aukia
 
COBOL Programming with VSCode - IBM Certificate
COBOL Programming with VSCode - IBM CertificateCOBOL Programming with VSCode - IBM Certificate
COBOL Programming with VSCode - IBM Certificate
VICTOR MAESTRE RAMIREZ
 
Software Engineering Process, Notation & Tools Introduction - Part 4
Software Engineering Process, Notation & Tools Introduction - Part 4Software Engineering Process, Notation & Tools Introduction - Part 4
Software Engineering Process, Notation & Tools Introduction - Part 4
Gaurav Sharma
 
Key AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence CompaniesKey AI Technologies Used by Indian Artificial Intelligence Companies
Key AI Technologies Used by Indian Artificial Intelligence Companies
Mypcot Infotech
 

Array programming with Numpy

  • 1. Nature | Vol 585 | 17 September 2020 | 357 Review ArrayprogrammingwithNumPy Charles R. Harris1 , K. Jarrod Millman2,3,4 ✉, Stéfan J. van der Walt2,4,5 ✉, Ralf Gommers6 ✉, Pauli Virtanen7,8 , David Cournapeau9 , Eric Wieser10 , Julian Taylor11 , Sebastian Berg4 , Nathaniel J. Smith12 , Robert Kern13 , Matti Picus4 , Stephan Hoyer14 , Marten H. van Kerkwijk15 , Matthew Brett2,16 , Allan Haldane17 , Jaime Fernández del Río18 , Mark Wiebe19,20 , Pearu Peterson6,21,22 , Pierre Gérard-Marchant23,24 , Kevin Sheppard25 , Tyler Reddy26 , Warren Weckesser4 , Hameer Abbasi6 , Christoph Gohlke27 & Travis E. Oliphant6 Arrayprogrammingprovidesapowerful,compactandexpressivesyntaxfor accessing,manipulatingandoperatingondatainvectors,matricesand higher-dimensionalarrays.NumPyistheprimaryarrayprogramminglibraryforthe Pythonlanguage.Ithasanessentialroleinresearchanalysispipelinesinfieldsas diverseasphysics,chemistry,astronomy,geoscience,biology,psychology,materials science,engineering,financeandeconomics.Forexample,inastronomy,NumPywas animportantpartofthesoftwarestackusedinthediscoveryofgravitationalwaves1 andinthefirstimagingofablackhole2 .Herewereviewhowafewfundamentalarray conceptsleadtoasimpleandpowerfulprogrammingparadigmfororganizing, exploringandanalysingscientificdata.NumPyisthefoundationuponwhichthe scientificPythonecosystemisconstructed.Itissopervasivethatseveralprojects, targetingaudienceswithspecializedneeds,havedevelopedtheirownNumPy-like interfacesandarrayobjects.Owingtoitscentralpositionintheecosystem,NumPy increasinglyactsasaninteroperabilitylayerbetweensucharraycomputation librariesand,togetherwithitsapplicationprogramminginterface(API),providesa flexibleframeworktosupportthenextdecadeofscientificandindustrialanalysis. TwoPythonarraypackagesexistedbeforeNumPy.TheNumericpack- age was developed in the mid-1990s and provided array objects and array-awarefunctionsinPython.ItwaswritteninCandlinkedtostand- ardfastimplementationsoflinearalgebra3,4 .Oneofitsearliestuseswas to steer C++ applications for inertial confinement fusion research at LawrenceLivermoreNationalLaboratory5 .Tohandlelargeastronomi- calimagescomingfromtheHubbleSpaceTelescope,areimplementa- tionofNumeric,calledNumarray,addedsupportforstructuredarrays, flexibleindexing,memorymapping,byte-ordervariants,moreefficient memoryuse,flexibleIEEE754-standarderror-handlingcapabilities,and bettertype-castingrules6 .AlthoughNumarraywashighlycompatible withNumeric,thetwopackageshadenoughdifferencesthatitdivided the community; however, in 2005 NumPy emerged as a ‘best of both worlds’ unification7 —combining the features of Numarray with the small-array performance of Numeric and its rich C API. Now, 15 years later, NumPy underpins almost every Python library that does scientific or numerical computation8–11 , including SciPy12 , Matplotlib13 , pandas14 , scikit-learn15 and scikit-image16 . NumPy is a community-developed, open-source library, which provides a mul- tidimensional Python array object along with array-aware functions thatoperateonit.Becauseofitsinherentsimplicity,theNumPyarray is the de facto exchange format for array data in Python. NumPyoperatesonin-memoryarraysusingthecentralprocessing unit(CPU).Toutilizemodern,specializedstorageandhardware,there has been a recent proliferation of Python array packages. Unlike with the Numarray–Numeric divide, it is now much harder for these new libraries to fracture the user community—given how much work is alreadybuiltontopofNumPy.However,toprovidethecommunitywith access to new and exploratory technologies, NumPy is transitioning into a central coordinating mechanism that specifies a well defined array programming API and dispatches it, as appropriate, to special- ized array implementations. NumPyarrays TheNumPyarrayisadatastructurethatefficientlystoresandaccesses multidimensionalarrays17 (alsoknownastensors),andenablesawide variety of scientific computation. It consists of a pointer to memory, along with metadata used to interpret the data stored there, notably ‘data type’, ‘shape’ and ‘strides’ (Fig. 1a). https://doi.org/10.1038/s41586-020-2649-2 Received: 21 February 2020 Accepted: 17 June 2020 Published online: 16 September 2020 Open access Check for updates 1 Independent researcher, Logan, UT, USA. 2 Brain Imaging Center, University of California, Berkeley, Berkeley, CA, USA. 3 Division of Biostatistics, University of California, Berkeley, Berkeley, CA, USA. 4 Berkeley Institute for Data Science, University of California, Berkeley, Berkeley, CA, USA. 5 Applied Mathematics, Stellenbosch University, Stellenbosch, South Africa. 6 Quansight, Austin, TX, USA. 7 Department of Physics, University of Jyväskylä, Jyväskylä, Finland. 8 Nanoscience Center, University of Jyväskylä, Jyväskylä, Finland. 9 Mercari JP, Tokyo, Japan. 10 Department of Engineering, University of Cambridge, Cambridge, UK. 11 Independent researcher, Karlsruhe, Germany. 12 Independent researcher, Berkeley, CA, USA. 13 Enthought, Austin, TX, USA. 14 Google Research, Mountain View, CA, USA. 15 Department of Astronomy and Astrophysics, University of Toronto, Toronto, Ontario, Canada. 16 School of Psychology, University of Birmingham, Edgbaston, Birmingham, UK. 17 Department of Physics, Temple University, Philadelphia, PA, USA. 18 Google, Zurich, Switzerland. 19 Department of Physics and Astronomy, The University of British Columbia, Vancouver, British Columbia, Canada. 20 Amazon, Seattle, WA, USA. 21 Independent researcher, Saue, Estonia. 22 Department of Mechanics and Applied Mathematics, Institute of Cybernetics at Tallinn Technical University, Tallinn, Estonia. 23 Department of Biological and Agricultural Engineering, University of Georgia, Athens, GA, USA. 24 France-IX Services, Paris, France. 25 Department of Economics, University of Oxford, Oxford, UK. 26 CCS-7, Los Alamos National Laboratory, Los Alamos, NM, USA. 27 Laboratory for Fluorescence Dynamics, Biomedical Engineering Department, University of California, Irvine, Irvine, CA, USA. ✉e-mail: [email protected]; [email protected]; [email protected]
  • 2. 358 | Nature | Vol 585 | 17 September 2020 Review The data type describes the nature of elements stored in an array. Anarrayhasasingledatatype,andeachelementofanarrayoccupies thesamenumberofbytesinmemory.Examplesofdatatypesinclude real and complex numbers (of lower and higher precision), strings, timestamps and pointers to Python objects. The shape of an array determines the number of elements along each axis, and the number of axes is the dimensionality of the array. For example, a vector of numbers can be stored as a one-dimensional array of shape N, whereas colour videos are four-dimensional arrays of shape (T, M, N, 3). Strides are necessary to interpret computer memory, which stores elementslinearly,asmultidimensionalarrays.Theydescribethenum- berofbytestomoveforwardinmemorytojumpfromrowtorow,col- umntocolumn,andsoforth.Consider,forexample,atwo-dimensional arrayoffloating-pointnumberswithshape(4, 3),whereeachelement occupies 8 bytes in memory. To move between consecutive columns, weneedtojumpforward8 bytesinmemory,andtoaccessthenextrow, 3 × 8 = 24 bytes. The strides of that array are therefore (24, 8). NumPy canstorearraysineitherCorFortranmemoryorder,iteratingfirstover either rows or columns. This allows external libraries written in those languages to access NumPy array data in memory directly. Users interact with NumPy arrays using ‘indexing’ (to access sub- arrays or individual elements), ‘operators’ (for example, +, − and × for vectorized operations and @ for matrix multiplication), as well as‘array-awarefunctions’;together,theseprovideaneasilyreadable, expressive, high-level API for array programming while NumPy deals with the underlying mechanics of making operations fast. Indexing an array returns single elements, subarrays or elements that satisfy a specific condition (Fig. 1b). Arrays can even be indexed usingotherarrays(Fig. 1c).Whereverpossible,indexingthatretrievesa subarrayreturnsa‘view’ontheoriginalarraysuchthatdataareshared between the two arrays. This provides a powerful way to operate on subsets of array data while limiting memory usage. To complement the array syntax, NumPy includes functions that perform vectorized calculations on arrays, including arithmetic, statistics and trigonometry (Fig. 1d). Vectorization—operating on entirearraysratherthantheirindividualelements—isessentialtoarray programming.Thismeansthatoperationsthatwouldtakemanytens oflinestoexpressinlanguagessuchasCcanoftenbeimplementedas asingle,clearPythonexpression.Thisresultsinconcisecodeandfrees users to focus on the details of their analysis, while NumPy handles looping over array elements near-optimally—for example, taking strides into consideration to best utilize the computer’s fast cache memory. Whenperformingavectorizedoperation(suchasaddition)ontwo arrays with the same shape, it is clear what should happen. Through ‘broadcasting’ NumPy allows the dimensions to differ, and produces results that appeal to intuition. A trivial example is the addition of a scalarvaluetoanarray,butbroadcastingalsogeneralizestomorecom- plex examples such as scaling each column of an array or generating agridofcoordinates.Inbroadcasting,oneorbotharraysarevirtually duplicated (that is, without copying any data in memory), so that the shapes of the operands match (Fig. 1d). Broadcasting is also applied when an array is indexed using arrays of indices (Fig. 1c). Other array-aware functions, such as sum, mean and maximum, performelement-by-element‘reductions’,aggregatingresultsacross one, multiple or all axes of a single array. For example, summing an n-dimensional array over d axes results in an array of dimension n − d (Fig. 1f). NumPyalsoincludesarray-awarefunctionsforcreating,reshaping, concatenating and padding arrays; searching, sorting and counting data; and reading and writing files. It provides extensive support for generatingpseudorandomnumbers,includesanassortmentofprob- ability distributions, and performs accelerated linear algebra, using oneofseveralbackendssuchasOpenBLAS18,19 orIntelMKLoptimized for the CPUs at hand (see Supplementary Methods for more details). Altogether, the combination of a simple in-memory array repre- sentation, a syntax that closely mimics mathematics, and a variety of array-aware utility functions forms a productive and powerfully expressive array programming language. In [1]: import numpy as np In [2]: x = np.arange(12) In [3]: x = x.reshape(4, 3) In [4]: x Out[4]: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) In [5]: np.mean(x, axis=0) Out[5]: array([4.5, 5.5, 6.5]) In [6]: x = x - np.mean(x, axis=0) In [7]: x Out[7]: array([[-4.5, -4.5, -4.5], [-1.5, -1.5, -1.5], [ 1.5, 1.5, 1.5], [ 4.5, 4.5, 4.5]]) a Data structure g Example x = 0 1 2 3 4 5 6 7 8 9 10 11 data data type shape strides 8-byte integer (4, 3) (24, 8) 1 2 3 4 5 6 70 8 9 10 11 8 bytes per element 3 × 8 = 24 bytes to jump one row down b Indexing (view) 10 1199 x[:,1:] → with slices 1 2 4 5 7 8 00 33 66 x[:,::2]→ with slices with steps 0 2 3 5 6 8 9 11 0 11 2 3 44 5 6 77 8 9 1010 11 Slices are start:end:step, any of which can be left blank d Vectorization + → 0 1 3 4 6 7 9 10 1 1 1 1 1 1 1 1 1 2 4 5 7 8 10 11 e Broadcasting × 3 6 0 9 1 2 → 0 0 3 6 6 12 9 18 f Reduction 0 1 3 4 6 7 9 10 2 5 8 11 3 12 21 30 sum axis 1 18 22 26 sum axis 0 66 sum axis (0,1) c Indexing (copy) 4 3 7 6 with arrays with broadcasting →x → ,2 1 1 0 x , 1 1 2 2 1 0 1 0 x with arraysx[0,1],x[1,2] 1 5→ →0 1 1 2 , x[x > 9] with masks10 11→→ 5 with scalarsx[1,2] Fig.1|TheNumPyarrayincorporatesseveralfundamentalarrayconcepts. a,TheNumPyarraydatastructureanditsassociatedmetadatafields. b,Indexinganarraywithslicesandsteps.Theseoperationsreturna‘view’of theoriginaldata.c,Indexinganarraywithmasks,scalarcoordinatesorother arrays,sothatitreturnsa‘copy’oftheoriginaldata.Inthebottomexample,an arrayisindexedwithotherarrays;thisbroadcaststheindexingarguments beforeperformingthelookup.d,Vectorizationefficientlyappliesoperations togroupsofelements.e,Broadcastinginthemultiplicationoftwo-dimensional arrays.f,Reductionoperationsactalongoneormoreaxes.Inthisexample, anarrayissummedalongselectaxestoproduceavector,oralongtwoaxes consecutivelytoproduceascalar.g,ExampleNumPycode,illustratingsomeof theseconcepts.
  • 3. Nature | Vol 585 | 17 September 2020 | 359 ScientificPythonecosystem Pythonisanopen-source,general-purposeinterpretedprogramming languagewellsuitedtostandardprogrammingtaskssuchascleaning data,interactingwithwebresourcesandparsingtext.Addingfastarray operations and linear algebra enables scientists to do all their work withinasingleprogramminglanguage—onethathastheadvantageof being famously easy to learn and teach, as witnessed by its adoption as a primary learning language in many universities. Even though NumPy is not part of Python’s standard library, it ben- efits from a good relationship with the Python developers. Over the years,thePythonlanguagehasaddednewfeaturesandspecialsyntax so that NumPy would have a more succinct and easier-to-read array notation.However,becauseitisnotpartofthestandardlibrary,NumPy is able to dictate its own release policies and development patterns. SciPyandMatplotlibaretightlycoupledwithNumPyintermsofhis- tory,developmentanduse.SciPyprovidesfundamentalalgorithmsfor scientificcomputing,includingmathematical,scientificandengineer- ingroutines.Matplotlibgeneratespublication-readyfiguresandvisu- alizations.ThecombinationofNumPy,SciPyandMatplotlib,together with an advanced interactive environment such as IPython20 or Jupy- ter21 ,providesasolidfoundationforarrayprogramminginPython.The scientificPythonecosystem(Fig. 2)buildsontopofthisfoundationto provide several, widely used technique-specific libraries15,16,22 , that in turn underlie numerous domain-specific projects23–28 . NumPy, at the base of the ecosystem of array-aware libraries, sets documentation standards, provides array testing infrastructure and adds build sup- port for Fortran and other compilers. Manyresearchgroupshavedesignedlarge,complexscientificlibrar- ies that add application-specific functionality to the ecosystem. For example, the eht-imaging library29 , developed by the Event Horizon Telescope collaboration for radio interferometry imaging, analysis andsimulation,reliesonmanylower-levelcomponentsofthescientific Pythonecosystem.Inparticular,theEHTcollaborationusedthislibrary forthefirstimagingofablackhole.Withineht-imaging,NumPyarrays are used to store and manipulate numerical data at every step in the processingchain:fromrawdatathroughcalibrationandimagerecon- struction.SciPysuppliestoolsforgeneralimage-processingtaskssuch asfilteringandimagealignment,andscikit-image,animage-processing library that extends SciPy, provides higher-level functionality such as edge filters and Hough transforms. The ‘scipy.optimize’ module performsmathematicaloptimization.NetworkX22 ,apackageforcom- plexnetworkanalysis,isusedtoverifyimagecomparisonconsistency. Astropy23,24 handlesstandardastronomicalfileformatsandcomputes time–coordinatetransformations.Matplotlibisusedtovisualizedata and to generate the final image of the black hole. Theinteractiveenvironmentcreatedbythearray programmingfoun- dation and the surrounding ecosystem of tools—inside of IPython or Jupyter—isideallysuitedtoexploratorydataanalysis.Userscanfluidly inspect,manipulateandvisualizetheirdata,andrapidlyiteratetorefine programmingstatements.Thesestatementsarethenstitchedtogether intoimperativeorfunctionalprograms,ornotebookscontainingboth computationandnarrative.Scientificcomputingbeyondexploratory workisoftendoneinatexteditororanintegrateddevelopmentenvi- ronment (IDE) such as Spyder. This rich and productive environment has made Python popular for scientific research. To complement this facility for exploratory work and rapid proto- typing,NumPyhasdevelopedacultureofusingtime-testedsoftware engineeringpracticestoimprovecollaborationandreduceerror30 .This culture is not only adopted by leaders in the project but also enthusi- astically taught to newcomers. The NumPy team was early to adopt distributedrevisioncontrolandcodereviewtoimprovecollaboration cantera Chemistry Biopython Biology Astropy Astronomy simpeg Geophysics NLTK Linguistics QuantEcon Economics SciPy Algorithms Matplotlib Plots scikit-learn Machine learning NetworkX Network analysis pandas, statsmodels Statistics scikit-image Image processing PsychoPykhmer Qiime2 FiPy deepchem librosaPyWavelets SunPy QuTiP yt nibabel yellowbrickmne-python scikit-HEP eht-imagingMDAnalysis iriscesium PyChrono Foundation Application-specific Domain-specific Technique-specific Array ProtocolsNumPy API Python Language IPython / Jupyter Interactive environments NumPy Arrays New array implementations Fig.2|NumPyisthebaseofthescientificPythonecosystem.EssentiallibrariesandprojectsthatdependonNumPy’sAPIgainaccesstonewarray implementationsthatsupportNumPy’sarrayprotocols(Fig. 3).
  • 4. 360 | Nature | Vol 585 | 17 September 2020 Review oncode,andcontinuoustestingthatrunsanextensivebatteryofauto- mated tests for every proposed change to NumPy. The project also hascomprehensive,high-qualitydocumentation,integratedwiththe source code31–33 . Thiscultureofusingbestpracticesforproducingreliablescientific softwarehasbeenadoptedbytheecosystemoflibrariesthatbuildon NumPy.Forexample,inarecentawardgivenbytheRoyalAstronomi- cal Society to Astropy, they state: “The Astropy Project has provided hundredsofjuniorscientistswithexperienceinprofessional-standard softwaredevelopmentpracticesincludinguseofversioncontrol,unit testing, code review and issue tracking procedures. This is a vital skill setformodernresearchersthatisoftenmissingfromformaluniversity educationinphysicsorastronomy”34 .Communitymembersexplicitly work to address this lack of formal education through courses and workshops35–37 . Therecentrapidgrowthofdatascience,machinelearningandarti- ficial intelligence has further and dramatically boosted the scientific use of Python. Examples of its important applications, such as the eht-imaging library, now exist in almost every discipline in the natu- ralandsocialsciences.Thesetoolshavebecometheprimarysoftware environmentinmanyfields.NumPyanditsecosystemarecommonly taught in university courses, boot camps and summer schools, and are the focus of community conferences and workshops worldwide. NumPy and its API have become truly ubiquitous. Arrayproliferationandinteroperability NumPyprovidesin-memory,multidimensional,homogeneouslytyped (thatis,single-pointerandstrided)arraysonCPUs.Itrunsonmachines rangingfromembeddeddevicestotheworld’slargestsupercomputers, withperformanceapproachingthatofcompiledlanguages.Formost its existence, NumPy addressed the vast majority of array computa- tion use cases. However,scientificdatasetsnowroutinelyexceedthememorycapac- ity of a single machine and may be stored on multiple machines or in thecloud.Inaddition,therecentneedtoacceleratedeep-learningand artificialintelligenceapplicationshasledtotheemergenceofspecial- izedacceleratorhardware,includinggraphicsprocessingunits(GPUs), tensor processing units (TPUs) and field-programmable gate arrays (FPGAs).Owingtoitsin-memorydatamodel,NumPyiscurrentlyunable to directly utilize such storage and specialized hardware. However, both distributed data and also the parallel execution of GPUs, TPUs andFPGAsmapwelltotheparadigmofarrayprogramming:therefore leadingtoagapbetweenavailablemodernhardwarearchitecturesand the tools necessary to leverage their computational power. Thecommunity’seffortstofillthisgapledtoaproliferationofnew array implementations. For example, each deep-learning framework created its own arrays; the PyTorch38 , Tensorflow39 , Apache MXNet40 and JAX arrays all have the capability to run on CPUs and GPUs in a distributed fashion, using lazy evaluation to allow for additional per- formanceoptimizations.SciPyandPyData/Sparsebothprovidesparse arrays,whichtypicallycontainfewnon-zerovaluesandstoreonlythose in memory for efficiency. In addition, there are projects that build on NumPy arrays as data containers, and extend its capabilities. Distrib- uted arrays are made possible that way by Dask, and labelled arrays— referring to dimensions of an array by name rather than by index for clarity, compare x[:, 1] versus x.loc[:, 'time']—by xarray41 . Such libraries often mimic the NumPy API, because this lowers the barriertoentryfornewcomersandprovidesthewidercommunitywith astablearray programminginterface.This,inturn,preventsdisruptive schisms such as the divergence between Numeric and Numarray. But exploring new ways of working with arrays is experimental by nature and,infact,severalpromisinglibraries(suchasTheanoandCaffe)have alreadyceaseddevelopment.Andeachtimethatauserdecidestotrya newtechnology,theymustchangeimportstatementsandensurethatthe newlibraryimplementsallthepartsoftheNumPyAPItheycurrentlyuse. Ideally, operating on specialized arrays using NumPy functions or semantics would simply work, so that users could write code once, and would then benefit from switching between NumPy arrays, GPU arrays,distributedarraysandsoforthasappropriate.Tosupportarray operations between external array objects, NumPy therefore added the capability to act as a central coordination mechanism with a well specified API (Fig. 2). To facilitate this interoperability, NumPy provides ‘protocols’ (or contractsofoperation),thatallowforspecializedarraystobepassedto NumPyfunctions(Fig. 3).NumPy,inturn,dispatchesoperationstothe originatinglibrary,asrequired.Overfourhundredofthemostpopular NumPy functions are supported. The protocols are implemented by widely used libraries such as Dask, CuPy, xarray and PyData/Sparse. Thankstothesedevelopments,userscannow,forexample,scaletheir computationfromasinglemachinetodistributedsystemsusingDask. The protocols also compose well, allowing users to redeploy NumPy codeatscaleondistributed,multi-GPUsystemsvia,forinstance,CuPy arrays embedded in Dask arrays. Using NumPy’s high-level API, users can leverage highly parallel code execution on multiple systems with millions of cores, all with minimal code changes42 . These array protocols are now a key feature of NumPy, and are expected to only increase in importance. The NumPy developers— many of whom are authors of this Review—iteratively refine and add protocol designs to improve utility and simplify adoption. Output arrays Input arrays NumPy API np.stack np.reshape np.transpose np.argmin np.mean np.std np.max np.cos np.arctan np.log np.cumsum np.diff ... NumPy array protocols In [1]: import numpy as np In [2]: import dask.array as da In [3]: x = da.arange(12) In [4]: x = np.reshape(x, (4, 3)) In [5]: x Out[5]: dask.array<..., shape=(4, 3), ...> In [6]: np.mean(x, axis=0) Out[6]: dask.array<..., shape=(3,), ...> In [7]: x = x - np.mean(x, axis=0) In [8]: x Out[8]: dask.array<..., shape=(4, 3), ...> Array implementation NumPy Dask CuPy PyData/ Sparse ... ... Dask NumPy CuPy PyData Sparse ... Dask NumPy CuPy PyData Sparse Fig.3|NumPy’sAPIandarrayprotocolsexposenewarraystothe ecosystem.Inthisexample,NumPy’s‘mean’functioniscalledonaDaskarray. Thecallsucceedsbydispatchingtotheappropriatelibraryimplementation(in thiscase,Dask)andresultsinanewDaskarray.Comparethiscodetothe examplecodeinFig. 1g.
  • 5. Nature | Vol 585 | 17 September 2020 | 361 Discussion NumPy combines the expressive power of array programming, the performanceofC,andthereadability,usabilityandversatilityofPython inamature,welltested,welldocumentedandcommunity-developed library.LibrariesinthescientificPythonecosystemprovidefastimple- mentations of most important algorithms. Where extreme optimiza- tion is warranted, compiled languages can be used, such as Cython43 , Numba44 and Pythran45 ; these languages extend Python and trans- parently accelerate bottlenecks. Owing to NumPy’s simple memory model, it is easy to write low-level, hand-optimized code, usually in C orFortran,tomanipulateNumPyarraysandpassthembacktoPython. Furthermore, using array protocols, it is possible to utilize the full spectrumofspecializedhardwareaccelerationwithminimalchanges to existing code. NumPywasinitiallydevelopedbystudents,facultyandresearchers to provide an advanced, open-source array programming library for Python,whichwasfreetouseandunencumberedbylicenseserversand softwareprotectiondongles.Therewasasenseofbuildingsomething consequential together for the benefit of many others. Participating in such an endeavour, within a welcoming community of like-minded individuals, held a powerful attraction for many early contributors. These user–developers frequently had to write code from scratch to solve their own or their colleagues’ problems—often in low-level languages that preceded Python, such as Fortran46 and C. To them, theadvantagesofaninteractive,high-levelarraylibrarywereevident. Thedesignofthisnewtoolwasinformedbyotherpowerfulinteractive programming languages for scientific computing such as Basis47–50 , Yorick51 , R52 and APL53 , as well as commercial languages and environ- ments such as IDL (Interactive Data Language) and MATLAB. WhatbeganasanattempttoaddanarrayobjecttoPythonbecame thefoundationofavibrantecosystemoftools.Now,alargeamountof scientific work depends on NumPy being correct, fast and stable. It is nolongerasmallcommunityproject,butcorescientificinfrastructure. Thedeveloperculturehasmatured:althoughinitialdevelopmentwas highlyinformal,NumPynowhasaroadmapandaprocessforpropos- ing and discussing large changes. The project has formal governance structures and is fiscally sponsored by NumFOCUS, a nonprofit that promotes open practices in research, data and scientific computing. Overthepastfewyears,theprojectattracteditsfirstfundeddevelop- ment, sponsored by the Moore and Sloan Foundations, and received anawardaspartoftheChanZuckerbergInitiative’sEssentialsofOpen Source Software programme. With this funding, the project was (and is) able to have sustained focus over multiple months to implement substantial new features and improvements. That said, the develop- mentofNumPystilldependsheavilyoncontributionsmadebygradu- ate students and researchers in their free time (see Supplementary Methods for more details). NumPyisnolongermerelythefoundationalarraylibraryunderlying thescientificPythonecosystem,butithasbecomethestandardAPIfor tensor computation and a central coordinating mechanism between arraytypesandtechnologiesinPython.Workcontinuestoexpandon and improve these interoperability features. Overthenextdecade,NumPydeveloperswillfaceseveralchallenges. Newdeviceswillbedeveloped,andexistingspecializedhardwarewill evolvetomeetdiminishingreturnsonMoore’slaw.Therewillbemore, andawidervarietyof,datasciencepractitioners,alargeproportionof whom will use NumPy. The scale of scientific data gathering will con- tinue to increase, with the adoption of devices and instruments such as light-sheet microscopes and the Large Synoptic Survey Telescope (LSST)54 .Newgenerationlanguages,interpretersandcompilers,suchas Rust55 ,Julia56 andLLVM57 ,willcreatenewconceptsanddatastructures, and determine their viability. ThroughthemechanismsdescribedinthisReview,NumPyispoised to embrace such a changing landscape, and to continue playing a leading part in interactive scientific computation, although to do so willrequiresustainedfundingfromgovernment,academiaandindus- try.But,importantly,forNumPytomeettheneedsofthenextdecade ofdatascience,itwillalsoneedanewgenerationofgraduatestudents and community contributors to drive it forward. 1. Abbott, B. P. et al. Observation of gravitational waves from a binary black hole merger. Phys. Rev. Lett. 116, 061102 (2016). 2. Chael, A. et al. High-resolution linear polarimetric imaging for the Event Horizon Telescope. Astrophys. J. 286, 11 (2016). 3. Dubois, P. F., Hinsen, K. & Hugunin, J. Numerical Python. Comput. Phys. 10, 262–267 (1996). 4. Ascher, D., Dubois, P. F., Hinsen, K., Hugunin, J. & Oliphant, T. E. An Open Source Project: Numerical Python (Lawrence Livermore National Laboratory, 2001). 5. Yang, T.-Y., Furnish, G. & Dubois, P. F. Steering object-oriented scientific computations. In Proc. TOOLS USA 97. Intl Conf. Technology of Object Oriented Systems and Languages (eds Ege, R., Singh, M. & Meyer, B.) 112–119 (IEEE, 1997). 6. Greenfield, P., Miller, J. T., Hsu, J. & White, R. L. numarray: a new scientific array package for Python. In PyCon DC 2003 http://citeseerx.ist.psu.edu/viewdoc/download?d oi=10.1.1.112.9899 (2003). 7. Oliphant, T. E. Guide to NumPy 1st edn (Trelgol Publishing, 2006). 8. Dubois, P. F. Python: batteries included. Comput. Sci. Eng. 9, 7–9 (2007). 9. Oliphant, T. E. Python for scientific computing. Comput. Sci. Eng. 9, 10–20 (2007). 10. Millman, K. J. & Aivazis, M. Python for scientists and engineers. Comput. Sci. Eng. 13, 9–12 (2011). 11. Pérez, F., Granger, B. E. & Hunter, J. D. Python: an ecosystem for scientific computing. Comput. Sci. Eng. 13, 13–21 (2011). Explains why the scientific Python ecosystem is a highly productive environment for research. 12. Virtanen, P. et al. SciPy 1.0—fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020); correction 17, 352 (2020). Introduces the SciPy library and includes a more detailed history of NumPy and SciPy. 13. Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007). 14. McKinney, W. Data structures for statistical computing in Python. In Proc. 9th Python in Science Conf. (eds van der Walt, S. & Millman, K. J.) 56–61 (2010). 15. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). 16. van der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014). 17. van der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13, 22–30 (2011). Discusses the NumPy array data structure with a focus on how it enables efficient computation. 18. Wang, Q., Zhang, X., Zhang, Y. & Yi, Q. AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs. In SC’13: Proc. Intl Conf. High Performance Computing, Networking, Storage and Analysis 25 (IEEE, 2013). 19. Xianyi, Z., Qian, W. & Yunquan, Z. Model-driven level 3 BLAS performance optimization on Loongson 3A processor. In 2012 IEEE 18th Intl Conf. Parallel and Distributed Systems 684–691 (IEEE, 2012). 20. Pérez, F. & Granger, B. E. IPython: a system for interactive scientific computing. Comput. Sci. Eng. 9, 21–29 (2007). 21. Kluyver, T. et al. Jupyter Notebooks—a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas (eds Loizides, F. & Schmidt, B.) 87–90 (IOS Press, 2016). 22. Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. In Proc. 7th Python in Science Conf. (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 11–15 (2008). 23. Astropy Collaboration et al. Astropy: a community Python package for astronomy. Astron. Astrophys. 558, A33 (2013). 24. Price-Whelan, A. M. et al. The Astropy Project: building an open-science project and status of the v2.0 core package. Astron. J. 156, 123 (2018). 25. Cock, P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009). 26. Millman, K. J. & Brett, M. Analysis of functional magnetic resonance imaging in Python. Comput. Sci. Eng. 9, 52–55 (2007). 27. The SunPy Community et al. SunPy—Python for solar physics. Comput. Sci. Discov. 8, 014009 (2015). 28. Hamman, J., Rocklin, M. & Abernathy, R. Pangeo: a big-data ecosystem for scalable Earth system science. In EGU General Assembly Conf. Abstracts 12146 (2018). 29. Chael, A. A. et al. ehtim: imaging, analysis, and simulation software for radio interferometry. Astrophysics Source Code Library https://ascl.net/1904.004 (2019). 30. Millman, K. J. & Pérez, F. Developing open source scientific practice. In Implementing Reproducible Research (eds Stodden, V., Leisch, F. & Peng, R. D.) 149–183 (CRC Press, 2014). Describes the software engineering practices embraced by the NumPy and SciPy communities with a focus on how these practices improve research. 31. van der Walt, S. The SciPy Documentation Project (technical overview). In Proc. 7th Python in Science Conf. (SciPy 2008) (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 27–28 (2008). 32. Harrington, J. The SciPy Documentation Project. In Proc. 7th Python in Science Conference (SciPy 2008) (eds Varoquaux, G., Vaught, T. & Millman, K. J.) 33–35 (2008). 33. Harrington, J. & Goldsmith, D. Progress report: NumPy and SciPy documentation in 2009. In Proc. 8th Python in Science Conf. (SciPy 2009) (eds Varoquaux, G., van der Walt, S. & Millman, K. J.) 84–87 (2009). 34. Royal Astronomical Society Report of the RAS ‘A’ Awards Committee 2020: Astropy Project: 2020 Group Achievement Award (A) https://ras.ac.uk/sites/default/files/2020-01/ Group%20Award%20-%20Astropy.pdf (2020). 35. Wilson, G. Software carpentry: getting scientists to write better code by making them more productive. Comput. Sci. Eng. 8, 66–69 (2006).
  • 6. 362 | Nature | Vol 585 | 17 September 2020 Review 36. Hannay, J. E. et al. How do scientists develop and use scientific software? In Proc. 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering 1–8 (IEEE, 2009). 37. Millman, K. J., Brett, M., Barnowski, R. & Poline, J.-B. Teaching computational reproducibility for neuroimaging. Front. Neurosci. 12, 727 (2018). 38. Paszke, A. et al. Pytorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 (eds Wallach, H. et al.) 8024–8035 (Neural Information Processing Systems, 2019). 39. Abadi, M. et al. TensorFlow: a system for large-scale machine learning. In OSDI’16: Proc. 12th USENIX Conf. Operating Systems Design and Implementation (chairs Keeton, K. & Roscoe, T.) 265–283 (USENIX Association, 2016). 40. Chen, T. et al. MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. Preprint at http://www.arxiv.org/abs/1512.01274 (2015). 41. Hoyer, S. & Hamman, J. xarray: N–D labeled arrays and datasets in Python. J. Open Res. Softw. 5, 10 (2017). 42. Entschev, P. Distributed multi-GPU computing with Dask, CuPy and RAPIDS. In EuroPython 2019 https://ep2019.europython.eu/media/conference/slides/ fX8dJsD-distributed-multi-gpu-computing-with-dask-cupy-and-rapids.pdf (2019). 43. Behnel, S. et al. Cython: the best of both worlds. Comput. Sci. Eng. 13, 31–39 (2011). 44. Lam, S. K., Pitrou, A. & Seibert, S. Numba: a LLVM-based Python JIT compiler. In Proc. Second Workshop on the LLVM Compiler Infrastructure in HPC, LLVM ’15 7:1–7:6 (ACM, 2015). 45. Guelton, S. et al. Pythran: enabling static optimization of scientific Python programs. Comput. Sci. Discov. 8, 014001 (2015). 46. Dongarra, J., Golub, G. H., Grosse, E., Moler, C. & Moore, K. Netlib and NA-Net: building a scientific computing community. IEEE Ann. Hist. Comput. 30, 30–41 (2008). 47. Barrett, K. A., Chiu, Y. H., Painter, J. F., Motteler, Z. C. & Dubois, P. F. Basis System, Part I: Running a Basis Program—A Tutorial for Beginners UCRL-MA-118543, Vol. 1 (Lawrence Livermore National Laboratory 1995). 48. Dubois, P. F. & Motteler, Z. Basis System, Part II: Basis Language Reference Manual UCRL-MA-118543, Vol. 2 (Lawrence Livermore National Laboratory, 1995). 49. Chiu, Y. H. & Dubois, P. F. Basis System, Part III: EZN User Manual UCRL-MA-118543, Vol. 3 (Lawrence Livermore National Laboratory, 1995). 50. Chiu, Y. H. & Dubois, P. F. Basis System, Part IV: EZD User Manual UCRL-MA-118543, Vol. 4 (Lawrence Livermore National Laboratory, 1995). 51. Munro, D. H. & Dubois, P. F. Using the Yorick interpreted language. Comput. Phys. 9, 609–615 (1995). 52. Ihaka, R. & Gentleman, R. R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5, 299–314 (1996). 53. Iverson, K. E. A programming language. In Proc. 1962 Spring Joint Computer Conf. 345–351 (1962). 54. Jenness, T. et al. LSST data management software development practices and tools. In Proc. SPIE 10707, Software and Cyberinfrastructure for Astronomy V 1070709 (SPIE and International Society for Optics and Photonics, 2018). 55. Matsakis, N. D. & Klock, F. S. The Rust language. Ada Letters 34, 103–104 (2014). 56. Bezanson, J., Edelman, A., Karpinski, S. & Shah, V. B. Julia: a fresh approach to numerical computing. SIAM Rev. 59, 65–98 (2017). 57. Lattner, C. & Adve, V. LLVM: a compilation framework for lifelong program analysis and transformation. In Proc. 2004 Intl Symp. Code Generation and Optimization (CGO’04) 75–88 (IEEE, 2004). Acknowledgements We thank R. Barnowski, P. Dubois, M. Eickenberg, and P. Greenfield, who suggested text and provided helpful feedback on the manuscript. K.J.M. and S.J.v.d.W. were funded in part by the Gordon and Betty Moore Foundation through grant GBMF3834 and by the Alfred P. Sloan Foundation through grant 2013-10-27 to the University of California, Berkeley. S.J.v.d.W., S.B., M.P. and W.W. were funded in part by the Gordon and Betty Moore Foundation through grant GBMF5447 and by the Alfred P. Sloan Foundation through grant G-2017-9960 to the University of California, Berkeley. Author contributions K.J.M. and S.J.v.d.W. composed the manuscript with input from others. S.B., R.G., K.S., W.W., M.B. and T.R. contributed text. All authors contributed substantial code, documentation and/or expertise to the NumPy project. All authors reviewed the manuscript. Competing interests The authors declare no competing interests. Additional information Supplementary information is available for this paper at https://doi.org/10.1038/s41586-020- 2649-2. Correspondence and requests for materials should be addressed to K.J.M., S.J.v.W. or R.G. Peer review information Nature thanks Edouard Duchesnay, Alan Edelman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Reprints and permissions information is available at http://www.nature.com/reprints. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. © The Author(s) 2020