Python Book
Python Book
Download details:
IP Address: 14.139.185.183
This content was downloaded on 02/11/2024 at 05:01
NEID Rossiter–McLaughlin Measurement of TOI-1268b: A Young Warm Saturn Aligned with Its Cool Host
Star
Jiayin Dong, Chelsea X. Huang, George Zhou et al.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system
or transmitted in any form or by any means, electronic, mechanical, photocopying, recording
or otherwise, without the prior permission of the publisher, or as expressly permitted by law or
under terms agreed with the appropriate rights organization. Multiple copying is permitted in
accordance with the terms of licences issued by the Copyright Licensing Agency, the Copyright
Clearance Centre and other reproduction rights organisations.
DOI 10.1088/978-1-6270-5620-5
Version: 20150601
Preface x
Acknowledgements xi
About the author xii
vii
Python and Matplotlib Essentials for Scientists and Engineers
viii
Python and Matplotlib Essentials for Scientists and Engineers
11 Applications 11-1
11.1 Fits to data 11-1
11.1.1 Linear least squares: fitting a polynomial 11-1
11.1.2 Non-linear least squares 11-2
11.1.3 Linear systems of equations 11-6
11.2 Numerical integration 11-7
11.3 Integrating ordinary differential equations 11-8
11.4 Fourier transforms 11-10
11.5 Writing sound files 11-12
ix
Preface
Python and Matplotlib Essentials for Scientists and Engineers is intended to provide a
starting point for scientists or engineers (or students of either discipline) who want to
explore using Python and Matplotlib to work with data and/or simulations, and
to make publication-quality plots. The active user base of Python and Matplotlib
has been growing rapidly in recent years as people realize these packages have a very
high level of functionality, are freely available for any likely operating system and
are relatively simple to learn and use compared to similar software solutions.
No previous programming experience is needed before beginning this book, as my
aim is to make this a stand-alone introduction to Python and Matplotlib. Indeed,
my hope is that you the reader can take this introduction and discover for yourself in
just a few hours whether Python and Matplotlib provide most if not all of the tools
you need to get your work done and your publication-quality plots rendered.
The examples given in this book are available for download at the companion
website pythonessentials.com.
x
Acknowledgements
I would like to thank first of all my wife Janie, for her encouragement and support.
I am grateful to Pim Schellart of the Astrophysics Department of Radboud
University, Nijmegen, The Netherlands, for first introducing me to the Python
language, and to Martin D Still of the Science Mission Directorate at NASA for
introducing me to more advanced Matplotlib capabilities and helping me to get
up to speed. Finally, thank you to the students and SARA colleagues who have
commented on earlier versions of this manuscript.
xi
About the author
Matt A Wood
Matt A Wood graduated with a BS degree in physics from Iowa
State University, and Master’s and PhD degrees in astronomy from
the University of Texas at Austin. He spent a year as a NATO
postdoctoral fellow at the Université de Montreal in Quebec before
accepting a position as assistant professor at The Florida Institute
of Technology. He spent the 2008–2009 academic year on sabbatical
at Radboud University in Nijmegen, The Netherlands, where he
was first introduced to the Python programming language. In 2012 he joined the
Department of Physics and Astronomy at Texas A&M University-Commerce as
department head. His current research focuses on mass-transfer binary star
systems known as cataclysmic variables. He has been an author on more than
80 peer-reviewed publications and a similar number of non-refereed publications.
He lives in Greenville, Texas, and when not doing astronomy or administrative
tasks he enjoys playing guitar and bass, walking his doberman Dexter and exploring
the world with his wife Janie.
xii
IOP Concise Physics
Chapter 1
Introduction: why Python and Matplotlib?
1
www.mathworks.com/products/matlab
2
www.exelisvis.com/IDL
$500 per license, with additional charges for the various ‘toolboxes’ that may be
required.
In the fields of physics and astronomy, the package SuperMongo3 (SM) is fairly
widely used for making publication-quality plots. It is a powerful interactive plotting
package that allows the user to generate beautiful plots with a minimum number of
simple commands or user-defined macros, to easily include LaTeX symbols in
strings, and to save the plots as postscript files. If you know what this all means then
you know this is very useful for technical publishing, and if not, do not worry—
Python/Matplotlib can do all this and more. I still use SM for some of my plots and,
while it also is not free, it is reasonably priced at $300 for a departmental site license
with unlimited free upgrades and personal technical support from the authors.
SM also only really runs on Unix-like systems (e.g. Linux and MacOS), so it is not
cross-platform.
There are several good plotting packages available that are open source (i.e. free),
including Gnuplot4 and GNU Octave5. Gnuplot is a command-line driven plotting
utility that is cross-platform and widely used, but which typically requires more time
and effort on the part of the user to prepare publication-quality plots. The Octave
language is very similar to MATLAB, such that code developed for MATLAB will
typically run under Octave with little modification required. However, Octave uses
Gnuplot to render output, with results that may not have the polish that MATLAB
yields by default.
There are many, many other choices as well, and it is probably safe to say that
any plotting package that has been around for more than a few years is capable of
producing publication-quality plots, given enough time and effort on the part of the
user. And this is really the key—you are busy, and if you are using a plotting
package that requires an hour or more of your time to produce a publication-quality
plot, and then another hour to produce a similar plot, and so on, then that package is
keeping you from getting other important tasks done.
3
www.astro.princeton.edu/~rhl/sm
4
www.gnuplot.info
5
www.gnu.org/software/octave
1-2
Python and Matplotlib Essentials for Scientists and Engineers
Python was developed in the early 1990s by Guido van Rossum while at Stichting
Mathematisch Centrum in the Netherlands. He is fondly known in the Python
community as the Benevolent Dictator for Life (BDFL), but countless others have
contributed to the development of the language and community packages. He chose
the name Python because he was feeling irreverent and was a big fan of Monty
Python’s Flying Circus.
Python is a well-executed, object-oriented programming language comparable with
Perl, Ruby, or Java. It uses a syntax that renders programs easy to read as well as to
write. It comes with a large (and growing) standard library with extensive capabilities
and can call modules that were written in a compiled language such as C/C++ or
Fortran. Python can be used in interactive mode to test short pieces of code and includes
a bundled development environment IDLE (Integrated DeveLopment Environment).
Python is free to download and use, and runs on all major OSs. The language is
copyrighted, but is freely re-distributable after modification as all releases of the
language are open source6.
In addition to basic data types such as numbers (integer, floating point,
complex), strings and lists, Python also supports object-oriented programming
with classes, allowing you to define your own object types. Exception handling
of errors is cleanly implemented and the language is well suited to grouping code
into modules and packages. Data types can be dynamically assigned and Python
implements automatic memory management so you do not have to worry
about allocating and freeing memory in your code. Some say entering import
antigravity at the Python prompt allows one to fly7, but that may be stretching
reality just a bit.
To sum up this introduction: if you are a scientist or engineer looking for a
numerical analysis and plotting system that is easy to learn and use, cross-platform
and free, the combination of Python and Matplotlib is the grail you seek.
1.3 Resources
There are now countless books and websites devoted to Python and associated
packages. The webpage wiki.python.org/moin/PythonBooks includes a long list of
book titles sorted by category, as well as links to reviews. Some of the specific books
that I have found useful in preparing this monograph include:
• Langtangen H P 2012 A Primer on Scientific Programming with Python 3rd
edn (Berlin: Springer)
• Downey A 2012 Think Python: How to Think Like a Computer Scientist
(Needham, MA: Green Tea)
• Fangohr H 2014 Introduction to Python for Computational Science and
Engineering (free download at www.southampton.ac.uk/~fangohr/software).
6
See www.opensource.org for the open source definition.
7
Source: xkcd.com/353.
1-3
Python and Matplotlib Essentials for Scientists and Engineers
Many online resources exist as well. To list just a few that I have found useful:
• python.org is of course the definitive Python resource on the web. A good starting
point is the Beginner’s Guide at wiki.python.org/moin/BeginnersGuide.
• stackoverflow.com is a great question and answer site for programmers. Users
vote up the best answers, so they show up first and are easiest to find.
• The Python course available at www.python-course.eu/index.php is very
comprehensive and includes many tutorials and examples.
• Google has a Python class available at developers.google.com/edu/python/.
The class includes text, lecture videos and many coding examples.
1-4
IOP Concise Physics
Chapter 2
Downloading and installation
1
Visit store.continuum.io/cshop/anaconda. Another popular choice is the Enthought Canopy Express Python
distribution available from the site store.enthought.com/downloads.
Chapter 3
First steps
If you are using Python 3.x, you would enter print("Hello, World!"). You
might have noticed when you invoked the Python shell in interactive mode that it
prints the version number and copyright notice before the primary prompt, which
defaults to the chevron (>>>). Continuation lines begin with the secondary prompt,
which is three dots (...) by default.
Note that you can use either single or double quotes, allowing you to have an
unmatched quote in your string:
1
In this book, text that is in monospaced font represents text that you enter or that is returned by the
computer. Text that is entered is colored black, and text that is returned is colored dark blue. Code snippets are
in ivory-colored boxes with light gray borders and complete standalone programs are in ivory-colored boxes
with gold borders.
If you need to, you can always escape quote marks with a backslash (\), and if
you need to include comments, just lead with a hash sign # at the prompt or in your
programs:
We can also join() the list elements back into a single string with spaces as a
separator:
2
The functions split() and join() are methods of the string class. Methods are called using dot notation, for
example str.join(). Methods, including how to define them, are discussed more fully in chapter 9.
3-2
Python and Matplotlib Essentials for Scientists and Engineers
3-3
Python and Matplotlib Essentials for Scientists and Engineers
simulation results) we first need to know how to convert an integer to a string type so
we can concatenate it into the file name. This is as simple as str(i) where i is a
variable referring to an integer:
>>> i = 42
>>> type(i)
<type 'int'>
>>> s = str(i)
>>> print s
42
>>> type(s)
<type 'str'>
>>> i = 42
>>> outfile = 'output'+str(i)+'.txt'
>>> print outfile
output42.txt
>>> i = 42
>>> s = str(i).zfill(3)
>>> print s
042
Next, we need our list of integers, which we obtain with the range() function3
The form of range() is range([start,] stop[, step]) where start if
3
Python 2.x includes the function xrange() which is an iterator object that returns the integers one at a time
and so conserves memory when the calling argument is a very large integer. In Python 3.x, range() is an
iterator object that behaves like xrange() in Python 2.x and the original range() is depreciated.
3-4
Python and Matplotlib Essentials for Scientists and Engineers
omitted defaults to 0 and step if omitted defaults to 1. If all three arguments are
present, the function returns a list of integers [start, start + step, start
+ 2*step, ...]. If step is positive, the last element is the largest start +
i*step less than stop. If step is negative, then the last element is the smallest
start + i*step greater than stop. For example
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(1,11)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> range(1,11,2)
[1, 3, 5, 7, 9]
>>> range(0,-11,-2)
[0, -2, -4, -6, -8, -10]
Finally, we put it all together with a for loop. Note that when entering these
commands, you will need to indent the lines beginning with outfile and print to
indicate they are within the for loop. Python simply uses the indentation level to
indicate which lines of code belong in a given block—curly brackets or endif
statements are therefore not required. The standard indentation is four spaces, but
you can use whatever you like, as long as you are consistent within a given block:
output001.txt
output002.txt
output003.txt
.
.
.
output100.txt
3-5
Python and Matplotlib Essentials for Scientists and Engineers
>>> a = 'spam'
>>> b = a + 1
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects
>>>
We use the type() function to confirm that quest and num are both string
variables. Note that Python version 2.7 and earlier contains the function input(),
which expects a valid Python expression, and evaluates it. This could be a
(potentially complex) number (2 or 12.3 or 1 + 2j), or an expression (3 + 2),
or a quote-enclosed string ("I'm not dead"). Thus, the statement
x = input("Enter something: ") could result in x being of type int, float,
complex, or string! If you are writing code just for your own use and want results
as quickly as possible, you might opt to use input() rather than raw_input(),
but the extra overhead for using raw_input() is small and offers protection
against what should be invalid inputs to your programs. In addition, in Python 3.x
the function input() behaves as raw_input() does in Python 2.7, so you might
as well using the current standard from the outset.
3-6
Python and Matplotlib Essentials for Scientists and Engineers
If we need our input to be assigned to a variable of type int or float we can use
the int() or float() functions to perform the conversion for us:
If instead of an integer you wanted a floating point (decimal) number, you could
use instead myfloat = float(raw_input("Enter number: ")). If the user
enters an integer when prompted, the integer will be promoted to a floating point
number (e.g., entering either 2 or 2.0 will set myfloat to 2.0).
If you want to write user-tolerant code, you might include a test and let the user
try again if he/she fumbles when typing the input:
3-7
Python and Matplotlib Essentials for Scientists and Engineers
4
This code snippet also gives a sneak peak at the while statement and flow control, discussed further in
chapter 7.
5
The link defaults to Python 3.x documentation, but the 2.x tutorial is just a click away.
6
code.google.com/p/spyderlib/
7
See ipython.org/notebook.html.
3-8
Python and Matplotlib Essentials for Scientists and Engineers
Our first program file (figure 3.1, top panel) simply prints Hello, World!
and exits when it is run. Our program file could simply contain the single line
print "Hello, World!", and the result would be the same. The triple quotes
set off a multi-line comment. This is used at the top of programs and functions to
describe the workings of the routine, as we will discuss further below. The very first
line #!/usr/bin/env python is a ‘shebang’ line that tells the program to
3-9
Python and Matplotlib Essentials for Scientists and Engineers
activate the Python program given by your environment variable and should make
the program portable between different machines and different OSs.
Our program as it stands is a bit inflexible. We can make it a little more versatile if
we add a command line argument. In general, command line arguments allow us to
input file names, variables, etc, into our program (hello.py) without having to
prompt for them within the program itself. For example:
3-10
Python and Matplotlib Essentials for Scientists and Engineers
#!/usr/bin/python
"""
Prints "Hello, <username>" where <username> is an optional command
line argument.
Args:
sys.argv[1]: string (optional)
Examples:
>>> python hello.py
Hello, Stranger!
>>> python hello.py Bob
Hello, Bob!
"""
import sys
Note that here we are importing the sys module to pass the command line argu-
ments. sys.argv[0] is the program name and sys.argv[1] is our command line
argument. The total number of command line arguments is given by len(sys.argv).
Now would be a good time to save this code to a file called hello.py8, after
which you can run it using the command
but for a program you will use frequently, it will be more convenient to make it an
executable. In Unix-like systems, this is accomplished with the chmod (change
mode) function, where chmod +x makes a script executable directly from the
command line:
% chmod +x hello.py
% hello.py
Hello, Stranger!
% hello.py Merlin
Hello, Merlin!
8
Example codes named in the text are available for download at pythonessentials.com.
3-11
Python and Matplotlib Essentials for Scientists and Engineers
% ipython
[version etc. information clipped]
In [1]: run hello.py Arthur # .py extension is optional within IPython
Hello, Arthur!
In [2]: run hello.py
Hello, Stranger!
The IPython shell numbers your commands for later use. In much of the rest of
this book, I will continue to show the standard Python prompt >>> for examples,
but when you are actually doing your own work, my strong recommendation is that
you use IPython for everything (or an IDLE or IPython Notebook) whenever you
are developing code.
You can also run an external program from the Python prompt (>>>) if IPython
is not available. If there are no command line arguments, then you can use the
execfile() function
>>> execfile("hello.py")
Hello, Stranger!
If you need to pass command line arguments, then things are not so straightforward:
3-12
IOP Concise Physics
Chapter 4
Working with numbers
>>> 2 + 2
4
>>> 4 / 2
2
>>> 1/2
0
>>> -1/2
-1
Note the first two lines give the expected results, but that 1 / 2 yields 0 when in
the Python 2.x interpreter, because the result was rounded down to the nearest
integer (floor). This results because Guido ‘BDFL’ van Rossum adopted the typical
rule from C (also true in Fortran) that the result of an equation is always of the same
type as the operands. So, divide a float by a float, you obtain a float; divide an
integer by an integer, you obtain an integer. The former case is fine, but the BDFL
now considers the latter to be a design bug1. Although there are many codes out
there that rely on this behavior, for the rest of us it is an annoyance we have to know
about and avoid. The good news is that Python 3.x implements true division for both
1
See python-history.blogspot.com/2009/03/problem-with-integer-division.html.
integers and floats2, so 1/2 will return 0.5. Python 2.x allows you to execute the
command from __future__ import division to yield this same behavior. If
using Python 2.x and you have not run this command, you can simply put a decimal
point after at least one of the integers in a division operation, which will raise the
result to a floating number:
>>> 1/2.
0.5
>>> 10. * 1/2 # 10.*1 -> 10., then 10./2 -> 5.
5.0
>>> 10. * (1/2) # (1/2) evaluated first -> 0, then 10. * 0 -> 0.0
0.0
>>> from _ _future_ _ import division
>>> 10. * (1/2)
5.0
>>> a = 4.0
is an assignment statement. What happens when this is entered into the interpreter is
that Python creates the float object 4.0 and binds the name a to that object.
In computer programs it is very common to iterate and so lines of code of the form
>>> a = a + 1
are common. As an algebraic statement, this makes no sense, but what this line of code
is telling the interpreter (or compiler in a compiled language), is to evaluate the
expression on the right-hand side of the equals sign and then assign the resulting value to
the variable name on the left-hand side of the equation. Because this is such a common
operation, Python has available the compact notation a += 1 so, for example,
>>> a = 3
>>> a += 1
>>> a
4
2
Integers and floating point numbers (floats) are stored differently in the computer memory.
4-2
Python and Matplotlib Essentials for Scientists and Engineers
>>> a *= 2
>>> a
8
>>> a = 4
>>> a
4
>>> b = 3
>>> a + b
7
>>> a / b # Integer division!
1
>>> a / float(b)
1.3333333333333333
>>> c = 3.
>>> a / c
1.3333333333333333
When in interactive mode, the previous result is available through the variable -.
This can be very useful when using Python as a calculator.
>>> secperday = 24 * 60 * 60
>>> secperday
86400
>>> secperyear = - * 365
>>> secperyear
31536000
>>> a = b = c = 5
>>> a
5
>>> b
5
>>> c
5
4-3
Python and Matplotlib Essentials for Scientists and Engineers
The modulus function % can be very useful in certain situations (for example,
printing diagnostics every 100 time steps of a simulation):
>>> 8 % 2
0
>>> 8 % 3
2
>>> 8 % 3.1
1.7999999999999998
>>>
output010.txt
output020.txt
.
.
.
output100.txt
It is probable that you will often want to use the value of π in your programs.
You can enter it yourself, but if you execute from math import pi at the
command line or at the top of your program, you will have it available to machine
precision when you need it. math is an example of a module and modules are
extensions to Python that can be imported to extend the capabilities of the base
language:
The math module provides access to the standard mathematical functions defined
by the C standard and you have the option to simply include everything in the
module with from math import * at the top of your programs, although as
4-4
Python and Matplotlib Essentials for Scientists and Engineers
discussed below it is generally safer to just import what you need to avoid conflicts in
the namespace:
Here is a selected list of a few useful functions from the math module:
Function Operation
fabs(x) absolute value of x
log(x) base-e logarithm of x
log10(x) base-10 logarithm of x
pow(x,y) x^y
sqrt(x) square root of x
acos(x) arc cosine of x, in radians
atan2(y,x) arctan(y/x), in range -pi,pi
degrees(x) converts x from radians to degrees
radians(x) converts x from degrees to radians
Complex numbers are also straightforward to work with, where j indicates the
complex part of the value:
>>> a = 2 + 3j
>>> b = 4 - 9j
>>> a
(2+3j)
>>> b
(4-9j)
>>> a + b
(6-6j)
>>> a * b
(35-6j)
4-5
Python and Matplotlib Essentials for Scientists and Engineers
(3.2 + 5.4j)
>>> a.real
3.2
>>> a.imag
5.4
Note that the elements of a list do not have to all be of the same type—in the
example, we have strings, an integer and a float all in the same list.
You can add elements on to the end of your list, replace elements within your list,
remove items from your list, insert some, reverse the list, or clear the entire list:
4-6
Python and Matplotlib Essentials for Scientists and Engineers
>>> a
['spam', 'a newt', 123]
>>> # insert an item
>>> a.insert(1,'spamalot')
>>> a
['spam', 'spamalot', 'a newt', 123]
>>> # reverse the order
>>> a.reverse()
>>> a
[123, 'a newt', 'spamalot', 'spam']
>>> # clear the list
>>> a = []
>>> a
[]
As noted briefly in the previous chapter, strings can be concatenated with the + sign:
>>> a = ['and','now','for']
>>> b = ['something','completely','different']
>>> c = a+b
>>> c
['and', 'now', 'for', 'something', 'completely', 'different']
Lists can also contain other lists, which is useful for some applications:
4-7
Python and Matplotlib Essentials for Scientists and Engineers
which is equivalent to
and if we only wanted to keep the even values, we can add a conditional statement:
4-8
Python and Matplotlib Essentials for Scientists and Engineers
4.2.4 Tuples
Python also includes a data type called a tuple (in fact, our most recent example
returned a list of tuples). Tuples, like lists, are sequences. What is special about
tuples is that they cannot be changed—they are immutable (lists are mutable). You
might think of a tuple as a ‘constant list’. Tuples are indicated by simply separating
some values with commas and are often enclosed in parentheses:
>>> 3, 2, 1
(3, 2, 1)
>>> (1, 2, 3)
(1, 2, 3)
The empty tuple is written as a (). You can slice tuples just as you can lists and
you can convert a list to a tuple or vice versa if need be:
>>> a = [3, 2, 1]
>>> tuple(a)
(3, 2, 1)
>>> b = list(a)
>>> b
[3, 2, 1]
Tuples are used behind the scenes in Python and you may use them when calling
functions. Again, they work very much like lists, with the exception that they cannot
be changed.
The zip() function iterates over two or more sequences or iterables in parallel.
Most commonly, it takes two or more lists and returns a list of tuples, where the ith
tuple contains the ith element from each of the argument lists:
>>> a
[1, 2, 3]
4-9
Python and Matplotlib Essentials for Scientists and Engineers
>>> b
[4, 5, 6]
>>> zip(a,b)
[(1, 4), (2, 5), (3, 6)] # list of tuples
The zip() function is commonly used in list comprehensions when two or more
lists are involved in the expression:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> [i*j for i,j in zip(a,b)]
[4, 10, 18]
>>> a = [1, 2, 3]
>>> b = a
>>> b
[1, 2, 3]
>>> b[2] = 10
>>> b
[1, 2, 10]
4-10
Python and Matplotlib Essentials for Scientists and Engineers
>>> a
[1, 2, 10]
>>> a is b # tests if a and b refer to the same object
To make a copy of a list that does not refer to the same object, you can use b =
list(a) or b = copy.copy(a) for simple lists3:
>>> a = [1, 2, 3]
>>> b = list(a)
>>> b[2] = 10
>>> b
[1, 2, 10]
>>> a
[1, 2, 3]
>>> import copy
>>> b = copy.copy(a)
>>> b[2] = 10
>>> b
[1, 2, 10]
>>> a
[1, 2, 3]
There may be situations where your list itself contains other objects like lists or
class instances. If you have such a situation, you can use new_list = copy.
deepcopy(old_list). See section 9.3 below for more on when you would need
to make a deep copy.
>>> a = [1, 2, 3]
>>> b = 3 * a
>>> b
[1, 2, 3, 1, 2, 3, 1, 2, 3]
3
For more information on the copy module, see docs.python.org/2/library/copy.html.
4-11
Python and Matplotlib Essentials for Scientists and Engineers
That is probably not what you expected! What you obtained was three copies of a
concatenated together. You can achieve the behavior you want with the following
list comprehension:
>>> a = [1, 2, 3]
>>> b = [3*i for i in a]
>>> b
[3, 6, 9]
This works, and you can use similar constructions for other arithmetic operations,
but is a bit cumbersome. In science and engineering disciplines, we are mostly going
to be dealing with arrays of numbers, which are not included as a core feature in
Python. So, let us introduce the package NumPy, which you will probably import at
the beginning of nearly all of your codes that work with numerical data.
4-12
IOP Concise Physics
Chapter 5
NumPy arrays
This is a 2D (rank 2) array. In this example, the first dimension (axis) has a length of
4 and the second has a length of 3.
We can access rows, columns and individual elements in the array just as we did
for lists using indexing and slicing:
>>> xpos[0]
array([ 0.1, 0.2, 1. ]) # first row
>>> xpos[:,2]
The numpy array class is called ndarray. You can create an array from an
existing (numerical-only) list a using
>>> x = np.array(a)
While it may be more convenient to import everything from the numpy module,
5-2
Python and Matplotlib Essentials for Scientists and Engineers
Arrays must be homogeneous, meaning all values have the same data type. So if
you enter a mix of integers and floating point numbers the integers will be converted
to floating point values and if there is a single complex number input all entries will
be converted to complex numbers:
>>> np.arange(1, 5)
array([1, 2, 3, 4])
>>> np.arange(5, 1, -1)
array([5, 4, 3, 2])
>>> np.arange(0, 1.0, 0.2) # but see note below
array([ 0. , 0.2, 0.4, 0.6, 0.8])
>>> np.linspace(0,1,5)
array([ 0. , 0.25, 0.5 , 0.75, 1. ])
>>> np.linspace(0,1,5,endpoint=False)
array([ 0. , 0.2, 0.4, 0.6, 0.8])
5-3
Python and Matplotlib Essentials for Scientists and Engineers
>>> a = np.array([1,2,3],dtype=complex)
>>> a
array([ 1.+0.j, 2.+0.j, 3.+0.j])
>>> np.zeros(6)
array([ 0., 0., 0., 0., 0., 0.])
>>> np.zeros((2,3))
array([[ 0., 0., 0.],
[ 0., 0., 0.]])
>>> np.zeros((2,3),dtype=int)
array([[0, 0, 0],
[0, 0, 0]])
>>> a = np.ones((3,2))
>>> a
array([[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])
>>> a.shape
(3, 2)
5-4
Python and Matplotlib Essentials for Scientists and Engineers
array([0, 1, 2, 3])
>>> c = a-b
>>> c
array([10, 19, 28, 37])
>>> b**2
array([0, 1, 4, 9])
>>> 5*np.sqrt(a)
array([ 15.8113883, 22.36067977, 27.38612788, 31.6227766 ])
>>> a < 25
array([True, True, False, False], dtype=bool)
Note you could use the final operation to create a mask for another operation.
The * product operator operates elementwise in NumPy arrays—it is not stan-
dard matrix multiplication1. To obtain a matrix product use the dot() function.
When using NumPy arrays, for example,
>>> a = arange(4.).reshape((2,2))
>>> a
array([[ 0., 1.],
[ 2., 3.]])
>>> b = a + 2
>>> b
array([[ 2., 3.],
[ 4., 5.]])
>>> a * b
array([[ 0., 3.], # elementwise multiplication
[ 8., 15.]])
>>> np.dot(a,b)
array([[ 4., 5.], # matrix multiplication
[ 16., 21.]])
where the last statement dot(a,b) gives standard matrix multiplication if a and b
are 2D arrays. If instead a and b are 1D arrays (i.e., vectors) then dot(a,b) returns
the standard inner product of the vectors (without complex conjugation). The
function cross(a,b) returns the cross product of vectors a and b:
1
MATLAB, for example calculates the matrix product when the * operator is used.
5-5
Python and Matplotlib Essentials for Scientists and Engineers
You may sometimes want to perform operations and overwrite the original array:
>>> a = np.ones((3,2))
>>> b = np.arange(6).reshape(3,2)
>>> a
array([[ 1., 1.],
[ 1., 1.],
[ 1., 1.]])
>>>b
array([[0, 1],
[2, 3],
[4, 5]])
>>> b += 1
>>> b
array([[1, 2],
[3, 4],
[5, 6]])
>>>b += 3*a # Note: array a is converted to integer type here!
>>>b
array([[4, 5],
[6, 7],
[8, 9]])
It will often be useful to find the minimum or maximum value of an array. The array
class provides methods max() and min() that return these. Having already imported
numpy, we will use the random.random() function to return a list of pseudo-
random numbers in the half-open interval [0.0, 1.0) with a uniform distribution2:
>>> a = np.random.random((3,2))
>>> a
array([[ 0.93143352, 0.23216296],
[ 0.4620674 , 0.68159093],
[ 0.6383356 , 0.58551648]])
2
If you run this example, your numbers will differ from what is shown.
5-6
Python and Matplotlib Essentials for Scientists and Engineers
>>> a.min()
0.23216295741014037
>>> a.max()
0.93143351615165648
>>> a.sum()
3.5311068835151014
It might be that you need the maximum or minimum of a given row or column, in
which case you could specify which axis to search:
>>> x = np.arange(3)
>>> x
array([0, 1, 2])
>>> y = x.copy()
>>> y
array([0, 1, 2])
>>> y[2] = 10
>>> y
array([ 0, 1, 10])
>>> x
array([0, 1, 2])
>>> # or
>>> y = np.array(x)
>>> y is x # tests if x and y refer to the same object
False
5-7
Python and Matplotlib Essentials for Scientists and Engineers
If you simply use y = x, then changing any element of y also changes the cor-
responding element in x since they both refer to the same object—be careful out there.
>>> x = np.arange(3)
>>> x
array([0, 1, 2])
>>> y = x
>>> y is x
True
>>> y[2] = 10 # this changes x[2] as well
>>> y
array([ 0, 1, 10])
>>> x
array([ 0, 1, 10])
5.3 Dictionaries
The Python dictionary object provides a very flexible means of storing information.
Perhaps you have a list that has the mass densities in units of g cm−3 for selected
substances:
For this to be useful we need to know what substance each of these list items
represents. This is where a dictionary can be useful. A dictionary object can be
created as follows using curly brackets {} and key-value pairs (or simply items) each
separated by a colon : where, in our example, the substance name is the key:
Note that the printed order of the key-value pairs is not the same as what we input,
because the information is stored as a hashtable. If you enter the same statements on
your computer, the order may be different from that above. This behavior is not a
problem because the dictionary is not accessed using an index as a sequence is, but
rather by the key.
With the above definition for rho, we can retrieve the density of ice using a statement
>>> rho['ice']
0.93
5-8
Python and Matplotlib Essentials for Scientists and Engineers
We can add to the dictionary and print it, and can return the length of the
dictionary using len():
The keys and values can each be extracted into new lists:
>>> rho.keys()
['water', 'aluminium', 'gold', 'ice']
>>> rho.values()
[1.0, 2.7, 19.3, 0.93]
We can sort a dictionary by the keys in alphabetical order using the sorted()
function:
5-9
Python and Matplotlib Essentials for Scientists and Engineers
As with lists and arrays, to make an independent copy of a dictionary use .copy():
>>> x = np.random.random(10)
>>> x
array([ 0.55996936, 0.38046019, 0.62875143,
0.12421172, 0.93876425, 0.80777689,
0.53750113, 0.73750162, 0.61521331,
0.65951292])
>>> np.median(x)
0.62198237032837067
>>> np.average(x) # can specify weights for this
0.59896628192996249
>>> np.mean(x)
0.59896628192996249
>>> x.std() # standard deviation
0.2148276320926009
>>> x.var() # normalized with N (not N — 1)
0.046150911510513891
5-10
Python and Matplotlib Essentials for Scientists and Engineers
Round-off error is a fact of life when programming and is the reason why it is best to
avoid comparing floats as equal in conditional statements. The following example code
would seem to print the numbers from 0.0 to 0.9 in increments of 0.1 and then stop when
t = 1.0. The actual behavior is that the conditional expression t != 1.0 never tests as
True and so the loop is infinite. The built-in function repr() returns a string con-
taining the full (‘official’) string representation of an object, whereas the str() function
returns an ‘informal’—potentially less accurate—string representation of the object:
>>> t = 0.
>>> while t != 1.0:
... print repr(t)
... t += 0.1
...
0.0
0.1
5-11
Python and Matplotlib Essentials for Scientists and Engineers
0.2
0.30000000000000004
0.4
0.5
0.6
0.7
0.7999999999999999
0.8999999999999999
0.9999999999999999
1.0999999999999999
1.2
1.3
.
.
. # Infinite loop!
Rather than testing for equality, it is much safer to check that you have reached the
target value within some tolerance. For example, the following code terminates at
0.8999..., as intended:
>>> t = 0.
>>> epsilon = 1.e-6
>>> while abs(t — 1.0) > epsilon:
... print repr(t)
... t += 0.1
5-12
Python and Matplotlib Essentials for Scientists and Engineers
>>> type(am_row)
<class 'numpy.matrixlib.defmatrix.matrix'>
The numpy module contains the linalg routines, which are optimized for linear
algebra. For example, it is trivial to solve a matrix equation of the form ax = b for
vector x.
>>> a = np.matrix(np.linspace(2.,5.,4).reshape((2,2)))
>>> a # another route to matrix a above
matrix([[2., 3.],
[4., 5.]])
>>> b = np.matrix([[2,3]]).T # transpose
>>> b
matrix([[2],
[3]])
>>> x = np.linalg.solve(a,b)
>>> x
matrix([[-0.5],
[1. ]])
>>> np.allclose(np.dot(a,x),b) # check that solution is correct
True
5-13
Python and Matplotlib Essentials for Scientists and Engineers
If you are working with very large matrices, you should consider instead using the
scipy.linalglinalg module, because typically SciPy is built using the opti-
mized ATLAS3 LAPACK and BLAS libraries, which results in very fast linear
algebra performance. However, in this case you will need to use the array class
instead of the matrix class.
We have not discussed SciPy up to this point, but it is worth mentioning that
essentially everything available in NumPy is also available in SciPy. Often the
routines are identical, but when they differ the SciPy routines are usually faster. To
quote the SciPy FAQ4:
In an ideal world, NumPy would contain nothing but the array data type and
the most basic operations: indexing, sorting, reshaping, basic elementwise
functions, et cetera. All numerical code would reside in SciPy. However, one of
NumPyʼs important goals is compatibility, so NumPy tries to retain all fea-
tures supported by either of its predecessors. Thus NumPy contains some
linear algebra functions, even though these more properly belong in SciPy. In
any case, SciPy contains more fully featured versions of the linear algebra
modules, as well as many other numerical algorithms. If you are doing sci-
entific computing with Python, you should probably install both NumPy and
SciPy. Most new features belong in SciPy rather than NumPy.
3
math-atlas.sourceforge.net
4
www.scipy.org/scipylib/faq.html
5-14
IOP Concise Physics
Chapter 6
File input and output
The read() command will return the entire file as a single string (including the
newline character \n) if no argument is passed, or just the number of bytes passed as
an argument, which for example can be useful for reading binary files. The seek(0)
command resets the current position back to the beginning of the file and close()
closes the file:
>>> f = open('data.dat','r')
>>> f.name # returns the filename
'data.dat'
>>> f.read()
'1.0 black hole\n2.0 red dwarfs\n3.0 blue planets\n'
>>> f.seek(0) # reset to beginning of file
>>> f.read(10) # read and return 10 bytes
'1.0 black '
>>> f.close()
The file can alternatively be read using readlines(), which returns a list
containing the lines in the file:
>>> f = open('data.dat','r')
>>> f.readlines()
['1.4 blue planets\n', '3.2 red dwarfs\n',
'2.1 green giants\n', '1.0 'black holes\n']
#!/usr/bin/env python
print mydat
6-2
Python and Matplotlib Essentials for Scientists and Engineers
% read_data.py
[[1.0, 'black', 'hole'], [2.0, 'red', 'dwarfs'], [3.0, 'blue', 'planets']]
This method is both simple and general, and can be used for essentially any
kind of file. What this code does is to open the file and then step through line by
line using the statement for line in f.readlines():. For each line, the
newline character is stripped with the string method strip(), the elements
are split into a list assuming a space for the separation character using method
split() and the first element of each line is converted to a floating point value
using float(). Finally, the new rank 1 list is appended onto the end of the nested
list mydat.
Now, it turns out we can improve on the above code. First, we did not have an
explicit close() statement in our example. The file will be closed when we exit
the program, but in a more complicated code that retrieves data from hundreds of
files it is possible that, even with the close statements, the files might not be closed
‘quickly enough’ and lead to a ‘too many files open’ error from the OS. If instead
we use a with block as shown in the following example, then the file is closed
immediately after the contents are retrieved. Using a with block is now considered
the preferred method of accessing files. Next, it turns out that readlines() is
not even needed and can slow down the execution of your code significantly
because it results in the entire file being stored in the memory, which can be a
problem if your data files are huge. Instead, you can just iterate on the file object
itself, as it is already an iterable object! This is memory efficient, fast and yields
simpler code.
So, if we wanted to read in a data file that we knew contained (any number of)
columns of numbers, we could use the following program (read_numdata.py)
which makes use of a list comprehension:
#!/usr/bin/env python
import sys
mydat = []
with open(sys.argv[1]) as f:
for line in f:
mydat.append([float(x) for x in line.split()])
print mydat
6-3
Python and Matplotlib Essentials for Scientists and Engineers
This program takes the file name from the command line, iterates on the file itself
and iterates within the append line. Indeed, this can be made even more compact
with the use of a nested list comprehension:
mydat = []
with open(sys.argv[1]) as f:
mydat = [[float(x) for x in line.split()] for line in f]
and for example if we know there are two columns of data in our file and we want to
put these into numpy array objects, we could add the lines
a = np.array(mydat)
x = a[:,0] # first column
y = a[:,1] # second column
Now that we have discussed the ‘hard’ way to accomplish this, let us discuss the
easier path that numpy affords for reading files containing numerical data.
# decimal C02
# date ppm
1980.042 337.80
1980.125 338.28
1980.208 340.04
. .
. .
. .
2014.792 395.93
2014.875 397.13
2014.958 398.78
where you will notice that the first two lines are comments serving as column headers.
1
www.esrl.noaa.gov/gmd/ccgg/trends/
6-4
Python and Matplotlib Essentials for Scientists and Engineers
Note that because the first two rows of the file start with the comment symbol #
they are ignored by loadtxt(). If you have some number of rows at the top of
your file that you want to skip but that do not start with #, you can simply include
the skiprows keyword when calling loadtxt(). When using skiprows com-
ment lines are included in the count of skipped lines, so it will behave as you want it
to, no matter if the rows you want to skip begin with # or not.
For example, given our CO2 data file of monthly averages, if we wanted to not
read in the first ten years of data, we would need to skip the two header lines and
120 data lines, for a total of 122 lines:
>>> a = np.loadtxt('co2data.txt',skiprows=122)
>>> a
array([[1990.125, 354.88 ],
[1990.208, 355.65 ],
[1990.292, 356.27 ],
.
.
.
[2014.958, 398.78 ]])
The actual data file from NOAA we are referring to (co2_mm_mlo.txt) con-
tains 682 rows of measurements (at the time of writing) dating back to 1958, where a
typical line looks like
6-5
Python and Matplotlib Essentials for Scientists and Engineers
Here column 1 is the year, column 2 is the month, column 3 is the decimal date,
columns 4, 5 and 6 are different estimations of the CO2 concentration averages and
the last column is the number of days going into the monthly average.
If we want to read this file directly, we can simply read everything with
>>> a = np.loadtxt('co2_mm_mlo.txt')
>>> np.shape(a)
(682, 7)
We could then copy the data of interest (the 3rd and 4th columns) to two 1D
arrays as follows:
x = a[:,2]
y = a[:,3]
but loadtxt() provides a more direct solution. If we just want the decimal date
and the direct average concentration, we can use the keyword usecols to specify
which of these columns we want to read, where the index starts at zero for the first
column. Even more useful, we can unpack the data and load them directly into 1D
arrays that we can later pass to (for example) the Matplotlib plot() functions:
>>> x, y = np.loadtxt('co2_mm_mlo.txt',usecols=(2,3),unpack=True)
>>> x
array([1958.208, 1958.292, ... 2014.792, 2014.875, 2014.958])
>>> y
array([315.71, 317.45, 317.5, ... 395.93, 397.13, 398.78])
The unpack parameter, if set True, transposes the returned array allowing
a statement of the form x, y, z = loadtxt(...) to be used. The function
loadtxt() has additional parameters that may be useful in special circum-
stances, but the above will probably work for most files you will need to read in
practice.
If your data file contains missing data and/or if you just want more control over
how your data file is read, the NumPy function genfromtxt() is a good choice2.
2
See docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html.
6-6
Python and Matplotlib Essentials for Scientists and Engineers
It can take missing data into account because it loops twice over the data. On the
first pass, it converts each line into a sequence of strings and on the second it
converts to the appropriate data type.
Then you can use methods of the datetime module to return useful information
or to reformat the date and time using .strftime()4:
The datetime module can return the current date and time (datetime.now()),
can calculate the difference between two dates, etc, but this is beyond the scope of
this book. For more information, see docs.python.org/2/library/datetime.html and
also section 6.1.4 below.
3
See, e.g., en.wikipedia.org/wiki/ISO_8601 and perhaps also xkcd.com/1179.
4
For a list of all the .strftime() format codes, see strftime.org.
5
Visit www.astropy.org. If using Anaconda Python, installation is as simple as typing conda install
astropy at a terminal command prompt.
6-7
Python and Matplotlib Essentials for Scientists and Engineers
data and attempts to guess the format by trying the known supported formats
(which include basic ASCII, HTML, LaTeX, CSV, FITS, HDF5 and many
others). For example, if you have a file named co2data.dat that contains the
last three lines of the CO2 data with month columns added as both text and
integer:
you can read this file and examine the results using the following:
Note that the ascii.read() function was able to determine that the first
line contains names and that the data types of the four columns were float,
float, string and int, respectively. The full utility of the astropy data table
object is beyond the scope of this text, but note that the column names can be
accessed via
>>> data.colnames
['date', 'ppm', 'month', 'mon_i']
6-8
Python and Matplotlib Essentials for Scientists and Engineers
Tables, like lists, are mutable so data in them can be changed in place, and rows
and columns can be deleted or added as needed. As noted above, ascii.read()
from the Astropy library can also read LaTeX tables directly, so if we had found our
data in the LaTeX source code of a paper on arxiv.org and saved it to a file on
our local disk as co2data.tex
\begin{table}
\begin{tabular}{cccc}
date & ppm & month & mon_i \\
2014.792 & 395.93 & Oct & 10 \\
2014.875 & 397.13 & Nov & 11 \\
2014.958 & 398.78 & Dec & 12 \\
\end tabular
\end table
we could read that file using the following and obtain exactly the same table object
as we obtained above. A related function discussed below that may be useful to you
is ascii.write(), which can write your data as a LaTeX table6.
data = ascii.read('co2data.tex')
6
In the current example, the table above was created with the command ascii.write(data, 'co2data.
tex',format='latex').
7
See astropy.readthedocs.org/en/latest/time. From the documentation, ‘All time manipulations and arithmetic
operations are done internally using two 64-bit floats to represent time… [T]he Time object maintains sub-
nanosecond precision over times spanning the age of the universe’.
6-9
Python and Matplotlib Essentials for Scientists and Engineers
with a specific focus on time formats (e.g., ISO 8601 and Julian date) and time
standards (e.g., UTC, TAI, etc) used in astronomy.
Let us assume that we have read the time string in ISO 8601 format with nano-
second accuracy. We can then convert it to an Astropy Time object:
>>> s = '2015-01-27T16:14:49.123456789Z'
>>> t = Time(t)
>>> print t
2015-01-27T16:14:49.123456789Z
>>> type(t)
<Time object: scale='utc' format='isot' value=2015-01-27T16:14:49.123>
and then can easily print out the equivalent Julian date, modified Julian date, or
convert to another time scale. If you do not need this functionality, then jump to the
next section, but if you do need this functionality, then something like this module
may have been on your wish list for years. See the documentation for more infor-
mation and send your thanks to the developers.
You have control over the precision of the printed output with the precision
attribute, which gives the number of digits after the decimal point when outputting a
value that includes seconds. The default is three and the maximum precision is nine:
>>> t = Time('2015-01-27T16:14:49.123456789Z')
>>> print t
2015-01-27T16:14:49.123
6-10
Python and Matplotlib Essentials for Scientists and Engineers
>>> t.precision = 6
>>> print t
2015-01-27T16:14:49.123457
>>> t.precision = 9
>>> print t
2015-01-27T16:14:49.123456789
Finally, note that the Astropy module can also read and write files using the
binary HDF5 and FITS formats, as we discuss in section 6.2.4 below.
If you are saving measurements that are only significant to four digits (e.g., 101.2,
3.002, 1.734 × 105) then the default behavior will not only take up unnecessary disk
space, but will be misleading since someone opening that file at a later time would
not know the true precision of the measurements from the file contents alone.
6-11
Python and Matplotlib Essentials for Scientists and Engineers
If you have programmed in any other languages, the format codes for this method
should be easily understandable. The general form is %[flag]width[.precision]
specifier, where
Flags
- : left justify
+ : a sign character (+ or -) will precede the number
0 : left pad the number with zeros instead of space
Width
Minimum number of characters to be printed.
Precision
For integer specifier (d, i, o, x) the minimum number of digits.
For e, E and f, the number of digits after the decimal point.
For g and G, the maximum number of significant digits.
For s, the maximum number of characters.
Specifiers:
c : character
d or i : signed decimal integer
e or E : scientific notation with e or E
f : decimal floating point
g or G : use the shorter of e, E, or f
s : string of characters
u : unsigned decimal integer
So, the code %4.2f means print a floating point number with a width of four
characters in the format N.NN (the decimal point counts as one character). The code
%3i means print an integer with three character spaces, including a sign if negative.
There are additional specifiers not listed here for binary, octal and hexadecimal
numbers.
Although it might appear that the formatting is part of the print function, this is
not the case. Instead the string object is acted upon by the modulo operator, which
returns another string, and it is this returned string that is passed to the print function:
6-12
Python and Matplotlib Essentials for Scientists and Engineers
which perhaps is not terribly clear, so let us convert our examples from above to the
new method:
As before with the string modulo operator, we again have a format string on
the left which has fields that will be replaced, however, here we indicate these
fields with curly brackets {}. The curly brackets and any format codes within
will be replaced by the formatted value of one of the arguments to the .format()
object. In the examples above, the positional arguments {0}, {1} and {2} were
explicitly stated, along with format codes. If the arguments are in the same order as
you want things printed, then you can leave them out. Similarly, if you do not care
about the exact formatting of the arguments, you can also leave out those specifiers:
However, if you want to use the arguments in a different order, or if you want to
use an argument more than once, then you do need to specify the positional
parameters
6-13
Python and Matplotlib Essentials for Scientists and Engineers
You may have noticed that the general syntax for .format() allowed keyword
arguments. This feature could be quite useful for complicated print statements, as it
makes it easier to map from the arguments back to the format string:
Option Effect
'<' Field is left-aligned within specified space (default for
strings).
'>' Field is right-alighted within specified space (default for
numbers).
'=' Valid for numeric types. Force zero padding after sign but
before digits.
'^' Field is centered within the available space.
Here is a slightly more complex example where we have a list we would like to
print centered one item per line between vertical bars. We use the join() method
from section 3.1 above, using the newline character \n as the separator and iterating
over the elements in the list:
a = 5.34 +- 0.02
but if you employ unicode characters you can output the ‘±’ to the terminal (or file).
The unicode character for the ‘±’ symbol is \u00B1. If you include this in your
6-14
Python and Matplotlib Essentials for Scientists and Engineers
string with a ‘u’ in front of the string to tell Python to interpret the string as unicode,
you can obtain the result in this form using
>>> f = open('output.txt','w')
>>> f.write('This is the first line of the file.\n')
>>> f.write("{} {} at \${:4.2f} each\n"format(20, 'liters', 1.17234)
>>> f.close()
After executing either of these statement blocks from the interpreter (or a file),
your working directory will contain the file output.txt with the following lines:
6-15
Python and Matplotlib Essentials for Scientists and Engineers
If you have numerical data that you want to write in columns to a file, there are
several ways you can do this, four of which are shown in the example below
(fwrite_demo.py). All four give identical output. The first example is the most
straightforward and perhaps the first thing you would think of if coming to Python
with previous experience in C/C++ or Fortran. Example 2 brings the for loop
inside a list comprehension, saving one line (‘Flat is better than nested’). Examples 3
and 4 both use the with statement, saving another line. These use f.write()and
f.writelines(), respectively, where f.write()writes a single line to a file and
f.writelines() writes a sequence of strings to the file. The writelines()
method requires the entire sequence to be created in memory before writing to the
file and so example 4 is less memory efficient than example 3. Therefore, of the
examples shown, I recommend example 3 as the best, however, NumPy provides
the savetxt() function, which is in practice what you will probably use to write
columns of numbers to a file.
import numpy as np
x = np.arange(5.)
y = x**2
# Example 1
f = open('mydat1.txt','w')
for i in range(len(x)):
f.write("{} {}\n".format(x[i],y[i]))
f.close()
# Example 2
f = open('mydat2.txt','w')
[f.write("{} {}\n".format(x[i],y[i])) for i in range(len(x))]
f.close()
# Example 3
with open('mydat3.txt','w') as f:
[f.write("{} {}\n".format(i,j)) for i,j in zip(x,y)]
# Example 4
with open('mydat4.txt','w') as f:
f.writelines(["{} {}\n".format(i,j) for i,j in zip(x,y)])
6-16
Python and Matplotlib Essentials for Scientists and Engineers
which will output a file containing the x values on the first line of the file and the y
values on the second line of the file, with all values by default printed in exponential
format to machine precision, which is not convenient:
1.000000000000000000e+00 2.000000000000000000e+00
3.000000000000000000e+00 4.000000000000000000e+00
1.000000000000000000e+00 4.000000000000000000e+00
9.000000000000000000e+00 1.600000000000000000e+01
1.00 1.0
2.00 4.0
3.00 9.0
4.00 16.0
If you would like to include header or footer lines, you can pass strings to the
header and footer keyword arguments. This example also demonstrates that if
you only include a single format specifier, it will be used for all values:
6-17
Python and Matplotlib Essentials for Scientists and Engineers
If you would like to include multiple comment lines at the top of your file, you can
do something similar to the following:
6-18
Python and Matplotlib Essentials for Scientists and Engineers
x y
1 1
2 4
3 9
As noted above, the Astropy write()function is very flexible and can also write
your data in several useful formats. If you want to write your file with the column
headings written as a comment line (so the file could be read directly with np.
loadtxt()):
# x y
1 1
2 4
3 9
\begin{table}
\begin{tabular}{cc}
x & y \\
1 & 1 \\
2 & 4 \\
3 & 9
\end{tabular}
\end{table}
6-19
Python and Matplotlib Essentials for Scientists and Engineers
>>> x = np.linspace(1,2,4)
>>> x
array([1. , 1.33333333, 1.66666667, 2. ])
>>> y = x**2
>>> y
array([1. , 1.77777778, 2.77777778, 4. ])
>>> data = Table([x,y],names=['x','y'])
>>> ascii.write(data, 'values.dat', formats='x':'%4.2f', 'y':'%4.2f')
xa ya
1.00 1.00
1.33 1.78
1.67 2.78
2.00 4.00
8
See www.hdfgroup.org/HDF5.
9
FITS stands for Flexible Image Transport System. The FITS format is widely used by astronomers and
although a binary file format, has the advantage that the metadata are included in a human-readable (ASCII)
header. See fits.gsfc.nasa.gov/fits_home.html.
6-20
Python and Matplotlib Essentials for Scientists and Engineers
defined on the write statement to label the table within the file. The example is
contained in the file apy_write.py.
import numpy as np
from astropy.table import Table
x = np.array([1, 2, 3])
y = x**2
data.write('values.hdf5',path='step1',overwrite=True)
x *= 2
y = x**2
data.write('values.hdf5',path='step2',append=True)
import numpy as np
from astropy.table import Table
6-21
Python and Matplotlib Essentials for Scientists and Engineers
Step1:
x y
--- ---
1 1
2 4
3 9
Step2:
x y
--- ---
2 4
4 16
6 36
There are other available packages for reading and writing HDF5 files in Python,
including the h5py package available at www.h5py.org. Quoting from the website,
‘[t]he h5py package provides a Pythonic interface to the HDF5 binary data format’
and there exists the book Python and HDF5 written Andrew Collette, the lead
author of the h5py package. The h5py package provides a lower-level interface to
HDF5 files, but may include features you need that the Astropy package does not.
6-22
IOP Concise Physics
Chapter 7
Simple programing: flow control
To write more advanced programs, you will need to use basic flow control com-
mands, some of which we have used in previous examples. In this chapter, we will
first discuss conditionals, then if-elif-else blocks, for blocks, while loops
and related items. In the following chapter, we will introduce functions. As noted
above, Python uses the indentation level to indicate which lines of code belong in
a block, obviating, for example, the need for brackets as used in, for example,
the C/C++ programming languages or the end <whatever> statement used in
Fortran and other languages. The convention among Python coders is to indent four
white spaces per new block of code.
7.1 Conditionals
Python uses the boolean True and False objects in conditional statements:
>>> a = True
>>> b = False
>>> a
True
>>> b
False
>>> type(a)
<type 'bool'>
The True and False objects behave as expected with and, or and not boolean
logic statements:
In our programs, we often need to test whether some condition is True or False
before executing some block of code:
7-2
Python and Matplotlib Essentials for Scientists and Engineers
The statement elif is short for else if and these, as well as the else statement,
are optional.
it may be that you want to loop over the elements in a list of strings:
>>> a = ['star','galaxy','universe']
>>> for s in a:
... print s
...
star
galaxy
universe
7-3
Python and Matplotlib Essentials for Scientists and Engineers
If you need to iterate over the indices of a sequence, you can do so by using the
range() and len() functions:
>>> a = ['star','galaxy','universe']
>>> for i in range(len(a)):
... print i, a[i]
...
0 star
1 galaxy
2 universe
You can also accomplish this behavior with the enumerate() function:
The enumerate() function can greatly simplify the situation when you need to
have a collection of items and want to know all the unique pairs. For example, to
calculate the gravitational potential energy of N point masses, we use the formula
N −1 N N −1 N
GMi M j
Ugrav = ∑ ∑ Uij = ∑ ∑ (7.1)
i=1 j=i+1 i=1 j=i+1
rij
where G is the gravitational constant and rij is the distance between masses Mi and
Mj. So given some object particles that contains the mass and position vectors
for all particles in the system (see section 9.1 below), we could implement this double
sum in Python and hence identify all the unique pairs using the enumerate()
function as in the following example (where we assume N = 3 particles):
7-4
Python and Matplotlib Essentials for Scientists and Engineers
0 3
1 2
1 3
2 3
Should you actually want to implement this, you may find it useful to know that you can
find the distance between two vector positions using np.linalg.norm():
>>> i = 0
>>> while (i <=3):
... print i
... i = i + 1
...
0
1
2
3
The potential problem is that, if we are not careful, the condition might not be
met and we then have an infinite loop. Generally, it is safer to use for loops.
7-5
Python and Matplotlib Essentials for Scientists and Engineers
input, using the statement break to break out of the loop. Here is a trivial example
using the break statement:
The continue statement skips the rest of the statements in the current loop
block and continues to the next iteration of the loop. The following example pro-
gram testwhile.py demonstrates the use of the while, break and continue
statements:
#! /usr/bin/env python
while True:
s = raw_input("What is your favorite color?: ")
if s == 'blue':
break
else:
print 'Try again.'
continue
print 'Right. Off you go!'
% testwhile.py
What is your favorite color?: red
Try again.
What is your favorite color?: green
Try again.
What is your favorite color?: blue
Right. Off you go!
%
7-6
IOP Concise Physics
Chapter 8
Functions and modules
Some blocks of code will be useful in multiple projects. An example would be the
standard math routines (e.g., log10(), sqrt(), sin(), etc.). These of course are
so useful that they are part of the distribution of any language. But perhaps your
simulation program writes output files in a specific format and you have several
different programs that you use to analyze and visualize the results. You could cut
and paste the lines of code that read your file format from program 1 to program 2,
to program 3, etc., but then if you change the output format of your simulation
program you have to update all of your other programs. Instead, it is more efficient
to put your read_file() code into a module that you can import into any pro-
gram. Then if you change the file format, you only need to change the code in the
module—not in each program separately.
Tim Peters posted the following to the python-list on June 4, 1999, with
the title The Python Way1. It has since come to be known as The Zen of Python
and is an ‘easter egg’ available at any time using import this at the interpreter
prompt:
1
Source: www.wefearchange.org/2010/06/import-this-and-zen-of-python.html.
8-2
Python and Matplotlib Essentials for Scientists and Engineers
but of course you will generally want to save your code in files for later use or simply
for a more efficient code development process. A module is simply a file that contains
one or more function definitions and associated statements.
The simple form of a function definition is
The function takes an argument(s) arg1 (etc), does something with it (them) and
returns a result. The returned result is called the return value. Note that it is possible
for a function to take no arguments and return no explicit result, but when defining a
function you must include the empty parentheses after the function name, even if
you do not pass any arguments to the function.
The following example shows a common use of the pass function introduced in
the previous chapter. A valid function definition must have at least one statement
following the definition line, so when developing a program or module, you may use
the pass statement as a placeholder statement for a function you have not yet
written (sometimes called a program stub), or instead you might opt to include a
comment that describes what the function will eventually do:
def donothing():
pass
def donothing2():
""" Compute the slacker coefficient """
8-3
Python and Matplotlib Essentials for Scientists and Engineers
functions without an explicit return statement still return the object None. The
object None is an object that has its own type—it is not the string 'None'.
As another example, here is a function that converts miles per hour (mph) to
meters per second (mps), saved in a file named conv.py:
def mph2mps(mph):
"""
Converts miles per hour to meters per second
"""
mps = mph * 0.44704
return mps
If we import and call this code interactively, the interpreter prints the result to the
terminal, but no variables are actually set. In order to set a variable to the result, we
have to call with something of the form x = func_name(arg)
The special object _ _name_ _ is set to the function name if imported and
to _ _main_ _ if run as a program. For example, consider the single-line module file
namedemo.py:
8-4
Python and Matplotlib Essentials for Scientists and Engineers
Notice how the output changes between importing the code and running the code
from the IPython interpreter with run or the shell prompt:
% ipython
In [1]: import namedemo.py
I am namedemo
In [2]: run namedemo.py
I am _ _main_ _
In [3]: quit
% python namedemo.py
I am _ _main_ _
def mph2mps(mph):
"""
Converts miles per hour to meters per second
"""
mps = mph * 0.44704
return mps
def mps2mph(mps):
"""
Converts meters per second to miles per hour
"""
mph = mps / 0.44704
return mph
def main():
"""
Code to demonstrate the functions in this module
mph2mps(1) -> 0.44704
mps2mph(10) -> 22.369
"""
print "Module conv: convert mph <-> mps\n"
print "Examples"
8-5
Python and Matplotlib Essentials for Scientists and Engineers
Accessing the module conv via import and run results in the following, where
we also demonstrate how to access the docstrings using print module.func-
tion._ _doc_ _:
% ipython
In [1]: run conv.py
Module conv: convert between mph <-> mps
Examples
1 mile per hour = 0.44704 meters per second
10 meters per second = 22.3693629205 miles per hour
8-6
Python and Matplotlib Essentials for Scientists and Engineers
1 2
y= at
2
for time
2y
t= .
a
def pos_vel_vs_time(y):
from math import sqrt
a = 9.8 # m/s^2
t = sqrt(2*y/a)
v = a * t
return t, v
8-7
Python and Matplotlib Essentials for Scientists and Engineers
We can now call this function without any arguments and the defaults will be
used. We can call it using the keyword arguments and, if using the keyword argu-
ments, the order does not matter:
>>> pos_vel_vs_time()
(0.4517539514526256, 4.427188724235731)
>>> pos_vel_vs_time(1, 9.8)
(0.4517539514526256, 4.427188724235731)
>>> pos_vel_vs_time(y=5, a=0.1)
(10.0, 1.0)
>>> pos_vel_vs_time(a=0.1, y=5)
(10.0, 1.0)
Keyword arguments are used a great deal in calls to the Matplotlib plotting
functions discussed in chapter 10 below.
2
For a more in-depth treatment of the topics in this subsection, see docs.python.org/2/howto/functional.html.
8-8
Python and Matplotlib Essentials for Scientists and Engineers
where the brackets ’[’ and ’]’ help remind us that the result will be a list object.
A simple example using a list comprehension is
Here is a more complex example using a nested list comprehension that returns a
list of prime numbers from 0 to 49. The list noprimes contains all the numbers
from 4 to 49 that are divisible by 2, 3, 4, … 7 (many of them more than once, e.g., 12
occurs four times). The primes list is created by finding the integers between 2 and
49 that are not contained in the noprimes list:
This method is reasonably efficient for finding small primes, but for finding very
large primes the above code would quickly fill all available system memory with the
no primes list. In such a case, a generator comprehension would be more appro-
priate, since generator objects simplify the task of writing iterators and maintain their
state between calls. That is, instead of storing the entire list, generators only return one
item of the list for each time they are called, so are more efficient than list compre-
hension when the list is just an intermediate step and does not actually need to be stored.
Here is a trivial example of a generator comprehension statement. Note that the
surrounding parentheses indicate the statement returns a generator object:
8-9
Python and Matplotlib Essentials for Scientists and Engineers
>>> x.next()
1
>>> x.next()
4
>>> x.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
def gen_isquared(N):
for i in range(N):
yield i*i
There is a lot more to say about generators, but they are not really essential so are
arguably beyond the scope of this book. For more on generators, see www.python.
org/dev/peps/pep-0255 and this link from the jeffknupp.com blog.
8-10
Python and Matplotlib Essentials for Scientists and Engineers
functions, and can improve both the conciseness and readability of code if used well.
Lambda functions can take any number of arguments but return just one value as
the result of evaluating a single expression. Lambda functions also have their own
local namespace and cannot access variables other than those in their parameter list
and in the global namespace.
The syntax of a lambda function is
For example,
The lambda function can be useful in situations where a simple function needs to
be passed as an argument to an equation solver or minimizer. For example, the
scipy.optimize module contains many useful optimization algorithms. Here is a
simple example using Brent’s method as implemented in minimize_scalar to
find the minimum of the univariate function f (x ) = (x + 4)(x − 2)2 , which is
shown in figure 8.1:
Another use of the lambda function is for those who use the Tkinter3 or
wxPython4 to make GUIs. This example (lambda_tkinter.py), written by
Michael Driscoll5, demonstrates how useful the lambda function can be in this
context. A full discussion of Tkinter is beyond the scope of this book, but the key
lines in this example are the first btn22 = ... and btn44 = ... lines. Note that
these statements create a tk.button() instance and bind to the printNum()
3
Tkinter is the most commonly used GUI programming toolkit for Python, but there are others. See
wiki.python.org/moin/TkInter for links to more information and tutorials.
4
www.wxpython.org
5
www.blog.pythonlibrary.org/2010/07/19/the-python-lambda
8-11
Python and Matplotlib Essentials for Scientists and Engineers
function in a single line. The lambda function is assigned to the button’s command
parameter, calling printNum(). In this case, the button object is said to ‘call back’
to the function object specified as its command. It turns out that using the lambda
function to implement so-called callbacks to the Tkinter (or wxPython) GUI
frameworks is one of the most frequent uses of the lambda function:
import Tkinter as tk
class App:
""""""
#---------------------------------------------------
def _ _init_ _(self, parent):
"""Constructor"""
frame = tk.Frame(parent)
frame.pack()
8-12
Python and Matplotlib Essentials for Scientists and Engineers
#---------------------------------------------------
def printNum(self, num):
""""""
print "You pressed the %s button" % num
calls function(element) for each of the sequences elements and assigns the
resulting list to the variable result.
So recalling our conversion functions, we have the following example that uses
map() with lambda and avoids the function definition for mph2mps() altogether:
And this can be reduced to a single line using a list comprehension as follows:
The map() function can also be applied to more than one list at a time with the
proviso that both lists have to have the same length:
>>> a = [1, 2, 3]
>>> b = [4, 5, 6]
>>> map(lambda a, b: a*b, a, b)
[4, 10, 18]
8-13
Python and Matplotlib Essentials for Scientists and Engineers
>>> a = range(-4,4)
>>> a
[-4, -3, -2, -1, 0, 1, 2, 3]
>>> filter((lambda x: x > 0), a)
[1, 2, 3]
To extend this, we might have a data set for which we would like to remove data
points that are more than some number of standard deviations away from the mean, a
process called sigma clipping. For a true normal (i.e., Gaussian) distribution, we
expect 98.7% of the samples to fall between ±3σ. For 1000 numbers drawn from a
normal distribution, we expect roughly three samples to fall outside this range. In the
following example, we draw 1000 samples from a normal distribution with mean 0.0
and σ = 1.0, then filter those to return the values that are more than 3σ from the mean:
>>> a = np.random.normal(0,1,1000)
>>> outliers = filter((lambda x: x<-3 or x>3),a)
>>> for i in outliers:
... print ":5.2f".format(i),
...
3.03 -3.02 3.12 -3.44
The comma at the end of the print statement suppresses the default newline. Note
that the same result can be obtained with a list comprehension which is arguably
easier to read:
8-14
Python and Matplotlib Essentials for Scientists and Engineers
Should you actually need to sigma clip your data, you can use the function
sigmaclip() from scipy.stats. Using the same array a from the previous
example, sigmaclip() will return an array with the four outliers removed. Note
that this routine is iterative, so after having removed outliers, the mean and standard
deviation of the culled sample are again computed and any outliers removed. This
continues until no outliers remain—use with caution:
8-15
IOP Concise Physics
Chapter 9
Classes and class methods
9.1 Introduction
Up to this point, we have used many of Pythonʼs built in object types—strings, lists,
tuples, etc. The class object lets us define our own object type. This is a very powerful
and flexible feature, but it is also a bit more involved than what we have discussed so
far1. As we have discussed, Python supports object-oriented programming. Everything
in Python is an object, and the program executes by operating on the objects.
For our example, let us consider what object type would be useful for a code that
computed the time evolution of a system of point particles subject to physical forces
(typically pairwise between the particles). Such a code is called an N-body code and
these are used widely in physics and astrophysics. For our specific example, let us
assume that we have N point masses interacting via the gravitational force2.
Conceptually, the method is straightforward. A single step in time δt in the
simulation consists of:
1. Calculate the net gravitational force Fi ⃗ for each particle i resulting from the
other N − 1 particles.
Fi⃗
2. Update each particleʼs velocity using vi⃗ new = vi⃗ old + Mi
δt .
3. Update each particleʼs position using ri⃗ new = ri⃗ old + vi⃗ newδt .
Once the particle positions are updated, the code again calculates the forces
and so on.
1
For an excellent discussion of Python classes and their implementation, see Downey A 2012 Think Python:
How to Think Like a Computer Scientist (Needham, MA: Green Tea) www.greenteapress.com/thinkpython.
2
In many N-body codes, the masses of all particles are identical, which reduces computation time. For more
on N-body methods (and more), see Bodenheimer P, Laughlin G P, Różyczka M and Yorke H W 2007
Numerical Methods in Astrophysics: An Introduction (Boca Raton, FL: Taylor and Francis).
Thus, for each particle, we must keep track of the position, velocity and mass.
The way this is typically accomplished if using a sequential programming language
is to use seven arrays, each of length N, which hold the positions (xi, yi, zi), velocities
(vxi, vyi, vzi) and masses (mi) for each of the i particles. However, making use of the
object-oriented programming features of Python, we can define a new data type
Particle. A user-defined type is also called a class. We can define this new class
and assign an object to it as follows:
The new object is called an instance of the class and creating the new object is
called instantiation.
>>> p.x = 3.
>>> p.y = 4.
>>> p.z = 0.
>>> p.vx = 1.
>>> p.vy = 1.
>>> p.vz = 0.
>>> p.mass = 1.0
The values of the attributes can be retrieved again with dot notation and we can
use them in any valid expression:
9-2
Python and Matplotlib Essentials for Scientists and Engineers
def print_p(p):
print "Position:{} {} {}\nVelocity:{} {} {}\nMass: {}"\
.format(p.x,p.y,p.z,p.vx,p.vy,p.vz,p.m)
>>> print_p(p1)
Position: 3.0 4.0 0.0
Velocity: 1.0 1.0 0.0
Mass: 1.0
Note that even without having implemented a function to print the values of the
attributes of our instance, we can always print all the attributes of our object using
the pprint module, which ‘pretty prints’ any Python data structure in a form which
could be used as input to the interpreter:
def collision_inelastic(p1,p2):
""" returns center of mass position, velocity, sum(mass) """
pcm = Particle()
pcm.m = p1.m + p2.m
9-3
Python and Matplotlib Essentials for Scientists and Engineers
When we run this we obtain the expected result and it is an instance of the
Particle() class:
Like lists, class attributes are mutable, so for example we could have a function
accrete() that adds to the mass of one of our particles:
def accrete(p,dm):
p.m += dm
>>> p1.m = 1.
>>> accrete(p1,0.5)
>>> p1.m
1.5
9-4
Python and Matplotlib Essentials for Scientists and Engineers
>>> p1 = Particle()
>>> p1.x = 3. ; p1.y = 4. ; p1.z = 0.
>>> p1.vx = 1. ; p1.vy = 1. ; p1.vz = 0.
>>> p1.m = 1.
>>> p2 = p1 # Both p1 and p2 refer to the same object!
>>> p1 is p2
True
>>> import copy
>>> p2 = copy.copy(p1)
>>> print_p(p1)
Position: 3.0 4.0 0.0
Velocity: 1.0 1.0 0.0
Mass: 1.0
>>> print_p(p2)
Position: 3.0 4.0 0.0
Velocity: 1.0 1.0 0.0
Mass: 1.0
Instead of having the position and velocity attributes defined as we did above,
we might have opted to create the individual classes Position and Velocity
for each of these vector quantities and then used them within the Particle
container object:
class Position(object):
""" Represents particle position """
class Velocity(object):
""" Represents particle velocity """
class Particle(object):
""" Represents a particle
9-5
Python and Matplotlib Essentials for Scientists and Engineers
>>> p1 = Particle()
>>> p1.m = 1.
In such a case, we must make use of the deepcopy function available in the
copy module. This function copies all levels of objects and so does return a com-
pletely independent copy of the object. It is slower than copy, but there are times
when it is unavoidable:
>>>p2 = copy.deepcopy(p1)
>>>p2 is p1
False
>>>p2.pos is p1.pos
False
>>>p2.vel is p1.vel
False
9-6
Python and Matplotlib Essentials for Scientists and Engineers
9.4 Methods
Methods are functions associated with a particular class. For example upper() is a
method associate with the string class:
>>> s = "ni".upper()
>>> print s
'NI'
Returning to our original definition of the Particle class3, we can bring our
print function into the class definition and make it a print method:
class Particle(object):
def print_p(self):
print "Position:{} {} {}\nVelocity:{} {} {}\nMass: {}"
.format(self.x,self.y,self.z,self.vx,self.vy,self.vz,self.m)
The convention is to use self as the first parameter of a method. Note that a
method is invoked using dot notation, as in the upper() example above:
>>> p.print_p()
Position: 0.0 0.0 0.0
Velocity: 0.0 0.0 0.0
Mass: 1.0
def accrete(self,dm):
self.m += dm
3
Note: the complete definition of the Particle class as discussed in this chapter is available at the companion
website pythonessentials.com in the file particle_class.py.
9-7
Python and Matplotlib Essentials for Scientists and Engineers
>>> p.accrete(0.5)
>>> print p.m
1.5
class Particle(object):
self.x = x
self.y = y
self.z = z
self.vx = vx
self.vy = vy
self.vz = vz
self.m = m
Now when we create p, the values of the attributes are set, even if we call the method
with no arguments:
We can create p using some or all of the arguments. We can call using positional
arguments or keyword arguments:
>>> p = Particle(1.,2.,3.,4.,5.,6.,7.)
>>> p.print_p()
Position: 1.0 2.0 3.0
9-8
Python and Matplotlib Essentials for Scientists and Engineers
>>> p = Particle(1.,2.,3.)
>>> p.print_p()
Position: 1.0 2.0 3.0
Velocity: 0.0 0.0 0.0
Mass: 1.0
>>> p = Particle(vx=3.,vy=4.,vz=5.)
>>> p.print_p()
Position: 0.0 0.0 0.0
Velocity: 3.0 4.0 5.0
Mass: 1.0
>>> p = Particle(1.,2.,3.,4.,5.,6.,7.)
>>> print p
Position: 1.0 2.0 3.0
Velocity: 4.0 5.0 6.0
Mass: 7.0
pcm = Particle()
pcm.m = self.m + other.m
9-9
Python and Matplotlib Essentials for Scientists and Engineers
>>> p1 = Particle(vy=10.)
>>> p2 = Particle(x=1.,m=3.)
>>> print p1
Position: 0.0 0.0 0.0
Velocity: 0.0 10.0 0.0
Mass: 1.0
>>> print p2
Position: 1.0 0.0 0.0
Velocity: 0.0 0.0 0.0
Mass: 3.0
>>> pcm = p1 + p2
>>> print pcm
Position: 0.75 0.0 0.0
Velocity: 0.0 2.5 0.0
Mass: 4.0
This ends our discussion of Python class objects but, as you can imagine, we have
barely scratched the surface of the discussion of classes or object-oriented pro-
gramming. For more information, see chapter 9 of The Python Tutorial (docs.
python.org/2/tutorial/classes.html).
9-10
IOP Concise Physics
Chapter 10
Making plots with Matplotlib
and you should obtain a plot that looks something like figure 10.1.
For a simple point plot (see figure 10.2), the code file (point_plot_demo.py)
might be
10-2
Python and Matplotlib Essentials for Scientists and Engineers
Note that to leave a little white space between the plotted points and the plot frame
we have specified the plot axes with the plt.axis([xmin,xmax,ymin,ymax])
command, which expects a list as an argument (you do not pass the values directly).
The complete list of arguments to plt.plot() for setting the line/point type
and color is too long to include here, but a useful subset includes
Character Description
'-' solid line
'–' dashed line
'-.' dash-dot line
':' dotted line
'.' point marker
',' pixel marker
'o' circle marker
's' square marker
'^' triangle marker
'v' upside down triangle
'*' star marker
'+' plus marker
'D' diamond marker
'd' thin diamond marker
Character Color
'b' blue
'g' green
'r' red
'k' black
'w' white
'c' cyan
'm' magenta
'y' yellow
10-3
Python and Matplotlib Essentials for Scientists and Engineers
You can also specify grayscale intensities as a string (‘0.6’), or specify the color as
a hex string (‘#5D0603’). You can also specify the markersize.
To plot multiple data sets on the same axis, just call plt.plot() multiple times.
The axes will autoscale to fit all of the data as shown in figure 10.3, but because
plt.plot() does not automatically insert white space between the data sets and
the axes, you will usually want to tweak your final plot ranges manually, as in the
previous example. Note this code (multiple_data_demo.py) also demonstrates
the use of the legend() function:
x = np.arange(1,19,.4)
y1 = np.log10(x)
y2 = 0.01 *x**2
y3 = 0.9*np.sin(x)
plt.plot(x,y1,'r-',label='y1')
plt.plot(x,y2,'bˆ',label='y2')
plt.plot(x,y3,'go',label='y3')
plt.plot(x,y1+y2+y3,'+',label='y1+y2+y3')
plt.legend(loc=2)
plt.show()
10-4
Python and Matplotlib Essentials for Scientists and Engineers
# errorbar example
import numpy as np
import matplotlib.pyplot as plt
plt.errorbar(x,y,xerr=xerr,yerr=yerr,fmt='bo')
plt.show()
10-5
Python and Matplotlib Essentials for Scientists and Engineers
If you only had error bars in your y values and it was the same value (e.g., σ = 0.2)
for all data points, you could simply use
plt.errorbar(x,y,yerr=0.2,fmt='bo')
# subplot example
import numpy as np
import matplotlib.pyplot as plt
def func(x):
return np.sin(2*np.pi*x)
x1 = np.arange(0.0,4.0,0.1)
x2 = np.arange(0.0,4.0,0.01)
y1 = func(x1)
y2 = func(x2)
y1n = y1 + 0.1*np.random.randn(len(x1))
plt.figure()
plt.subplot(211)
plt.plot(x1,y1,'bo',x2,y2,'r:')
plt.subplot(212)
plt.plot(x1,y1n,'bo',x2,y2,'r:')
plt.show()
Note that here we first call plt.figure() to initialize our figure space. This is
optional but good practice. We next specify plt.subplot(211) where the
10-6
Python and Matplotlib Essentials for Scientists and Engineers
argument 211 says make two plots vertically and one horizontally, and we are
plotting in the first (top left) plot until we give a new subplot() command. If you
had, for example, a 2×2 grid of subplots, then the order 221, 222, 223 and 224
would be top left, top right, bottom left and bottom right, respectively.
10-7
Python and Matplotlib Essentials for Scientists and Engineers
bins = np.arange(histmin,histmax,width)
plt.hist(x,bins=bins)
plt.show()
#!/usr/bin/env python
"""
Plot 2 column data file using a line to connect points
"""
import sys
1
I also keep a similar code poixy.py that plots points instead of lines.
10-8
Python and Matplotlib Essentials for Scientists and Engineers
if (len(sys.argv) >1):
infile = sys.argv[1]
x, y = loadtxt(infile,unpack=True,usecols=(0,1))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x,y)
xrange = max(x) - min(x)
yrange = max(y) - min(y)
minx = min(x)-0.05*xrange
maxx = max(x)+0.05*xrange
miny = min(y)-0.05*yrange
maxy = max(y)+0.05*yrange
ax.axis([minx,maxx,miny,maxy])
if (len(sys.argv) >2):
title(sys.argv[2])
plt.show()
else:
print "syntax: pltxy <infile>[title]"
import sys
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import numpy as np
10-9
Python and Matplotlib Essentials for Scientists and Engineers
dpi = 300
labelsize = 18
ticksize = 16
lcolor = '#0000ff'
lwidth = 1.0
fcolor = '#ffff00'
infile1 = 'V344Lyr_demo.dat'
plotfile = 'inset+label.pdf'
x1, y1 = np.loadtxt(infile1,unpack=True,usecols=(0,1))
fig = plt.figure(figsize=[xsize,ysize])
ax = plt.axes([0.16,0.2,0.75,0.7])
10-10
Python and Matplotlib Essentials for Scientists and Engineers
# first panel
plt.plot(x1,y1,'k,')
xrange = max(x1) - min(x1)
yrange = max(y1) - min(y1)
minx = min(x1)-0.05*xrange
maxx = max(x1)+0.05*xrange
miny = min(y1)-0.05*yrange
maxy = max(y1)+0.05*yrange
plt.axis([minx,maxx,miny,maxy])
plt.xlabel('Time (BJD - 2455000)',labelpad=10)
plt.ylabel(r'SAP Flux ($\rm e^-\ s^{-1}$)')
# Inset
ax.add_patch(Rectangle((279.9,9800),.4,200,alpha=0.3))
plt.plot([279.3,279.9],[6430,9800],'b',alpha=0.3)
plt.plot([280.97,280.3],[6430,9800],'b',alpha=0.3)
ax1 = plt.axes([0.45,0.31,0.15,0.2])
plt.axis([279.9,280.3,10000,15000])
plt.plot(x1,y1,'k,')
plt.xticks([280,280.2],('280','280.2'))
plt.yticks([10000,14000])
plt.savefig(plotfile,dpi=dpi)
plt.show()
10-11
Python and Matplotlib Essentials for Scientists and Engineers
fig = plt.figure()
ax = fig.add_subplot(111)
z = -np.sin(x)*np.sin(y)
ax = plt.imshow(z,cmap=cm.jet,origin='lower')
cbar = fig.colorbar(ax)
cbar.ax.get_yaxis().labelpad=15
cbar.ax.set_ylabel('Z Value',rotation=270)
plt.xlabel('X Label')
plt.ylabel('Y Label')
plt.show()
10-12
Python and Matplotlib Essentials for Scientists and Engineers
10.8 3D plots
10.8.1 3D scatter plots
If you have data of three variables you would like to plot in 3D, use functions
within the mplot3d module2. Here is a simple example (3DPoints.py) where
200 random points are colored by their distance from the origin:
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import matplotlib.cm as cm
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
n = 200
xs = np.random.rand(n)
ys = np.random.rand(n)
zs = np.random.rand(n)
rs = np.sqrt(xs*xs + ys*ys + zs*zs)
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
plt.show()
When this file is run, the result will be something that looks like figure 10.9. The
plot shown on the screen is interactive—you can use the mouse (click–drag) to
change the viewing angle.
2
For more examples, see the mplot3d tutorial at matplotlib.org/mpl_toolkits/mplot3d/tutorial.html.
10-13
Python and Matplotlib Essentials for Scientists and Engineers
10-14
Python and Matplotlib Essentials for Scientists and Engineers
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
plt.show()
Note that we use the NumPy function meshgrid() in the previous example.
This function makes it easy to generate a mesh of x–y values from 1D arrays
that can then be used in statements that assign values to a 2D z array. For
example:
10-15
Python and Matplotlib Essentials for Scientists and Engineers
For the surface plot, let us step up the complexity just a bit by using a color
map to shade the surface according to the z values, making the surface slightly
transparent with alpha=0.7, adding a contour plot on the base plane and adding a
colorbar to the right of the plot (see figure 10.11). The following lines of code replace
the line beginning ax.plot_wireframe(...) in the wire3D.py example:
Figure 10.11. A 3D surface plot with a contour plot base and semi-transparent surface.
10-16
IOP Concise Physics
Chapter 11
Applications
import numpy as np
import matplotlib.pyplot as plt
p1 = np.poly1d(c1)
p2 = np.poly1d(c2)
p3 = np.poly1d(c3)
p4 = np.poly1d(c4)
plt.plot(x,y,'ko',label='Data')
plt.plot(x,p1(x),'g-',label='1st order')
plt.plot(x, p2(x),'b-',label='2nd order')
plt.plot(x,p3(x),'c-',label='3rd order')
plt.plot(x,p4(x),'r-',label='4th order')
plt.legend()
plt.show()
When we execute the code, it prints out the fitted coefficients. As expected the final
fit returns the original coefficients and a perfect fit (see figure 11.1):
11-2
Python and Matplotlib Essentials for Scientists and Engineers
values for the parameters and the covariance matrix. The standard errors are given
by the square-root of the diagonal elements of the covariance matrix:
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
x = np.linspace(0, 6, 20)
noise = 0.05 * np.random.normal(size=len(x))
y = func(x,1,2,1) + noise
sigma = np.ones(len(x))
p0 = np.ones(3) # initial guesses for a, b, c
p, pcov = curve_fit(func,x,y,p0,sigma)
perr = np.sqrt(np.diag(pcov))
print 'Best-fit:'
print u'a = {:g} \u00B1 {:g}'.format(p[0],perr[0])
print u'b = {:g} \u00B1 {:g}'.format(p[1],perr[1])
print u'c = {:g} \u00B1 {:g}'.format(p[2],perr[2])
11-3
Python and Matplotlib Essentials for Scientists and Engineers
xfit = np.linspace(0,6,100)
plt.plot(x,y,'ko',label='Data')
plt.plot(xfit, func(xfit,p[0],p[1],p[2]),'b-',label='Fit')
plt.legend()
plt.text(4,1.5,"$y = a + bx e^{-cx}$",fontsize=20)
plt.axis([-0.5,6.5,0.92,1.85])
plt.show()
When we run the code, it prints the best fit solution and plots the fit over the
generated data (see figure 11.2):
11-4
Python and Matplotlib Essentials for Scientists and Engineers
SciPy optimize.leastsq()
Next we present an example of fitting a sine curve to a generated curve with added
noise using the function leastsq() (see figure 11.3). In this example (singen
+fit.py), all of the data have the same weight, but it should be obvious from the
code how to included point-by-point weighting if appropriate for your data. The
code for this is a bit long.
Figure 11.3. An example of a non-linear least squares fit to sinusoidal data with noise.
#!/usr/bin/env /python
''''''
SinGen+Fit
Generate sine curve with noise, then fit using Levenberg-Marquardt
method.
''''''
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize
from scipy.optimize import leastsq
11-5
Python and Matplotlib Essentials for Scientists and Engineers
t = np.linspace(0.,50.,npts)
y = amp*np.sin(2*np.pi/per*(t-T0))+noise_amp*np.random.randn(npts)
chisq=sum(info["fvec"]*info["fvec"])
dof = len(t) - len(p1)
rmsred = np.sqrt(chisq/dof)
print "Converged with chi squared ",chisq
print "degrees of freedom, dof ",dof
print "sqrt(chisq/dof) ",rmsred
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.legend(('Data','Fit','Init Params'))
ax = plt.axes()
s = 'Best Fit:\n P = %.3f\n A = %.3f\n T0 = %.3f'%(p1[1],p1[0],p1[2])
plt.text(2.0, 3.5,s, family='monospace',fontsize=12)
plt.show()
11-6
Python and Matplotlib Essentials for Scientists and Engineers
We can solve this by initializing a nested NumPy array A to the coefficients on the
left-hand side and B to the values on the right-hand side of the equals signs. The code
(lineq.py) to solve this particular problem is
import numpy as np
from scipy import linalg
x = linalg.solve(A,B)
print x
The linalg package contains many more linear algebra functions, including
functions for solving matrix inversions, banded matrices, triangular matrices,
eigenvalue problems, decompositions, etc. Low-level BLAS functions are available
using scipy.linalg.blas and low-level LAPACK functions are available using
scipy.linalg.lapack.
import numpy as np
from scipy import integrate
print y, yerr
11-7
Python and Matplotlib Essentials for Scientists and Engineers
% python quad_demo.py
0.5 7.7350316838e-11
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from mpl_toolkits.mplot3d import Axes3D
1
See, e.g., mathworld.wolfram.com/LorenzAttractor.html.
11-8
Python and Matplotlib Essentials for Scientists and Engineers
x = pos[:, 0]
y = pos[:, 1]
z = pos[:, 2]
fig = plt.figure(figsize=(12,10))
ax = fig.gca(projection='3d')
ax.plot(x, y, z)
plt.show()
11-9
Python and Matplotlib Essentials for Scientists and Engineers
where the coefficients ck provide an estimate of the power in the time series at the
frequency. The direct coding of the above results in the following example, where in
addition we have applied the normalization such that an input sine curve with an
amplitude of A will return a peak in the Fourier transform with an amplitude of 1.0.
In the example, the DFT is implemented in the function dft(), and we use this to
demonstrate the calculation of the amplitude spectrum on a time series consisting of
two sinusoids where we have randomly deleted 80% of the points. For large data
sets, the DFT can be very slow, as it requires 6 (N 2 ) operations. If your data are
equally spaced, then you will be able to use a fast Fourier transform (FFT) to
compute your estimate of the amplitude spectrum. The FFT requires only
6 (N log N ) operations.
In the example (ft_demo.py) we include a call to the NumPy function
rfft() from the fft module (using the gapless data set), as well as the
rfftfreq() function which returns the frequencies of the calculated amplitudes
(see figure 11.5):
import numpy as np
from numpy.fft import rfft, rfftfreq
import matplotlib.pyplot as plt
from cmath import exp, pi
def dft(t,y,freq):
nt = len(t)
nf = len(freq)
c = np.zeros(nf,complex)
2
A classic and comprehensive text introducing Fourier transforms is Bracewell R 1999 The Fourier Transform &
Its Applications (New York: McGraw-Hill).
3
See chapter 7 of Newman M 2012 Computational Physics (Scotts Valley, CA: CreateSpace) for a thorough
discussion of applications of Fourier transforms.
11-10
Python and Matplotlib Essentials for Scientists and Engineers
for k in range(nf):
f = freq[k]
for i in range(nt):
c[k] += y[i]*exp(-2j*pi*f*t[i])
return 2.*abs(c)/nt
npts = 1000
twopi = 2.* pi
A1 = 1.5 ; P1 = 10.
A2 = 1. ; P2 = 3.
t = np.linspace(0.,100.,npts)
y = A1*np.sin((twopi/P1)*t) + A2*np.sin((twopi/P2)*t)
r = np.random.rand(y.shape[0])
t_subset = t[r >= (1.0 — frac_points)]
y_subset = y[r >= (1.0 — frac_points)]
nfreq = 1000
freq_dft = np.linspace(0.,1.,nfreq)
fig = plt.figure()
ax = plt.subplot(311) # time series
plt.plot(t_subset,y_subset,'bo',ms=3)
plt.plot(t,y,'r',alpha=0.2)
ax = plt.subplot(313) # FFT
plt.plot(freq_fft, amp_fft)
plt.xlim([0,1])
plt.show()
The SciPy module contains the signal module which contains a long list of
useful functions, including lombscargle() which returns an estimate of the
power spectral density using the Lomb–Scargle periodogram. To call this function,
11-11
Python and Matplotlib Essentials for Scientists and Engineers
Figure 11.5. A demonstration of a Fourier transform. The upper panel shows the original time series (red) and
the randomly selected points (blue points). The middle panel shows the DFT amplitude spectrum of the
randomly selected points and the lower panel shows the FFT amplitude spectrum of the red curve.
we only need to calculate the angular frequencies and call as follows using our
previous definitions:
4
Based on codingmess.blogspot.com/2010/02/how-to-make-wav-file-with-python.html.
11-12
Python and Matplotlib Essentials for Scientists and Engineers
normalizes to the range -16384 to +16384, then uses the wave module functions to
create and write a WAV file:
#!/usr/bin/env python
"""
Program: dat2wav
Description
Read in time series data (2-col) and normalize.
Write sound (.wav) file
Based on http://codingmess.blogspot.com/2010/02
how-to-make-wav-file-with-python.html
"""
import numpy as np
import wave
import sys
class SoundFile:
def write(self):
self.file.setparams((1, 2, self.sr, samplerate*4, 'NONE',
'noncompressed'))
self.file.writeframes(self.signal)
self.file.close()
def np2str(y):
"""Convert NumPy vector to string (h =>data formatted as ints)"""
signal = ".join(wave.struct.pack('h', item) for item in y)
return signal
11-13
Python and Matplotlib Essentials for Scientists and Engineers
outfile = sys.argv[2]
x, y = np.loadtxt(infile,usecols=(0,1),unpack=True)
ssignal = np2str(ydata)
f = SoundFile(ssignal)
f.write()
else:
print "syntax: dat2wav <infile> <outfile> [samplerate]"
11-14
IOP Concise Physics
Chapter 12
Visualization and animations
12.1 VPython
VPython1 is based on the visual 3D graphics module contributed by David
Scherer in 2000. The visual package allows you to place 3D objects in a scene.
The scene is updated many times per second, allowing animations and user-
controlled scene rotations and scalings that are fluid. The vpython.org website
contains links to tutorial videos that provide a good introduction. Here is a simple
example2 (ball+box.py) that simulates a ball bouncing between two walls.
Figure 12.1 shows a screen capture from a slightly modified version of this code
that creates a new sphere every 15 time steps. In the code, the visual module is
imported on the first line. Next, the scene is initialized, where center=(0,3,0)
indicates the coordinates to which the camera points. The next three lines initialize
the wall and floor objects. The ball object is initialized next at position (-4.5,
4, 0), radius = 0.5 and with a red color. The next line initializes the vector
velocity. The while loop is infinite, so the program is terminated by closing the
plot window. The command rate(100) indicates that no more than 100 time
steps should be taken per second, which is needed for fast computers to prevent the
entire simulation from zipping by too fast to register. The ball object’s position is
updated with the velocity as a vector statement. Normal velocity components are
mirrored upon collision with the walls or floor and the y velocity is updated using
vynew = vyold − g dt on all time steps except when there is a floor collision. Note that
1
Tagline: 3D Programming for Ordinary Mortals. VPython is available from vpython.org. To install VPython in
Anaconda, type conda install -c mwcraig vpython at a terminal prompt. To build VPython and
dependencies from source, see the instructions at github.com/mwcraig/conda-vpython-recipes.
2
Based on the bounce example from vpython.org/contents/bounce_example.html.
there are no graphics calls in the while loop—just calculations updating positions
and velocities:
while 1:
rate (100)
ball.pos = ball.pos + ball.velocity*dt
if ball.y < ball.radius:
ball.velocity.y = abs(ball.velocity.y)
else:
ball.velocity.y = ball.velocity.y — 9.8*dt
if ball.x < -5+ball.radius or ball.x > 5-ball.radius:
ball.velocity.x = -ball.velocity.x
12-2
Python and Matplotlib Essentials for Scientists and Engineers
Rsun = 7.e8
Msun = 2.e30
G = 6.67e-11
12-3
Python and Matplotlib Essentials for Scientists and Engineers
M = M1+M2
a = 5*Rsun
a1 = a*M2/M
a2 = a*M1/M
Porb = np.sqrt(4*np.pi**2 / (G*M) * a**3)
v1circ = 2 * np.pi * a1 / Porb
v2circ = 2 * np.pi * a2 / Porb
v1 = 0.8 * v1circ
v2 = v1 * M1/M2
while 1:
rate (100)
r12 = s2.pos — s1.pos
r = mag(r12)
r12hat = r12 / mag(r12)
accel1 = G * M2 / r**2 * r12hat
accel2 = G * M1 / r**2 * -r12hat
s1.velocity = s1.velocity + accel1*dt
s2.velocity = s2.velocity + accel2*dt
s1.pos = s1.pos + s1.velocity*dt
s2.pos = s2.pos + s2.velocity*dt
3
For download, installation, documentation and a gallery of examples, visit code.enthought.com/projects/
mayavi.
12-4
Python and Matplotlib Essentials for Scientists and Engineers
import numpy as np
from mayavi import mlab
x = np.linspace(1,4,4)
x, y, z = np.meshgrid(x, x, x)
r = np.sqrt(x*x + y*y + z*z)
mlab.figure(bgcolor=(1,1,1),size=(1200,1200))
mlab.points3d(x,y,z,r,resolution=16,opacity=0.7)
mlab.show()
12-5
Python and Matplotlib Essentials for Scientists and Engineers
argument scalars=r instructs Mayavi to use the r values to scale the color map
on the surface (see figure 12.4):
a = -3./256*sqrt(1001/pi)
dp = pi/250.0
[theta,phi] = mgrid[0:pi+dp:dp,0:2*pi+dp:dp]
x = r*sin(theta)*cos(phi)
y = r*sin(theta)*sin(phi)
z = r*cos(theta)
fig = mlab.figure(bgcolor=(1.,1.,1.),size=(1200,1200))
s = mlab.mesh(x, y, z,scalars=r,figure=fig)
#mlab.savefig('Y10.5_mayavi.png')
mlab.show()
12-6
Python and Matplotlib Essentials for Scientists and Engineers
12.3 Animations
Matplotlib contains the animation module which can be useful for several purposes.
As discussed above, animations are excellent for visualizing physical phenomena. In
addition, animations can be very effectively used during public talks or for web
publications. Even a simple value versus time plot can be brought to life by animation.
For example, the following code4 reads in the CO2 data discussed in section 6.1.2 (see
figure 12.5). The line object is created with l,= plt.plot([], [], 'r-') and is
of type Line2D. It is the only element which changes during the animation. The line
object is updated with the function update_line(), which is what is called
repeatedly to generate the animation. Here, the ellipsis ‘...’ serve as a placeholder for
a variable number of ‘:’ slices and :num of course indicates to return only the first
num-1 values of the array. The result of this function definition is that each sub-
sequent call sets the data for the line object to contain one additional data pair, up to
the complete data set for the final call. The interval keyword argument specifies a
delay of 20 ms between frames. The blit keyword if set True tells FuncAnimation
to only redraw the pixels of the plot that have changed, which can speed up the ani-
mations considerably5. Finally, the animation is saved in the file anim_co2.mp46.
4
Based on basic_example.py (anim_co2.py) from the matplotlib.org animation examples page
matplotlib.org/1.4.2/examples/animation/index.html. A nice tutorial is available at jakevdp.github.io/blog/
2012/08/18/matplotlib-animation-tutorial.
5
If using OS X, you will need to specify blit=False due to the way in which the OS works. This has been a
known issue for several years now and apparently is not a simple fix. For more information, see github.com/
matplotlib/matplotlib/issues/531.
6
Available at pythonessentials.org/anim_co2.mp4.
12-7
Python and Matplotlib Essentials for Scientists and Engineers
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
fig = plt.figure()
data = np.loadtxt('co2_mm_mlo0.txt',usecols=(2,3),unpack=True)
nlines = data.shape[1]
plt.show()
12-8
IOP Concise Physics
Chapter 13
Interfacing with other languages
It is possible to call C/C++ and Fortran routines from within a Python program and
vice versa. You might want to do this if you have a working and often-used com-
putationally intensive routine in one of these languages, but you prefer the more
user-friendly interface that Python can present. Perhaps surprisingly, calling a
Fortran routine is less involved than interfacing with C/C++ as long as we use
f2py1, which is now part of the NumPy distribution and is what we will explore in
this chapter. If you have the need to interface with C/C++, see the documentation at
docs.python.org/2/extending/extending.html.
Here we will use the specific example of a Fortran subroutine (dft in file
dftsub.f) that calculates a simple DFT, as discussed in section 11.4. The sub-
routine calculates the normalized amplitudes, such that when passed a noise-free
sine curve with amplitude 1.0, the calculated amplitude will equal 1.0 at the
appropriate frequency. The implementation here does not require that the input data
be equally spaced, as is required when using an FFT.
subroutine dft(t,y,freq,amp,numt,nfreq)
1
See, for example, docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html.
twopi = 2 * 3.141592653589793
do ifreq = 1,nfreq
f = freq(ifreq)
fr = 0.
fi = 0.
do i = 1,numt
a = twopi * f * t(i)
c = cos(a)
s = sin(a)
fr = fr + y(i) * c
fi = fi + y(i) * s
end do
fr = fr/numt
fi = fi/numt
ff = fr*fr + fi*fi
amp(ifreq) = 2.*sqrt(ff)
end do
return
end
def dft(t,y,freq):
from numpy import zeros
from cmath import exp, pi
nt = len(t)
nf = len(freq)
c = zeros(nf,complex)
for k in range(nf):
f = freq[k]
for i in range(nt):
c[k] += y[i]*exp(-2j*pi*f*t[i])
return 2.*abs(c)/nt
where both implementations return the same values to six decimal places.
Given our Fortran subroutine, we can use f2py to create a module that can be
imported and used:
13-2
Python and Matplotlib Essentials for Scientists and Engineers
This creates a file on your system with the basename dftsub and an extension
that is the appropriate extension for a Python extension module on your platform
(e.g., .so, .pyd, etc). The module dftsub is now importable, but all array
dimensions must be declared in the calling function.
In this example (dft-compare.py) we generate a time series data set of 10 000
points consisting of two periods, of ten and three seconds, of different amplitudes.
We calculate the DFT at 1000 frequency points using both the Python code and the
code in the Fortran-derived module (dftsub.so). We use the time() function
from the time module to determine how much time is spent in each of the two DFT
routines and find that the Python DFT function takes some 150 times longer to
complete than the Fortran subroutine:
#!/usr/bin/env python
"""
DFT Comparison (Python vs. Fortran)
Generate 2-sine curve, then call Python and Fortran DFT function.
"""
from numpy import *
import matplotlib.pyplot as plt
import time
import dftsub
import dft_py
npts = 10000
twopi = 2.* pi
A1 = 2. ; P1 = 10.
A2 = 1. ; P2 = 3.
t = linspace(0.,300.,npts)
y = A1*sin((twopi/P1)*t) + A2*sin((twopi/P2)*t)
nfreq = 1000
freq = linspace(0.,1.,nfreq)
amp_f = zeros(nfreq,dtype='float')
t0p = time.time()
amp_py = dft_py.dft(t,y,freq)
t1p = time.time()
t0f = time.time()
dftsub.dft(t,y,freq,amp_f,npts,nfreq)
t1f = time.time()
13-3
Python and Matplotlib Essentials for Scientists and Engineers
13-4