Python in High Performance Computing
Python in High Performance Computing
Dr. Jussi Enkovaara and Dr. Martti Louhivuori have over ten years of experience in working with
Python in some of the biggest supercomputers in the world, you may follow us during the course:
Please introduce yourself in the comments section below, tell also why you are attending the course!
This is course is brought to you by The Partnership for Advanced Computing in Europe (PRACE)
PRACE is an international non-profit association with its seat in Brussels. The PRACE Research
Infrastructure provides a persistent world-class high performance computing service for scientists and
researchers from academia and industry in Europe. The computer systems and their operations
accessible through PRACE are provided by 5 PRACE members (BSC representing Spain, CINECA
representing Italy, CSCS representing Switzerland, GCS representing Germany and GENCI
representing France). The Implementation Phase of PRACE receives funding from the EU’s Horizon
2020 Research and Innovation Programme (2014-2020) under grant agreements 730913 and 823767.
For more information, see www.prace-ri.eu.”
https://training.prace-ri.eu/
https://prace-ri.eu/
1.2. Prerequisities and structure of the course
In order to succeed in this course, you need to know few things beforehand.
First, this course is aimed at people who already know how to program in Python. We will not be
teaching Python programming as such. However, that said, you do not need to be an expert in Python
programming either.
We expect you to be familiar (and comfortable) with:
• Python syntax
• Basic built-in datastructures (lists, tuples and dictionaries)
• Control structures (if-else, while, for)
• How to write and use functions and modules
• File I/O
You can test if you have the prerequisite level of skills and knowledge by going through the quiz in the
next step.
Second, previous knowledge of high-performance computing is not needed, we will discuss the
relevant topics as we proceed.
Third, the course duration is four weeks. Each week we discuss a different topic related to Python
performance. The main aspects of each topic is covered in articles and videos, but most importantly,
these are followed by hands-on exercises on the topic. At least half of your time should be devoted to
the exercises, since getting your hands dirty and doing it yourself is simply the best way to learn
programming.
We provide simple instructions for setting up a virtual machine with the programming environment for
the course in your own laptop/workstation. The virtual machine is based on Linux, and in order to carry
out the exercises some familiarity with the Linux command line is useful.
5. You should now see HPC Python image listed in VirtualBox. You can start it either by double
clicking it or via the Start button.
6. The system should now boot up. Once you are greeted with a login screen, log in with the following
credentials:
• username: Monty Python
• password: hpc1python
7. Hands-on exercises are carried out in the command line terminal which you can open from the
launcher panel on the left.
• There are several standard text editors (gedit, nano, emacs, vim) available, if you are not
familiar with any of these we recommend starting with gedit.
Exercise material
In addition to installing the required software (see above), you need to download also some material for
the hands-on exercises.
Exercise material is hosted on GitHub at: https://github.com/csc-training/hpc-python
Download
You have three options for downloading the material to your own Linux system or to the Virtual
Machine:
1. Recommended approach: Fork the GitHub repository and clone your fork
• You need to have a GitHub user account for this option
• Go to course repository in GitHub and Sign in
• Next, Fork the repository via the button in the top right corner.
• After forking the repository, open the Terminal and clone your fork with the command (using
your own GitHub username):
git clone https://github.com/my-github-username/hpc-python.git
An easy way to get the URL for cloning, is to copy it from the green Clone or download button
on the Github page of your fork.
• No further usage of git is required in the course, but if you are familiar with it we strongly
recommend committing often and pushing your work back to your own GitHub.
2. Clone the repository directly
• Open the Terminal and clone the repository directly with the command:
git clone https://github.com/csc-training/hpc-python.git
• However, with this option, any changes you make will only be available locally and can not be
pushed back to Github
3. (Not recommended): If needed, you can also download all the material via from the Clone or
download button as a Zip-file. However, this option means you will loose all the benefits of version
control.
Overall structure
Skeleton code snippets and model solutions to hands-on exercises are in separate directories for each
exercise. The exercises are organised in subdirectories (mpi/, numpy/ etc.) for each topic under the
main directory (hpc-python/).
The file README.md contains a list of all exercises (with links), which is also shown as the default
page on Github, and can be an easy way to navigate to an exercise.
Each exercise has also a README.md file that contains the instructions for the exercise and a solution/
directory that contains a model solution. Additional files (skeleton code, input data etc.) may also be
present.