Manual
Manual
Summer 2025
1.1 Requirements
1. C/C++ compiler that supports C++17 standard (gcc or clang).
3. Running/Testing
• Local First: Test thoroughly on your machine before using the cluster.
– Use Standard Mode: ./allgather_merge [options] -c.
– Use Measurements Mode: ./allgather_merge -c.
• Hydra Cluster (for Report):
– See the instructions on how to use Hydra.
– Use Measurements Mode (./allgather_merge without -m and without -c) for final
results.
– Run experiments using the provided job-run.sh script (check paths inside it first).
1
1.3 Configure
Configuration prepares the project for compilation. We use CMake build system to build the project.
There are two configurations available: Debug and Release. The Debug configuration is optional, it
is meant to help you debug your code. The main components of the project are the source files .cpp,
.c and .h and the CMakeLists.txt, which is used by CMake. When you configure your project you
specify a build folder.
• Run the configuration commands in the top folder of the project, which is where CMakeLists.txt
is located.
• The above commands will create a folder named “build” where the project will be build next.
• You can use the Release configuration for testing the correctness of your code locally.
• Use the Release configuration to perform the experiments and get measurements on hydra when
your code is correct.
• This activates the DEBUG directives, allowing you to print debug information in specific parts
of your code.
• You can add DEBUG directives in specific parts of your code using #ifdef DEBUG, see examples
of in the main.cpp file.
1.4 Build
The build phase compiles the code into an executable. After a successful Release configuration you
can build the project:
• Run this command to rebuild your project after you change the source code. You do not need
to reconfigure the project.
2
2 Development
• The algorithms.h file contains the function signatures of the three algorithms you will need to
implement for the project. DO NOT edit this file.
• For simplicity you can assume that the recvbuf and sendbuf data types are tuwtype_t, which
is defined in algorithm.h as MPI_INT:
This simplifies declaring the types of internal buffers that you may use.
• Write your implementations in the following files:
– baseline.cpp: The Baseline algorithm.
– algorithm1.cpp: Distributed merging with Bruck’s algorithm.
– algorithm2.cpp: Distributed merging with a circulant algorithm.
• The algorithms are called by the main function in main.cpp. You can add code for debugging
(DEBUG directives) or correctness checks, and you can also adjust the WARMUP and REPEAT
parameters but you should not change the main structure main.cpp.
3 Run/Test
• To run on your local machine use the mpirun provided by the MPI installation. For example,
mpirun -np <number of processes> ./executable args.
• There are two major modes for running, the Standard and Measurements mode.
Usage: ./allgather_merge [-m <value>] [-a <value>] [-t <value>] [-c <value>]
-m, --msgsize: Message size (default: 10)
-a, --algo: Algorithm (0-baseline, 1-Bruck, 2-Circulant) (default: 0)
-t, --type: Input type (0-2) (default: 0)
-c, --check: Verify results (default: false)
3
• You can also run debug builds (-DCMAKE_BUILD_TYPE=Debug) in this mode.
• Run in Standard Mode and make sure your solutions do not produce errors and they are correct
before running in Measurements Mode.
• Measurements Mode is activated if you do not specify any message size (-m, --msgsize).
• This runs all three algorithms, each with seven message sizes (see main.cpp), and three types of
input.
• Run in Measurements Mode only after performing some basic correctness checks in Standard
Mode and ensure that the project is configured with -DCMAKE_BUILD_TYPE=Release.
• You can use the -c to verify your results for multiple message sizes in Measurements Mode.
However, because this can take too long (with many processes), so you should do it only on
your local machine.
4 General Workflow
We strongly suggest that you run the code locally before running on the hydra cluster
otherwise the system will be overloaded. If the system is overloaded, all student experiments
will take longer to complete, because hydra is shared by all students.
• Build the project and do not forget to rebuild it anytime you modify the source code.
• Run and Test on your local machine and check the following:
1. The code does not produce runtime errors, e.g., segmentation faults.
2. The code is correct, meaning that the output is as expected.
• You are advised to run both in Standard and Measurements modes on your local machine.
1. Ensure that you can connect to hydra: Instructions for connecting to hydra. The link works only
from the internal TU Wien network.
2. Copy only your sources on hydra: Instructions for transferring data on hydra. Do not copy
binaries, you need only the CMakeLists.txt and the source files that are sources folder.
4
4. Load the environment: spack load openmpi/4.1.5.
6. Before running the experiments read the following: Slurm commands and Running MPI jobs on
Hydra.
7. We use the sbatch utility to run jobs on Hydra, we do not use mpirun.
8. Use the provided job-run.sh in the sources folder SBATCH script to run your experiments in
Measurements Mode on hydra: sbatch job-run.sh
• Before executing the sbatch command, examine the job-run.sh and ensure that OUTFOLDER
and EXEC_PATH are correct.
• In case your code is very slow, modify the for loop in job-run.sh to run with only one
ntasks, e.g., for ntasks in 32; do.
9. If successful, the script creates a folder with three files in it, each will contain the results from
the respective Nodes x Processes Per Node (PPN) configuration.
• The form of the file names is NodesxPPN_JOBID.out. The .out files contain the results,
which you can copy into your report.
• The script will also produce a slurm log (.log) and error files (.err) for each configuration
in the OUTFOLDER.
• Check the error files for potential errors.