Unit V
Unit V
Case studies - n-Body solvers – Tree Search – OpenMP and MPI implementations and comparison.
1. Case studies
Case Study 1- Parallel Sorting Using MPI
Step 1: Choosing Pivots to Define Buckets
The first step of the algorithm is to select P-1 pivots that define the P buckets. (Bucket i will
contain elements between pivot[i-1] and pivot[i].) To do this, your code should randomly select
S samples from the entire array A, and then choose P-1 pivots from the selection using the
following process
Step 2: Bucketing Elements of the Input Array
The second step is to bucket all elements of A into P buckets where element A[i] is placed in
bucket j if pivot[j-1] <= A[i] < pivot[j]. (The 0'th bucket contains all elements less than pivot[0],
the P-1'th bucket contains all elements greater than or equal to pivot[P-2]) The randomized
choice of pivots ensures that in expectation, the number of elements in each bucket is well
balanced. (This is important, because it will lead to good workload balance in Step 4!)
Step 3: Redistributing Elements
• Now that the bucket containing each array element is known, redistribute the data elements
such that each process i holds all the elements in bucket i.
Step 4: Final Local Sort
• Finally, each process uses a fast sequential sorting algorithm to sort each bucket. As a result, the
distributed array is now sorted!
2. n-Body solvers
The n-body problem
• Find the positions and velocities of a collection of interacting particles over a period of time.
• An n-body solver is a program that finds the solution to an n-body problem by simulating the
behavior of the particles.
Positiontime 0
N-body solver Positiontime x
mass
Velocitytime x
Velocitytime 0
Simulating motion of planets
• Determine the positions and velocities:
– Newton’s second law of motion.
– Newton’s law of universal gravitation.
Serial pseudo-code
• Another difference between the Pthreads and the OpenMP versions has to do with barriers.
• At the end of a parallel for OpenMP has an implied barrier.
• We need to add explicit barriers after the inner loops when a race condition can arise.
• The Pthreads standard includes a barrier.
• If a barrier isn't defined we must define a function that uses a Pthreads condition variable to
implement a barrier.
Parallelizing the Basic Solver Using MPI
• Choices with respect to the data structures:
– Each process stores the entire global array of particle masses.
– Each process only uses a single n-element array for the positions.
– Each process uses a pointer loc_pos that refers to the start of its block of pos.
– So on process 0 local_pos = pos; on process 1 local_pos = pos + loc_n; etc.
– Pseudo-code for the MPI version of the basic n-body solver
3. Tree Search
A graph (not to be confused with a graph in calculus) is a collection of vertices and edges or line
segments joining pairs of vertices.
In a directed graph or digraph, the edges are oriented—one end of each edge is the tail, and
the other is the head.
A graph or digraph is labeled if the vertices and/or edges have labels
Tree search problem (TSP)
• An NP-complete problem.
• No known solution to TSP that is better in all cases than exhaustive search.
• Ex., the travelling salesperson problem, finding a minimum cost tour.
• A Four-City TSP