0% found this document useful (0 votes)

411 views

Case Studies - N-Body Solvers - Tree Search - Openmp and Mpi Implementations and Comparison

This document summarizes parallel programming techniques for n-body solvers and tree search algorithms. It describes OpenMP and MPI implementations of n-body solvers that distribute particle data across processes. It also discusses mapping tree search problems to parallel processes using work stealing and distributing partial tours. Dynamic load balancing is important for efficient parallel tree searches.

Uploaded by

Monika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

411 views

Case Studies - N-Body Solvers - Tree Search - Openmp and Mpi Implementations and Comparison

Uploaded by

Monika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

UNIT – 5 : PARALLEL PROGRAM DEVELOPMENT

Case studies – n-Body solvers – Tree Search – OpenMP and MPI

implementations and comparison.

THE N-BODY SOLVERS

 The n-body problem is one of the most famous problems in mathematical physics and
molecular dynamics.
 Find the positions and velocities of a collection of interacting particles over a period of
time.
 An n-body solver is a program that finds the solution to an n-body problem by simulating
the behavior of the particles.
 For example,
 An astrophysicist might want to know the positions and velocities of a collection
of stars.
 A chemist might want to know the positions and velocities of a collection of molecules or
atoms.

For n = 2, the problem was completely solved.

For n = 3, solutions exist in special cases.
In general, numerical methods must be used to simulate such systems.

Problem Formulation
 To determine the positions and velocities.
 Based on
Newton’s second law of motion
Newton’s law of universal gravitation
Newton’s second law of motion

1
The acceleration of an object is dependent upon two variables –
1. the net force acting upon the object
2. the mass of the object
The acceleration of an object depends directly upon the net force acting upon the object,
and inversely upon the mass of the object.
Newton’s law of universal gravitation
A particle attracts every other particle in the universe using a force that is directly
proportional to the product of their masses and inversely proportional to the square of the
distance between them.
Suppose we have n=2 particles (q and k) with
Masses - mq and mk
Positions - sq(t) and sk(t) at a time t
The force on particle q exerted by particle k is given by

The total force on any particle is calculated by adding all the forces due to all the particles.
If our n particles are numbered 0, 1, 2, … , n -1, then the total force on particle q is given by

The acceleration of particle q is given by the formula

2
Thus Newton’s laws give us a system of differential equations — equations involving
derivatives.
Our job is to find at each time t , the position and the velocity of the particle.

Basic Algorithm for Computing N-Body Forces

Computation of the forces

We’re assuming that the forces and the positions of the particles are stored as two-dimensional
arrays, forces and pos, respectively.
The x-component of the force on particle q is forces[q][X] and the y-component is forces[q][Y].
Similarly, the components of the position are pos[q][X] and pos[q][Y].

3
A Reduced Algorithm for Computing N-Body Forces

The individual forces

Euler’s Method

4
PARALLELIZING THE N-BODY SOLVERS USING OPENMP
 Apply Foster’s methodology.
 Initially, we want a lot of tasks.
 Start by making our tasks the computations of the positions, the velocities, and the total
forces at each timestep.
Communications Among Tasks in the Basic N-Body Solver

Communications Among Agglomerated Tasks in the Basic N-Body Solver

Communications Among Agglomerated Tasks in the Reduced N-Body Solver

5
PARALLELIZING THE BASIC SOLVER USING OPENMP

PARALLELIZING THE REDUCED SOLVER USING OPENMP

PARALLELIZING THE BASIC SOLVER USING MPI

 Each process stores the entire global array of particle masses.
 Each process only uses a single n-element array for the positions.
 Each process uses a pointer loc_pos that refers to the start of its block of pos.
 So on process 0 local_pos = pos; on process 1 local_pos = pos + loc_n; etc.

6
Communication In A Possible MPI Implementation of the N-Body Solver

PARALLELIZING THE REDUCED SOLVER USING MPI

7
Run-Times for OpenMP and MPI Versions of N-Body Solvers

TREE SEARCH
Many problems can be solved using a tree search. As a simple example, consider the traveling
salesperson problem, or TSP. In TSP, a salesperson is given a list of cities. The salesman needs
to visit and a cost for traveling between each pair of cities. The problem is to visit each city once,
returning to the starting city, with the least possible cost. Thus, the TSP is to find a minimum-
cost tour.

TSP is what’s known as an NP-complete problem. This means that there is no algorithm known
for solving it that, in all cases, is significantly better than exhaustive search. Exhaustive search
means examining all possible solutions to the problem and choosing the best. The number of
possible solutions to TSP grows exponentially as the number of cities is increased.

For example, if we add one additional city to an n-city problem, we’ll increase the number of
possible solutions by a factor of n - 1. Thus, although there are only six possible solutions to a
four-city problem, there are 4*6 = 24 to a five-city problem, 5*24 = 120 to a six-city problem,
6*120 =720 to a seven-city problem, and so on.

Example: Consider a four-city TSP

8
Solution :
 Start at the origin, here city 0
 Do the depth-first search
 Maintain the current best tour, that is minimum cost
 If a node is reached with cost larger than current minimum cost, do not go deeper

Search tree for four-city TSP

In the example, we’ll start at the root, and branch left until we reach the leaf Labeled

Then we back up to the tree node labeled 0→ 1, since it is the deepest ancestor node
with unvisited children, and we’ll branch down to get to the leaf labeled

Continuing, we’ll back up to the root and branch down to the node labeled 0→2.
When we visit its child, labeled

we’ll go no further in this subtree, since we’ve already found a complete tour with cost less than
21.We’ll back up to 0→2 and branch down to its remaining unvisited child. Continuing in this
fashion, we eventually find the least-cost tour.

Algorithm
 Cities are numbered 0, 1, . . . , n − 1
 A tour contains number of cities, the cities in the tour, and the cost of it

9
 Number of cities is citycount (tour)
 Initially, tour contains the first city 0 and cost 0
 Besttour(tour) checks if this is the best tour so far
 Ipdatebesttour(tour) updates the best tour
 Feasible (tour, city ) checks if city has been visited, and if not, if it can be added to tour
so that cost up to city < cost( best tour )
 Add(tour, city ) adds city to tour; city must be feasible
 Removelast(tour, city ) removes last city from tour

PARALLELIZING TREE SEARCH USING OPENMP

Mapping
 Assume p processes
 One process could run until there are p tours in the stack
 Assign them to processes

Best Tour
 Processes work independently until each finds it local best tour
 Do global reduction on process 0 to find the best tour Simple, but a process may search
through partial tours that cannot lead to global best tour

Dynamic Mapping
 When a process runs out of work, get more work
 Each stack entry is partial tour

10
 A process can get a partial tour and work on it
 The order in which nodes are visited does not matter

When a single thread executes some code, we use the OpenMP directive
# pragma omp single

This will insure that the following structured block of code will be executed by one thread in the
team, and the other threads in the team will wait in an implicit barrier at the end of the block
until the executing thread is finished.

The test can also be replaced by the OpenMP directive

# pragma omp master

PARALLELIZING TREE SEARCH USING MPI

 Process 0 generates and sends partial tours to p processes.
 When a process finds a best tour, it sends its cost to all other processes.

 Destination can check periodically using

MPI_Recv(& receivedcost , 1 , MPI INT , MPI ANY SOURCE , NEW COST TAG,
comm, & status ) ;
 But receiving process will block.

 We can use MPI_Iprobe to check if a message from src with tag in communicator comm
is available
 If such is available ∗msg is 1 and status−>MPI SOURCE contains the source; otherwise
∗msg is 0

11
 To check if there is a message from any source
MPI_Iprobe(MPI ANY SOURCE,NEW COST TAG, &msg, &status)

 If msg=1, we can receive with

MPI_Recv(&received cost,1,MPI INT,status.MPI SOURCE, NEW COST
TAG,comm,MPI STATUS IGNORE);

When the call to MPI Allreduce returns, we have two alternatives:

(1) If process 0 already has the best tour, we simply return.
(2)Otherwise, the process owning the best tour sends it to process 0.

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Software Quality Assurance Complete Notes
0% (1)
Software Quality Assurance Complete Notes
42 pages
RippleTankSE Key PDF
100% (4)
RippleTankSE Key PDF
10 pages
Advance Operating System
No ratings yet
Advance Operating System
1 page
Uid-Graphical System Advatages
No ratings yet
Uid-Graphical System Advatages
21 pages
(Ebook) Building the Internet of Things with IPv6 and MIPv6: The Evolving World of M2M Communications by Daniel Minoli ISBN 9781118473474, 1118473477 instant download
100% (1)
(Ebook) Building the Internet of Things with IPv6 and MIPv6: The Evolving World of M2M Communications by Daniel Minoli ISBN 9781118473474, 1118473477 instant download
56 pages
Software Testing Methodologies Unit I
No ratings yet
Software Testing Methodologies Unit I
195 pages
Software Testing Methodologies: Unit 1
No ratings yet
Software Testing Methodologies: Unit 1
16 pages
CP4212 Software Engineering Laboratory Record
No ratings yet
CP4212 Software Engineering Laboratory Record
64 pages
Oose Question Bank
No ratings yet
Oose Question Bank
6 pages
The Role of Algorithms in Computing
No ratings yet
The Role of Algorithms in Computing
9 pages
CS8791-CC Unit-II
No ratings yet
CS8791-CC Unit-II
75 pages
NLP Unit-Ii
No ratings yet
NLP Unit-Ii
118 pages
Crowd Sourcing Analytics
100% (1)
Crowd Sourcing Analytics
27 pages
Chapter 06 Part1
No ratings yet
Chapter 06 Part1
20 pages
Cp4291 Iot Lab Manual
No ratings yet
Cp4291 Iot Lab Manual
35 pages
Experiment No. 1: Theory
No ratings yet
Experiment No. 1: Theory
7 pages
TOC Question Bank - Unit - 1 - 2 - 3 - 4 - 2022
No ratings yet
TOC Question Bank - Unit - 1 - 2 - 3 - 4 - 2022
7 pages
IJPREMS Template January 2023
No ratings yet
IJPREMS Template January 2023
2 pages
From Chapter 1 of Distributed Systems Concepts and Design, 4 Edition
100% (1)
From Chapter 1 of Distributed Systems Concepts and Design, 4 Edition
49 pages
Chap 1 Web Essentials
75% (4)
Chap 1 Web Essentials
100 pages
Software Quality: Robert Hughes and Mike Cotterell
No ratings yet
Software Quality: Robert Hughes and Mike Cotterell
46 pages
MST 4220
No ratings yet
MST 4220
15 pages
CP4291 IOT LAb MANUAL-1
No ratings yet
CP4291 IOT LAb MANUAL-1
37 pages
Domain Specific Iot
No ratings yet
Domain Specific Iot
17 pages
CS8691 AI CO-PO Mapping
No ratings yet
CS8691 AI CO-PO Mapping
6 pages
Ooad
No ratings yet
Ooad
165 pages
IoT Levels and Deployment Templates
No ratings yet
IoT Levels and Deployment Templates
10 pages
East West Institute of Technology: Sadp Notes
No ratings yet
East West Institute of Technology: Sadp Notes
30 pages
Principles of Pervasive Computing
No ratings yet
Principles of Pervasive Computing
15 pages
P.prabu (28x61c) CCS334 BDA - Unit 4
No ratings yet
P.prabu (28x61c) CCS334 BDA - Unit 4
28 pages
Information Retrieval Techniques-Anna University QP
No ratings yet
Information Retrieval Techniques-Anna University QP
11 pages
CP4253 Map Unit I
No ratings yet
CP4253 Map Unit I
31 pages
MSD Previous Papers 2022-23
100% (1)
MSD Previous Papers 2022-23
4 pages
JNTUH FLAT Study Material
No ratings yet
JNTUH FLAT Study Material
211 pages
DAA-2020-21 Final Updated Course File
No ratings yet
DAA-2020-21 Final Updated Course File
49 pages
SQT - Question Papers
0% (1)
SQT - Question Papers
7 pages
Unit I
No ratings yet
Unit I
53 pages
Unit 1 (Fiot)
No ratings yet
Unit 1 (Fiot)
38 pages
Types & Classification of Wireless Sensor Networks
No ratings yet
Types & Classification of Wireless Sensor Networks
4 pages
Smart Calculator
100% (1)
Smart Calculator
11 pages
AI NEW Lab Manual-R22 BATCH-CSE
No ratings yet
AI NEW Lab Manual-R22 BATCH-CSE
32 pages
Cs3451 Ios Unit 5 Notes
No ratings yet
Cs3451 Ios Unit 5 Notes
21 pages
Problem Statement
No ratings yet
Problem Statement
23 pages
6CS4-02 ML PPT Unit-3
No ratings yet
6CS4-02 ML PPT Unit-3
52 pages
Cp4152 Database Practice Lab Manual R 2021
No ratings yet
Cp4152 Database Practice Lab Manual R 2021
48 pages
Ooad SDLC
0% (1)
Ooad SDLC
32 pages
Mobile Application Development
100% (1)
Mobile Application Development
130 pages
2 & 16 Mark Questions and Answers
No ratings yet
2 & 16 Mark Questions and Answers
27 pages
Computer Network
No ratings yet
Computer Network
17 pages
STM Notes
No ratings yet
STM Notes
153 pages
ch-10 Advanced Relationships
No ratings yet
ch-10 Advanced Relationships
33 pages
R20-Atcd-Q.p - Model Paper.
100% (1)
R20-Atcd-Q.p - Model Paper.
3 pages
CCS366 - STA Book
No ratings yet
CCS366 - STA Book
105 pages
IV 2 JavaLab Edited
No ratings yet
IV 2 JavaLab Edited
40 pages
Specification of Tokens
0% (1)
Specification of Tokens
17 pages
Animal Detection and Prevention in Agri Field Using Iot
No ratings yet
Animal Detection and Prevention in Agri Field Using Iot
36 pages
Unit 3 Indexing
100% (1)
Unit 3 Indexing
10 pages
Module 5
No ratings yet
Module 5
19 pages
M.E. Bda 2021
No ratings yet
M.E. Bda 2021
64 pages
Unit V
No ratings yet
Unit V
10 pages
UNIT-2 Parallel Programming Challenges
No ratings yet
UNIT-2 Parallel Programming Challenges
32 pages
Unit Iv Distributed Memory Programming With Mpi
No ratings yet
Unit Iv Distributed Memory Programming With Mpi
19 pages
3unit3 Mca Pecnotes
No ratings yet
3unit3 Mca Pecnotes
23 pages
Unit4 RMD PDF
No ratings yet
Unit4 RMD PDF
18 pages
Unit3 RMD PDF
No ratings yet
Unit3 RMD PDF
25 pages
Makalah Tembok Penahan Cantilever
No ratings yet
Makalah Tembok Penahan Cantilever
56 pages
Computer AnalysisDesign of Large Mat Foundations 1
No ratings yet
Computer AnalysisDesign of Large Mat Foundations 1
2 pages
Me101 Engineering Drawing & Graphics
No ratings yet
Me101 Engineering Drawing & Graphics
20 pages
Lateral Response of Contiguous Pile Wall Subjected To Staged Excavation - Physical and Numerical Investigations
100% (1)
Lateral Response of Contiguous Pile Wall Subjected To Staged Excavation - Physical and Numerical Investigations
10 pages
Plastic Bending
No ratings yet
Plastic Bending
1 page
B.SC - Honours-Physics PDF
No ratings yet
B.SC - Honours-Physics PDF
101 pages
X03
No ratings yet
X03
2 pages
Surcharge Induced Earth Pressure Reduction On Non-Yielding Rigid Retaining Wall Using Relief Shelf
No ratings yet
Surcharge Induced Earth Pressure Reduction On Non-Yielding Rigid Retaining Wall Using Relief Shelf
5 pages
Physics q5&q6 Trial 2021
No ratings yet
Physics q5&q6 Trial 2021
63 pages
5.3 Inviscid Instability Mechanism of Parallel Ows: 5.3.1 Rayleigh's Equation
No ratings yet
5.3 Inviscid Instability Mechanism of Parallel Ows: 5.3.1 Rayleigh's Equation
5 pages
Work Power Energy - DPP 01
No ratings yet
Work Power Energy - DPP 01
3 pages
PHY Formula Handbook - ATC
No ratings yet
PHY Formula Handbook - ATC
12 pages
Section A: Answer All Questions
No ratings yet
Section A: Answer All Questions
3 pages
Instruction: Write Your Answers On A Separate Sheet of Paper. Write Only Your Name, Section, and Answers in Each
No ratings yet
Instruction: Write Your Answers On A Separate Sheet of Paper. Write Only Your Name, Section, and Answers in Each
2 pages
Cable Solution from Book by Gunnar Tibert - Numerical Analyses of Cable Roof Structures, TRITA-BKN. Bulletin 46, 1999
No ratings yet
Cable Solution from Book by Gunnar Tibert - Numerical Analyses of Cable Roof Structures, TRITA-BKN. Bulletin 46, 1999
37 pages
The Transport Disengagement Height (TDH) in A Bubbling Fluidized
No ratings yet
The Transport Disengagement Height (TDH) in A Bubbling Fluidized
9 pages
TOM-6Brake and Dynamometer
No ratings yet
TOM-6Brake and Dynamometer
2 pages
Analisis Kegagalan Teknik D4-2-Teori Dasar
No ratings yet
Analisis Kegagalan Teknik D4-2-Teori Dasar
106 pages
Physics 2nd Assignment Deney Harmonic Mothion
No ratings yet
Physics 2nd Assignment Deney Harmonic Mothion
4 pages
Ripple Tank
No ratings yet
Ripple Tank
1 page
Shock Dyno Graphs
No ratings yet
Shock Dyno Graphs
7 pages
Workshop 11 PDF
No ratings yet
Workshop 11 PDF
22 pages
Tensor Product
No ratings yet
Tensor Product
5 pages
Physics by Topics
No ratings yet
Physics by Topics
144 pages
Free Fall and Air Resistance
No ratings yet
Free Fall and Air Resistance
1 page
Module Iv & V
No ratings yet
Module Iv & V
27 pages
1 s2.0 S0038080612000790 Main PDF
No ratings yet
1 s2.0 S0038080612000790 Main PDF
16 pages
7. POISEUILLE’S METHOD
No ratings yet
7. POISEUILLE’S METHOD
5 pages
CAEA v15 ANSYS Composite Damage
No ratings yet
CAEA v15 ANSYS Composite Damage
24 pages

Case Studies - N-Body Solvers - Tree Search - Openmp and Mpi Implementations and Comparison

Uploaded by

Case Studies - N-Body Solvers - Tree Search - Openmp and Mpi Implementations and Comparison

Uploaded by

UNIT – 5 : PARALLEL PROGRAM DEVELOPMENT

Case studies – n-Body solvers – Tree Search – OpenMP and MPI

THE N-BODY SOLVERS

For n = 2, the problem was completely solved.

The acceleration of particle q is given by the formula

Basic Algorithm for Computing N-Body Forces

Computation of the forces

The individual forces

Communications Among Agglomerated Tasks in the Basic N-Body Solver

Communications Among Agglomerated Tasks in the Reduced N-Body Solver

PARALLELIZING THE REDUCED SOLVER USING OPENMP

PARALLELIZING THE BASIC SOLVER USING MPI

PARALLELIZING THE REDUCED SOLVER USING MPI

Example: Consider a four-city TSP

Search tree for four-city TSP

PARALLELIZING TREE SEARCH USING OPENMP

The test can also be replaced by the OpenMP directive

PARALLELIZING TREE SEARCH USING MPI

 Destination can check periodically using

 If msg=1, we can receive with

When the call to MPI Allreduce returns, we have two alternatives:

You might also like