KINGDOM OF SAUDI ARABIA المملكة العربية السعودية
Ministry of Higher Education
Taibah University وزارة التعليم العالي
College of Computer Science & Engineering كلية علوم وهندسة الحاسب-جامعة طيبة
اآللي
CS424 Introduction to Parallel Computing Semester II 2018-2019
LAB 8: Introduction to OpenMP
Objective
To learn how to parallelise program using OpenMP directives such as parallel
and parallel for.
To learn how to ensure data consistency in OpenMP programs using critical.
To learn collective operation directive such as reduction.
Lab Activities
1. Using the sequential “hello, world” program in Lab 5, make the necessary
changes so that the program runs in parallel using OpenMP parallel directive.
Here a sample with 4 threads:
2. Compile and run the “OpenMP-hello, world” program with varying number of
threads (e.g. 1, 2, 4 and 8 thread). Observe the output. Is the ordering of the
output as expected, i.e. the first thread prints the output, followed by the second
etc.
Dr. Mutaz & Dr. Fazilah Page 1
3. Modify your sequential vector summation program into a parallel program.
Here a sample:
4. Run your program using 1000 numbers (with 10, and 100 threads). Observe the
output. Do you always get the expected results.
Here a sample:
5. Use omp critical to solve the data inconsistency problem in (4).
Here a sample:
6. Modify the program in (5) by using reduction.
7. Modify the program (6) using parallel for.
Dr. Mutaz & Dr. Fazilah Page 2
Exercises
1. Compare and contrast between OpenMP, Java and Pthread “hello, world”
programs. For Pthread refer to page 154 of Pacheco.
2. Include error checking code in your “hello, world” program (in Lab Activity A),
which avoid errors if your compiler does not support OpenMP. Compile and Run the
program.
3. Write a program with the master thread prints the following environment
information:
The number of processors available
The number of threads being used
The maximum number of threads available
If you are in a parallel region
4. Write a program which will only do the summation of numbers between array a and
b.
Dr. Mutaz & Dr. Fazilah Page 3
#include<omp.h>
#include<stdio.h>
void printHello();
int main(void) {
#pragma omp parallel
printHello();
return 0;
}//main
void printHello() {
int my_rank = omp_get_thread_num();
int thread_count = omp_get_num_threads();
printf("Hello World! from %d of %d\n", my_rank, thread_count);
}//printHello
#include<omp.h>
#include<stdio.h>
void printHello();
int main(void) {
#pragma omp parallel num_threads(3)//اذا طلب طباعة عدد ثريد محدد نستخدم هذي الكلوز
printHello();
return 0;
}//main
void printHello() {
int my_rank = omp_get_thread_num();
int thread_count = omp_get_num_threads();
printf("Hello World! from %d of %d\n", my_rank, thread_count);
}//printHello
#include <omp.h>
#include <stdio.h>
void Vector_sum(int x[], int y[], int z[], int n);
int main()
{
const int n = 8;
int x[8] = { 0, 0, 1, 1, 2, 2, 3, 3 };
int y[8] = { 0, 0, 1, 1, 2, 2, 3, 3 };
int z[8];
#pragma omp parallel num_threads (4)
Vector_sum(x, y, z, n);
for (int i = 0; i < n; i++)
printf("%d ", z[i]);
return 0;
}
void Vector_sum(int x[], int y[], int z[], int n) {
int i;
int my_rank = omp_get_thread_num();
int thread_count = omp_get_num_threads();
int local_n = n / thread_count;
int frist = my_rank * local_n;
int last = frist + local_n;
for (i = frist; i < last; i++)
z[i] = x[i] + y[i];
}
#include <omp.h>
#include <stdio.h>
int Vector_sum(int x[], int n);
//void printHello();
int main()
{
const int n = 1000;
int x[1000];
for (int i = 0; i < n; i++) {
x[i] = 1;//1 عنصر قيمهم كلها1000 مصفوفه من
}
int sum = 0;
#pragma omp parallel num_threads (4)
{
int local_sum = Vector_sum(x, n);
#pragma omp critical
sum += local_sum;
}//pragma
printf("Sum is %d ", sum);
return 0;
}
int Vector_sum(int x[], int n) {
int i;
int my_rank = omp_get_thread_num();
int thread_count = omp_get_num_threads();
//الثالث اسطر هذي ثابته
int local_n = n / thread_count;
int frist = my_rank * local_n;
int last = frist + local_n;
// اجمع عناصر المصفوفه, وظيفة الداله
int local_sum = 0;
for (i = frist; i < last; i++)
local_sum += x[i];
return local_sum;
}
#include <omp.h>
#include <stdio.h>
int Vector_sum(int x[], int n);
int main()
{
const int n = 1000;
int x[1000];
for (int i = 0; i < n; i++)
x[i] = 1;//1 عنصر قيمهم كلها1000 مصفوفه من
int sum = 0;
// المتغير, استخدمت ريدكشنsum حتكون له نسخه للكل و نسخه برايفت لكل واحد
#pragma omp parallel num_threads (4) reduction(+:sum)
{
sum += Vector_sum(x, n);
}//pragma
printf("Sum is %d ", sum);
return 0;
}
int Vector_sum(int x[], int n) {
int i;
int my_rank = omp_get_thread_num();
int thread_count = omp_get_num_threads();
//الثالث اسطر هذي ثابته
int local_n = n / thread_count;
int frist = my_rank * local_n;
int last = frist + local_n;
// اجمع عناصر المصفوفه, وظيفة الداله
int local_sum = 0;
for (i = frist; i < last; i++)
local_sum += x[i];
return local_sum;
}
#include <omp.h>
#include <stdio.h>
int main()
{
const int n = 1000;
int x[1000];
int z[1000];
for (int i = 0; i < n; i++)
x[i] = 1;//1 عنصر قيمهم كلها1000 مصفوفه من
int sum = 0;
//parallel for اذا الكود مافيه اعتماديه نستخدمها
#pragma omp parallel for num_threads (4) reduction(+:sum)
for (int i = 0; i < n; i++)
sum += x[i];
printf("Sum is %d ", sum);
return 0;
}
#include <omp.h>
#include <stdio.h>
int main()
{
const int n = 1000;
int x[1000];
int z[1000];
for (int i = 0; i < n; i++)
x[i] = 1;//1 عنصر قيمهم كلها1000 مصفوفه من
int sum = 0;
#pragma omp parallel num_threads (4) reduction(+:sum)
for (int i = 0; i < n; i++)
sum += x[i];
printf("Sum is %d ", sum);
return 0;
}
KINGDOM OF SAUDI ARABIA المملكة العربية السعودية
Ministry of Higher Education
Taibah University وزارة التعليم العالي
College of Computer Science & Engineering كلية علوم وهندسة الحاسب-جامعة طيبة
اآللي
CS424 Introduction to Parallel Computing Semester II 2018-2019
LAB 9: Advanced OpenMP Programming
Objectives
To learn how to use schedule clause and the different types of scheduling.
To analyse the performance of OpenMP programs.
Lab Activities
1. Use your parallel summation to examine the effects of cyclic schedule with different
chunk sizes. What can you say about default and cyclic schedule, and the effect of
chunk sizes (see section 5.7).
2. Modify your sequential Pi program to an OpenMP-Pi (see section 5.5.4).
3. Obtain the speed up and efficiency of your pi program using 2, 4 and 8 threads. Plot
the graphs for speed up and efficiency.
4. What can you say on the performance of OpenMP-Pi compare to MPI-Pi.
Exercises
1. Write, compile and run the following programs using OpenMP and analyse their
performance (using 2, 4, 8 threads).
(a) The trapezoid rule program (section 5.4)
(b) Parallel matrix multiplication (section 5.9).
Dr. Mutaz & Dr. Fazilah Page 1
Introduction to Parallel Computing
CS424
LAB 8: Introduction to OpenMP
Objective
• To learn how to parallelise program using OpenMP directives such as
parallel and parallel for.
• To learn how to ensure data consistency in OpenMP programs using critical
directive.
• To learn collective operation directive such as reduction.
Enable OpenMP
1
3
5 4
Debug OpenMP
Activity 1
4. Write, compile, and run a program which does a vector addition. Suppose the
size of the vector is 1000, and the value of all elements in the vector is 1.
• Run the program with 10, and 100 threads
• Observe the output.
• Do you always get the expected results.
Activity
Activity
Activity
Activity
5. Use omp critical to solve the data inconsistency problem in (4).
Activity
Activity
6. Modify the program in (5) by using reduction.
Activity
6. Modify the program in (6) using parallel for.
Introduction to Parallel Computing
CS424
LAB 9: Advanced OpenMP Programming
Activity 1
• Use your parallel summation to examine the effects of cyclic
schedule with different chunk sizes.
• What can you say about default and cyclic schedule, and the
effect of chunk sizes.
• Obtain the speed up and efficiency of your pi program using
2, 4 and 8 threads.
Activity 1
Activity 1
Activity 2
• Modify your sequential Pi program to an OpenMP
Activity 2
Activity 2
Activity 3
• Obtain the speed up and efficiency of your pi program using 2, 4 and 8
threads.
• Plot the graphs for speed up and efficiency.