SlideShare a Scribd company logo
Parallel Programming Basics
Jimmy Hu
Target Audience
 People interests parallel programming topic
 People wants to know how to improve the performance of their code
 People wants to know how to acquire the (possible) peak performance from their computer
(There are a bunch of techniques / methods available for reaching peak performance and this
kind of things is out of the range of our discussion)
 Someone wants to know the way that I am using my computers / servers (X
2
Outline
 Why parallel programming?
 What is parallel programming?
 How to perform parallel programming (in C++ / Matlab / C#)
 Conclusion / Further Discussions
3
Why Parallel Programming?
4
 Please check the following C++ code, what’s the output?
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
Why Parallel Programming?
5
 Answer of the question “Please check the following C++ code, what’s the output?”
// https://godbolt.org/z/fb7TdT495
// https://godbolt.org/z/3Kj1azb4h
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
4, 5, 6,
Why Parallel Programming?
6
 Code Structure
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
Part 1: Variable Initialization
Why Parallel Programming?
7
 Code Structure
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
Part 2: Data Processing / Calculation
Why Parallel Programming?
8
 Code Structure
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
Part 3: Output
Why Parallel Programming?
9
 In the mentioned simple example, the calculating part is simple add operation
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
4, 5, 6,
Why Parallel Programming?
10
 What happened in the case of more complicated operation?
int main()
{
std::vector<Frame> image_frames =
{img1, img2, img3};
auto results = std::vector<Features>(3);
for(int i = 0; i < std::ranges::size(image_frames); ++i)
{
results[i] = feature_extraction(image_frames[i]);
}
…
return 0;
}
This example is calling a function
which named “feature_extraction”.
Why Parallel Programming?
11
Without Parallel Programming With Parallel Programming
Dish A
Dish B
Dish C
…
Dish A Dish B Dish C
Icon is from https://www.hiclipart.com/free-transparent-background-png-clipart-iuxpq/download
Why Parallel Programming?
12
Without Parallel Programming With Parallel Programming
Task A
Task B
Task C
…
Task A Task B Task C
Icon is from https://www.flaticon.com/free-icon/cpu_1250593
The Steps of Execution
13
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
 Let’s review the previous simple case. How’s the program is executed?
The Steps of Execution
14
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
1 2 3
test_vector
The Steps of Execution
15
1 2 3
test_vector
3
a
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
The Steps of Execution
16
1 2 3
test_vector
3
a
 Then, the execution runs sequentially?
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
The Steps of Execution
17
1 2 3
test_vector
3
a
 Then, the execution runs sequentially?
= 4
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
The Steps of Execution
18
4 2 3
test_vector
3
a
 Then, the execution runs sequentially?
= 5
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
The Steps of Execution
19
4 5 3
test_vector
3
a
 Then, the execution runs sequentially?
= 6
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
The Steps of Execution
20
4 5 6
test_vector
3
a
 Then, the execution runs sequentially?
int main()
{
std::vector<int> test_vector = {1, 2, 3};
int a = 3;
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < std::ranges::size(test_vector); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
The Concept of Parallelization
21
1 2 3
test_vector
3
 Despite the way of sequentialization, is it possible to speed up?
 Why not let’s make the program runs parallelly (enable the operations run simultaneously)?
a 3
3
New
test_vector 4 5 6
The Concept of Parallelization
22
1 2 3
test_vector
3
 How this can be done in our program? Solution: Parallel Programming!
a 3
3
New
test_vector 4 5 6
The Concept of Parallelization
23
 Parallelization enabling
 Tools in C++:
- OpenMP
- TBB(Threading Building Blocks)
- std::thread
- Execution Policy in STL
 Tools in Matlab
 Tools in C#
Parallelization Implementation
24
 Parallelization enabling with OpenMP #include <omp.h>
int main()
{
auto test_vector = std::vector<int>{1, 2, 3};
int a = 3;
#pragma omp parallel for
for(int i = 0; i < test_vector.size(); i++)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < test_vector.size(); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
auto test_vector = std::vector<int>{1, 2, 3};
int a = 3;
test_vector[0] =
test_vector[0] + a;
test_vector[1] =
test_vector[1] + a;
test_vector[2] =
test_vector[2] + a;
for(int i = 0; i < test_vector.size(); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
Parallelization Implementation
25
 Parallelization enabling with OpenMP / TBB
// https://godbolt.org/z/szMc4jbqn
// https://godbolt.org/z/haM1qd6eY
#include <omp.h>
int main()
{
auto test_vector = std::vector<int>{1, 2, 3};
int a = 3;
#pragma omp parallel for
for(int i = 0; i < test_vector.size(); ++i)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < test_vector.size(); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
// https://godbolt.org/z/dcssoWj8K
#include <tbb/parallel_for.h>
int main()
{
auto test_vector = std::vector<int>{1, 2, 3};
int a = 3;
tbb::parallel_for( tbb::blocked_range<int>(0,test_vector.size()),
[&](tbb::blocked_range<int> r)
{
for (int i=r.begin(); i<r.end(); ++i)
{
test_vector[i] = test_vector[i] + a;
}
});
for(int i = 0; i < test_vector.size(); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
Parallelization Implementation
26
 Parallelization enabling with OpenMP / TBB
// https://godbolt.org/z/haM1qd6eY
#include <omp.h>
int main()
{
auto test_vector = std::vector<int>{1, 2, 3};
int a = 3;
#pragma omp parallel for
for(int i = 0; i < test_vector.size(); i++)
{
test_vector[i] = test_vector[i] + a;
}
for(int i = 0; i < test_vector.size(); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
// https://godbolt.org/z/dcssoWj8K
#include <tbb/parallel_for.h>
int main()
{
auto test_vector = std::vector<int>{1, 2, 3};
int a = 3;
tbb::parallel_for( tbb::blocked_range<int>(0,test_vector.size()),
[&](tbb::blocked_range<int> r)
{
for (int i=r.begin(); i<r.end(); ++i)
{
test_vector[i] = test_vector[i] + a;
}
});
for(int i = 0; i < test_vector.size(); ++i)
{
std::cout << test_vector[i] << ", ";
}
return 0;
}
A lambda function is here!
Parallelization Methods Comparison
27
 Comparing OpenMP / TBB and std::thread
// The following code is an example of std::thread
// https://stackoverflow.com/a/11229853/6667035
// https://godbolt.org/z/YeY9d4EeP
#include <string>
#include <iostream>
#include <numeric>
#include <thread>
#include <vector>
void task1(std::vector<int> input) // The function we want to execute on the new thread.
{
for(int i = 0; i < input.size(); i++)
{
std::cout << "output from task1 function: " << input[i];
}
}
void function1()
{
auto test_vector1 = std::vector<int>(100);
std::iota(test_vector1.begin(), test_vector1.end(), 1);
int sum = 0;
for(int i = 0; i < test_vector1.size(); i++)
{
sum += test_vector1[i];
}
std::cout << sum << "n”;
}
int main()
{
auto test_vector = std::vector<int>(100);
std::iota(test_vector.begin(), test_vector.end(), 1);
std::thread t1(task1, test_vector);
function1();
t1.join();
return 0;
}
std::thread Concept
28
 Comparing OpenMP / TBB and std::thread
main function
task1 function
function1 function
Parallel Part
end
Execution Policy in STL
29
 When it comes to Execution Policy after C++17…
 std::execution::par
 std::execution::seq
// https://en.cppreference.com/w/cpp/algorithm/transform
// https://godbolt.org/z/bY14q1z3K
#include <algorithm>
#include <execution>
#include <iomanip>
#include <iostream>
#include <string>
#include <thread>
int main()
{
std::string g {"hello"};
std::for_each(std::execution::par, g.begin(), g.end(), [](char& c) // modify in-place
{
c = std::toupper(static_cast<unsigned char>(c));
});
std::cout << "g = " << std::quoted(g) << 'n';
return 0;
}
Parallelization in Matlab
30
 Document of parfor function usage
Parallelization in Matlab
31
 parfor function usage example
Program without parfor Program with parfor
// https://www.mathworks.com/help/parallel-
computing/parfor.html
tic
n = 200;
A = 500;
a = zeros(1,n);
for i = 1:n
a(i) = max(abs(eig(rand(A))));
end
toc
Elapsed time is 31.935373 seconds.
tic
n = 200;
A = 500;
a = zeros(1,n);
parfor i = 1:n
a(i) = max(abs(eig(rand(A))));
end
toc
Elapsed time is 10.760068 seconds.
Parallelization in C#
32
 Document of Parallel.For function usage: https://learn.microsoft.com/en-
us/dotnet/standard/parallel-programming/data-parallelism-task-parallel-library
Parallelization in C#
33
 Parallel.For function usage example
Program without Parallel.For Program with Parallel.For
using System;
using System.Threading.Tasks;
public class ParallelTest
{
public static void Main(string[] args)
{
for(int i = 0; i < 10; i++)
{
Console.WriteLine (i + "n");
};
}
}
using System;
using System.Threading.Tasks;
public class ParallelTest
{
public static void Main(string[] args)
{
Parallel.For(0, 10, i =>
{
Console.WriteLine (i + "n");
}); // Parallel.For
}
}
A lambda function is here!
Parallelization in C#
34
Concept of Parallelable
35
 Please think that what’s the limitation of Parallelization
Concept of Parallelable
36
 Please think that what’s the limitation of Parallelization
Answer: The limitation of parallelization is that the operation which is to be parallelize
should be independent!
What’s the meaning of independent?
Concept of Parallelable
37
 Please think that what’s the limitation of Parallelization
Answer: The limitation of parallelization is that the operation which is to be parallelize
should be independent!
What’s the meaning of independent?
Let’s check the case of dependent first:
A B C
Concept of Parallelable
38
 Please think that what’s the limitation of Parallelization
Answer: The limitation of parallelization is that the operation which is to be parallelize
should be independent!
What’s the meaning of independent?
Let’s check the case of dependent first:
The A, B and C operations cannot be made in parallelization!
Because B operation needs the output from A and
C operation needs the output from B!
A B C
Conclusion / Further Discussions
39
 Parallelization technique can bring some performance increment when you use it
properly
 Parallelization can make higher utilization of computers / computing devices
 Is there any disadvantage of using parallelization method?
Conclusion / Further Discussions
40
 Parallelization technique can bring some performance increment when you use it
properly
 Parallelization can make higher utilization of computers / computing devices
 Is there any disadvantage of using parallelization method?
 Memory usage issue

More Related Content

Similar to ParallelProgrammingBasics_v2.pdf (20)

Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++
Satalia
 
Complier design
Complier design Complier design
Complier design
shreeuva
 
The Style of C++ 11
The Style of C++ 11The Style of C++ 11
The Style of C++ 11
Sasha Goldshtein
 
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Evgeniy Muralev, Mark Vince, Working with the compiler, not against itEvgeniy Muralev, Mark Vince, Working with the compiler, not against it
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Sergey Platonov
 
Example : parallelize a simple problem
Example : parallelize a simple problemExample : parallelize a simple problem
Example : parallelize a simple problem
MrMaKKaWi
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
mspline
 
In-class slides with activities
In-class slides with activitiesIn-class slides with activities
In-class slides with activities
SERC at Carleton College
 
The Inner Secrets of Compilers
The Inner Secrets of CompilersThe Inner Secrets of Compilers
The Inner Secrets of Compilers
IT MegaMeet
 
Introduction to C ++.pptx
Introduction to C ++.pptxIntroduction to C ++.pptx
Introduction to C ++.pptx
VAIBHAVKADAGANCHI
 
Isorerism in relative strength in inline functions in pine script.ppt
Isorerism in relative strength in inline functions in pine script.pptIsorerism in relative strength in inline functions in pine script.ppt
Isorerism in relative strength in inline functions in pine script.ppt
DjangoVijay
 
parellel computing
parellel computingparellel computing
parellel computing
katakdound
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
Andrey Karpov
 
Lecture#5-Arrays-oral patholohu hfFoP.ppt
Lecture#5-Arrays-oral patholohu hfFoP.pptLecture#5-Arrays-oral patholohu hfFoP.ppt
Lecture#5-Arrays-oral patholohu hfFoP.ppt
SamanArshad11
 
Lecture6
Lecture6Lecture6
Lecture6
tt_aljobory
 
Using c++Im also using a the ide editor called CodeLiteThe hea.pdf
Using c++Im also using a the ide editor called CodeLiteThe hea.pdfUsing c++Im also using a the ide editor called CodeLiteThe hea.pdf
Using c++Im also using a the ide editor called CodeLiteThe hea.pdf
fashiongallery1
 
Modern c++
Modern c++Modern c++
Modern c++
Jorge Martinez de Salinas
 
Modern C++
Modern C++Modern C++
Modern C++
Michael Clark
 
Task based Programming with OmpSs and its Application
Task based Programming with OmpSs and its ApplicationTask based Programming with OmpSs and its Application
Task based Programming with OmpSs and its Application
Facultad de Informática UCM
 
Autovectorization in llvm
Autovectorization in llvmAutovectorization in llvm
Autovectorization in llvm
ChangWoo Min
 
Loops
LoopsLoops
Loops
International Islamic University
 
Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++Options and trade offs for parallelism and concurrency in Modern C++
Options and trade offs for parallelism and concurrency in Modern C++
Satalia
 
Complier design
Complier design Complier design
Complier design
shreeuva
 
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Evgeniy Muralev, Mark Vince, Working with the compiler, not against itEvgeniy Muralev, Mark Vince, Working with the compiler, not against it
Evgeniy Muralev, Mark Vince, Working with the compiler, not against it
Sergey Platonov
 
Example : parallelize a simple problem
Example : parallelize a simple problemExample : parallelize a simple problem
Example : parallelize a simple problem
MrMaKKaWi
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
mspline
 
The Inner Secrets of Compilers
The Inner Secrets of CompilersThe Inner Secrets of Compilers
The Inner Secrets of Compilers
IT MegaMeet
 
Isorerism in relative strength in inline functions in pine script.ppt
Isorerism in relative strength in inline functions in pine script.pptIsorerism in relative strength in inline functions in pine script.ppt
Isorerism in relative strength in inline functions in pine script.ppt
DjangoVijay
 
parellel computing
parellel computingparellel computing
parellel computing
katakdound
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
Andrey Karpov
 
Lecture#5-Arrays-oral patholohu hfFoP.ppt
Lecture#5-Arrays-oral patholohu hfFoP.pptLecture#5-Arrays-oral patholohu hfFoP.ppt
Lecture#5-Arrays-oral patholohu hfFoP.ppt
SamanArshad11
 
Using c++Im also using a the ide editor called CodeLiteThe hea.pdf
Using c++Im also using a the ide editor called CodeLiteThe hea.pdfUsing c++Im also using a the ide editor called CodeLiteThe hea.pdf
Using c++Im also using a the ide editor called CodeLiteThe hea.pdf
fashiongallery1
 
Task based Programming with OmpSs and its Application
Task based Programming with OmpSs and its ApplicationTask based Programming with OmpSs and its Application
Task based Programming with OmpSs and its Application
Facultad de Informática UCM
 
Autovectorization in llvm
Autovectorization in llvmAutovectorization in llvm
Autovectorization in llvm
ChangWoo Min
 

More from Chen-Hung Hu (12)

淺談電腦檔案系統概念
淺談電腦檔案系統概念淺談電腦檔案系統概念
淺談電腦檔案系統概念
Chen-Hung Hu
 
【智慧核心-CPU】第三節:負數、小數的修正機制
【智慧核心-CPU】第三節:負數、小數的修正機制【智慧核心-CPU】第三節:負數、小數的修正機制
【智慧核心-CPU】第三節:負數、小數的修正機制
Chen-Hung Hu
 
【智慧核心-CPU】第二節:正整數進位制的轉換-編碼
【智慧核心-CPU】第二節:正整數進位制的轉換-編碼【智慧核心-CPU】第二節:正整數進位制的轉換-編碼
【智慧核心-CPU】第二節:正整數進位制的轉換-編碼
Chen-Hung Hu
 
漫談七段顯示器
漫談七段顯示器漫談七段顯示器
漫談七段顯示器
Chen-Hung Hu
 
BJT Transistor分壓偏壓電路分析
BJT Transistor分壓偏壓電路分析BJT Transistor分壓偏壓電路分析
BJT Transistor分壓偏壓電路分析
Chen-Hung Hu
 
淺談類比-數位轉換器
淺談類比-數位轉換器淺談類比-數位轉換器
淺談類比-數位轉換器
Chen-Hung Hu
 
感光元件及其相關迴路之研究 --以光敏電阻為例
感光元件及其相關迴路之研究 --以光敏電阻為例感光元件及其相關迴路之研究 --以光敏電阻為例
感光元件及其相關迴路之研究 --以光敏電阻為例
Chen-Hung Hu
 
穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例
穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例
穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例
Chen-Hung Hu
 
Adc0804及其相關迴路之研究
Adc0804及其相關迴路之研究Adc0804及其相關迴路之研究
Adc0804及其相關迴路之研究
Chen-Hung Hu
 
可調式電源供應器之研究
可調式電源供應器之研究可調式電源供應器之研究
可調式電源供應器之研究
Chen-Hung Hu
 
HC 05藍芽模組連線
HC 05藍芽模組連線HC 05藍芽模組連線
HC 05藍芽模組連線
Chen-Hung Hu
 
自動功因改善裝置之研究
自動功因改善裝置之研究自動功因改善裝置之研究
自動功因改善裝置之研究
Chen-Hung Hu
 
淺談電腦檔案系統概念
淺談電腦檔案系統概念淺談電腦檔案系統概念
淺談電腦檔案系統概念
Chen-Hung Hu
 
【智慧核心-CPU】第三節:負數、小數的修正機制
【智慧核心-CPU】第三節:負數、小數的修正機制【智慧核心-CPU】第三節:負數、小數的修正機制
【智慧核心-CPU】第三節:負數、小數的修正機制
Chen-Hung Hu
 
【智慧核心-CPU】第二節:正整數進位制的轉換-編碼
【智慧核心-CPU】第二節:正整數進位制的轉換-編碼【智慧核心-CPU】第二節:正整數進位制的轉換-編碼
【智慧核心-CPU】第二節:正整數進位制的轉換-編碼
Chen-Hung Hu
 
漫談七段顯示器
漫談七段顯示器漫談七段顯示器
漫談七段顯示器
Chen-Hung Hu
 
BJT Transistor分壓偏壓電路分析
BJT Transistor分壓偏壓電路分析BJT Transistor分壓偏壓電路分析
BJT Transistor分壓偏壓電路分析
Chen-Hung Hu
 
淺談類比-數位轉換器
淺談類比-數位轉換器淺談類比-數位轉換器
淺談類比-數位轉換器
Chen-Hung Hu
 
感光元件及其相關迴路之研究 --以光敏電阻為例
感光元件及其相關迴路之研究 --以光敏電阻為例感光元件及其相關迴路之研究 --以光敏電阻為例
感光元件及其相關迴路之研究 --以光敏電阻為例
Chen-Hung Hu
 
穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例
穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例
穩壓元件及其相關迴路之研究 --以可調式輸出電源供應器為例
Chen-Hung Hu
 
Adc0804及其相關迴路之研究
Adc0804及其相關迴路之研究Adc0804及其相關迴路之研究
Adc0804及其相關迴路之研究
Chen-Hung Hu
 
可調式電源供應器之研究
可調式電源供應器之研究可調式電源供應器之研究
可調式電源供應器之研究
Chen-Hung Hu
 
HC 05藍芽模組連線
HC 05藍芽模組連線HC 05藍芽模組連線
HC 05藍芽模組連線
Chen-Hung Hu
 
自動功因改善裝置之研究
自動功因改善裝置之研究自動功因改善裝置之研究
自動功因改善裝置之研究
Chen-Hung Hu
 

Recently uploaded (20)

Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...
Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...
Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...
BeHappy728244
 
Influence line diagram for truss in a robust
Influence line diagram for truss in a robustInfluence line diagram for truss in a robust
Influence line diagram for truss in a robust
ParthaSengupta26
 
UNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power SystemUNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power System
Sridhar191373
 
"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai
Julio Chai
 
[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)
[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)
[HIFLUX] Lok Fitting&Valve Catalog 2025 (Eng)
하이플럭스 / HIFLUX Co., Ltd.
 
Axial Capacity Estimation of FRP-strengthened Corroded Concrete Columns
Axial Capacity Estimation of FRP-strengthened Corroded Concrete ColumnsAxial Capacity Estimation of FRP-strengthened Corroded Concrete Columns
Axial Capacity Estimation of FRP-strengthened Corroded Concrete Columns
Journal of Soft Computing in Civil Engineering
 
Proposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor RuleProposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor Rule
AlvaroLinero2
 
Software Engineering Project Presentation Tanisha Tasnuva
Software Engineering Project Presentation Tanisha TasnuvaSoftware Engineering Project Presentation Tanisha Tasnuva
Software Engineering Project Presentation Tanisha Tasnuva
tanishatasnuva76
 
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdfISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
FILTRATION ENGINEERING & CUNSULTANT
 
Webinar On Steel Melting IIF of steel for rdso
Webinar  On Steel  Melting IIF of steel for rdsoWebinar  On Steel  Melting IIF of steel for rdso
Webinar On Steel Melting IIF of steel for rdso
KapilParyani3
 
Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...
Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...
Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...
ManiMaran230751
 
ISO 4548-9 Oil Filter Anti Drain Catalogue.pdf
ISO 4548-9 Oil Filter Anti Drain Catalogue.pdfISO 4548-9 Oil Filter Anti Drain Catalogue.pdf
ISO 4548-9 Oil Filter Anti Drain Catalogue.pdf
FILTRATION ENGINEERING & CUNSULTANT
 
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDINGMODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
Dr. BASWESHWAR JIRWANKAR
 
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
RishabhGupta578788
 
Highway Engineering - Pavement materials
Highway Engineering - Pavement materialsHighway Engineering - Pavement materials
Highway Engineering - Pavement materials
AmrutaBhosale9
 
Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)
elelijjournal653
 
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos12
 
Introduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptxIntroduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptx
gunjalsachin
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 
Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...
Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...
Direct Current circuitsDirect Current circuitsDirect Current circuitsDirect C...
BeHappy728244
 
Influence line diagram for truss in a robust
Influence line diagram for truss in a robustInfluence line diagram for truss in a robust
Influence line diagram for truss in a robust
ParthaSengupta26
 
UNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power SystemUNIT-5-PPT Computer Control Power of Power System
UNIT-5-PPT Computer Control Power of Power System
Sridhar191373
 
"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai"The Enigmas of the Riemann Hypothesis" by Julio Chai
"The Enigmas of the Riemann Hypothesis" by Julio Chai
Julio Chai
 
Proposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor RuleProposed EPA Municipal Waste Combustor Rule
Proposed EPA Municipal Waste Combustor Rule
AlvaroLinero2
 
Software Engineering Project Presentation Tanisha Tasnuva
Software Engineering Project Presentation Tanisha TasnuvaSoftware Engineering Project Presentation Tanisha Tasnuva
Software Engineering Project Presentation Tanisha Tasnuva
tanishatasnuva76
 
Webinar On Steel Melting IIF of steel for rdso
Webinar  On Steel  Melting IIF of steel for rdsoWebinar  On Steel  Melting IIF of steel for rdso
Webinar On Steel Melting IIF of steel for rdso
KapilParyani3
 
Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...
Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...
Android basics – Key Codes – ADB – Rooting Android – Boot Process – File Syst...
ManiMaran230751
 
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDINGMODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
Dr. BASWESHWAR JIRWANKAR
 
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
9aeb2aae-3b85-47a5-9776-154883bbae57.pdf
RishabhGupta578788
 
Highway Engineering - Pavement materials
Highway Engineering - Pavement materialsHighway Engineering - Pavement materials
Highway Engineering - Pavement materials
AmrutaBhosale9
 
Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)Electrical and Electronics Engineering: An International Journal (ELELIJ)
Electrical and Electronics Engineering: An International Journal (ELELIJ)
elelijjournal653
 
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos12
 
Introduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptxIntroduction of Structural Audit and Health Montoring.pptx
Introduction of Structural Audit and Health Montoring.pptx
gunjalsachin
 
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
IRJET Journal
 

ParallelProgrammingBasics_v2.pdf

  • 2. Target Audience  People interests parallel programming topic  People wants to know how to improve the performance of their code  People wants to know how to acquire the (possible) peak performance from their computer (There are a bunch of techniques / methods available for reaching peak performance and this kind of things is out of the range of our discussion)  Someone wants to know the way that I am using my computers / servers (X 2
  • 3. Outline  Why parallel programming?  What is parallel programming?  How to perform parallel programming (in C++ / Matlab / C#)  Conclusion / Further Discussions 3
  • 4. Why Parallel Programming? 4  Please check the following C++ code, what’s the output? int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 5. Why Parallel Programming? 5  Answer of the question “Please check the following C++ code, what’s the output?” // https://godbolt.org/z/fb7TdT495 // https://godbolt.org/z/3Kj1azb4h int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; } 4, 5, 6,
  • 6. Why Parallel Programming? 6  Code Structure int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; } Part 1: Variable Initialization
  • 7. Why Parallel Programming? 7  Code Structure int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; } Part 2: Data Processing / Calculation
  • 8. Why Parallel Programming? 8  Code Structure int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; } Part 3: Output
  • 9. Why Parallel Programming? 9  In the mentioned simple example, the calculating part is simple add operation int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; } 4, 5, 6,
  • 10. Why Parallel Programming? 10  What happened in the case of more complicated operation? int main() { std::vector<Frame> image_frames = {img1, img2, img3}; auto results = std::vector<Features>(3); for(int i = 0; i < std::ranges::size(image_frames); ++i) { results[i] = feature_extraction(image_frames[i]); } … return 0; } This example is calling a function which named “feature_extraction”.
  • 11. Why Parallel Programming? 11 Without Parallel Programming With Parallel Programming Dish A Dish B Dish C … Dish A Dish B Dish C Icon is from https://www.hiclipart.com/free-transparent-background-png-clipart-iuxpq/download
  • 12. Why Parallel Programming? 12 Without Parallel Programming With Parallel Programming Task A Task B Task C … Task A Task B Task C Icon is from https://www.flaticon.com/free-icon/cpu_1250593
  • 13. The Steps of Execution 13 int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }  Let’s review the previous simple case. How’s the program is executed?
  • 14. The Steps of Execution 14 int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; } 1 2 3 test_vector
  • 15. The Steps of Execution 15 1 2 3 test_vector 3 a int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 16. The Steps of Execution 16 1 2 3 test_vector 3 a  Then, the execution runs sequentially? int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 17. The Steps of Execution 17 1 2 3 test_vector 3 a  Then, the execution runs sequentially? = 4 int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 18. The Steps of Execution 18 4 2 3 test_vector 3 a  Then, the execution runs sequentially? = 5 int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 19. The Steps of Execution 19 4 5 3 test_vector 3 a  Then, the execution runs sequentially? = 6 int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 20. The Steps of Execution 20 4 5 6 test_vector 3 a  Then, the execution runs sequentially? int main() { std::vector<int> test_vector = {1, 2, 3}; int a = 3; for(int i = 0; i < std::ranges::size(test_vector); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < std::ranges::size(test_vector); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 21. The Concept of Parallelization 21 1 2 3 test_vector 3  Despite the way of sequentialization, is it possible to speed up?  Why not let’s make the program runs parallelly (enable the operations run simultaneously)? a 3 3 New test_vector 4 5 6
  • 22. The Concept of Parallelization 22 1 2 3 test_vector 3  How this can be done in our program? Solution: Parallel Programming! a 3 3 New test_vector 4 5 6
  • 23. The Concept of Parallelization 23  Parallelization enabling  Tools in C++: - OpenMP - TBB(Threading Building Blocks) - std::thread - Execution Policy in STL  Tools in Matlab  Tools in C#
  • 24. Parallelization Implementation 24  Parallelization enabling with OpenMP #include <omp.h> int main() { auto test_vector = std::vector<int>{1, 2, 3}; int a = 3; #pragma omp parallel for for(int i = 0; i < test_vector.size(); i++) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < test_vector.size(); ++i) { std::cout << test_vector[i] << ", "; } return 0; } auto test_vector = std::vector<int>{1, 2, 3}; int a = 3; test_vector[0] = test_vector[0] + a; test_vector[1] = test_vector[1] + a; test_vector[2] = test_vector[2] + a; for(int i = 0; i < test_vector.size(); ++i) { std::cout << test_vector[i] << ", "; } return 0;
  • 25. Parallelization Implementation 25  Parallelization enabling with OpenMP / TBB // https://godbolt.org/z/szMc4jbqn // https://godbolt.org/z/haM1qd6eY #include <omp.h> int main() { auto test_vector = std::vector<int>{1, 2, 3}; int a = 3; #pragma omp parallel for for(int i = 0; i < test_vector.size(); ++i) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < test_vector.size(); ++i) { std::cout << test_vector[i] << ", "; } return 0; } // https://godbolt.org/z/dcssoWj8K #include <tbb/parallel_for.h> int main() { auto test_vector = std::vector<int>{1, 2, 3}; int a = 3; tbb::parallel_for( tbb::blocked_range<int>(0,test_vector.size()), [&](tbb::blocked_range<int> r) { for (int i=r.begin(); i<r.end(); ++i) { test_vector[i] = test_vector[i] + a; } }); for(int i = 0; i < test_vector.size(); ++i) { std::cout << test_vector[i] << ", "; } return 0; }
  • 26. Parallelization Implementation 26  Parallelization enabling with OpenMP / TBB // https://godbolt.org/z/haM1qd6eY #include <omp.h> int main() { auto test_vector = std::vector<int>{1, 2, 3}; int a = 3; #pragma omp parallel for for(int i = 0; i < test_vector.size(); i++) { test_vector[i] = test_vector[i] + a; } for(int i = 0; i < test_vector.size(); ++i) { std::cout << test_vector[i] << ", "; } return 0; } // https://godbolt.org/z/dcssoWj8K #include <tbb/parallel_for.h> int main() { auto test_vector = std::vector<int>{1, 2, 3}; int a = 3; tbb::parallel_for( tbb::blocked_range<int>(0,test_vector.size()), [&](tbb::blocked_range<int> r) { for (int i=r.begin(); i<r.end(); ++i) { test_vector[i] = test_vector[i] + a; } }); for(int i = 0; i < test_vector.size(); ++i) { std::cout << test_vector[i] << ", "; } return 0; } A lambda function is here!
  • 27. Parallelization Methods Comparison 27  Comparing OpenMP / TBB and std::thread // The following code is an example of std::thread // https://stackoverflow.com/a/11229853/6667035 // https://godbolt.org/z/YeY9d4EeP #include <string> #include <iostream> #include <numeric> #include <thread> #include <vector> void task1(std::vector<int> input) // The function we want to execute on the new thread. { for(int i = 0; i < input.size(); i++) { std::cout << "output from task1 function: " << input[i]; } } void function1() { auto test_vector1 = std::vector<int>(100); std::iota(test_vector1.begin(), test_vector1.end(), 1); int sum = 0; for(int i = 0; i < test_vector1.size(); i++) { sum += test_vector1[i]; } std::cout << sum << "n”; } int main() { auto test_vector = std::vector<int>(100); std::iota(test_vector.begin(), test_vector.end(), 1); std::thread t1(task1, test_vector); function1(); t1.join(); return 0; }
  • 28. std::thread Concept 28  Comparing OpenMP / TBB and std::thread main function task1 function function1 function Parallel Part end
  • 29. Execution Policy in STL 29  When it comes to Execution Policy after C++17…  std::execution::par  std::execution::seq // https://en.cppreference.com/w/cpp/algorithm/transform // https://godbolt.org/z/bY14q1z3K #include <algorithm> #include <execution> #include <iomanip> #include <iostream> #include <string> #include <thread> int main() { std::string g {"hello"}; std::for_each(std::execution::par, g.begin(), g.end(), [](char& c) // modify in-place { c = std::toupper(static_cast<unsigned char>(c)); }); std::cout << "g = " << std::quoted(g) << 'n'; return 0; }
  • 30. Parallelization in Matlab 30  Document of parfor function usage
  • 31. Parallelization in Matlab 31  parfor function usage example Program without parfor Program with parfor // https://www.mathworks.com/help/parallel- computing/parfor.html tic n = 200; A = 500; a = zeros(1,n); for i = 1:n a(i) = max(abs(eig(rand(A)))); end toc Elapsed time is 31.935373 seconds. tic n = 200; A = 500; a = zeros(1,n); parfor i = 1:n a(i) = max(abs(eig(rand(A)))); end toc Elapsed time is 10.760068 seconds.
  • 32. Parallelization in C# 32  Document of Parallel.For function usage: https://learn.microsoft.com/en- us/dotnet/standard/parallel-programming/data-parallelism-task-parallel-library
  • 33. Parallelization in C# 33  Parallel.For function usage example Program without Parallel.For Program with Parallel.For using System; using System.Threading.Tasks; public class ParallelTest { public static void Main(string[] args) { for(int i = 0; i < 10; i++) { Console.WriteLine (i + "n"); }; } } using System; using System.Threading.Tasks; public class ParallelTest { public static void Main(string[] args) { Parallel.For(0, 10, i => { Console.WriteLine (i + "n"); }); // Parallel.For } } A lambda function is here!
  • 35. Concept of Parallelable 35  Please think that what’s the limitation of Parallelization
  • 36. Concept of Parallelable 36  Please think that what’s the limitation of Parallelization Answer: The limitation of parallelization is that the operation which is to be parallelize should be independent! What’s the meaning of independent?
  • 37. Concept of Parallelable 37  Please think that what’s the limitation of Parallelization Answer: The limitation of parallelization is that the operation which is to be parallelize should be independent! What’s the meaning of independent? Let’s check the case of dependent first: A B C
  • 38. Concept of Parallelable 38  Please think that what’s the limitation of Parallelization Answer: The limitation of parallelization is that the operation which is to be parallelize should be independent! What’s the meaning of independent? Let’s check the case of dependent first: The A, B and C operations cannot be made in parallelization! Because B operation needs the output from A and C operation needs the output from B! A B C
  • 39. Conclusion / Further Discussions 39  Parallelization technique can bring some performance increment when you use it properly  Parallelization can make higher utilization of computers / computing devices  Is there any disadvantage of using parallelization method?
  • 40. Conclusion / Further Discussions 40  Parallelization technique can bring some performance increment when you use it properly  Parallelization can make higher utilization of computers / computing devices  Is there any disadvantage of using parallelization method?  Memory usage issue