SlideShare a Scribd company logo
1
Physically Aware HW/SW PartitioningPhysically Aware HW/SW Partitioning
for Reconfigurable Architectures withfor Reconfigurable Architectures with
Partial Dynamic ReconfigurationPartial Dynamic Reconfiguration
By :- Vibrant Technologies &
Computers
OutlineOutline
• Introduction
• Target System Architecture
• Placement Issues
• Proposed Approach
• Placement Example
• Experimental Results
• Conclusions
IntroductionIntroduction
• Hardware / Software Partitioning
o Used in systems with reconfigurable hardware (FPGA) operated in
conjunction with a software processor
o Hardware and Software tasks can execute concurrently
o Partitioning divides task graph into HW executed and SW executed tasks
to reduce time to completion
4
IntroductionIntroduction
• Partial Reconfiguration
o ‘Columns’ of FPGA can be configured independently
o Hardware mapped to other columns continues to run during
reconfiguration
• Partial Dynamic Reconfiguration
o Allows reuse of FPGA resources
o However, feasibility of placement no longer guaranteed
Target System ArchitectureTarget System ArchitectureGeneral Purpose
Memory
Software
Hardware
(Partial RTR)
Shared Memory
• Software: A processor running
software tasks
• Hardware: An FPGA accelerator
that supports partial
reconfiguration
• Shared Memory: Dedicated
memory used to transfer
input/output data between tasks
6
Target System ArchitectureTarget System Architecture
• Shared Memory can be implemented as on-chip or
off-chip dedicated memory
• Tasks mapped to the same device have negligible
communication overhead
• Tasks mapped to different devices incur a HW/SW
communication overhead
• Primary advantage: FPGA task placement reduces
to simple linear placement
Criticality of Task PlacementCriticality of Task Placement
• Each HW task occupies one or more adjacent
FPGA columns
• Placement feasibility in not guaranteed even with
an exact algorithm
• Infeasible implementation can result from
scheduling conflicts if not considered during
placement
Criticality of Task PlacementCriticality of Task Placement
9
Infeasible
Task Graph
Criticality of Task PlacementCriticality of Task Placement
10
Feasible
Task Graph
Criticality of Task PlacementCriticality of Task Placement
Infeasible placement
Heterogeneous ImplementationsHeterogeneous Implementations
• FPGA contain heterogeneous components:
o Memory Blocks
o Hardware Multipliers
o Embedded Processors
• Placement should consider multiple hardware
implementations of tasks
• Problem: Resources are limited and available in
specific locations on FPGA
12
Configuration PrefetchConfiguration Prefetch
• Reconfiguration can take place as other HW tasks
execute
• Prefetch of configuration data should be
considered while scheduling tasks
13
Proposed ApproachProposed Approach
• Exact Algorithm: Integer Linear Programming
o Technique of Optimization given linear constraints
o Constraints: Traditional HW/SW partitioning + Contiguous placement +
Configuration Prefetch
o Implementation on commercial ILP solver (CPLEX) very slow
• Heuristic Formulation:
o Modified KLFM approach
Basic KLFM HeuristicBasic KLFM Heuristic
KLFM Loop:
While (more unlocked tasks)
select best task to switch between HW/SW
move & lock best task
update best partition if new partition is better
Basic KLFM HeuristicBasic KLFM Heuristic
KLFM Loop:
While (more unlocked tasks)
for (each unlocked task)
for (each alternate implementation)
calculate makespan by physically
aware list scheduling
select & lock best (task, implementation point)
update best partition if new partition is better
Placement ExamplePlacement Example
Task HW time SW time HW area
1 5 23 3
2 2 9 3
3 2 11 2
4 3 14 1
5 2 10 2
6 3 7 4
Time C1 C2 C5C4C3 C6 Proc
1
2
6
7
8
9
10
5
3
4
E1
E2
R3
R4
E3
E4
R5
E5
P6
C65
Gap
T1 T2
T3T4
T5
T6
Experiments on FeasibilityExperiments on Feasibility
Placement-unaware Placement-aware
Test TILP Feasibility TILP THEU
tg1 10 Y 10 11
tg5 25 NO 26 26
Mean-value 21 Y 21 21
tg7 20 Y 20 20
tg10 27 NO 28 29
FFT 25 Y 25 25
tg11 36 NO 38 41
tg12 14 NO 15 18
4-band eq 27 Y 27 27
Case Study: JPEG EncoderCase Study: JPEG Encoder
• Resource constraint of 8 columns
• Total area occupied by tasks: 11 columns
• Data collected for a 256x256 color image
Experiment Schedule length (ms)
HW-SW partitioning, no partial RTR 16.74
HW-SW, partial RTR 9.9
HW-SW, partial RTR, perfect prefetch 9.04
Finer-grain graph 7.21
Multiple implementations, single heterogeneous
column
6.82
Best implementation points only 9.58
19
ConclusionsConclusions
• Current techniques do not consider one or more placement and
scheduling issues:
o Configuration prefetch
o Feasibility of partition
o Single reconfiguration controller bottleneck
o Multiple Implementations
o Heterogeneous Architecture
• Integer Linear Programming: Exact solution, but very long run-time
• Modified KLFM Heuristic: Almost ideal solution, run-time of minutes of
hundreds of nodes
Issues in PlacementIssues in Placement
• Resource bottleneck of a single reconfiguration
controller
• May not be possible to hide reconfiguration
overhead for all tasks
• Cannot apply rectangular packing algorithms due
to gaps in schedule (caused by dependencies)
EST Computation AlgorithmEST Computation Algorithm
find earliest time slot where task can
be placed
reconfig start = earliest time instant
that space and controller are
available together
if (( reconfig start + reconfig time) <
dependency time )
EST=earliest time parent
dependencies are satisfied
else
EST=end of reconfiguration
ThankThank You !!!You !!!
For More Information click below link:
Follow Us on:
http://vibranttechnologies.co.in/embedded-system-classes-in-mumbai.html

More Related Content

PDF
LAS16-TR04: Using tracing to tune and optimize EAS (English)
PPT
Threading Successes 03 Gamebryo
PDF
Tuning Java for Big Data
PPTX
Constraint Programming in Compiler Optimization: Lessons Learned
PPT
Tech Days 2015: User Presentation Vermont Technical College
PPTX
Fortran & Link with Library & Brief Explanation of MKL BLAS
PPTX
XSEDE15_PhastaGateway
PPT
Jboss World 2011 Infinispan
LAS16-TR04: Using tracing to tune and optimize EAS (English)
Threading Successes 03 Gamebryo
Tuning Java for Big Data
Constraint Programming in Compiler Optimization: Lessons Learned
Tech Days 2015: User Presentation Vermont Technical College
Fortran & Link with Library & Brief Explanation of MKL BLAS
XSEDE15_PhastaGateway
Jboss World 2011 Infinispan

What's hot (20)

PDF
Unleash performance through parallelism - Intel® Math Kernel Library
PPTX
Hadoop world g1_gc_forh_base_v4
PPT
운영체제론 - Ch09
PDF
Some experiences for porting application to Intel Xeon Phi
PDF
BUD17-218: Scheduler Load tracking update and improvement
PDF
Postgres Vision 2018: Making Postgres Even Faster
 
PPTX
G1GC
PDF
LCU13: Power-efficient scheduling, and the latest news from the kernel summit
PDF
Modeling & Simulation of CubeSat-based Missions'Concept of Operations
PPTX
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
PPT
Health Check Your DB2 UDB For Z/OS System
PDF
How Netflix Tunes EC2 Instances for Performance
PDF
Briefing - The Atlast V Aft Bulkhead Carrier Update - Past Missions, Upcoming...
PDF
Netflix SRE perf meetup_slides
PPTX
G1 collector and tuning and Cassandra
PDF
LAS16-307: Benchmarking Schedutil in Android
ODP
SANER 2015 ERA track: Differential Flame Graphs
PDF
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
PPTX
Storing Cassandra Metrics
PDF
LCU14-410: How to build an Energy Model for your SoC
Unleash performance through parallelism - Intel® Math Kernel Library
Hadoop world g1_gc_forh_base_v4
운영체제론 - Ch09
Some experiences for porting application to Intel Xeon Phi
BUD17-218: Scheduler Load tracking update and improvement
Postgres Vision 2018: Making Postgres Even Faster
 
G1GC
LCU13: Power-efficient scheduling, and the latest news from the kernel summit
Modeling & Simulation of CubeSat-based Missions'Concept of Operations
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Health Check Your DB2 UDB For Z/OS System
How Netflix Tunes EC2 Instances for Performance
Briefing - The Atlast V Aft Bulkhead Carrier Update - Past Missions, Upcoming...
Netflix SRE perf meetup_slides
G1 collector and tuning and Cassandra
LAS16-307: Benchmarking Schedutil in Android
SANER 2015 ERA track: Differential Flame Graphs
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...
Storing Cassandra Metrics
LCU14-410: How to build an Energy Model for your SoC
Ad

Viewers also liked (20)

POTX
Blackboard.ppt template
PPT
Blackboard Basics
PPT
Ajal mod 1
PPTX
Chalkboard and other display board
PPTX
chalkboard cleaning machine
PPT
Bulletin boards
PPT
The chalkboard
PPTX
Blackboard Powerpoint
PDF
The Minimum Loveable Product
PPTX
How People Really Hold and Touch (their Phones)
PDF
Displaying Data
PDF
Five Killer Ways to Design The Same Slide
PDF
The History of SEO
PDF
The Seven Deadly Social Media Sins
PDF
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
PDF
Upworthy: 10 Ways To Win The Internets
PDF
How To (Really) Get Into Marketing
PDF
The What If Technique presented by Motivate Design
PPTX
Why Content Marketing Fails
PDF
What 33 Successful Entrepreneurs Learned From Failure
Blackboard.ppt template
Blackboard Basics
Ajal mod 1
Chalkboard and other display board
chalkboard cleaning machine
Bulletin boards
The chalkboard
Blackboard Powerpoint
The Minimum Loveable Product
How People Really Hold and Touch (their Phones)
Displaying Data
Five Killer Ways to Design The Same Slide
The History of SEO
The Seven Deadly Social Media Sins
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
Upworthy: 10 Ways To Win The Internets
How To (Really) Get Into Marketing
The What If Technique presented by Motivate Design
Why Content Marketing Fails
What 33 Successful Entrepreneurs Learned From Failure
Ad

Similar to Embedded system -Introduction to hardware designing (20)

PPT
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
PPTX
Project Slides for Website 2020-22.pptx
PPTX
Ceph Community Talk on High-Performance Solid Sate Ceph
PPT
Runtime Reconfigurable Network-on-chips for FPGA-based Devices
PPTX
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
PDF
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
PDF
OGG Architecture Performance
PDF
The state of SQL-on-Hadoop in the Cloud
PPTX
NGENSTOR_ODA_P2V_V5
PPTX
Retour d'expérience d'un environnement base de données multitenant
PDF
Benchmarking for postgresql workloads in kubernetes
PDF
Fast switching of threads between cores - Advanced Operating Systems
PDF
Oracle GoldenGate Architecture Performance
PDF
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
PDF
8d545d46b1785a31eaab12d116e10ba41d996928Lecture%202%20and%203%20pdf (1).pdf
PDF
Callgraph analysis
PPT
MPHD RC Overview
PDF
The state of SQL-on-Hadoop in the Cloud
PPTX
Designing for High Performance Ceph at Scale
PDF
General Purpose GPU Computing
Extreme Availability using Oracle 12c Features: Your very last system shutdown?
Project Slides for Website 2020-22.pptx
Ceph Community Talk on High-Performance Solid Sate Ceph
Runtime Reconfigurable Network-on-chips for FPGA-based Devices
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
OGG Architecture Performance
The state of SQL-on-Hadoop in the Cloud
NGENSTOR_ODA_P2V_V5
Retour d'expérience d'un environnement base de données multitenant
Benchmarking for postgresql workloads in kubernetes
Fast switching of threads between cores - Advanced Operating Systems
Oracle GoldenGate Architecture Performance
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
8d545d46b1785a31eaab12d116e10ba41d996928Lecture%202%20and%203%20pdf (1).pdf
Callgraph analysis
MPHD RC Overview
The state of SQL-on-Hadoop in the Cloud
Designing for High Performance Ceph at Scale
General Purpose GPU Computing

More from Vibrant Technologies & Computers (20)

PPT
Buisness analyst business analysis overview ppt 5
PPT
SQL Introduction to displaying data from multiple tables
PPT
SQL- Introduction to MySQL
PPT
SQL- Introduction to SQL database
PPT
ITIL - introduction to ITIL
PPT
Salesforce - Introduction to Security & Access
PPT
Data ware housing- Introduction to olap .
PPT
Data ware housing - Introduction to data ware housing process.
PPT
Data ware housing- Introduction to data ware housing
PPT
Salesforce - classification of cloud computing
PPT
Salesforce - cloud computing fundamental
PPT
SQL- Introduction to PL/SQL
PPT
SQL- Introduction to advanced sql concepts
PPT
SQL Inteoduction to SQL manipulating of data
PPT
SQL- Introduction to SQL Set Operations
PPT
Sas - Introduction to designing the data mart
PPT
Sas - Introduction to working under change management
PPT
SAS - overview of SAS
PPT
Teradata - Architecture of Teradata
PPT
Teradata - Restoring Data
Buisness analyst business analysis overview ppt 5
SQL Introduction to displaying data from multiple tables
SQL- Introduction to MySQL
SQL- Introduction to SQL database
ITIL - introduction to ITIL
Salesforce - Introduction to Security & Access
Data ware housing- Introduction to olap .
Data ware housing - Introduction to data ware housing process.
Data ware housing- Introduction to data ware housing
Salesforce - classification of cloud computing
Salesforce - cloud computing fundamental
SQL- Introduction to PL/SQL
SQL- Introduction to advanced sql concepts
SQL Inteoduction to SQL manipulating of data
SQL- Introduction to SQL Set Operations
Sas - Introduction to designing the data mart
Sas - Introduction to working under change management
SAS - overview of SAS
Teradata - Architecture of Teradata
Teradata - Restoring Data

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Modernizing your data center with Dell and AMD
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
HCSP-Presales-Campus Network Planning and Design V1.0 Training Material-Witho...
PDF
madgavkar20181017ppt McKinsey Presentation.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Advanced IT Governance
PDF
Electronic commerce courselecture one. Pdf
PDF
Transforming Manufacturing operations through Intelligent Integrations
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Cloud computing and distributed systems.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
CIFDAQ's Market Wrap: Ethereum Leads, Bitcoin Lags, Institutions Shift
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
NewMind AI Weekly Chronicles - August'25 Week I
Modernizing your data center with Dell and AMD
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
HCSP-Presales-Campus Network Planning and Design V1.0 Training Material-Witho...
madgavkar20181017ppt McKinsey Presentation.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Understanding_Digital_Forensics_Presentation.pptx
Advanced IT Governance
Electronic commerce courselecture one. Pdf
Transforming Manufacturing operations through Intelligent Integrations
Empathic Computing: Creating Shared Understanding
Cloud computing and distributed systems.
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
CIFDAQ's Market Wrap: Ethereum Leads, Bitcoin Lags, Institutions Shift
Per capita expenditure prediction using model stacking based on satellite ima...

Embedded system -Introduction to hardware designing

  • 1. 1
  • 2. Physically Aware HW/SW PartitioningPhysically Aware HW/SW Partitioning for Reconfigurable Architectures withfor Reconfigurable Architectures with Partial Dynamic ReconfigurationPartial Dynamic Reconfiguration By :- Vibrant Technologies & Computers
  • 3. OutlineOutline • Introduction • Target System Architecture • Placement Issues • Proposed Approach • Placement Example • Experimental Results • Conclusions
  • 4. IntroductionIntroduction • Hardware / Software Partitioning o Used in systems with reconfigurable hardware (FPGA) operated in conjunction with a software processor o Hardware and Software tasks can execute concurrently o Partitioning divides task graph into HW executed and SW executed tasks to reduce time to completion 4
  • 5. IntroductionIntroduction • Partial Reconfiguration o ‘Columns’ of FPGA can be configured independently o Hardware mapped to other columns continues to run during reconfiguration • Partial Dynamic Reconfiguration o Allows reuse of FPGA resources o However, feasibility of placement no longer guaranteed
  • 6. Target System ArchitectureTarget System ArchitectureGeneral Purpose Memory Software Hardware (Partial RTR) Shared Memory • Software: A processor running software tasks • Hardware: An FPGA accelerator that supports partial reconfiguration • Shared Memory: Dedicated memory used to transfer input/output data between tasks 6
  • 7. Target System ArchitectureTarget System Architecture • Shared Memory can be implemented as on-chip or off-chip dedicated memory • Tasks mapped to the same device have negligible communication overhead • Tasks mapped to different devices incur a HW/SW communication overhead • Primary advantage: FPGA task placement reduces to simple linear placement
  • 8. Criticality of Task PlacementCriticality of Task Placement • Each HW task occupies one or more adjacent FPGA columns • Placement feasibility in not guaranteed even with an exact algorithm • Infeasible implementation can result from scheduling conflicts if not considered during placement
  • 9. Criticality of Task PlacementCriticality of Task Placement 9 Infeasible Task Graph
  • 10. Criticality of Task PlacementCriticality of Task Placement 10 Feasible Task Graph
  • 11. Criticality of Task PlacementCriticality of Task Placement Infeasible placement
  • 12. Heterogeneous ImplementationsHeterogeneous Implementations • FPGA contain heterogeneous components: o Memory Blocks o Hardware Multipliers o Embedded Processors • Placement should consider multiple hardware implementations of tasks • Problem: Resources are limited and available in specific locations on FPGA 12
  • 13. Configuration PrefetchConfiguration Prefetch • Reconfiguration can take place as other HW tasks execute • Prefetch of configuration data should be considered while scheduling tasks 13
  • 14. Proposed ApproachProposed Approach • Exact Algorithm: Integer Linear Programming o Technique of Optimization given linear constraints o Constraints: Traditional HW/SW partitioning + Contiguous placement + Configuration Prefetch o Implementation on commercial ILP solver (CPLEX) very slow • Heuristic Formulation: o Modified KLFM approach
  • 15. Basic KLFM HeuristicBasic KLFM Heuristic KLFM Loop: While (more unlocked tasks) select best task to switch between HW/SW move & lock best task update best partition if new partition is better
  • 16. Basic KLFM HeuristicBasic KLFM Heuristic KLFM Loop: While (more unlocked tasks) for (each unlocked task) for (each alternate implementation) calculate makespan by physically aware list scheduling select & lock best (task, implementation point) update best partition if new partition is better
  • 17. Placement ExamplePlacement Example Task HW time SW time HW area 1 5 23 3 2 2 9 3 3 2 11 2 4 3 14 1 5 2 10 2 6 3 7 4 Time C1 C2 C5C4C3 C6 Proc 1 2 6 7 8 9 10 5 3 4 E1 E2 R3 R4 E3 E4 R5 E5 P6 C65 Gap T1 T2 T3T4 T5 T6
  • 18. Experiments on FeasibilityExperiments on Feasibility Placement-unaware Placement-aware Test TILP Feasibility TILP THEU tg1 10 Y 10 11 tg5 25 NO 26 26 Mean-value 21 Y 21 21 tg7 20 Y 20 20 tg10 27 NO 28 29 FFT 25 Y 25 25 tg11 36 NO 38 41 tg12 14 NO 15 18 4-band eq 27 Y 27 27
  • 19. Case Study: JPEG EncoderCase Study: JPEG Encoder • Resource constraint of 8 columns • Total area occupied by tasks: 11 columns • Data collected for a 256x256 color image Experiment Schedule length (ms) HW-SW partitioning, no partial RTR 16.74 HW-SW, partial RTR 9.9 HW-SW, partial RTR, perfect prefetch 9.04 Finer-grain graph 7.21 Multiple implementations, single heterogeneous column 6.82 Best implementation points only 9.58 19
  • 20. ConclusionsConclusions • Current techniques do not consider one or more placement and scheduling issues: o Configuration prefetch o Feasibility of partition o Single reconfiguration controller bottleneck o Multiple Implementations o Heterogeneous Architecture • Integer Linear Programming: Exact solution, but very long run-time • Modified KLFM Heuristic: Almost ideal solution, run-time of minutes of hundreds of nodes
  • 21. Issues in PlacementIssues in Placement • Resource bottleneck of a single reconfiguration controller • May not be possible to hide reconfiguration overhead for all tasks • Cannot apply rectangular packing algorithms due to gaps in schedule (caused by dependencies)
  • 22. EST Computation AlgorithmEST Computation Algorithm find earliest time slot where task can be placed reconfig start = earliest time instant that space and controller are available together if (( reconfig start + reconfig time) < dependency time ) EST=earliest time parent dependencies are satisfied else EST=end of reconfiguration
  • 23. ThankThank You !!!You !!! For More Information click below link: Follow Us on: http://vibranttechnologies.co.in/embedded-system-classes-in-mumbai.html

Editor's Notes