Embedded-System-Partitioning-Cosynthesis
Embedded-System-Partitioning-Cosynthesis
System Partitioning
EE8205: Embedded Computer Systems
http://www.ee.ryerson.ca/~courses/ee8205/
Dr. Gul N. Khan
http://www.ee.ryerson.ca/~gnkhan
Electrical and Computer Engineering
Ryerson University______________
Overview
• Hardware-Software Codesign
• Task Graph Representations
• Scheduling for Partitioning
• GDL Scheduling and Partitioning
• DADGP-based Partitioning
Introductory Articles on Hardware-Software Partitioning available at the course webpage,
Part of Chapter 7, 5 of the Text by Wayne Wolf
switch (state) {
case IDLE: if (seat) { state = SEATED; timer_on = TRUE; } break;
case SEATED: if (belt) state = BELTED;
else if (timer) state = BUZZER; break;
………
}
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page: 8
Data Flow Graph
x = a + b; a b c d
y = c - d;
z = x * y; + -
y1 = b + d; y
x
* +
z y1
DFG
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page:10
Control Data Flow Graph
CDFG: represents control and data.
• Uses data flow graphs as components.
• Two types of nodes:
Data Flow Node encapsulate a DFG x = a + b;
y=c+d
Decision Nodes
T v1 value v4
cond
F v2 v3
Equivalent Forms
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page:11
Control Data Flow Graph Example
T
if (cond1) bb1(); cond1 bb1( )
else bb2(); F
bb3(); bb2( )
switch (test1) {
case c1: bb4(); break;
bb3( )
case c2: bb5(); break;
case c3: bb6(); break;
} test1 c3
c1
c2
bb4( ) bb5( ) bb6( )
• Represent variable
execution order of tasks
T1 and T2
4 u:=a-b -
t
1 Path-1 = 1 2 3 4 5 6 7 10 11
- + - 10 10
2 +
Path-2 = 1 2 3 8 9 10 11
a 3 IF
a’ 10 10
4 - 8 10ns
5 +
9 -
1,2,3
6 -
a
4(a), 8(a’)
Resources: One Adder and a’
7 10ns Subtractor each.
5, 6, 7
Constraints: 15ns State
10 Cycle
9, 10, 11
11 10ns 10, 11
4 - 8 10ns
7 10ns 9 -
1,2,3
6 - a
4(a), 7(a), 8(a’)
a’
5 +
10 5, 6, 10, 11 9, 10, 11
11 10ns
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page:21
Partitioning Approaches
Simple one CPU and an ASIC architecture is the most
common.
• Early approaches (mainly heuristic): Initially assume
all tasks mapped to software (one CPU Hardware)
• Move tasks to HW incrementally until system
requirements (system or individual task execution
time) are met.
• Other early approaches: Initially all tasks are mapped
to dedicated hardware.
• Move tasks incrementally to SW (CPU) until system
requirements (system or individual task execution
time) are met.
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page:22
Optimal Partitioning
• Exhaustive approaches are characterized by attempting
all possible combinations there by always selecting the
best option.
• Exhaustive approaches are generally computationally
intensive, consume huge-time in the range of hours or
even days to find an optimal partition.
PE0 A PE0 A B
PE1 PE1
PE2 B C PE2 C
4 9 13 3 8 12 16
GDL result Result of not considering decedents
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page:26
Another Example
10 11 PE0 PE1
A B C A 1 2
B 2 2
C 20 1
1
PE0 A
GDL Result
PE1 B C
11 13 14
PE0
PE1 A B C Optimal Solution
2 4 5
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page:27
DADGP: Directed Acyclic Data
Dependency Graph with Precedence
• Arrow represents dependence A
relationship 1
Profiling
LD Path Search
Mapping No
No Scheduling
Yes Yes
Finish
Operation SW HW HW Area
EXE EXE (gates) Gx2 Data
dependency
(ms) (ms)
Gradient 9.4 1.4 1200
Gy2
(Gx or Gy)
Square 5.2 0.9 500
Add
Add 3.88 0.3 100
-1 0 +1 -1 -2 -1
Gx Gy
b22=(a11*m11)+(a12*m12)+(a13*m13)+(a21*m21)+(a22*m22)+(a23*m23)+(a31*m31)+(a32*m32)+(a33*m33
)
©G. Khan EE8205: Embedded Computer Systems, HW-SW Partitioning Page: 40
Sobel Edge Detection
main() {
unsigned char image_in[ROWS][COLS];
unsigned char image_out[ROWS][COLS];
int r, c; /* row and column array counters */
int pixel; /* temporary value of pixel */
/*filter the image and store result in output array */
for (r=1; r<ROWS-1; r++)
for (c=1; c<COLS-1; c++) { /* Apply Sobel operator. */
pixel = image_in[r-1][c+1]–image_in[r-1][c-1]
+ 2*image_in[r][c+1] - 2*image_in[r][c-1]
+ image_in[r+1][c+1] - image_in[r+1][c-1];
/* Normalize and take absolute value */
pixel = abs(pixel/4);
/* Check magnitude */
if (pixel > Threshold)
pixel= 255; /*EDGE_VALUE;*/
/* Store in output array */
image_out[r][c] = (unsigned char) pixel;
}
}
0.1
Gy
2 0.1
Gy
2
0.1 0.1
Ad
d Ad
d
0.1
Gx Gy
0.1
Gx Gy
0.1
Gx Gy
0.1
0.1
0.1 Gx Gy
Gx Gy 2 2
Gx Gy 2 2 0.1
2 2 0.1
0.1
0.1
Ad
0.1
Ad d
0.1 d
Ad
d
40
35 33.8
30
25
23.68
Seconds
20
15.88
15
10.68
10
6.38
5
2.8
0
0 1200 2400 2900 3400 3500
HW area