0% found this document useful (0 votes)
140 views46 pages

Performance Characterization PDF

The document discusses techniques for estimating delay in digital logic circuits. It covers: - Defining propagation delay and its components (rise/fall times) - Using RC models and equivalent circuits to estimate delay - Calculating logical effort and electrical effort to characterize a gate's ability to drive load - Examples of analyzing NAND and NOR gates, estimating delays, and transistor sizing to balance delays The overall aim is to provide methods to easily estimate delay without simulation for evaluating design tradeoffs.

Uploaded by

Sankalp Jain
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views46 pages

Performance Characterization PDF

The document discusses techniques for estimating delay in digital logic circuits. It covers: - Defining propagation delay and its components (rise/fall times) - Using RC models and equivalent circuits to estimate delay - Calculating logical effort and electrical effort to characterize a gate's ability to drive load - Examples of analyzing NAND and NOR gates, estimating delays, and transistor sizing to balance delays The overall aim is to provide methods to easily estimate delay without simulation for evaluating design tradeoffs.

Uploaded by

Sankalp Jain
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Performance Characterization 

    Delay analysis Transistor sizing Logical effort Power analysis

ECE 261

Krish Chakrabarty 

1 

Delay Denitions 
 tpdr: rising propagation delay
 From input to rising output crossing VDD/2

 tpdf: falling propagation delay


 From input to falling output crossing VDD/2

 tpd: average propagation delay


 tpd = (tpdr + tpdf)/2

 tr: rise time


 From output crossing 0.2 VDD to 0.8 VDD

 tf: fall time


 From output crossing 0.8 VDD to 0.2 VDD
ECE 261 Krish Chakrabarty  2 

1 

Simulated Inverter Delay 


 Solving differential equations by hand is too hard  SPICE simulator solves the equations numerically
 Uses more accurate I-V models too!

 But simulations take time to write

ECE 261

Krish Chakrabarty 

3 

Delay Estimation 
 We would like to be able to easily estimate delay
 Not as accurate as simulation  But easier to ask What if?

 

The step response usually looks like a 1st order RC response with a decaying exponential. Use RC delay models to estimate delay
 C = total capacitance on output node  Use effective resistance R  So that tpd = RC

Characterize transistors by nding their effective R


 Depends on average current as gate switches

ECE 261

Krish Chakrabarty 

4 

2 

RC Delay Models 
 Use equivalent circuits for MOS transistors
 Ideal switch + capacitance and ON resistance  Unit nMOS has resistance R, capacitance C  Unit pMOS has resistance 2R, capacitance C

 Capacitance proportional to width  Resistance inversely proportional to width

ECE 261

Krish Chakrabarty 

5 

Example: 3-input NAND 


 Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R).

ECE 261

Krish Chakrabarty 

6 

3 

Example: 3-input NAND 


 Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R).

ECE 261

Krish Chakrabarty 

7 

Example: 3-input NAND 


 Sketch a 3-input NAND with transistor widths chosen to achieve effective rise and fall resistances equal to a unit inverter (R).

ECE 261

Krish Chakrabarty 

8 

4 

3-input NAND Caps 


 Annotate the 3-input NAND gate with gate and diffusion capacitance.

ECE 261

Krish Chakrabarty 

9 

3-input NAND Caps 


 Annotate the 3-input NAND gate with gate and diffusion capacitance.

ECE 261

Krish Chakrabarty 

10 

5 

3-input NAND Caps 


 Annotate the 3-input NAND gate with gate and diffusion capacitance.

ECE 261

Krish Chakrabarty 

11 

Elmore Delay 
 ON transistors look like resistors  Pullup or pulldown network modeled as RC ladder  Elmore delay of RC ladder

ECE 261

Krish Chakrabarty 

12 

6 

Example: 2-input NAND 


 Estimate worst-case rising and falling delay of 2input NAND driving h identical gates.

ECE 261

Krish Chakrabarty 

13 

Example: 2-input NAND 


 Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.

ECE 261

Krish Chakrabarty 

14 

7 

Example: 2-input NAND 


 Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.

ECE 261

Krish Chakrabarty 

15 

Example: 2-input NAND 


 Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.

ECE 261

Krish Chakrabarty 

16 

8 

Example: 2-input NAND 


 Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.

ECE 261

Krish Chakrabarty 

17 

Example: 2-input NAND 


 Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates.

ECE 261

Krish Chakrabarty 

18 

9 

Example: 2-input NAND 


 Estimate rising and falling propagation delays of a 2-input NAND driving h identical gates. 

ECE 261

Krish Chakrabarty 

19 

Delay Components 
 Delay has two parts
 Parasitic delay  6 or 7 RC  Independent of load   Effort delay  4h RC  Proportional to load capacitance

ECE 261

Krish Chakrabarty 

20 

10 

Contamination Delay 
 Best-case (contamination) delay can be substantially less than propagation delay.  Ex: If both inputs fall simultaneously

ECE 261

Krish Chakrabarty 

21 

Diffusion Capacitance 
 We assumed contacted diffusion on every s / d.  Good layout minimizes diffusion area  Ex: NAND3 layout shares one diffusion contact
 Reduces output capacitance by 2C  Merged uncontacted diffusion might help too

ECE 261

Krish Chakrabarty 

22 

11 

Layout Comparison 
 Which layout is better?

ECE 261

Krish Chakrabarty 

23 

Resizing the Inverter 


Minimum-sized  transistor: W=3, L=2 3  poly 2 To get equal rise and fall times, n = p  Wp = 3Wn, assuming n-diffusion that electron mobility is three times that  of holes Wp=9

2 9 p-diffusion

Sometimes the function being implemented makes resizing unnecessary!

poly
ECE 261 Krish Chakrabarty  24 

12 

Analyzing the NAND Gate 


VDD p1  a  p2 b a b c c n1  n2 n3 Gnd Resistances are in series (conductances are in parallel) If n1  = n2  =  n3  = n then  n, eff = n/3  Pull-down circuit has three times resistance, one-third times the conductance  p3 F n, eff =  1 n1  1 + 1 + 1 n2  n3 

For pull-up, only one transistor has to be on, p, eff = min{p1,p2,p3} If  p1 =  p2  =  p3  = p  = n/3 then  n, eff = p   no resizing is necessary

ECE 261

Krish Chakrabarty 

25 

Analyzing the NOR Gate 


VDD a b c  n1 a  n2 b p, eff =  p1  p2 p3 c Gnd Resistances are in series (conductances are in parallel)  = p2  =  p3  = p then  p, eff = p/3  If p1 n3  Pull-up circuit has three times resistance, one-third times the conductance 1 1 + 1 + 1 p1  p2  p3 

For pull-down, only one transistor has to be on, n, eff = min{n1,n2,n3} If  n1 =  n2  =  n3  = n  = 3p then  n,eff=9p,eff   considerable resizing is necessary W = 9W !
p n

ECE 261

Krish Chakrabarty 

26 

13 

Effect of Series Transistors 


Diffusion L poly Diffusion poly

L

poly

3L

L

poly

ECE 261

Krish Chakrabarty 

27 

Effect of Series Transistors 


VDD

Transistor resizing a example


b

p c p Pull-down

p

Resize the pull-up transistors to make pull-up times equal After resizing:  a: 2p, b: 2p, c: p

ECE 261

Krish Chakrabarty 

28 

14 

Transistor Placement (Series Stack) 


How to order transistors in a series stack? Body effect: Vt  Vsb 
Pull-up stack a b c tc Gnd ta tb Cb Cc F Ca  At time t = 0, a=b=c=0, f=1, capacitances are charged  Ideally Vta = Vtb = Vtc  0.8V  However, Vta > Vtb > Vtc because of body effect  If a, b, c become 1 at the same time, which transistor will switch on rst?  tc will switch on rst (Vsb for tc is zero), Cc will discharge, pulling Vsb for tb to zero  If signals arrive at different times, how should the  transistors be ordered?  Design strategy: place latest arriving signal nearest to output-early signals will discharge internal  nodes
Krish Chakrabarty  29 

ECE 261

Transistor Placement 
2 2 2 2 Primary inputs (change simultaneously) c tc Gnd b Pull-up stack a ta tb Cb Cc 2 2 2 2 Pull-up stack b a c tc
ECE 261 Krish Chakrabarty 

F Ca

ta tb Cb Cc

Ca

F

30 

15 

Some Design Guidelines 


 Use NAND gates (instead of NOR) wherever possible  Placed inverters (buffers) at high fanout nodes to improve drive capability  Avoid use of NOR completely in high-speed circuits: A1 + A2 + + An = A1.A2.An 

ECE 261

Krish Chakrabarty 

31 

Some Design Guidelines 


 Use limited fan-in (<10): high fan-in  long series stacks  Use minimum-sized gates on high fan-out nodes: minimize load presented to driving gate

ECE 261

Krish Chakrabarty 

32 

16 

Logical Effort 
 Chip designers face a bewildering array of choices
 What is the best circuit topology for a function?  How many stages of logic give least delay?  How wide should the transistors be?

 Logical effort is a method to make these decisions


   
ECE 261

Uses a simple model of delay Allows back-of-the-envelope calculations Helps make rapid comparisons between alternatives Emphasizes remarkable symmetries
Krish Chakrabarty  33 

Delay in a Logic Gate 


 Express delays in process-independent unit
 = 3RC  12 ps in 180 nm process 40 ps in 0.6 m process

ECE 261

Krish Chakrabarty 

34 

17 

Delay in a Logic Gate 


 Express delays in process-independent unit

 Delay has two components

ECE 261

Krish Chakrabarty 

35 

Delay in a Logic Gate 


 Express delays in process-independent unit

 Delay has two components  Effort delay f = gh (a.k.a. stage effort)


 Again has two components
ECE 261 Krish Chakrabarty  36 

18 

Delay in a Logic Gate 


 Express delays in process-independent unit

 Delay has two components  Effort delay f = gh (a.k.a. stage effort)


 Again has two components

 g: logical effort
 Measures relative ability of gate to deliver current  g  1 for inverter

ECE 261

Krish Chakrabarty 

37 

Delay in a Logic Gate 


 Express delays in process-independent unit

 Delay has two components  Effort delay f = gh (a.k.a. stage effort)


 Again has two components

 h: electrical effort = Cout / Cin


 Ratio of output to input capacitance  Sometimes called fanout

ECE 261

Krish Chakrabarty 

38 

19 

Delay in a Logic Gate 


 Express delays in process-independent unit

 Delay has two components  Parasitic delay p


 Represents delay of gate driving no load  Set by internal parasitic capacitance
Krish Chakrabarty 

ECE 261

39 

Delay Plots 
d = f + p
 = gh + p

ECE 261

Krish Chakrabarty 

40 

20 

Delay Plots 
d = f + p
 = gh + p

ECE 261

Krish Chakrabarty 

41 

Computing Logical Effort 


 Denition: Logical effort is the ratio of the input capacitance of a gate to the input capacitance of an inverter delivering the same output current.  Measure from delay vs. fanout plots  Or estimate by counting transistor widths

ECE 261

Krish Chakrabarty 

42 

21 

Catalog of Gates 
 Logical effort of common gates
Gate type 1 Inverter NAND NOR Tristate / mux 2 1 4/3 5/3 2 5/3 7/3 2 6/3 9/3 2 (n+2)/3 (2n+1)/3 2 2 Number of inputs  3 4 n

ECE 261

Krish Chakrabarty 

43 

Catalog of Gates 
 Parasitic delay of common gates
 In multiples of pinv (1) Gate type 1 Inverter NAND NOR Tristate / mux 2 XOR, XNOR
ECE 261

2 2 2 4 4

Number of inputs  3 4 3 3 6 6
Krish Chakrabarty 

n n n 2n

1 4 4 8 8
44 

22 

Example: Ring Oscillator 


 Estimate the frequency of an N-stage ring oscillator

Logical Effort: g =  Electrical Effort: h = Parasitic Delay: p = Stage Delay: d = Frequency: fosc = 
ECE 261 Krish Chakrabarty  45 

Example: Ring Oscillator 


 Estimate the frequency of an N-stage ring oscillator

Logical Effort: g = 1 Electrical Effort: h = 1 Parasitic Delay: p = 1 Stage Delay: d = 2 Frequency: fosc = 1/(2*N*d) = 1/4N
ECE 261 Krish Chakrabarty 

31 stage ring oscillator in 0.6 m process has frequency of ~ 200 MHz

46 

23 

Example: FO4 Inverter 


 Estimate the delay of a fanout-of-4 (FO4) inverter

Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay:

g =  h = p = d =

ECE 261

Krish Chakrabarty 

47 

Example: FO4 Inverter 


 Estimate the delay of a fanout-of-4 (FO4) inverter

Logical Effort: Electrical Effort: Parasitic Delay: Stage Delay:

g = 1 h = 4 p = 1 d = 5

The FO4 delay is about 200 ps in 0.6 m process 60 ps in a 180 nm process f/3 ns in an f m process

ECE 261

Krish Chakrabarty 

48 

24 

Multistage Logic Networks 


 Logical effort generalizes to multistage networks  Path Logical Effort  Path Electrical Effort  Path Effort

ECE 261

Krish Chakrabarty 

49 

Multistage Logic Networks 


 Logical effort generalizes to multistage networks  Path Logical Effort  Path Electrical Effort  Path Effort  Can we write F = GH?
ECE 261 Krish Chakrabarty  50 

25 

Paths that Branch 


 No! Consider paths that branch: G = H = GH = h1 = h2 = F = GH? 
ECE 261 Krish Chakrabarty  51 

Paths that Branch 


 No! Consider paths that branch: G = 1 H = 90 / 5 = 18 GH = 18 h1 = (15 +15) / 5 = 6 h2 = 90 / 15 = 6 F = g1g2h1h2 = 36 = 2GH 
ECE 261 Krish Chakrabarty  52 

26 

Branching Effort 
 Introduce branching effort
 Accounts for branching between stages in path

Note:

 Now we compute the path effort


 F = GBH
ECE 261 Krish Chakrabarty  53 

Multistage Delays 
 Path Effort Delay  Path Parasitic Delay  Path Delay

ECE 261

Krish Chakrabarty 

54 

27 

Designing Fast Circuits 


 Delay is smallest when each stage bears same effort

Thus minimum delay of N stage path is

This is a key result of logical effort


 Find fastest possible delay  Doesnt require calculating gate sizes

ECE 261

Krish Chakrabarty 

55 

Gate Sizes 
 How wide should the gates be for least delay?

 Working backward, apply capacitance transformation to nd input capacitance of each gate given load it drives.  Check work by verifying input cap spec is met.

ECE 261

Krish Chakrabarty 

56 

28 

Example: 3-stage path 


 Select gate sizes x and y for least delay from A to B

ECE 261

Krish Chakrabarty 

57 

Example: 3-stage path 

Logical Effort  Electrical Effort Branching Effort Path Effort  Best Stage Effort Parasitic Delay  Delay
ECE 261

G =  H = B = F =  P = D =
Krish Chakrabarty  58 

29 

Example: 3-stage path 

Logical Effort  Electrical Effort Branching Effort Path Effort  Best Stage Effort Parasitic Delay  Delay
ECE 261

G = (4/3)*(5/3)*(5/3) = 100/27 H = 45/8 B = 3 * 2 = 6 F = GBH = 125  P = 2 + 3 + 2 = 7 D = 3*5 + 7 = 22 = 4.4 FO4


Krish Chakrabarty  59 

Example: 3-stage path 


 Work backward for sizes y = x =

ECE 261

Krish Chakrabarty 

60 

30 

Example: 3-stage path 


 Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15*2) * (5/3) / 5 = 10

ECE 261

Krish Chakrabarty 

61 

Best Number of Stages 


 How many stages should a path use?
 Minimizing number of stages is not always fastest

 Example: drive 64-bit datapath with unit inverter

D =

ECE 261

Krish Chakrabarty 

62 

31 

Best Number of Stages 


 How many stages should a path use?
 Minimizing number of stages is not always fastest

 Example: drive 64-bit datapath with unit inverter

D = NF1/N + P  = N(64)1/N + N

ECE 261

Krish Chakrabarty 

63 

Derivation 
 Consider adding inverters to end of path
 How many give least delay?

 Dene best stage effort 

ECE 261

Krish Chakrabarty 

64 

32 

Best Stage Effort 


 has no closed-form solution  Neglecting parasitics (pinv = 0), we nd  = 2.718 (e)  For pinv = 1, solve numerically for  = 3.59

ECE 261

Krish Chakrabarty 

65 

Review of Denitions 
Term number of stages logical effort electrical effort branching effort effort effort delay parasitic delay delay
ECE 261

Stage

Path

Krish Chakrabarty 

66 

33 

Method of Logical Effort 


1) 2) 3) 4) 5) Compute path effort Estimate best number of stages Sketch path with N stages Estimate least delay Determine best stage effort

6) Find gate sizes

ECE 261

Krish Chakrabarty 

67 

Limits of Logical Effort 


 Chicken and egg problem
 Need path to compute G  But dont know number of stages without G

 Simplistic delay model


 Neglects input rise time effects

 Interconnect
 Iteration required in designs with wire

 Maximum speed only


 Not minimum area/power for constrained delay
ECE 261 Krish Chakrabarty  68 

34 

Summary 
 Logical effort is useful for thinking of delay in circuits
       Numeric logical effort characterizes gates NANDs are faster than NORs in CMOS Paths are fastest when effort delays are ~4 Path delay is weakly sensitive to stages, sizes But using fewer stages doesnt mean faster paths Delay of path is about log4F FO4 inverter delays Inverters and NAND2 best for driving large caps

 Provides language for discussing fast circuits


 But requires practice to master
ECE 261 Krish Chakrabarty  69 

Power and Energy 


 Power is drawn from a voltage source attached to the VDD pin(s) of a chip.  Instantaneous Power:  Energy:  Average Power:
ECE 261 Krish Chakrabarty  70 

35 

Dynamic Power 
 Dynamic power is required to charge and discharge load capacitances when transistors switch.  One cycle involves a rising and falling output.  On rising output, charge Q = CVDD is required  On falling output, charge is dumped to GND  This repeats Tfsw times over an interval of T

ECE 261

Krish Chakrabarty 

71 

Dynamic Power Cont. 

ECE 261

Krish Chakrabarty 

72 

36 

Dynamic Power Cont. 

ECE 261

Krish Chakrabarty 

73 

Activity Factor 
 Suppose the system clock frequency = f  Let fsw = f, where  = activity factor
 If the signal is a clock,  = 1  If the signal switches once per cycle,  =   Dynamic gates:   Switch either 0 or 2 times per cycle,  =   Static gates:  Depends on design, but typically  = 0.1

 Dynamic power:
ECE 261 Krish Chakrabarty  74 

37 

Short Circuit Current 


 When transistors switch, both nMOS and pMOS networks may be momentarily ON at once  Leads to a blip of short circuit current.  < 10% of dynamic power if rise/fall times are comparable for input and output

ECE 261

Krish Chakrabarty 

75 

Example 
 200 Mtransistor chip
 20M logic transistors  Average width: 12   180M memory transistors  Average width: 4   1.2 V 100 nm process  Cg = 2 fF/m

ECE 261

Krish Chakrabarty 

76 

38 

Dynamic Example 
 Static CMOS logic gates: activity factor = 0.1  Memory arrays: activity factor = 0.05 (many banks!)  Estimate dynamic power consumption per MHz. Neglect wire capacitance and short-circuit current.

ECE 261

Krish Chakrabarty 

77 

Dynamic Example 
 Static CMOS logic gates: activity factor = 0.1  Memory arrays: activity factor = 0.05 (many banks!)  Estimate dynamic power consumption per MHz. Neglect wire capacitance.

ECE 261

Krish Chakrabarty 

78 

39 

Static Power 
 Static power is consumed even when chip is quiescent.
 Ratioed circuits burn power in ght between ON transistors  Leakage draws power from nominally OFF devices

ECE 261

Krish Chakrabarty 

79 

Ratio Example 
 The chip contains a 32 word x 48 bit ROM
 Uses pseudo-nMOS decoder and bitline pullups  On average, one wordline and 24 bitlines are high

 Find static power drawn by the ROM 


  = 75 A/V2  Vtp = -0.4V

ECE 261

Krish Chakrabarty 

80 

40 

Ratio Example 
 The chip contains a 32 word x 48 bit ROM
 Uses pseudo-nMOS decoder and bitline pullups  On average, one wordline and 24 bitlines are high

 Find static power drawn by the ROM 


  = 75 A/V2  Vtp = -0.4V

 Solution:

ECE 261

Krish Chakrabarty 

81 

Leakage Example 
 The process has two threshold voltages and two oxide thicknesses.   Subthreshold leakage: 
 20 nA/m for low Vt  0.02 nA/m for high Vt

 Gate leakage:
 3 nA/m for thin oxide  0.002 nA/m for thick oxide

 Memories use low-leakage transistors everywhere  Gates use low-leakage transistors on 80% of logic
ECE 261 Krish Chakrabarty  82 

41 

Leakage Example Cont. 


 Estimate static power:

ECE 261

Krish Chakrabarty 

83 

Leakage Example Cont. 


 Estimate static power:
 High leakage:  Low leakage:

ECE 261

Krish Chakrabarty 

84 

42 

Leakage Example Cont. 


 Estimate static power:
 High leakage:  Low leakage:

 If no low leakage devices, Pstatic = 749 mW (!)


ECE 261 Krish Chakrabarty  85 

Low Power Design 


 Reduce dynamic power
 :  C:  VDD:  f:

 Reduce static power

ECE 261

Krish Chakrabarty 

86 

43 

Low Power Design 


 Reduce dynamic power
 : clock gating, sleep mode  C:  VDD:  f:

 Reduce static power

ECE 261

Krish Chakrabarty 

87 

Low Power Design 


 Reduce dynamic power
 : clock gating, sleep mode  C: small transistors (esp. on clock), short wires  VDD:  f:

 Reduce static power

ECE 261

Krish Chakrabarty 

88 

44 

Low Power Design 


 Reduce dynamic power
 : clock gating, sleep mode  C: small transistors (esp. on clock), short wires  VDD: lowest suitable voltage  f:

 Reduce static power

ECE 261

Krish Chakrabarty 

89 

Low Power Design 


 Reduce dynamic power
 : clock gating, sleep mode  C: small transistors (esp. on clock), short wires  VDD: lowest suitable voltage  f: lowest suitable frequency

 Reduce static power

ECE 261

Krish Chakrabarty 

90 

45 

Low Power Design 


 Reduce dynamic power
    : clock gating, sleep mode C: small transistors (esp. on clock), short wires  VDD: lowest suitable voltage f: lowest suitable frequency

 Reduce static power


 Selectively use ratioed circuits  Selectively use low Vt devices  Leakage reduction:  stacked devices, body bias, low temperature
ECE 261 Krish Chakrabarty  91 

46 

You might also like