Advanced VLSI Design: Dr. Premananda B.S
Advanced VLSI Design: Dr. Premananda B.S
• Introduction to Syllabus
• Reference Books
• Scaling
• Design issues of VLSI Systems
• Review of CMOS logic
• Lab Experiments: Assignments/Projects
CLK CLK
D
Latch
D Q
Q
Sequential logic circuit
• Timing metrics for sequential circuits
• Bistability principle
• Static latches (NAND & NOR based), Clocked
• Multiplexer-based Latches
• MS Edge-triggered Register using Multiplexers
• Improved static MS Edge-triggered Register
Multiplexer-based Latches
Positive latch built using Transmission Gates
CLK
D Latch Operation
Q Q
D Q D Q
CLK = 1 CLK = 0
CLK
Q
D Flip-flop Design
CLK CLK
CLK QM
D Q
CLK CLK CLK CLK CLK
Latch
Latch
QM
D Q
CLK CLK
D Flip-flop Operation
QM Q
D
CLK = 0
QM
D Q
CLK = 1
CLK
Q
Non-overlapping Clocks
• Non-overlapping clocks can prevent races
– As long as non-overlap exceeds clock skew
2 1
QM
D Q
2 2 1 1
2 1
1
2
Dynamic Latches and Registers
• Dynamic transmission-gate edge-triggered registers
• Clocked CMOS register
• True single-phase clocked register
Definitions
• Static storage
– Static uses a bistable element with feedback to store its
state and thus preserves state as long as the power is on.
– Load new data: 1) cut the feedback path (mux based);
2) overpower the feedback path (SRAM based).
• Dynamic storage
– Dynamic stores state on parasitic capacitors so the state
held for only a period of time; requires periodic refresh.
– Dynamic is simpler (fewer transistors), higher speed, lower
power but due to noise immunity issues always modify the
circuit so that it is pseudo-static.
Storage Mechanisms
Dynamic Transmission-Gate Edge Triggered
Registers
master slave
!clk clk
QM
D T1 I1 T2 I2 Q
C1 C2
clk !clk
tsu = tpd_tx
thold = zero
master transparent tc-q = 2 tpd_inv + tpd_tx
slave hold
clk
QM
D T1 I1 T2 I2 Q
C C
1 2
clk !clk
QM
D T1 I1 T2 I2 Q
C1 C2
!clk1 !clk2
master transparent
slave hold
clk1
tnon_overlap
clk2
master hold
Keep clock non-overlap time large enough that no slave transparent
overlap occurs even in the presence of clock skew
But now have 4 clock signals to route!
Pseudo-static Dynamic Latch
• Robustness considerations limit the use of dynamic FF:
– coupling between signal nets and internal storage nodes
can inject significant noise and destroy the FF state
– leakage currents cause state to leak away with time
– internal dynamic nodes don’t track fluctuations in VDD that
reduces noise margins
clk
QM
Q
!clk
• Adding a weak feedback inverter increase cost in delay and
power consumption, but it improves noise immunity.
Dynamic Latches and Registers
• Dynamic transmission-gate edge-triggered registers
M2 M6
clk Mon
4
!clk Moff
8
off QM on
D Q
!clk Mon
3
C1 clk Moff
7
C2
off on
M1 M5
master transparent
slave hold
clk
Q
X
D Q
Inferences
• In a flop-based system:
– Data launches on one rising edge
– Must setup before next rising edge
– If it arrives late, system fails
– If it arrives early, time is wasted
– Flops have hard edges
Summary
• Static MS edge-triggered register
• Dynamic transmission-gate latch
• Dynamic transmission-gate edge-triggered register
• Clocked CMOS register
• Dual-edge register
• Demo on using Multisim CAD tool
THANK YOU
delay/area
overhead is
minimized by logic
based TSPC
Only single phase clocks are used. When CLK is high the latch is
in the evaluate mode. When CLK is low the latch is in hold-mode.
TSPC Register (TSPCR)
Master-Slave Flip-flops
Including Logic into TSPC Latch
PUN
In
Static
Logic Out
PDN
Contamination and
Propagation Delays
Timing issues of Flip-flop and Latch systems
Q1 D2
F1
F2
Combinational Logic
Tc
tsetup
clk
tpcq
Q1 tpd
D2
t pd Tc tsetup t pcq
sequencing overhead
• Overhead of flip-flops must be as small in order to maximize
the available time for the combinational logic to carry out
more complicated functions
Min-delay constraint
Q1
F1
CL
thold ≤ tccq + tcd
clk
D2
F2
clk
Q1 tccq tcd
D2 thold
Max-Delay: 2-Phase Latches
1 2 1
D1 Q1 Combinational D2 Q2 Combinational D3 Q3
L1
L2
L3
Logic 1 Logic 2
1
2
Tc
D1 tpdq1
Q1 tpd1
D2 tpdq2
Q2 tpd2
D3
t pd t pd 1 t pd 2 Tc 2t
pdq
sequencing overhead
Min-Delay: 2-Phase Latches
1
thold ≤ tcd + tccq + tnonoverlap − tsu
Q1
L1
CL
2
D2
L2
tnonoverlap
1
tccq
2
Q1 tcd
D2 thold
Inferences
• The design of a system with latches is much more difficult
than with flip-flops due to the transparent property inherently
associated with latches.
• The setup-time failure can be eliminated by elongating the
clock period, namely, slowing the operating clock, or by using
flip-flops with shorter setup time and/or clock-to-Q delay.
• The hold-time failure can only be fixed by redesigning the
logic circuit; it cannot be fixed by slowing the operating clock.
• Good practice is to design a system very conservatively in
order to avoid such failures because redesigning or modifying
a system or a chip is very expensive and time consuming.
Timing Issues in Clocked Systems
• Max and Min delay constraints
Clk
tSK
Clk tJS
TCLK + d
TCLK
1 3
CLK1
d
CLK2 2 4
d + th
TCLK + d
TCLK
1 3
CLK1
CLK2 2 4
d
2 4
+ thold
T: T + tc-q + tplogic + tsu so T tc-q + tplogic + tsu -
thold : thold + ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg -
• > 0: Improves performance, but makes thold harder to meet.
• If thold is not met (race conditions), the circuit malfunctions
independent of the clock period!
• Clock skew has potential to improve the performance of
the circuit.
• Increasing skew makes the circuit more susceptible to
race conditions.
• If the minimum delay of the combinational logic block is
small, the inputs to R2 may change before R2’s first
rising edge.
• To avoid races, ensure that the minimum delay through
the register and logic must long enough that the inputs
to R2 are valid for a hold time after that edge.
• Reducing the clock frequency can’t fix it!
Negative Clock Skew
R1 R2
• Clock and data Combinational
In D Q D Q
flow in logic
opposite
directions tclk1 tclk2
clk
delay
T
T+
1 3
2 4
<0
-tjitter +tjitter
Flip-Flops
clk
• 2-Phase Latches
Flop
Flop
Combinational Logic
1 2 1
Latch
Latch
Latch
Combinational Combinational
Logic Logic
Half-Cycle 1 Half-Cycle 1
Pulsed Latches
p tpw
p p
Latch
Latch
Combinational Logic
Timing Diagrams
A tpd
Combinational
A Y
Logic
Y tcd
D Q D
tpcq
Q tccq
D Q D tpdq
tcdq
Q
How Much Borrowing?
1 2
2-Phase Latches
c tsetup tnonoverlap
T D1 Q1 D2 Q2
L1
L2
tborrow Combinational Logic 1
2
1
A latch-based system
with a two-phase non-
overlapping clocking 2 tnonoverlap
Tc
scheme, each latch is
nominally assigned a tsetup
half cycle regardless of Tc/2 tborrow
whether its logic is fast Nominal Half-Cycle 1 Delay
or slow.
D2
Time Borrowing
• The slow logic can use more time than its designated
half cycle automatically without the need of any explicit
design changes.
• This ability is referred to as time borrowing, slack
borrowing, or cycle stealing.
• The maximum time borrowing in a latch-based system
can be derived from the timing diagram.
• Due to the feature of automatic time borrowing inherently
existing in a level sensitive latch-based system, a latch-
based system may have better overall performance than
a FF-based system.
Skew: Flip-Flops clk
Q1 D2
clk
F1
F2
Combinational Logic
clk
tpcq
sequencing overhead tskew
Q1 tpdq tsetup
clk
Q1
F1
CL
clk
D2
F2
tskew
clk
thold
Q1 tccq
D2 tcd
Skew: Latches
2-Phase Latches 1 2 1
t pd Tc 2t
pdq D1 Q1 Combinational D2 Q2 Combinational D3 Q3
L1
L2
L3
Logic 1 Logic 2
sequencing overhead
2
c tsetup tnonoverlap tskew
T
tborrow
2
Summary
• Flip-Flops:
– Very easy to use, supported by all tools
• 2-Phase Transparent Latches:
– Lots of skew tolerance and time borrowing
• Pulsed Latches:
– Fast, some skew tolerance & borrow, hold time risk
Advanced VLSI Design
4 Power Supply
3 Interconnect
2 6 Capacitive Load
Devices
• Clock-Signal Generation
• Manufacturing Device Variations
• Interconnect Variations
• Environmental Variations
• Capacitive Coupling
Clock Generation and Distribution Networks
• ..
Clock subsystem
• Clock generation
• Clock distribution
• Clock gaters
• Clock synchronous
Clock Generation Circuits
• Clock generators can be categorized into:
– multivibrators
– linear oscillators
• Two widely used multivibrators are:
– Ring oscillator and Schmitt-circuit-based oscillator
• A ring oscillator consists of a number of voltage-gain
stages to form a closed-feedback loop.
• A Schmitt-circuit-based oscillator is based on the charge
and discharge operations on timing capacitors.
– Both period and duty cycle can be controlled through
the time constants of charge and discharge paths.
Clock Generation Circuits Contd..
• A linear oscillator (resonator-based oscillator) generates
a single-frequency sinusoidal wave.
• Built from either RC-tuned circuit or LC-tuned circuit.
• An RC-tuned oscillator uses an RC-frequency-selective
feedback network to select the desired frequency.
• An LC-tuned oscillator uses an LC-frequency-selective
feedback network to select the desired frequency.
• The ring oscillator and Schmitt-circuit-based oscillator
are considered.
Ring Oscillators
PLL
Data
Digital Digital
System System
reference
fsystem = N x fcrystal clock
Divider PLL
PLL Clock
Buffer
fcrystal , 200<Mhz
Crystal
Oscillator
PLL Block Diagram
Reference Up
clock vcont
Phase Charge Loop
VCO
detector pump filter
Local Down
clock
Divide by
N
System
Clock
Phase Detector
Output before filtering
Transfer
characteristic
Phase-Frequency Detector
Clock-deskew circuits
• The clk2 is fed back to the input of PLL to compare with the
external clock clkin so that clk2 can track the phase of clkin and
both clocks can then be lined up with each other.
Clock Generation and Distribution Networks
CP/LF
Phase
Detector
Digital
GLOBAL CLK VCDL •••
Circuit
CP/LF
Phase
Detector
DLL & PLL
Delay-Locked Loop (Delay Line Based)
fREF U
Phase Charge
D DL
Det Pump
Filter
fO
PD D CP VCO
÷N Filter
fO
Agenda
R1 R2 R3 R4
In Logic Logic Logic
D Q Block #1 D Q Block #2 D Q Block #3 D Q
Self-timed design
R1 F1 R2 F2 R3 F3 Out
In
Start C2 C2
C1 C1
VDD
Start
Start
P0 P1 P2 P3
C0 C1 C2 C3 C4 C4 (b) Completion signal
C0 K0 K1 K2 K3
Start
Post-charge logic
Self-resetting 3-input OR
• Assume all inputs are low, and int is initially precharged.
• If A goes high, int will fall, causing out to go high, causes,
the gate to precharge.
• When the PMOS precharge device is active, the inputs
must be in a reset state to avoid contention.
VDD
int
out
A B C
3. Clock-Delayed Domino
VDD
Q1 (also D2)
D1 Pulldown
Network
Clock-Delayed Domino logic
• Interfacing circuits
• Interfacing circuits
• The switching current rate di/dt and hence, the Ldi/dt noise
can be reduced if the RC values are properly set.
• To control the resistance, control the slew rate of the output
buffer with the loading capacitance C on the predriver.
Differential-signal Scheme
1500
Device
Under
100 pF Test
Human-body model (HBM)